Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#26337 closed Cleanup/optimization (fixed)

Translations - No fallback used if requesting english variant

Reported by: Cristiano Coelho Owned by: nobody
Component: Documentation Version: 1.9
Severity: Normal Keywords: i18n translations english
Cc: cristianocca@…, Claude Paroz Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Cristiano Coelho)

This a new ticket raised by an issue noticed while reporting #26328.

The issue is exactly at this line https://github.com/django/django/blob/master/django/utils/translation/trans_real.py#L190

And it is currently done this way because django translations does not define values for english variants (see issue #24413 ) because message ids are enough, and such, if fallback language was to be used, all english translations from django would fail when the default language is not english resulting in incorrect translations all over the place.

Now the ticket above fixed the issue for django english translations, but actually creates a new issue (the one reported in this ticket).

The issue is then that if you are using an non english default language and expect it to be used when an english translation is requested (ie you don't have any english translation at all at your app or parts of it are missing), all your translated text will then fallback to the message id rather than the actual default language translation (which may be different than the message id).

How to fix this? Well, no idea! The best bet would be to make sure all translations values are always added rather than relying on message ids as defaults and remove the english check, but that would mean going through all translations for english variants implemented by django and django apps, plus there might be people relying on the same special english treatment.

Another fix that involves no code changes is to at least have it documented that you should have always a translation (and with values) file for english for all your apps otherwise you may get unexpected results when using non english as default.

The last idea but quite complicated, is to mark translation po files in a way that the compiler would then compile it setting the translation value the same as the message id and such the english check can be removed and django translations would still work (however django translation files should be marked)

Final note: This might not be really important, since most of the time if you are using translations, english will be one of them, and you will always have translation values if the default language is not the default, so this issue would not exist as long as this is always true.

Change History (15)

comment:1 by Cristiano Coelho, 9 years ago

Description: modified (diff)

comment:2 by Claude Paroz, 9 years ago

I think the scenarios you presented are really corner cases, and I'm not sure we should do anything about them. Note also that gettext is limited when translating from a language other than English, as soon as plural rules are different.

If you have an idea about some patch to suggest (code or docs), we'll gladly review it. Otherwise, I'm tempted to close it as won't fix.

comment:3 by Cristiano Coelho, 9 years ago

You are right about it being a corner case, but I think it should be at least mentioned in the docs and a recomendation about always having translation values.

comment:4 by Tim Graham, 9 years ago

Component: InternationalizationDocumentation
Triage Stage: UnreviewedAccepted
Type: BugCleanup/optimization

Could you propose some draft text?

comment:5 by Cristiano Coelho, 9 years ago

Hi, I’m not really good at writing but here’s an attempt.

I would put some note somewhere starting from here, probably after msgid and msgstr definitions: https://docs.djangoproject.com/en/dev/topics/i18n/translation/#message-files

“Due to certain limitations of the GNU gettext toolset, English have a special treatment. Make sure all your English (and variants) translation files correctly define a msgstr value and do not rely on your default language as fallback, or you may experience unexpected results such as msgid being returned rather than the default language msgstr (which might or not be the expected result).”

Would that work, makes any sense at all?

comment:6 by Claude Paroz, 9 years ago

Not convinced :-/
Would it be possible to provide a concrete example of what you try to explain, with real strings and real languages?

comment:7 by Cristiano Coelho, 9 years ago

Example (high level, no actual code)

  • using message ids that are actually codes rather than text, but can be done with any case where the actual msgstr is not always the same as msgid for your default language (another example, if you do not use unicode characters as msgids for some reason like not wanting unicode on python code strings, so you always need the msgstr value even for default language for translations that have unicode characters)
  • default language = 'en-gb'
  • translations: en-gb = { 'BUTTON_1':'Colour', 'BUTTON_2':'Monday'}, en-us = { 'BUTTON_1':'Color', 'BUTTON_2':} (note how I'm lazy and I don't translate the BUTTON_2 msgid because 'Monday' is the same in every language variant so I'm ok by falling to default.
  • Client requests the 'en-gb' translation, woops it starts with 'en' so it won't load your default language as fallback and you never get the actual BUTTON_2 translation, when requested you end up with the ugly msgid.

in reply to:  7 comment:8 by Claude Paroz, 9 years ago

Replying to cristianocca:

Example (high level, no actual code)

  • using message ids that are actually codes rather than text, but can be done with any case where the actual msgstr is not always the same as msgid for your default language (another example, if you do not use unicode characters as msgids for some reason like not wanting unicode on python code strings, so you always need the msgstr value even for default language for translations that have unicode characters)

(...)

I'm not convinced by either use case, having codes as msgid is IMHO very bad practice, and avoiding Unicode is a thing from the past. I really need a legitimate use case to accept this.

comment:9 by Cristiano Coelho, 9 years ago

Hmmm, what about the use case where your msgids are in a different language than your default language setting? This is pretty much the reason of the hardcoded english for django.

Example again:

  • default language = spanish
  • msgids language = russian (russian developers making an spanish site)
  • requested language = english, if you don't have an english catalog/translation you expect to fall into the default language (as it would with any other language that's not english).
  • Actual result --> russian msgids rather than spanish translated values because default language is never loaded.

Perhaps there can be a simplified note, something like: "Keep in mind that due to certain framework limitations, when an english variant is requested, it will not load the default language and will end up returning msgids if a catalog or translation is not available"

comment:10 by Claude Paroz, 9 years ago

Thanks, that use case is much more realistic!

I would suggest to add the following paragraphs under https://docs.djangoproject.com/en/1.9/topics/i18n/translation/#implementation-notes :

Non-English base language
-------------------------

Django makes the general assumption that the original strings in a translatable
project are written in English.
You can choose another language, but you must be aware of certain limitations:

* ``gettext`` only provide two plural forms for the original messages, so you
  will need to also provide a translation for the base language to include all
  plural forms if the plural rules for the base language are different from
  English.

* When an English variant is activated and English strings are missing, the
  fallback language will not be the :setting:`LANGUAGE_CODE` of the project,
  but the original strings. For example, an English user visiting a site which
  default language is Spanish and original strings are written in Russian will
  fallback to Russian, not to Spanish.

Thoughts?

comment:11 by Cristiano Coelho, 9 years ago

I think that's good, and will clear any doubt of "Why I am not getting my texts translated" when it happens.

comment:12 by Claude Paroz, 9 years ago

Has patch: set

comment:13 by Tim Graham, 9 years ago

Triage Stage: AcceptedReady for checkin

comment:14 by Claude Paroz <claude@…>, 9 years ago

Resolution: fixed
Status: newclosed

In f6fefbf:

Fixed #26337 -- Added i18n note about using a non-English base language

Thanks Cristiano Coelho for the report and Tim Graham for the review.

comment:15 by Claude Paroz <claude@…>, 9 years ago

In 5641a80:

[1.10.x] Fixed #26337 -- Added i18n note about using a non-English base language

Thanks Cristiano Coelho for the report and Tim Graham for the review.
Backport of f6fefbf8cb from master.

Note: See TracTickets for help on using tickets.
Back to Top