#8626 closed Uncategorized (wontfix)
Translations from "en_US" locale being used even though request.LANGUAGE_CODE is "en"
Reported by: | francisoreilly | Owned by: | nobody |
---|---|---|---|
Component: | Internationalization | Version: | dev |
Severity: | Normal | Keywords: | locale language en-us en-US |
Cc: | Triage Stage: | Accepted | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | no | UI/UX: | no |
Description
I've got a situation where even though the template has request.LANGUAGE_CODE=="en", the "en_US" translations are being rendered instead of the "en" translations. Furthermore, if request.LANGUAGE_CODE=="en-gb", the "en_GB" translation is being pulled back, correctly. In summary:
- if LANGUAGE_CODE=="en" -> pulls back "en_US" translations (incorrect)
- if LANGUAGE_CODE=="en-gb" -> pulls back "en_GB" translations (expected result)
- if LANGUAGE_CODE=="en-us" -> pulls back "en_US" translations (expected result)
To demonstrate the problem I put together and attached a tar.gz of a simple project directory:
- A homepage that has a dropdown control for selecting/setting the user's chosen language, choices are "en", "en-gb" and "en-us". The form sets the request.LANGUAGE_CODE via the set-language view (django.conf.urls.i18n). urls.py is setup to activate the set-language view when user clicks Submit. The homepage itself lives at /index.html/
- Three locale translations corresponding, i.e. "en", "en_GB" and "en_US" in the locale subdir. I've localized the text "Homepage" with different text strings for each of the three locales. django.po
- A settings.py which specifies the three LANGUAGES, a LANGUAGE_CODE of "en". It also pulls in the LocaleMiddleware as is necessary for locale translations.
I think the other settings/files included are not relevant to the problem (e.g. the sqlite_db database file, etc), they're only included to form a runnable project.
I've been able to show this behaviour in 1.0-beta_2-SVN-8643 - simply go to the homepage at /index.html/ and choose the different language values and submit. The page refreshes to show the current value of request.LANGUAGE_CODE and also which translation has been pulled back.
Attachments (4)
Change History (16)
by , 16 years ago
Attachment: | locale_test.tar.gz added |
---|
comment:1 by , 16 years ago
milestone: | → 1.0 |
---|
comment:2 by , 16 years ago
Keywords: | locale language en-us en-US added |
---|
comment:3 by , 16 years ago
I can confirm this is happening (using the project provided). Poking around a lot, the problem seems to be happening as result of something inside django.utils.translation.trans_real.translation()
. The Django development server makes a call to activate("en_US")
as part of setting up itself initially and then the correct locale is set for each request. Somehow, as part of that initial setup, the en_US
translation is being cached for en
.
I'll pick this up again in the morning if nobody solves it in the interim.
comment:4 by , 16 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:5 by , 16 years ago
This is very weird. From what I can see, the fault isn't on Django's part. I have the exact same setup working perfectly (after fixing #7163) for providing Serbian language in two scripts (sr and sr_LATN).
Unless I'm missing something, Python gettext (or gettext in general, don't know anything about the implementation) will choose the en_US translation over the en translation even if they both exist. But it only happens for the en locale. I'm attaching a "diagnostic" diff for trans_real and a tgz of your app with two Serbian locales added so you can see for yourself.
Phew, for once not having English as my primary language seems to make things easier for me ;)
by , 16 years ago
Attachment: | locale_test.tar.2.gz added |
---|
by , 16 years ago
Attachment: | trans_real_r8739.diff added |
---|
comment:6 by , 16 years ago
Okay, that was fun... This is a problem in Python's locale
module, where (basically) en
is mapped to en_US
in the locale.locale_alias
dict. This in turn causes the wrong .mo
file to be loaded.
In detail: gettext.translation
(which is used in django.utils.translation.trans_real.translation
) calls gettext.find
to locate the correct .mo
file to load, which calls (the misnamed) gettext._expand_lang
(gettext._expand_lang
returns a list of possible locale names for the given locale). Finally, gettext._expand_lang
calls locale.normalize
for the normalized name of a locale, and locale.normalize
uses the locale.locale_alias
dict. And it just so happens that en
is mapped to en_US.ISO8859-1
...
>>> import gettext >>> gettext.find('django', 'locale', ['en'], all=1) ['locale/en_US/LC_MESSAGES/django.mo', 'locale/en/LC_MESSAGES/django.mo'] >>> gettext._expand_lang('en') ['en_US.ISO8859-1', 'en_US', 'en.ISO8859-1', 'en'] >>> >>> # Monkey patch our way out of trouble. >>> import locale >>> locale.locale_alias['en'] = 'en.ISO8859-1' >>> gettext.find('django', 'locale', ['en'], all=1) ['locale/en/LC_MESSAGES/django.mo'] >>> gettext._expand_lang('en') ['en.ISO8859-1', 'en']
(The output is from a session in the project directory.)
I'll attach a patch.
comment:7 by , 16 years ago
Has patch: | set |
---|---|
Patch needs improvement: | set |
I don't see a nicer way to fix this, but if anybody has a bright idea...
comment:8 by , 16 years ago
Wow guys, that was smart and digging at very low level.
Following a suggestion from Malcolm I had been trying to even unconditionally (by not checking if the new one shared a lang spec prefix with a previously cached one) deepcopy
ing the Python translation objects in translation._fetch
and littering the trans_real.py
file with debugging statements, all without success and without a clue of what could be the real cause of the problem.
comment:9 by , 16 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
This is excellent work and it's nice to understand what's going on. However, at the end of the day, I suspect this is probably not a bug. Firstly, the locale.locale_alias
dictionary maps every single "language only" locale specifier to a languagee + country version. It's unfortunate that English as spoken by the English isn't the canonical mapping and the US version is used instead, but a choice had to be made. The fact is that the designator "en" is ambiguous. It needs to be mapped to "English as spoken in XYZ" for some value of XYZ.
With, say, Norwegian, the intuitively right thing is what actually happens because Norwegian as spoken by the people of Norway is relatively obvious (the English case was obvious, too, dagnabbit! But they chose poorly! Yes, I know there are historical reasons why this is done in POSIX systems; doesn't make it right :-( )
The same problem as noted here will occur if somebody creates differing "es" and "es_ES" translations. They will always get back the "es_ES" version. Again, because the initial designator has to have the ambiguity resolved somehow.
Conclusion: this is a wontfix situation, because it's arguably not a bug and the behaviour is consistent. If we patch English, why not start patching all the other locales (en_uk -> en_GB
, but english_uk -> en_EN
... huh?!)? A documentation patch (in a separate ticket so that it can be handled in isolation) might be appropriate to explain this. I realise the whole i18n situation can get pretty convoluted when you're trying to do the right thing.
comment:10 by , 16 years ago
To make this work, you can use en-en
as the LANGUAGE_CODE instead of en
.
comment:11 by , 14 years ago
Easy pickings: | unset |
---|---|
Severity: | → Normal |
Type: | → Uncategorized |
UI/UX: | unset |
I ran into this problem today.
with the following imported:
from django.utils.translation import activate
you can do:
activate(request.LANGUAGE_CODE)
this will properly set the language when request.LANGUAGE_CODE is 'en' (instead of the improper en-us default)
tar.gz of the project demonstrating the problem