Opened 17 years ago

Closed 16 years ago

Last modified 15 years ago

#7163 closed (fixed)

Region locales' translation objects get overwritten

Reported by: oggy Owned by: Malcolm Tredinnick
Component: Internationalization Version: dev
Severity: Keywords: translation gettext region locale
Cc: ognjen.maric@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Internal caching performed by Python's gettext module causes translation objects for different region locales of the same language (e.g. en-us and en-gb) to get overwritten.

Consider the scenario of having en-us and en-gb localized versions of a project or an app. When the translation for one of these locales is first loaded, Django loads the translation objects by processing .mo files in this order:

  1. From root django conf/locale dir
  2. From project locale dir
  3. From settings.LOCALE_PATHS
  4. From apps' locale dirs

Due to the way that Python's ugettext module works, since no en_US or en_GB subdirs are found in step 1, the base language 'en' translation in conf/locale/en will be used. Hence, both of these translations will refer to the same .mo file. However, this will cause ugettext to cache, returning two translation objects which share the same _catalog dictionary.

In [15]: t1 = gettext.translation('django', '/usr/lib/python2.5/site-packages/django/conf/locale/', ['en_US'])

In [16]: t2 = gettext.translation('django', '/usr/lib/python2.5/site-packages/django/conf/locale/', ['en_GB'])

In [17]: id(t1._catalog)
Out[17]: 3079088844L

In [18]: id(t2._catalog)
Out[18]: 3079088844L

Therefore, any changes to one translation at stages 2-4 will effectively change the other one as well, rendering regional translations useless at the project/LOCALE_DIRS/app level. To prevent this, it's necessary to perform a deep copy of the translation object at stage 1. Since Django implements its own caching of translation objects, this should not affect the performance too much.

Attachments (3)

trans_real_v7513.diff (590 bytes ) - added by oggy 17 years ago.
trans_real_v7513v2.diff (711 bytes ) - added by oggy 17 years ago.
trans_real_v7513v3.diff (774 bytes ) - added by oggy 16 years ago.

Download all attachments as: .zip

Change History (11)

by oggy, 17 years ago

Attachment: trans_real_v7513.diff added

comment:1 by oggy, 17 years ago

Oops, didn't realize that the "partition" string method was only introduced in Python 2.5. This one's not quite as elegant, but it works for elderly snakes as well...

by oggy, 17 years ago

Attachment: trans_real_v7513v2.diff added

comment:2 by Simon Greenhill, 16 years ago

Triage Stage: UnreviewedReady for checkin

by oggy, 16 years ago

Attachment: trans_real_v7513v3.diff added

comment:3 by oggy, 16 years ago

Argh. My patch broke Django when run under 2.4, due to the differences in deepcopy between 2.4 and 2.5. This one has been tested to work under 2.4, but I don't have any installs of 2.3 around, could somebody please verify before commiting the patch?

comment:4 by Malcolm Tredinnick, 16 years ago

Owner: changed from nobody to Malcolm Tredinnick
Patch needs improvement: set
Triage Stage: Ready for checkinAccepted

This is some very nice debugging and analysis. However, I think you might be going too far in the patch. It looks like only a copy() is needed, not a deepcopy(). The reason being that both _catalog and _info are dictionaries where keys are updated or accessed, but we (Django or Python) are never doing in-place updates of the values in the dictionaries -- only atomic replacements.

When I replace your deepcopy() calls with plain dictionary copies, it seems to work for me (I saw the problem without any patches and it went away with patches). You clearly have a failing case as well. Can you make sure it looks fixed for you, too. Just replace your deepcopy() calls with

res._catalog = res._catalog.copy()
res._info = res._info.copy()

I'm pretty sure that's a fix, but I'd like verification. Also, how do you want to be credited in the AUTHORS file?

(Don't worry about submitting a new patch or anything; I'll see when you reply and can modify things in my local branch appropriately.)

comment:5 by oggy, 16 years ago

You're right of course, deepcopy() doesn't make sense since those dicts contain only unicode (i.e. immutable) values. It was a leftover from my previous attempt - figured it'd be nicer to deep copy the entire object instead of messing with other people's _-ed variables, but sometimes a man's gotta do what a man's gotta do ;)

Anyway I tested your patch and it works fine. As for the AUTHORS, could you make it to "oggy <ognjen.maric@…>"? Thanks

comment:6 by Malcolm Tredinnick, 16 years ago

Resolution: fixed
Status: newclosed

Fixed in [7857].

comment:8 by spoof, 16 years ago

what if

 res = _translation(globalpath) 

returns None? The following code:

base_lang = lambda x: x.split('-', 1)[0] 
if base_lang(lang) in [base_lang(trans) for trans in _translations]: 
  res._info = res._info.copy() 
  res._catalog = res._catalog.copy() 

will fail with error:

AttributeError: 'NoneType' object has no attribute '_info'

is it right?

comment:9 by anonymous, 16 years ago

Yes it's right but it should not return None unless there's no translation of the Django core for that particular language.

Note: See TracTickets for help on using tickets.
Back to Top