Opened 16 years ago

Closed 6 years ago

Last modified 6 years ago

#10852 closed Uncategorized (wontfix)

Add no-fuzzy-matching option to makemessages

Reported by: graham.carlyle@… Owned by: nobody
Component: Internationalization Version: dev
Severity: Normal Keywords:
Cc: chris@…, Simon Charette, אורי Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I'd like an option added to makemessages to invoke msgmerge without fuzzy matching. This is because sometimes I'd prefer a translator not to have a "fuzzy" default and also to be able to make the non-translated text sticks out more in the web app (by being untranslated).

msgmerge with the "-N" option swiches off fuzzy-matching.

Change History (19)

comment:1 by Ramiro Morales, 16 years ago

AFAIK literals marked as 'fuzzy' aren't regarded by msgmft(1) as translated and the original untranslated msgid gets used, so your second reason: "to be able to make the non-translated text sticks out more in the web app (by being untranslated)" is already covered.

Could you give us a bit more detail about the rationale behind the first reason ("because sometimes I'd prefer a translator not to have a fuzzy default")?

comment:2 by graham.carlyle@…, 16 years ago

msgmft? don't know what you're referring to there. Sorry I'm new to dealing with translations via po files so might well be misunderstanding things.

When I said "I'd prefer a translator not to have a "fuzzy" default" I meant that when running makemessages to create the po file then the msgstr sometimes seems to be speculatively filled in using existing translations.

For example say I have in a german translation po file...

#: templates/randa/map.html:56
msgid "Country info"
msgstr "Länderinfo"

then I add some new text in a template

  {% trans 'Country' %}

and re-run the makemessages script, I get...

#: templates/randa/map.html:91
#, fuzzy
msgid "Country"
msgstr "Länderinfo"

which worried me as it seemed a bit speculative to provide a default for the translator.

However it seems I was mistaken in thinking that the web app would show this, presumably the "#, fuzzy" line stops that happening (maybe that's what you are referring to by msgmft?).

But if I hack django

ndex: django/core/management/commands/makemessages.py
===================================================================
--- django/core/management/commands/makemessages.py	(revision 10575)
+++ django/core/management/commands/makemessages.py	(working copy)
@@ -185,7 +185,7 @@
                 raise CommandError("errors happened while running msguniq\n%s" % errors)
             open(potfile, 'w').write(msgs)
             if os.path.exists(pofile):
-                (stdin, stdout, stderr) = os.popen3('msgmerge -q "%s" "%s"' % (pofile, potfile), 't')
+                (stdin, stdout, stderr) = os.popen3('msgmerge -N -q "%s" "%s"' % (pofile, potfile), 't')
                 msgs = stdout.read()
                 errors = stderr.read()
                 if errors:

then it generates the po...

#: templates/randa/map.html:91
msgid "Country"
msgstr ""

maybe I worry too much and the #fuzzy stuff is clear to a translator :)

in reply to:  2 ; comment:3 by Ramiro Morales, 16 years ago

Replying to graham.carlyle@maplecroft.com:

However it seems I was mistaken in thinking that the web app would show this, presumably the "#, fuzzy" line stops that happening (maybe that's what you are referring to by msgmft?).

Exactly, the entries marked as fuzzy by msgmerge are generated by a simple lexicographic comparison and the probability of them being not totally accurate is high so a) they require intervention from the translator (to review/correct them and remove the fuzzy flag) and b) they aren't used in the final translation.

msgfmt is the utility from the GNU gettext suite that compiles .po files to .mo files (the ones that get finally used by the i18n machinery) and it is executed by the compilemessages Django management command. See http://www.gnu.org/software/gettext/manual/gettext.html#msgfmt-Invocation (particularly the --use-fuzzy command line switch, which isn't used by compilemessages)

comment:4 by Malcolm Tredinnick, 16 years ago

Resolution: wontfix
Status: newclosed

Okay, this is a non-issue. As Ramiro points out, the fuzzy annotation prevents the string from being used as a translation. However, it should be kept in the PO file because it saves translators work, particularly in the update phase, as they can see what a likely match is going to be. Particularly in crowd-sourced translations, where multiple people are going to be working on the same file, that's a huge win. In single-sourced cases, it's never going to be a hinderance, either, once people understand what fuzzy means.

So, no, we're not going to provide an option not to include those. They're normal parts of GNU PO files and experienced translators expect them, are used to working with them and gain benefit from them. Less experienced translators become more experienced as time goes by.

in reply to:  4 ; comment:5 by EmilStenstrom, 13 years ago

Easy pickings: unset
Severity: Normal
Type: Uncategorized
UI/UX: unset

Replying to mtredinnick:

"However, it should be kept in the PO file because it saves translators work, particularly in the update phase, as they can see what a likely match is going to be"

I've seen many instances where translators are confused by the "almost correct" versions of strings, and instead miss them entirely. We've had greater success when manually going through the translation files and removing all the fuzzy strings entirely. For this reason, I'm strongly in favour of a flag for disabling fuzzy strings.

in reply to:  5 comment:6 by justinkhill@…, 13 years ago

Resolution: wontfix
Status: closedreopened

Replying to EmilStenstrom:

Replying to mtredinnick:

"However, it should be kept in the PO file because it saves translators work, particularly in the update phase, as they can see what a likely match is going to be"

I've seen many instances where translators are confused by the "almost correct" versions of strings, and instead miss them entirely. We've had greater success when manually going through the translation files and removing all the fuzzy strings entirely. For this reason, I'm strongly in favour of a flag for disabling fuzzy strings.

I second that. Our translator actually asked me to disable fuzzy, saying "django probably has a way to turn that off". He didn't elaborate, but said fuzzy has caused havoc for them in the past on other projects. Since this isn't an option, I'll be removing the fuzzy translations manually, before sending them off.

comment:7 by Claude Paroz, 13 years ago

Resolution: wontfix
Status: reopenedclosed

Please do not reopen a ticket closed by a core committer, unless you have either a good new argument (and "fuzzy has caused havoc in the past" is certainly not) or after discussing it on django-dev mailing list.

in reply to:  3 comment:8 by anonymous, 12 years ago

Just wanted to chime in with a "me too".

Replying to ramiro:

Exactly, the entries marked as fuzzy by msgmerge are generated by a simple lexicographic comparison and the probability of them being not totally accurate is high

At least in my case, this is just not true. It's filling things in that are similar but not the same, and it shouldn't be. I'd rather they be empty than speculatively filled in.

There are already many options available for makemessages. I really don't see the harm in adding another option.

For now, I'm patching Django locally, which I really hate to do.

comment:9 by Ramiro Morales, 12 years ago

Resolution: wontfixduplicate

Duplicate of #18714.

comment:10 by Ramiro Morales, 12 years ago

Resolution: duplicatewontfix

Restoring ticket status. Sorry for the noise.

comment:11 by Chris Adams, 11 years ago

Cc: chris@… added

comment:12 by Markus Konrad, 10 years ago

I stumbled upon this discussion because "fuzzy" entries in the PO files caused big problems for me, too. I mean, "fuzzy" is way to fuzzy in makemessages. Some examples: The string "has geolocation" gets a fuzzy match to the string "Translation". "Further description" is believed to be the same as "Add an inscription". "edit" is supposed to be the same as "edition". It's just ridiculous. Because Django doesn't have a "non-fuzzy" option for makemessages, I ended up hacking ./core/management/commands/makemessages.py and adding -N to the list of msgmerge arguments on line 197. I honestly don't understand why no option to disable fuzzy matching is added to makemessages.

comment:13 by Claude Paroz, 10 years ago

It's not ridiculous, it's fuzzy matching. It helps the translator ~80% of the time, and ~20% of the time it's completely wrong. But it doesn't matter that much because fuzzy strings are not included in compiled messages. So it's the translator's work to evaluate accuracy of the fuzzy proposal, and simply dismiss it when it does not make sense.

comment:14 by Sergey Kolosov, 10 years ago

Those still looking for an option to customise msgmerge arguments, have a look at changes introduced in https://github.com/django/django/commit/06efeae598c6dafbe56d2ea323a0dccdd5bf2b8e (Django 1.7 and above); now it is feasible by subclassing the makemessages command, and overriding msgmerge_options.

comment:15 by Simon Charette, 9 years ago

Cc: Simon Charette added

comment:16 by אורי, 6 years ago

Resolution: wontfix
Status: closednew

I also don't like the "fuzzy" keyword and I spent hours in deleting them from our django.po files after running makemessages. I would like to disable them completely since they don't make sense in our project. Every time after running makemessages I have to search for the "fuzzy" keywords, delete them and also delete translations which are incorrect. I would like all the translations to be blank if not exactly translated.

I'm reopening this ticket since it's 10 years and I think it's important.

I didn't understand what you mean by "msgmerge with the "-N" option swiches off fuzzy-matching.".

comment:17 by אורי, 6 years ago

Cc: אורי added

comment:18 by Tim Graham, 6 years ago

Resolution: wontfix
Status: newclosed
Note: See TracTickets for help on using tickets.
Back to Top