Opened 19 years ago
Closed 17 years ago
#1355 closed enhancement (duplicate)
Internationalisation(charset) problems with FileField file names and core.db.backend.mysql
Reported by: | little | Owned by: | nobody |
---|---|---|---|
Component: | Core (Other) | Version: | |
Severity: | normal | Keywords: | |
Cc: | Triage Stage: | Accepted | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
The function django.utils.text.get_valid_filename() is not friendly for non-latin file-names
s = s.strip().replace(' ', '_') return re.sub(r'[^-A-Za-z0-9_.]', '', s)
truncates file name to underscores only: "__________.txt"
for example.
Let it retun a Unicode object of string s
return unicode(s,'utf8')
Or make it possible to overload this function to end-programmer.
Change History (6)
comment:1 by , 19 years ago
Description: | modified (diff) |
---|
comment:2 by , 19 years ago
Another good way to name files is to give them [database id].ext names
for example 12345.txt 34567.doc and so on...
Files like numbers are more better than underscores, any way.
comment:3 by , 19 years ago
Component: | Internationalization → Core framework |
---|---|
Owner: | changed from | to
comment:4 by , 18 years ago
Yeah, I also encounter this problem. And I hope how to use i18n filename should determined by ender user but not automatically processed. Or we can set some flag in save_FIELD_file() method, just like:
object.save_FIELD_file(i18n_filename, content, safety=True)
This will use get_valid_filename to deal with filename, and if user invoke:
object.save_FIELD_file(i18n_filename, content, safety=False)
This will not use get_valid_filename. Parameter safety can be default True in order to keep compatibility with the old function.
comment:5 by , 18 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Accepted. Seems obvious something needs to change.
comment:6 by , 17 years ago
Resolution: | → duplicate |
---|---|
Status: | new → closed |
Closing in favor of #3119, which has a patch.
The problem here: we can't assume anything about the filesystem of the server beside the fact that it is possible to use us-ascii in filenames. So utf-8 won't be an option - it might produce unreadable filenames. And since there are several places that function like / and ., we can't just accept any char we want, or we would open up for filesystem traversal hackery.
One way would be to just turn non-ascii chars into a uXXXX form, so that at least the filename isn't all dashes.
I move the database stuff into it's own ticket, as that isn't i18n related, but more database backend related.