Opened 7 years ago
Closed 5 years ago
#29086 closed Bug (wontfix)
Add a warning when saving bytestrings to CharFields
Reported by: | Collin Anderson | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 2.0 |
Severity: | Release blocker | Keywords: | |
Cc: | Collin Anderson | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
Django 1.11 behavior (python 3):
o = MyModel() o.myfield = b'123' o.save() o.refresh_from_db() o.myfield == "123"
Django 2.0 behavior:
o = MyModel() o.myfield = b'123' o.save() o.refresh_from_db() o.myfield == "b'123'"
Could we restore the old behavior, maybe with a deprecation warning? Either that or raise an error?
Change History (13)
comment:1 by , 7 years ago
Description: | modified (diff) |
---|
comment:2 by , 7 years ago
Cc: | added |
---|
comment:3 by , 7 years ago
comment:4 by , 7 years ago
As I've mentioned in #28748's PR I feel like this should be the the correct behavior in a Python 3 only code base and this was probably caused by an overlooked case during the porting of the affected code bases from Python 2 to Python 3.
In other words, if you are passing binary data to an interface expecting text you should expect bad things to happen -- Python 3 is stricter about this concept and it's a good thing IMHO.
Now, this change should at least have been mentioned in the 2.0 release notes. The fact it wasn't makes me think there might other instances changed by #27795 related commits that might have introduced similar regressions.
comment:5 by , 7 years ago
Now, this change should at least have been mentioned in the 2.0 release notes.
The release notes say:
To support native Python 2 strings, older Django versions had to accept both bytestrings and unicode strings. Now that Python 2 support is dropped, bytestrings should only be encountered around input/output boundaries (handling of binary fields or HTTP streams, for example). You might have to update your code to limit bytestring usage to a minimum, as Django no longer accepts bytestrings in certain code paths.
For example,
reverse()
now usesstr()
instead offorce_text()
to coerce theargs
andkwargs
it receives, prior to their placement in the URL. For bytestrings, this creates a string with an undesiredb
prefix as well as additional quotes (str(b'foo')
is"b'foo'"
). To adapt, calldecode()
on the bytestring before passing it toreverse()
.
Does that cover what you're looking for?
comment:6 by , 7 years ago
"Django no longer accepts bytestrings in certain code paths"
"Python 3 is stricter about this concept and it's a good thing IMHO." - Right, it raises an error when you pass bytes, it seems to me Django should do the same thing.
It seems to me the current behavior is really bad for projects that are upgrading, because there's no warning or error, just silent bad data. Maybe the force_text() calls (#27795) need to be bytes_not_allowed() or something. (Too late to raise an error for 2.0, but maybe we could start raising a warning?)
comment:7 by , 7 years ago
It seems to me the current behavior is really bad for projects that are upgrading, because there's no warning or error, just silent bad data. Maybe the force_text() calls (#27795) need to be bytes_not_allowed() or something. (Too late to raise an error for 2.0, but maybe we could start raising a warning?)
This idea of warn/error when mixing bytes with text is actually built into Python with the `-b` and `-bb` options. I've used it to great success in my project to discover mishandling of bytes and text. I don't think Django needs to duplicate this functionality, but maybe the release notes could be modified to recommend using -b
to help discover missed cases when upgrading.
comment:8 by , 7 years ago
Thanks for release note mention Jon, I completely missed it when I skimmed through the docs.
+1 on mentioning the -b
option!
comment:9 by , 7 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Summary: | Bytestrings on CharFields getting saved as "b'data'" → Add a warning when saving bytestrings to CharFields |
PR to mention -b
in the release notes.
comment:12 by , 5 years ago
Resolution: | wontfix |
---|---|
Status: | closed → new |
I'd like to re-open this for discussion, if possible.
Ideally, saving bytes to a CharField wouldn't give a warning. It would just trust that the developer knows what they're doing. Namely, that they took care of the decode step, and to just pass the bytes right onto the database.
I have a situation here where I have a CharField, I have some byte data, I know it's encoded in the same manner as the destination varchar field in the database. All these things are known to me. And yet, Django attempts to coerce the byte value into a string (adding a "b" in front and wrapping the thing in quotes as it goes), just before it's converted back into bytes when its sent to the database. That doesn't really make sense.
The best course of action here is to just trust the developer. If the CharField value is bytes, leave it be.
comment:13 by , 5 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Trac is not a place for a discussion.
Follow triaging guidelines with regards to wontfix tickets.
How does that mistake (assigning bytes) happen? I doubt that it's very common but am open to being persuaded otherwise.