#22943 closed Bug (fixed)
Some validators defined in django.core.validators can't be serialized
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Core (Other) | Version: | 1.7-rc-1 |
Severity: | Release blocker | Keywords: | validators, migrations |
Cc: | Simon Charette | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | yes |
Easy pickings: | yes | UI/UX: | no |
Description
In django.core.validators
, validate_slug
and validate_ipv4_address
are defined as RegexValidator
instances constructed with precompiled regexes. Since the migrations framework can't serialize compiled regexes (as explained only in the release notes for 1.7; shouldn't this also be documented in the validators reference?), it is impossible to use these validators on model fields in a migrated app - the migration will fail with ValueError: Cannot serialize: <_sre.SRE_Pattern object at 0x0274AAD0>
. Presumably Django's own built-in validators should work with migrations out of the box.
I would assume all that needs to be done here is pass the regex arguments as strings instead of compiled regexes to the RegexValidator
constructor where these validators are defined.
Attachments (4)
Change History (21)
comment:1 by , 10 years ago
Keywords: | migrations added |
---|
by , 10 years ago
Attachment: | validator_patch.diff added |
---|
comment:2 by , 10 years ago
Has patch: | set |
---|
Well, I decided to try my own hand at a patch for this. First-time contributor, so please excuse if it's a bit shaky; in particular, I'm not sure if it's prudent to import from django.db.migrations
in the validator tests or if dependencies to other parts of Django should be kept to a minimum (and if so how it would be best to solve this). Ran the entire test suite in case the change caused problems somewhere else, and everything passed (bar expected failures).
comment:3 by , 10 years ago
Severity: | Normal → Release blocker |
---|---|
Triage Stage: | Unreviewed → Accepted |
Hi antialiasis,
Unfortunately defining those patterns as string instead of compiled re
might be breaking backward compatibility since they're expected to be compiled.
I think the correct way of solving this issue would be to either:
- Instruct the migration framework how to deal with compiled
re
instance. You can have a look at how django.db.migrations.writer.MigrationWriter work for this; - Implement
RegexValidator.deconstruct
instead of using thedeconstructible
decorator to make sure the compiledself.regex
attribute is deconstructed to returnregex=self.regex.pattern
andflags=self.regex.flags
kwargs.
comment:4 by , 10 years ago
You might have to rely on ducktyping to implement the first option since it seems there's no way to reliably detect that a given object is a compiled pattern.
Edit: compiled_re_type = type(re.compile(''))
should work.
comment:5 by , 10 years ago
There is also another option: passing slug_re.pattern
, ipv4_re.pattern
, etc. to the validator instances. But the solutions proposed by Simon have the advantage of being more general.
comment:6 by , 10 years ago
As for the tests, I think it would be better to add them in tests/migrations/test_writer.py, as there are already some validator-related tests in test_serialize
(just create a separate test for all validator tests).
by , 10 years ago
Attachment: | validator_patch_2.diff added |
---|
Allow MigrationWriter to serialize compiled regex objects.
comment:7 by , 10 years ago
claudep's solution is more along the lines of what I meant to do here (it stupidly slipped my mind that slug_re
etc. were public API and not just intermediate variable names for cleaner constructor calls). Then again, the reason that's what I meant to do was that I had assumed actually making validators with compiled regexes serializable was impossible, or at least a lot trickier than it sounded, given it hadn't been done and the release notes were telling people to just stop passing precompiled regexes to RegexValidator. I've attached my attempt at making compiled regexes serializable (plus factoring out the validator tests and removing that item from the documentation); if it really is that simple, then of course that's a more general and useful solution.
comment:8 by , 10 years ago
Cc: | added |
---|---|
Patch needs improvement: | set |
Patch is looking good! Two nitpicks:
- I would try to cache
type(re.compile(''))
instead of computing it on every instance check; - Could you rebase your patch on the master branch, I can't apply it with
patch -p0 < validator_patch_2.diff
.
comment:10 by , 10 years ago
Hmm I still couldn't apply your patch on top of the lastest master so I had to manually merge all the changes.
I created a PR with it. Could you sure make I didn't miss anything?
comment:11 by , 10 years ago
I updated the patch to work under Py3. Some tests were failing because of the fact that re.compile(u'').flags == 32
on Py3 and 0
on Py2.
I would appreciate if someone else could make sure the changes I made make sense.
comment:12 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
comment:14 by , 10 years ago
I reviewed the patch and it looks fine to me; I've merged it and backported it to the 1.7 stable branch too.
by , 10 years ago
Attachment: | regex_py3_wip.diff added |
---|
Fix Python 3 regex serialization, but a test currently fails (see comment below)
comment:16 by , 10 years ago
Sorry to disappear from the discussion; I was on a trip. Thanks for correcting and accepting the patch. However, I believe charettes' Py3 correction is incomplete - the patch still checks if regex.flags is truthy to determine if it should include the flags parameter in the serialization, so if one created a regex with re.compile('', 0)
on Py3, it would currently write out just re.compile('')
(which Py3 will treat as re.compile('', 32)
).
I've attached a version that checks if regex.flags is equal to the default flags instead, but this causes a test to fail, because even though re.compile('', re.U).flags == re.compile('').flags
(on Py3), re.compile('', re.U) != re.compile('')
(and the latter is what comes out of the serialization of the former). Is there any chance this could potentially cause problems in practice or should we just change the test?
comment:17 by , 10 years ago
Okay, I looked into it a bit better. To correct/clarify: the real issue is that the flags attribute on a compiled regex does not simply contain the flags passed into re.compile
; it contains those flags, plus any inline flags defined within the pattern itself, plus (on Python 3) the implicit Unicode flag. This means if we naïvely serialize regexes based on the flags attribute, the serialization may explicitly specify flags that were not actually passed into re.compile
when the regex was created. While as far as I can tell this should not affect how the resulting regex functions, it does mean it won't compare equal to the original. Could this cause problems?
I believe this means my patch above is obsolete. We could compare the flags on the actual regex to re.compile(regex.pattern).flags
to see if we can omit the flags argument, but the serialized-deserialized regex would then still compare unequal to the original if the user did explicitly specify a flag such as re.U
. If this is a problem, we may need to rethink this a bit.
Use strings rather than precompiled regexes for built-in RegexValidator instances.