#19463 closed New feature (fixed)
Add UUID Field to core
Reported by: | Thomas Güttler | Owned by: | Marc Tamlyn |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | trbs@…, matt@…, mike@…, Marc Aymerich, cyphase@…, jonathan+django@…, tomek@…, saxix.rome@…, loic@…, galuszkak@…, ashwoods, anubhav9042@…, lukas-hetzenecker | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
on django-dev Dec 2012 If someone can come up with a good patch I'd be fine considering it for core. Jacob (Kaplan-Moss)
Related: #4682 was closed five years ago.
I (Thomas Güttler) want to moderate this ticket, but won't create patch.
Change History (26)
comment:1 by , 12 years ago
Cc: | added |
---|
comment:2 by , 12 years ago
Note that in databases other than PostgreSQL, it might be desirable to store internally the UUID value as binary, not as a char, both for performance reasons and for compatibility with Postgres' uuid (stored as a 128 bits binary). So we might need to solve #2417 beforehand...
comment:3 by , 12 years ago
Cc: | added |
---|
One thing I found with my UUIDField is that I needed to supply code to enable South to (a) handle this field type on migrations, and (b) prevent it trying to create a default at the time a migration is run.
Specifically, https://bitbucket.org/schinckel/django-uuidfield/commits/69f7c0cdf91d28da2cceaff6f46ece34f733b560 shows how to do this.
I would assume we wouldn't want to have any code related to providing data to south in django core, so perhaps we would need to ensure that South releases a version around the same time as after this patch is included.
comment:4 by , 12 years ago
Cc: | added |
---|
comment:5 by , 12 years ago
I've been thinking that we would likely want to have a new field type: GeneratedField. This is like AutoField - the field gets a value on save() if it doesn't already have a value, and this field type is always a primary key (I am not 100% sure of the PK requirement, but it could simplify things). GeneratedField would have a backing field (the db storage type) and some generator, where the generator could fetch the value from DB using RETURNING, could generate the value in Python (like default, but with access to connection), or it could fetch the value after save from the DB (AutoField does this using select currval(someseq) on some backends).
I think such a field type would cover a lot of requests we have currently - unsigned serial fields, tiny/big/...integer serial fields, UUID fields (no matter what the UUID generator function is), and likely some more.
I don't know how hard such a field will be to write, or what the exact API should be - so this is mostly hand waving at the moment. Still, it seems there are only two public API places where this would affect current code - model.save() and bulk_create(), so it seems this should not be totally out of reach as a feature.
comment:6 by , 12 years ago
Triage Stage: | Unreviewed → Accepted |
---|
Quoting Jacob from the recent django-developers discussion: "If someone can come up with a good patch I'd be fine considering it for core.".
So, marking as accepted based on that.
comment:7 by , 12 years ago
Cc: | added |
---|
comment:8 by , 12 years ago
Cc: | added |
---|
comment:9 by , 12 years ago
Cc: | added |
---|
comment:10 by , 12 years ago
Cc: | added |
---|
comment:11 by , 12 years ago
Cc: | added |
---|
comment:12 by , 12 years ago
Cc: | added |
---|
Big +1 on @akaariai's GeneratedField idea.
For example I use extensively what I call a "readable unique ID", similar to YouTube video IDs (i.e. "sc5vraPpTcA"), for which I made a custom Field. It functions like a UUID but trades the creation convenience (guaranteed uniqueness) for usage convenience (being able to read it out loud, shorter URL, etc.). A GeneratedField would allow me to implement that cleanly.
That said, some databases have native support for UUIDs and it's pretty much the standard for sharding, so we could have the generic GeneratedField and a UUIDField subclass.
I'd work on a patch with some guidance from @akaariai.
comment:13 by , 11 years ago
Cc: | added |
---|---|
Version: | 1.4 → master |
comment:14 by , 11 years ago
Cc: | added |
---|
comment:15 by , 11 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
For postgres at least, this will form part of my upcoming work on django.contrib.postgres. Support for bigserial
is also likely to come in with that, so a more general base class for AutoField
might be useful. That said, a UUIDField
does not always want to be autogenerated (unlike an autoincrementing which probably should be) - it is a reasonable use case for an API client to generate a uuid (using the uuid4 approach which has a very high probability of avoiding clashes) and expect that to be saved by a Django backed API.
Supporting a simple UUIDField(default=uuid.uuid4) should be a good start.
comment:16 by , 11 years ago
I have written a UUID Field for django that supports 1.7 and its features, migrations serialization etc.
The field can be set with a UUID instance, either a hyphenated str or one that is not. also it can be created with bytes if that is needed. It can auto generate the uuid aka uuid4 and supports the other variants that python's uuid module offers (1,3,4,5). Queries work with either str or UUID instances but not with bytes because who is ever going to query by the bytes, em I right? https://github.com/japrogramer/django-uuid-contour
P.S.
Many tests are included and supports python 3.4 ;)
comment:17 by , 11 years ago
Has patch: | set |
---|
PR available at https://github.com/django/django/pull/2923
comment:19 by , 10 years ago
Patch needs improvement: | set |
---|
There seems to be one issue that needs solving: should we use SubfieldBase or not? SubfieldBase is used so that the field's to_python method is called any time a value is assigned to a model instance. In particular this happens when setting a value in model.__init__
. So, if a database value is just bytes or string, then when the model is initialized from the database we get correctly UUID instance in the uuid field because to_python is called.
There isn't any field in core that uses to_python. There are some disadvantages when using to_python:
- It doesn't work when using .values('uuid_field')
- There is a small performance penalty when setting the field value, in particular model.init will be 10-20% slower for each field that uses SubfieldBase.
- Fields with subfieldbase work a bit differently from other core fields. SubfieldBase fields do value conversion on assignment, so:
>>> s = SomeModel() >>> s.uuid_field = "f47ac10b-58cc-4372-a567-0e02b2c3d479" >>> s.uuid_field OUT: uuid("f47ac10b-58cc-4372-a567-0e02b2c3d479") when using SubfieldBase OUT: "f47ac10b-58cc-4372-a567-0e02b2c3d479" when not using SubfieldBase
Now, one could consider this to be a feature. But, no other field in core or contrib does this kind of conversion on assignment, so we should avoid this if possible.
Other ways forward are:
- Add a more generic field value conversion framework: add field.from_db_value(value, connection). This is a larger amount of work, but is needed in any case. This solution would work in .values(), and it would also be considerably faster than the current SubfieldBase way of doing things. Unfortunately this means that we can't merge this ticket before we have added the from_db_value method.
- Use backend specific converters. Unfortunately it seems one needs to create custom compilers for each backend (see django/db/backends/oracle/compiler.py for example)
So, in the end there seems to be just two choices: wait for field.from_db_value() or use SubfieldBase (with the possibility of removing use of SubfieldBase when field.from_db_value is introduced).
I'll mark patch needs improvement for lack of better marker that this isn't ready for merge before we agree on a solution on the SubfieldBase issue.
comment:20 by , 10 years ago
Cc: | added |
---|
comment:21 by , 10 years ago
Cc: | added; removed |
---|
comment:22 by , 10 years ago
Patch needs improvement: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
comment:23 by , 10 years ago
Cc: | added |
---|
comment:24 by , 10 years ago
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:26 by , 10 years ago
@deronnax please open a new feature request instead of commenting on a closed ticket.
For reference: https://github.com/django-extensions/django-extensions/issues/277