Opened 6 weeks ago

Last modified 4 weeks ago

#35941 assigned New feature

Add composite GenericForeignKey support — at Version 13

Reported by: Csirmaz Bendegúz Owned by: Csirmaz Bendegúz
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Csirmaz Bendegúz, Peter Thomassen Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Csirmaz Bendegúz)

This is a follow up to #373 (CompositePrimaryKey).

Proposal:

My proposal is to implement GenericForeignKey support with JSON.

  1. object_id is a CharField (or TextField)
  2. CompositePrimaryKey is stored as a JSON array in object_id e.g. object_id='[1, "abc"]'
  3. JOINs can be achieved with JSON functions (varies per db backend)

Joins:

After some experimentation with JSON functions, I believe the simplest solution is to construct JOINs like this:

JOIN ON ((object_id::jsonb)->>0)::integer = id
    AND ((object_id::jsonb)->>1)::uuid = uuid

Casting is a pain point, especially when joining on DateTimeFields, as we need to make sure the two columns are in the same format.

Risks:

What if someone is using a JSON array as the primary key (but it's not a composite primary key)?
Before deserializing the JSON array, we need to check if the content type has a composite primary key or not.

What if the db backend doesn't support JSON functions?
All supported databases support JSON functions. MariaDB has some limitations handling surrogate pairs.

JOIN on JSON functions is not efficient
When storing a composite primary key in a CharField / TextField, JSON is the best option we have because it's widely supported by database backends. To achieve better performance, the composite pk shouldn't be stored in a CharField / TextField in the first place (see: alternatives).

Notes:

  1. JOINs must work with Unicode characters
  2. int, date, datetime, uuid, text fields must be supported
  3. Django admin's LogEntry has its own implementation of "generic foreign keys". The approach we take with GenericForeignKey should also apply to LogEntry.

Alternatives:

It's possible to implement other strategies to deal with composite generic foreign keys.

  1. JSONField - instead of CharField / TextField, we could make "object_id" a JSONField. This achieves the same thing but saves the ::jsonb cast, so it's slightly faster.
  2. GenericForeignKey with multiple object_ids - less flexible, can't store both regular primary keys and composite primary keys. Fastest JOINs.

There's no reason we couldn't implement other strategies. I believe my proposal provides the most flexibility at the cost of performance. It should be implemented for backwards-compatibility, so systems using a CharField / TextField object_id (e.g. django-guardian) can easily integrate with composite primary keys. Other strategies can then be implemented to improve performance.

Any feedback is appreciated!

Change History (13)

comment:1 by Natalia Bidart, 6 weeks ago

Triage Stage: UnreviewedAccepted

Thank you Ben for creating this ticket. Accepting as described in ticket:373#comment:204.

comment:2 by Csirmaz Bendegúz, 6 weeks ago

Description: modified (diff)

Thanks Natalia!

comment:3 by Simon Charette, 6 weeks ago

I think the proposal looks great Ben, thanks for creating this ticket.

One thing that comes to mind as potentially problematic in JSON array comparison for JOIN is the treatment of Unicode characters as we're using json.dumps(ensure_ascii=False) as well as quotes and other characters that need to be escaped in JSON. It'd be great to add test coverage for these.

As for the risks I appreciate you raising them but I think that supporting composite generic foreign key composed of standard members (int, date, datetime, uuid, text) is already a good stretch and I would feel comfortable saying that anything that falls in the exotic primary key category should not necessarily be supported in the first place.

comment:4 by Csirmaz Bendegúz, 6 weeks ago

Description: modified (diff)
Owner: set to Csirmaz Bendegúz
Status: newassigned

Yes, thanks for raising these points Simon!

comment:5 by Peter Thomassen, 6 weeks ago

Cc: Peter Thomassen added

comment:6 by Csirmaz Bendegúz, 5 weeks ago

Description: modified (diff)
Has patch: set

comment:7 by Csirmaz Bendegúz, 5 weeks ago

Description: modified (diff)

comment:8 by Csirmaz Bendegúz, 5 weeks ago

Description: modified (diff)

comment:9 by Csirmaz Bendegúz, 5 weeks ago

Description: modified (diff)

comment:10 by Csirmaz Bendegúz, 5 weeks ago

Description: modified (diff)

comment:11 by Csirmaz Bendegúz, 5 weeks ago

Description: modified (diff)

comment:12 by Csirmaz Bendegúz, 5 weeks ago

Description: modified (diff)

comment:13 by Csirmaz Bendegúz, 4 weeks ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top