Opened 2 years ago

Closed 2 years ago

#33937 closed Cleanup/optimization (fixed)

Optimize m2m serialization to avoid loading full model instances

Reported by: Gordon Wrigley Owned by: mark evans
Component: Core (Serialization) Version: 4.0
Severity: Normal Keywords: performance
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Adam Johnson)

When not using natural keys, the handle_m2m_field function(source) loads the full object for every entry in the m2m model, when it only needs the pks. The pk's can even be obtained from the m2m intermediate table, without joining the target table.

In my case the table we are m2m'ing to has files in it, so that's a weighty fetch.
We are using django-reversion which stores a serialized version of each save.
On the workload that flagged this up enabling reversion incurs a 300x performance hit (from half a second to 2.5 minutes) and it's almost entirely because of this.

Change History (9)

comment:1 by Adam Johnson, 2 years ago

Description: modified (diff)
Summary: Serialize is loading full objects when serializing m2m fields.Optimize m2m serialization to avoid loading full model instances
Triage Stage: UnreviewedAccepted
Type: UncategorizedCleanup/optimization

comment:2 by Mariusz Felisiak, 2 years ago

When not using natural keys, the handle_m2m_field function(​source) loads the full object for every entry in the m2m model, when it only needs the pks.

What about natural keys? They can contain any combination of fields (also non-unique in general).

in reply to:  2 comment:3 by Gordon Wrigley, 2 years ago

Replying to Mariusz Felisiak:

When not using natural keys, the handle_m2m_field function(​source) loads the full object for every entry in the m2m model, when it only needs the pks.

What about natural keys? They can contain any combination of fields (also non-unique in general).

It loads the whole objects for the natural key case as well, but I don't know if that is avoidable, whereas the PK case seems like it should be easily avoidable (and is also the one I care about :P )

comment:4 by mark evans, 2 years ago

Owner: changed from nobody to mark evans
Status: newassigned

Going to take this on as my first Django ticket. Should have a patch shortly.

comment:5 by mark evans, 2 years ago

Has patch: set

comment:6 by mark evans, 2 years ago

I've submitted a patch here: https://github.com/django/django/pull/16028

comment:7 by Mariusz Felisiak, 2 years ago

Patch needs improvement: set

comment:8 by Mariusz Felisiak, 2 years ago

Patch needs improvement: unset
Triage Stage: AcceptedReady for checkin

comment:9 by Mariusz Felisiak <felisiak.mariusz@…>, 2 years ago

Resolution: fixed
Status: assignedclosed

In 19e0587e:

Fixed #33937 -- Optimized serialization of related m2m fields without natural keys.

Note: See TracTickets for help on using tickets.
Back to Top