#26749 closed Cleanup/optimization (fixed)
Preserve behavior of use_for_related_field during deprecation
Reported by: | Julien Hartmann | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.10 |
Severity: | Normal | Keywords: | manager |
Cc: | Loic Bistuer, Shai Berger | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I am in the process of testing Django 1.10 and upgrading the third-party package I maintain, django-hvad.
This ticket is related to the two following commits:
- Streamlined manager inheritance.: https://github.com/django/django/commit/3a47d42fa33012b2156bf04058d933df6b3082d2
- Introduced new APIs to specify models' default and base managers: https://github.com/django/django/commit/ed0ff913c648b16c4471fc9a9441d1ee48cb5420
The issue I have with the way inheritance and customization now works is it is very hard to have a different manager as the base_manager
and the one used for related field descriptors.
Why the need?
Django-hvad is one of the modules that change the default querysets. It does it by replacing the default manager with one that brings extra features. Namely, automatic joining onto a translations model.
Having those features available on related fields is essential to provide a consistent interface, and after years of having it, it is now essential to backward compatibility as well.
It does not break the basic rule of “a base manager must not hide objects”, as it merely adds automatic select_related()
rules and iterator()
encapsulation.
Yet, those extra features cannot be on the base_manager
, due to the way ORM internals use it (for instance, triggering additional joins on a delete would not work).
Why the ticket?
Up to Django 1.9, this was possible using use_for_related_fields = True
, and fixing the _base_manager
in the class_prepared
hook. Not clean but workable.
With the new API though, all the manager machinery uses property descriptors on the Options
class, making it difficult and fragile to override. And at the same time, using use_for_related_fields
got deprecated, which means there is no option to use a different manager as the base manager and as the related field descriptor's manager anymore.
Unless there is another way to do this?
Change History (23)
comment:1 by , 8 years ago
Cc: | added |
---|
comment:2 by , 8 years ago
comment:3 by , 8 years ago
Triage Stage: | Unreviewed → Accepted |
---|
After discussion with loic in IRC, I think it makes sense to go ahead and add a single related_manager_name
option. That would cover this use case (and all the ones I can think of), and wouldn't preclude later layering on many_related_manager_name
or single_related_manager_name
if someone finds compelling needs for them later on.
comment:4 by , 8 years ago
Could it be possible to clarify the intended use of base_manager
? All over Django codebase, it's only used at 4 points:
- when finding out the list of related objects in the deletion collector
- when saving an existing model, the internal
_update
method is called on a queryset created by te base manager. - when saving a newly created model, the internal
_insert
method is called on the base manager. - in the
contrib.contenttypes
application, for ̀GenericForeignKey`.
- And well, since the changes of the linked commits, it's now also used for related fields, but hopefully that should be gone or overridable soon.
My understanding is use cases 1-3 are internal uses, and are the intended users of the base manager.
On the other hand, use cases 4-5 expose the manager to user code, and should be overridable to offer custom behavior to user code without altering ORM internals.
The change you suggest, with related_manager_name
, seems in line with my understanding? If so, should GenericForeignKey
use it as well?
(by the way, in the default manager documentation, I think it should say default_manager_name
instead of base_manager_name
?)
comment:5 by , 8 years ago
Could it be possible to clarify the intended use of exhaustive?
I was originally planning to document the exhaustive list of base manager usages, but Tim actually wrote the docs for that patch and went with a terser description. I'm not against adding a bit more details, patch welcome.
since the changes of the linked commits, it's now also used for related fields, but hopefully that should be gone or overridable soon
The previous implementation of one-related descriptors weren't representative of intent:
- The forward M2O descriptor had its own logic with regards to
use_for_related_fields
because its implementation predated the introduction of_base_manager
(7d9b29a56ddc1a793a02d55743c372a8256b97aa). - The reverse O2O descriptor as originally implemented only used
_base_manager
until I changed it in Django 1.8 (e043aae9bb2e3fa244f59b07fc16f57476cb80ff).
In practice use_for_related_fields
meant "the default manager can also be used as base manager" since the introduction of _base_manager
, as demonstrated by your need to monkeypatch _base_manager
back to a plain manager during the class preparation. Also _base_manager
was introduced to generalize the use_for_related_fields
mechanism as documented in the historical note that was commited the same day _base_manager
was introduced.
My understanding is use cases 1-3 are internal uses, and are the intended users of the base manager.
I don't see a distinction between internal and external use. There is nothing internal about the concept of inserting data and/or wanting to guarantee unfiltered results (e.g. dumpdata, collection for deletion, etc.), and for that you'd want to use a plain manager. Granted those are low-level operations that most projects don't need to worry about, but 3rd-party libraries are definitely among the intended users.
The change you suggest, with related_manager_name, seems in line with my understanding? If so, should GenericForeignKey use it as well?
Definitely, GenericForeignKey and GenericRelation actually don't pick the most adequate managers due to their reliance on the get_object_for_this_type
and get_all_objects_for_this_type
utilities which use _base_manager
. I'm planning to undocument these and remove every usage in Django (deprecating them would break too much working code) in favor of inlining their code.
(by the way, in the default manager documentation, I thinkg it should say default_manager_name instead of base_manager_name?)
Well spotted, want to make a patch?
comment:6 by , 8 years ago
Before changing anything, I must be certain to fully understand.
What I meant by "internal" is the way the manager is used does not expose it to user code in cases 1-3. In those use cases, the manager provides a service to Django ORM, which could completely be implemented with no manager interaction, user code would not notice.
→ On the other hand, in cases 4-5, Django simply picks the manager and passes it to user code, which makes it possible for the manager to expose additional features to user code.
So we have two very different use patterns. Therefore it makes sense to separate them — related_manager
and base_manager
. Actually, I'd even say that related_manager
is closer to default_manager
than base_manager
in purpose and use pattern.
Let's get back to base_manager
. Actually, cases 1-3 are not the whole story, as you pointed out. It also serves the purpose of providing a unfiltered-access-to-model guarantee.
I believe that's a big red flag. Because those two purposes are different: cases 1-3 are consumed by the ORM while the unfiltered guarantee is consumed by user code.
Now, this unfiltered-guarantee is clearly defined and definitely a good addition to the public side of the API. No problem with that. Now, though, the purpose of customizing the inner workings of the ORM seems to be inconsistently served:
- Overriding insertion:
- Either override
MyModel._do_insert
- or just let it forward the call to the base_manager and override
MyBaseManager._insert
- Either override
- Overriding updates:
- Override
Queryset.update
(for queryset-based updates) - and either:
- override
MyModel._do_update
- or let it forward the call to base_manager's queryset and override
QuerySet._update
- override
- Override
- Overriding deletion:
- If object can be fast deleted, override base_manager's queryset
_raw_delete
method. - otherwise, no luck, you cannot override deletion, it is a hardcoded
sql.DeleteQuery(model)
.
- If object can be fast deleted, override base_manager's queryset
There is no object clearly responsible for updating the database. Sometimes the queryset does it, sometimes the base_manager, sometimes the deletion collector bypasses both. Sometimes the calls go through the model, with a default implementation that simply forwards them and sometimes they don't.
Also, though base_manager_name
is becoming a public API, doing anything useful with it requires overriding private, undocumented API methods.
So, the final breakdown, as I understand it at the moment, is as such:
default_manager
: makes it possible to expose additional features to user code, in the general case.related_manager
: makes it possible to expose additional features to user code, on related fields.base_manager
: must provide the unfiltered, untampered guarantee for use in things such as dumpdata.
- unspecified API clinging to
base_manager
and living in undocumented, private API methods of the manager or its queryset: can be used to override low-level operations. That is, cases 1-3.
I would advocate splitting base_manager
and the unspecified API, for several reasons:
- it would make the
base_manager
's role and responsibilities clearly bound and defined. - there is no reason for those methods to live on the manager/queryset in the first place.
- the fact they live on the manager(/manager's queryset) means every manager has them, yet they are only ever used/usable on the base manager(/manager's queryset).
- the unspecified API is inconsistent and incomplete anyway, certainly not mature enough to go public with the
base_manager
.
Namely, I'd completely remove Manager._insert
and QuerySet._update
and let Model._do_insert
and Model._do_update
do the job.
I hope this didn't sound too bold, I'm trying my best to make my meaning unambiguous, somewhat at the expense of political correctness.
If you think this has some value, I can try to make a proof of concept patch.
comment:7 by , 8 years ago
If I understand correctly, this is an accepted ticket asking for design changes in a new feature. Shouldn't it be a blocker?
comment:8 by , 8 years ago
Cc: | added |
---|
comment:9 by , 8 years ago
I was taking a shot at adding the related_manager_name
.
I was wondering what it should default to, if unspecified: the base_manager
or default_manager
? To me, default_manager
makes more sense, but base_manager
is closer to former behavior.
comment:10 by , 8 years ago
@spectras, I already have a branch that introduce related_manager_name
, but there are a bunch of design decisions to make that I'm not comfortable making this late in the 1.10 cycle. I don't want to introduce new APIs in a rush.
Instead I'd like to adapt the current deprecation shim for use_for_related_field
so django-hvad can use it to work on Django 1.10, and then introduce definitive APIs in Django 1.11.
comment:11 by , 8 years ago
Alright, well maybe it's better to just summarize what is needed:
If user code did not change any manager, then:
→ hvad must override the default manager and the manager used by related fields, for an hvad-enabled one.
And in all cases, even if the user did override a manager, then:
→ hvad must ensure the base manager is not hvad-enabled: isinstance(model._base_manager, TranslationManager)
is False
).
This is because hvad-enabled manager is not suitable as a base manager. Though it does provide the unfiltered-guarantee, it breaks because it alters the behavior of the deletion collector.
I'll investigate whether it's possible to prevent that in a future version, but that's probably a full rewrite with backward incompatibilities, something not happening in time for Django 1.10.
comment:12 by , 8 years ago
@spectras, could you please try https://github.com/django/django/pull/6825.
Then on django-hvad:
Model._plain_manager = models.Manager()
.- Instead of patching
Model._base_manager
patchModel._meta.base_manager_name = '_plain_manager'
. TranslationManager.silence_use_for_related_fields_deprecation = True
comment:13 by , 8 years ago
Passes the test suite. I'll fiddle with it a bit more, but unless I find odd corner cases, it's good.
I like the way the PR only changes the deprecation path, allowing a clean base to refine the API on next release.
comment:14 by , 8 years ago
I guess the downside of this approach is that it'll trigger a migration for each model. Think it's acceptable?
comment:15 by , 8 years ago
The migration process has been very well streamlined since South functionality has been merged into Django, so yes — I'll make sure to mention it in the release notes. That will be the opportunity to include another change that requires a migration as well.
comment:16 by , 8 years ago
As the release is getting closer, I'm coming back here to know whether the POC fix will be merged into stable/1.10.x?
comment:17 by , 8 years ago
Type: | Uncategorized → Cleanup/optimization |
---|
I expect it should be, but the patch is missing a test. If you want to try writing one, that would be helpful.
comment:18 by , 8 years ago
Can do.
I checked current tests, this corner case is not covered. I'll try to add that today.
comment:19 by , 8 years ago
I opened another pull request as I cannot update loic's one. It is here: https://github.com/django/django/pull/6889
I'm open to all suggestions to improve it.
comment:20 by , 8 years ago
Has patch: | set |
---|---|
Patch needs improvement: | set |
comment:21 by , 8 years ago
Patch needs improvement: | unset |
---|---|
Summary: | Different manager for _base_manager and related descriptors → Preserve behavior of use_for_related_field during deprecation |
Triage Stage: | Accepted → Ready for checkin |
At some point in the design of the new API we had another model option called
related_manager_name
to do basically what OP is asking (discussions at https://github.com/django/django/pull/6175). There was some design decisions to make on whether it should only affect to-one rels, or every rels, and on whether we should introduce even more granularity with options liketo_one_related_manager_name
andto_many_related_manager_name
.