Opened 14 months ago

Closed 14 months ago

Last modified 12 months ago

#34884 closed Cleanup/optimization (wontfix)

Half bug/half enhancement : inconsistent behavior of get_or_create() regarding related attributes cache

Reported by: Laurent Lyaudet Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: ORM get_or_create
Cc: laurent.lyaudet@… Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hello,
The sample script below demonstrates that get_or_create() avoid unnecessary DB query for related object given as argument only if created is True.
It would be nice if the case created is False could be made coherent with it.
Setting the cache when possible would avoid having to do all over the code:

my_object = MyModel.objects.get_or_create(related_object=related_object)
my_object.related_object = related_object

I can code a patch and provide it in a few days if my idea is accepted :)
Here is the script:

In [1]: from django.db import connection

In [2]: from django.contrib.contenttypes.models import ContentType

In [3]: from django.contrib.auth.models import Permission

In [4]: content_type = ContentType.objects.get_for_model(Permission)

In [5]: permission, created = Permission.objects.get_or_create(name="Test", content_type=content_type, codename="Test")

In [6]: created
Out[6]: True

In [7]: connection.queries
Out[7]: 
[{'sql': "SELECT t.oid, typarray FROM pg_type t JOIN pg_namespace ns ON typnamespace = ns.oid WHERE typname = 'hstore'",
  'time': '0.001'},
 {'sql': "SELECT typarray FROM pg_type WHERE typname = 'citext'",
  'time': '0.000'},
 {'sql': 'SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE ("django_content_type"."app_label" = \'auth\' AND "django_content_type"."model" = \'permission\')',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'},
 {'sql': 'INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES (\'Test\', 2, \'Test\') RETURNING "auth_permission"."id"',
  'time': '0.001'}]

In [8]: permission.content_type
Out[8]: <ContentType: permission>

In [9]: connection.queries
Out[9]: 
[{'sql': "SELECT t.oid, typarray FROM pg_type t JOIN pg_namespace ns ON typnamespace = ns.oid WHERE typname = 'hstore'",
  'time': '0.001'},
 {'sql': "SELECT typarray FROM pg_type WHERE typname = 'citext'",
  'time': '0.000'},
 {'sql': 'SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE ("django_content_type"."app_label" = \'auth\' AND "django_content_type"."model" = \'permission\')',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'},
 {'sql': 'INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES (\'Test\', 2, \'Test\') RETURNING "auth_permission"."id"',
  'time': '0.001'}]

In [10]: permission, created = Permission.objects.get_or_create(name="Test", content_type=content_type, codename="Test")

In [11]: created
Out[11]: False

In [12]: connection.queries
Out[12]: 
[{'sql': "SELECT t.oid, typarray FROM pg_type t JOIN pg_namespace ns ON typnamespace = ns.oid WHERE typname = 'hstore'",
  'time': '0.001'},
 {'sql': "SELECT typarray FROM pg_type WHERE typname = 'citext'",
  'time': '0.000'},
 {'sql': 'SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE ("django_content_type"."app_label" = \'auth\' AND "django_content_type"."model" = \'permission\')',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'},
 {'sql': 'INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES (\'Test\', 2, \'Test\') RETURNING "auth_permission"."id"',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'}]

In [13]: permission.content_type
Out[13]: <ContentType: permission>

In [14]: connection.queries
Out[14]: 
[{'sql': "SELECT t.oid, typarray FROM pg_type t JOIN pg_namespace ns ON typnamespace = ns.oid WHERE typname = 'hstore'",
  'time': '0.001'},
 {'sql': "SELECT typarray FROM pg_type WHERE typname = 'citext'",
  'time': '0.000'},
 {'sql': 'SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE ("django_content_type"."app_label" = \'auth\' AND "django_content_type"."model" = \'permission\')',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'},
 {'sql': 'INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES (\'Test\', 2, \'Test\') RETURNING "auth_permission"."id"',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'},
 {'sql': 'SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE "django_content_type"."id" = 2',
  'time': '0.001'}]

Best regards,

Laurent Lyaudet

Change History (10)

comment:1 by Laurent Lyaudet, 14 months ago

I coded a monkey patch for my use case where it will not be soon to upgrade Django and in case some bikeshedding occurs that puts to trash coherency and efficiency in yet another open source software where some people against free software have influence.
Here it is :

get_or_create_monkey_patch.py

from django.db.models import QuerySet, ForeignKey


original_get_or_create = QuerySet.get_or_create


def patched_get_or_create(self, defaults=None, **kwargs):
     result, created = original_get_or_create(self, defaults=defaults, **kwargs)
     if defaults is not None:
         for key, value in defaults.items():
             if isinstance(result._meta.get_field(key), ForeignKey):
                 # isinstance handles OneToOneField also.
                 setattr(result, key, value)
     for key, value in kwargs.items():
         if isinstance(result._meta.get_field(key), ForeignKey):
             # isinstance handles OneToOneField also.
             setattr(result, key, value)
     return result, created


QuerySet.get_or_create = patched_get_or_create

You can execute it at the end of the previous script + some test and obtain this:

In [15]: from django.db.models import QuerySet, ForeignKey

In [16]: original_get_or_create = QuerySet.get_or_create

In [17]: def patched_get_or_create(self, defaults=None, **kwargs):
    ...:      result, created = original_get_or_create(self, defaults=defaults, **kwargs)
    ...:      if defaults is not None:
    ...:          for key, value in defaults.items():
    ...:              if isinstance(result._meta.get_field(key), ForeignKey):
    ...:                  # isinstance handles OneToOneField also.
    ...:                  setattr(result, key, value)
    ...:      for key, value in kwargs.items():
    ...:          if isinstance(result._meta.get_field(key), ForeignKey):
    ...:              # isinstance handles OneToOneField also.
    ...:              setattr(result, key, value)
    ...:      return result, created
    ...: 

In [18]: QuerySet.get_or_create = patched_get_or_create

In [19]: permission, created = Permission.objects.get_or_create(name="Test", content_type=content_type, codename="Test")

In [20]: created
Out[20]: False

In [21]: connection.queries
Out[21]: 
[{'sql': "SELECT t.oid, typarray FROM pg_type t JOIN pg_namespace ns ON typnamespace = ns.oid WHERE typname = 'hstore'",
  'time': '0.001'},
 {'sql': "SELECT typarray FROM pg_type WHERE typname = 'citext'",
  'time': '0.000'},
 {'sql': 'SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE ("django_content_type"."app_label" = \'auth\' AND "django_content_type"."model" = \'permission\')',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'},
 {'sql': 'INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES (\'Test\', 2, \'Test\') RETURNING "auth_permission"."id"',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'},
 {'sql': 'SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE "django_content_type"."id" = 2',
  'time': '0.000'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'}]

In [22]: permission.content_type
Out[22]: <ContentType: permission>

In [23]: connection.queries
Out[23]: 
[{'sql': "SELECT t.oid, typarray FROM pg_type t JOIN pg_namespace ns ON typnamespace = ns.oid WHERE typname = 'hstore'",
  'time': '0.001'},
 {'sql': "SELECT typarray FROM pg_type WHERE typname = 'citext'",
  'time': '0.000'},
 {'sql': 'SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE ("django_content_type"."app_label" = \'auth\' AND "django_content_type"."model" = \'permission\')',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'},
 {'sql': 'INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES (\'Test\', 2, \'Test\') RETURNING "auth_permission"."id"',
  'time': '0.001'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'},
 {'sql': 'SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE "django_content_type"."id" = 2',
  'time': '0.000'},
 {'sql': 'SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = \'Test\' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = \'Test\')',
  'time': '0.001'}]

In [24]: 

Version 0, edited 14 months ago by Laurent Lyaudet (next)

comment:2 by Laurent Lyaudet, 14 months ago

The previous patch lowered the running time of my automatic tests from 195 s to 181 s which is quite good for such a small patch.
I tried also a slightly more powerful patch but it yields no mesurable gain.
The second patch on get can totally replace the previous patch, you don't need both.

original_get = QuerySet.get


def patched_get(self, *args, **kwargs):
    result = original_get(self, *args, **kwargs)
    for key, value in kwargs.items():
        try:
            if isinstance(result._meta.get_field(key), ForeignKey):
                # isinstance handles OneToOneField also.
                setattr(result, key, value)
        except FieldDoesNotExist:
            pass
    return result


QuerySet.get = patched_get

This patch could still be improved to handle args also.
But I don't see a use case where one use:

my_object = MyModel.objects.get(Q(related_object=related_object))

instead of

my_object = MyModel.objects.get(related_object=related_object)

And clearly it could not be patched if one use ~Q() or Q() | Q().

comment:3 by Natalia Bidart, 14 months ago

Resolution: invalid
Status: newclosed

Hello Laurent, thank you for your ticket.

As far as I understand your report, the key to your issue is about using select_related when getting the Permission you need. Specifically, see how if I create a permission using the content_type's PK, the resulting object does not have the ContentType instance fetched by default:

>>> permission, created = Permission.objects.get_or_create(name="Test", content_type_id=content_type.pk, codename="OtherTest")
>>> print("\n\n".join(i["sql"] for i in connection.queries))
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = 'OtherTest' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = 'Test') LIMIT 21

BEGIN

INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES ('Test', 2, 'OtherTest') RETURNING "auth_permission"."id"

COMMIT

>>> permission.content_type
<ContentType: Authentication and Authorization | permission>
>>> print("\n\n".join(i["sql"] for i in connection.queries))
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = 'OtherTest' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = 'Test') LIMIT 21

BEGIN

INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES ('Test', 2, 'OtherTest') RETURNING "auth_permission"."id"

COMMIT

SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE "django_content_type"."id" = 2 LIMIT 21

You can see the extra query at the end when the content_type attribute is accessed. Similarly, you can avoid the "extra" query when the object exists by doing:

>>> permission, created = Permission.objects.select_related("content_type").get_or_create(name="Test", content_type_id=content_type.pk, codename="OtherTest")
>>> print("\n\n".join(i["sql"] for i in connection.queries))
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename", "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "auth_permission" INNER JOIN "django_content_type" ON ("auth_permission"."content_type_id" = "django_content_type"."id") WHERE ("auth_permission"."codename" = 'OtherTest' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = 'Test') LIMIT 21

>>> permission.content_type
<ContentType: Authentication and Authorization | permission>
>>> print("\n\n".join(i["sql"] for i in connection.queries))
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename", "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "auth_permission" INNER JOIN "django_content_type" ON ("auth_permission"."content_type_id" = "django_content_type"."id") WHERE ("auth_permission"."codename" = 'OtherTest' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = 'Test') LIMIT 21

in reply to:  3 comment:4 by Laurent Lyaudet, 14 months ago

Hello Natalia,

Your solution is suboptimal to say the least and denotes a lack of thought on what happens under the hood.
I was somehow prepared for such an answer that proves that BS reigns suprems in most FOSS.

Here are the problems :

  • First: adding a select_related is as cumbersome to write than the line:
    my_object.related_object = related_object
    

in my first ticket description :

my_object = MyModel.objects.get_or_create(related_object=related_object)
my_object.related_object = related_object
  • Second: adding a select_related is worse than the line above when it comes to the extra computation that you add both in Python and in the DBMS.

This join is useless, it is superfluous work for the machine, bad programming.

  • Third and most important: for the consistency of your data, with select_related you now have 2 objects of type ContentType.

If you update one of these 2 objects in Python, the other is not updated unless, you save the first and refresh from db the second.
This is plain bad programming.

  • Fourth: setting arrogantly * status: new => closed * resolution: => invalid is a full proof of how BS this kind of answer is.

Replying to Natalia Bidart:

Hello Laurent, thank you for your ticket.

As far as I understand your report, the key to your issue is about using select_related when getting the Permission you need. Specifically, see how if I create a permission using the content_type's PK, the resulting object does not have the ContentType instance fetched by default:

>>> permission, created = Permission.objects.get_or_create(name="Test", content_type_id=content_type.pk, codename="OtherTest")
>>> print("\n\n".join(i["sql"] for i in connection.queries))
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = 'OtherTest' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = 'Test') LIMIT 21

BEGIN

INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES ('Test', 2, 'OtherTest') RETURNING "auth_permission"."id"

COMMIT

>>> permission.content_type
<ContentType: Authentication and Authorization | permission>
>>> print("\n\n".join(i["sql"] for i in connection.queries))
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename" FROM "auth_permission" WHERE ("auth_permission"."codename" = 'OtherTest' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = 'Test') LIMIT 21

BEGIN

INSERT INTO "auth_permission" ("name", "content_type_id", "codename") VALUES ('Test', 2, 'OtherTest') RETURNING "auth_permission"."id"

COMMIT

SELECT "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "django_content_type" WHERE "django_content_type"."id" = 2 LIMIT 21

You can see the extra query at the end when the content_type attribute is accessed. Similarly, you can avoid the "extra" query when the object exists by doing:

>>> permission, created = Permission.objects.select_related("content_type").get_or_create(name="Test", content_type_id=content_type.pk, codename="OtherTest")
>>> print("\n\n".join(i["sql"] for i in connection.queries))
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename", "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "auth_permission" INNER JOIN "django_content_type" ON ("auth_permission"."content_type_id" = "django_content_type"."id") WHERE ("auth_permission"."codename" = 'OtherTest' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = 'Test') LIMIT 21

>>> permission.content_type
<ContentType: Authentication and Authorization | permission>
>>> print("\n\n".join(i["sql"] for i in connection.queries))
SELECT "auth_permission"."id", "auth_permission"."name", "auth_permission"."content_type_id", "auth_permission"."codename", "django_content_type"."id", "django_content_type"."app_label", "django_content_type"."model" FROM "auth_permission" INNER JOIN "django_content_type" ON ("auth_permission"."content_type_id" = "django_content_type"."id") WHERE ("auth_permission"."codename" = 'OtherTest' AND "auth_permission"."content_type_id" = 2 AND "auth_permission"."name" = 'Test') LIMIT 21

comment:5 by Natalia Bidart, 14 months ago

Laurent, first of all, please maintain a professional tone and adhere to the Django Code of Conduct when composing/sending messages.

Secondly, if you disagree with this resolution, you are welcomed to follow the documentation and start a new conversation on the Django Forum, where you'll reach a wider audience and likely get more feedback.

comment:6 by Tim Graham, 14 months ago

Hi Laurent,

To echo what Natalian said, please avoid comments like "in case some bikeshedding occurs that puts to trash coherency and efficiency in yet another open source software where some people against free software have influence". It adds nothing helpful. Assuming bad faith on the part of the Django community is no way to introduce yourself and get people to want to work with you. Thank you.

As for the issue, I think it's a bit of a niche case. I mocked up a patch using your proposal. I'm not sure that the extra complexity is worthwhile, as well as the performance penalty of iterating through the kwargs (which is unnecessary when a foreign key isn't present). The patch may need consideration for other relations like GenericForeignKey. I'm not sure offhand. I wonder if update_or_create() has a similar inconsistency. I'm afraid that trying to make the ORM smarter about this might open a can of worms.

Cheers!

  • django/db/models/query.py

    diff --git a/django/db/models/query.py b/django/db/models/query.py
    index 1125302933..818d31faac 100644
    a b class QuerySet(AltersData):  
    939939        Return a tuple of (object, created), where created is a boolean
    940940        specifying whether an object was created.
    941941        """
     942        from django.db.models.fields.related import ForeignKey
    942943        # The get() needs to be targeted at the write database in order
    943944        # to avoid potential transaction consistency problems.
    944945        self._for_write = True
    945946        try:
    946             return self.get(**kwargs), False
     947            obj, created = self.get(**kwargs), False
     948            for key, value in kwargs.items():
     949                try:
     950                    if isinstance(obj._meta.get_field(key), ForeignKey):
     951                        # isinstance handles OneToOneField also.
     952                        setattr(obj, key, value)
     953                except exceptions.FieldDoesNotExist:
     954                    pass
     955            return obj, created
    947956        except self.model.DoesNotExist:
    948957            params = self._extract_model_params(defaults, **kwargs)
    949958            # Try to create an object using passed params.
    class QuerySet(AltersData):  
    953962                    return self.create(**params), True
    954963            except IntegrityError:
    955964                try:
     965                    # Another get() case that would need to be updated.
    956966                    return self.get(**kwargs), False
    957967                except self.model.DoesNotExist:
    958968                    pass
  • tests/get_or_create/tests.py

    diff --git a/tests/get_or_create/tests.py b/tests/get_or_create/tests.py
    index 0b56d6b1a2..4aac3ff580 100644
    a b class GetOrCreateTests(TestCase):  
    129129        self.assertEqual(book.authors.count(), 2)
    130130
    131131        # Try creating a book through an author.
    132         _, created = ed.books.get_or_create(name="Ed's Recipes", publisher=p)
     132        eds_recipes, created = ed.books.get_or_create(name="Ed's Recipes", publisher=p)
    133133        self.assertTrue(created)
     134        # In the created=True case, the publisher is set on the new object.
     135        with self.assertNumQueries(0):
     136            self.assertEqual(eds_recipes.publisher.name, 'Acme Publishing')
     137
     138        # In the case of created=False, the publisher isn't set on the object.
     139        eds_recipes, created = ed.books.get_or_create(name="Ed's Recipes", publisher=p)
     140        self.assertFalse(created)
     141        # This assertion fails due to the query required to fetch the
     142        # publisher, demonstrating #34884.
     143        with self.assertNumQueries(0):
     144            self.assertEqual(eds_recipes.publisher.name, 'Acme Publishing')
    134145
    135146        # Now Ed has two Books, Fred just one.
    136147        self.assertEqual(ed.books.count(), 2)

in reply to:  6 comment:7 by Laurent Lyaudet, 12 months ago

Hi Natalia, Hi Tim,

I did what was needed : https://pypi.org/project/django-monkey-patches/

Replying to Tim Graham:

Hi Laurent,

To echo what Natalian said, please avoid comments like "in case some bikeshedding occurs that puts to trash coherency and efficiency in yet another open source software where some people against free software have influence". It adds nothing helpful. Assuming bad faith on the part of the Django community is no way to introduce yourself and get people to want to work with you. Thank you.

Sorry, that's the result of previous interactions with the FOSS community among which Django here:
https://groups.google.com/g/django-developers/c/257nJTrf1Qk/m/7wz1AVj_AwAJ
where the silence closed the discussion.

As for the issue, I think it's a bit of a niche case. I mocked up a patch using your proposal. I'm not sure that the extra complexity is worthwhile, as well as the performance penalty of iterating through the kwargs (which is unnecessary when a foreign key isn't present). The patch may need consideration for other relations like GenericForeignKey. I'm not sure offhand. I wonder if update_or_create() has a similar inconsistency.

In my tests, as I previously said, there is a global performance gain of more than 7 % ! (for such a small patch), despite the very small penalty in some cases.
There is several orders of magnitude between iterating over some kwargs and avoiding a DB query (a fortiori if you have an n-tiers architecture).
update_or_create() is already patched since it uses get_or_create() internally: https://github.com/django/django/blob/f7389c4b07ceeb036436e065898e411b247bca78/django/db/models/query.py#L967

I'm afraid that trying to make the ORM smarter about this might open a can of worms.

FUD, if it's not precise, it's not constructive and does not help having a clear view of the question.
Maybe you do not it intentionally, but maybe you do, I can't know.
https://en.wikipedia.org/wiki/Fear,_uncertainty,_and_doubt

Replying to Natalia Bidart:

Laurent, first of all, please maintain a professional tone and adhere to the ​Django Code of Conduct when composing/sending messages.

Yes but it says "This code applies equally to founders, mentors and those seeking help and guidance."
And closing my ticket without an open discussion before is a kind of slap in the face that is unfortunately current on the Internet.
It says also "Be friendly and patient", "Be welcoming", "Be considerate", "Be respectful".
I'm not the only one who does not strictly followed this code of conduct here.

Secondly, if you disagree with this resolution, you are welcomed to follow ​the documentation and start a new conversation on the ​Django Forum, where you'll reach a wider audience and likely get more feedback.

Well if the forum is as responsive as the dev google-group...
Sorry, I like truth and progress, my energy can bring a lot of good to the world, but it cannot overcome people that do not want to try to see the problem at hand like me for 5 minutes.

Best regards,
Laurent Lyaudet

comment:8 by Tim Graham, 12 months ago

Resolution: invalidwontfix
Type: UncategorizedCleanup/optimization

where the silence closed the discussion.

Unfortunately Django cannot accommodate every feature request. If no one finds your proposal compelling enough to respond, perhaps it is too niche. In the mailing list thread you mentioned, I'm afraid that the lack of indentation may have prevented some engagement. I guess the forum doesn't suffer from that issue. I'm not active on the forum, but I have heard that it's more active than the mailing list.

I'm afraid that trying to make the ORM smarter about this might open a can of worms.

What I meant is that other changes may be required to make created and non-created model instances identical. The patch I offered already adds non-trivial complexity and only handles foreign keys.

And closing my ticket without an open discussion before is a kind of slap in the face that is unfortunately current on the Internet.

I understand that it can come off that way, but as you see, a closed ticket doesn't end the discussion. We don't like to leave tickets in an untriaged state while the discussion takes place. I am going to change the ticket resolution to "wontfix" because I think it's something that could be changed if there is clear community consensus. I offer my opinion that I don't think the proposal would be accepted, but no one is stopping you from engaging further if this feature is important enough to you. A more compelling proposal would address my doubt about whether or not foreign key caching is the only difference between created and non-created model instances.

comment:9 by Laurent Lyaudet, 12 months ago

Hello Tim,
Thanks for your constructive answer.
I did not think about making the created and the non-created model instances strictly identical.
I had a much simpler goal, my approach was more agile and more bazaar on this topic.
But I like the cathedral way of thinking.
And it's maybe more appropriate to think about it for a framework where you have no control on the code using the framework.
This should not be possible to make it so, since the "API"/convention of get_or_create() is to apply the defaults values (from "defaults" kwarg) only on creation.
As of now it is not clear yet, in my mind, how hacky a "field taken from a query_filter" can be processed.
I mean, for example, is it possible to have a field that is a valid component of a query filter but also has a "setter" hidden in the class code ?
Can you have :

my_counter = models.IntegerField(...)
my_foreign_key = models.ForeignKey(...)

with some custom class code such that
whenever my_foreign_key is set the field my_counter is incremented.
I would be able to do it with a property but the property would have a name distinct from "my_foreign_key".
If I can be provocative, someone that do this deserves to debug his code when he updates his framework ;) XD.
More seriously, almost anything is possible in the code using Django,
and thus it is about direction.
Breaking changes can happen in any framework on major version releases.
I think my goal/direction was reasonable.
I am just trying to correct a cache inconsistency,
whilst having identical instances in both cases seems to be impossible.
In my point of view, the trade-off is positive,
but I'm open to think about it further.
In any case, correcting this cache inconsistency would be a step toward identical instances in both cases
Can you restrict your goal of identical instances in both cases to something more "possible" ?
What kind of compatibility problems do you foresee ?
Best regards,
Laurent Lyaudet

comment:10 by Tim Graham, 12 months ago

I have said my piece. I don't think the additional complexity in Django is warranted for what looks like a niche case. The workaround of setting the attribute after get_or_create() isn't particularly burdensome. You can try to solicit the opinions of others if you are so motivated.

Note: See TracTickets for help on using tickets.
Back to Top