Opened 9 months ago

Last modified 6 months ago

#35167 assigned Bug

JSONField get_db_prep_value being called with `Cast` types

Reported by: Samantha Hughes Owned by: HeejunShin
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords: JSONField
Cc: Samantha Hughes, Simon Charette, Krish, Jacob Walls Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Running django 4.2.9.

We have a customized version of the standard JsonField where we have overridden from_db_value and get_prep_value.

The worked in django ~4.1. Migrating to 4.2.9, the get_prep_value function started receiving Cast objects as the value, exploding json serializing/encryption , only during bulk_update.

We fixed it by overriding `get_db_prep_save to

def get_db_prep_save(self, value, connection):
         if hasattr(value, "as_sql"):
             return value
         return self.get_db_prep_value(value, connection=connection, prepared=False)

The jsonfield implementation does not check for as_sql. I'm not sure if this is intentional or not (unmodified JSONField columns don't seem to explode), possibly our implementation has messed up the jsonfield?

Change History (24)

comment:1 by Simon Charette, 9 months ago

Triage Stage: UnreviewedAccepted
Type: UncategorizedBug

This highly likely relates to #34539 which was about making sure get_db_prep_value always calls get_prep_value as it use to be case in 4.1.

Given the default implementation of get_db_prep_save doesn't delegate to get_db_prep_value when provided a compilable expression I believe we should also make sure JSONField.get_db_prep_save does something equivalent.

The challenge here is that Value(output_field=JSONField()), which represents wrapped literal JSON values which could also be JSON strings, must be able to make their way to connection.ops.adapt_json_value(value, self.encoder). At least they needed during the deprecation period maybe it's no longer the case since Value(value).as_sql will do the right thing by itself.

comment:2 by Simon Charette, 9 months ago

You mentioned that you are using Django 4.2 so the following patch won't apply cleanly against your branch but against main it gets the tests passing

  • django/db/models/fields/json.py

    diff --git a/django/db/models/fields/json.py b/django/db/models/fields/json.py
    index 571e6e79f3..73b4a4dd50 100644
    a b def get_internal_type(self):  
    9999    def get_db_prep_value(self, value, connection, prepared=False):
    100100        if not prepared:
    101101            value = self.get_prep_value(value)
    102         if isinstance(value, expressions.Value) and isinstance(
    103             value.output_field, JSONField
    104         ):
    105             value = value.value
    106         elif hasattr(value, "as_sql"):
    107             return value
    108102        return connection.ops.adapt_json_value(value, self.encoder)
    109103
    110104    def get_db_prep_save(self, value, connection):
     105        # This slightly convoluted logic is to allow for `None` to be used to
     106        # store SQL `NULL` while `Value(None, JSONField())` can be used to
     107        # store JSON `null` while preventing compilable `as_sql` values from
     108        # making their way to `get_db_prep_value` which is what the `super()`
     109        # implementation does.
    111110        if value is None:
    112111            return value
    113         return self.get_db_prep_value(value, connection)
     112        elif (
     113            isinstance(value, expressions.Value)
     114            and value.value is None
     115            and isinstance(value.output_field, JSONField)
     116        ):
     117            value = None
     118        return super().get_db_prep_save(value, connection)
    114119
    115120    def get_transform(self, name):
    116121        transform = super().get_transform(name)

Basically now that the deprecation shims have been removed the logic can be slightly simplified to only have to special case the Value(None, JSONField()) which is truly the only problematic case in the implementation of JSONField due to ambiguous nature of None with regards to SQL NULL and JSON null.

comment:3 by Simon Charette, 9 months ago

Cc: Simon Charette added

comment:4 by Samantha Hughes, 9 months ago

Thanks, the quick response is really appreciated. Any idea if this will make it to a 4.2 patch or will we need to upgrade to ~5?

comment:5 by Jacob Walls, 9 months ago

Component: Error reportingDatabase layer (models, ORM)
Owner: set to nobody

comment:6 by Simon Charette, 9 months ago

Any idea if this will make it to a 4.2 patch or will we need to upgrade to ~5?

Assuming someone prepares a patch it should make the cut for 5.1 meant to be released in August 2024. It won't be backported to 4.2 or 5.0 at this point unfortunately.

comment:7 by Krish, 9 months ago

Cc: Krish added

comment:8 by Jacob Walls, 9 months ago

Cc: Jacob Walls added
Summary: JSONFIeld get_db_prep_value being called with `Cast` typesJSONField get_db_prep_value being called with `Cast` types

comment:9 by HeejunShin, 9 months ago

Could i try this ticket?

comment:10 by bcail, 9 months ago

@HeejunShin, nobody owns this ticket, so you're welcome to work on this ticket. You can assign it to yourself and/or add comments as you work on it.

comment:11 by Almaz, 9 months ago

Owner: changed from nobody to Almaz
Status: newassigned

comment:12 by Almaz, 8 months ago

Is this a relevant problem?

Because, after some period of exploration, I see that most of the code were updated.

comment:13 by Almaz, 8 months ago

Owner: Almaz removed
Status: assignednew

comment:14 by HeejunShin, 8 months ago

Is this ticket solved?

in reply to:  14 comment:15 by Simon Charette, 8 months ago

Replying to HeejunShin:

Is this ticket solved?

No. While it has an inline tentative patch no one has opened a PR on Github with the proposed changes including a regression test yet.

comment:18 by HeejunShin, 8 months ago

Owner: set to HeejunShin
Status: newassigned

in reply to:  16 comment:19 by HeejunShin, 8 months ago

Replying to Simon Charette:

thank you for your response.

Last edited 8 months ago by HeejunShin (previous) (diff)

comment:20 by HeejunShin, 7 months ago

Sorry for the slow progress. I am currently analyzing the code looking at the customizing doc of the Field class.

comment:21 by Natalia Bidart, 7 months ago

Ticket #35381 is related to this.

in reply to:  21 comment:22 by HeejunShin, 7 months ago

Replying to Natalia Bidart:

Ticket #35381 is related to this.

Is this a problem that needs to be resolved as quickly as possible?
Should I also refer to the ticket you mentioned to solve the problem?
Any advice would be greatly appreciated.

comment:23 by Natalia Bidart, 7 months ago

Hello HeejunShin, thank you for asking! You can take the time you need to work on this ticket, in general no ticket is urgent unless is a release blocker, which is not the case for this one.
Also, if you can't work on this anymore, that is also fine and you can just de-assign yourself from the ticket and eventually someone else would pick it up.

Django is built and maintained by its community and everyone contributes their time and energy when they can.

in reply to:  23 comment:24 by HeejunShin, 7 months ago

Replying to Natalia Bidart:

Hello HeejunShin, thank you for asking! You can take the time you need to work on this ticket, in general no ticket is urgent unless is a release blocker, which is not the case for this one.
Also, if you can't work on this anymore, that is also fine and you can just de-assign yourself from the ticket and eventually someone else would pick it up.

Django is built and maintained by its community and everyone contributes their time and energy when they can.

thank you for your response.
I'll try my best.

comment:25 by HeejunShin, 6 months ago

It's really late though
I was just analyzing the code and reproduced the bug.

comment:26 by HeejunShin, 6 months ago

i have a question.
maybe The bug reproduction seems to be incorrect.

i tested using 2 ways.

tests[1].json = {"test": Cast('good', output_field=CharField())}
Test.objects.bulk_update(tests_will_be_updated, fields=["json"])
tests[1].json = Cast("good", output_field=CharField())
Test.objects.bulk_update(tests_will_be_updated, fields=["json"])

first one, i got "raise TypeError(f'Object of type {o.class.name} 'TypeError: Object of type Cast is not JSON serializable" error.
but value argument type in get_db_prep_value is dict.

second one, i could not get error.
value argument type in get_db_prep_value is <class 'django.db.models.expressions.Case'>. (CASE WHEN <WhereNode: (AND: Exact(Col(polls_test, polls.Test.id), 2))> THEN Cast(Col(polls_test, polls.Test.json)), ELSE Value(None)).

i thought first one is type dict, so this is wrong.
is second one wrong way that reproduce bug case?

Note: See TracTickets for help on using tickets.
Back to Top