Opened 4 years ago

Last modified 3 weeks ago

#32491 assigned Bug

Updating a field based on a JSONField's sub-value adds extra quotes [postgres]

Reported by: Baptiste Mispelon Owned by: YashRaj1506
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Giannis Terzopoulos Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Consider the following model:

class JSONNamedModel(models.Model):
    name = models.CharField(max_length=10, blank=True)
    data = models.JSONField()

The following testcase seems to pass under sqlite but fails under postgres (postgresql version 13.1, psycopg2 version 2.8.6). I haven't tested other backends:

class ReproductionTestCase(TestCase):
    def test_issue_update(self):
        JSONNamedModel.objects.create(data={'name': 'django'})
        JSONNamedModel.objects.update(name=F('data__name'))
        self.assertQuerysetEqual(
            JSONNamedModel.objects.all(),
            ['django'],
            transform=lambda m: m.name,
        )

To summarize what happens in the test, if you have an instance with data={'name': 'django'} and you do update(name=F('data__name')), you end up with an instance that has name='"django"' (the original name with extra double quotes around it).

Interestingly, if you start with django", you end up with "django\"" (the original " is backslash-escaped and the whole thing is wrapped by double quotes).

Not sure how relevant this is, but while digging into this issue I noticed that the test failure changed somewhat recently (bisected down to commit 8b040e3cbbb2e81420e777afc3ca48a1c8f4dd5a).
Before that commit, the error was django.core.exceptions.FieldError: Joined field references are not permitted in this query

Change History (8)

comment:1 by Baptiste Mispelon, 4 years ago

Initially, I thought that adding an explicit Cast (using update(name=Cast(F('data__name'), CharField()))) might solve the issue, but the test still fails in the same way.

I got it to work using KeyTextTransform (which doesn't seem to be documented based on a quick git grep): update(name=KeyTextTransform('name', 'data')).

Adding to my confusion was the fact that this behavior doesn't appear when using annotate: annotate(dataname=F('data__name')) correctly yields django without the extra quotes.

I'm not sure if this is really a bug or how fixable it is (I don't see how Django could guess that it should use KeyTextTransform over KeyTransform) but silently adding extra " seems pretty bad.
Having KeyTransform/KeyTextTransform documented could help, though I'm not sure if I would have discovered their existence even if they were.

comment:2 by Simon Charette, 4 years ago

Triage Stage: UnreviewedAccepted

Bonjour Baptiste, always a pleasure to see you around :)

Initially, I thought that adding an explicit Cast (using update(name=Cast(F('dataname'), CharField()))) might solve the issue, but the test still fails in the same way.

yeah that's expected since ::json::text is pretty much the equivalent of json.dumps (e.g. see #27257 for a similar problem)

Adding to my confusion was the fact that this behavior doesn't appear when using annotate: annotate(dataname=F('dataname')) correctly yields django without the extra quotes.

Right this happens because JSONField.from_db_value attempts a json.loads on the return value so '"foo"' -> 'foo'

I'm not sure if this is really a bug or how fixable it is (I don't see how Django could guess that it should use KeyTextTransform over KeyTransform) but silently adding extra " seems pretty bad. Having KeyTransform/KeyTextTransform documented could help, though I'm not sure if I would have discovered their existence even if they were.

It could be done by having transforms used for_save have access to output_field of the left hand side of the UPDATE/INSERT (e.g a CharField) in this case and decide whether ->> or -> should be used but that would be slightly backward incompatible.

An alternative could be to introduce an __astext lookup on JSONField that would translate to the usage of KeyTextTransform. That would make it explicit which SQL operator must be used and a similar approach could be used for __asint and __asfloat to remove the lookups hacks for textual and numeric values as right now it's not possible to do things like jsonfield__name__gt="Simon"

Last edited 4 years ago by Simon Charette (previous) (diff)

comment:3 by Nick Pope, 4 years ago

Also FYI: There is an open ticket about documenting KeyTransform() and KeyTextTransform(): #26511

comment:4 by m_moeinzadeh, 17 months ago

Owner: changed from nobody to m_moeinzadeh
Status: newassigned

comment:5 by m_moeinzadeh, 16 months ago

Owner: m_moeinzadeh removed
Status: assignednew

neither models.JSONField() functions or backend converters of django support changing the value of the models.JSONField()'s value
both save() and get() functions of django area working without any bugs and the extra set of quotations is being added by while converting the query data to flat data
in order to fix it, a new update is required on converters so they can apply on the value of a models.JSONField() value not only on the key of it's value
therefore this ticket requires major changes that are outside of this tickets tritory that might cause great problems in other features and i suggest adding full support for models.JSONField() to converters

comment:6 by Giannis Terzopoulos, 11 months ago

Cc: Giannis Terzopoulos added

comment:7 by YashRaj1506, 3 months ago

Owner: set to YashRaj1506
Status: newassigned

comment:8 by YashRaj1506, 3 weeks ago

So it was mentioned that it works well with annotate but when i am trying to do some tests, it adds a ' " ' after django, so i get -> 'django"'

 class ReproductionTestCase(TestCase):
     def test_issue_update(self):
         JSONNamedModel.objects.create(data={'name': 'django"'})
         queryset = JSONNamedModel.objects.annotate(dataname=F('data__name'))
         self.assertEqual([obj.dataname for obj in queryset], ['django'])

this is the result on the terminal for the above mentioned testcase:

======================================================================
FAIL: test_issue_update (users.tests.ReproductionTestCase.test_issue_update)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/yash/Desktop/django-trial/trial/users/tests.py", line 20, in test_issue_update
    self.assertEqual([obj.dataname for obj in queryset], ['django'])
AssertionError: Lists differ: ['django"'] != ['django']

First differing element 0:
'django"'
'django'

- ['django"']
?         -

+ ['django']

----------------------------------------------------------------------
Ran 1 test in 0.002s

FAILED (failures=1)
Destroying test database for alias 'default'...

so ig the issue still persists from the annotate side too.

Note: See TracTickets for help on using tickets.
Back to Top