Opened 2 months ago

Closed 3 hours ago

#35842 closed Bug (fixed)

JSONField has_key, has_keys, has_any_keys lookups do not properly handle quotes on Oracle and SQLite

Reported by: Simon Charette Owned by: Sage Abdullah
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: oracle sqlite json key quote
Cc: Sage Abdullah Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

The following test reproduces issues on both Oracle and SQLite

  • tests/model_fields/test_jsonfield.py

    diff --git a/tests/model_fields/test_jsonfield.py b/tests/model_fields/test_jsonfield.py
    index ff42b1a14c..b1090b30ac 100644
    a b def test_has_key_number(self):  
    636636                    [obj],
    637637                )
    638638
     639    def test_has_key_special_chars(self):
     640        value = {
     641            'double"': "",
     642            "single'": "",
     643            "dollar$": "",
     644            "mixed$\"'.": "",
     645        }
     646        obj = NullableJSONModel.objects.create(
     647            value=value
     648        )
     649        obj.refresh_from_db()
     650        self.assertEqual(obj.value, value)
     651        for key in value:
     652            with self.subTest(key=key):
     653                self.assertSequenceEqual(
     654                    NullableJSONModel.objects.filter(value__has_key=key),
     655                    [obj],
     656                )
     657
    639658    @skipUnlessDBFeature("supports_json_field_contains")
    640659    def test_contains(self):
    641660        tests = [

In the case of Oracle the issue arise because it doesn't supporting binding variables in JSON_EXISTS (original discussion) so while the json.dumps of the key is believed to protect from SQL injections it can still result in crashes if the key contains a single quote character. Using the PASSING clause could possibly allow us to bypass this limitation or using a different escaping strategy could possibly be used to adjust the Oracle implementation.

---

In the case of SQLite the problem is with the double-quote character " because escapes generated by json.dumps are not properly interpreted by SQLite.

In other words "foo\"bar" is not properly interpreted as 'foo"bar and while SQLite allows you not to quote keys (e.g. JSON_TYPE(%s, '$.foo\"bar') IS NOT NULL) the solution is not viable for keys that contain both a double-quote and a symbol such as . as exemplified by the mixed key in the provide test.

Change History (21)

comment:1 by Mariusz Felisiak, 2 months ago

It's related to the #32213.

comment:2 by Natalia Bidart, 2 months ago

Triage Stage: UnreviewedAccepted

comment:3 by Sage Abdullah, 4 weeks ago

Cc: Sage Abdullah added

comment:4 by Sage Abdullah, 4 weeks ago

Owner: set to Sage Abdullah
Status: newassigned

comment:5 by Sage Abdullah, 3 weeks ago

Has patch: set
Patch needs improvement: set
Version: 5.0dev

PR, with more explanation in the PR description. Marking as needs improvement, I still need to write some more tests for the KeyTransform as I'm pretty sure it has the same issue.

Thanks Simon for such detailed description and suggestions!

comment:6 by Sage Abdullah, 3 weeks ago

Patch needs improvement: unset

I can confirm that KeyTransform is affected by the same issue, and I've added the tests in the PR.

I think #32213 is a different issue, though. That one is about the deserialization of the values, whereas the issue in this ticket is about the escaping of function parameters.

comment:7 by Simon Charette, 3 weeks ago

Thanks for the patch Sage, I do agree that #32213 is a different issue.

comment:8 by Mariusz Felisiak, 2 weeks ago

Triage Stage: AcceptedReady for checkin

comment:9 by Sarah Boyce <42296566+sarahboyce@…>, 2 weeks ago

In b28438f:

Refs #35842 -- Fixed handling of quotes in JSONField key lookups on Oracle.

comment:10 by Sarah Boyce, 2 weeks ago

Has patch: unset
Triage Stage: Ready for checkinAccepted

comment:11 by Sage Abdullah, 2 weeks ago

Apparently it's fixed in SQLite 3.37.0 (released 2024-10-21)!
It's not in the release notes, but you can see the changeset. Looks like it was backported to the 3.36 branch too, but given that 3.37.0 and even 3.37.1 have been released, it looks like there won't be any more 3.36.x releases (at least based on the chronology of past releases).

I can confirm this by downloading a precompiled SQLite from the download page and running the following query:

select json_extract(json('{"foo\".bar": "ba\"z", "x": "y"}'), '$."foo\".bar"');

I have yet to run the Django test, as updating the SQLite version is a bit more complicated – I'll do that this weekend. Anyhow, if it works, I think we should remove the expected test failure if the SQLite version is >= 3.37.0.

Not sure how to update it in CI, though. It's tricky because you need to compile Python with the newer SQLite installed on the system so that the resulting libsqlite3.so.0 inside Python's lib directory is compiled with the newer SQLite. Or, you can compile it separately and then replace the lib file in the Python installation (or use the LD_LIBRARY_PATH environment variable). Alternatively, we can just wait until the Python that's distributed in the docker (?) image used in our Jenkins setup gets updated.

comment:12 by Simon Charette, 2 weeks ago

Apparently it's fixed in SQLite 3.37.0 (released 2024-10-21)!

The test_has_key_special_chars test attached to this ticket still fails against against SQLite 3.37.0 and on 3.46.0 on macOS (x86) and on Linux. I built it from source on both platform and used set env DYLD_INSERT_LIBRARIES on macOS and LD_PRELOAD on Linux.

I can confirm this by downloading a precompiled SQLite from ​the download page and running the following query:

I get different results on my side

SQLite version 3.37.0 2021-12-09 01:34:53
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> select json_extract(json('{"foo\".bar": "ba\"z", "x": "y"}'), '$."foo\".bar"');

sqlite> select json_extract(json('{"foo\".bar": "ba\"z", "x": "y"}'), '$."x"');
y

To make sure that the built library is used you access sqlite3.sqlite_version in the Python terminal from the environment you used to run your tests

>>> import sqlite3
>>> sqlite3.sqlite_version
'3.46.0'
Version 0, edited 2 weeks ago by Simon Charette (next)

comment:13 by Simon Charette, 2 weeks ago

Re-reading your message and the release date (2024-10-21) I assume you meant 3.47.0 and not 3.37.0.

I also realized that my tests on macOS were bogus since SQLite is statically linked on this platform.

I can confirm the issue is solved on 3.47.0+

SQLite version 3.47.0 2024-10-21 16:30:22
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.
sqlite> select json_extract(json('{"foo\".bar": "ba\"z", "x": "y"}'), '$."foo\".bar"');
ba"z

I guess this means we could conditionally include entries in django_test_expected_failures based on sqlite3.sqlite_version?

Last edited 2 weeks ago by Simon Charette (previous) (diff)

in reply to:  13 comment:14 by Sage Abdullah, 2 weeks ago

Replying to Simon Charette:

Re-reading your message and the release date (2024-10-21) I assume you meant 3.47.0 and not 3.37.0.

I guess this means we could conditionally include entries in django_test_expected_failures based on sqlite3.sqlite_version?

Oops, yep! I meant 3.47.0, sorry. Thanks for figuring it out 😄

I think so, we already have some conditionals in django_test_skips, so we should be OK with changing django_test_skips into a cached_property and check Database.sqlite_version_info.

Although it's going to be tricky to prove it in CI without adding/updating one of the jobs to use a newer SQLite version, as you experienced yourself.

I think it would still be useful to have at least one configurable SQLite version though, so we can test with the latest SQLite independently of the installed Python. That would be separate to this ticket, but I'd be happy to make the PR if we want to have that.

comment:15 by Simon Charette, 13 days ago

I think it would still be useful to have at least one configurable SQLite version though, so we can test with the latest SQLite independently of the installed Python. That would be separate to this ticket, but I'd be happy to make the PR if we want to have that.

I'm not sure how complex it would be on CI but having it on django-docker-box would be certainly valuable and since it's Linux based LD_PRELOAD should work just fine.

comment:16 by Sage Abdullah, 10 days ago

Has patch: set

I did manage to hack it on django-docker-box, but I'm not familiar enough with the docker setup to add a clean way to customise the SQLite version, so I added a GitHub Actions job in django/django instead.

comment:17 by Mariusz Felisiak, 5 days ago

Triage Stage: AcceptedReady for checkin

comment:18 by Sarah Boyce <42296566+sarahboyce@…>, 2 days ago

In 47eafd1:

Refs #35842 -- Fixed test_lookups_special_chars_double_quotes on SQLite 3.47+.

comment:19 by Sarah Boyce, 31 hours ago

Triage Stage: Ready for checkinAccepted

comment:20 by Sage Abdullah, 5 hours ago

May I ask what the expected next step is for this ticket?

As demonstrated by my other PR, this issue is fixed on SQLite 3.47+, and I see no way for Django to fix this on our side for SQLite < 3.47. FWIW, I still think it's worthwhile to have a CI job that tests against the latest SQLite release. Happy to open a forum discussion if that's necessary to get the PR merged.

Is the intention to wait for now and close the ticket when we bump the minimum version of SQLite to 3.47+?

comment:21 by Mariusz Felisiak, 3 hours ago

Resolution: fixed
Status: assignedclosed

As for me, it's fixed.

Note: See TracTickets for help on using tickets.
Back to Top