Opened 2 months ago
Last modified 85 minutes ago
#35842 assigned Bug
JSONField has_key, has_keys, has_any_keys lookups do not properly handle quotes on Oracle and SQLite
Reported by: | Simon Charette | Owned by: | Sage Abdullah |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | oracle sqlite json key quote |
Cc: | Sage Abdullah | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The following test reproduces issues on both Oracle and SQLite
-
tests/model_fields/test_jsonfield.py
diff --git a/tests/model_fields/test_jsonfield.py b/tests/model_fields/test_jsonfield.py index ff42b1a14c..b1090b30ac 100644
a b def test_has_key_number(self): 636 636 [obj], 637 637 ) 638 638 639 def test_has_key_special_chars(self): 640 value = { 641 'double"': "", 642 "single'": "", 643 "dollar$": "", 644 "mixed$\"'.": "", 645 } 646 obj = NullableJSONModel.objects.create( 647 value=value 648 ) 649 obj.refresh_from_db() 650 self.assertEqual(obj.value, value) 651 for key in value: 652 with self.subTest(key=key): 653 self.assertSequenceEqual( 654 NullableJSONModel.objects.filter(value__has_key=key), 655 [obj], 656 ) 657 639 658 @skipUnlessDBFeature("supports_json_field_contains") 640 659 def test_contains(self): 641 660 tests = [
In the case of Oracle the issue arise because it doesn't supporting binding variables in JSON_EXISTS
(original discussion) so while the json.dumps
of the key is believed to protect from SQL injections it can still result in crashes if the key contains a single quote character. Using the PASSING
clause could possibly allow us to bypass this limitation or using a different escaping strategy could possibly be used to adjust the Oracle implementation.
---
In the case of SQLite the problem is with the double-quote character "
because escapes generated by json.dumps
are not properly interpreted by SQLite.
In other words "foo\"bar"
is not properly interpreted as 'foo"bar
and while SQLite allows you not to quote keys (e.g. JSON_TYPE(%s, '$.foo\"bar') IS NOT NULL
) the solution is not viable for keys that contain both a double-quote and a symbol such as .
as exemplified by the mixed key in the provide test.
Change History (20)
comment:1 by , 2 months ago
comment:2 by , 2 months ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:3 by , 4 weeks ago
Cc: | added |
---|
comment:4 by , 4 weeks ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:5 by , 3 weeks ago
Has patch: | set |
---|---|
Patch needs improvement: | set |
Version: | 5.0 → dev |
PR, with more explanation in the PR description. Marking as needs improvement, I still need to write some more tests for the KeyTransform
as I'm pretty sure it has the same issue.
Thanks Simon for such detailed description and suggestions!
comment:6 by , 3 weeks ago
Patch needs improvement: | unset |
---|
I can confirm that KeyTransform
is affected by the same issue, and I've added the tests in the PR.
I think #32213 is a different issue, though. That one is about the deserialization of the values, whereas the issue in this ticket is about the escaping of function parameters.
comment:8 by , 2 weeks ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:10 by , 2 weeks ago
Has patch: | unset |
---|---|
Triage Stage: | Ready for checkin → Accepted |
comment:11 by , 2 weeks ago
Apparently it's fixed in SQLite 3.37.0 (released 2024-10-21)!
It's not in the release notes, but you can see the changeset. Looks like it was backported to the 3.36 branch too, but given that 3.37.0 and even 3.37.1 have been released, it looks like there won't be any more 3.36.x releases (at least based on the chronology of past releases).
I can confirm this by downloading a precompiled SQLite from the download page and running the following query:
select json_extract(json('{"foo\".bar": "ba\"z", "x": "y"}'), '$."foo\".bar"');
I have yet to run the Django test, as updating the SQLite version is a bit more complicated – I'll do that this weekend. Anyhow, if it works, I think we should remove the expected test failure if the SQLite version is >= 3.37.0.
Not sure how to update it in CI, though. It's tricky because you need to compile Python with the newer SQLite installed on the system so that the resulting libsqlite3.so.0
inside Python's lib
directory is compiled with the newer SQLite. Or, you can compile it separately and then replace the lib file in the Python installation (or use the LD_LIBRARY_PATH
environment variable). Alternatively, we can just wait until the Python that's distributed in the docker (?) image used in our Jenkins setup gets updated.
comment:12 by , 2 weeks ago
Apparently it's fixed in SQLite 3.37.0 (released 2024-10-21)!
The test_has_key_special_chars
test attached to this ticket still fails against against SQLite 3.37.0 and on 3.46.0 on macOS (x86) and on Linux. I built it from source on both platform and used set env DYLD_INSERT_LIBRARIES
on macOS and LD_PRELOAD
on Linux.
I can confirm this by downloading a precompiled SQLite from the download page and running the following query:
I get different results on my side
SQLite version 3.37.0 2021-12-09 01:34:53 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> select json_extract(json('{"foo\".bar": "ba\"z", "x": "y"}'), '$."foo\".bar"'); sqlite> select json_extract(json('{"foo\".bar": "ba\"z", "x": "y"}'), '$."x"'); y
But 3.47.1 seems to have fixed it?
SQLite version 3.47.1 2024-11-25 12:07:48 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> select json_extract(json('{"foo\".bar": "ba\"z", "x": "y"}'), '$."foo\".bar"'); ba"z
To make sure that the built library is used you access sqlite3.sqlite_version
in the Python terminal from the environment you used to run your tests
>>> import sqlite3 >>> sqlite3.sqlite_version '3.46.0'
follow-up: 14 comment:13 by , 2 weeks ago
Re-reading your message and the release date (2024-10-21) I assume you meant 3.47.0 and not 3.37.0.
I also realized that my tests on macOS were bogus since SQLite is statically linked on this platform.
I can confirm the issue is solved on 3.47.0+
SQLite version 3.47.0 2024-10-21 16:30:22 Enter ".help" for usage hints. Connected to a transient in-memory database. Use ".open FILENAME" to reopen on a persistent database. sqlite> select json_extract(json('{"foo\".bar": "ba\"z", "x": "y"}'), '$."foo\".bar"'); ba"z
I guess this means we could conditionally include entries in django_test_expected_failures
based on sqlite3.sqlite_version
?
comment:14 by , 13 days ago
Replying to Simon Charette:
Re-reading your message and the release date (2024-10-21) I assume you meant 3.47.0 and not 3.37.0.
I guess this means we could conditionally include entries in
django_test_expected_failures
based onsqlite3.sqlite_version
?
Oops, yep! I meant 3.47.0, sorry. Thanks for figuring it out 😄
I think so, we already have some conditionals in django_test_skips, so we should be OK with changing django_test_skips
into a cached_property
and check Database.sqlite_version_info
.
Although it's going to be tricky to prove it in CI without adding/updating one of the jobs to use a newer SQLite version, as you experienced yourself.
I think it would still be useful to have at least one configurable SQLite version though, so we can test with the latest SQLite independently of the installed Python. That would be separate to this ticket, but I'd be happy to make the PR if we want to have that.
comment:15 by , 13 days ago
I think it would still be useful to have at least one configurable SQLite version though, so we can test with the latest SQLite independently of the installed Python. That would be separate to this ticket, but I'd be happy to make the PR if we want to have that.
I'm not sure how complex it would be on CI but having it on django-docker-box
would be certainly valuable and since it's Linux based LD_PRELOAD
should work just fine.
comment:16 by , 10 days ago
Has patch: | set |
---|
I did manage to hack it on django-docker-box
, but I'm not familiar enough with the docker setup to add a clean way to customise the SQLite version, so I added a GitHub Actions job in django/django instead.
comment:17 by , 5 days ago
Triage Stage: | Accepted → Ready for checkin |
---|
comment:19 by , 28 hours ago
Triage Stage: | Ready for checkin → Accepted |
---|
comment:20 by , 85 minutes ago
May I ask what the expected next step is for this ticket?
As demonstrated by my other PR, this issue is fixed on SQLite 3.47+, and I see no way for Django to fix this on our side for SQLite < 3.47. FWIW, I still think it's worthwhile to have a CI job that tests against the latest SQLite release. Happy to open a forum discussion if that's necessary to get the PR merged.
Is the intention to wait for now and close the ticket when we bump the minimum version of SQLite to 3.47+?
It's related to the #32213.