Opened 7 years ago
Closed 3 years ago
#29482 closed Cleanup/optimization (wontfix)
simplify KeyTransform to always use the 'nested_operator'
Reported by: | David De Sousa | Owned by: | nobody |
---|---|---|---|
Component: | contrib.postgres | Version: | 2.0 |
Severity: | Normal | Keywords: | |
Cc: | James Howe | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Currently, the KeyTransform
and KeyTextTransform
classes in the JSONB Field lookups are making a distinction in the way the query is being generated based on the depth of the key inside the JSON, using a different operator depending on if the key is in the root of the json or not.
This is an issue specially if you create indexes based on jsonb traversals, for some reason if you create an index with the path operators (#> or #>>) and then query with the "regular" operators (-> or ->>) the indexes are not always used.
I've tested removing the use of the operator
and basically just leaving the return in https://github.com/django/django/blob/2.0.6/django/contrib/postgres/fields/jsonb.py#L105 as the only return in the function and it works for me, I'd be happy to create a PR but I don't know if that distinction is there for a reason.
Change History (9)
follow-up: 3 comment:1 by , 7 years ago
follow-up: 4 comment:2 by , 7 years ago
Does it need something to do with array elements vs object keys? Or does the postgres nested operator already handle that?
If you had a top-level (and non-nested) JSON array in a field, you'd probably want to stick to a simpler index/operator, I think.
I haven't looked into this, it's just thoughts.
comment:3 by , 7 years ago
Replying to Simon Charette:
From my understanding the
->
and->>
operators are used instead of their#
variant when the key lookup depth is 1 for readability purposes.
Is there a reason you can't create your non-nested attribute lookups using the
->
and->>
operators instead? Are you suggesting we favor the#
syntax because a functional index ondata#>'{a,b}'
would be used when doing adata#>'{a}'
lookup?
No, that won't happen AFAIK
I'm asking because changing it at this point is likely to break any previously created functional index created using these operators for the same reason you are asking them to be changed here. I assume this is something that would eventually be fixed in PostgreSQL.
It would be awesome if PostgreSQL dealt with this, but I asked in the #postgres IRC and was told there's no way for the query planner to do this, although I got confirmation the #
and -
operators are equivalent.
No, the issue is that django uses both operators based on if it's a root key or a nested one, I agree a change here would potentially break previously created indexes using the ->
or -->
operators, so developers should take this into consideration when building indexes, it should at least be documented, I found out the hard way.
comment:4 by , 7 years ago
Replying to Matthew Schinckel:
Does it need something to do with array elements vs object keys? Or does the postgres nested operator already handle that?
#>>
is a path operator, so it will work also with array indexes, #>> '{2}' would give you the 3rd element of the root array.
If you had a top-level (and non-nested) JSON array in a field, you'd probably want to stick to a simpler index/operator, I think.
As I said before I could agree with this, although I'd vouch for simplicity in the code, using only the path operator makes the code a lot simpler IMHO
comment:5 by , 7 years ago
The suggested change doesn't solve the problem, it just creates a new one.
The problem is that an index will only be used if the JSON operators are the same in both the index definition and the query.
The four path traversal operators (->
, ->>
, #>
, #>>
) are all incompatible.
Standardizing on a single operator for all queries would make things easier to deal with, but people may still have indices they are unable to use.
Allowing the operator to be controlled may be a quick workaround.
Ideally, the Index
classes should support JSON paths (and partial indexes), so the definition can be standardized to the same operators.
comment:6 by , 7 years ago
Cc: | added |
---|
follow-up: 9 comment:7 by , 7 years ago
I agree, the ideal solution would be to support JSON paths and partial indexes in the Index
class, just be aware that -> and ->> return different types, so instead of using the four operators, django could use two, either -> and ->> or #> and #>>, depending on the user's desire.
Although if support is added to the Index
class, this discussion is irrelevant since the index can be created with the same logic the KeyTransform
classes use, using the -> and ->> operators when dealing with root-level keys and the #> and #>> operators when traversing, this is what I'm doing in a management command that handles the index creation for my use case after I found out about this behavior.
comment:8 by , 7 years ago
Component: | Uncategorized → contrib.postgres |
---|---|
Triage Stage: | Unreviewed → Accepted |
Type: | Uncategorized → Cleanup/optimization |
I'm not sure what the way forward is, but it sounds like Django can do something, whether it's a behavior change or some documentation.
comment:9 by , 3 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
Triage Stage: | Accepted → Unreviewed |
Replying to David De Sousa:
Although if support is added to the
Index
class, this discussion is irrelevant since the index can be created with the same logic theKeyTransform
classes use, using the -> and ->> operators when dealing with root-level keys and the #> and #>> operators when traversing, this is what I'm doing in a management command that handles the index creation for my use case after I found out about this behavior.
Index()
supports creating functional and partial indexes based on JSONField
key transforms (see tests), so as far as I'm aware this can be closed.
Closing as "wontfix", because we will not change the current operators.
From my understanding the
->
and->>
operators are used instead of their#
variant when the key lookup depth is 1 for readability purposes.Is there a reason you can't create your non-nested attribute lookups using the
->
and->>
operators instead? Are you suggesting we favor the#
syntax because a functional index ondata#>'{a,b}'
would be used when doing adata#>'{a}'
lookup?I'm asking because changing it at this point is likely to break any previously created functional index created using these operators for the same reason you are asking them to be changed here. I assume this is something that would eventually be fixed in PostgreSQL.