Opened 5 years ago
Closed 5 years ago
#31228 closed Bug (fixed)
DISTINCT with GROUP_CONCAT() and multiple expressions raises NotSupportedError on SQLite.
Reported by: | Andy Terra | Owned by: | Hongtao Ma |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | sqlite |
Cc: | Simon Charette | Triage Stage: | Ready for checkin |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Contrary to what is suggested in lines 60-64 of django.db.backends.sqlite3.operations.py
, SQLite does support DISTINCT on aggregate functions.
One such example is GROUP_CONCAT, which is quite similar to PostgreSQL's STRING_AGG.
I can't find any canonical links which provide a useful explanation of GROUP_CONCAT, but this should be good enough: https://www.w3resource.com/sqlite/aggregate-functions-and-grouping-group_concat.php
I came across this issue when trying to create my own GroupConcat
function subclassing Aggregate
(similar to the StringAgg implementation from postgres) and noticed that skipping the check in django.db.backends.sqlite3.operations.py
would allow my queries to run as advertised.
My code for GroupConcat
is:
from django.db.models import Value from django.db.models.aggregates import Aggregate class GroupConcat(Aggregate): function = 'GROUP_CONCAT' template = '%(function)s(%(distinct)s %(expressions)s)' allow_distinct = True def __init__(self, expression, delimiter=None, **extra): if delimiter is not None: self.allow_distinct = False delimiter_expr = Value(str(delimiter)) super().__init__(expression, delimiter_expr, **extra) else: super().__init__(expression, **extra) def as_sqlite(self, compiler, connection, **extra_context): return super().as_sql( compiler, connection, function=self.function, template=self.template, **extra_context )
For the record, as the code above suggests, a separate issue is that GROUP_CONCAT only allows DISTINCT when a delimiter isn't specified.
After some discussion on #django, an argument was raised in favor of changing the message to say "Django doesn't support...", but I would imagine that skipping the check entirely would simply result in an OperationalError
for malformed queries while still allowing users to extend the ORM as needed.
Change History (10)
comment:1 by , 5 years ago
Cc: | added |
---|---|
Summary: | SQLite does support DISTINCT on aggregate functions → DISTINCT with GROUP_CONCAT() and multiple expressions raises NotSupportedError on SQLite. |
Triage Stage: | Unreviewed → Accepted |
Version: | 3.0 → master |
comment:2 by , 5 years ago
Cc: | added |
---|
comment:3 by , 5 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
comment:4 by , 5 years ago
There seems to be problem for concatenation style aggregation in general, here is what i experimented on postgresql:
In [18]: Blog.objects.annotate(StringAgg('title', delimiter='x')) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) ~/django/django/db/models/query.py in annotate(self, *args, **kwargs) 1071 try: -> 1072 if arg.default_alias in kwargs: 1073 raise ValueError("The named annotation '%s' conflicts with the " ~/django/django/db/models/aggregates.py in default_alias(self) 64 return '%s__%s' % (expressions[0].name, self.name.lower()) ---> 65 raise TypeError("Complex expressions require an alias") 66 TypeError: Complex expressions require an alias During handling of the above exception, another exception occurred: TypeError Traceback (most recent call last) <ipython-input-18-b924dae7712a> in <module> ----> 1 Blog.objects.annotate(StringAgg('title', delimiter='x')) ~/django/django/db/models/manager.py in manager_method(self, *args, **kwargs) 80 def create_method(name, method): 81 def manager_method(self, *args, **kwargs): ---> 82 return getattr(self.get_queryset(), name)(*args, **kwargs) 83 manager_method.__name__ = method.__name__ 84 manager_method.__doc__ = method.__doc__ ~/django/django/db/models/query.py in annotate(self, *args, **kwargs) 1075 % arg.default_alias) 1076 except TypeError: -> 1077 raise TypeError("Complex annotations require an alias") 1078 annotations[arg.default_alias] = arg 1079 annotations.update(kwargs) TypeError: Complex annotations require an alias In [19]:
follow-up: 6 comment:5 by , 5 years ago
An error is quite descriptive IMO, "TypeError: Complex expressions require an alias".
comment:6 by , 5 years ago
Replying to felixxm:
An error is quite descriptive IMO, "TypeError: Complex expressions require an alias".
Not quite sure about what kind of 'expressions' deserve being called complex. But I assume StringAgg('title', delimiter='x') shouldn't be one of them. The problem here is that the delimiter part of StringAgg is included in the expressions, which probably makes it complex.
From the syntax defined by Postgresql, the expression is explicitly separated from the delimiter:
string_agg(expression, delimiter)
string_agg
comment:7 by , 5 years ago
Yes, because delimiter
is treated as an expression in StringAgg
. Please don't add unrelated comments/questions to tickets.
comment:9 by , 5 years ago
Needs tests: | unset |
---|---|
Triage Stage: | Accepted → Ready for checkin |
Thanks for this ticket. I agree that it shouldn't raise an error in some cases, but removing this check is not an option, IMO.
Regression in bc05547cd8c1dd511c6b6a6c873a1bc63417b111.