Opened 5 years ago

Closed 5 years ago

#31228 closed Bug (fixed)

DISTINCT with GROUP_CONCAT() and multiple expressions raises NotSupportedError on SQLite.

Reported by: Andy Terra Owned by: Hongtao Ma
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: sqlite
Cc: Simon Charette Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Contrary to what is suggested in lines 60-64 of django.db.backends.sqlite3.operations.py, SQLite does support DISTINCT on aggregate functions.

One such example is GROUP_CONCAT, which is quite similar to PostgreSQL's STRING_AGG.

I can't find any canonical links which provide a useful explanation of GROUP_CONCAT, but this should be good enough: https://www.w3resource.com/sqlite/aggregate-functions-and-grouping-group_concat.php

I came across this issue when trying to create my own GroupConcat function subclassing Aggregate (similar to the StringAgg implementation from postgres) and noticed that skipping the check in django.db.backends.sqlite3.operations.py would allow my queries to run as advertised.

My code for GroupConcat is:

from django.db.models import Value
from django.db.models.aggregates import Aggregate

class GroupConcat(Aggregate):
    function = 'GROUP_CONCAT'
    template = '%(function)s(%(distinct)s %(expressions)s)'
    allow_distinct = True

    def __init__(self, expression, delimiter=None, **extra):
        if delimiter is not None:
            self.allow_distinct = False
            delimiter_expr = Value(str(delimiter))
            super().__init__(expression, delimiter_expr, **extra)        
        else:
            super().__init__(expression, **extra)        

    def as_sqlite(self, compiler, connection, **extra_context):
        return super().as_sql(
            compiler, connection,
            function=self.function,
            template=self.template,
            **extra_context
        )

For the record, as the code above suggests, a separate issue is that GROUP_CONCAT only allows DISTINCT when a delimiter isn't specified.

After some discussion on #django, an argument was raised in favor of changing the message to say "Django doesn't support...", but I would imagine that skipping the check entirely would simply result in an OperationalError for malformed queries while still allowing users to extend the ORM as needed.

Change History (10)

comment:1 by Mariusz Felisiak, 5 years ago

Cc: Simon Charette added
Summary: SQLite does support DISTINCT on aggregate functionsDISTINCT with GROUP_CONCAT() and multiple expressions raises NotSupportedError on SQLite.
Triage Stage: UnreviewedAccepted
Version: 3.0master

Thanks for this ticket. I agree that it shouldn't raise an error in some cases, but removing this check is not an option, IMO.

Regression in bc05547cd8c1dd511c6b6a6c873a1bc63417b111.

comment:2 by Hongtao Ma, 5 years ago

Cc: Hongtao Ma added

comment:3 by Hongtao Ma, 5 years ago

Owner: changed from nobody to Hongtao Ma
Status: newassigned

comment:4 by Hongtao Ma, 5 years ago

There seems to be problem for concatenation style aggregation in general, here is what i experimented on postgresql:

In [18]: Blog.objects.annotate(StringAgg('title', delimiter='x'))                                                                 
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/django/django/db/models/query.py in annotate(self, *args, **kwargs)
   1071             try:
-> 1072                 if arg.default_alias in kwargs:
   1073                     raise ValueError("The named annotation '%s' conflicts with the "

~/django/django/db/models/aggregates.py in default_alias(self)
     64             return '%s__%s' % (expressions[0].name, self.name.lower())
---> 65         raise TypeError("Complex expressions require an alias")
     66 

TypeError: Complex expressions require an alias

During handling of the above exception, another exception occurred:

TypeError                                 Traceback (most recent call last)
<ipython-input-18-b924dae7712a> in <module>
----> 1 Blog.objects.annotate(StringAgg('title', delimiter='x'))

~/django/django/db/models/manager.py in manager_method(self, *args, **kwargs)
     80         def create_method(name, method):
     81             def manager_method(self, *args, **kwargs):
---> 82                 return getattr(self.get_queryset(), name)(*args, **kwargs)
     83             manager_method.__name__ = method.__name__
     84             manager_method.__doc__ = method.__doc__

~/django/django/db/models/query.py in annotate(self, *args, **kwargs)
   1075                                      % arg.default_alias)
   1076             except TypeError:
-> 1077                 raise TypeError("Complex annotations require an alias")
   1078             annotations[arg.default_alias] = arg
   1079         annotations.update(kwargs)

TypeError: Complex annotations require an alias

In [19]:                                                                                                                          

comment:5 by Mariusz Felisiak, 5 years ago

An error is quite descriptive IMO, "TypeError: Complex expressions require an alias".

in reply to:  5 comment:6 by Hongtao Ma, 5 years ago

Replying to felixxm:

An error is quite descriptive IMO, "TypeError: Complex expressions require an alias".

Not quite sure about what kind of 'expressions' deserve being called complex. But I assume StringAgg('title', delimiter='x') shouldn't be one of them. The problem here is that the delimiter part of StringAgg is included in the expressions, which probably makes it complex.

From the syntax defined by Postgresql, the expression is explicitly separated from the delimiter:
string_agg(expression, delimiter) string_agg

comment:7 by Mariusz Felisiak, 5 years ago

Yes, because delimiter is treated as an expression in StringAgg. Please don't add unrelated comments/questions to tickets.

comment:8 by Hongtao Ma, 5 years ago

Cc: Hongtao Ma removed
Has patch: set
Needs tests: set

comment:9 by Mariusz Felisiak, 5 years ago

Needs tests: unset
Triage Stage: AcceptedReady for checkin

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 5 years ago

Resolution: fixed
Status: assignedclosed

In cbb6531:

Fixed #31228 -- Reallowed aggregates to be used with multiple expressions and no DISTINCT on SQLite.

Regression in bc05547cd8c1dd511c6b6a6c873a1bc63417b111.

Thanks Andy Terra for the report.

Note: See TracTickets for help on using tickets.
Back to Top