Opened 6 days ago

Last modified 2 days ago

#36051 assigned New feature

Declare arity on aggregate functions

Reported by: Jacob Walls Owned by: Jacob Walls
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords:
Cc: Simon Charette Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Calling a database function with an incorrect number of arguments raises TypeError if the function declares its arity. We could improve the DX around CompositePrimaryKey by leveraging this feature, instead of just advising to be careful.

The docs sort of imply some kind of error is raised ("# ERROR"), but this test passes:

  • tests/composite_pk/test_aggregate.py

    diff --git a/tests/composite_pk/test_aggregate.py b/tests/composite_pk/test_aggregate.py
    index b5474c5218..750618a277 100644
    a b  
    11from django.db import NotSupportedError
    2 from django.db.models import Count, Q
     2from django.db.models import Count, Max, Q
    33from django.test import TestCase
    44
    55from .models import Comment, Tenant, User
    class CompositePKAggregateTests(TestCase):  
    3232        cls.comment_5 = Comment.objects.create(id=5, user=cls.user_3, text="barbaz")
    3333        cls.comment_6 = Comment.objects.create(id=6, user=cls.user_3, text="baz")
    3434
     35    def test_max_pk(self):
     36        # Could raise TypeError instead.
     37        self.assertEqual(Comment.objects.aggregate(Max("pk")), {'pk__max': 1})
     38
    3539    def test_users_annotated_with_comments_id_count(self):
    3640        user_1, user_2, user_3 = User.objects.annotate(Count("comments__id")).order_by(
    3741            "pk"

Relatedly, most of the built-in aggregate functions should set arity=1, which I'll include in a separate commit, but we can also discuss whether that needs to go through a deprecation path.

Change History (6)

comment:1 by Jacob Walls, 6 days ago

Has patch: set

comment:2 by Tim Graham, 6 days ago

Triage Stage: UnreviewedAccepted

comment:3 by Simon Charette, 3 days ago

Cc: Simon Charette added

I understand the desire to streamline the composite primary key reference experience but this approach just seems plain wrong. Function arity is tied to the number of argument passed to a function and Max(F("pk")) is a single argument where the argument itself is composite.

In other words, there is a difference between calls of the form foo(1, 2) and foo((1,2)) (or foo(bar) where bar: int | tuple[int, int] since F is a reference) and trying to automatically turn one into the other seems like it's going to cause more harm than good.

To me #36042 and and this ticket should be addressed by introducing a BaseExpression.allows_composite_expressions: bool = False that we set to True on Count and TupleLookupMixin and make BaseExpression.resolve_expression raise a ValueError when it's set to False and any of its source expression is an instance of ColPairs.

Don't get me wrong I think we should define arity for Aggregate subclasses to guard against improper calls but I don't think that making Func.resolve_expression unpack composite expressions to trigger arity check failures is something we should do for the aforementioned reasons. In essence this is the same problem as model_instance.pk now returning a tuple and requiring some special care but at the field resolving level.

Last edited 3 days ago by Simon Charette (previous) (diff)

comment:4 by Jacob Walls, 3 days ago

Patch needs improvement: set
Summary: Count CompositePrimaryKey field targets toward function arity checkDeclare arity on aggregate functions

Super helpful. I'll update #36042 to do as you describe here, and then we can refocus this ticket on adding arity checks to aggregates, since it sounds like we have consensus on that (and that doesn't have to land in 5.2).

In other words, there is a difference between calls of the form foo(1, 2) and foo((1,2))

but I don't think that making Func.resolve_expression unpack composite expressions to trigger arity check failures is something we should do

Certainly there is a difference between those two calls, but on the main branch the unpacking was already happening, i.e. Max(F("pk")) already compiles to Max(col1, col2), not Max((col1, col2)), which is why I thought adding arity checking wasn't a big deal here. But if the solve for #36042 is going to be to lock down resolve_expression() for all but Count and TupleLookupMixin, then F("pk") won't even be allowed here.

Thanks for letting me kick the tires (this is helping me get more comfortable with the ORM internals!)

comment:5 by Simon Charette, 3 days ago

Super helpful. I'll update #36042 to do as you describe here, and then we can refocus this ticket on adding arity checks to aggregates, since it sounds like we have consensus on that (and that doesn't have to land in 5.2).

That makes sense!

Certainly there is a difference between those two calls, but on the main branch the unpacking was already happening, i.e. Max(F("pk")) already compiles to Max(col1, col2), not Max((col1, col2))

Max("pk") (we can drop the explicit F) compiles to MAX(col1, col2) at the SQL level due to the quirky way ColPairs.as_sql is implemented to support SELECT and GROUP BY references on backends that don't have native ROW / tuple support but it resolves to Max(ColPairs("col1", "col2")) which is a distinction that makes me believe we should not use arity and unpacking for this purpose.

Thanks for letting me kick the tires (this is helping me get more comfortable with the ORM internals!)

Happy my feedback is useful, thanks for your time spent making the composite primary key usage experience better. The current awkwardness around ColPairs.as_sql and Tuple will vanish once we have proper support for composite fields but there might be a way to simplify things before that by getting rid of the the hacky way SELECT and GROUP BY deal with ColPairs compilation. I'll see if I can get something working tomorrow.

comment:6 by Jacob Walls, 2 days ago

Patch needs improvement: unset
Note: See TracTickets for help on using tickets.
Back to Top