Opened 3 years ago

Closed 3 years ago

Last modified 3 years ago

#33769 closed Bug (duplicate)

F() expressions - OperationalError

Reported by: Pablo Owned by: nobody
Component: Documentation Version: 3.2
Severity: Normal Keywords:
Cc: Mariusz Felisiak Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

AppName: shifts

class WorkShift(RegistroMixin):
    season = models.ForeignKey(Season, on_delete=models.CASCADE)
    type = models.ForeignKey('WorkShiftType', null=True, on_delete=models.PROTECT)
    horario = models.ForeignKey('HorarioTurno', null=True, blank=True,  on_delete=models.PROTECT)
    start_date = models.DateTimeField()
    end_date = models.DateTimeField()

class Season(RegistroMixin):
    chief_tipes = models.ManyToManyField('WorkShiftType')

WorkShift.objects.filter(
                          season__in=seasons,
                          type__in=F('season__chief_tipes'),
                          end_date__gte=today
                      ).filter(
            Q(start_date__lte=today) | Q(start_date__lte=today + datetime.timedelta(hours=12))
                      )

After upgrade Django 2.2 -> 3.2 this Queryset returns:

OperationalError at /
no such table: shifts_season_chief_tipes.workshifttype_id

The ORM is not correctly constructing the Where statement. In other code queries, it does work correctly after migration.

Change History (8)

comment:1 by Simon Charette, 3 years ago

I'm pretty sure this was never meant to be supported or that it possibly generated wrong results but I'll let others chimes in. In the mean type you can get an equivalent query with

WorkShift.objects.filter(
    season__in=seasons,
    type__in=Season.chief_tipes.through.objects.filter(
        season=OuterRef('season'),
    ).values('workshifttype'),
    end_date__gte=today,
).filter(
    Q(start_date__lte=today) | Q(start_date__lte=today + datetime.timedelta(hours=12))
)

comment:2 by Carlton Gibson, 3 years ago

Cc: Mariusz Felisiak added
Component: UncategorizedDocumentation
Triage Stage: UnreviewedAccepted

Given the docs at Filters can reference fields on the model, I might expect this to work:

You can also use the double underscore notation to span relationships in an F() object. An F() object with a double underscore will introduce
any joins needed to access the related object. For example, to retrieve all the entries where the author’s name is the same as the blog name, we could issue the query:

>>> Entry.objects.filter(authors__name=F('blog__name'))

Nothing there to say it works with __exact but not __in — Questions: Why not? What are the limitations here? And maybe, Why? 🤔 If it's not supported OK, but can we document that better? Can we add Simon's Use a subquery example as a note maybe?

I'll provisionally accept on that basis.

comment:3 by Simon Charette, 3 years ago

I'd be curious to see what kind of queries it generated on Django 2.2.

I would assume that a LEFT JOIN is generated for shifts_season_chief_tipes and that the where clause has a shifts_workshift.type_id IN (shifts_season_chief_tipes.workshifttype_id) which is basically equivalent to shifts_workshift.type_id = shifts_season_chief_tipes.workshifttype_id.

Given we've taken a stance that we don't want to explicitly support __exact=Queryset (single-valued lhs with multi-valued rhs) I thought we'd want to do the same with __in=F('m2m') (multi-valued lhs with single-valued rhs).

comment:4 by Carlton Gibson, 3 years ago

I thought we'd want to do the same with __in=F('m2m') (multi-valued lhs with single-valued rhs).

Yes, that makes sense — __in should take a collection/list.

Once phrased that way, I wonder if it really does need clarifying.
Likely an added __in example would be sufficient guidance.


comment:5 by Carlton Gibson, 3 years ago

Resolution: invalid
Status: newclosed

OK, the in lookup docs already have everything I'd expect here: both that an iterable is required, and the nested subquery example.

I'll close on that basis.

I'd be curious to see what kind of queries it generated on Django 2.2.

Pablo if you wanted to post the difference in the generated SQL, and maybe bisect to find the change, it may be that we could say more. Thanks.

comment:6 by Pablo, 3 years ago

I'm sorry, but I don't understand the exact way to Bisecting a regression.

I leave the SQL statement with the original models to avoid making any changes (I have deleted some non-transcendent fields for this case) that would allow you to see a possible explanation. It's funny but the two SQL statements are the same, so it could be that the problem is how it resolves the name of the table, since it concatenates the name of its primary key?

Seasson -> Temporada
WorkShift -> Turno
seasonchief_tipes -> jefatura_tipos_turno_publicos
AppName -> turnos

Original error

OperationalError at /

no such table: turnos_temporada_jefatura_tipos_turno_publicos.tipoturno_id

The models.

class Temporada(models.Model):
    jefatura_tipos_turno_publicos = models.ManyToManyField('turnos.TipoTurno',
                                                           related_name='turnos_temporadas_jefatura',
                                                           verbose_name=_('Jefatura - Tipos de turno públicos'),
                                                           blank=True)

class Turno(models.Model):
    temporada = models.ForeignKey(Temporada, on_delete=models.CASCADE)
    tipo = models.ForeignKey('TipoTurno', null=True, on_delete=models.PROTECT)
    fecha_inicio = models.DateTimeField(_('Inicio turno'))
    fecha_fin = models.DateTimeField(_('Fin turno'))

class TipoTurno(models.Model):
    nombre = models.CharField(_('Nombre'), max_length=30)

The Queryset

        turnos_jefatura = Turno.objects.filter(temporada__in=temporadas,
                                               tipo__in=F('temporada__jefatura_tipos_turno_publicos'),
                                               fecha_fin__gte=hoy
                                               ).filter(
            Q(fecha_inicio__lte=hoy) | Q(fecha_inicio__lte=hoy + datetime.timedelta(hours=12))
        )

Django 2.2 SQL

SELECT "turnos_turno"."id",
       "turnos_turno"."temporada_id",
       "turnos_turno"."tipo_id",
       "turnos_turno"."fecha_inicio",
       "turnos_turno"."fecha_fin",
FROM "turnos_turno"
         INNER JOIN "turnos_temporada"
                    ON ("turnos_turno"."temporada_id" = "turnos_temporada"."id")
         INNER JOIN "turnos_temporada_jefatura_tipos_turno_publicos"
                    ON ("turnos_temporada"."id" = "turnos_temporada_jefatura_tipos_turno_publicos"."temporada_id")
WHERE ("turnos_turno"."fecha_fin" >= '''2022-06-06 17:18:46.405382''' AND
       "turnos_turno"."temporada_id" IN (SELECT V0."id"
                                         FROM "turnos_temporada" V0
                                                  LEFT OUTER JOIN "turnos_temporada_jefatura_parques_afectacion" V1
                                                                  ON (V0."id" = V1."temporada_id")
                                         WHERE (V0."activo" = 'True' AND V0."es_jefatura" = 'True' AND
                                                V0."fecha_fin" >= '''2022-06-06''' AND
                                                V0."fecha_inicio" <= '''2022-06-06''' AND
                                                (V1."parque_id" IN (SELECT DISTINCT U0."id"
                                                                    FROM "parques_parque" U0
                                                                    WHERE U0."id" IN
                                                                          (...)) OR
                                                 V1."parque_id" IS NULL))) AND
       "turnos_turno"."tipo_id" IN ("turnos_temporada_jefatura_tipos_turno_publicos"."tipoturno_id") AND
       ("turnos_turno"."fecha_inicio" <= '''2022-06-06 17:18:46.405382''' OR
        "turnos_turno"."fecha_inicio" <= '''2022-06-07 05:18:46.405382'''))

Django 3.2 SQL

SELECT "turnos_turno"."id",
       "turnos_turno"."temporada_id",
       "turnos_turno"."tipo_id",
       "turnos_turno"."fecha_inicio",
       "turnos_turno"."fecha_fin",
FROM "turnos_turno"
         INNER JOIN "turnos_temporada"
                    ON ("turnos_turno"."temporada_id" = "turnos_temporada"."id")
         INNER JOIN "turnos_temporada_jefatura_tipos_turno_publicos"
                    ON ("turnos_temporada"."id" = "turnos_temporada_jefatura_tipos_turno_publicos"."temporada_id")
WHERE ("turnos_turno"."fecha_fin" >= '''2022-06-07 09:58:33.644086''' AND
       "turnos_turno"."temporada_id" IN (SELECT V0."id"
                                         FROM "turnos_temporada" V0
                                                  LEFT OUTER JOIN "turnos_temporada_jefatura_parques_afectacion" V1
                                                                  ON (V0."id" = V1."temporada_id")
                                         WHERE (V0."activo" AND V0."es_jefatura" AND
                                                V0."fecha_fin" >= '''2022-06-07''' AND
                                                V0."fecha_inicio" <= '''2022-06-07''' AND
                                                (V1."parque_id" IN (SELECT DISTINCT U0."id"
                                                                    FROM "parques_parque" U0
                                                                    WHERE U0."id" IN
                                                                          (...)) OR
                                                 V1."parque_id" IS NULL))) AND
       "turnos_turno"."tipo_id" IN "turnos_temporada_jefatura_tipos_turno_publicos"."tipoturno_id" AND (
       "turnos_turno"."fecha_inicio" <= '''2022-06-07 09:58:33.644086''' OR
       "turnos_turno"."fecha_inicio" <= '''2022-06-07 21:58:33.644086'''))

comment:7 by Simon Charette, 3 years ago

I didn't run proper bisecting but by comparing the queries the only semantic difference between both is wrapping of the right-hand-side of the IN operator in parentheses.

So I suspect this was broken by 3a505c70e7b228bf1212c067a8f38271ca86ce09 in 3.0 and fixed back in 4.0 by 170b006ce82b0ecf26dc088f832538b747ca0115 for reasons unrelated to the rationale mentioned here.

All that to say that we don't explicitly support/test for __in=single_valued_expression and this was inadvertently broken in 3.0 and fixed later on.

comment:8 by Mariusz Felisiak, 3 years ago

Resolution: invalidduplicate
Triage Stage: AcceptedUnreviewed

Duplicate of #31135.

Note: See TracTickets for help on using tickets.
Back to Top