Opened 3 years ago

Closed 3 years ago

#33749 closed Bug (invalid)

Queries take exponential time when filtering in a loop

Reported by: Enhaloed Owned by: nobody
Component: Database layer (models, ORM) Version: 3.2
Severity: Normal Keywords:
Cc: Enhaloed Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

I have a loop that builds up various queries, then applies them with a filter.

I was having some performance issues and started looking into it more closely and it seemed to be exponential time instead of something relatively linear.

I made the query execute on each loop by printing the results and used django-debug-toolbar to time each query.

Here is the important loop:

text = 'Choose one, if there are ten or more fallen'
words = text.split(' ')
for word in words:
    word_query = Q()
    for search_field in search_fields:
        word_query |= Q(**{search_field + '__icontains': word})

        if search_field == 'name':
            # Also check the alternative name
            word_query |= Q(**{'name_without_punctuation__icontains': word})

    if exactness_option == CONS.TEXT_CONTAINS_ALL:
        cards = cards.filter(word_query)
        print(word_query)
        print(cards)

The word_query object scome out as expected, essentially checking if each word exists in any of the relationships, one word at a time

(OR: ('name__icontains', 'Choose'), ('name_without_punctuation__icontains', 'Choose'), ('name_without_punctuation__icontains', 'Choose'), ('ability_texts__text__icontains', 'Choose'), ('races__name__icontains', 'Choose'))
(OR: ('name__icontains', 'one,'), ('name_without_punctuation__icontains', 'one,'), ('name_without_punctuation__icontains', 'one,'), ('ability_texts__text__icontains', 'one,'), ('races__name__icontains', 'one,'))
(OR: ('name__icontains', 'if'), ('name_without_punctuation__icontains', 'if'), ('name_without_punctuation__icontains', 'if'), ('ability_texts__text__icontains', 'if'), ('races__name__icontains', 'if'))
(OR: ('name__icontains', 'there'), ('name_without_punctuation__icontains', 'there'), ('name_without_punctuation__icontains', 'there'), ('ability_texts__text__icontains', 'there'), ('races__name__icontains', 'there'))
(OR: ('name__icontains', 'are'), ('name_without_punctuation__icontains', 'are'), ('name_without_punctuation__icontains', 'are'), ('ability_texts__text__icontains', 'are'), ('races__name__icontains', 'are'))
(OR: ('name__icontains', 'ten'), ('name_without_punctuation__icontains', 'ten'), ('name_without_punctuation__icontains', 'ten'), ('ability_texts__text__icontains', 'ten'), ('races__name__icontains', 'ten'))
(OR: ('name__icontains', 'or'), ('name_without_punctuation__icontains', 'or'), ('name_without_punctuation__icontains', 'or'), ('ability_texts__text__icontains', 'or'), ('races__name__icontains', 'or'))
(OR: ('name__icontains', 'more'), ('name_without_punctuation__icontains', 'more'), ('name_without_punctuation__icontains', 'more'), ('ability_texts__text__icontains', 'more'), ('races__name__icontains', 'more'))
(OR: ('name__icontains', 'fallen'), ('name_without_punctuation__icontains', 'fallen'), ('name_without_punctuation__icontains', 'fallen'), ('ability_texts__text__icontains', 'fallen'), ('races__name__icontains', 'fallen'))

Using django-debug-toolbar to check the times for these queries however go as follows:

0.47s
0.83s
1.04s
2.43s
4.74s
26.27s
60.18s
124.46s
175.75s

As far as I can see these values should be somewhat similar since each query is checking the same 4 columns for a string of comparable length. Instead, they are going somewhat exponential.

This is using PostgreSQL 13.4 with the engine django.db.backends.postgresql, Django v3.2.6

Let me know if there's any other details I can provide to help with this.

Change History (3)

comment:1 by Enhaloed, 3 years ago

Cc: Enhaloed added

comment:2 by Enhaloed, 3 years ago

Type: UncategorizedBug

comment:3 by Carlton Gibson, 3 years ago

Resolution: invalid
Status: newclosed

Hi. There's not enough to demonstrate an issue in Django here.

TBH it looks more like a usage issue. (If cards is a QuerySet that never gets reset you're asking it do ever more work in the loop. Either way, you need to be using explain to see the work your DB is actually performing.) Please see TicketClosingReasons/UseSupportChannels for appropriate places to get help.

If you can identify a concrete issue with a reproduce in Django, then here is the appropriate place for that.

Good luck.

Note: See TracTickets for help on using tickets.
Back to Top