Opened 2 years ago
Closed 2 years ago
#33749 closed Bug (invalid)
Queries take exponential time when filtering in a loop
Reported by: | Enhaloed | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 3.2 |
Severity: | Normal | Keywords: | |
Cc: | Enhaloed | Triage Stage: | Unreviewed |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
I have a loop that builds up various queries, then applies them with a filter.
I was having some performance issues and started looking into it more closely and it seemed to be exponential time instead of something relatively linear.
I made the query execute on each loop by printing the results and used django-debug-toolbar to time each query.
Here is the important loop:
text = 'Choose one, if there are ten or more fallen' words = text.split(' ') for word in words: word_query = Q() for search_field in search_fields: word_query |= Q(**{search_field + '__icontains': word}) if search_field == 'name': # Also check the alternative name word_query |= Q(**{'name_without_punctuation__icontains': word}) if exactness_option == CONS.TEXT_CONTAINS_ALL: cards = cards.filter(word_query) print(word_query) print(cards)
The word_query object scome out as expected, essentially checking if each word exists in any of the relationships, one word at a time
(OR: ('name__icontains', 'Choose'), ('name_without_punctuation__icontains', 'Choose'), ('name_without_punctuation__icontains', 'Choose'), ('ability_texts__text__icontains', 'Choose'), ('races__name__icontains', 'Choose')) (OR: ('name__icontains', 'one,'), ('name_without_punctuation__icontains', 'one,'), ('name_without_punctuation__icontains', 'one,'), ('ability_texts__text__icontains', 'one,'), ('races__name__icontains', 'one,')) (OR: ('name__icontains', 'if'), ('name_without_punctuation__icontains', 'if'), ('name_without_punctuation__icontains', 'if'), ('ability_texts__text__icontains', 'if'), ('races__name__icontains', 'if')) (OR: ('name__icontains', 'there'), ('name_without_punctuation__icontains', 'there'), ('name_without_punctuation__icontains', 'there'), ('ability_texts__text__icontains', 'there'), ('races__name__icontains', 'there')) (OR: ('name__icontains', 'are'), ('name_without_punctuation__icontains', 'are'), ('name_without_punctuation__icontains', 'are'), ('ability_texts__text__icontains', 'are'), ('races__name__icontains', 'are')) (OR: ('name__icontains', 'ten'), ('name_without_punctuation__icontains', 'ten'), ('name_without_punctuation__icontains', 'ten'), ('ability_texts__text__icontains', 'ten'), ('races__name__icontains', 'ten')) (OR: ('name__icontains', 'or'), ('name_without_punctuation__icontains', 'or'), ('name_without_punctuation__icontains', 'or'), ('ability_texts__text__icontains', 'or'), ('races__name__icontains', 'or')) (OR: ('name__icontains', 'more'), ('name_without_punctuation__icontains', 'more'), ('name_without_punctuation__icontains', 'more'), ('ability_texts__text__icontains', 'more'), ('races__name__icontains', 'more')) (OR: ('name__icontains', 'fallen'), ('name_without_punctuation__icontains', 'fallen'), ('name_without_punctuation__icontains', 'fallen'), ('ability_texts__text__icontains', 'fallen'), ('races__name__icontains', 'fallen'))
Using django-debug-toolbar to check the times for these queries however go as follows:
0.47s 0.83s 1.04s 2.43s 4.74s 26.27s 60.18s 124.46s 175.75s
As far as I can see these values should be somewhat similar since each query is checking the same 4 columns for a string of comparable length. Instead, they are going somewhat exponential.
This is using PostgreSQL 13.4 with the engine django.db.backends.postgresql, Django v3.2.6
Let me know if there's any other details I can provide to help with this.
Change History (3)
comment:1 by , 2 years ago
Cc: | added |
---|
comment:2 by , 2 years ago
Type: | Uncategorized → Bug |
---|
comment:3 by , 2 years ago
Resolution: | → invalid |
---|---|
Status: | new → closed |
Hi. There's not enough to demonstrate an issue in Django here.
TBH it looks more like a usage issue. (If
cards
is a QuerySet that never gets reset you're asking it do ever more work in the loop. Either way, you need to be using explain to see the work your DB is actually performing.) Please see TicketClosingReasons/UseSupportChannels for appropriate places to get help.If you can identify a concrete issue with a reproduce in Django, then here is the appropriate place for that.
Good luck.