#11017 closed (fixed)
Oracle LIKEC query doesn't use index
Reported by: | Jani Tiainen | Owned by: | nobody |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Keywords: | oracle | |
Cc: | Matt Boersma | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
It seems (not confirmed from Oracle officially) LIKEC query doesn't use index at all, you get always full table scan that is costly when number of rows starts to grow.
Like queries uses index as expected with same parameters.
Change was done when fixing #5985
Attachments (1)
Change History (18)
comment:1 by , 16 years ago
milestone: | → 1.2 |
---|---|
Triage Stage: | Unreviewed → Accepted |
comment:2 by , 16 years ago
Can you please supply some documentation or test case that supports this? Is this the case for NVARCHAR2 as well as VARCHAR2?
follow-up: 5 comment:3 by , 16 years ago
Scratch that, I confirmed it myself. I can't get Oracle 10g XE to formulate a plan using an index for LIKEC queries at all.
However, I still need to see some evidence that changing it back to LIKE (and finding another fix for #5985) would be an improvement. The three types of lookups that use LIKEC are contains
, startswith
, and endswith
. contains
and endswith
lookups always start with '%'
, so they will never be indexed according to the Oracle docs. startswith
lookups take the form 'prefix%'
and might theoretically use the index, but in my testing I was unable to find an actual example of that form that did.
comment:4 by , 16 years ago
Cc: | added |
---|
follow-up: 6 comment:5 by , 16 years ago
Replying to ikelly:
Scratch that, I confirmed it myself. I can't get Oracle 10g XE to formulate a plan using an index for LIKEC queries at all.
However, I still need to see some evidence that changing it back to LIKE (and finding another fix for #5985) would be an improvement. The three types of lookups that use LIKEC are
contains
,startswith
, andendswith
.contains
andendswith
lookups always start with'%'
, so they will never be indexed according to the Oracle docs.startswith
lookups take the form'prefix%'
and might theoretically use the index, but in my testing I was unable to find an actual example of that form that did.
Both lookups (i)startswith
and (i)endswith
takes an advantage of index. In case of endswith
index must be created with keyword "REVERSE". contains
lookups can be only indexed with Oracle Text.
And it doesn't matter is is VARCHAR2 or NVARCHAR2 - no effect.
Small sample how table + indices should be created:
CREATE TABLE TEST_DATA( ID NUMBER PRIMARY KEY, TXT VARCHAR(200) ); # Forward index (startswith) CREATE INDEX TEST_DATA_TXT TEST_DATA(TXT); # Reverse index (endswith) CREATE INDEX TEST_DATA_TXT TEST_DATA(TXT) REVERSE; # Function based uppercase forward index # (if istartswith uses UPPER otherwise change to lower accordingly) CREATE INDEX TEST_DATA_TXT TEST_DATA(UPPER(TXT)); # Function based uppercase reverse index. See previous CREATE INDEX TEST_DATA_TXT TEST_DATA(UPPER(TXT)) REVERSE;
Now running different queries (except contains
should hit indices).
Few things that might affect it:
- Number of rows is too low.
- Tables are not analyzed.
- Too large columns (2000 chars/bytes I think) doesn't get indexed.
If this doesn't get working, I could rise TAR to Oracle Metalink to see is there some issues regading usage of LIKEC (it doesn't seem to be much used anywhere nor documented very well - there might be reason for that)
follow-up: 7 comment:6 by , 16 years ago
Replying to jtiai:
In case of
endswith
index must be created with keyword "REVERSE".
From what I understand (I'm at home right now and don't have an Oracle database handy to actually test on), this doesn't work. Reverse-key indexes are used to provide an optimization for monotonically increasing columns with lots of concurrent inserts, at the cost of not being able to use the index for range scans -- not to allow reverse LIKE queries. See http://richardfoote.wordpress.com/2008/01/16/introduction-to-reverse-key-indexes-part-ii-another-myth-bites-the-dust/
An actual solution (as mentioned in the same article) is to create an index of the REVERSE
function. This would require a refactoring of the way the lookups are implemented in Django backends, which is something that should be done anyway.
The suggestion of creating an uppercase index in addition to the regular index is also a useful one, for a separate ticket.
Back on the LIKEC
topic, there is no question that a query of the form WHERE TXT LIKE 'foo%'
will take advantage of an index under the right circumstances. The main obstacle as I see it is #5985. We weren't able to reproduce that bug ourselves, and so it would be challenging to confirm the effectiveness of an alternative fix. That's why I would like to see an example confirming that LIKE
will use an index, but LIKEC
does not, before considering this change.
comment:7 by , 16 years ago
Replying to ikelly:
Replying to jtiai:
In case of
endswith
index must be created with keyword "REVERSE".
From what I understand (I'm at home right now and don't have an Oracle database handy to actually test on), this doesn't work. Reverse-key indexes are used to provide an optimization for monotonically increasing columns with lots of concurrent inserts, at the cost of not being able to use the index for range scans -- not to allow reverse LIKE queries. See http://richardfoote.wordpress.com/2008/01/16/introduction-to-reverse-key-indexes-part-ii-another-myth-bites-the-dust/
Straight from Oracle documentation:
"Creating a reverse key index, compared to a standard index, reverses the bytes of each column indexed (except the rowid) while keeping the column order."
An actual solution (as mentioned in the same article) is to create an index of the
REVERSE
function. This would require a refactoring of the way the lookups are implemented in Django backends, which is something that should be done anyway.
The suggestion of creating an uppercase index in addition to the regular index is also a useful one, for a separate ticket.
Back on the
LIKEC
topic, there is no question that a query of the formWHERE TXT LIKE 'foo%'
will take advantage of an index under the right circumstances. The main obstacle as I see it is #5985. We weren't able to reproduce that bug ourselves, and so it would be challenging to confirm the effectiveness of an alternative fix. That's why I would like to see an example confirming thatLIKE
will use an index, butLIKEC
does not, before considering this change.
All of the above are optimizations that could be done. I'm using legacy application so in my particular case indices are already there. I'm hitting this problem because I've "small" address table - around 21 million rows. There you clearly see when query hits index and when they do not - and it applies to performance as well.
I try to produce sample set of test data and script that would demonstrate index usage. Note that performance problem is easily hidden with powerful hardware.
by , 16 years ago
Attachment: | index_testing.sql added |
---|
Script to test index usage in case of LIKE and LIKEC
follow-ups: 9 10 comment:8 by , 16 years ago
Thanks for the test case. This confirms that the problem happens at least with VARCHAR2 columns. However, when I change the column type to NVARCHAR2, I get full table scans for both queries. I may have something misconfigured -- it seems bizarre that this sort of query would ignore the index just because the column is NVARCHAR2.
To test the reverse-key index suggestion, I also changed the original script by adding a reverse-key index and changing the search strings to '%Alley'. The result: Oracle wouldn't let me create both a normal and reverse-key index on the same column. So I deleted the normal index and tried again, and I got full table scans for both queries.
comment:9 by , 16 years ago
Replying to ikelly:
Thanks for the test case. This confirms that the problem happens at least with VARCHAR2 columns. However, when I change the column type to NVARCHAR2, I get full table scans for both queries. I may have something misconfigured -- it seems bizarre that this sort of query would ignore the index just because the column is NVARCHAR2.
I recall that there was some "catch" again here. Orace is so delicate what comes to indices.
To test the reverse-key index suggestion, I also changed the original script by adding a reverse-key index and changing the search strings to '%Alley'. The result: Oracle wouldn't let me create both a normal and reverse-key index on the same column. So I deleted the normal index and tried again, and I got full table scans for both queries.
There was some "catch" to make reverses working - it's not main concern right now anyway.
comment:10 by , 16 years ago
Replying to ikelly:
Thanks for the test case. This confirms that the problem happens at least with VARCHAR2 columns. However, when I change the column type to NVARCHAR2, I get full table scans for both queries. I may have something misconfigured -- it seems bizarre that this sort of query would ignore the index just because the column is NVARCHAR2.
Full table scan was chosen by oracle optimizer because there was too few rows. Change that most inner loop for c in 1..100
to something like 10000 and you start getting index usage with NVARCHAR2 too. Still it doesn't fix LIKEC problem...
Of course I always can fallback to RAW SQL here.
comment:11 by , 15 years ago
LIKEC runs terribly slow in django for my case. The same sql runs in Toad about 6-7 sec, but in Django about 30 - 32 sec.
comment:12 by , 15 years ago
It seems that there is some bug in Oracle 9.2.0.5 and previous versions that didn't correctly handled escape. It's been fixed in later patchsets.
Because original report in #5985 didn't state which version Oracle was used it is hard to tell was it Oracle bug or something else.
Known workaround for Oracle versions 9.2.0.5 and earlier is to use:
LIKE TRANSLATE(%s USING NCHAR_CS) ESCAPE TRANSLATE('\\' USING NCHAR_CS)
comment:13 by , 15 years ago
Quick note: we reverted changes made at [7412] and are now testing any side-effects that might have caused.
comment:14 by , 15 years ago
Status update: We've been using reverted version for a good while and no anomalies have been occuring. Indices gets used as expected. My suggestion is just to revert changes made in [7412].
comment:15 by , 15 years ago
Thanks for tracking down that bug. I'm fairly sure that Frank was using Oracle 9 when he reported #5985, so this is quite likely the same bug. I'm reluctant to just revert the change since that would cause a regression of #5985, but I'm definitely willing to consider the workaround. I also wonder if it might be fixed a bit more efficiently by:
LIKE %s ESCAPE N'\\'
Unfortunately, I don't have a 9.2.0.5 database available to test on. Do you?
comment:16 by , 15 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
This is too late for 1.1.