#9136 closed (fixed)
Oracle backend: slicing is using row_number() instead of rownum
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | 1.0 |
Severity: | Keywords: | ||
Cc: | Triage Stage: | Unreviewed | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Doing Entity.objects.all()[:10] is generating a query like:
SELECT * FROM (SELECT (ROW_NUMBER() OVER (ORDER BY "ENTITY"."ID")) AS "_RN", "ENTITY"."ID", "ENTITY"."NAME" FROM "ENTITY") WHERE "_RN" > 0 AND "_RN" <= 10
This is suboptimal as a full table scan is forced (at least in 10g).
Using rownum instead of row_number() is faster (it can use indexes), and doesn't require an order by clause when not required. The query would look like:
SELECT "ENTITY"."ID", "ENTITY"."NAME" FROM "ENTITY" WHERE ROWNUM > 0 AND ROWNUM <= 10 ORDER BY "ENTITY"."ID"
It won't make a noticeable difference with tables with a couple of rows; but when you are reaching the million rows, you go from minutes to milliseconds.
Attachments (2)
Change History (10)
by , 16 years ago
Attachment: | rownum.patch added |
---|
comment:1 by , 16 years ago
Has patch: | set |
---|
comment:3 by , 16 years ago
I tried to do some limited benchmarking on this on an Oracle XE installation and got conflicting results. The model I used was this:
class Test(models.Model): class Meta: ordering = ['id']
With 1,000,000 rows in the table, I tried taking slices of 100 rows from the beginning, middle, and end of the table. Using rownum, the time was consistently about 870 ms/query. Using row_number(), the time was about 20 ms/query from the beginning of the table, about the same as rownum from the middle, and about twice as long as rownum from the end. Removing the default ordering from the model slightly sped up rownum, but slightly slowed down row_number().
It's difficult to judge, but I guess that the most common scenario for slicing large table would be taking small slices from near the beginning of an ordered query. From that point of view, I'm in favor of keeping row_number() at this point, but I could be persuaded to change my mind with fresh evidence.
Note that I recently changed the overall structure of the query in [9221], which could have an impact on the results.
comment:4 by , 16 years ago
The change to the query in [9221] fixes my main gripes about slicing with oracle.
My own test on a set of table with some real data do not show any significant difference between rownum and row_number() for the first few rows, however as the number of rows increase row_number() gets significantly slower.
If you look at the result of an explain plan you will easily understand why: row_number() is an analytic function and the database needs to apply the function first; rownum on the oder hand is built in the query and costs nothing. Depending on data and indexes the cost difference between rownum and row_number(), as calculated by the optimizer, can reach several order of magnitude. (At least for my release of Oracle: 10.2 Enterprise.)
(As a side note: 870ms per query for the table you mentioned is really bad. Are you sure you did execute the tests with warm and/or cold cache for both rownum and row_number() ?)
(Second side note: looks like the spam filter is a tad overzealous.)
comment:5 by , 16 years ago
Yeah, that database is running on an old Pentium 3 desktop, and it's probably horrendously misconfigured to boot -- I'm not a DBA. I just tried running the same test on one of our real development databases, and in that scenario both methods run about 10k queries/second regardless of position, which probably just means that the network is now the bottleneck and I need to do a larger test.
comment:6 by , 16 years ago
Scratch that, I messed up the second test and the queries weren't actually running (I knew 10k / second was too good to be true). I am seeing row_number() running significantly slower than rownum now.
comment:7 by , 16 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
patch replacing row_number() with row