Opened 12 years ago

Closed 5 years ago

#20393 closed Cleanup/optimization (wontfix)

django.db.models.query.QuerySet.__repr__ should not have side-effects

Reported by: justin@… Owned by: nobody
Component: Database layer (models, ORM) Version: dev
Severity: Normal Keywords: QuerySet, repr, side-effect
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

In trying to track down some timeouts and strange queries in our apps we saw some strange queries being run by django. We were querying for a single record with a FOR UPDATE, getting an error, then django would run the same query 3 times but with a limit of 21 and then fetch all of the records, each time.

Eventually I tracked this down to our error-handling middleware calling repr() on a QuerySet, which then queried the database. __repr__ is supposed to be a string-representation of an object in its current state and should not gave side-effects like making database queries, especially in the case where the query uses FOR UPDATE or similar logic.

I suggest changing django.db.models.query.QuerySet.__repr__ to only touch the cached records and not use the local iterator, which can and will query the DB, potentially causing issues like this.

Change History (11)

comment:1 by anonymous, 12 years ago

Has patch: set

Patch in pull request

comment:2 by Claude Paroz, 12 years ago

Which pull request?

comment:3 by Luke Plant, 12 years ago

The reason for the current behaviour is to help interactive coding and learning of Django - if you type an expression that returns a QuerySet, you get something helpful. See the tutorial for example:

https://docs.djangoproject.com/en/1.5/intro/tutorial01/

This would have to be considered when weighing up this ticket - a lot of Django tutorials (our own and others) would need significant modification, and the habits of people used to the current behaviour.

comment:4 by anonymous, 12 years ago

pull request: https://github.com/django/django/pull/1055

I realize that it may inconvenience simple usage, but functions like str and repr really should not be causing side effects in general.

In particular, debuggers and error reporting can cause strange side-effects and errors. In our application, we've implemented a mysql connection pool in the db backend. This works fine until an error happens and the error-catching middleware we're using (sentry/raven) starts accessing these query objects after the connection has been returned to the pool.

comment:5 by Luke Plant, 12 years ago

Resolution: wontfix
Status: newclosed

Well, you can argue that semantically these methods don't have side effects. They do 'SELECT', not 'UPDATE'. They do happen to use IO, and can therefore fail in all kinds of ways. "SELECT FOR UPDATE" is an unfortunate edge case.

Looking at these docs:

http://docs.python.org/2/reference/datamodel.html#object.__repr__

... you have a conflict between the goal of being information rich and with always being useful for debugging. In general, automatically seeing the results is information rich and is useful in debugging, but sometimes it causes further problems.

I have thought about it, and with this conflict of goals, I think we are going to have to err on the side of the current behaviour. It is possible to work around by not calling repr on QuerySets in your middleware, and there are too many people who will be upset (or have problems with their own debugging facilities) by changing this.

comment:6 by Luke Plant, 12 years ago

I should have added - feel free to persuade us otherwise on the DevelopersMailingList

comment:7 by Patryk Zawadzki, 12 years ago

What if we changed __repr__ to be safe (only print query params) and allowed a slice with Ellipsis to return a list of first N elements, possibly executing the query if not evaluated earlier?

Example:

>>> x = Foo.objects.filter(age__gte=5)
<QuerySet(Foo, …)>
>>> x[...]
[<Foo>, <Foo>, <Foo>, <Foo>, '11 more...']

comment:8 by anonymous, 12 years ago

That sounds like a great change to me.

comment:9 by Matt Johnson, 5 years ago

I opened a discussion on the dev mailing list about this:

https://groups.google.com/forum/#!topic/django-developers/WzRZ0IfBipc

comment:10 by Matt Johnson, 5 years ago

Resolution: wontfix
Status: closednew

Re-opening based on the mailing list discussions.

comment:11 by Mariusz Felisiak, 5 years ago

Resolution: wontfix
Status: newclosed
Type: UncategorizedCleanup/optimization
Version: 1.4master

2 days of discussion doesn't look sufficient for reopening ticket that was closed 6 years ago and is also backward incompatible. We have basically only 2 responses, let's wait for a strong consensus.

Note: See TracTickets for help on using tickets.
Back to Top