Opened 17 years ago
Closed 10 years ago
#7074 closed Bug (worksforme)
MySQL error/warning when 'gt' field lookup with a datetime field and fulltext search.
Reported by: | anonymous | Owned by: | ANUBHAV JOSHI |
---|---|---|---|
Component: | Database layer (models, ORM) | Version: | dev |
Severity: | Normal | Keywords: | qsrf-cleanup mysql, fulltext search, gt |
Cc: | cortland@…, anubhav9042@… | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description (last modified by )
This query produces the following traceback:
Keyword.objects.filter(keyword__search=term, keyworddata__updated__gt=datetime.datetime.now(), keyworddata__source="1")
term = 'test' In [28]: Keyword.objects.filter(keyword__search=term, keyworddata__updated__gt=datetime.datetime.now(), keyworddata__source="1" ).select_related() Out[29]: --------------------------------------------------------------------------- Warning Traceback (most recent call last) /home/lybp/dev/lybp/<ipython console> in <module>() /usr/local/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg/IPython/Prompts.py in __call__(self, arg) 533 534 # and now call a possibly user-defined print mechanism --> 535 manipulated_val = self.display(arg) 536 537 # user display hooks can change the variable to be stored in /usr/local/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg/IPython/Prompts.py in _display(self, arg) 559 return IPython.generics.result_display(arg) 560 except TryNext: --> 561 return self.shell.hooks.result_display(arg) 562 563 # Assign the default display method: /usr/local/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg/IPython/hooks.py in __call__(self, *args, **kw) 132 #print "prio",prio,"cmd",cmd #dbg 133 try: --> 134 ret = cmd(*args, **kw) 135 return ret 136 except ipapi.TryNext, exc: /usr/local/lib/python2.5/site-packages/ipython-0.8.2-py2.5.egg/IPython/hooks.py in result_display(self, arg) 160 161 if self.rc.pprint: --> 162 out = pformat(arg) 163 if '\n' in out: 164 # So that multi-line strings line up with the left column of /usr/local/lib/python2.5/pprint.py in pformat(self, object) 109 def pformat(self, object): 110 sio = _StringIO() --> 111 self._format(object, sio, 0, 0, {}, 0) 112 return sio.getvalue() 113 /usr/local/lib/python2.5/pprint.py in _format(self, object, stream, indent, allowance, context, level) 127 self._readable = False 128 return --> 129 rep = self._repr(object, context, level - 1) 130 typ = _type(object) 131 sepLines = _len(rep) > (self._width - 1 - indent - allowance) /usr/local/lib/python2.5/pprint.py in _repr(self, object, context, level) 193 def _repr(self, object, context, level): 194 repr, readable, recursive = self.format(object, context.copy(), --> 195 self._depth, level) 196 if not readable: 197 self._readable = False /usr/local/lib/python2.5/pprint.py in format(self, object, context, maxlevels, level) 205 and whether the object represents a recursive construct. 206 """ --> 207 return _safe_repr(object, context, maxlevels, level) 208 209 /usr/local/lib/python2.5/pprint.py in _safe_repr(object, context, maxlevels, level) 290 return format % _commajoin(components), readable, recursive 291 --> 292 rep = repr(object) 293 return rep, (rep and not rep.startswith('<')), False 294 /usr/local/lib/python2.5/site-packages/django/db/models/query.py in __repr__(self) 106 107 def __repr__(self): --> 108 return repr(self._get_data()) 109 110 def __len__(self): /usr/local/lib/python2.5/site-packages/django/db/models/query.py in _get_data(self) 484 def _get_data(self): 485 if self._result_cache is None: --> 486 self._result_cache = list(self.iterator()) 487 return self._result_cache 488 /usr/local/lib/python2.5/site-packages/django/db/models/query.py in iterator(self) 187 188 cursor = connection.cursor() --> 189 cursor.execute("SELECT " + (self._distinct and "DISTINCT " or "") + ",".join(select) + sql, params) 190 191 fill_cache = self._select_related /usr/local/lib/python2.5/site-packages/django/db/backends/util.py in execute(self, sql, params) 16 start = time() 17 try: ---> 18 return self.cursor.execute(sql, params) 19 finally: 20 stop = time() /home/lybp/dev/lybp/build/bdist.linux-i686/egg/MySQLdb/cursors.py in execute(self, query, args) /home/lybp/dev/lybp/build/bdist.linux-i686/egg/MySQLdb/cursors.py in _warning_check(self) /usr/local/lib/python2.5/warnings.py in warn(message, category, stacklevel) 60 registry = globals.setdefault("__warningregistry__", {}) 61 warn_explicit(message, category, filename, lineno, module, registry, ---> 62 globals) 63 64 def warn_explicit(message, category, filename, lineno, /usr/local/lib/python2.5/warnings.py in warn_explicit(message, category, filename, lineno, module, registry, module_globals) 100 101 if action == "error": --> 102 raise message 103 # Other actions 104 if action == "once": Warning: Truncated incorrect DOUBLE value: '2008-04-23 14:39:36.133203' In [30]: Keyword.objects.filter(keyword__search=term, keyworddata__updated=datetime.datetime.now(), keyworddata__source="1" ).select_related() Out[31]: [] In [12]: Keyword.objects.filter(keyword=term, keyworddata__updated__gt=datetime.datetime.now(), keyworddata__source="1" ).select_related() Out[13]: []
If I change keyword__search=
to keyword=
, I don't get the error. Also if I use keyworddata__updated=
instead of keyworddata__updated__gt
I don't get the error.
KeywordData
is a model with a foreign key pointing to keyword - so I guess it's a reverse lookup.
I'm using MySQL 5.0.45.
Change History (23)
comment:1 by , 17 years ago
comment:3 by , 16 years ago
Keywords: | qsrf-cleanup added |
---|
comment:4 by , 16 years ago
milestone: | → 1.0 |
---|
comment:5 by , 16 years ago
This seems to be trying to report two bugs in the one report, which is a no-no. Let's have one ticket about the __gt
comparison and one about the full text searching, each with a small model (as small as possible) demonstrating the problem.
Note that Erik's comment isn't the source of the problem, since connection.queries
shows the query prior to it being passed to the Python DB wrapper, which is where any quoting takes place.
This needs confirmation and a simpler example and I'm in a triage mode at the moment, so I'm not going to do that right now. If somebody else has time and inclination to reproduce this, it would be appreciated.
comment:6 by , 16 years ago
I think it's only one (MySQL) bug, some interaction with fulltext search and greater than (or gte) comparison involving dates. I can recreate in just plain mysql, but if I change the gte to lte or the fulltext search to a like search, the warning goes away.
mysql> select count(Clue) from Clues natural join Puzzles where match (Clue) against ('Marceau') and `Date` >= '2008-06-26'; +-------------+ | count(Clue) | +-------------+ | 1 | +-------------+ 1 row in set, 1 warning (0.00 sec) mysql> show warnings; +---------+------+------------------------------------------------+ | Level | Code | Message | +---------+------+------------------------------------------------+ | Warning | 1292 | Truncated incorrect DOUBLE value: '2008-06-26' | +---------+------+------------------------------------------------+ 1 row in set (0.00 sec) mysql> select count(Clue) from Clues natural join Puzzles where match (Clue) against ('Marceau') and `Date` <= '2008-06-26'; +-------------+ | count(Clue) | +-------------+ | 23 | +-------------+ 1 row in set (0.01 sec) mysql> select count(Clue) from Clues natural join Puzzles where Clue like '%Marceau%' and `Date` >= '2008-06-26'; +-------------+ | count(Clue) | +-------------+ | 1 | +-------------+ 1 row in set (0.00 sec)
That's just bizarre, so I searched for MySQL bugs involving this message and found this:
http://bugs.mysql.com/bug.php?id=34374
which looks like a pretty good match. They've verified it's a bug but have no fix yet. They list a workaround to use CAST(date_val AS DATE)...somehow I'm thinking that wouldn't be trivial for the ORM to start doing?
comment:7 by , 16 years ago
Owner: | changed from | to
---|---|
Triage Stage: | Unreviewed → Accepted |
Impressive debugging (and searching), again, Karen. Thanks. But... good grief! :-(
Since we've also got integers needing to be cast to text for PostgreSQL 8.3 in another ticket, it looks like it's time to crank out some general code to explicitly cast to the right type in a bunch of cases. Maybe it's not too horrible, since we really only convert Python values to SQL lookup values in one place. I'll look into it.
comment:8 by , 16 years ago
Owner: | removed |
---|
There's no point this staying on my list at the moment.. I have a bunch of other stuff I'm doing first. So reassigning back to the pool in case anybody else wants to pick it up. I'm hoping it's possible to do something similar to [8242] in this case.
comment:9 by , 16 years ago
I took a look at this. The cast needs to be added on the right-hand-side, not the left as is done by the use of lookup_cast
in [8242]. There doesn't seem to be any general backend hook for adding casting sql to the right-hand-side.
What there is is datetime_cast_sql
(currently only implemented by the Oracle backend) that is called from source:django/trunk/django/db/models/sql/where.py in make_atom
in the case where value_annot is datetime.datetime
:
if value_annot is datetime.datetime: cast_sql = connection.ops.datetime_cast_sql() else: cast_sql = '%s'
Implementing this for mysql as:
def datetime_cast_sql(self): return 'CAST(%s AS DATETIME)'
will actually fix the problem exactly as reported, but not the identical problem that exists for just plain DATE fields, nor the case where the values are passed in as strings not date[time] objects. Keeping the above datetime_cast_sql
, plus defining a new general right-hand-side casting hook and implementing it for mysql like so:
def rhs_cast_sql(self, db_type): if db_type in ['date', 'datetime']: return 'CAST(%%s AS %s)' % db_type return '%s'
plus changing the code in where.py make_atom
to call it:
if value_annot is datetime.datetime: cast_sql = connection.ops.datetime_cast_sql() else cast_sql = connection.ops.rhs_cast_sql(db_type)
works to fix the reported problem for both DATE and DATETIME fields (and covers the case where values are passed as strings, not date[time] objects).
But that puts the mysql fix in two different places and that if value_annot...else
is a bit mysterious looking. I'd think it would be better to have a single general hook that covers both what the existing Oracle datetime_cast_sql
function does and what's needed for mysql for this bug. Unfortunately I have zero knowledge of Oracle so I'm not sure what the value of db_type is for which the current datetime_cast_sql
function is being called. (And value_annot is apparently not useful for datetime.date values -- it's set to 'True' for them? So value_annot can't be used in the general hook case.)
So, in summary I think a fix for this would be to implement a general right-hand-side casting hook for the database backends but doing that correctly requires some Oracle knowledge I don't have. I'll investigate a little more if anyone cares to provides clues for me and thinks this approach is worthwhile.
[Also, I don't know how to write a test for this. The MySQL bug only manifests itself when you've got a full-text index involved in the WHERE clause. So it would have to be a test that ran only when the backend was MySQL with the MyISAM storage engine, that created a full-text index (via custom sql?) and used it in a lookup in combination with a date/datetime gt/gte lookup. Not sure it's worth figuring out how to do all that in a test? Actually I'm not sure this problem is worth fixing in Django since it's really a MySQL bug, but as of today the MySQL bug is still open with no fix in sight other than the casting workaround, so it's still there for Django users to hit.]
comment:10 by , 16 years ago
In my patch for [8242], I initially looked into creating a cast function for values, in addition to the field cast function there currently. A couple notes:
- The
db_type
values are defined per Field type in the backend's creation module (DATE
andTIMESTAMP
for Oracle). - I tried to implement a generic value casting method, whilst doing away with
datetime_cast_sql
. The problem is then you need at least the value's type to be passed through (asvalue_annot
does now) to the cast method. - It seems like
value_annot
should be split to avoid it's current dual function of knowing when a value evaluates to True/False, and specifying the value's type for datetimes. However, that tuple that gets passed around is also used by GIS, so it's not completely internal.
So a generic rhs_cast_sql
function for Oracle may look something like this:
def rhs_cast_sql(self, db_type, value_type): if db_type == 'TIMESTAMP' and value_type is datetime.datetime: return "TO_TIMESTAMP(%s, 'YYYY-MM-DD HH24:MI:SS.FF')" return "%s"
comment:11 by , 16 years ago
milestone: | 1.0 → post-1.0 |
---|
This needs to wait on a proper rhs-cast solution, which won't be in 1.0. So, for anyone reading this far down using 1.0: sorry -- try not to do this kind of lookup, and we'll fix this bug in the next release!
comment:12 by , 16 years ago
Cc: | added |
---|
I'm also expreiencing this issue when using djangosearch r21 with a filter on a resulting search; the filter has a gte on a datetime.now.
comment:13 by , 16 years ago
You can get around the bug by using .extra(). Instead of
task_set.filter(field__gt=time)
use
task_set.extra(where=['field > DATE(%s)'], params=[time])
.
comment:15 by , 14 years ago
I have just also experienced this problem, I am using software with below versions:
- Django 1.2.1
- MySQL 5.1.41
- python-mysqldb 1.2.2
Full traceback below:
In [20]: MessageSearch.objects.using('search').all().count() Out[20]: 826 In [21]: MessageSearch.objects.using('search').filter(search_full__search='newsy').filter(created__gt='2010-10-05').count() ERROR: An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (215, 0)) ERROR: An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (21, 0)) ERROR: An unexpected error occurred while tokenizing input The following traceback may be corrupted or invalid The error message is: ('EOF in multi-line statement', (70, 0)) --------------------------------------------------------------------------- Warning Traceback (most recent call last) /home/bluszcz/projekty/property/rynek_pierwotny/<ipython console> in <module>() /home/bluszcz/projekty/property/rynek_pierwotny/django/db/models/query.pyc in count(self) 324 return len(self._result_cache) 325 --> 326 return self.query.get_count(using=self.db) 327 328 def get(self, *args, **kwargs): /home/bluszcz/projekty/property/rynek_pierwotny/django/db/models/sql/query.pyc in get_count(self, using) 392 393 obj.add_count_column() --> 394 number = obj.get_aggregation(using=using)[None] 395 396 # Apply offset and limit constraints manually, since using LIMIT/OFFSET /home/bluszcz/projekty/property/rynek_pierwotny/django/db/models/sql/query.pyc in get_aggregation(self, using) 364 query.related_select_fields = [] 365 --> 366 result = query.get_compiler(using).execute_sql(SINGLE) 367 if result is None: 368 result = [None for q in query.aggregate_select.items()] /home/bluszcz/projekty/property/rynek_pierwotny/django/db/models/sql/compiler.pyc in execute_sql(self, result_type) 725 726 cursor = self.connection.cursor() --> 727 cursor.execute(sql, params) 728 729 if not result_type: /home/bluszcz/projekty/property/rynek_pierwotny/django/db/backends/util.pyc in execute(self, sql, params) 13 start = time() 14 try: ---> 15 return self.cursor.execute(sql, params) 16 finally: 17 stop = time() /home/bluszcz/projekty/property/rynek_pierwotny/django/db/backends/mysql/base.pyc in execute(self, query, args) 84 def execute(self, query, args=None): 85 try: ---> 86 return self.cursor.execute(query, args) 87 except Database.IntegrityError, e: 88 raise utils.IntegrityError, utils.IntegrityError(*tuple(e)), sys.exc_info()[2] /usr/lib/pymodules/python2.6/MySQLdb/cursors.pyc in execute(self, query, args) 166 self.errorhandler(self, exc, value) 167 self._executed = query --> 168 if not self._defer_warnings: self._warning_check() 169 return r 170 /usr/lib/pymodules/python2.6/MySQLdb/cursors.pyc in _warning_check(self) 80 self.messages.append((self.Warning, w)) 81 for w in warnings: ---> 82 warn(w[-1], self.Warning, 3) 83 elif self._info: 84 self.messages.append((self.Warning, self._info)) Warning: Truncated incorrect DOUBLE value: '2010-10-05 00:00:00' In [22]:
Someone mentioned above, that it could related with MySQL bug: http://bugs.mysql.com/bug.php?id=34374
and looks like that it has been fixed there for versions 5.5+. Anyone maybe checked this?
comment:16 by , 14 years ago
Severity: | → Normal |
---|---|
Type: | → Bug |
comment:19 by , 12 years ago
Component: | Core (Other) → Database layer (models, ORM) |
---|
comment:20 by , 11 years ago
Owner: | set to |
---|---|
Status: | new → assigned |
Working on this in my GSoC project.
comment:21 by , 10 years ago
Can someone help me in reproducing this now. Because __gt, __gte
works for both DateTimeField
as well as DateField
. If that's not the only issue here, kindly point me to what I have been missing.
I did:
s = "2014-06-12 11:57" Comment.objects.filter(article__pub_date__gte=s)
comment:22 by , 10 years ago
models.py
from django.db import models class Test(models.Model): name = models.CharField(max_length=40) created = models.DateField() created2 = models.DateTimeField()
shell:
In [1]: from t19508.models import * In [2]: import datetime In [3]: d = datetime.datetime.now() In [4]: d2 = d.date() In [5]: t=Test.objects.using('mysql').create(name='anubhav', created=d.date(), created2=d) In [6]: t=Test.objects.using('mysql').filter(created2__gte="2008-04-23 14:39:36.133203") In [7]: t[0].created Out[7]: datetime.date(2014, 7, 28) In [8]: t[0].created2 Out[8]: datetime.datetime(2014, 7, 28, 23, 36, 13, tzinfo=<UTC>) In [9]: t=Test.objects.using('mysql').filter(name__search="anubhav").filter(created2__gte="2008-04-23 14:39:36.133203") In [10]: t[0].created Out[10]: datetime.date(2014, 7, 28) In [11]: t[0].created2 Out[11]: datetime.datetime(2014, 7, 28, 23, 36, 13, tzinfo=<UTC>)
I think that the problem now doesn't exists.
comment:23 by , 10 years ago
Cc: | added |
---|---|
Resolution: | → worksforme |
Status: | assigned → closed |
It looks to me like the date isn't being quoted - or at least that's what connection.queries says...