Opened 17 months ago

Closed 17 months ago

Last modified 16 months ago

#34828 closed Bug (wontfix)

DateTimeField breaks when given datetime that would be invalid in UTC

Reported by: Denis Cornehl Owned by: nobody
Component: Database layer (models, ORM) Version: 4.2
Severity: Normal Keywords:
Cc: Triage Stage: Unreviewed
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

We had a strange production error which I think is due to a Django bug, but please correct me if I'm wrong :)

We have USE_TZ active, our own timezone set (Europe/Berlin), and a user entered 0001-01-01 00:00:00 into a datetimefield in a form.

This value is now put into the postgres timestamp with tz field including its timezone ( 0001-01-01 00:00:28+00:53:28 for example).

When I now try to read the model instance from the database, I'm getting a ValueError: year -1 is out of range exception:

  File "[...]/lib/python3.11/site-packages/django/db/models/manager.py", line 87, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...]/lib/python3.11/site-packages/django/db/models/query.py", line 633, in get
    num = len(clone)
          ^^^^^^^^^^
  File "[...]/lib/python3.11/site-packages/django/db/models/query.py", line 380, in __len__
    self._fetch_all()
  File "[...]/lib/python3.11/site-packages/django/db/models/query.py", line 1881, in _fetch_all
    self._result_cache = list(self._iterable_class(self))
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...]/lib/python3.11/site-packages/django/db/models/query.py", line 91, in __iter__
    results = compiler.execute_sql(
              ^^^^^^^^^^^^^^^^^^^^^
  File "[...]/lib/python3.11/site-packages/django/db/models/sql/compiler.py", line 1595, in execute_sql
    return list(result)
           ^^^^^^^^^^^^
  File "[...]/lib/python3.11/site-packages/django/db/models/sql/compiler.py", line 2093, in cursor_iter
    for rows in iter((lambda: cursor.fetchmany(itersize)), sentinel):
  File "[...]/lib/python3.11/site-packages/django/db/models/sql/compiler.py", line 2093, in <lambda>
    for rows in iter((lambda: cursor.fetchmany(itersize)), sentinel):
                              ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "[...]/lib/python3.11/site-packages/django/db/utils.py", line 98, in inner
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
ValueError: year -1 is out of range

I'm aware that the postgres timestamp supports a wider range than the python datetime, but then I would have assumed that the field wouldn't even accept values it can't handle.
In this specific case I see that we have a timezone aware datetime in python, store it into the database with timezone, and psycopg2 fetches it as timezone aware datetime. So I'm not sure why it has to be valid in UTC too.

This can be reproduced using a simple model:

class TestModel(models.Model):
    d = models.DateTimeField()

And the following code:

from brokendate.models import TestModel
from datetime import datetime, date
from pytz import timezone

TestModel.objects.all().delete()
ok = datetime(1,1,1, 0,0,0, tzinfo=timezone("Europe/Berlin"))
t = TestModel()
# assign datetime that would be invalid in UTC
t.d = ok
t.save()

print("fetch object")
# this raises the exception
t = TestModel.objects.get()
print(t.d)

I've created a small django project / app with a break management command that will reproduce this problem.

Attachments (1)

djangotz.zip (8.8 KB ) - added by Denis Cornehl 17 months ago.
test project to reproduce the issue

Download all attachments as: .zip

Change History (7)

by Denis Cornehl, 17 months ago

Attachment: djangotz.zip added

test project to reproduce the issue

comment:1 by David Sanders, 17 months ago

The problem we're seeing here is that the time of 0001-01-01 00:00:00+1 is stored in UTC, which will be a "negative" year in postgres. You can test this by examining the timestamp (be sure to set your timezone to utc) and observing the "BC" suffix.

When timestamps are retrieved from the database, they're retrieved as-is, ie in UTC then converted back to the specified timezone upon request.

This assumption here:

In this specific case I see that we have a timezone aware datetime in python, store it into the database with timezone, and psycopg2 fetches it as timezone aware datetime.

sounds like the common misconception that pg stores the timezone and it can be retrieved… it can't because pg doesn't store the timezone. For psycopg2 to create a datetime with a timezone it uses the timezone specified for the connection. When USE_TZ is set, this is UTC hence the error you see.

Some more information here: https://docs.djangoproject.com/en/4.2/topics/i18n/timezones/#postgresql

in reply to:  1 comment:2 by Denis Cornehl, 17 months ago

Replying to David Sanders:

The problem we're seeing here is that the time of 0001-01-01 00:00:00+1 is stored in UTC, which will be a "negative" year in postgres. You can test this by examining the timestamp (be sure to set your timezone to utc) and observing the "BC" suffix.

When timestamps are retrieved from the database, they're retrieved as-is, ie in UTC then converted back to the specified timezone upon request.

This assumption here:

In this specific case I see that we have a timezone aware datetime in python, store it into the database with timezone, and psycopg2 fetches it as timezone aware datetime.

sounds like the common misconception that pg stores the timezone and it can be retrieved… it can't because pg doesn't store the timezone. For psycopg2 to create a datetime with a timezone it uses the timezone specified for the connection. When USE_TZ is set, this is UTC hence the error you see.

Some more information here: https://docs.djangoproject.com/en/4.2/topics/i18n/timezones/#postgresql

Thank you for the quick response!

Yes, when the system timezone is UTC, postgres will have BC in the DB, you're absolutely right.

But shouldn't then the DateTimeField fail and not accept this datetime? Or is it expected behaviour like this?

comment:3 by David Sanders, 17 months ago

But shouldn't then the DateTimeField fail and not accept this datetime? Or is it expected behaviour like this?

I don't think it's necessarily "expected" but it's a reasonable exception to be getting given the -1 year. In this case func() is a psycopg2 (that's what I tested with) function – since Python's datetime min year is 1 then this is where errors will be thrown when it attempts to convert it to a Python type.

I'm not sure what you mean by DateTimeField failing but fields don't validate data from the database 🤔 Is there something you had in mind?

in reply to:  3 comment:4 by Denis Cornehl, 17 months ago

Replying to David Sanders:

But shouldn't then the DateTimeField fail and not accept this datetime? Or is it expected behaviour like this?

I don't think it's necessarily "expected" but it's a reasonable exception to be getting given the -1 year. In this case func() is a psycopg2 (that's what I tested with) function – since Python's datetime min year is 1 then this is where errors will be thrown when it attempts to convert it to a Python type.

I'm not sure what you mean by DateTimeField failing but fields don't validate data from the database 🤔 Is there something you had in mind?

I'm sorry, I could have elaborated in more detail.

I see that when someone puts a timestamp into postgres that is outside of the range of a python datetime, we will see failures when retrieving this data.

But seeing the example above, the invalid data is not bypassing anything but getting into the database via normal DateTimeField operations.

Personally I would have expected that the model-field or at least the formfield would raise an exception when the user passes data that would lead to problems when trying to read it.
In our case a user entered this value and then every request failed that tried to read this object in any way.

( of course my expectation might not fit what Django has in mind here, which is totally OK, then we would work around it)

comment:5 by David Sanders, 17 months ago

Resolution: wontfix
Status: newclosed

Yep so essentially when going to the database it is a valid datetime, when coming out of the database it isn't.

I believe the workaround in this case is to temporarily set USE_TZ = False for scenarios where you expect to be retrieving negative dates that were converted from positive dates. For validation one could check whether the timestamp is within 24 hours (or your timezone) of the year 0 🤷‍♂️

The decision as to whether or not Django should validate & detect things going into the database that would be converted to negative timestamps is probably more something to raise on the Django forum. If that's something you'd like to raise please feel free to start a thread: https://www.djangoproject.com/community/ The process from here is to mark the ticket wontfix and if the forum decides yes then we reopen this ticket.

Hope that helps 🤷‍♂️🙂

in reply to:  5 comment:6 by Denis Cornehl, 16 months ago

Replying to David Sanders:

The decision as to whether or not Django should validate & detect things going into the database that would be converted to negative timestamps is probably more something to raise on the Django forum. If that's something you'd like to raise please feel free to start a thread: https://www.djangoproject.com/community/ The process from here is to mark the ticket wontfix and if the forum decides yes then we reopen this ticket.

Hope that helps 🤷‍♂️🙂

Of course this helps, thank you for the quick & helpful responses.

Then I'll first find a workaround for us, and then start the discussion in the forum.

Note: See TracTickets for help on using tickets.
Back to Top