Opened 14 years ago

Last modified 8 years ago

#14904 new Bug

TextField with unique (or in unique_together) constraint breaks for large inputs in Postgres

Reported by: jorn Owned by: nobody
Component: Database layer (models, ORM) Version: 1.2
Severity: Normal Keywords: postgresql, index, textfield
Cc: Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Hi,

i have a model like this one:

class foo(Model):
  a = CharField(max_length=8)
  b = TextField()
  
  class Meta:
    unique_together = (("a", "b"),)

Now this leads to a sql create table statement like the following in Postgresql:

CREATE TABLE "myapp_foo" (
    "id" serial NOT NULL PRIMARY KEY,
    "a" varchar(8) NOT NULL,
    "b" text NOT NULL,
    UNIQUE ("a", "b")
);

The problem here is that b is a textfield which can get arbitrarily big (that's what they are there for), while values in a btree index can not:
If the input for b is above a certain size postgresql will complain:

django.db.utils.DatabaseError: index row size 3032 exceeds maximum 2712 for index "myapp_a_key"
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.
Consider a function index of an MD5 hash of the value, or use full text indexing.

The problem here is that the used unique constraint as of Postgres 8.4.5 uses a btree internally, which as stated above doesn't work with values above a certain size.

The solution for this problem would be: If a TextField has a / is part of a unique constraint or has an explicit index_db=True: _always_ index an md5 or better sha256 sum of TextField columns only (note that sha256 is in pgcrypto module only).
In this case:
UNIQUE ("a", md5("b"))

As mentioned this also applies to the universal index_db=True field argument.

Change History (4)

comment:1 by Ramiro Morales, 14 years ago

Triage Stage: UnreviewedAccepted

It would be helpful to have a complete simple sample case description especially a hint at how much text content should be assigned to the field for this condition triggering, and thus can be reproduced and/or tested against other postgresql versions.

comment:2 by James Addison, 14 years ago

Severity: Normal
Type: Bug

comment:3 by anonymous, 13 years ago

Easy pickings: unset
UI/UX: unset

Have the same problem with mysql,

_mysql_exceptions.OperationalError: (1071, 'Specified key was too long; max key length is 767 bytes')

With

name = models.CharField(null=True, blank=True, max_length=300)
user = models.ForeignKey(User,null=True, blank=True)    

class Meta :
        unique_together = (('name', 'user'),)

comment:4 by Sarah Messer, 8 years ago

Have also observed this with Django 1.8 and postgres 9.3

I'm linking references for fixes / workarounds found in a quick search:

At the moment, it looks like any workaround involves writing some raw SQL. Digging through Django's code, indices are created by django.db.backends.base.schema.BaseDatabaseSchemaEditor._sql_create_index() That's pretty far outside the public API.

Version 0, edited 8 years ago by Sarah Messer (next)
Note: See TracTickets for help on using tickets.
Back to Top