#5806 closed (wontfix)
Would like a callback on models.Field during SQL generation
Reported by: | Owned by: | nobody | |
---|---|---|---|
Component: | Uncategorized | Version: | dev |
Severity: | Keywords: | ||
Cc: | Triage Stage: | Unreviewed | |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
It would be useful if Fields had the opportunity to specify SQL code to be included when generating the SQL for their model.
I am currently trying to add some stuff to the SQL generated by Django to implement full text indexes in PostgreSQL for several models. Each model requires the following to be added:
-- full text indexing on blogs_post CREATE INDEX blogs_post_fts ON blogs_post USING gin (vector); CREATE FUNCTION blogs_post_fts_update () RETURNS trigger AS $$ BEGIN NEW.vector = ts2.to_tsvector ('default', NEW.title || ' ' || NEW.body); RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER blogs_post_fts_update BEFORE INSERT OR UPDATE ON blogs_post FOR EACH ROW EXECUTE PROCEDURE blogs_post_fts_update (); -- end of full text indexing on blogs_post
Entering this for every table that I want to create an index upon is really annoying!
It is possible to generate these SQL statements based on information in the model definition:
class Blog (models.Model): title = models.TextField () body = models.TextField () vector = TsvectorField (fields = ['title', 'body'])
In order to do this, I created a custom field type that adds the actual field to the CREATE TABLE statement:
class TsvectorField (models.Field): def __init__ (self, fields, *args, **kwargs): if kwargs.pop ('editable', False) == True: raise ValueError ('TsvectorFields can not be editable') super (TsvectorField, self).__init__ (*args, **kwargs) # do something with fields! def db_type (self): return 'ts2.tsvector'
But at this point I am stuck: as far as I can see, there is nowhere to place the code to generate the above SQL.
I propose that django.core.management.sql.sql_indexes_for_model
by modified to call a method on each field, named something like get_sql_index
. This method would generate & return the above SQL statements.
Thoughts?
Change History (5)
comment:1 by , 17 years ago
Summary: | Would like a callback on field creation → Would like a callback on models.Field during SQL generation |
---|
comment:2 by , 17 years ago
comment:3 by , 17 years ago
The problem with having to manually executing the SQL statements means that they can't be run automatically when a project is installed, including when manage.py test
is run. Not to mention how tedious and error-prone it is, compared with (in this case) adding a single TsvectorField to each model.
I actually think that this will help prevent the database API from becoming bloated. It will allow users to make use of entire classes of features that require additional, custom database statements such as these, without having to add APIs for each individual feature.
This is particularly desirable in cases such as ours, where the actual SQL generated is quite specific to our rqeuirements; there are many, many different strategies for implementing things such as full-text indexing, none of which is suitable for everyone.
Having done some more thinking about this, I think that rather than hijacking the sqlindexes command, a new command such as sqlextra
should be added; this would call each field's get_sql_extra
method. The sqlextra
command would be called as part of the sqlall
command.
I'm happy to work on this myself, of course, but I don't really want to do so if I'm going to have to maintain the patch out-of-tree.
comment:4 by , 17 years ago
Resolution: | → wontfix |
---|---|
Status: | new → closed |
The post_syncdb exists for purposes like this. Catch it for the models you care about and then start talking directly to the database to add whatever extra stuff you need. It is called once per app and you are passed a list ofmodels created in that app.
comment:5 by , 17 years ago
Thanks for the hint--that works well, except for one problem: the function is called twice when I run the 'tests' management command.
It seems that both 'syncdb' and 'flush' emit the 'post_syncdb' signal. Since 'test' calls 'syncdb' and then 'flush', the DDL statements all fail the second time they are called (and any subsequent times, too).
Personally, I think that trying to support every conceivable SQL feature people might want to use during table creation is a hopeless task -- the API would need to bloat up to the level of something like SQLAlchemy, and the complexity would quickly get overwhelming. Meanwhile, you can pipe the output of
manage.py sql
into a file, edit it to add absolutely anything you like, and then execute it against your database.