Opened 18 years ago

Closed 11 years ago

#2284 closed Bug (duplicate)

[patch] Extra /sql files with whitespace break syncdb

Reported by: scottanderson Owned by: nobody
Component: Core (Management commands) Version: dev
Severity: Normal Keywords: syncdb
Cc: scottanderson@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: yes Patch needs improvement: yes
Easy pickings: no UI/UX: no

Description

Extra .sql files in /sql directories break syncdb when the files have whitespace.

Example:

CREATE TABLE "forum_post_fulltext" (
etc...

) WITHOUT OIDS;

CREATE INDEX idxcontent ON forum_post_fulltext USING gist (content_idx);

CREATE INDEX idxposttree ON forum_post (thread_id, leftid, rightid);

CREATE FUNCTION f_post_insert() RETURNS trigger AS '
... etc. ...
' LANGUAGE plpgsql
;

The above will cause syncdb to display errors.

Attachments (3)

syncdb-whitespace-fix.diff (785 bytes ) - added by anonymous 18 years ago.
Patch to fix whitespace problem
post.sql (318 bytes ) - added by scottanderson@… 18 years ago.
Test file for problem
new-patch.diff (1.9 KB ) - added by scottanderson@… 18 years ago.
Reverts old regex but stil fixes escaped quotes problem from ticket #2119

Download all attachments as: .zip

Change History (31)

comment:1 by anonymous, 18 years ago

Summary: Extra /sql files with whitespace break syncdb[patch] Extra /sql files with whitespace break syncdb

by anonymous, 18 years ago

Attachment: syncdb-whitespace-fix.diff added

Patch to fix whitespace problem

comment:2 by anonymous, 18 years ago

Component: Core frameworkTools

comment:3 by Malcolm Tredinnick, 18 years ago

Can you post an example of something that includes the whitespace you are seeing breakage for, please. It's a bit hard to work out from your original post: is the whitespace hidden inside the "etc" bits, or is it the linebreaks, or ...? To create a regression test for this, we first need a failing example. Thanks.

comment:4 by scottanderson, 18 years ago

Apparently the bug report software stripped the linebreaks. Just add linebreaks between the statements.

comment:5 by anonymous, 18 years ago

Cc: scottanderson@… added

comment:6 by Malcolm Tredinnick, 18 years ago

I'm still not able to repeat this. Can you please attach a text file of a failing example. If I add blank lines between statements in tests/regressiontests/inital_sql_regress/sql/simple.sql in the source, it does not cause any errors, for example.

The get_sql_initial_data_for_model() function in management.py already strips each SQL statement and does not append blank ones, so I am wondering how these extra blank statements are getting through.

What subversion checkout version are you running and which database is this failing under (shouldn't be relevant, but just in case)?

comment:7 by scottanderson, 18 years ago

To be honest, this is a patch we've been using for a while against magic removal. I've just created a new database and the patch doesn't seem to be fixing it any longer. The whitespace no longer seems to be significant; it's choking on the stored procedure syntax.


Here's a minimal testcase (in a file called Post.sql; I'll attach this as well):


CREATE FUNCTION f_post_insert() RETURNS trigger AS '
DECLARE

v_level integer;
v_rightid integer;
nowtime timestamp(3);
vrec RECORD;

BEGIN

nowtime := CURRENT_TIMESTAMP;

RETURN NEW;

END;
' LANGUAGE plpgsql
;

CREATE TRIGGER post_insert BEFORE INSERT ON forum_post FOR EACH ROW
EXECUTE PROCEDURE f_post_insert() ;



Here's the output I'm getting now:


Installing initial data for Post model
Executing 'CREATE FUNCTION f_post_insert() RETURNS trigger AS '
DECLARE

v_level integer;'

Failed to install initial SQL data for Post model: ERROR: unterminated quoted string at or near "'
DECLARE

v_level integer;" at character 52

CREATE FUNCTION f_post_insert() RETURNS trigger AS '
DECLARE

v_level integer;Installing initial data for Collection model



I apologize, but I don't know what version of m-r we were using when the fix was actually working. It's been a while since I've had to rebuild the database, and I wanted to send the patch in before I forgot about it.


From svn info: Last Changed Rev: 3275

This is PostgreSQL 8.1.

by scottanderson@…, 18 years ago

Attachment: post.sql added

Test file for problem

comment:8 by scottanderson@…, 18 years ago

I should also add that this file works fine when cut and pasted into a psql session.

comment:9 by scottanderson@…, 18 years ago

I've discovered the problem, I think.


Your change seen in 'svn diff -r 3177:3178' is what fixed the problem my patch also fixed, but yours ironically breaks my SQL file in a different fashion. :-)


Reverting to the old regular expression allows my code to go through. The problem bit is that this guy is missing now:


| (?:'[']*') # something in single quotes



Which was preserving the text of the stored procedure as a single executable statement. Now it's being broken up at the semi-colons by the new regex.

by scottanderson@…, 18 years ago

Attachment: new-patch.diff added

Reverts old regex but stil fixes escaped quotes problem from ticket #2119

comment:10 by scottanderson@…, 18 years ago

This new patch seems to work for both escaped quotes and stored procedures in PostgreSQL. I can't promise it works in other databases.

comment:11 by Malcolm Tredinnick, 18 years ago

Reverting [3178] is a bit desperate. The old reg-exp was extremely fragile and broke in a number of cases (I can think of about half a dozen bugs that were open more or less simultaneously because of problesm in that code). Basically, trying to parse SQL using a reg-exp is pretty doomed (it looks like there might be problems with properly SQL-escaped quotes -- things like -- in that reg-exp, for example). The multi-line nature of the whole thing is part of the cause of the problems.

So we'll need to think about this a bit more and try to come up with something more robust. Since speed isn't a real issue for initial SQL, we can afford to do a little more inspection and be more careful than in other places. So a solution based on splitting and inspection might work, too. I'll try some things out.

comment:12 by Malcolm Tredinnick, 18 years ago

Oh, #$%! Bitten by wiki-formatting. Just before italics get turned on, it should have two single quotes next to each other (the spec-compliant SQL quote escape).

comment:13 by scottanderson@…, 18 years ago

Well, it's not so much a reversion as fixing the old regex in a different fashion. Note that it takes care of escaped quotes now. But yes, I agree with you that it can get to be a bit difficult. However, since all we need to do is figure out statement endings, the problem is simpler. Actually parsing the SQL is an order of magnitude more difficult given the different engines that it needs to support. Perhaps pushing this sort of thing off into the individual database wrappers is in order, so that the idiosyncratic differences won't need a general solution.

At any rate, adding a stored procedure definition to the regression tests is probably a good idea. :-)

comment:14 by anonymous, 18 years ago

This takes care of sql-style quoting.

          | (?:'(?:[^']|\\'|'')*') # something in single quotes, but not escaped
          | (?:"(?:[^"]|\\"|"")*") # something in double quotes, but not escaped

comment:15 by Gary Wilson <gary.wilson@…>, 18 years ago

Needs tests: set
Triage Stage: UnreviewedAccepted

comment:16 by Adrian Holovaty, 17 years ago

Version: magic-removalSVN

comment:17 by Adrian Holovaty, 16 years ago

Component: Toolsdjango-admin.py

in reply to:  description comment:18 by anonymous, 16 years ago

Component: django-admin.pydjango.contrib.databrowse
milestone: post-1.0
Needs documentation: set
Owner: changed from nobody to anonymoushgjdhj vf
Patch needs improvement: set
Version: SVN1.0-alpha

Replying to scottanderson:

Extra .sql files in /sql directories break syncdb when the files have whitespace.

Example:jgnjn njngjnj jnj koijiojij jiojiojibvj iojiojbio jiojioji ji jiojio joi joijoijovijfiji ijoijkioj iojoij oij oijioj iojio joijiojioj iojioj ioj

CREATE TABLE "forum_post_fulltext" (
etc...

) WITHOUT OIDS;

CREATE INDEX idxcontent ON forum_post_fulltext USING gist (content_idx);

CREATE INDEX idxposttree ON forum_post (thread_id, leftid, rightid);

CREATE FUNCTION f_post_insert() RETURNS trigger AS '
... etc. ...
' LANGUAGE plpgsql
;

The above will cause syncdb to display errors.

comment:19 by Russell Keith-Magee, 16 years ago

Component: django.contrib.databrowsedjango-admin.py
milestone: post-1.0
Needs documentation: unset
Owner: changed from anonymoushgjdhj vf to nobody
Patch needs improvement: unset
Version: 1.0-alphaSVN

Reverting some unhelpful status changes

in reply to:  19 comment:20 by anonymous, 16 years ago

Replying to russellm:

Reverting some unhelpful status changes

no not this time

comment:21 by Adam Nelson, 15 years ago

Patch needs improvement: set

From Jeremy Dunck (http://groups.google.com/group/django-developers/browse_thread/thread/a60a582d1e0a562b/cfca983bfe633b80):

... the current status is captured in
comments 11 through 14 -- that the logic for executing initial SQL
breaks in cases where extended SQL syntax (such as pgsql's DECLARE)
includes quoted text, etc.
You can see the Malcolm objected to the latest patch in 11, and
ScottAnderson objected to it in 13.
Certainly, the patch needs improvement because it doesn't include any
test cases. Possibly, the patch as-is isn't good for the reasons
Malcolm listed, but it's hard for me to tell without doing my own
testing. I'd mark patch-needs-improvement. If you're feeling more
ambitious, I'd suggest coming up with a set of initial SQL files that
exercise the assumptions of the loading code.

comment:22 by Łukasz Rekucki, 14 years ago

Type: defectBug

comment:23 by Łukasz Rekucki, 14 years ago

Severity: normalNormal

comment:24 by Aymeric Augustin, 13 years ago

UI/UX: unset

Change UI/UX from NULL to False.

comment:25 by Aymeric Augustin, 13 years ago

Easy pickings: unset

Change Easy pickings from NULL to False.

comment:26 by Claude Paroz, 12 years ago

I think that the parsing solution leads us nowhere. It is not fair to penalize all backends because some cannot execute multiple statements. IMHO, we should have a backend-specific load_sql that take care of either splitting the statements or to feed the database with the read file at once.

comment:27 by merb, 11 years ago

this ticket seems to be fixed.

comment:28 by Claude Paroz, 11 years ago

Resolution: duplicate
Status: newclosed

In any case, custom sql loading is handled in #3214.

Note: See TracTickets for help on using tickets.
Back to Top