Opened 19 years ago

Last modified 18 years ago

#87 closed enhancement

[patch] Oracle database support — at Version 22

Reported by: jrhuggins@… Owned by: Adrian Holovaty
Component: Database layer (models, ORM) Version:
Severity: normal Keywords: oracle database db sequence
Cc: Triage Stage: Unreviewed
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: UI/UX:

Description (last modified by Adrian Holovaty)

Here's a patch for Oracle support.

Change History (33)

comment:1 by gwyn.evans@…, 19 years ago

Note that you could just have a single sequence that provides the nextvals for all the administered tables, although to auto-increment, you'll want a trigger per table, e.g.

create table MAPPING (
  ID         integer not null,
  DPCP_ID    integer not null,
  primary key (ID)
);

create or replace trigger MAPPING_TRIGGER
before insert on MAPPING
for each row
begin
select SEQUENCE.NEXTVAL into :new.ID from DUAL;
end;
;

comment:2 by Jason Huggins, 19 years ago

I *mostly* have the Oracle driver working... (I use one sequence and one trigger per table to simulate auto-increment. Literature is mixed on this, but it appears 1 table - 1 sequence is the "Oracle Way".)

With my Oracle driver, I can initialize the site (django-admin init), create the superuser, log in as the superuser on the website, install an "app", and edit users and groups through the web.

Two things are bugging me, though.

1) Oracle's equivalent of "LIMIT and OFFSET" is very kludgy... I have to completely rewrite normal Django SQL statements that use "LIMIT" to use Oracle's "rank()" or "row_number()" analytic functions. The kludge is that you have to wrap the original sql statement inside a subquery, and then apply a rownumber filter on the outer/main query. I haven't found a great example for how to do this when *also* joining to multiple tables. When selecting content edit history, Django uses a join on auth_admin_log and auth_users and a LIMIT clause. I punted on this for now by commenting out the LIMIT and OFFSET clause code in the "function_get_sql_clause" of django.core.meta.py. :-(

2) When trying to save edits to default content in the admin site (i.e. 'users', 'groups', etc), in many different places, I'm getting "Cannot insert NULL into.. <schema>.<table_name.<field_name>" errors. My workaround for this for now is hacking model field definitions by adding "null=True" to allow nulls in the database. I need to find a better default answer to this.

After I work out the LIMIT issue and insert NULL issue, the Oracle driver will probably be in a "alpha code condition" for checking in. Next step would be to work on fleshing out major unit tests for the db backends.

comment:3 by rmunn@…, 19 years ago

Apparently Oracle treats empty strings as equivalent to NULLs. I was surprised to discover this, too, but it's true. From the official Oracle documentation at http://www.oracle.com/technology/documentation/index.html, click on Oracle Database 10g Release 2 (10.2), then SQL Reference. (Registering with the oracle.com Web site ends up being required at some point in the process). Then click on "Nulls" in the table of contents. Straight from the horse's mouth, as it were:

"Note: Oracle Database currently treats a character value with a length of zero as null. However, this may not continue to be true in future releases, and Oracle recommends that you do not treat empty strings the same as nulls."

My own personal reaction on learning this: "And people pay *HOW MUCH* for this thing?!!?"

But that's the way Oracle currently is, and we have to work around it. So here's my suggestion: hack the Oracle backend to replace all empty strings with a special value. Some people apparently use " " -- but I suggest U+FEFF (ZERO WIDTH NO-BREAK SPACE). And on the way out of the database, replace all occurrences of a single U+FEFF with the empty string. As long as Oracle handles Unicode properly, that should be a decent workaround for #2.

Note that the workaround will take quite a bit of work to make it actually function. It's not just INSERTs and UPDATEs that you'll need to tweak, it's all the WHERE clauses as well. When the user does "WHERE field = ", you'll have to intercept that one and replace with U+FEFF, or he'll get no rows back from a query that should have returned data.

comment:4 by rmunn@…, 19 years ago

The "#2" above was not meant to be a link to ticket 2, it was just meant as a reference to the second point of Jason's rant. Ah well.

comment:5 by Jason Huggins, 19 years ago

Here's a good link describing the different ways to do LIMIT/OFFSET in several databases (including Oracle):
http://troels.arvin.dk/db/rdbms/#select-limit-offset

comment:6 by Jason Huggins, 19 years ago

I wrote: " I need to find a better default answer to this."

It appears that my above statement was more clever than I thought. It appears that "DEFAULT" is supported in Oracle. It is used when rows that are inserted into a table omit values for a column. This is one part of the problem I'm having with Django on Oracle. Some of the errors I'm seeing are when inserts or updates are leaving out a field and an empty strinig is trying to be inserted in its place. So I could perhaps set the default to U+FEFF (Another option is the good old space character (' '), which is how my PeopleSoft on Oracle systems deal with the empty string problem--- all empty strings are converted to one single space.

rmunn wrote: "It's not just INSERTs and UPDATEs that you'll need to tweak, it's all the WHERE clauses as well. "

This might not be a problem, most good queries in Django use parameter passing. I could catch all parameters that are "" and convert them to U+FEFF on the fly. The problem is trickier if parameter passing is not being used everywhere.

What if we came up with a db backend-specific value called "DB_EMPTY_STR_EQUIV". In MySQL, PostgreSQL, and SQLlite, it would be "", in Oracle, it would be U+FEFF or " " or even "this string left intentially blank"... All queries would then use DB_EMPTY_STR_EQUIV instead of hard-coding "". This might be the price we have to pay for cross database compatibility. :-/

comment:7 by Jason Huggins, 19 years ago

I think I have the NULL vs empty string issue mostly fixed.

I added an EMPTY_STR_EQUIV configuration setting to core/db/backends/oracle.py. For now, I have the option set to " " (a single space character). This can replaced with a unicode character (U+FEFF) if the database supports it. See below for how the setting is used.

The other thing I might need to change for this "Oracle can't deal with empty strings" bug is to add a "DEFAULT" clause to string fields in the table creation SQL if/when insert statements try to insert a value into a string field without defining a value for it.

Also, we might not need to intercept the WHERE clauses in select queries. Users of Oracle will simply need to understand that queries looking for "" will look for NULLS, not empty strings... Users could always use "db.EMPTY_STR_EQUIV" in their select queries to make "select fields with empty string" queries cross-database-platform safe.

I changed get_db_prep_save in meta.py to use the new setting like so:

class Field(object):
...
    def get_db_prep_save(self, value, add):
        "Returns field's value prepared for saving into a database."
        
        # Oracle treats empty strings ('') the same as NULLs
        # To get around this wart, we need to change it to something else
        if value == '':
            string_replacement = getattr(db,'EMPTY_STR_EQUIV','')
            value = string_replacement
        return value
...

comment:8 by Jason Huggins, 19 years ago

Offset and limit are working now!

I had to hack "function_get_sql_clause" in meta.py, but it's working. In the old code, function_get_sql_clause returns parts of a final query ('select' - the fields, 'sql' - the from, where, order by, and limit/offset clauses, and 'params' - the query parameters). These parts are returned and the calling query does final assembly to create and execute a query. Because Oracle requires the order by and limit/offset clauses to go in "weird" places... I thought it is better for function_get_sql_clause to return the entire rendered query, instead of just the parts... For backwards compatibility, it'll return the parts (select, sql, and params), plus the final rendered query.

Here's an example before and after for dealing with limits:

Before:

SELECT 
auth_sessions.id,
auth_sessions.user_id,
auth_sessions.session_md5,
auth_sessions.start_time,
auth_users.id,
auth_users.username,
auth_users.first_name,
auth_users.last_name,
auth_users.email,
auth_users.password_md5,
auth_users.is_staff,
auth_users.is_active,
auth_users.is_superuser,
auth_users.last_login,
auth_users.date_joined 
FROM auth_sessions,auth_users WHERE 
auth_sessions.session_md5 = '68df88f01fa4ccb4b85bfaeb4b05ff2b' 
AND auth_sessions.user_id = auth_users.id
ORDER BY auth_sessions.id DESC LIMIT 10 OFFSET 10

After:

SELECT * FROM (
  SELECT
    auth_sessions.id,
    auth_sessions.user_id,
    auth_sessions.session_md5,
    auth_sessions.start_time,
    auth_users.id,
    auth_users.username,
    auth_users.first_name,
    auth_users.last_name,
    auth_users.email,
    auth_users.password_md5,
    auth_users.is_staff,
    auth_users.is_active,
    auth_users.is_superuser,
    auth_users.last_login,
    auth_users.date_joined,
    ROW_NUMBER() OVER (ORDER BY auth_sessions.id DESC) AS rn
    FROM auth_sessions,auth_users WHERE 
    auth_sessions.session_md5 = '68df88f01fa4ccb4b85bfaeb4b05ff2b' 
    AND auth_sessions.user_id = auth_users.id
)
WHERE rn <= (limit+offset) AND rn > offset

comment:9 by Jason Huggins, 19 years ago

With the Nulls vs. empty strings and limit/offset issues solved, I've hit a new snag-- Boolean field types... As far as I can tell, Oracle doesn't support a native Boolean field type of "True" or "False" like in PostgreSQL and MySQL. In Oracle, like in older versions of Python, you have to use a numeric field type, where 1 = True, and 0 = False. So, anywhere a query is looking for a Boolean, I have to intercept it and convert where clause logic values to ones and zeroes.

I'm slowly getting there... I think I can see the finish line from here...

comment:10 by Adrian Holovaty, 19 years ago

Regarding boolean fields: MySQL doesn't support those, either (internally, they're just ones and zeroes), so we could just do the same stuff for Oracle boolean support.

comment:11 by Jason Huggins, 19 years ago

Oh, Django... how I love thee... let me count the ways... :-)

Converting True to 1 and False to 0 was *eeeeeeeeasy*, thanks to the sweet pre/post hooks in the API.

I just needed to add a get_db_prep_lookup method def to the BooleanField class (in meta.py) like so:

class BooleanField(Field):
    ...

    def get_db_prep_lookup(self, lookup_type, value):
        if db.DATABASE_ENGINE == 'oracle':
            if value == 'True':
                value = 1
            elif value == 'False':
                value = 0
        return Field.get_db_prep_lookup(self, lookup_type, value)

    ...

So for MySQL support, perhaps we should add a check for MySQL, too?

comment:12 by Jason Huggins, 19 years ago

P.S. My example "before and after" queries are more than a bit *odd*--- You would never want to select more than one session for a user, let alone a limit of 10.

However, the example is useful for showing the syntactical differences between PostgreSQL and Oracle for limit/offset queries.

comment:13 by Jason Huggins, 19 years ago

Adrian, I'll look into the typecast you mentioned for MySQL... However, I'm now confused about when typecasts should be used vs. the pre/post hooks in the Field class definitions. More than one way to do it?

comment:14 by Adrian Holovaty, 19 years ago

Jason: The get_db_prep_lookup hooks aren't intended for database-specific stuff -- they're for field-type-specific stuff. The boolean typecasting should go in the database-specific module in django.core.db.*.

comment:15 by Jason Huggins, 19 years ago

Adrian, what about this line in fields.py?

class DateField(Field):
    ...
    def get_db_prep_save(self, value):
        # Casts dates into string format for entry into database.
        if value is not None:
            value = value.strftime('%Y-%m-%d')
        return Field.get_db_prep_save(self, value)
    ...

To get this to work in Oracle, I modifed it like so:

class DateField(Field):
    ...
    def get_db_prep_save(self, value):
        # Casts dates into string format for entry into database.
        if value is not None:
            if db.DATABASE_ENGINE != 'oracle':
            #Oracle does not need a string conversion
                value = value.strftime('%Y-%m-%d')
        return Field.get_db_prep_save(self, value)
    ...

Even though I haven't changed the lookup hooks into typecasts, yet... I think the Oracle backend is "good" enough for intitial upload into the repository. Well, good enough that it works for me. :-) The sequences issue, the Null/Empty String issue, the offset&limit issue, and boolean issue are all now resolved for Oracle. I'll post the diffs and files shortly.

by Jason Huggins, 19 years ago

Attachment: __init__.py added

by Jason Huggins, 19 years ago

Attachment: mysql.py added

by Jason Huggins, 19 years ago

Attachment: postgresql.py added

by Jason Huggins, 19 years ago

Attachment: sqlite3.py added

by Jason Huggins, 19 years ago

Attachment: oracle.py added

New db backend for Oracle.

by Jason Huggins, 19 years ago

Attachment: management.py added

by Jason Huggins, 19 years ago

Attachment: __init__.2.py added

django\core\meta\init.py

by Jason Huggins, 19 years ago

Attachment: fields.py added

by Jason Huggins, 19 years ago

Unified diff showing all changes... The diff does not include oracle.py, of course, since it is a new file.

comment:16 by rmunn@…, 19 years ago

Summary: Oracle database support.[patch] Oracle database support.

Added "[patch]" to summary to flag it for inclusion in the Active tickets with patches list.

by Jason Huggns, 19 years ago

Updated unified diff for Oracle backend

comment:17 by Jason Huggins, 19 years ago

I have attached an updated patch to deal with the Django codebase changes between revisions 477 (when the diff was first created) and revision 656 (which was the latest as of September 21, 2005)

Also added new support for date filtering in the admin views.

comment:18 by Adrian Holovaty, 19 years ago

Status: newassigned
Summary: [patch] Oracle database support.[patch] Oracle database support

Excellent! Thanks for the update, Jason.

comment:19 by Jacob, 19 years ago

milestone: Version 1.0

by Jason Huggins, 19 years ago

Updated patch for subversion rev #2074

comment:20 by Jason Huggins, 19 years ago

I updated the patch to work with Django as of revision 2074.

There are several "if db=='oracle'" type of statements in the patch, which Adrian has said should be removed as much as possible. I have Oracle support working for me fairly well now, but I'm not sure how to make this patch more elegant. I'm sure there are still some gremlins in the patch somehwere, but the common CRUD operations in the admin app are working well... So So, feedback welcome.

Also, if folks want to test out the patch here are a few links I recommended to get up and running with Oracle quickly:
http://www.vmware.com/vmtn/vm/oracle.html

or

http://www.vmware.com/vmtn/vm/ubuntu.html + http://csee.wvu.edu/~ccole/oracle10g-ubuntu.html

My patch has been tested on Oracle 9.2.0.1 on Windows XP with Python 2.4.1 and Django svn rev 2074.

comment:21 by Jason Huggins, 19 years ago

Just a side comment, the issue summary for this ticket should probably be changed. Although it is interesting that Adrian and I first chatted about this over IRC, the summary should probably be truncated to something pithy. I don't have access to make that edit.
Thanks, Jason

comment:22 by Adrian Holovaty, 19 years ago

Description: modified (diff)
Note: See TracTickets for help on using tickets.
Back to Top