Opened 13 years ago

Closed 12 years ago

#17758 closed Bug (fixed)

Do not depend on dict order in test suite

Reported by: Łukasz Rekucki Owned by: Aymeric Augustin
Component: Core (Other) Version: 1.3
Severity: Release blocker Keywords:
Cc: taylor.mitchell@… Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This includes things like attribute order in HTML output, key order in URL query strings, order in Form's error, etc.

Change History (39)

comment:1 by Carl Meyer, 13 years ago

Component: UncategorizedCore (Other)
Severity: NormalRelease blocker
Triage Stage: UnreviewedAccepted

Attribute order in HTML output should have been addressed by r17414 - if some cases were missed, assertHTMLEqual ought to make the fix simple in those cases.

comment:2 by Łukasz Rekucki, 13 years ago

OK, I found 2 major bugs so far:

1) django.test.simple.dependency_order()

There are two problems here:

a) The result of this functions depends on the order of aliases for given DB signature (which in turn depends on the order in settings.DATABASES dict) - i.e. if you have a DB with aliases ["A", "B"] and "A" depends on "B" then it will raise an error. If the alias list is ["B", "A"] it will pass (I think it should always fail, right?).

b) The way setup_databases() in the runner treat databases with no NAME - they get the same signature. There's a special case later to fix this, but it messes up dependency_order(). In particular, the Django test suite has two DBs: "default" and "other". If they're assigned the same test_signature, they're treated as aliases (which isn't correct). As documented, the "other" database has an implicit dependency on "default" because we gave no explicit TEST_DEPENDENCIES. And that's how we arrive at a).

So right now, you have about 50% chance of even running the test suite ;) I don't have a patch for a general case yet. ATM, I modified the backend code to return NaN instead of empty string for databases with no NAME, which prevents them from being aliases. This should also be the case for SQLite databases where NAME == ":memory:". And of course, dependency_order() needs fixing.

2) An ORM bug

FAIL: test_tickets_8921_9188 (regressiontests.queries.tests.Queries6Tests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/lrekucki/django/django_lqc/tests/regressiontests/queries/tests.py", line 1443, in test_tickets_8921_9188
    ['<Tag: t1>', '<Tag: t4>', '<Tag: t5>']
  File "/home/lrekucki/django/django_lqc/django/test/testcases.py", line 774, in assertQuerysetEqual
    return self.assertEqual(map(transform, qs), values)
AssertionError: Lists differ: ['<Tag: t4>', '<Tag: t5>'] != ['<Tag: t1>', '<Tag: t4>', '<T...

I tracked the problem down to QuerySet.alias_refcount and QuerySet.split_exclude which assumes a specific order on the former. The order can be preserved using SortedDict, but there are a few places that relabel aliases and ruin the order, like QuerySet.change_aliases(). (this iterates through change_map argument which is a dictionary, so the new aliases are inserted in random order). I fixed this, by using change_map = SortedDict() in QuerySet.split_exclude. This should probably apply to all change_maps when working with aliases.

comment:3 by Claude Paroz, 13 years ago

Note that #17762 addresses the issue about test_db_signature when NAME is empty or ":memory:".

comment:4 by Łukasz Rekucki, 13 years ago

Almost done, a WIP branch is at https://github.com/lqc/django/tree/randomhash_fixes. All tests pass except for 3 fixture related, which compare serialized XML. Also, I didn't get to fixing the dependency_order() function yet and I need an extra test case for the ORM bug (i.e. one that shows the issue exists even with seed=0).

comment:5 by Alex Gaynor, 13 years ago

Owner: changed from nobody to Alex Gaynor

Assigning to myself.

comment:6 by Łukasz Rekucki, 13 years ago

Fixed the dependency_order() function: https://github.com/lqc/django/commit/1b87e80abeee1257cc248f8c1c438d4efee39610. I gave up trying to come up with a test case for the ORM bug - it would require generating "U%d" tables aliases sequence which get ordered such that "U0" is returned last, not sure if it's even possible with HASHSEED=0. But the issue is quite visible with HASHSEED=42 ;)

I'm only missing assertXMLEqual() implementation. The doctests OutputChecker has one called "check_output_xml", but I didn't want to just copy & paste this and at the moment, I don't have enough time to refactor it. Hope this helps :)

Last edited 13 years ago by Łukasz Rekucki (previous) (diff)

comment:7 by Taylor Mitchell, 13 years ago

I picked this effort up at the PyCon sprints. Found a few more edge cases over various other seeds. The ORM changes were reverted so we can investigate the core issue and see if there's another way to attack it.

I also found a difference in 2.7.3 that's not a result of the randomized hashing. This maybe should be a separate ticket.

My branches:

comment:8 by Taylor Mitchell, 13 years ago

Cc: taylor.mitchell@… added

comment:9 by Łukasz Rekucki, 13 years ago

Thanks :), I'm still buried in other stuff, but could you give some examples of "more edge cases over various other seeds.", so I can reproduce them ? I'll probably have some time at the end of the week to work on this (if no one beats me to it). For the record, I originally tested on Python 2.6.8;

As for the ORM bug, do you guys have any concrete approach in mind? My feeling is, that avoiding this issue in another way would involve larger changes in how the ORM constructs joins and subqueries. As this is were most ORM bugs tend appear, I wanted to make as little changes as possible to avoid introducing new ones. Maybe it would be better to split this out into a new ticket.

comment:10 by Taylor Mitchell, 13 years ago

Any known edge cases should now be resolved in my branch, sorry to imply that I left things hanging. I've run the test case over about 16 different seeds and it passes all, minus the ORM one. I did not have time to dive into the underlying source of the ORM bug. I will try to carve some time out to noodle on it this week while the sprints are happening.

comment:11 by Łukasz Rekucki, 13 years ago

Has patch: set

comment:12 by Aymeric Augustin, 13 years ago

Severity: Release blockerNormal

Unfortunately, it seems unlikely that this patch will make it into 1.4.

comment:13 by Aymeric Augustin, 13 years ago

In [17777]:

Used SortedDict instead of dict to avoid random errors that may occur when dict randomization is enabled in Python. Refs #17758. Thanks Łukasz Rekucki.

comment:14 by Łukasz Rekucki, 13 years ago

I created a seperate ticket for the django.test.simple.dependency_order() bug: #17954 and updated the pull request to reflect changes in trunk.

comment:15 by Simon Charette, 13 years ago

I think dict randomization just kicked in ubuntu precise's updates. I getting quite a lot of weird failures all related to dict ordering.

comment:16 by Alex Gaynor, 13 years ago

I'm on Ubuntu precise as well; it's only on if you pass the -R flag to Python.

comment:17 by Aymeric Augustin, 12 years ago

Severity: NormalRelease blocker

We should try to fix this in 1.5. Especially since we claim supporting Python 3.3.

comment:18 by Claude Paroz, 12 years ago

I factored out the assertXML[Not]Equal part from the existing branches to this pull request: https://github.com/django/django/pull/401

comment:19 by Ian Clelland, 12 years ago

I 've been working on this for a day or so -- I want to get 3.3 supported, and this obviously blocks that :)

I have cleaned up and updated all of the changes in lrekucki's and tmitchell's branches to work with trunk -- dropping python 2.5 support and integrating the six library utils, and I've re-organized the commits functionally. There were a couple of other tests that were missed (that's part of the problem with randomization -- it's really easy for the tests to accidentally succeed) and I've fixed those as well.

My branch is at https://github.com/clelland/django/tree/ticket_17758. I've tested against Python 2.6.7, 2.7.3, 3.2.3 and 3.3.0b1, all with *and* without hash randomization. All tests pass (except, on 3.3, which I'm working on in a separate branch at the moment.

comment:21 by Anssi Kääriäinen, 12 years ago

There is a nasty dict-randomization problem in the ORM related to how F() expressions deal with multijoins: see #18375 comment 3 and comment:5 for details.

comment:22 by Claude Paroz <claude@…>, 12 years ago

In 117e99511e0985701780ed1bcd3afd456e244ae3:

Added assertXML[Not]Equal assertions

This is especially needed to compare XML when hash randomization
is on, as attribute order may vary. Refs #17758, #19038.
Thanks Taylor Mitchell for the initial patch, and Ian Clelland for
review and cleanup.

comment:23 by Aymeric Augustin <aymeric.augustin@…>, 12 years ago

In 6fd321652aba307c86ac0df3c25c920e0e948d1e:

[1.5.x] Fixed timezone tests when dict randomization is on

Refs #17758.

Backport of 1d6b7f3 from master.

comment:24 by Aymeric Augustin, 12 years ago

The following tests still fail when hash randomization is enabled:

  • fixtures, fixtures_regress, m2m_through_regress: they compare serialised JSON and XML representations. Possible solution: add assertJSONEqual to Django's TestCase.
  • admin_filters, admin_views: they compare query strings, whose order is no longer deterministic. Possible solution: add assertQueryStringEqual to Django's TestCase. This technique also works but it isn't practical.
  • forms: <span class="foo bar"> and <span class="bar foo"> are equivalent, but assertHTMLEquals doesn't knows.
  • feedgenerator: this test looks for a hard coded string in XML, which depends on attributes order.

in reply to:  24 comment:25 by Łukasz Rekucki, 12 years ago

Replying to aaugustin:

The following tests still fail when hash randomization is enabled:

  • fixtures, fixtures_regress, m2m_through_regress: they compare serialised JSON and XML representations. Possible solution: add assertJSONEqual to Django's TestCase.

Uhmm, didn't I fix that already? :P - https://github.com/lqc/django_old/commit/a9b52c101c2f9e902728c2816767971d86718f94

  • admin_filters, admin_views: they compare query strings, whose order is no longer deterministic. Possible solution: add assertQueryStringEqual to Django's TestCase. This technique also works but it isn't practical.

https://github.com/lqc/django_old/commit/7fdeaa52db50de5bafaac4dc57f5c731c43b4fe0
https://github.com/lqc/django_old/commit/7ef7420ba051a981f9f7bd888fd6b5f65eecbf31

  • forms: <span class="foo bar"> and <span class="bar foo"> are equivalent, but assertHTMLEquals doesn't knows.

Ok, this one I don't have.

  • feedgenerator: this test looks for a hard coded string in XML, which depends on attributes order.

https://github.com/lqc/django_old/commit/d16c3248a03e6785e6cea9187f906b81c85d6c89

I guess my pull request against the old repo got messed up (https://github.com/django/django-old/pull/128), so I'll rebase the mentioned changes against the new repo.

comment:26 by Łukasz Rekucki, 12 years ago

Hmm, I just noticed clelland already did that for me. Well, I rebased some of it anyway https://github.com/lqc/django/commits/issue17758_randomhash.

Worth mentioning is that I added assertQueryStringEqual instead of sorting in urlencode (see http://bugs.python.org/issue15719) and reworked the link checks in admin_inlines a bit.

Version 0, edited 12 years ago by Łukasz Rekucki (next)

comment:27 by Aymeric Augustin, 12 years ago

Several patches were proposed by different people, and some of the problems were fixed without reference to this ticket, making the situation a bit complicated.

Currently:

  • there are patches for everything, either in clelland's branch or in lrekucki's;
  • both Luke and myself prefer to review and merge them piecewise -- at least, that's what we've done until now;
  • the next step is that core devs must review and merge these patches.

Thanks for your work, and sorry for the mess :(

in reply to:  27 comment:28 by Łukasz Rekucki, 12 years ago

Replying to aaugustin:

  • both Luke and myself prefer to review and merge them piecewise -- at least, that's what we've done until now;

That's cool, but It would be nice if you cherry-pick'ed the commits, so it's easier to track what got committed already. Thanks for your hard work reviewing :)

Last edited 12 years ago by Łukasz Rekucki (previous) (diff)

comment:29 by Luke Plant <L.Plant.98@…>, 12 years ago

In 8bc410b44536e03ee38a0087256faf367dd98dd9:

Fixed HTML comparisons of class="foo bar" and class="bar foo" in tests

Refs #17758

comment:30 by Luke Plant <L.Plant.98@…>, 12 years ago

In 2164cd00ecdb61774d05f5f278078a28c926779f:

[1.5.x] Fixed HTML comparisons of class="foo bar" and class="bar foo" in tests

Refs #17758

Backport of 8bc410b44536e03ee38a0087256faf367dd98dd9 from master

comment:31 by Luke Plant, 12 years ago

Resolution: fixed
Status: newclosed

I think this is fixed now. However, these things keep crawling out of the woodwork, it seems, because hash randomization is ... random.

Please re-open if you find more problems.

Thanks to clelland and lreckucki for all their work on this.

comment:32 by Aymeric Augustin, 12 years ago

Resolution: fixed
Status: closedreopened

Thanks Luke!

I just set up the CI server to run the test suite under Python 3.3, and I got three failures during the first run:
http://ci.djangoproject.com/job/Django/2137/database=sqlite3,python=python3.3/testReport/

In case that's relevant — the CI server uses Python 3.3 provided by http://ppa.launchpad.net/fkrull/deadsnakes/ubuntu/

comment:33 by Aymeric Augustin, 12 years ago

I managed to reproduce these failures locally:

  • cache.CacheI18nTest.test_middleware_with_streaming_response always fails
  • multiple_database.AuthTestCase.test_dumpdata fails randomly

comment:34 by Aymeric Augustin <aymeric.augustin@…>, 12 years ago

In e8f07f0378cef5a133028c502ff4064b91cd0611:

Fixed a randomly failing test under Python 3.

Refs #17758.

comment:35 by Aymeric Augustin <aymeric.augustin@…>, 12 years ago

In 1114d8203ea48f4a1021f59a87954d321d8b1fa5:

[1.5.x] Fixed a randomly failing test under Python 3.

Refs #17758.

Backport of e8f07f0 from master.

comment:36 by Aymeric Augustin, 12 years ago

Owner: changed from Alex Gaynor to Aymeric Augustin
Status: reopenednew

The last failure actually reveals a bug in [4b278131].

comment:37 by Aymeric Augustin <aymeric.augustin@…>, 12 years ago

In cd914175c8209c5f366e87d94f8f6d050347757d:

[1.5.x] Prevented caching of streaming responses.

The test introduced in 4b278131 accidentally passed because of a
limitation of Python < 3.3.

Refs #17758, #7581.

Backport of 1c8be95 from master.

comment:38 by Aymeric Augustin <aymeric.augustin@…>, 12 years ago

In 1c8be95a864540d416602577d1aa03d58ba33168:

Prevented caching of streaming responses.

The test introduced in 4b278131 accidentally passed because of a
limitation of Python < 3.3.

Refs #17758, #7581.

comment:39 by Aymeric Augustin, 12 years ago

Resolution: fixed
Status: newclosed

All known issues are fixed now.

Please re-re-open if you find more.

Note: See TracTickets for help on using tickets.
Back to Top