Opened 11 years ago

Closed 11 years ago

#21874 closed Cleanup/optimization (fixed)

Consistent handling for app modules with no (or empty) __path__

Reported by: Carl Meyer Owned by: Carl Meyer
Component: Core (Other) Version: dev
Severity: Normal Keywords: app-loading
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

This ticket is a followup from discussion that occurred on https://github.com/django/django/pull/2212 in the course of resolving #17304. The description here is lengthy as I've tried to be comprehensive, but really the issue impacts only very unusual edge-case app modules.

Currently in master, if an app module has no __path__ attribute, its appconfig gets a path of None. This is intended to indicate that the app has no directory path that can be used for finding things like templates, static assets, etc. It is likely to break some naive code that iterates over app-configs using their paths. (For instance, the current code in AppDirectoriesFinder breaks with a TypeError if any app-config has a path of None).

If an app module would have an empty __path__, currently an obscure IndexError would be raised.

If an app module has a __path__ with more than one element in it (namespace package), ImproperlyConfigured is raised.

All of the above is bypassed if the app uses an AppConfig subclass that has an explicit path as a class attribute - in that case, that path is unconditionally used and no path detection occurs.

The __file__ attribute is not currently used at all for determining an app's filesystem path.

Here are some types of app-modules that could be referenced in INSTALLED_APPS, and the contents of their __path__ and __file__ attributes:

A) An ordinary package, e.g 'myapp' where myapp is a directory containing an __init__.py. In this case, the module object's __path__ will be ['/path/to/myapp/'] and its __file__ will be '/path/to/myapp/__init__.pyc'

B) A namespace package with only a single location. In this case, __path__ is length-one, just as in (A), and there is no __file__ attribute.

C) A namespace package with multiple locations. __path__ is length > 1, and there is no __file__.

D) A single-file module, e.g. 'myapp' where there is just myapp.py. In this case, the module will have no __path__ attribute, and __file__ will be /path/to/myapp.pyc.

E) A package inside a zipped egg, e.g. 'myapp' where there is a zipped egg myapp-0.1-py2.7.egg added to sys.path in a pth file, and the zipfile internally contains a file myapp/__init__.py. In this case, __path__ will contain the bogus non-path ['/path/to/myapp-0.1-py2.7.egg/myapp/'] and __file__ will similarly be a bogus non-path '/path/to/myapp-0.1-py2.7.egg/myapp/__init__.pyc'. (A package or module within an unzipped egg is no different from case (A) or (D).)

F) A module loaded by some hypothetical custom loader which leaves __path__ empty.

So currently for cases (A), (B), and (E) we use the single entry in __path__ (even though in case (E) that path doesn't exist), and for case (C) we raise ImproperlyConfigured and require the user to explicitly supply a path on the AppConfig. For case (D) we set appconfig.path to None, and for case (F) we raise an obscure IndexError. I consider our handling of cases (D) and (F) sub-optimal.

I think it would be preferable if appconfig.path were never set to None. The algorithm I would propose for appconfig.path is this: (i) if AppConfig.path is set explicitly, use it, (ii) If app_module.__path__ exists and is length 1, use its only element, (iii) if app_module.__file__ exists, use os.path.dirname(__file__), and (iv) if no path can be deduced from the __path__ and __file__ attributes, then raise ImproperlyConfigured and tell the user they need to set the path explicitly on their AppConfig.

Raising ImproperlyConfigured for any type of app module that we supported in previous versions of Django would be a backwards-incompatible change. As far as I am aware, only cases (A) and (E) were previously explicitly supported, and neither of those would raise ImproperlyConfigured with my proposal.

Case (D) seems to work in previous versions of Django, though I don't believe that it was ever explicitly supported. In any case, it would be covered in my proposal by the fallback to os.path.dirname(__file__).

Case (F) is a bit vague, and I don't believe is a backwards-compatibility concern - I think my proposal provides reasonable handling of any combination of __path__ and __file__ that any custom loader might set.

If my proposal is rejected for whatever reason and we opt to stick with something closer to the status quo in master, then the following actions should be taken to resolve this ticket: set appconfig.path to None when __path__ exists but is empty to avoid the IndexError in case (F), and update AppDirectoriesFinder in contrib.staticfiles to not blow up if it its an appconfig.path that is None (and perhaps audit other uses of appconfig.path in Django to ensure they can handle it being None).

One issue that will need to be resolved for my proposal to move forward is that currently the test suite instantiates AppConfig in a number of places with app_module=None, which currently passes silently but under my proposal would cause ImproperlyConfigured to be raised. I plan to look into this issue and propose a patch.

Feedback on the concept welcome, in particular if I've missed any important cases.

Change History (7)

comment:1 by Aymeric Augustin, 11 years ago

Triage Stage: UnreviewedAccepted

Django has an explicity django.template.loaders.eggs.Loader to load templates from eggs, but nothing to load other files (static files, translations, etc.) So eggs don't work for loading files except for a specific case that requires explicit configuration of the template loaders. app_config.path will have a bogus value; live with it or upgrade to a more palatable package format ;-)

I assume wheels don't matter here because they're expanded to regular files during installation (but I could be wrong).

I agree with your proposal. I don't see any holes in the reasoning.

As far as I know, only the AppConfigStub subclass of AppConfig has app_module set to None. This is a dumb class provided as a convenience for the migrations system. Simply adjust its __init__ so that the tests pass -- it's an private API. EDIT: simply setting path = None on the class should do. Migrations don't use path.

Last edited 11 years ago by Aymeric Augustin (previous) (diff)

in reply to:  1 comment:2 by Carl Meyer, 11 years ago

Replying to aaugustin:

Django has an explicity django.template.loaders.eggs.Loader to load templates from eggs, but nothing to load other files (static files, translations, etc.) So eggs don't work for loading files except for a specific case that requires explicit configuration of the template loaders. app_config.path will have a bogus value; live with it or upgrade to a more palatable package format ;-)

Indeed, I agree. That's why I didn't include case (E) in my list of cases where I think we need better handling. If a module loader is going to return bogus paths, I think Django should just pass them along. (Plus, almost all uses of appconfig.path are likely to take roughly the form mypath = os.path.join(appconfig.path, 'something'); if os.path.exists(mypath): ... and a bogus path won't blow up that code.

I assume wheels don't matter here because they're expanded to regular files during installation (but I could be wrong).

That's correct - wheels and unzipped eggs are both loaded by the normal filesystem module loading process, so there is nothing special about them from this perspective.

I agree with your proposal. I don't see any holes in the reasoning.

As far as I know, only the AppConfigStub subclass of AppConfig has app_module set to None. This is a dumb class provided as a convenience for the migrations system. Simply adjust its __init__ so that the tests pass -- it's an private API. EDIT: simply setting path = None on the class should do. Migrations don't use path.

Yep, I discovered this later last night. I also discovered why Django's test suite may have misleadingly made it seem that egg-loaded modules would have no __path__; because our egg-template-loader tests have a function that creates fake eggs and stuffs them into sys.modules, and those fake eggs (unlike real ones) have no __path__.

comment:3 by Aymeric Augustin, 11 years ago

Keywords: app-loading added

comment:4 by Carl Meyer, 11 years ago

Has patch: set
Owner: changed from nobody to Carl Meyer
Status: newassigned

comment:5 by Carl Meyer, 11 years ago

It is possible that I am wrong about which way we should go on this, and that ultimately we will want to provide a way for app authors to explicitly say "this app has no filesystem location, please don't try to load any files (e.g. templates or static assets) from it."

Even with this pull request, that's possible by just giving a bogus path, like eggs do - but that's a bit ugly as an official recommendation for a real use case.

My inclination is to go with this strict (but backwards compatible) approach for 1.7 and see if use cases pop up for apps without a filesystem location. But that means that third-party code written for 1.7 and assuming that appconfig.path must be a string would later be broken if we decide that None is a valid value for appconfig.path after all.

Version 1, edited 11 years ago by Carl Meyer (previous) (next) (diff)

comment:6 by Aymeric Augustin, 11 years ago

Apps without a path are unlikely to be found in the wild; the most common problematic case is eggs that have an invalid path. Let's not agonize over a virtually non-existent problem ;-)

Patch looks good and well tested.

I have three small suggestions, bordering on bikeshedding:

  • Put a reference to this ticket in a comment, your original explanation is worth referencing.
  • Extract the logic that determines the path in a separate function, because it's getting a bit long for __init__ which is already quite long.
  • Avoid relying on errmsg in the conditionals; it may be marginally less efficient, but it's more readable:
    # Convert paths to list because Python 3.3 _NamespacePath does
    # not support indexing.
    paths = list(getattr(app_module, '__path__', []))

    # Fallback to __file__ if no suitable path is found.
    if len(paths) != 1:
        filename = getattr(app_module, '__file__', None)
        if filename is not None:
            paths = [os.path.dirname(filename)]

    # Handle errors
    # ...

comment:7 by Carl Meyer <carl@…>, 11 years ago

Resolution: fixed
Status: assignedclosed

In 88a2d39159872f6a7fced1bae591550650fd7f38:

Fixed #21874 -- Require Django applications to have a filesystem path.

Wherever possible this filesystem path is derived automatically from the app
module's path and file attributes (this avoids any
backwards-compatibility problems).

AppConfig allows specifying an app's filesystem location explicitly, which
overrides all autodetection based on path and file. This
permits Django to support any type of module as an app (namespace packages,
fake modules, modules loaded by other hypothetical non-filesystem module
loaders), as long as the app is configured with an explicit filesystem path.

Thanks Aymeric for review and discussion.

Note: See TracTickets for help on using tickets.
Back to Top