#35187 closed Bug (fixed)
@sensitive_variables/sensitive_post_parameters decorators crash with .pyc-only builds.
Reported by: | Marcus Hoffmann | Owned by: | Mariusz Felisiak |
---|---|---|---|
Component: | Core (Other) | Version: | 5.0 |
Severity: | Release blocker | Keywords: | |
Cc: | Marcus Hoffmann, William Schwartz, Jon Janzen | Triage Stage: | Accepted |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
Since upgrading django to version 5.0.1 (5.0 untested currently) in buildroot (an embedded rootfs build toolkit) django stopped working correctly in the buildroot default configuration. By default buildroot compiles all installed python modules to .pyc files and removes the .py files.
There's been some discussion around this but the conclusion seems to be, that this is an officially supported way of running cpython: https://github.com/python/cpython/issues/95827
Here's a test run output of creating a new django project and invoking manage.py from it inside a minmal buildroot created image: https://gitlab.com/buildroot.org/buildroot/-/jobs/6148209453#L181
Creating the project works but manage.py fails with:
Traceback (most recent call last): File "/opt/testsite/./manage.py", line 22, in <module> main() File "/opt/testsite/./manage.py", line 18, in main execute_from_command_line(sys.argv) File "/usr/lib/python3.12/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line File "/usr/lib/python3.12/site-packages/django/core/management/__init__.py", line 416, in execute File "/usr/lib/python3.12/site-packages/django/__init__.py", line 24, in setup File "/usr/lib/python3.12/site-packages/django/apps/registry.py", line 91, in populate File "/usr/lib/python3.12/site-packages/django/apps/config.py", line 193, in create File "/usr/lib/python3.12/importlib/__init__.py", line 90, in import_module File "<frozen importlib._bootstrap>", line 1387, in _gcd_import File "<frozen importlib._bootstrap>", line 1360, in _find_and_load File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 935, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 994, in exec_module File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed File "/usr/lib/python3.12/site-packages/django/contrib/admin/__init__.py", line 2, in <module> File "/usr/lib/python3.12/site-packages/django/contrib/admin/filters.py", line 12, in <module> File "/usr/lib/python3.12/site-packages/django/contrib/admin/options.py", line 33, in <module> File "/usr/lib/python3.12/site-packages/django/contrib/auth/__init__.py", line 96, in <module> File "/usr/lib/python3.12/site-packages/django/views/decorators/debug.py", line 50, in decorator File "/usr/lib/python3.12/inspect.py", line 1264, in getsourcelines File "/usr/lib/python3.12/inspect.py", line 1093, in findsource OSError: could not get source code
This seems to specifically fail somewhere in loading contrib.auth, if I remove (in a local test) both the 'django.contrib.admin' and 'django.contrib.auth' apps from the list of installed apps then manage.py starts to work correctly.
Change History (17)
comment:2 by , 9 months ago
Component: | Uncategorized → Core (Other) |
---|
Hello Marcus, thank you for your report and for helping making Django better.
I'm trying to triage this bug properly but I would like to understand a few things before making any further decision:
- In the CPython thread, while they say that source-les imports are supported, there seems to be a general pushback around it ("we historically discourage them unless you have a pressing use-case", ".pyc-only imports are CPython-specific", "that doesn't mean any 3rd party packages or other tools support a source-less mode of operation"). You affirmed that "this is an officially supported way of running cpython", where do you read that exactly?
- While I can see a value and valid use cases for running Python in embedded systems, I'm not sure there are use cases for running a Django web server in an embedded system. Would you have examples for wanting to run Django in a "minified" rootfs?
I'm trying to determine whether this report pertains to a specific need arising from a niche use case or if it applies to the broader ecosystem. Django is a framework designed to offer robust and accurate solutions for common scenarios.
comment:3 by , 9 months ago
Cc: | added |
---|
This may be related to #32316, #30950 &co. If so William Schwartz was actively using _frozen apps_ so may be able to advise.
comment:4 by , 9 months ago
I agree that what's happening here is the use (where the traceback shows) of inspect.getsourcelines, the documentation of which says, "An OSError
is raised if the source code cannot be retrieved." I also agree that the problem started with https://github.com/django/django/pull/16831, which says,
To persist the sensitive variables list for async functions, we can use a global dict with the hash of the filename and line number of the function as the key (more formally:
hash(f"{file_path}:{line_number}")
and the value is the tuple or sensitive variables (or__ALL__
). Then, during traceback cleanup we can read this dict by reconstructing the same key withf_code.co_filename
andf_code.co_firstlineno
to lookup the relevant value in the dict.
I don't have the bandwidth these days to offer an alternative solution. All I can say is that the implementation assumes the existence of a file path that Python does not guarantee exists. To be clear on this: Python does not guarantee that there is a source file or cache file associated with any object. (See discussion at #30950 and other linked tickets.) Modules without source files exist in *all* CPython instances because of the sys
module, and in most CPython instances because of the frozen importlib
submodules. They occur frequently in embedded instance of Python, e.g., when running code directly from pyc cache files (which I have always [back to 2010] understood was a fully supported solution) or using a compiler such as PyOxidizer, which I use. So a correct solution to whatever https://github.com/django/django/pull/16831 was trying to fix for async functions cannot rely on source files. I think what went wrong was essentially what comment:10:ticket:30950 warned of:
I could be wrong but I believe that whatever we do to tackle this ticket, it's important ensure that file isn't added back by new contributions in the future. Ideally this shouldn't be a burden on reviewers, there should be a linting rule, or a test or something that ensures that any contributions following the resolution of this ticket isn't adding this back.
Ideally there'd be a test run of Django in CI that runs in embedded form on an embedded project/app (whether cache-only, frozen, or PyOxidered or whatever). That would have caught this issue since __file__
isn't being directly used, just indirectly via inspect.getsourcelines
. Unfortunately I don't have the time to do this myself.
PS, I'm excited that someone else is using Django embedded! FWIW I've been successfully using Django 3.2.13 with PyOxidizer Python applications for awhile. However, I haven't embedded Django itself, which continues to exist in source form next to my applications, i.e., only my own application code is embedded in the PyOxidizer-generated binary, not Django itself. Moreover, we do not use the admin or auth contrib apps. That said, I would certainly prefer to embed Django one day. If I recall correctly, at the time I ran out of budget to work on this, the barrier to embedding Django for us was the loaddata
command.
comment:5 by , 9 months ago
Cc: | added |
---|
As far as I'm aware, the only way to fix this is to revert 38e391e95fe5258bc6d2467332dc9cd44ce6ba52.
follow-up: 7 comment:6 by , 9 months ago
As far as I'm aware, the only way to fix this is to revert
I think there are alternatives, I don't really have time to explore them right now but:
- Use a different cache key, like swap the line number for the function name perhaps
- I think the code object stores the line numbers right? So perhaps there's a way to fetch the code object and get the line number that way.
If the above commit is reverted I think we'd have to think about reverting other things that depend on it like https://github.com/django/django/pull/16636
follow-ups: 9 10 comment:7 by , 9 months ago
Replying to Jon Janzen:
I think there are alternatives, I don't really have time to explore them right now but:
- Use a different cache key, like swap the line number for the function name perhaps
Great idea! Marcus, does it work for you?
-
django/views/decorators/debug.py
diff --git a/django/views/decorators/debug.py b/django/views/decorators/debug.py index 7ea8a540de..6540fc0651 100644
a b def sensitive_variables(*variables): 47 47 48 48 try: 49 49 file_path = inspect.getfile(wrapped_func) 50 _, first_file_line = inspect.getsourcelines(wrapped_func)51 50 except TypeError: # Raises for builtins or native functions. 52 51 raise ValueError( 53 52 f"{func.__name__} cannot safely be wrapped by " … … def sensitive_variables(*variables): 55 54 "Python file (not a builtin or from a native extension)." 56 55 ) 57 56 else: 58 key = hash(f"{file_path}:{first_file_line}") 57 first_line_number = wrapped_func.__code__.co_firstlineno 58 key = hash(f"{file_path}:{first_line_number}") 59 59 60 60 if variables: 61 61 coroutine_functions_to_sensitive_variables[key] = variables
comment:8 by , 9 months ago
Severity: | Normal → Release blocker |
---|---|
Summary: | Django 5.0+ doesn't work with pyc files only → @sensitive_variables/sensitive_post_parameters decorators crash with .pyc-only builds. |
Triage Stage: | Unreviewed → Accepted |
comment:9 by , 9 months ago
Replying to Mariusz Felisiak:
Great idea! Marcus, does it work for you?
django/views/decorators/debug.py
diff --git a/django/views/decorators/debug.py b/django/views/decorators/debug.py index 7ea8a540de..6540fc0651 100644
a b def sensitive_variables(*variables): 47 47 48 48 try: 49 49 file_path = inspect.getfile(wrapped_func) 50 _, first_file_line = inspect.getsourcelines(wrapped_func)51 50 except TypeError: # Raises for builtins or native functions. 52 51 raise ValueError( 53 52 f"{func.__name__} cannot safely be wrapped by " … … def sensitive_variables(*variables): 55 54 "Python file (not a builtin or from a native extension)." 56 55 ) 57 56 else: 58 key = hash(f"{file_path}:{first_file_line}") 57 first_line_number = wrapped_func.__code__.co_firstlineno 58 key = hash(f"{file_path}:{first_line_number}") 59 59 60 60 if variables: 61 61 coroutine_functions_to_sensitive_variables[key] = variables
Thanks for the patch, I'll try and test this tomorrow!
comment:10 by , 9 months ago
Replying to Mariusz Felisiak:
Great idea! Marcus, does it work for you?
I can confirm, this fixes the Problem for us. :-)
comment:11 by , 9 months ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
I can confirm, this fixes the Problem for us. :-)
Thanks for checking!
comment:15 by , 9 months ago
I've put up a PR to add pyc-only builds to Django's daily tests.
Marcus, if you wouldn't mind can you take a look and verify that the checks are sufficient to ensure buildroot/pyc-only builds won't break again?
This was very likely introduced by https://github.com/django/django/pull/16831
I don't exactly understand what this is doing but
inspect.getsourcelines(wrapped_func)
will throw in case there's no corresponding source file (which we don't in a pyc only distribution). The question now is, can this be solved another way?