Opened 11 months ago

Closed 11 months ago

Last modified 11 months ago

#35187 closed Bug (fixed)

@sensitive_variables/sensitive_post_parameters decorators crash with .pyc-only builds.

Reported by: Marcus Hoffmann Owned by: Mariusz Felisiak
Component: Core (Other) Version: 5.0
Severity: Release blocker Keywords:
Cc: Marcus Hoffmann, William Schwartz, Jon Janzen Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Since upgrading django to version 5.0.1 (5.0 untested currently) in buildroot (an embedded rootfs build toolkit) django stopped working correctly in the buildroot default configuration. By default buildroot compiles all installed python modules to .pyc files and removes the .py files.

There's been some discussion around this but the conclusion seems to be, that this is an officially supported way of running cpython: https://github.com/python/cpython/issues/95827

Here's a test run output of creating a new django project and invoking manage.py from it inside a minmal buildroot created image: https://gitlab.com/buildroot.org/buildroot/-/jobs/6148209453#L181

Creating the project works but manage.py fails with:

Traceback (most recent call last):
  File "/opt/testsite/./manage.py", line 22, in <module>
    main()
  File "/opt/testsite/./manage.py", line 18, in main
    execute_from_command_line(sys.argv)
  File "/usr/lib/python3.12/site-packages/django/core/management/__init__.py", line 442, in execute_from_command_line
  File "/usr/lib/python3.12/site-packages/django/core/management/__init__.py", line 416, in execute
  File "/usr/lib/python3.12/site-packages/django/__init__.py", line 24, in setup
  File "/usr/lib/python3.12/site-packages/django/apps/registry.py", line 91, in populate
  File "/usr/lib/python3.12/site-packages/django/apps/config.py", line 193, in create
  File "/usr/lib/python3.12/importlib/__init__.py", line 90, in import_module
  File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 994, in exec_module
  File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
  File "/usr/lib/python3.12/site-packages/django/contrib/admin/__init__.py", line 2, in <module>
  File "/usr/lib/python3.12/site-packages/django/contrib/admin/filters.py", line 12, in <module>
  File "/usr/lib/python3.12/site-packages/django/contrib/admin/options.py", line 33, in <module>
  File "/usr/lib/python3.12/site-packages/django/contrib/auth/__init__.py", line 96, in <module>
  File "/usr/lib/python3.12/site-packages/django/views/decorators/debug.py", line 50, in decorator
  File "/usr/lib/python3.12/inspect.py", line 1264, in getsourcelines
  File "/usr/lib/python3.12/inspect.py", line 1093, in findsource
OSError: could not get source code

This seems to specifically fail somewhere in loading contrib.auth, if I remove (in a local test) both the 'django.contrib.admin' and 'django.contrib.auth' apps from the list of installed apps then manage.py starts to work correctly.

Change History (17)

comment:1 by Marcus Hoffmann, 11 months ago

This was very likely introduced by https://github.com/django/django/pull/16831
I don't exactly understand what this is doing but inspect.getsourcelines(wrapped_func) will throw in case there's no corresponding source file (which we don't in a pyc only distribution). The question now is, can this be solved another way?

Last edited 11 months ago by Marcus Hoffmann (previous) (diff)

comment:2 by Natalia Bidart, 11 months ago

Component: UncategorizedCore (Other)

Hello Marcus, thank you for your report and for helping making Django better.

I'm trying to triage this bug properly but I would like to understand a few things before making any further decision:

  • In the CPython thread, while they say that source-les imports are supported, there seems to be a general pushback around it ("we historically discourage them unless you have a pressing use-case", ".pyc-only imports are CPython-specific", "that doesn't mean any 3rd party packages or other tools support a source-less mode of operation"). You affirmed that "this is an officially supported way of running cpython", where do you read that exactly?
  • While I can see a value and valid use cases for running Python in embedded systems, I'm not sure there are use cases for running a Django web server in an embedded system. Would you have examples for wanting to run Django in a "minified" rootfs?

I'm trying to determine whether this report pertains to a specific need arising from a niche use case or if it applies to the broader ecosystem. Django is a framework designed to offer robust and accurate solutions for common scenarios.

comment:3 by Carlton Gibson, 11 months ago

Cc: William Schwartz added

This may be related to #32316, #30950 &co. If so William Schwartz was actively using _frozen apps_ so may be able to advise.

(FWIW, if that's correct, and it is related, it's not clear Django really supports running frozen, although that would be a good to have)

Last edited 11 months ago by Carlton Gibson (previous) (diff)

comment:4 by William Schwartz, 11 months ago

I agree that what's happening here is the use (where the traceback shows) of inspect.getsourcelines, the documentation of which says, "An OSError is raised if the source code cannot be retrieved." I also agree that the problem started with https://github.com/django/django/pull/16831, which says,

To persist the sensitive variables list for async functions, we can use a global dict with the hash of the filename and line number of the function as the key (more formally: hash(f"{file_path}:{line_number}") and the value is the tuple or sensitive variables (or __ALL__). Then, during traceback cleanup we can read this dict by reconstructing the same key with f_code.co_filename and f_code.co_firstlineno to lookup the relevant value in the dict.

I don't have the bandwidth these days to offer an alternative solution. All I can say is that the implementation assumes the existence of a file path that Python does not guarantee exists. To be clear on this: Python does not guarantee that there is a source file or cache file associated with any object. (See discussion at #30950 and other linked tickets.) Modules without source files exist in *all* CPython instances because of the sys module, and in most CPython instances because of the frozen importlib submodules. They occur frequently in embedded instance of Python, e.g., when running code directly from pyc cache files (which I have always [back to 2010] understood was a fully supported solution) or using a compiler such as PyOxidizer, which I use. So a correct solution to whatever https://github.com/django/django/pull/16831 was trying to fix for async functions cannot rely on source files. I think what went wrong was essentially what comment:10:ticket:30950 warned of:

I could be wrong but I believe that whatever we do to tackle this ticket, it's important ensure that file isn't added back by new contributions in the future. Ideally this shouldn't be a burden on reviewers, there should be a linting rule, or a test or something that ensures that any contributions following the resolution of this ticket isn't adding this back.

Ideally there'd be a test run of Django in CI that runs in embedded form on an embedded project/app (whether cache-only, frozen, or PyOxidered or whatever). That would have caught this issue since __file__ isn't being directly used, just indirectly via inspect.getsourcelines. Unfortunately I don't have the time to do this myself.

PS, I'm excited that someone else is using Django embedded! FWIW I've been successfully using Django 3.2.13 with PyOxidizer Python applications for awhile. However, I haven't embedded Django itself, which continues to exist in source form next to my applications, i.e., only my own application code is embedded in the PyOxidizer-generated binary, not Django itself. Moreover, we do not use the admin or auth contrib apps. That said, I would certainly prefer to embed Django one day. If I recall correctly, at the time I ran out of budget to work on this, the barrier to embedding Django for us was the loaddata command.

Last edited 11 months ago by William Schwartz (previous) (diff)

comment:5 by Mariusz Felisiak, 11 months ago

Cc: Jon Janzen added

As far as I'm aware, the only way to fix this is to revert 38e391e95fe5258bc6d2467332dc9cd44ce6ba52.

comment:6 by Jon Janzen, 11 months ago

As far as I'm aware, the only way to fix this is to revert

I think there are alternatives, I don't really have time to explore them right now but:

  1. Use a different cache key, like swap the line number for the function name perhaps
  2. I think the code object stores the line numbers right? So perhaps there's a way to fetch the code object and get the line number that way.

If the above commit is reverted I think we'd have to think about reverting other things that depend on it like https://github.com/django/django/pull/16636

in reply to:  6 ; comment:7 by Mariusz Felisiak, 11 months ago

Replying to Jon Janzen:

I think there are alternatives, I don't really have time to explore them right now but:

  1. Use a different cache key, like swap the line number for the function name perhaps

Great idea! Marcus, does it work for you?

  • django/views/decorators/debug.py

    diff --git a/django/views/decorators/debug.py b/django/views/decorators/debug.py
    index 7ea8a540de..6540fc0651 100644
    a b def sensitive_variables(*variables):  
    4747
    4848            try:
    4949                file_path = inspect.getfile(wrapped_func)
    50                 _, first_file_line = inspect.getsourcelines(wrapped_func)
    5150            except TypeError:  # Raises for builtins or native functions.
    5251                raise ValueError(
    5352                    f"{func.__name__} cannot safely be wrapped by "
    def sensitive_variables(*variables):  
    5554                    "Python file (not a builtin or from a native extension)."
    5655                )
    5756            else:
    58                 key = hash(f"{file_path}:{first_file_line}")
     57                first_line_number = wrapped_func.__code__.co_firstlineno
     58                key = hash(f"{file_path}:{first_line_number}")
    5959
    6060            if variables:
    6161                coroutine_functions_to_sensitive_variables[key] = variables

comment:8 by Mariusz Felisiak, 11 months ago

Severity: NormalRelease blocker
Summary: Django 5.0+ doesn't work with pyc files only@sensitive_variables/sensitive_post_parameters decorators crash with .pyc-only builds.
Triage Stage: UnreviewedAccepted

in reply to:  7 comment:9 by Marcus Hoffmann, 11 months ago

Replying to Mariusz Felisiak:

Great idea! Marcus, does it work for you?

  • django/views/decorators/debug.py

    diff --git a/django/views/decorators/debug.py b/django/views/decorators/debug.py
    index 7ea8a540de..6540fc0651 100644
    a b def sensitive_variables(*variables):  
    4747
    4848            try:
    4949                file_path = inspect.getfile(wrapped_func)
    50                 _, first_file_line = inspect.getsourcelines(wrapped_func)
    5150            except TypeError:  # Raises for builtins or native functions.
    5251                raise ValueError(
    5352                    f"{func.__name__} cannot safely be wrapped by "
    def sensitive_variables(*variables):  
    5554                    "Python file (not a builtin or from a native extension)."
    5655                )
    5756            else:
    58                 key = hash(f"{file_path}:{first_file_line}")
     57                first_line_number = wrapped_func.__code__.co_firstlineno
     58                key = hash(f"{file_path}:{first_line_number}")
    5959
    6060            if variables:
    6161                coroutine_functions_to_sensitive_variables[key] = variables

Thanks for the patch, I'll try and test this tomorrow!

in reply to:  7 comment:10 by Marcus Hoffmann, 11 months ago

Replying to Mariusz Felisiak:

Great idea! Marcus, does it work for you?

I can confirm, this fixes the Problem for us. :-)

comment:11 by Mariusz Felisiak, 11 months ago

Owner: changed from nobody to Mariusz Felisiak
Status: newassigned

I can confirm, this fixes the Problem for us. :-)

Thanks for checking!

comment:12 by Mariusz Felisiak, 11 months ago

Has patch: set

comment:13 by GitHub <noreply@…>, 11 months ago

Resolution: fixed
Status: assignedclosed

In d1be05b3:

Fixed #35187 -- Fixed @sensitive_variables/sensitive_post_parameters decorators crash with .pyc-only builds.

Thanks Jon Janzen for the implementation idea.

Thanks Marcus Hoffmann for the report.

Regression in 38e391e95fe5258bc6d2467332dc9cd44ce6ba52.

comment:14 by Mariusz Felisiak <felisiak.mariusz@…>, 11 months ago

In 41a4bba:

[5.0.x] Fixed #35187 -- Fixed @sensitive_variables/sensitive_post_parameters decorators crash with .pyc-only builds.

Thanks Jon Janzen for the implementation idea.

Thanks Marcus Hoffmann for the report.

Regression in 38e391e95fe5258bc6d2467332dc9cd44ce6ba52.
Backport of d1be05b3e9209fd0787841c71a95819d81061187 from main

comment:15 by Jon Janzen, 11 months ago

I've put up a PR to add pyc-only builds to Django's daily tests.

Marcus, if you wouldn't mind can you take a look and verify that the checks are sufficient to ensure buildroot/pyc-only builds won't break again?

comment:16 by Marcus Hoffmann, 11 months ago

@Jon: The test looks good to me! Thanks for adding it!

comment:17 by Mariusz Felisiak <felisiak.mariusz@…>, 11 months ago

In 6feaad9:

Refs #30950, Refs #35187 -- Added tests for byte-compiled Django to daily builds.

Note: See TracTickets for help on using tickets.
Back to Top