Opened 22 months ago

Closed 20 months ago

Last modified 18 months ago

#34322 closed Bug (fixed)

ManifestStaticFilesStorage crashes on commented JavaScript import statements

Reported by: Adam Johnson Owned by: Mariusz Felisiak
Component: contrib.staticfiles Version: 4.2
Severity: Release blocker Keywords:
Cc: Triage Stage: Accepted
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description (last modified by Adam Johnson)

#32319 added module support to ManifestStaticFilesStorage. It can crash with imports in comments, which are used for Typescript (docs) but don't necessarily resolve when code is bundled.

Example from htmx:

//** @type {import("./htmx").HtmxApi} */

Leads to:

whitenoise.storage.MissingFileError: The file 'example/dist/htmx' could not be found with <whitenoise.storage.CompressedManifestStaticFilesStorage object at 0x16ff630a0>.

The JS file 'example/dist/app.js' references a file which could not be found:
  example/dist/htmx

Please check the URL references in this JS file, particularly any
relative paths which might be pointing to the wrong location.

The regex should be adjusted to only select imports that are alone on a line, with whitespace.

This may be a challenge as comments can be multi-line like:

        /**
         * @param {HTMLElement} elt
         * @returns {import("./htmx").HtmxTriggerSpecification[]}
         */

Attachments (1)

js-commented-regions.patch (3.1 KB ) - added by Adam Johnson 20 months ago.
PoC patch to avoid commented regions

Download all attachments as: .zip

Change History (32)

comment:1 by Carlton Gibson, 22 months ago

Severity: NormalRelease blocker
Triage Stage: UnreviewedAccepted

Grrr. OK. Sigh. Thanks for the report Adam. (I didn't run it, but I'm going to trust you)

(Mariusz was right to be sceptical that this was going to be OK. . :)

Let's look at a fix here but:

I think we need to be prepared to say where we're not going to try more complex solutions.
(https://code.djangoproject.com/ticket/32319#comment:21)

comment:2 by Adam Johnson, 22 months ago

Summary: ManifestStaticFilesStorage crashesManifestStaticFilesStorage crashes on commented JavaScript import statements

comment:3 by Adam Johnson, 21 months ago

Description: modified (diff)

Added more detail to description about possible failures

comment:4 by Adam Johnson, 21 months ago

I think it's worth noting that this kind of failure was always possible with the other replaced regexes. Commented url() or @import lines in a CSS file could always fail in a similar way. It seems that the TypeScript syntax has made it way more likely that there will be commented unresolvable imports in JavaScript files.

comment:5 by Claude Paroz, 21 months ago

I wouldn't be shocked if we would document that imports in comments are currently not supported in ManifestStaticFilesStorage. Is the situation worse than when the import directive wasn't supported at all?

comment:6 by Adam Johnson, 21 months ago

In the case I found, imports in comments aren't really controllable in the toolchain. Either you use the webpack dev mode and output them, or you remove all comments and minify etc. So in dev you don't want the comments there.

comment:7 by Carlton Gibson, 21 months ago

Adam, do you have any ideas for the way forward? — A regex vs multi-line comments 😬 — "...you remove all comments..." — is that something we can require?

Current status is that without some approach we may need to revert (looking at 4.2b1 next week).

comment:8 by Claude Paroz, 21 months ago

This is in fact just one more occurrence of #21080, so the fact that ManifestStaticFileStorage is broken with commented code, be it in CSS or JS, is a known fact since some time. Every time we'll extend ManifestStaticFileStorage functionality, we are going to worsen the situation a bit more.

One possible mitigation could be to condition JS import support to a class attribute, so it would at least allow people wanting to opt out of this new functionality to subclass ManifestStaticFileStorage to fix any possible issue with commented imports. I plead to keep this functionality in 4.2 to allow those writing modern module-based JS in their projects to safely use ManifestStaticFileStorage.

in reply to:  8 comment:9 by Mariusz Felisiak, 21 months ago

Replying to Claude Paroz:

This is in fact just one more occurrence of #21080, so the fact that ManifestStaticFileStorage is broken with commented code, be it in CSS or JS, is a known fact since some time. Every time we'll extend ManifestStaticFileStorage functionality, we are going to worsen the situation a bit more.

I'm happy to close it as a duplicate of #21080. It's a long standing and well known issue, ManifestStaticFilesStorage crashes on references inside comments.

comment:10 by Mariusz Felisiak, 21 months ago

I think documenting this issue is justified as it's been 9 years since it was opened, see PR.

comment:11 by GitHub <noreply@…>, 21 months ago

In bae053d4:

Refs #21080, Refs #34322 -- Added warning to ManifestStaticFilesStorage docs about paths in comments.

comment:12 by Mariusz Felisiak <felisiak.mariusz@…>, 21 months ago

In e1c74bf4:

[4.2.x] Refs #21080, Refs #34322 -- Added warning to ManifestStaticFilesStorage docs about paths in comments.

Backport of bae053d497ba8a8de7e4f725973924bfb1885fd2 from main.

comment:13 by Mariusz Felisiak <felisiak.mariusz@…>, 21 months ago

In 60be590:

[4.1.x] Refs #21080, Refs #34322 -- Added warning to ManifestStaticFilesStorage docs about paths in comments.

Backport of bae053d497ba8a8de7e4f725973924bfb1885fd2 from main.

comment:14 by Mariusz Felisiak, 21 months ago

Resolution: duplicate
Severity: Release blockerNormal
Status: newclosed

Duplicate of #21080.

comment:15 by Adam Johnson, 21 months ago

I'm not really happy about closing this with just a warning. The previous issue covered references inside CSS comments - something that is rare. The new issue with JS comments is something we can expect to be very common. Currently Django 4.2 will make using two fairly common things - htmx (non-minified) and WhiteNoise - crash collectstatic with an obscure error message.

We can do better, we have django.utils.jslex which can be used to identify import statements, rather than relying on fragile regexes. I started working on a patch but didn't get far, I don't get much time for open source at the moment.

comment:16 by Claude Paroz, 21 months ago

Adam, is your current patch shareable?

If the current state is not acceptable for 4.2, we could still make that feature togglable with an opt-in/opt-out flag on the class as I suggested on comment:8.

by Adam Johnson, 20 months ago

Attachment: js-commented-regions.patch added

PoC patch to avoid commented regions

comment:17 by Adam Johnson, 20 months ago

I just attached the exploratory patch. I'll explain the status below, but first some fresh shower thoughts on impact.

This bug probably affects many projects. Lots of projects use a bunch of JavaScript for their frontend, and the number of packages a typical JS project *probably* means at least one uses TypeScript import comments. It's also mostly outside of users' control - they need React, it pulls in 50 other things, they can't easily remove the comments.

Despite the impact, I think it's also *unlikely* to be detected by many people in Django's beta phase. The standard advice is to test with your project's test suite, which would does not include collectstatic. (In fact, I only "accidentally" ran collectstatic, it wasn't on my checklist.)

Case in point: the CSS source map change to HashedFilesMixin in Django 4.1 (#33353) broke Django Rest Framework, but no one found that until the final release. The fix was simple in that case: add the missing source map file to DRF (https://github.com/encode/django-rest-framework/pull/8591). But in this case, there's no simple fix. Users are unlikely to be able to edit out every typescript comment from all their libraries.

Okay, on the patch I started on... it's quite exploratory really. It tries to find all commented regions within the JS file, and skip processing import statements within those. But I realized I would *really* need to expand the approach to all kinds of string literals that might mention the word "import", and then I got to the point of realizing that regexes just won't cut it (not a huge surprise in retrospect).

To avoid all false positives, we need to properly lex/tokenize the JS so we can partially-parse only real import statements. Luckily Django already has a JS lexer, but it would still be some work.

The patch currently fails with JS regex literals that contain the word "import", or quote marks/backticks (because they match a string start):

const importRe = /import/;
const doubleQuoteRe = /"/;

django.utils.jslex should be able to parse past those.

I would suggest that we *stop* processing files with regexes, at least JavaScript files, and instead tokenize with jslex. Then we can process the rules individually.

An optimization would be to skip tokenzing files that do not contain the words sourceMappingURL, import, or export (which a first-pass regex could detect). This would probably be required to avoid noticeable slowdown, since the tokenization process is almost certainly slower than the current regex approach.

This change feels quite heavy though, I'm not sure it's something we'd want to do in a beta release?

I was using a test project with an app.js containing:

// These should not be processed
// @returns {import("./non-existent-1").something}
/* @returns {import("./non-existent-2").something} */
'import("./non-existent-3")'
"import('./non-existent-4')"
`import("./non-existent-5")`
r = /import/;
// This should be processed
import example from "./module.js";

And an empty file module.js next to it.

comment:18 by Claude Paroz, 20 months ago

This change feels quite heavy though, I'm not sure it's something we'd want to do in a beta release?

Indeed, it seems late for such a change.

Here's a possible opt-in implementation of that new functionality, without docs, just for exploration and opinion: https://github.com/claudep/django/commit/0eaf4f3542a64c8df66dc92e8574ab45d16053fd

in reply to:  18 comment:19 by Mariusz Felisiak, 20 months ago

Replying to Claude Paroz:

This change feels quite heavy though, I'm not sure it's something we'd want to do in a beta release?

Indeed, it seems late for such a change.

Here's a possible opt-in implementation of that new functionality, without docs, just for exploration and opinion: https://github.com/claudep/django/commit/0eaf4f3542a64c8df66dc92e8574ab45d16053fd

Agreed, we can change this to an experimental opt-in feature. Thanks both!

comment:20 by Sebastiaan Zeeff, 20 months ago

I'm also running into this problem and I don't thinks it's confined to comments. Unfortunately, it only comes up when you really start working with the 4.2b1 instead of just running test suites without running collectstatic.

I added a comment to the old issue (ten years old), but since discussion is going on here as well, here's a link: https://code.djangoproject.com/ticket/21080#comment:26

I fear that this change in scanning for imports may break a lot more projects and a lot of those project are probably using third-party JavaScript/CSS dependencies that they can't easily change to circumvent false positives in the import scanning procedure of Django. The third-party package I'm using includes a widget that shows documentation, including examples of imports in JavaScript strings, and it's probably not the only third-party package that includes import examples somewhere in files (either in comments or in the "empty states" of widgets that have not been initialized yet).

in reply to:  20 ; comment:21 by Mariusz Felisiak, 20 months ago

Replying to Sebastiaan Zeeff:

I'm also running into this problem and I don't thinks it's confined to comments. Unfortunately, it only comes up when you really start working with the 4.2b1 instead of just running test suites without running collectstatic.

I added a comment to the old issue (ten years old), but since discussion is going on here as well, here's a link: https://code.djangoproject.com/ticket/21080#comment:26

I fear that this change in scanning for imports may break a lot more projects and a lot of those project are probably using third-party JavaScript/CSS dependencies that they can't easily change to circumvent false positives in the import scanning procedure of Django. The third-party package I'm using includes a widget that shows documentation, including examples of imports in JavaScript strings, and it's probably not the only third-party package that includes import examples somewhere in files (either in comments or in the "empty states" of widgets that have not been initialized yet).

Thanks for your comment. What do you think about https://code.djangoproject.com/ticket/34322#comment:18?

comment:22 by Klaas van Schelven, 20 months ago

This bug probably affects many projects. Lots of projects use a bunch of JavaScript for their frontend, and the number of packages a typical JS project *probably* means at least one uses TypeScript import comments. It's also mostly outside of users' control - they need React, it pulls in 50 other things, they can't easily remove the comments.

+1 on this sentiment

in reply to:  21 comment:23 by Sebastiaan Zeeff, 20 months ago

Replying to Mariusz Felisiak:

Replying to Sebastiaan Zeeff:

(snipped for brevity)_

Thanks for your comment. What do you think about https://code.djangoproject.com/ticket/34322#comment:18?

It's a good option for end-users, although they'll have to discover the option. The staticfiles error itself is probably going to be a bit confusing, especially to less experienced users.

It's a bit more difficult for package/plugin maintainers, as they can't control the settings of the users using their package. For my current client, I'm maintaining a package that vendors a components library and integrates it into Django's form system to make using these widgets in forms and generic edit views easy. It's used by a lot of projects, which means all of them will have to make sure that they're using the opt-in when upgrading to Django 4.2 LTS (which is probably going to be mandatory in this large organisation).

comment:24 by Mariusz Felisiak, 20 months ago

It's a good option for end-users, although they'll have to discover the option. The staticfiles error itself is probably going to be a bit confusing, especially to less experienced users.

The proposition is to make it opt-in not opt-out, so they'll have to discover the option to turn it on. Moreover, it will be documented as experimental with warning (already documented) about comments.

in reply to:  24 comment:25 by Sebastiaan Zeeff, 20 months ago

Replying to Mariusz Felisiak:

It's a good option for end-users, although they'll have to discover the option. The staticfiles error itself is probably going to be a bit confusing, especially to less experienced users.

The proposition is to make it opt-in not opt-out, so they'll have to discover the option to turn it on. Moreover, it will be documented as experimental with warning (already documented) about comments.

That means I misread the earlier conversation. I interpreted it as "it's to late to make major changes to the behavior that's currently present in the beta, but there's an opt-in patch that will solve issues for people who are experience issues with the new import detection".

Apologies for taking up your time.

comment:26 by Mariusz Felisiak, 20 months ago

Resolution: duplicate
Severity: NormalRelease blocker
Status: closednew

comment:27 by Mariusz Felisiak, 20 months ago

Owner: changed from nobody to Mariusz Felisiak
Status: newassigned

comment:28 by Mariusz Felisiak, 20 months ago

Has patch: set

comment:29 by GitHub <noreply@…>, 20 months ago

Resolution: fixed
Status: assignedclosed

In e10c1688:

Fixed #34322 -- Made ES module support to ManifestStaticFilesStorage optional.

Co-authored-by: Author: Claude Paroz <claude@…>

comment:30 by Mariusz Felisiak <felisiak.mariusz@…>, 20 months ago

In f2923306:

[4.2.x] Fixed #34322 -- Made ES module support to ManifestStaticFilesStorage optional.

Co-authored-by: Author: Claude Paroz <claude@…>
Backport of e10c1688f96e3b2d202fe401472b7b25f6105969 from main

comment:31 by Adam Johnson, 18 months ago

I have learned that these comments are part of the JSDoc standard: https://jsdoc.app/

Also a data point, Svelte is moving its process to use such JSDoc comments over TypeScript, to avoid the build process: https://github.com/sveltejs/svelte/pull/8569 . So such comments may become more common if JavaScirpt developers decide to follow this popular project’s approach.

Note: See TracTickets for help on using tickets.
Back to Top