#30669 closed Bug (fixed)
Can FILE_UPLOAD_MAX_MEMORY_SIZE be set to None?
Reported by: | Lincoln | Owned by: | nobody |
---|---|---|---|
Component: | HTTP handling | Version: | dev |
Severity: | Normal | Keywords: | FILE_UPLOAD_MAX_MEMORY_SIZE, None, settings |
Cc: | Andrew Godwin | Triage Stage: | Accepted |
Has patch: | no | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
The settings reference for FILE_UPLOAD_MAX_MEMORY_SIZE doesn't say anything about if this setting can be set to None
.
The first usages of it in asgi.py does a is None
check on it, suggesting it can be None
.
But the second usage of it in uploadhandler.py does a <=
comparison which raises a TypeError
when it is None
, suggesting it shouldn't be None
I would be happy to submit a patch if someone with more knowledge of Django can answer the question.
Change History (12)
comment:1 by , 5 years ago
Triage Stage: | Unreviewed → Accepted |
---|
comment:2 by , 5 years ago
Cc: | added |
---|
follow-up: 4 comment:3 by , 5 years ago
Nice spot. This is a good one...
The purpose here is to provide a buffer for the request body prior to it being read()
by the client of the request object itself.
Otherwise, we read the body into memory for each request, regardless of whether it is consumed, or how big it is, or... — which is a DoS waiting to happen.
RequestDataTooBig
is raised normally if and when body_file
is consumed during request processing, so we don't want to raise it here. The check is only being used for "how much memory should I consume before using the disk?"
In one sense, it's a bit naughty re-using the existing setting this way...
- Do we even need the
is None
branch, i.e. can't we just always used theSpooledTemporaryFile
? - It's not 100% why either
FILE_UPLOAD_MAX_MEMORY_SIZE
orDATA_UPLOAD_MAX_MEMORY_SIZE
is the right value for the roll-over to disk. - I wonder if a hook on
ASGIHandler
forget_body_file()
or similar is worth adding (yet)?- Such would enable me to set the roll-over as I wanted without a setting.
- And, if I ever get the bandwidth to actually work on it (sigh), there's an idea (on Channels repo) to wrap the
recieve
awaitable in a file-like, to avoid prematurely reading the body at all. It would be easy to plugin/try-out such a method with a hook in place. - Equally, I might say, "Always use memory", and any other strategy I like...
follow-up: 6 comment:4 by , 5 years ago
Replying to Carlton Gibson:
- Do we even need the
is None
branch, i.e. can't we just always used theSpooledTemporaryFile
?
+1, receiving in a BytesIO()
without upper limiting looks like a simple DOS attack vector.
I'm not sure about your explanation of later size check. What if a request has "unlimited" (i.e. *very* big) length?
comment:5 by , 5 years ago
I agree that there's no situation it should be None - the "is None" statement is likely a leftover from my testing and development.
DATA_UPLOAD_MAX_MEMORY_SIZE may well be the right thing to use - I would appreciate other people's opinion on this one since it's not totally clear to me how the failure/enforcement modes of those two settings are defined.
comment:6 by , 5 years ago
Component: | File uploads/storage → HTTP handling |
---|
Replying to Claude Paroz:
I'm not sure about your explanation of later size check. What if a request has "unlimited" (i.e. *very* big) length?
OK, so this is where it get's tricky. 🙂
As it stands if the request is "unlimited" we'd end up using a lot™ of disk space. Whatever frontend server is in play should put an absolute limit on request sizes, and we could document the need for that in the deployment checklist, but we should have some kind of ceiling here.
DATA_UPLOAD_MAX_MEMORY_SIZE
is not really the correct measure. At this point we want to limit the total size of the request. DATA_UPLOAD_MAX_MEMORY_SIZE
is applied to only to the POST
data, with FILES
being handled separately (in MultipartParser, as you point to Claude.) The danger is we limit total request size to DATA_UPLOAD_MAX_MEMORY_SIZE
and in so doing stop people handling large file uploads.
So, total request size should be limited to (something like) the amount of POST data I want to handle in memory (DATA_UPLOAD_MAX_MEMORY_SIZE
) plus the maximum file size I want to allow to be uploaded (which isn't FILE_UPLOAD_MAX_MEMORY_SIZE
either).
Whilst we need to read in the request here, I guess we can decide a generous but sensible default, and allow folks to set their own value. Personally I'd rather avoid a setting and let people pass a parameter in their asgi.py
when instantiating the application (or use an ASGIHandler
subclass which specifies it) but what do others think?
comment:7 by , 5 years ago
As it stands if the request is "unlimited" we'd end up using a lot™ of disk space.
But this applies equally to the existing WSGI code right? Django doesn't have a limit on the total file data that can be uploaded. (Does it? — I've always set a sensible value at the nginx level...)
The only difference here is we have this extra buffer layer. (So we read the bytes, presumably to disk, twice, for the length of the request.)
Update: A quick survey of the landscape suggests the nginx-like layer is ‘’where this is handled’’. Neither werkzeug, nor gunicorn, nor Daphne, nor uvicorn enforce total request size, it seems because they don’t buffer the whole request to be able to check. Like Django, Flask requires you to consume the request before you can check the size. (Issue on Werkzeug summarising some discussions.)
Comment from Tom Christie when I asked his opinion:
I think the right answer to this is: middleware for app-level protection. Nginx or whatever platform-level gateway you're on for added robustness.
comment:8 by , 5 years ago
PR removing the `is None` branch.
Not sure currently if/what we want to do about the rest of the discussion.
comment:10 by , 5 years ago
Thanks for the code update and the explorations. I think we can close this ticket.
I'll let you decide if you think the request size limits stuff would need a doc ticket.
comment:11 by , 5 years ago
Resolution: | → fixed |
---|---|
Status: | new → closed |
Agreed, we removed incorrect branch so this question is not valid anymore.
comment:12 by , 5 years ago
I think Leave if for now, but we're aware of it. i.e. Lets see how the async work rolls out. (Most likely it's not an issue...)
I don't think there is a use case for a
None
value. To deactivate memory-based file uploads, I think that the official method is to remove theMemoryFileUploadHandler
from upload handlers.In my opinion, the usage by
asgi.py
might be a bug. It should used DATA_UPLOAD_MAX_MEMORY_SIZE, and mimic theMultiPartParser.parse()
code to raiseRequestDataTooBig
when it is exceeded. To be confirmed…