Opened 10 years ago
Closed 10 months ago
#23759 closed New feature (fixed)
Storage.get_available_name should preserve all file extensions, not just the first one
Reported by: | thenewguy | Owned by: | Adam Zapletal |
---|---|---|---|
Component: | File uploads/storage | Version: | dev |
Severity: | Normal | Keywords: | |
Cc: | Triage Stage: | Ready for checkin | |
Has patch: | yes | Needs documentation: | no |
Needs tests: | no | Patch needs improvement: | no |
Easy pickings: | no | UI/UX: | no |
Description
get_available_name
only preserves the first file extension. If the file is foo.tar.gz
then get_available_name
will return foo.tar_RANDOM.gz
instead of foo_RANDOM.tar.gz
https://github.com/django/django/blob/master/django/core/files/storage.py#L71
def get_available_name(self, name): """ Returns a filename that's free on the target storage system, and available for new content to be written to. """ dir_name, file_name = os.path.split(name) file_root, file_ext = os.path.splitext(file_name) # If the filename already exists, add an underscore and a random 7 # character alphanumeric string (before the file extension, if one # exists) to the filename until the generated filename doesn't exist. while self.exists(name): # file_ext includes the dot. name = os.path.join(dir_name, "%s_%s%s" % (file_root, get_random_string(7), file_ext)) return name
Change History (10)
comment:1 by , 10 years ago
comment:2 by , 10 years ago
Sorry for the delay. I just found out status updates on tickets I am watching have been going to spam. I guess that is why I never get updates.
In this case, I do not think it is an issue if a valid part of the filename is assumed to be an extension because get_available_name
does not promise to return the same filename, and it doesn't matter where the random string is in the name to be "available".
However, the mimetypes module could be used to determine if an extension was actually an extension. If mimetypes returns a type or an encoding, add the extension to the extensions list, otherwise break and assume there are no more extensions.
Also, this only happens if the submitted filename already exists.
I think it would be very nice to be able to maintain reliable file extensions out of the box. This is important for many cases, including when a user downloads the file.
Here is a sample mixin I've used in the past (keep in mind it does not query mimetypes, but that is trivial)
class GetUUIDAvailableNameStorageMixin(object): def get_available_name(self, name): """ Returns a hex encoded uuid filename, using the extensions provided by the input filename, that's suitable for use in the target storage system. """ exts = [] root = ext = name while ext: root, ext = splitext(root) exts.insert(0, ext) name = "%s%s" % (uuid4().hex, "".join(exts)) return super(GetUUIDAvailableNameStorageMixin, self).get_available_name(name)
comment:3 by , 10 years ago
I think that if we can do it without introducing too much complexity, we should do it. Could you provide a patch with tests?
comment:4 by , 10 years ago
Owner: | changed from | to
---|---|
Status: | new → assigned |
Triage Stage: | Unreviewed → Accepted |
assigned and marked accepted per https://code.djangoproject.com/ticket/23759#comment:3
comment:5 by , 10 years ago
Easy pickings: | unset |
---|---|
Type: | Uncategorized → New feature |
Version: | 1.7 → master |
comment:6 by , 4 years ago
Owner: | removed |
---|---|
Status: | assigned → new |
comment:7 by , 10 months ago
Owner: | set to |
---|---|
Status: | new → assigned |
comment:8 by , 10 months ago
Has patch: | set |
---|
In general, I think a period is a valid character in a filename, so we cannot simply assume everything to the right of the first period is the file extension. Some different strategies for the issue are detailed on stackoverflow. I am not sure its up to Django to pick one, but we could at least document the caveat and link how to write your own storage subclass so someone can implement the behavior they desire. Thoughts?