Opened 10 years ago

Closed 11 months ago

#23759 closed New feature (fixed)

Storage.get_available_name should preserve all file extensions, not just the first one

Reported by: thenewguy Owned by: Adam Zapletal
Component: File uploads/storage Version: dev
Severity: Normal Keywords:
Cc: Triage Stage: Ready for checkin
Has patch: yes Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

get_available_name only preserves the first file extension. If the file is foo.tar.gz then get_available_name will return foo.tar_RANDOM.gz instead of foo_RANDOM.tar.gz

https://github.com/django/django/blob/master/django/core/files/storage.py#L71

    def get_available_name(self, name):
        """
        Returns a filename that's free on the target storage system, and
        available for new content to be written to.
        """
        dir_name, file_name = os.path.split(name)
        file_root, file_ext = os.path.splitext(file_name)
        # If the filename already exists, add an underscore and a random 7
        # character alphanumeric string (before the file extension, if one
        # exists) to the filename until the generated filename doesn't exist.
        while self.exists(name):
            # file_ext includes the dot.
            name = os.path.join(dir_name, "%s_%s%s" % (file_root, get_random_string(7), file_ext))

        return name

Change History (10)

comment:1 by Tim Graham, 10 years ago

In general, I think a period is a valid character in a filename, so we cannot simply assume everything to the right of the first period is the file extension. Some different strategies for the issue are detailed on stackoverflow. I am not sure its up to Django to pick one, but we could at least document the caveat and link how to write your own storage subclass so someone can implement the behavior they desire. Thoughts?

comment:2 by thenewguy, 10 years ago

Sorry for the delay. I just found out status updates on tickets I am watching have been going to spam. I guess that is why I never get updates.

In this case, I do not think it is an issue if a valid part of the filename is assumed to be an extension because get_available_name does not promise to return the same filename, and it doesn't matter where the random string is in the name to be "available".

However, the mimetypes module could be used to determine if an extension was actually an extension. If mimetypes returns a type or an encoding, add the extension to the extensions list, otherwise break and assume there are no more extensions.

Also, this only happens if the submitted filename already exists.

I think it would be very nice to be able to maintain reliable file extensions out of the box. This is important for many cases, including when a user downloads the file.

Here is a sample mixin I've used in the past (keep in mind it does not query mimetypes, but that is trivial)

class GetUUIDAvailableNameStorageMixin(object):
    def get_available_name(self, name):
        """
        Returns a hex encoded uuid filename, using the extensions provided by the
        input filename, that's suitable for use in the target storage system.
        """
        exts = []
        root = ext = name
        while ext:
            root, ext = splitext(root)
            exts.insert(0, ext)
        name = "%s%s" % (uuid4().hex, "".join(exts))
        return super(GetUUIDAvailableNameStorageMixin, self).get_available_name(name) 

comment:3 by Claude Paroz, 10 years ago

I think that if we can do it without introducing too much complexity, we should do it. Could you provide a patch with tests?

comment:4 by thenewguy, 10 years ago

Owner: changed from nobody to thenewguy
Status: newassigned
Triage Stage: UnreviewedAccepted

Assigned and marked accepted per previous comment.

Last edited 8 years ago by Tim Graham (previous) (diff)

comment:5 by Tim Graham, 10 years ago

Easy pickings: unset
Type: UncategorizedNew feature
Version: 1.7master

comment:6 by Mariusz Felisiak, 4 years ago

Owner: thenewguy removed
Status: assignednew

comment:7 by Adam Zapletal, 11 months ago

Owner: set to Adam Zapletal
Status: newassigned

comment:8 by Adam Zapletal, 11 months ago

Has patch: set

comment:9 by Mariusz Felisiak, 11 months ago

Triage Stage: AcceptedReady for checkin

comment:10 by Mariusz Felisiak <felisiak.mariusz@…>, 11 months ago

Resolution: fixed
Status: assignedclosed

In eb2d49b:

Fixed #23759 -- Preserved all file extensions in Storage.get_available_name().

Note: See TracTickets for help on using tickets.
Back to Top