Opened 5 months ago

Last modified 5 months ago

#35572 assigned Cleanup/optimization

Improve performance replacing os.listdir() with os.scandir()

Reported by: Paolo Melchiorre Owned by: Amir Karimi
Component: Core (Other) Version: dev
Severity: Normal Keywords: scandir listdir python os
Cc: Paolo Melchiorre Triage Stage: Accepted
Has patch: no Needs documentation: no
Needs tests: no Patch needs improvement: no
Easy pickings: no UI/UX: no

Description

Use os.scandir() instead of os.listdir() in the remaining occurrences in the code:
https://github.com/search?q=repo%3Adjango%2Fdjango+os.listdir&type=code

Based on the Python documentation

Using scandir() instead of listdir() can significantly increase the performance of code that also needs file type or file attribute information, because os.DirEntry objects expose this information if the operating system provides it when scanning a directory.

Change History (4)

comment:1 by Sarah Boyce, 5 months ago

Triage Stage: UnreviewedAccepted

Similar to #29689 accepting, thank you
Note that additional benchmarks in django-asv are always welcome 👍

comment:2 by Amir Karimi, 5 months ago

Owner: set to Amir Karimi
Status: newassigned

comment:3 by Tim Graham, 5 months ago

Component: UncategorizedCore (Other)

The description makes it sound like this is a simple find and replace all, however, do all usages "also need file type or file attribute information"?

in reply to:  3 comment:4 by Amir Karimi, 5 months ago

Replying to Tim Graham:

The description makes it sound like this is a simple find and replace all, however, do all usages "also need file type or file attribute information"?

Good point! Except this case: https://github.com/django/django/blob/aa74c4083e047473ac385753e047e075e8f04890/scripts/manage_translations.py#L42
I didn't find any other cases where file attributes (is_dir, etc) are needed, and only their names or the number of list_dir output are needed. The only edge that "scandir" may still have is its less memory consumption when it comes to large folders (which I suspect is the case in any of these usages)

Note: See TracTickets for help on using tickets.
Back to Top