Context Navigation

← Previous Ticket
Next Ticket →

#34356 closed Bug (invalid)

Memory leak when generating PDFs

Reported by:	Robin (Robert) Thomas	Owned by:	nobody
Component:	Core (Other)	Version:	4.1
Severity:	Normal	Keywords:	memory memory-leak pdf weasyprint
Cc:		Triage Stage:	Unreviewed
Has patch:	no	Needs documentation:	no
Needs tests:	no	Patch needs improvement:	no
Easy pickings:	no	UI/UX:	no

Description (last modified by Robin (Robert) Thomas)

Context

Our app generates a one-page PDF report for users. It contains a few small SVG and PNG icons, and 4 big textual tables. The PDF is generated once, after which it is put in a storage bucket for subsequent retrieval.

Problem

The app is Django 4.1.6, Weasyprint 57.2, running on Heroku (heroku-22). We're not having any issues retrieving previously-generated PDFs, but each time it generates a new PDF (filesize 38kb) the app's memory RSS increases by 20 - 40mb, as reported by Heroku. This memory usage doesn't go down until the server is restarted.

Unfortunately Heroku doesn't automatically restart the server until both memory RSS and swap exceed the 512mb limit, so once RSS is used up we start getting a lot of pings about OOM errors and have to manually restart it.

What we've tried

Even after removing all images, fonts, and CSS (filesize 32kb) each generation still increases the memory RSS by about 17mb.

If we remove everything from the report template, leaving just <!DOCTYPE html><html lang="en"><head><title>Test</title><body></body></html> (filesize 863b), each generation increases the memory RSS by about 1.3mb.

I opened a bug ticket about this with Weasyprint (https://github.com/Kozea/WeasyPrint/issues/1496). They say that because they cannot reproduce this when running just Weasyprint by itself from the command-line, the memory leak must be elsewhere in the ecosystem.

Reproduce

I deployed a little test app to show this in action, with a link to the source code: https://weasyprint-mem.herokuapp.com/

You can see from the attached image that every time a PDF is generated it increases the memory usage, although not always consistently. I would expect the data for each PDF to be garbage-collected once it has rendered:

Attachments (1)

219976184-2e826b19-eb1d-40a8-926a-b9751468f0eb.jpg (97.3 KB ) - added by Robin (Robert) Thomas 2 years ago.: Screenshot of memory usage

Download all attachments as: .zip

Change History (9)

by Robin (Robert) Thomas, 2 years ago

Attachment:	219976184-2e826b19-eb1d-40a8-926a-b9751468f0eb.jpg added

Screenshot of memory usage

comment:1 by Robin (Robert) Thomas, 2 years ago

Description:	modified (diff)

comment:2 by Robin (Robert) Thomas, 2 years ago

Description:	modified (diff)

comment:3 by Mariusz Felisiak, 2 years ago

Resolution:	→ needsinfo
Status:	new → closed

Hi, I don't think you've explained the issue in enough detail to confirm a bug in Django. Please reopen the ticket if you can debug your issue and provide details about why and where Django is at fault.

This may be a duplicate of #16022, see PR for a possible solution.

comment:4 by Robin (Robert) Thomas, 2 years ago

@mariusz The ticket you referenced is for DB file fields, and the given source code and example do not use models or a database at all.

I'll try this in Flask to see if there's a similar result. If not, then the issue must be with Django in which case I'll open a new ticket.

comment:5 by Mariusz Felisiak, 2 years ago

If not, then the issue must be with Django in which case I'll open a new ticket.

Please don't reopen the ticket without providing an extra details, i.e. why and where Django is at fault.

comment:6 by Carlton Gibson, 2 years ago

Often this is Python's garbage collection not kicking in as soon as you want. First this I'd do it add a gc.collect() after processing the PDF, to see if you can bring it down by hand. (If the memory is collected, it's not a leak per se… — unless there's something specific, gc behaviour is a Python issue, rather than anything Django can do.)

comment:7 by Robin (Robert) Thomas, 2 years ago

Getting the same behavior in Django:

https://weasyprint-mem.herokuapp.com/

...and in Flask:

https://weasyprint-mem-flask.herokuapp.com/

...so I'll leave this closed. Thanks! :)

comment:8 by Mariusz Felisiak, 2 years ago

Resolution:	needsinfo → invalid

Note: See TracTickets for help on using tickets.

Download in other formats:

Issues

Context Navigation

#34356 closed Bug (invalid)

Memory leak when generating PDFs

Description (last modified by Robin (Robert) Thomas)

Context

Problem

What we've tried

Reproduce

Attachments (1)

Change History (9)

by Robin (Robert) Thomas, 2 years ago

comment:1 by Robin (Robert) Thomas, 2 years ago

comment:2 by Robin (Robert) Thomas, 2 years ago

comment:3 by Mariusz Felisiak, 2 years ago

comment:4 by Robin (Robert) Thomas, 2 years ago

comment:5 by Mariusz Felisiak, 2 years ago

comment:6 by Carlton Gibson, 2 years ago

comment:7 by Robin (Robert) Thomas, 2 years ago

comment:8 by Mariusz Felisiak, 2 years ago

Download in other formats:

Django Links

Learn More

Get Involved

Get Help

Follow Us

Support Us

Issues

Context Navigation

#34356 closed Bug (invalid)

Memory leak when generating PDFs

Description (last modified by Robin (Robert) Thomas)

Context

Problem

What we've tried

Related

Reproduce

Attachments (1)

Change History (9)

by Robin (Robert) Thomas, 2 years ago

comment:1 by Robin (Robert) Thomas, 2 years ago

comment:2 by Robin (Robert) Thomas, 2 years ago

comment:3 by Mariusz Felisiak, 2 years ago

comment:4 by Robin (Robert) Thomas, 2 years ago

comment:5 by Mariusz Felisiak, 2 years ago

comment:6 by Carlton Gibson, 2 years ago

comment:7 by Robin (Robert) Thomas, 2 years ago

comment:8 by Mariusz Felisiak, 2 years ago

Download in other formats: