high memory usage during GET for large files and slow links #129

Closed
opened 2007-09-12 17:17:12 +00:00 by warner · 2 comments

Load testing revealed that doing a GET of a large file through a slow link
causes the memory footprint of the decoding node to balloon to the size of
the file being downloaded. The cause is simple: decode is outpacing the
download, and we're doing naive twisted.web transport.write for each segment.
This forces the transport to buffer all of the data that we've written and
which the client (in this case a browser on the other end of a DSL line) has
not yet received.

I can think of two possible solutions:

  • make sure that decoding is a producer/consumer process. This means we hold
    off on downloading the shares for a given segment until the consumer (in
    this case the HTTP connection) says they want more (because their buffer
    size has dropped below some value). This changes the control flow in
    download, not coincidentally mirroring a similar change in upload (to
    support offloaded-uploading #116).

  • have the decode process write the data to a temporary file on disk, and
    then pass that off to the web transport to read at its leisure (and
    delete it when finished, using an anonymous filehandle)

Doing producer/consumer probably raises the memory footprint by 1MB for each
active download (holding one segment of plaintext in memory while we wait for
the client to download it, maybe 2MB if we pipeline the next segment's
shares).

The tempfile approach means downloads run full-throttle and then finish,
avoiding the memory overhead, but of course then we have a disk overhead of
the full file size for the duration of the download. In practice, the kernel
will cache these disk files until they get too large, then push them to an
actual disk, with a cache size varying according to whatever else is using
memory.

I'm inclined to implement the producer/consumer thing, but when I think about
it, the kernel is in the best position to make the tradeoff between disk and
memory, so it might be a better approach to simply let it do its job. Client
behavior has an effect too: if people download half of a large file and then
quit and never come back, the tempfile approach means a lot of wasted
fetch/decode effort. On the other hand, the tempfile approach makes it a
!!!lot!!! easier to keep the tempfile around for a couple of hours in case
the client comes back to finish the job. (we'd have to implement
Content-Range: on the GET command, but that might not be all that difficult).

Load testing revealed that doing a GET of a large file through a slow link causes the memory footprint of the decoding node to balloon to the size of the file being downloaded. The cause is simple: decode is outpacing the download, and we're doing naive twisted.web transport.write for each segment. This forces the transport to buffer all of the data that we've written and which the client (in this case a browser on the other end of a DSL line) has not yet received. I can think of two possible solutions: * make sure that decoding is a producer/consumer process. This means we hold off on downloading the shares for a given segment until the consumer (in this case the HTTP connection) says they want more (because their buffer size has dropped below some value). This changes the control flow in download, not coincidentally mirroring a similar change in upload (to support offloaded-uploading #116). * have the decode process write the data to a temporary file on disk, and then pass that off to the web transport to read at its leisure (and delete it when finished, using an anonymous filehandle) Doing producer/consumer probably raises the memory footprint by 1MB for each active download (holding one segment of plaintext in memory while we wait for the client to download it, maybe 2MB if we pipeline the next segment's shares). The tempfile approach means downloads run full-throttle and then finish, avoiding the memory overhead, but of course then we have a disk overhead of the full file size for the duration of the download. In practice, the kernel will cache these disk files until they get too large, then push them to an actual disk, with a cache size varying according to whatever else is using memory. I'm inclined to implement the producer/consumer thing, but when I think about it, the kernel is in the best position to make the tradeoff between disk and memory, so it might be a better approach to simply let it do its job. Client behavior has an effect too: if people download half of a large file and then quit and never come back, the tempfile approach means a lot of wasted fetch/decode effort. On the other hand, the tempfile approach makes it a !!!lot!!! easier to keep the tempfile around for a couple of hours in case the client comes back to finish the job. (we'd have to implement Content-Range: on the GET command, but that might not be all that difficult).
warner added the
code-frontend-web
critical
defect
0.5.1
labels 2007-09-12 17:17:12 +00:00
warner added this to the undecided milestone 2007-09-12 17:17:12 +00:00
Author

I've added an automated memory test for this: check out the buildbot "memcheck" builder for the current numbers. As of right now, downloading a 50MB file and pushing it over a slow HTTP 'GET' link causes the node to peak at 89MB.

I've added an automated memory test for this: check out the buildbot "memcheck" builder for the current numbers. As of right now, downloading a 50MB file and pushing it over a slow HTTP 'GET' link causes the node to peak at 89MB.
warner self-assigned this 2007-09-19 04:21:44 +00:00
Author

Fixed, in changeset:1340c484c6c60c52. The producer/consumer stuff works great, and the memory footprint is now down to 29MB for a stalled download of a 50MB file (this is within 7% of the footprint of our other 50MB tests).

The new code also handles interrupted downloads extremely gracefully. The segment that is currently downloading completes, then the rest are skipped and the download finishes with a DownloadStopped exception.

Fixed, in changeset:1340c484c6c60c52. The producer/consumer stuff works great, and the memory footprint is now down to 29MB for a stalled download of a 50MB file (this is within 7% of the footprint of our other 50MB tests). The new code also handles interrupted downloads extremely gracefully. The segment that is currently downloading completes, then the rest are skipped and the download finishes with a [DownloadStopped](wiki/DownloadStopped) exception.
warner added the
fixed
label 2007-09-19 08:12:20 +00:00
warner modified the milestone from undecided to 0.6.0 2007-09-19 08:12:20 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#129
No description provided.