very large memory usage when adding to large directories #379

Closed
opened 2008-04-12 05:44:48 +00:00 by warner · 3 comments

We've seen the webapi servers that are participating in the allmydata
migration effort have memory usage spikes that jump to an incredible 3.1GB
over the past two days. The CPU usage goes to 100% for about 5 minutes while
this happens. This is occurring about once every 15 minutes, causing the
migration process to run significantly slower.

We've isolated the problem to the following items:

  • our RemoteInterface for mutable share writes (i.e.
    slot_testv_and_readv_and_writev) declares a maximum share data size of 1
    MiB, i.e. 1048576 bytes, but the maximum size of a mutable file (3.5 MB)
    leads to shares that can exceed this. This occurs for directories of
    about 9600 entries
  • Adding new children to such a directory causes an outbound foolscap
    Violation, because the share-size constraint is being violated
  • the Violation is a Failure object, and Twisted (specifically
    maybeDeferred) causes the cleanFailure method to be run, which turns
    the entire stack trace (including local variables for each frame) into
    a bunch of strings
  • the 1MB-ish shares are present in the argument list for about a dozen
    stack frames (in the locals). Every one of these strings gets repr'ed,
    and since they're binary data, each repr'ing gets a 4x expansion.
  • the memory usage is coming from the dozens of copies of this expanded
    share data inside the Failure's .stack and .frames attributes.
  • in addition, the way we're using DeferredList fails to catch
    errors in callRemote: servers fail to accept shares, but we don't
    notice.

What we need to do to address this:

  • remove the size constraint on ShareData. This will stop these
    particular Violations from happening
  • Fix twisted ticket Twisted#2466,
    somehow, then require the resulting release of twisted.
  • add fireOnOneErrback=True to the uses of DeferredList in mutable.py,
    to properly catch exceptions

Any exception in share transmission is likely to consume this sort of memory.
Workarounds we could conceivably implement before the Twisted problem gets
fixed:

  • avoid holding large strings in local variables: pass them as instance
    attributes instead, be careful to delete them from locals as soon as
    possible
  • hold large strings in some sort of object (with a repr() that doesn't
    show the whole thing) instead of a real string, and teach Foolscap
    (via ISliceable) to serialize them as strings.
  • have Foolscap trim Failures in callRemote somehow
  • try to break up the stacks by using fireEventually in more places
  • rewrite Foolscap's outbound constraint checking to not do everything
    on the same stack

Our current plan is to fix the constraint and then hope that we don't trigger
other failures while we contribute to the Twisted ticket and wait for a new
release. Later refactorings of share management will probably put more data
in instance attributes rather than being passed through method arguments.

If necessary, we can ship tahoe with a patched version of Twisted.

We've seen the webapi servers that are participating in the allmydata migration effort have memory usage spikes that jump to an incredible 3.1GB over the past two days. The CPU usage goes to 100% for about 5 minutes while this happens. This is occurring about once every 15 minutes, causing the migration process to run significantly slower. We've isolated the problem to the following items: * our RemoteInterface for mutable share writes (i.e. slot_testv_and_readv_and_writev) declares a maximum share data size of 1 MiB, i.e. 1048576 bytes, but the maximum size of a mutable file (3.5 MB) leads to shares that can exceed this. This occurs for directories of about 9600 entries * Adding new children to such a directory causes an outbound foolscap Violation, because the share-size constraint is being violated * the Violation is a Failure object, and Twisted (specifically maybeDeferred) causes the cleanFailure method to be run, which turns the entire stack trace (including local variables for each frame) into a bunch of strings * the 1MB-ish shares are present in the argument list for about a dozen stack frames (in the locals). Every one of these strings gets repr'ed, and since they're binary data, each repr'ing gets a 4x expansion. * the memory usage is coming from the dozens of copies of this expanded share data inside the Failure's .stack and .frames attributes. * in addition, the way we're using `DeferredList` fails to catch errors in callRemote: servers fail to accept shares, but we don't notice. What we need to do to address this: * remove the size constraint on `ShareData`. This will stop these particular Violations from happening * Fix twisted ticket [Twisted#2466](http://twistedmatrix.com/trac/ticket/2466), somehow, then require the resulting release of twisted. * add fireOnOneErrback=True to the uses of `DeferredList` in mutable.py, to properly catch exceptions Any exception in share transmission is likely to consume this sort of memory. Workarounds we could conceivably implement before the Twisted problem gets fixed: * avoid holding large strings in local variables: pass them as instance attributes instead, be careful to delete them from locals as soon as possible * hold large strings in some sort of object (with a repr() that doesn't show the whole thing) instead of a real string, and teach Foolscap (via `ISliceable`) to serialize them as strings. * have Foolscap trim Failures in callRemote somehow * try to break up the stacks by using fireEventually in more places * rewrite Foolscap's outbound constraint checking to not do everything on the same stack Our current plan is to fix the constraint and then hope that we don't trigger other failures while we contribute to the Twisted ticket and wait for a new release. Later refactorings of share management will probably put more data in instance attributes rather than being passed through method arguments. If necessary, we can ship tahoe with a patched version of Twisted.
warner added the
code
critical
defect
1.0.0
labels 2008-04-12 05:44:48 +00:00
Author

I changed the remote interface to remove the size constraint, and I fixed the use of DeferredList to propagate errors back to the caller.

I updated our production webapi servers with these change, and they stopped using 3GB of memory every time someone tried to exceed the directory size limit.

I didn't update our storage servers to match, however, so what's probably happening right now is that the remote end is raising the same Violation. However this should involve less memory: the inbound constraint is being tested on each token and rejecting the message immediately (precisely what the foolscap constraint mechanism is designed for). So the caller is still likely to get a very weird error (a foolscap constraint Violation) rather than hitting the earlier "mutable files must be < 3.5MB" check.

I changed the remote interface to remove the size constraint, and I fixed the use of `DeferredList` to propagate errors back to the caller. I updated our production webapi servers with these change, and they stopped using 3GB of memory every time someone tried to exceed the directory size limit. I didn't update our storage servers to match, however, so what's probably happening right now is that the remote end is raising the same Violation. However this should involve less memory: the inbound constraint is being tested on each token and rejecting the message immediately (precisely what the foolscap constraint mechanism is designed for). So the caller is still likely to get a very weird error (a foolscap constraint Violation) rather than hitting the earlier "mutable files must be < 3.5MB" check.
Author

the newly-refactoried mutable files no longer use DeferredList.

I think this ticket has served its purpose, and hopefully future searchers will find it in the archives when they run into this sort of problem again.

the newly-refactoried mutable files no longer use `DeferredList`. I think this ticket has served its purpose, and hopefully future searchers will find it in the archives when they run into this sort of problem again.
warner added the
fixed
label 2008-04-24 23:54:36 +00:00
warner added
code-dirnodes
and removed
code
labels 2008-04-24 23:54:43 +00:00

Milestone 1.0.1 deleted

Milestone 1.0.1 deleted
zooko added this to the 1.1.0 milestone 2008-05-05 21:08:36 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#379
No description provided.