very large memory usage when adding to large directories #379

New Issue

warner · 2008-04-12T05:44:48Z

warner commented

2008-04-12 05:44:48 +00:00

We've seen the webapi servers that are participating in the allmydata
migration effort have memory usage spikes that jump to an incredible 3.1GB
over the past two days. The CPU usage goes to 100% for about 5 minutes while
this happens. This is occurring about once every 15 minutes, causing the
migration process to run significantly slower.

We've isolated the problem to the following items:

our RemoteInterface for mutable share writes (i.e.
slot_testv_and_readv_and_writev) declares a maximum share data size of 1
MiB, i.e. 1048576 bytes, but the maximum size of a mutable file (3.5 MB)
leads to shares that can exceed this. This occurs for directories of
about 9600 entries
Adding new children to such a directory causes an outbound foolscap
Violation, because the share-size constraint is being violated
the Violation is a Failure object, and Twisted (specifically
maybeDeferred) causes the cleanFailure method to be run, which turns
the entire stack trace (including local variables for each frame) into
a bunch of strings
the 1MB-ish shares are present in the argument list for about a dozen
stack frames (in the locals). Every one of these strings gets repr'ed,
and since they're binary data, each repr'ing gets a 4x expansion.
the memory usage is coming from the dozens of copies of this expanded
share data inside the Failure's .stack and .frames attributes.
in addition, the way we're using DeferredList fails to catch
errors in callRemote: servers fail to accept shares, but we don't
notice.

What we need to do to address this:

remove the size constraint on ShareData. This will stop these
particular Violations from happening
Fix twisted ticket Twisted#2466,
somehow, then require the resulting release of twisted.
add fireOnOneErrback=True to the uses of DeferredList in mutable.py,
to properly catch exceptions

Any exception in share transmission is likely to consume this sort of memory.
Workarounds we could conceivably implement before the Twisted problem gets
fixed:

avoid holding large strings in local variables: pass them as instance
attributes instead, be careful to delete them from locals as soon as
possible
hold large strings in some sort of object (with a repr() that doesn't
show the whole thing) instead of a real string, and teach Foolscap
(via ISliceable) to serialize them as strings.
have Foolscap trim Failures in callRemote somehow
try to break up the stacks by using fireEventually in more places
rewrite Foolscap's outbound constraint checking to not do everything
on the same stack

Our current plan is to fix the constraint and then hope that we don't trigger
other failures while we contribute to the Twisted ticket and wait for a new
release. Later refactorings of share management will probably put more data
in instance attributes rather than being passed through method arguments.

If necessary, we can ship tahoe with a patched version of Twisted.

We've seen the webapi servers that are participating in the allmydata migration effort have memory usage spikes that jump to an incredible 3.1GB over the past two days. The CPU usage goes to 100% for about 5 minutes while this happens. This is occurring about once every 15 minutes, causing the migration process to run significantly slower. We've isolated the problem to the following items: * our RemoteInterface for mutable share writes (i.e. slot_testv_and_readv_and_writev) declares a maximum share data size of 1 MiB, i.e. 1048576 bytes, but the maximum size of a mutable file (3.5 MB) leads to shares that can exceed this. This occurs for directories of about 9600 entries * Adding new children to such a directory causes an outbound foolscap Violation, because the share-size constraint is being violated * the Violation is a Failure object, and Twisted (specifically maybeDeferred) causes the cleanFailure method to be run, which turns the entire stack trace (including local variables for each frame) into a bunch of strings * the 1MB-ish shares are present in the argument list for about a dozen stack frames (in the locals). Every one of these strings gets repr'ed, and since they're binary data, each repr'ing gets a 4x expansion. * the memory usage is coming from the dozens of copies of this expanded share data inside the Failure's .stack and .frames attributes. * in addition, the way we're using `DeferredList` fails to catch errors in callRemote: servers fail to accept shares, but we don't notice. What we need to do to address this: * remove the size constraint on `ShareData`. This will stop these particular Violations from happening * Fix twisted ticket [Twisted#2466](http://twistedmatrix.com/trac/ticket/2466), somehow, then require the resulting release of twisted. * add fireOnOneErrback=True to the uses of `DeferredList` in mutable.py, to properly catch exceptions Any exception in share transmission is likely to consume this sort of memory. Workarounds we could conceivably implement before the Twisted problem gets fixed: * avoid holding large strings in local variables: pass them as instance attributes instead, be careful to delete them from locals as soon as possible * hold large strings in some sort of object (with a repr() that doesn't show the whole thing) instead of a real string, and teach Foolscap (via `ISliceable`) to serialize them as strings. * have Foolscap trim Failures in callRemote somehow * try to break up the stacks by using fireEventually in more places * rewrite Foolscap's outbound constraint checking to not do everything on the same stack Our current plan is to fix the constraint and then hope that we don't trigger other failures while we contribute to the Twisted ticket and wait for a new release. Later refactorings of share management will probably put more data in instance attributes rather than being passed through method arguments. If necessary, we can ship tahoe with a patched version of Twisted.

warner added the

labels 2008-04-12 05:44:48 +00:00

warner commented

2008-04-12 18:13:32 +00:00

I changed the remote interface to remove the size constraint, and I fixed the use of DeferredList to propagate errors back to the caller.

I updated our production webapi servers with these change, and they stopped using 3GB of memory every time someone tried to exceed the directory size limit.

I didn't update our storage servers to match, however, so what's probably happening right now is that the remote end is raising the same Violation. However this should involve less memory: the inbound constraint is being tested on each token and rejecting the message immediately (precisely what the foolscap constraint mechanism is designed for). So the caller is still likely to get a very weird error (a foolscap constraint Violation) rather than hitting the earlier "mutable files must be < 3.5MB" check.

I changed the remote interface to remove the size constraint, and I fixed the use of `DeferredList` to propagate errors back to the caller. I updated our production webapi servers with these change, and they stopped using 3GB of memory every time someone tried to exceed the directory size limit. I didn't update our storage servers to match, however, so what's probably happening right now is that the remote end is raising the same Violation. However this should involve less memory: the inbound constraint is being tested on each token and rejecting the message immediately (precisely what the foolscap constraint mechanism is designed for). So the caller is still likely to get a very weird error (a foolscap constraint Violation) rather than hitting the earlier "mutable files must be < 3.5MB" check.

warner commented

2008-04-24 23:54:36 +00:00

the newly-refactoried mutable files no longer use DeferredList.

I think this ticket has served its purpose, and hopefully future searchers will find it in the archives when they run into this sort of problem again.

the newly-refactoried mutable files no longer use `DeferredList`. I think this ticket has served its purpose, and hopefully future searchers will find it in the archives when they run into this sort of problem again.

warner added the

fixed

label 2008-04-24 23:54:36 +00:00

warner closed this issue

2008-04-24 23:54:36 +00:00

warner added

code-dirnodes

and removed

code

labels 2008-04-24 23:54:43 +00:00

zooko commented

2008-05-05 21:08:36 +00:00