pycryptopp uses up too much RAM #419

New Issue

zooko · 2008-05-12T21:39:46Z

zooko commented

2008-05-12 21:39:46 +00:00

Brian pointed out that the graph which shows the time to create a mutable ("SSK") file has shown a dramatic reduction in time recent:

http://allmydata.org/tahoe-figleaf-graph/hanford.allmydata.com-tahoe_speedstats_SSK_creation.html

The time to create an SSK is supposed to be dominated by the time to create a new RSA public/private keypair, which is a randomized process involving iteratively creating a big number and testing it for primality.

Brian writes:

so, the munin performance graphs show a considerable speedup in mutable file creation time that occurred last tuesday around noon it affected both the colo and DSL tests, so I assume it was a code thing instead of a buildslave getting moved or something

in addition, our 32-bit initial memory footprint went up by 6MB the only code change that appears relevant was the new reliance upon pycryptopp >= 0.5

I can somewhat believe that the memory increase is due to the inclusion of the EC-DSA code but can you think of any reason why RSA key generation might have speed up by nearly a factor of 10?

Brian pointed out that the graph which shows the time to create a mutable ("SSK") file has shown a dramatic reduction in time recent: <http://allmydata.org/tahoe-figleaf-graph/hanford.allmydata.com-tahoe_speedstats_SSK_creation.html> The time to create an SSK is supposed to be dominated by the time to create a new RSA public/private keypair, which is a randomized process involving iteratively creating a big number and testing it for primality. Brian writes: > so, the munin performance graphs show a considerable speedup in mutable file creation time that occurred last tuesday around noon it affected both the colo and DSL tests, so I assume it was a code thing instead of a buildslave getting moved or something > > in addition, our 32-bit initial memory footprint went up by 6MB the only code change that appears relevant was the new reliance upon pycryptopp >= 0.5 > > I can somewhat believe that the memory increase is due to the inclusion of the EC-DSA code but can you think of any reason why RSA key generation might have speed up by nearly a factor of 10?

zooko added the

labels 2008-05-12 21:39:46 +00:00

zooko added this to the 1.1.0 milestone 2008-05-12 21:39:46 +00:00

zooko self-assigned this 2008-05-12 21:39:46 +00:00

warner commented

2008-05-14 19:32:50 +00:00

So if I understand what Zooko told me correctly, then pycryptopp was recently
changed to include a full copy of the underlying Crypto++ library, and to use
it in preference to any version that's installed in /usr/lib/ .

I'm still investigating, but I'm suspicious that this approach is responsible
for the increased memory usage, for two reasons. The obvious one is that the
.so files can't be shared between multiple users: if there are multiple Tahoe
processes on a single box, or a Tahoe process and something else that's using
Crypto++, then they won't be able to share Crypto++ code pages unless that
code is coming from the same place.

The other reason is that, lacking a stable shared library to reference, the
pycryptopp glue .so files must each statically link against the Crypto++
libraries. My not-fully-investigated evidence for this is that cipher/aes.so
is 12MB in size, and /usr/bin/size reports that it includes 2.7MB of .text
(i.e. code pages). Since AES can be implemented in far less than this, I'm
suspecting that a whole bunch of extra C++ baggage has been copied into that
file. The other glue .so files (sha256.so, rsa.so) have similar sizes, so I
think that there may be multiple copies of that C++ baggage.

When the system's Crypto++ was used from /usr/lib/libcrypto++.so, these glue
.so's would reference that file instead, and any of the C++ overhead would be
shared between the different python modules. So that might be a reason for
the 6MB increase in memory footprint.

I'm not sure what I think about pycryptopp insisting upon using its own code
in preference to the version installed on the system. This isn't a new topic
of discussion.. we've talked about this one a lot. So perhaps this is just
another datapoint in that discussion: using private copies of libraries
instead of linking against a system library increases both disk usage and
memory footprint. Disk usage may not be a very compelling argument these
days, but memory size still might be.

So if I understand what Zooko told me correctly, then pycryptopp was recently changed to include a full copy of the underlying Crypto++ library, and to use it in preference to any version that's installed in /usr/lib/ . I'm still investigating, but I'm suspicious that this approach is responsible for the increased memory usage, for two reasons. The obvious one is that the .so files can't be shared between multiple users: if there are multiple Tahoe processes on a single box, or a Tahoe process and something else that's using Crypto++, then they won't be able to share Crypto++ code pages unless that code is coming from the same place. The other reason is that, lacking a stable shared library to reference, the pycryptopp glue .so files must each statically link against the Crypto++ libraries. My not-fully-investigated evidence for this is that cipher/aes.so is 12MB in size, and /usr/bin/size reports that it includes 2.7MB of .text (i.e. code pages). Since AES can be implemented in far less than this, I'm suspecting that a whole bunch of extra C++ baggage has been copied into that file. The other glue .so files (sha256.so, rsa.so) have similar sizes, so I think that there may be multiple copies of that C++ baggage. When the system's Crypto++ was used from /usr/lib/libcrypto++.so, these glue .so's would reference that file instead, and any of the C++ overhead would be shared between the different python modules. So that might be a reason for the 6MB increase in memory footprint. I'm not sure what I think about pycryptopp insisting upon using its own code in preference to the version installed on the system. This isn't a new topic of discussion.. we've talked about this one a lot. So perhaps this is just another datapoint in that discussion: using private copies of libraries instead of linking against a system library increases both disk usage *and* memory footprint. Disk usage may not be a very compelling argument these days, but memory size still might be.

warner commented

2008-05-14 19:44:44 +00:00

More investigation: aes.so contains symbols for RSA, EC-DSA, SHA-512, and a
whole bunch of other stuff.

So I believe that each of our glue .so files contains a complete copy of
Crypto++, and only differ in the tiny amount of python-to-C++ code that is
added on top of that.

The whole .so (including the whole copy of Crypto++) gets added to the Tahoe
node's memory space for each module that gets imported (aes.so, sha256.so,
rsa.so). That's why the vmsize is so large: duplicate copies of Crypto++
code.

Obviously, most of that code will never get used: there is no way for the
aes.so glue code to provide access to, say, the Blowfish cipher. But the
linker doesn't know that, so it has to include the whole thing in the .so. I
believe that the runtime code is not able to selectively map pieces of an .so
into memory, so it is forced to map the whole thing too, which might be why
the RSS size grows too. It's C++, so there are static constructors and things
that use memory when the code is loaded, so memory consumption gets pretty
complicated.

The answer is probably to just ignore this and regretfully accept that a
Tahoe node will consume more memory (or at least appear to consume more
memory) than it used to. We'll add more things like this over time, and
Tahoe's memory usage will grow and grow. I hope this doesn't happen.

Some other possibilities:

pycryptopp could have a single .so file, containing all of the glue code
for all algorithms (RSA, DSA, AES, SHA256, etc). Then it could have
separate .py modules that provide access to this glue, perhaps as simple
as "from _pycryptopp import AES" or something, just a mapping of the
names. This would get us a single copy of Crypto++ instead of multiple
ones.
pycryptopp could link statically against its copy of Crypto++ instead of
dynamically. This would probably result in just a minimal subset of
Crypto++ being copied into the glue .so files: i.e. only the RSA code (and
necessary support) would wind up in aes.so, nothing else. This would give
end-users the smallest memory footprint: they would not pay the memory
penalty for RSA unless then actually needed it.
using a system Crypto++ instead of a private copy might help, because then
aes.so would reference /usr/lib/libcrypto++.so rather than incorporating
a full copy, so aes.so and rsa.so could reference the same thing, and
a python process which used both aes.so and rsa.so would only get one copy
of Crypto++ instead of two.

More investigation: aes.so contains symbols for RSA, EC-DSA, SHA-512, and a whole bunch of other stuff. So I believe that each of our glue .so files contains a complete copy of Crypto++, and only differ in the tiny amount of python-to-C++ code that is added on top of that. The whole .so (including the whole copy of Crypto++) gets added to the Tahoe node's memory space for each module that gets imported (aes.so, sha256.so, rsa.so). That's why the vmsize is so large: duplicate copies of Crypto++ code. Obviously, most of that code will never get used: there is no way for the aes.so glue code to provide access to, say, the Blowfish cipher. But the linker doesn't know that, so it has to include the whole thing in the .so. I believe that the runtime code is not able to selectively map pieces of an .so into memory, so it is forced to map the whole thing too, which might be why the RSS size grows too. It's C++, so there are static constructors and things that use memory when the code is loaded, so memory consumption gets pretty complicated. The answer is probably to just ignore this and regretfully accept that a Tahoe node will consume more memory (or at least appear to consume more memory) than it used to. We'll add more things like this over time, and Tahoe's memory usage will grow and grow. I hope this doesn't happen. Some other possibilities: * pycryptopp could have a single .so file, containing all of the glue code for all algorithms (RSA, DSA, AES, SHA256, etc). Then it could have separate .py modules that provide access to this glue, perhaps as simple as "from _pycryptopp import AES" or something, just a mapping of the names. This would get us a single copy of Crypto++ instead of multiple ones. * pycryptopp could link statically against its copy of Crypto++ instead of dynamically. This would probably result in just a minimal subset of Crypto++ being copied into the glue .so files: i.e. only the RSA code (and necessary support) would wind up in aes.so, nothing else. This would give end-users the smallest memory footprint: they would not pay the memory penalty for RSA unless then actually needed it. * using a system Crypto++ instead of a private copy might help, because then aes.so would reference /usr/lib/libcrypto++.so rather than incorporating a full copy, so aes.so and rsa.so could reference the same thing, and a python process which used both aes.so and rsa.so would only get one copy of Crypto++ instead of two.

zooko commented

2008-05-14 20:46:48 +00:00

Thanks for investigating this. I'll try to do something to reduce this size at some point. Your three ideas of how to reduce it are three good ones.

zooko changed title from ~~mysterious speed-up in creation of SSK files~~ to pycryptopp uses up too much space

2008-05-15 12:40:46 +00:00

warner changed title from ~~pycryptopp uses up too much space~~ to pycryptopp uses up too much RAM

2008-05-15 18:17:28 +00:00

warner modified the milestone from 1.1.0 to 1.2.0

2008-05-29 22:19:41 +00:00

zooko commented

2009-05-04 17:47:46 +00:00

If you love this ticket (#419), then you might like tickets #54 (port memory usage tests to windows), #227 (our automated memory measurements might be measuring the wrong thing), #478 (add memory-usage to stats-provider numbers), and #97 (reducing memory footprint in share reception).

zooko commented

2009-06-15 20:02:35 +00:00

(http://allmydata.org/trac/pycryptopp/ticket/9) (link against existing (system) libcrypto++.so) has been fixed in pycryptopp. I think that this implements the first and the third of Brian's suggestions above, if you pass --disable-embedded-cryptopp. If you don't, then pycryptopp has always implemented the second suggestion of Brian's -- to link statically against its own copy of libcryptopp. Once the buildslaves that run the memory measurements (on The Dev Page) are upgraded to pycryptopp >= 0.5.13 then if whoever builds the pycryptopp package uses --disable-embedded-cryptopp we'll see if this changes those measurements.

Remember, though, that I don't think the number produced by those measurements correlates with any behavior that anyone cares about -- see #227 (our automated memory measurements might be measuring the wrong thing). If we measured resident set size (and even better if we turned off swap in order to prevent the resident set size measurement from dipping down randomly) then the number would correlate with something we care about: how many Tahoe nodes can you have in RAM and doing this sort of task simultaneously.

I'm moving this from Milestone 1.5 to Milestone undecided, but I don't know if we should instead close it as "fixed" or "invalid" or "wont-fix".

(http://allmydata.org/trac/pycryptopp/ticket/9) (link against existing (system) libcrypto++.so) has been fixed in pycryptopp. I think that this implements the first and the third of Brian's suggestions above, if you pass `--disable-embedded-cryptopp`. If you don't, then pycryptopp has always implemented the second suggestion of Brian's -- to link statically against its own copy of libcryptopp. Once the buildslaves that run the memory measurements (on [The Dev Page](wiki/Dev)) are upgraded to `pycryptopp >= 0.5.13` then if whoever builds the pycryptopp package uses `--disable-embedded-cryptopp` we'll see if this changes those measurements. Remember, though, that I don't think the number produced by those measurements correlates with any behavior that anyone cares about -- see #227 (our automated memory measurements might be measuring the wrong thing). If we measured resident set size (and even better if we turned off swap in order to prevent the resident set size measurement from dipping down randomly) then the number would correlate with something we care about: how many Tahoe nodes can you have in RAM and doing this sort of task simultaneously. I'm moving this from Milestone 1.5 to Milestone undecided, but I don't know if we should instead close it as "fixed" or "invalid" or "wont-fix".

zooko added

and removed

labels 2009-06-15 20:02:35 +00:00

zooko modified the milestone from 1.5.0 to undecided

2009-06-15 20:02:35 +00:00

zooko commented

2009-06-17 20:04:00 +00:00

Oh, and I forgot that Matt Mackall has invented "smem" which provides measurements of memory usage that are actually useful: http://lwn.net/Articles/329458/ .

Oh, and I forgot that Matt Mackall has invented "smem" which provides measurements of memory usage that are actually useful: <http://lwn.net/Articles/329458/> .

Brian Warner <warner@lothar.com> commented

2017-06-05 09:37:37 +00:00

In 42d8a79/trunk:

Merge PR419: update docs: OpenSolaris->Illumos

closes #419

In [42d8a79/trunk](/tahoe-lafs/trac-2024-07-25/commit/42d8a79c9d0f0f369a1a7d61bba4a14e4e6ea0fc): ``` Merge PR419: update docs: OpenSolaris->Illumos closes #419 ```

tahoe-lafs added the

fixed

label 2017-06-05 09:37:37 +00:00

Brian Warner <warner@lothar.com> closed this issue

2017-06-05 09:37:37 +00:00

warner commented

2017-06-05 09:39:52 +00:00

Oops, I forgot to use the don't-close-Trac style of github commit message. This ticket might not be worth keeping around, but it shouldn't have been closed like that. Sorry!

warner removed the

fixed

label 2017-06-05 09:39:52 +00:00

warner reopened this issue

2017-06-05 09:39:52 +00:00

exarkun commented

2019-07-25 13:10:11 +00:00

Tahoe-LAFS switched from pycryptopp to cryptography in ticket:3031.

exarkun added the

wontfix

label 2019-07-25 13:10:11 +00:00

exarkun closed this issue