pycryptopp uses up too much RAM #419

Closed
opened 2008-05-12 21:39:46 +00:00 by zooko · 9 comments

Brian pointed out that the graph which shows the time to create a mutable ("SSK") file has shown a dramatic reduction in time recent:

http://allmydata.org/tahoe-figleaf-graph/hanford.allmydata.com-tahoe_speedstats_SSK_creation.html

The time to create an SSK is supposed to be dominated by the time to create a new RSA public/private keypair, which is a randomized process involving iteratively creating a big number and testing it for primality.

Brian writes:

so, the munin performance graphs show a considerable speedup in mutable file creation time that occurred last tuesday around noon it affected both the colo and DSL tests, so I assume it was a code thing instead of a buildslave getting moved or something

in addition, our 32-bit initial memory footprint went up by 6MB the only code change that appears relevant was the new reliance upon pycryptopp >= 0.5

I can somewhat believe that the memory increase is due to the inclusion of the EC-DSA code but can you think of any reason why RSA key generation might have speed up by nearly a factor of 10?

Brian pointed out that the graph which shows the time to create a mutable ("SSK") file has shown a dramatic reduction in time recent: <http://allmydata.org/tahoe-figleaf-graph/hanford.allmydata.com-tahoe_speedstats_SSK_creation.html> The time to create an SSK is supposed to be dominated by the time to create a new RSA public/private keypair, which is a randomized process involving iteratively creating a big number and testing it for primality. Brian writes: > so, the munin performance graphs show a considerable speedup in mutable file creation time that occurred last tuesday around noon it affected both the colo and DSL tests, so I assume it was a code thing instead of a buildslave getting moved or something > > in addition, our 32-bit initial memory footprint went up by 6MB the only code change that appears relevant was the new reliance upon pycryptopp >= 0.5 > > I can somewhat believe that the memory increase is due to the inclusion of the EC-DSA code but can you think of any reason why RSA key generation might have speed up by nearly a factor of 10?
zooko added the
code-encoding
major
task
1.0.0
labels 2008-05-12 21:39:46 +00:00
zooko added this to the 1.1.0 milestone 2008-05-12 21:39:46 +00:00
zooko self-assigned this 2008-05-12 21:39:46 +00:00

So if I understand what Zooko told me correctly, then pycryptopp was recently
changed to include a full copy of the underlying Crypto++ library, and to use
it in preference to any version that's installed in /usr/lib/ .

I'm still investigating, but I'm suspicious that this approach is responsible
for the increased memory usage, for two reasons. The obvious one is that the
.so files can't be shared between multiple users: if there are multiple Tahoe
processes on a single box, or a Tahoe process and something else that's using
Crypto++, then they won't be able to share Crypto++ code pages unless that
code is coming from the same place.

The other reason is that, lacking a stable shared library to reference, the
pycryptopp glue .so files must each statically link against the Crypto++
libraries. My not-fully-investigated evidence for this is that cipher/aes.so
is 12MB in size, and /usr/bin/size reports that it includes 2.7MB of .text
(i.e. code pages). Since AES can be implemented in far less than this, I'm
suspecting that a whole bunch of extra C++ baggage has been copied into that
file. The other glue .so files (sha256.so, rsa.so) have similar sizes, so I
think that there may be multiple copies of that C++ baggage.

When the system's Crypto++ was used from /usr/lib/libcrypto++.so, these glue
.so's would reference that file instead, and any of the C++ overhead would be
shared between the different python modules. So that might be a reason for
the 6MB increase in memory footprint.

I'm not sure what I think about pycryptopp insisting upon using its own code
in preference to the version installed on the system. This isn't a new topic
of discussion.. we've talked about this one a lot. So perhaps this is just
another datapoint in that discussion: using private copies of libraries
instead of linking against a system library increases both disk usage and
memory footprint. Disk usage may not be a very compelling argument these
days, but memory size still might be.

So if I understand what Zooko told me correctly, then pycryptopp was recently changed to include a full copy of the underlying Crypto++ library, and to use it in preference to any version that's installed in /usr/lib/ . I'm still investigating, but I'm suspicious that this approach is responsible for the increased memory usage, for two reasons. The obvious one is that the .so files can't be shared between multiple users: if there are multiple Tahoe processes on a single box, or a Tahoe process and something else that's using Crypto++, then they won't be able to share Crypto++ code pages unless that code is coming from the same place. The other reason is that, lacking a stable shared library to reference, the pycryptopp glue .so files must each statically link against the Crypto++ libraries. My not-fully-investigated evidence for this is that cipher/aes.so is 12MB in size, and /usr/bin/size reports that it includes 2.7MB of .text (i.e. code pages). Since AES can be implemented in far less than this, I'm suspecting that a whole bunch of extra C++ baggage has been copied into that file. The other glue .so files (sha256.so, rsa.so) have similar sizes, so I think that there may be multiple copies of that C++ baggage. When the system's Crypto++ was used from /usr/lib/libcrypto++.so, these glue .so's would reference that file instead, and any of the C++ overhead would be shared between the different python modules. So that might be a reason for the 6MB increase in memory footprint. I'm not sure what I think about pycryptopp insisting upon using its own code in preference to the version installed on the system. This isn't a new topic of discussion.. we've talked about this one a lot. So perhaps this is just another datapoint in that discussion: using private copies of libraries instead of linking against a system library increases both disk usage *and* memory footprint. Disk usage may not be a very compelling argument these days, but memory size still might be.

More investigation: aes.so contains symbols for RSA, EC-DSA, SHA-512, and a
whole bunch of other stuff.

So I believe that each of our glue .so files contains a complete copy of
Crypto++, and only differ in the tiny amount of python-to-C++ code that is
added on top of that.

The whole .so (including the whole copy of Crypto++) gets added to the Tahoe
node's memory space for each module that gets imported (aes.so, sha256.so,
rsa.so). That's why the vmsize is so large: duplicate copies of Crypto++
code.

Obviously, most of that code will never get used: there is no way for the
aes.so glue code to provide access to, say, the Blowfish cipher. But the
linker doesn't know that, so it has to include the whole thing in the .so. I
believe that the runtime code is not able to selectively map pieces of an .so
into memory, so it is forced to map the whole thing too, which might be why
the RSS size grows too. It's C++, so there are static constructors and things
that use memory when the code is loaded, so memory consumption gets pretty
complicated.

The answer is probably to just ignore this and regretfully accept that a
Tahoe node will consume more memory (or at least appear to consume more
memory) than it used to. We'll add more things like this over time, and
Tahoe's memory usage will grow and grow. I hope this doesn't happen.

Some other possibilities:

  • pycryptopp could have a single .so file, containing all of the glue code
    for all algorithms (RSA, DSA, AES, SHA256, etc). Then it could have
    separate .py modules that provide access to this glue, perhaps as simple
    as "from _pycryptopp import AES" or something, just a mapping of the
    names. This would get us a single copy of Crypto++ instead of multiple
    ones.
  • pycryptopp could link statically against its copy of Crypto++ instead of
    dynamically. This would probably result in just a minimal subset of
    Crypto++ being copied into the glue .so files: i.e. only the RSA code (and
    necessary support) would wind up in aes.so, nothing else. This would give
    end-users the smallest memory footprint: they would not pay the memory
    penalty for RSA unless then actually needed it.
  • using a system Crypto++ instead of a private copy might help, because then
    aes.so would reference /usr/lib/libcrypto++.so rather than incorporating
    a full copy, so aes.so and rsa.so could reference the same thing, and
    a python process which used both aes.so and rsa.so would only get one copy
    of Crypto++ instead of two.
More investigation: aes.so contains symbols for RSA, EC-DSA, SHA-512, and a whole bunch of other stuff. So I believe that each of our glue .so files contains a complete copy of Crypto++, and only differ in the tiny amount of python-to-C++ code that is added on top of that. The whole .so (including the whole copy of Crypto++) gets added to the Tahoe node's memory space for each module that gets imported (aes.so, sha256.so, rsa.so). That's why the vmsize is so large: duplicate copies of Crypto++ code. Obviously, most of that code will never get used: there is no way for the aes.so glue code to provide access to, say, the Blowfish cipher. But the linker doesn't know that, so it has to include the whole thing in the .so. I believe that the runtime code is not able to selectively map pieces of an .so into memory, so it is forced to map the whole thing too, which might be why the RSS size grows too. It's C++, so there are static constructors and things that use memory when the code is loaded, so memory consumption gets pretty complicated. The answer is probably to just ignore this and regretfully accept that a Tahoe node will consume more memory (or at least appear to consume more memory) than it used to. We'll add more things like this over time, and Tahoe's memory usage will grow and grow. I hope this doesn't happen. Some other possibilities: * pycryptopp could have a single .so file, containing all of the glue code for all algorithms (RSA, DSA, AES, SHA256, etc). Then it could have separate .py modules that provide access to this glue, perhaps as simple as "from _pycryptopp import AES" or something, just a mapping of the names. This would get us a single copy of Crypto++ instead of multiple ones. * pycryptopp could link statically against its copy of Crypto++ instead of dynamically. This would probably result in just a minimal subset of Crypto++ being copied into the glue .so files: i.e. only the RSA code (and necessary support) would wind up in aes.so, nothing else. This would give end-users the smallest memory footprint: they would not pay the memory penalty for RSA unless then actually needed it. * using a system Crypto++ instead of a private copy might help, because then aes.so would reference /usr/lib/libcrypto++.so rather than incorporating a full copy, so aes.so and rsa.so could reference the same thing, and a python process which used both aes.so and rsa.so would only get one copy of Crypto++ instead of two.
Author

Thanks for investigating this. I'll try to do something to reduce this size at some point. Your three ideas of how to reduce it are three good ones.

Thanks for investigating this. I'll try to do something to reduce this size at some point. Your three ideas of how to reduce it are three good ones.
zooko changed title from mysterious speed-up in creation of SSK files to pycryptopp uses up too much space 2008-05-15 12:40:46 +00:00
warner changed title from pycryptopp uses up too much space to pycryptopp uses up too much RAM 2008-05-15 18:17:28 +00:00
warner modified the milestone from 1.1.0 to 1.2.0 2008-05-29 22:19:41 +00:00
Author

If you love this ticket (#419), then you might like tickets #54 (port memory usage tests to windows), #227 (our automated memory measurements might be measuring the wrong thing), #478 (add memory-usage to stats-provider numbers), and #97 (reducing memory footprint in share reception).

If you love this ticket (#419), then you might like tickets #54 (port memory usage tests to windows), #227 (our automated memory measurements might be measuring the wrong thing), #478 (add memory-usage to stats-provider numbers), and #97 (reducing memory footprint in share reception).
Author

(http://allmydata.org/trac/pycryptopp/ticket/9) (link against existing (system) libcrypto++.so) has been fixed in pycryptopp. I think that this implements the first and the third of Brian's suggestions above, if you pass --disable-embedded-cryptopp. If you don't, then pycryptopp has always implemented the second suggestion of Brian's -- to link statically against its own copy of libcryptopp. Once the buildslaves that run the memory measurements (on The Dev Page) are upgraded to pycryptopp >= 0.5.13 then if whoever builds the pycryptopp package uses --disable-embedded-cryptopp we'll see if this changes those measurements.

Remember, though, that I don't think the number produced by those measurements correlates with any behavior that anyone cares about -- see #227 (our automated memory measurements might be measuring the wrong thing). If we measured resident set size (and even better if we turned off swap in order to prevent the resident set size measurement from dipping down randomly) then the number would correlate with something we care about: how many Tahoe nodes can you have in RAM and doing this sort of task simultaneously.

I'm moving this from Milestone 1.5 to Milestone undecided, but I don't know if we should instead close it as "fixed" or "invalid" or "wont-fix".

(http://allmydata.org/trac/pycryptopp/ticket/9) (link against existing (system) libcrypto++.so) has been fixed in pycryptopp. I think that this implements the first and the third of Brian's suggestions above, if you pass `--disable-embedded-cryptopp`. If you don't, then pycryptopp has always implemented the second suggestion of Brian's -- to link statically against its own copy of libcryptopp. Once the buildslaves that run the memory measurements (on [The Dev Page](wiki/Dev)) are upgraded to `pycryptopp >= 0.5.13` then if whoever builds the pycryptopp package uses `--disable-embedded-cryptopp` we'll see if this changes those measurements. Remember, though, that I don't think the number produced by those measurements correlates with any behavior that anyone cares about -- see #227 (our automated memory measurements might be measuring the wrong thing). If we measured resident set size (and even better if we turned off swap in order to prevent the resident set size measurement from dipping down randomly) then the number would correlate with something we care about: how many Tahoe nodes can you have in RAM and doing this sort of task simultaneously. I'm moving this from Milestone 1.5 to Milestone undecided, but I don't know if we should instead close it as "fixed" or "invalid" or "wont-fix".
zooko added
minor
defect
and removed
major
task
labels 2009-06-15 20:02:35 +00:00
zooko modified the milestone from 1.5.0 to undecided 2009-06-15 20:02:35 +00:00
Author

Oh, and I forgot that Matt Mackall has invented "smem" which provides measurements of memory usage that are actually useful: http://lwn.net/Articles/329458/ .

Oh, and I forgot that Matt Mackall has invented "smem" which provides measurements of memory usage that are actually useful: <http://lwn.net/Articles/329458/> .
Brian Warner <warner@lothar.com> commented 2017-06-05 09:37:37 +00:00
Owner

In 42d8a79/trunk:

Merge PR419: update docs: OpenSolaris->Illumos

closes #419
In [42d8a79/trunk](/tahoe-lafs/trac-2024-07-25/commit/42d8a79c9d0f0f369a1a7d61bba4a14e4e6ea0fc): ``` Merge PR419: update docs: OpenSolaris->Illumos closes #419 ```
tahoe-lafs added the
fixed
label 2017-06-05 09:37:37 +00:00
Brian Warner <warner@lothar.com> closed this issue 2017-06-05 09:37:37 +00:00

Oops, I forgot to use the don't-close-Trac style of github commit message. This ticket might not be worth keeping around, but it shouldn't have been closed like that. Sorry!

Oops, I forgot to use the don't-close-Trac style of github commit message. This ticket might not be worth keeping around, but it shouldn't have been closed like that. Sorry!
warner removed the
fixed
label 2017-06-05 09:39:52 +00:00
warner reopened this issue 2017-06-05 09:39:52 +00:00

Tahoe-LAFS switched from pycryptopp to cryptography in ticket:3031.

Tahoe-LAFS switched from pycryptopp to cryptography in ticket:3031.
exarkun added the
wontfix
label 2019-07-25 13:10:11 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#419
No description provided.