pycryptopp uses up too much RAM #419
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#419
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Brian pointed out that the graph which shows the time to create a mutable ("SSK") file has shown a dramatic reduction in time recent:
http://allmydata.org/tahoe-figleaf-graph/hanford.allmydata.com-tahoe_speedstats_SSK_creation.html
The time to create an SSK is supposed to be dominated by the time to create a new RSA public/private keypair, which is a randomized process involving iteratively creating a big number and testing it for primality.
Brian writes:
So if I understand what Zooko told me correctly, then pycryptopp was recently
changed to include a full copy of the underlying Crypto++ library, and to use
it in preference to any version that's installed in /usr/lib/ .
I'm still investigating, but I'm suspicious that this approach is responsible
for the increased memory usage, for two reasons. The obvious one is that the
.so files can't be shared between multiple users: if there are multiple Tahoe
processes on a single box, or a Tahoe process and something else that's using
Crypto++, then they won't be able to share Crypto++ code pages unless that
code is coming from the same place.
The other reason is that, lacking a stable shared library to reference, the
pycryptopp glue .so files must each statically link against the Crypto++
libraries. My not-fully-investigated evidence for this is that cipher/aes.so
is 12MB in size, and /usr/bin/size reports that it includes 2.7MB of .text
(i.e. code pages). Since AES can be implemented in far less than this, I'm
suspecting that a whole bunch of extra C++ baggage has been copied into that
file. The other glue .so files (sha256.so, rsa.so) have similar sizes, so I
think that there may be multiple copies of that C++ baggage.
When the system's Crypto++ was used from /usr/lib/libcrypto++.so, these glue
.so's would reference that file instead, and any of the C++ overhead would be
shared between the different python modules. So that might be a reason for
the 6MB increase in memory footprint.
I'm not sure what I think about pycryptopp insisting upon using its own code
in preference to the version installed on the system. This isn't a new topic
of discussion.. we've talked about this one a lot. So perhaps this is just
another datapoint in that discussion: using private copies of libraries
instead of linking against a system library increases both disk usage and
memory footprint. Disk usage may not be a very compelling argument these
days, but memory size still might be.
More investigation: aes.so contains symbols for RSA, EC-DSA, SHA-512, and a
whole bunch of other stuff.
So I believe that each of our glue .so files contains a complete copy of
Crypto++, and only differ in the tiny amount of python-to-C++ code that is
added on top of that.
The whole .so (including the whole copy of Crypto++) gets added to the Tahoe
node's memory space for each module that gets imported (aes.so, sha256.so,
rsa.so). That's why the vmsize is so large: duplicate copies of Crypto++
code.
Obviously, most of that code will never get used: there is no way for the
aes.so glue code to provide access to, say, the Blowfish cipher. But the
linker doesn't know that, so it has to include the whole thing in the .so. I
believe that the runtime code is not able to selectively map pieces of an .so
into memory, so it is forced to map the whole thing too, which might be why
the RSS size grows too. It's C++, so there are static constructors and things
that use memory when the code is loaded, so memory consumption gets pretty
complicated.
The answer is probably to just ignore this and regretfully accept that a
Tahoe node will consume more memory (or at least appear to consume more
memory) than it used to. We'll add more things like this over time, and
Tahoe's memory usage will grow and grow. I hope this doesn't happen.
Some other possibilities:
for all algorithms (RSA, DSA, AES, SHA256, etc). Then it could have
separate .py modules that provide access to this glue, perhaps as simple
as "from _pycryptopp import AES" or something, just a mapping of the
names. This would get us a single copy of Crypto++ instead of multiple
ones.
dynamically. This would probably result in just a minimal subset of
Crypto++ being copied into the glue .so files: i.e. only the RSA code (and
necessary support) would wind up in aes.so, nothing else. This would give
end-users the smallest memory footprint: they would not pay the memory
penalty for RSA unless then actually needed it.
aes.so would reference /usr/lib/libcrypto++.so rather than incorporating
a full copy, so aes.so and rsa.so could reference the same thing, and
a python process which used both aes.so and rsa.so would only get one copy
of Crypto++ instead of two.
Thanks for investigating this. I'll try to do something to reduce this size at some point. Your three ideas of how to reduce it are three good ones.
mysterious speed-up in creation of SSK filesto pycryptopp uses up too much spacepycryptopp uses up too much spaceto pycryptopp uses up too much RAMIf you love this ticket (#419), then you might like tickets #54 (port memory usage tests to windows), #227 (our automated memory measurements might be measuring the wrong thing), #478 (add memory-usage to stats-provider numbers), and #97 (reducing memory footprint in share reception).
(http://allmydata.org/trac/pycryptopp/ticket/9) (link against existing (system) libcrypto++.so) has been fixed in pycryptopp. I think that this implements the first and the third of Brian's suggestions above, if you pass
--disable-embedded-cryptopp
. If you don't, then pycryptopp has always implemented the second suggestion of Brian's -- to link statically against its own copy of libcryptopp. Once the buildslaves that run the memory measurements (on The Dev Page) are upgraded topycryptopp >= 0.5.13
then if whoever builds the pycryptopp package uses--disable-embedded-cryptopp
we'll see if this changes those measurements.Remember, though, that I don't think the number produced by those measurements correlates with any behavior that anyone cares about -- see #227 (our automated memory measurements might be measuring the wrong thing). If we measured resident set size (and even better if we turned off swap in order to prevent the resident set size measurement from dipping down randomly) then the number would correlate with something we care about: how many Tahoe nodes can you have in RAM and doing this sort of task simultaneously.
I'm moving this from Milestone 1.5 to Milestone undecided, but I don't know if we should instead close it as "fixed" or "invalid" or "wont-fix".
Oh, and I forgot that Matt Mackall has invented "smem" which provides measurements of memory usage that are actually useful: http://lwn.net/Articles/329458/ .
In 42d8a79/trunk:
Oops, I forgot to use the don't-close-Trac style of github commit message. This ticket might not be worth keeping around, but it shouldn't have been closed like that. Sorry!
Tahoe-LAFS switched from pycryptopp to cryptography in ticket:3031.