our automated memory measurements might be measuring the wrong thing #227

New Issue

zooko · 2007-12-09T13:20:41Z

zooko commented

2007-12-09 13:20:41 +00:00

As visible in the memory usage graphs, pycryptopp increased the static memory footprint by about 6 MiB when we added it in early November (I think it was November 6, although the Performance page says November 9), and removing pycrypto on 2007-12-03 seems to have had almost no benefit in reducing memory footprint.

This reminds me of the weirdness about the 64-bit version using way more memory than we expected.

Hm. I think maybe we are erring by using "VmSize" (from /proc/*/status) as our proxy for memory usage. That number is the total size of the virtual address space requested by the process, if I understand correctly. So for example, mmap'ing a file adds the file's size to your VmSize, although it does not (by itself) use any memory.

Linux kernel hackers seem to be in universal agreement that it is a bad idea to use VmSize for anything:

http://bmaurer.blogspot.com/2006/03/memory-usage-with-smaps.html
http://lwn.net/Articles/230975/

But what's the alternative? We could read "smaps" and see if we can get a better metric out of that.

By the way, if anyone wants to investigate more closely the memory usage, the valgrind tool named massif has been rewritten so maybe it will work this time.

As visible in [the memory usage graphs](http://allmydata.org/tahoe-figleaf-graph/hanford.allmydata.com-tahoe_memstats.html), pycryptopp increased the static memory footprint by about 6 MiB when we added it in early November (I think it was November 6, although [the Performance page](wiki/Performance) says November 9), and removing pycrypto on 2007-12-03 seems to have had almost no benefit in reducing memory footprint. This reminds me of the weirdness about the 64-bit version using way more memory than we expected. Hm. I think maybe we are erring by using "VmSize" (from /proc/*/status) as our proxy for memory usage. That number is the total size of the virtual address space requested by the process, if I understand correctly. So for example, mmap'ing a file adds the file's size to your [VmSize](wiki/VmSize), although it does not (by itself) use any memory. Linux kernel hackers seem to be in universal agreement that it is a bad idea to use [VmSize](wiki/VmSize) for anything: <http://bmaurer.blogspot.com/2006/03/memory-usage-with-smaps.html> <http://lwn.net/Articles/230975/> But what's the alternative? We could read "smaps" and see if we can get a better metric out of that. By the way, if anyone wants to investigate more closely the memory usage, the valgrind tool named massif has been rewritten so maybe it will work this time.

zooko added the

labels 2007-12-09 13:20:41 +00:00

zooko added this to the eventually milestone 2007-12-09 13:20:41 +00:00

zooko changed title from ~~pycryptopp uses up a 6 MB of memory (or at least it increases VmSize by 6M)~~ to our automated memory measurements might be measuring the wrong thing

2008-01-01 22:48:21 +00:00

zooko commented

2008-01-01 22:52:18 +00:00

Here is a way to test whether your memory measurement is giving you useful answers. Take a machine with little physical RAM -- I have one here with 500 MB -- turn off swap, and start more and more Tahoe clients and start each one doing the "upload" operation until eventually you get malloc failures or Linux OOM kills or whatever.

Now divide your physical RAM by the number of Tahoe clients that you were able to run without incurring memory problems. The result of that division is a reasonable approximation of the "memory requirements" of the current Tahoe client.

This sounds like fun -- I'll accept this ticket.

Here is a way to test whether your memory measurement is giving you useful answers. Take a machine with little physical RAM -- I have one here with 500 MB -- turn off swap, and start more and more Tahoe clients and start each one doing the "upload" operation until eventually you get malloc failures or Linux OOM kills or whatever. Now divide your physical RAM by the number of Tahoe clients that you were able to run *without* incurring memory problems. The result of that division is a reasonable approximation of the "memory requirements" of the current Tahoe client. This sounds like fun -- I'll accept this ticket.

zooko added

dev-infrastructure

and removed

unknown

labels 2008-01-01 22:52:18 +00:00

zooko commented

2008-01-04 04:30:25 +00:00

Ooh, and as Seb just reminded me, I can turn off overcommit first too, to make it more deterministic/analyzable.

zooko commented

2008-01-17 02:37:00 +00:00

Please see:

http://allmydata.org/pipermail/tahoe-dev/2008-January/000341.html

Zandr: how would you feel about turning off swap for tahoeslave-feisty and for zandr-64? I believe that turning off swap is necessary in order to get a useful measurement of memory. (Personally, I turn off swap on my Linux systems anyway.)

I think that turning off memory overcommit isn't strictly necessary for doing measurements, but it might help by showing memory exhaustion errors in a more deterministic way than the Linux OOM killer.

Please see: <http://allmydata.org/pipermail/tahoe-dev/2008-January/000341.html> Zandr: how would you feel about turning off swap for tahoeslave-feisty and for zandr-64? I believe that turning off swap is necessary in order to get a useful measurement of memory. (Personally, I turn off swap on my Linux systems anyway.) I think that turning off memory overcommit isn't strictly necessary for doing measurements, but it might help by showing memory exhaustion errors in a more deterministic way than the Linux OOM killer.

zooko commented

2008-01-18 21:34:22 +00:00

Adding Cc: zandr

zooko commented

2009-05-04 17:44:17 +00:00

By the way, I don't think I succeeded at boiling down the results of my research for the consumption of others. Here's the boiled-down version: measuring the vsize as we do in [our Performance page]wiki: gives a number much higher than what we actually want to know, and it changes even when the thing that we care about hasn't changed, so it is either useless or only barely useful. Measuring the resident set size would give something probably smaller or possibly larger than the thing we want to know, and it too would change randomly when the thing we care about hasn't changed. The two of them put together and then eyeballed might give you insight, or might just mislead you.

The idea that I had and wrote up in this ticket (above) was a third option: turn off swap and measure resident. That gives you a number that is probably pretty close to what you care about, if what you care about is something like "How much RAM do I need in my machine to run one Tahoe node without it needing to swap.". (If you have a different idea of what you want to know then by all means speak up.)

Anyway, that's all my attempt to restate the history of this ticket and explain why you shouldn't pay much if any attention to the numbers on the Performance page. The new news is that Matt Mackall has been working on this problem and has a new tool that can help (on Linux):

http://lwn.net/SubscriberLink/329458/d28c2d45a663045a

By the way, I don't think I succeeded at boiling down the results of my research for the consumption of others. Here's the boiled-down version: measuring the vsize as we do in [our Performance page]wiki: gives a number much higher than what we actually want to know, and it changes even when the thing that we care about hasn't changed, so it is either useless or only barely useful. Measuring the resident set size would give something probably smaller or possibly larger than the thing we want to know, and it too would change randomly when the thing we care about hasn't changed. The two of them put together and then eyeballed might give you insight, or might just mislead you. The idea that I had and wrote up in this ticket (above) was a third option: turn off swap and measure resident. That gives you a number that is probably pretty close to what you care about, if what you care about is something like "How much RAM do I need in my machine to run one Tahoe node without it needing to swap.". (If you have a different idea of what you want to know then by all means speak up.) Anyway, that's all my attempt to restate the history of this ticket and explain why you shouldn't pay much if any attention to the numbers on [the Performance page](wiki/Performance). The new news is that Matt Mackall has been working on this problem and has a new tool that can help (on Linux): <http://lwn.net/SubscriberLink/329458/d28c2d45a663045a>

zooko commented

2009-05-04 17:46:30 +00:00

If you love this ticket (#227), then you might like tickets #54 (port memory usage tests to windows), #419 (pycryptopp uses up too much RAM), #478 (add memory-usage to stats-provider numbers), and #97 (reducing memory footprint in share reception).

zooko commented

2009-06-17 20:04:38 +00:00

Here's the permanent URL for that LWN.net article: Matt Mackall has invented "smem" which provides measurements of memory usage that are actually useful: http://lwn.net/Articles/329458/ .

Here's the permanent URL for that LWN.net article: Matt Mackall has invented "smem" which provides measurements of memory usage that are actually useful: <http://lwn.net/Articles/329458/> .

warner commented

2015-03-20 20:26:58 +00:00

I took a quick look at smem today, seems pretty nice. I think the "USS" (Unique Set Size) might be a good thing to track: it's the amount of memory you'd get back by killing the process. For Tahoe, the main thing we care about is that the client process isn't leaking or over-allocating the memory used to hold files during the upload/download process, and that memory isn't going to be shared with any other process. So even if it doesn't answer the "can I fit this tahoe node/workload on my NN-MB computer", it does answer the question of whether we're meeting our memory-complexity design goals.

Installing smem requires a bunch of other stuff (python-gtk2, python-tk, matplotlib), since it has a graphical mode that we don't care about, but that's not a big deal. There's a process-filter thing which I can't find documentation on, which we'd need to limit the output to the tahoe client's own PID. And then the main downside I can think of is that you have to shell out to a not-small python program for each sample (vs reading /proc/self/status, which is basically free), so somebody might be worried about the performance impact.

I took a quick look at smem today, seems pretty nice. I think the "USS" (Unique Set Size) might be a good thing to track: it's the amount of memory you'd get back by killing the process. For Tahoe, the main thing we care about is that the client process isn't leaking or over-allocating the memory used to hold files during the upload/download process, and that memory isn't going to be shared with any other process. So even if it doesn't answer the "can I fit this tahoe node/workload on my NN-MB computer", it *does* answer the question of whether we're meeting our memory-complexity design goals. Installing `smem` requires a bunch of other stuff (python-gtk2, python-tk, matplotlib), since it has a graphical mode that we don't care about, but that's not a big deal. There's a process-filter thing which I can't find documentation on, which we'd need to limit the output to the tahoe client's own PID. And then the main downside I can think of is that you have to shell out to a not-small python program for each sample (vs reading /proc/self/status, which is basically free), so somebody might be worried about the performance impact.

exarkun commented

2020-12-09 14:13:06 +00:00

This ticket is about operational visibility for operations that are no longer operational.

exarkun added the

wontfix

label 2020-12-09 14:13:06 +00:00

exarkun closed this issue

2020-12-09 14:13:06 +00:00

Sign in to join this conversation.