Performance problems testing on a 45 disk Dual core 3.3Ghz 4G memory Box #962

New Issue

tahoe-lafs · 2010-02-20T06:14:11Z

stott commented

2010-02-20 06:14:11 +00:00

Performance problems testing on a 45 disk Dual core 3.3Ghz 4G memory Box.

When uploading a 200MB file generating 10 Shares, 3 Needed to recover. The system load spiked to 100% for ~20 minutes. During this time the system was also Swapping.

Performance was extremely slow.

Performance problems testing on a 45 disk Dual core 3.3Ghz 4G memory Box. When uploading a 200MB file generating 10 Shares, 3 Needed to recover. The system load spiked to 100% for ~20 minutes. During this time the system was also Swapping. Performance was extremely slow.

tahoe-lafs added the

labels 2010-02-20 06:14:11 +00:00

tahoe-lafs added this to the undecided milestone 2010-02-20 06:14:11 +00:00

zooko commented

2010-02-20 19:10:01 +00:00

So did you have 45 different Tahoe-LAFS storage server processes? Could you check how much RAM each one is using by using top?

(Or, for bonus points, using smem instead: http://www.selenic.com/smem/ )

So did you have 45 different Tahoe-LAFS storage server processes? Could you check how much RAM each one is using by using top? (Or, for bonus points, using smem instead: <http://www.selenic.com/smem/> )

imhavoc commented

2010-02-22 16:20:21 +00:00

It would helpful to know how you have this configured.

stott commented

2010-02-23 01:08:26 +00:00

System Configuration (http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage)

1 Intel(R) Core(TM)2 Duo CPU E8600 @ 3.33GHz
1 80G boot drive mounted as /
45 Seagate 1.5 T drives mounted as /tahoe/node1-45
4G memory

eth0 is at 100MB/FULL

Default tahoe configuration.

SMEM :
root@firetruck1:~/app/smem-0.9# ./smem |egrep "PID|python"

PID User Command 13025 root egrep --color=auto PID|pyth 13024 root /usr/bin/python ./smem 12402 www-data /usr/bin/python /usr/bin/tw 12546 www-data /usr/bin/python /usr/bin/tw 12537 www-data /usr/bin/python /usr/bin/tw 12590 www-data /usr/bin/python /usr/bin/tw 12634 www-data /usr/bin/python /usr/bin/tw 12645 www-data /usr/bin/python /usr/bin/tw 12612 www-data /usr/bin/python /usr/bin/tw 12601 www-data /usr/bin/python /usr/bin/tw 12425 www-data /usr/bin/python /usr/bin/tw 12447 www-data /usr/bin/python /usr/bin/tw 12557 www-data /usr/bin/python /usr/bin/tw 12480 www-data /usr/bin/python /usr/bin/tw 12524 www-data /usr/bin/python /usr/bin/tw 12623 www-data /usr/bin/python /usr/bin/tw 12579 www-data /usr/bin/python /usr/bin/tw 12513 www-data /usr/bin/python /usr/bin/tw 12700 www-data /usr/bin/python /usr/bin/tw 12491 www-data /usr/bin/python /usr/bin/tw 12469 www-data /usr/bin/python /usr/bin/tw 12656 www-data /usr/bin/python /usr/bin/tw 12502 www-data /usr/bin/python /usr/bin/tw 12667 www-data /usr/bin/python /usr/bin/tw 12436 www-data /usr/bin/python /usr/bin/tw 12458 www-data /usr/bin/python /usr/bin/tw 12744 www-data /usr/bin/python /usr/bin/tw 12788 www-data /usr/bin/python /usr/bin/tw 12722 www-data /usr/bin/python /usr/bin/tw 12678 www-data /usr/bin/python /usr/bin/tw 12689 www-data /usr/bin/python /usr/bin/tw 12568 www-data /usr/bin/python /usr/bin/tw 12832 www-data /usr/bin/python /usr/bin/tw 12898 www-data /usr/bin/python /usr/bin/tw 12777 www-data /usr/bin/python /usr/bin/tw 12854 www-data /usr/bin/python /usr/bin/tw 12735 www-data /usr/bin/python /usr/bin/tw 12711 www-data /usr/bin/python /usr/bin/tw 12414 www-data /usr/bin/python /usr/bin/tw 12799 www-data /usr/bin/python /usr/bin/tw 12843 www-data /usr/bin/python /usr/bin/tw 12865 www-data /usr/bin/python /usr/bin/tw 12821 www-data /usr/bin/python /usr/bin/tw 12887 www-data /usr/bin/python /usr/bin/tw 12755 www-data /usr/bin/python /usr/bin/tw 12766 www-data /usr/bin/python /usr/bin/tw 12810 www-data /usr/bin/python /usr/bin/tw 12876 www-data /usr/bin/python /usr/bin/tw Swap USS PSS RSS
0 260 276 860
0 4560 4609 6224
23028 33356 33391 35504
14796 47700 47754 50764
9412 53300 53351 56228
12 60732 60786 63796
0 60792 60846 63856
0 60796 60854 63868
8 60904 60958 63968
4 61192 61246 64256
668 61860 61911 64788
728 62240 62291 65168
12 62368 62422 65432
0 62384 62438 65448
156 62500 62552 65492
4 62500 62554 65564
8 62564 62616 65556
52 62632 62686 65696
0 62644 62698 65708
40 62772 62824 65764
264 62796 62850 65860
0 63004 63057 66060
0 63184 63238 66248
0 63264 63318 66328
92 63392 63446 66456
164 63780 63834 66844
0 64016 64070 67080
0 65400 65451 68328
0 66472 66526 69536
0 67036 67090 70100
0 67112 67166 70176
0 70172 70224 73164
0 71800 71854 74864
0 71760 71949 75168
0 72432 72486 75496
0 73000 73050 75920
0 73096 73150 76160
0 73308 73362 76372
908 73256 73449 76680
0 73652 73706 76716
0 74660 74714 77724
0 74888 74946 77960
0 76804 76858 79868
0 79180 79233 82236
0 79892 79946 82956
0 80548 80599 83532
0 86512 86563 89440
0 87592 87645 90648

================

Note,

After switching to

shares.needed = 32
shares.happy = 35
shares.total = 40

Performance has greatly increased. The 3.3 Blowup may have been the largest contributing factor.

I'm still concerned about memory utilization per node as the nodes increase per grid, but that may be a separate ticket.

Performace is still much less than I would like to see.

bash-3.2$ time curl -T 524stottlemyer.zip http://10.20.0.151:3470/uri/URI:DIR2:whzhc6kxh5mfem4basdoioewji:4qpwpea2s46tg3lnd5znwsyqqenwjoddjxpsvr733sk7n4on6uua/
URI:CHK:fsmf5rn6me7mrueq2ri4zvboay:sd5lnugn2gnisrvodxtzp7zvkx4f2skyk6ri5ncs65rmq6qkdoqa:32:40:106554122
real 1m17.962s

bash-3.2$ ls -larth 524stottlemyer.zip
-rw-r--r-- 1 justin staff 102M Nov 20 15:44 524stottlemyer.zip

I'll have to collect more stats during an upload to present what is going on. A snapshot indicated heavy CPU, nearly no swap in use at this time, and disks were much better able to handle the load w/ great distribution.

Here is a brief network snapshot though time spent on internal operations was much greater.

eth0: 0.003 In, 0.000 Out Mbps | 4 In, 0 Out pps
eth0: 40.860 In, 0.963 Out Mbps | 3709 In, 1912 Out pps
eth0: 92.249 In, 2.226 Out Mbps | 8409 In, 4354 Out pps
eth0: 37.314 In, 0.893 Out Mbps | 3430 In, 1761 Out pps
eth0: 0.006 In, 0.015 Out Mbps | 8 In, 3 Out pps

System Configuration (<http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage>) 1 Intel(R) Core(TM)2 Duo CPU E8600 @ 3.33GHz 1 80G boot drive mounted as / 45 Seagate 1.5 T drives mounted as /tahoe/node1-45 4G memory eth0 is at 100MB/FULL Default tahoe configuration. SMEM : root@firetruck1:~/app/smem-0.9# ./smem |egrep "PID|python" > PID User Command Swap USS PSS RSS 13025 root egrep --color=auto PID|pyth 0 260 276 860 13024 root /usr/bin/python ./smem 0 4560 4609 6224 12402 www-data /usr/bin/python /usr/bin/tw 23028 33356 33391 35504 12546 www-data /usr/bin/python /usr/bin/tw 14796 47700 47754 50764 12537 www-data /usr/bin/python /usr/bin/tw 9412 53300 53351 56228 12590 www-data /usr/bin/python /usr/bin/tw 12 60732 60786 63796 12634 www-data /usr/bin/python /usr/bin/tw 0 60792 60846 63856 12645 www-data /usr/bin/python /usr/bin/tw 0 60796 60854 63868 12612 www-data /usr/bin/python /usr/bin/tw 8 60904 60958 63968 12601 www-data /usr/bin/python /usr/bin/tw 4 61192 61246 64256 12425 www-data /usr/bin/python /usr/bin/tw 668 61860 61911 64788 12447 www-data /usr/bin/python /usr/bin/tw 728 62240 62291 65168 12557 www-data /usr/bin/python /usr/bin/tw 12 62368 62422 65432 12480 www-data /usr/bin/python /usr/bin/tw 0 62384 62438 65448 12524 www-data /usr/bin/python /usr/bin/tw 156 62500 62552 65492 12623 www-data /usr/bin/python /usr/bin/tw 4 62500 62554 65564 12579 www-data /usr/bin/python /usr/bin/tw 8 62564 62616 65556 12513 www-data /usr/bin/python /usr/bin/tw 52 62632 62686 65696 12700 www-data /usr/bin/python /usr/bin/tw 0 62644 62698 65708 12491 www-data /usr/bin/python /usr/bin/tw 40 62772 62824 65764 12469 www-data /usr/bin/python /usr/bin/tw 264 62796 62850 65860 12656 www-data /usr/bin/python /usr/bin/tw 0 63004 63057 66060 12502 www-data /usr/bin/python /usr/bin/tw 0 63184 63238 66248 12667 www-data /usr/bin/python /usr/bin/tw 0 63264 63318 66328 12436 www-data /usr/bin/python /usr/bin/tw 92 63392 63446 66456 12458 www-data /usr/bin/python /usr/bin/tw 164 63780 63834 66844 12744 www-data /usr/bin/python /usr/bin/tw 0 64016 64070 67080 12788 www-data /usr/bin/python /usr/bin/tw 0 65400 65451 68328 12722 www-data /usr/bin/python /usr/bin/tw 0 66472 66526 69536 12678 www-data /usr/bin/python /usr/bin/tw 0 67036 67090 70100 12689 www-data /usr/bin/python /usr/bin/tw 0 67112 67166 70176 12568 www-data /usr/bin/python /usr/bin/tw 0 70172 70224 73164 12832 www-data /usr/bin/python /usr/bin/tw 0 71800 71854 74864 12898 www-data /usr/bin/python /usr/bin/tw 0 71760 71949 75168 12777 www-data /usr/bin/python /usr/bin/tw 0 72432 72486 75496 12854 www-data /usr/bin/python /usr/bin/tw 0 73000 73050 75920 12735 www-data /usr/bin/python /usr/bin/tw 0 73096 73150 76160 12711 www-data /usr/bin/python /usr/bin/tw 0 73308 73362 76372 12414 www-data /usr/bin/python /usr/bin/tw 908 73256 73449 76680 12799 www-data /usr/bin/python /usr/bin/tw 0 73652 73706 76716 12843 www-data /usr/bin/python /usr/bin/tw 0 74660 74714 77724 12865 www-data /usr/bin/python /usr/bin/tw 0 74888 74946 77960 12821 www-data /usr/bin/python /usr/bin/tw 0 76804 76858 79868 12887 www-data /usr/bin/python /usr/bin/tw 0 79180 79233 82236 12755 www-data /usr/bin/python /usr/bin/tw 0 79892 79946 82956 12766 www-data /usr/bin/python /usr/bin/tw 0 80548 80599 83532 12810 www-data /usr/bin/python /usr/bin/tw 0 86512 86563 89440 12876 www-data /usr/bin/python /usr/bin/tw 0 87592 87645 90648 ================ Note, > After switching to shares.needed = 32 shares.happy = 35 shares.total = 40 Performance has greatly increased. The 3.3 Blowup may have been the largest contributing factor. I'm still concerned about memory utilization per node as the nodes increase per grid, but that may be a separate ticket. Performace is still much less than I would like to see. bash-3.2$ time curl -T 524stottlemyer.zip <http://10.20.0.151:3470/uri/URI:DIR2:whzhc6kxh5mfem4basdoioewji:4qpwpea2s46tg3lnd5znwsyqqenwjoddjxpsvr733sk7n4on6uua/> URI:CHK:fsmf5rn6me7mrueq2ri4zvboay:sd5lnugn2gnisrvodxtzp7zvkx4f2skyk6ri5ncs65rmq6qkdoqa:32:40:106554122 real 1m17.962s bash-3.2$ ls -larth 524stottlemyer.zip -rw-r--r-- 1 justin staff 102M Nov 20 15:44 524stottlemyer.zip I'll have to collect more stats during an upload to present what is going on. A snapshot indicated heavy CPU, nearly no swap in use at this time, and disks were much better able to handle the load w/ great distribution. Here is a brief network snapshot though time spent on internal operations was much greater. eth0: 0.003 In, 0.000 Out Mbps | 4 In, 0 Out pps eth0: 40.860 In, 0.963 Out Mbps | 3709 In, 1912 Out pps eth0: 92.249 In, 2.226 Out Mbps | 8409 In, 4354 Out pps eth0: 37.314 In, 0.893 Out Mbps | 3430 In, 1761 Out pps eth0: 0.006 In, 0.015 Out Mbps | 8 In, 3 Out pps

tahoe-lafs added

defect

and removed

enhancement

labels 2010-02-23 06:53:43 +00:00

warner commented

2010-02-24 04:17:59 +00:00

Is there just one upload happening at a time? Or are you doing many in parallel?

One of the "recent uploads and downloads" stats that you pasted in IRC showed a mutable upload, rather than the normal immutable upload (which the curl command above would use). We've documented once or twice how memory usage for mutable files is pretty bad (N/kfilesize, as opposed to the well-limited N/k128KiB for immutable files). source:docs/performance.txt mentions about half the story.. I guess we should probably improve that section.

If you had a lot of simultaneous large-memory uploads going on, I can easily see how the overall memory usage would spike, and you'd get a lot of swapping and stalls. It sounds like there were a couple different kinds of tests taking place, though, so I don't want to misdiagnose problem A if the symptoms actually occurred during test B.

Is there just one upload happening at a time? Or are you doing many in parallel? One of the "recent uploads and downloads" stats that you pasted in IRC showed a **mutable** upload, rather than the normal immutable upload (which the `curl` command above would use). We've documented once or twice how memory usage for mutable files is pretty bad (N/k*filesize, as opposed to the well-limited N/k*128KiB for immutable files). source:docs/performance.txt mentions about half the story.. I guess we should probably improve that section. If you had a lot of simultaneous large-memory uploads going on, I can easily see how the overall memory usage would spike, and you'd get a lot of swapping and stalls. It sounds like there were a couple different kinds of tests taking place, though, so I don't want to misdiagnose problem A if the symptoms actually occurred during test B.

stott commented

2010-02-24 06:27:06 +00:00

Updating with attached straces

http://testgrid.allmydata.org:3567/uri/URI%3ADIR2RO%3Axqm7hlme3oncrzpoqr5thvsh7a%3A64dg3rttsh6nqujoyr62nt7kssmmqaf5rv5nozyuvqocpi4mzzba/

Updating with attached straces <http://testgrid.allmydata.org:3567/uri/URI%3ADIR2RO%3Axqm7hlme3oncrzpoqr5thvsh7a%3A64dg3rttsh6nqujoyr62nt7kssmmqaf5rv5nozyuvqocpi4mzzba/>

stott commented

2010-02-24 06:27:47 +00:00

Attachment node1.pid.9394 (5020912 bytes) added

gateway strace of a tahoe put 106mb file.

**Attachment** node1.pid.9394 (5020912 bytes) added gateway strace of a tahoe put 106mb file.

node1.pid.9394

4.8 MiB

stott commented

2010-02-24 06:28:18 +00:00

Attachment node2.pid.3225 (173852 bytes) added

server node strace file.

**Attachment** node2.pid.3225 (173852 bytes) added server node strace file.

node2.pid.3225

170 KiB

zooko commented

2010-02-26 04:00:08 +00:00

Are node1.pid.9394 and node2.pid.3225 from the same upload of a 106 MB file?

stott commented

2010-02-26 04:14:05 +00:00

I should have specified, this latest operation was a Tahoe get of that 106MB file.

Yes these two straces are from the same get, just watching it from two places. The gateway and a storage server, the same operation.

Sorry for the confusion of the bad note in the attachment. I also have the strace from the get, where you can see the 24S wait with no correlating hangs in the other processes.

The Get is in the testgrid link above, the get was too large for an attachment to TRAC.

I should have specified, this latest operation was a Tahoe get of that 106MB file. Yes these two straces are from the same get, just watching it from two places. The gateway and a storage server, the same operation. Sorry for the confusion of the bad note in the attachment. I also have the strace from the get, where you can see the 24S wait with no correlating hangs in the other processes. The Get is in the testgrid link above, the get was too large for an attachment to TRAC.

zooko commented

2010-02-26 04:32:48 +00:00

Replying to stott:

Updating with attached straces

http://testgrid.allmydata.org:3567/uri/URI%3ADIR2RO%3Axqm7hlme3oncrzpoqr5thvsh7a%3A64dg3rttsh6nqujoyr62nt7kssmmqaf5rv5nozyuvqocpi4mzzba/

What is this URL? It appears to be a broken link.

Replying to [stott](/tahoe-lafs/trac-2024-07-25/issues/962#issuecomment-75678): > Updating with attached straces > > <http://testgrid.allmydata.org:3567/uri/URI%3ADIR2RO%3Axqm7hlme3oncrzpoqr5thvsh7a%3A64dg3rttsh6nqujoyr62nt7kssmmqaf5rv5nozyuvqocpi4mzzba/> What is this URL? It appears to be a broken link.

zooko commented

2010-02-26 04:53:16 +00:00

Replying to stott:

I should have specified, this latest operation was a Tahoe get of that 106MB file.

Sorry, I'm still confused. So node1.pid.9394 is strace of the gateway during a 106 MB GET and node2.pid.3225 is the strace of a random server during that same GET? And the encoding parameters were: shares.needed = 32 shares.happy = 35 shares.total = 40 ?

Replying to [stott](/tahoe-lafs/trac-2024-07-25/issues/962#issuecomment-75680): > I should have specified, this latest operation was a Tahoe get of that 106MB file. Sorry, I'm still confused. So node1.pid.9394 is strace of the gateway during a 106 MB GET and node2.pid.3225 is the strace of a random server during that same GET? And the encoding parameters were: shares.needed = 32 shares.happy = 35 shares.total = 40 ?

zooko commented

2010-02-26 04:57:39 +00:00

And, when you did that GET, how well did it perform? Do you know about the "Recent Uploads and Downloads" page? From there you can find timing information about the transfer (as observed by the Tahoe-LAFS gateway itself) that might be useful. Could you attach the timing results from that particular GET (if it is still there on the Recent Uploads and Downloads) or from a new similar GET?

stott commented

2010-02-26 05:00:17 +00:00

(http://codepad.org/cxuaFwz2) here is the relevant recent get for that download.
I'll attempt to attach the html

(http://codepad.org/cxuaFwz2) here is the relevant recent get for that download. I'll attempt to attach the html

stott commented

2010-02-26 05:01:20 +00:00

Attachment codepad-tahoeget.html (7436 bytes) added

recent stats of the get

**Attachment** codepad-tahoeget.html (7436 bytes) added recent stats of the get

codepad-tahoeget.html

7.3 KiB

zooko commented

2010-02-26 05:34:56 +00:00

Okay, thank you for the timings in codepad-tahoeget.html . I see that it was a mutable file of size 106 MB, with K=32, N=40 encoding. Therefore I would expect gateway to (temporarily) use up to 2 times 106 MB of RAM when downloading the file.

(Aside: hmm, [docs/performance.txt]source:docs/performance.txt@4240#L51 says that it will take N/K*A. I don't think that's right—I think it will take just A (possibly times a factor of 2 or so of memory usage because it does a certain amount of copying the data in RAM). I'll change the doc to say just A there.)

Okay, and the timings reported by the gateway node for each server will just have a single data point because shares of a mutable file are transferred in one big block of size A/K. (As contrasted with shares of an immutable file, which are transferred in a succession of blocks each of size 128 KiB/K.)

Each of these blocks should be about 106MB/32≅3.33MB.

The timing results are that one of the servers—cvpcsjio—sent its 3.33MB block in 442ms and every other server sent its 3.33MB block in 22sec.

(Aside: I would really like to know if that was exactly 22,000ms for every one of them or what. The display of these measurements should include a fixed and appropriate amount of precision instead of rounding off to seconds.)

What is special about cvpcsjio?

Okay, so does this make sense so far? The next question in my mind is how much of that 22 seconds was spent doing what. That 22 seconds encompasses the following sequence of actions:

The gateway serializes and sends a request to the server to read the block.
The server looks up the requested share in its filesystem.
The server reads the share data from the filesystem.
The server sends the share data back over the TCP connection to the gateway.
Looking at the strace of one of the servers (which one was it? Was it cvpcsjio?) might help explain how much of the 22 seconds was spent in which of those steps. Also turning on verbose logging (per [docs/logging.txt]source:docs/logging.txt) on one of the servers would give information, but it would also make that server use a lot more CPU and possibly respond slower.

Okay, thank you for the timings in codepad-tahoeget.html . I see that it was a mutable file of size 106 MB, with `K=32, N=40` encoding. Therefore I would expect gateway to (temporarily) use up to 2 times 106 MB of RAM when downloading the file. *(Aside: hmm, [docs/performance.txt]source:docs/performance.txt@4240#L51 says that it will take `N/K*A`. I don't think that's right—I think it will take just `A` (possibly times a factor of 2 or so of memory usage because it does a certain amount of copying the data in RAM). I'll change the doc to say just `A` there.)* Okay, and the timings reported by the gateway node for each server will just have a single data point because shares of a mutable file are transferred in one big block of size `A/K`. (As contrasted with shares of an immutable file, which are transferred in a succession of blocks each of size `128 KiB/K`.) Each of these blocks should be about 106MB/32≅3.33MB. The timing results are that one of the servers—`cvpcsjio`—sent its 3.33MB block in 442ms and every other server sent its 3.33MB block in 22sec. *(Aside: I would really like to know if that was exactly 22,000ms for every one of them or what. The display of these measurements should include a fixed and appropriate amount of precision instead of rounding off to seconds.)* What is special about `cvpcsjio`? Okay, so does this make sense so far? The next question in my mind is how much of that 22 seconds was spent doing what. That 22 seconds encompasses the following sequence of actions: 1. The gateway serializes and sends a request to the server to read the block. 2. The server looks up the requested share in its filesystem. 3. The server reads the share data from the filesystem. 4. The server sends the share data back over the TCP connection to the gateway. Looking at the strace of one of the servers (which one was it? Was it `cvpcsjio`?) might help explain how much of the 22 seconds was spent in which of those steps. Also turning on verbose logging (per [docs/logging.txt]source:docs/logging.txt) on one of the servers would give information, but it would also make that server use a lot more CPU and possibly respond slower.

zooko commented

2010-02-26 05:52:35 +00:00

Hm... It might be worth adding to source:docs/performance.txt a note about the fact that when downloading a mutable file you first have to wait for the entire file data to transfer from the storage servers to the gateway, and then the data streams from the gateway to your client. By "streams" here, I mean that your client can receive the first part of the data before the gateway has sent the last part of the data. Observe that the data of a mutable file does not "stream" from the storage servers all the way to your client—your client won't see the first part of the mutable file until after the gateway has received the last part of the mutable file from the storage servers.

By contrast, if you download an immutable file it will "stream" all the way from the storage servers to your client—your client can see the first part of the immutable file as soon as the gateway receives the first block of each share from the storage servers, long before the gateway has seen the last part of the immutable file.

This is potentially another reason to prefer immutable files over mutable files for performance when you're using Tahoe-LAFS v1.6. (How big of a performance issue this is depends on the details of your network and servers and the size of the file.)

Hm... It might be worth adding to source:docs/performance.txt a note about the fact that when downloading a mutable file you first have to wait for the entire file data to transfer from the storage servers to the gateway, and then the data streams from the gateway to your client. By "streams" here, I mean that your client can receive the first part of the data before the gateway has sent the last part of the data. Observe that the data of a mutable file does *not* "stream" from the storage servers all the way to your client—your client won't see the first part of the mutable file until after the gateway has received the *last* part of the mutable file from the storage servers. By contrast, if you download an immutable file it *will* "stream" all the way from the storage servers to your client—your client can see the first part of the immutable file as soon as the gateway receives the first block of each share from the storage servers, long before the gateway has seen the last part of the immutable file. This is potentially another reason to prefer immutable files over mutable files for performance when you're using Tahoe-LAFS v1.6. (How big of a performance issue this is depends on the details of your network and servers and the size of the file.)

zooko commented

2010-02-26 05:55:33 +00:00

Replying to zooko:

This is potentially another reason to prefer immutable files over mutable files for performance when you're using Tahoe-LAFS v1.6. (How big of a performance issue this is depends on the details of your network and servers and the size of the file.)

It would be interesting to see the timing results of downloading a 106 MB immutable file on stott's petabyte-on-a-budget box. It would also be interesting to know the total time from the perspective of the client (which is different from the perspective of the Tahoe-LAFS gateway). For example, if you are downloading the file with "tahoe get" then use the bash "time" builtin to tell you how long from when you pressed ENTER on the "tahoe get" command to when that command finished. Or if you are downloading the file with wget or curl then capture the output that those tools can emit to show elapsed time and observe throughput of their download.

Replying to [zooko](/tahoe-lafs/trac-2024-07-25/issues/962#issuecomment-75686): > > This is potentially another reason to prefer immutable files over mutable files for performance when you're using Tahoe-LAFS v1.6. (How big of a performance issue this is depends on the details of your network and servers and the size of the file.) It would be interesting to see the timing results of downloading a 106 MB immutable file on stott's petabyte-on-a-budget box. It would also be interesting to know the total time from the perspective of the client (which is different from the perspective of the Tahoe-LAFS gateway). For example, if you are downloading the file with "tahoe get" then use the bash "time" builtin to tell you how long from when you pressed ENTER on the "tahoe get" command to when that command finished. Or if you are downloading the file with wget or curl then capture the output that those tools can emit to show elapsed time and observe throughput of their download.

stott commented

2010-02-26 06:42:48 +00:00

I was collecting those stats as well, the curl and the tahoe get were both taking the same amount of time indicated by the gateway.

If you read the strace output, you can see that it appears that each server actually sends its data in ~450-600ms however the measurement of 22S appears to be the total time of all servers, not the actual time per server. There is nothing special about cvpcsjio except that I believe it actually gave an accurate measure.

I'll test again tomorrow with an immutable file and provide more details.

I was collecting those stats as well, the curl and the tahoe get were both taking the same amount of time indicated by the gateway. If you read the strace output, you can see that it appears that each server actually sends its data in ~450-600ms however the measurement of 22S appears to be the total time of all servers, not the actual time per server. There is nothing special about cvpcsjio except that I believe it actually gave an accurate measure. I'll test again tomorrow with an immutable file and provide more details.

zooko commented

2010-02-26 15:50:45 +00:00

Oh, I see! Yes, this is suggestive of either a bug in mutable file download performance or a bug in the timings report for mutable file downloads, or both.

Oh, I see! Yes, this *is* suggestive of either a bug in mutable file download performance or a bug in the timings report for mutable file downloads, or both.

zooko commented

2010-03-04 05:58:38 +00:00

I'll examine the source code for the timing reports for mutable file downloads.

stott commented

2010-03-04 06:03:36 +00:00

I have upgraded to 1.6.0. The speeds upload/download are the same.

Recent stats no longer show uploads.
Recent stats now shows downloads as the newest attachment shows.

I have upgraded to 1.6.0. The speeds upload/download are the same. Recent stats no longer show uploads. Recent stats now shows downloads as the newest attachment shows.

stott commented

2010-03-04 06:05:26 +00:00

Attachment 1.6DL.html (71866 bytes) added

**Attachment** 1.6DL.html (71866 bytes) added

1.6DL.html

70 KiB

zooko commented

2010-03-09 04:45:32 +00:00

Okay what's the status of this issue? I think it is time to open new tickets for more specific issues now that we can tell them apart.

The first issue that began the ticket was that upload of a large mutable file caused high CPU load on the storage servers. I'll open a new ticket for that issue.

Okay what's the status of this issue? I think it is time to open new tickets for more specific issues now that we can tell them apart. The first issue that began the ticket was that upload of a large mutable file caused high CPU load on the storage servers. I'll open a new ticket for that issue.

zooko commented

2010-03-09 06:36:08 +00:00

Okay I've created #983 (high CPU load on storage servers when uploading large mutable file) to track that issue.

warner added

code

and removed

website

labels 2010-08-06 22:19:44 +00:00

zooko commented

2010-09-03 18:07:56 +00:00

Stott: I'm not sure what the next step is to make progress on this ticket or to close this ticket. How about: you rerun your performance tests using the shiny new Tahoe-LAFS v1.8.0c3. :-)

It should be not worse than the previous version for mutable files or for uploading immutable files, and better for downloading immutable files. It might be a little bit better if your servers are all similarly performant and if your shares are well distributed, or a lot better if there is more heterogeneity in your server performance, or if some servers are overloaded compared to others.

Stott: I'm not sure what the next step is to make progress on this ticket or to close this ticket. How about: you rerun your performance tests using the shiny new Tahoe-LAFS v1.8.0c3. :-) It should be not worse than the previous version for mutable files or for uploading immutable files, and better for downloading immutable files. It might be a little bit better if your servers are all similarly performant and if your shares are well distributed, or a lot better if there is more heterogeneity in your server performance, or if some servers are overloaded compared to others.

zooko commented

2011-03-17 18:07:23 +00:00

Stott: would you please run your performance tests using Tahoe-LAFS v1.8.2? If you're not going to do that, let's just close this ticket because it is old and because I don't know what to do to make progress towards closing it. If you have some other request or can explain more specifically one particular issue that you had then we could use that as a starting point for making progress.

zooko commented

2012-05-26 05:53:39 +00:00

Reading back over this ticket, I don't see a lot of evidence about a performance issue specifically in immutable files. Most of the problems turned out to be with mutables, including ticket #983, which has been fixed. We don't know for sure that there aren't still performance issues in the current version when used on a 45-drive box, but until we get more evidence we'll leave this issue closed.

Reading back over this ticket, I don't see a lot of evidence about a performance issue specifically in immutable files. Most of the problems turned out to be with mutables, including ticket #983, which has been fixed. We don't know for sure that there *aren't* still performance issues in the current version when used on a 45-drive box, but until we get more evidence we'll leave this issue closed.

zooko added the

cannot reproduce

label 2012-05-26 05:53:39 +00:00

zooko closed this issue

2012-05-26 05:53:39 +00:00

Sign in to join this conversation.