pipeline upload segments to make upload faster #392
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#392
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In ticket #252 we decided to reduce the max segment size from 1MiB to 128KiB. But this caused in-colo upload speed to drop by at least 50%.
We should see if we can pipeline two segments for upload, to get back the extra round-trip times that we lost with having more segments.
It's also possible that some of the slowdown is just from the extra overhead of computing more hashes, but I suspect the turnaround time more than overhead.
We need to do something similar for download too, since the download speed was reduced drastically by the segsize change too.
Oh, and I just thought of the right place to do this too: in the
WriteBucketProxy
. It should be allowed to keep a Nagle-like cache ofwrite vectors, and send them out in a batch when the cache gets larger than some
particular size (that will coalesce small writes into a single call, reducing the
round-trip time). In addition, it should be allowed to have multiple calls outstanding
if the total amount of data that it has sent (and therefore might be in the transport
buffer) is below some amount, say 128KiB. If k=3, then that should allow three segments to be on the wire at once, mitigating the slowdown due to round trips. As long as the RTT time is less than the bandwidth*windowsize, this should keep the pipe full.
#320 is related, since the storage-server protocol changes we talked about would make it easier to implement the pipelining.
Attachment pipeline.diff (14391 bytes) added
patch to add pipelining to immutable upload
So, using the attached patch, I added pipelined writes to the immutable
upload operation. The
Pipeline
class allows up to 50KB in the pipebefore it starts blocking the sender (specifically, the calls to
WriteBucketProxy._write
returndefer.succeed
until there is morethan 50KB of unacknowledged data in the pipe, after which it returns regular
Deferreds until some of those writes get retired. A terminal
flush()
call causes the Upload to wait for the pipeline to drain before it is
considered complete).
A quick performance test (in the same environments that we do the buildbot
performance tests on: my home DSL line and tahoecs2 in colo) showed a
significant improvement in the DSL per-file overhead, but only about a 10%
improvement in the overall upload rate (for both DSL and colo).
Basically, the 7 writes used to write a small file (header, segment 0,
crypttext_hashtree, block_hashtree, share_hashtree, uri_extension, close) are
all put on the wire together, so they take bandwidth plus 1 RTT instead of
bandwidth plus 7 RTT. The savings of 6 RTT appears to save us about 1.8
seconds over my DSL line. (my ping time to the servers is about 11ms, but
then there's kernel/python/twisted/foolscap/tahoe overhead on top of that).
For a larger file, pipelining might increase the utilization of the wire,
particularly if you have a "long fat" pipe (high bandwidth but high latency).
However, with 10 shares going out at the same time, the wire is probably
pretty full already: the ratio of interest is segsize*N/k/BW / RTT . You send
N blocks for a single segment at once, then you wait for all the replies to
come back, then generate the next blocks. If the time it takes to send a
single block is greater than the server's turnaround time, then N-1 responses
will be received before the last block is finished sending, so you've only
got one RTT of idle time (while you wait for the last server to respond).
Pipelining will fill this last RTT, but my guess is that isn't that much of a
help, and that something else is needed to explain the performance hit we saw
in colo when we moved to larger segments.
DSL no pipelining:
DSL with pipelining:
The in-colo tests showed roughly the same improvement to upload speed, but
very little change to the per-file time. The RTT time there is shorter (ping
time is about 120us), which might explain the difference. But I think the
slowdown lies elsewhere. Pipelining shaves about 30ms off each file, and
increases the overall upload speed by about 10%.
colo no pipelining:
colo with pipelining:
I want to run some more tests before landing this patch, to make sure it's
really doing what I though it should be doing. I'd also like to improve the
automated speed-test to do a simple TCP transfer to measure the available
upstream bandwidth, so we can compare tahoe's upload speed against the actual
wire.
I pushed this patch anyways.. I think it'll help, just not as much as I was hoping for.
In 5e1d464/trunk: