add readv() API to immutable-share storage-server protocol, use in downloader #1545
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1545
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
One of the most obvious fixes for the immutable-download performance problems tracked in #1264 (and on the Performance/Sep2011 results) is to implement a scatter/gather
readv()
method for immutable shares. The graphs show MDMF downloads running just as fast with k=60 as with k=3, whereas for immutable files there is a drastic slowdown (10x) between k=3 and k=60. We're still investigating, but I suspect that Foolscap's message-serialization performance is to blame, and an easy way to mitigate that is to send fewer messages.The interface should probably be just like the mutable-share's
remote_readv()
API: a vector of(offset,length)
tuples, and the return value is a vector of data strings. (A future HTTP-based interface will probably pack these vectors into a single string, but we might experiment with doing that here too (basically do the marshalling before handing anything to foolscap, trading off generality for performance).David-Sarah mentioned that some of their new storage-backend code (for LAE) provides this interface, so we're likely to have the back half of this feature fairly soon. The rest of the work is to change immutable/downloader/share.py to turn a Request span into a read vector, instead of looping over all pieces of the span and sending separate
read()
requests for each.early results suggest that doing this would speed up high-k immutable downloads by about 24%. For example, k=54 with trunk takes roughly 414s to download a 100MB file (6 servers, 3 hosts, LAN connections). When basic readv() is used, this drops to 317s. For small k (like the default k=3), the effect is less clear, however there still seems to be a significant improvement (k=3 trunk 100MB takes maybe 38s, with-readv takes 32s).
The effect is roughly halfway between unmodified trunk CHK and trunk MDMF (which prefetches the whole block_hash_tree and doesn't even have a crypttext_hash_tree).
Attachment readv.diff (8229 bytes) added
add readv() support to server, use it from the client if available
That patch is just a proof-of-concept, not actually ready or recommended for landing. I'm attaching it so others can reproduce my results.