Protocol is potentially high-latency and high bandwidth overhead for small files #3766
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#3766
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Imagine uploading a new, small file. As I understand it, this will require:
One can't do all queries in parallel, only uploads, because of the race condition between the uploads and the storage index existing. So even a clever, async client implementation will still require two HTTP roundtrips for each upload.
In addition to double latency (or 11× latency for a naive client, which maybe we don't care about), there's also a bunch of HTTP protocol overhead for uploading a file.
One can imagine an optimized variant of the API that includes both storage index and share creation in a single HTTP API call, for smaller files. This is, however, an optimization, and probably needn't exist in the first version.
Just to the point of such a naive client specifically: there are other motivations to not be this naive. Primarily, all shares are produced at the same time as the cleartext is processed. If you only upload one of them at a time, you have to store all the rest of them locally until you're ready to upload them. If you upload in parallel (which the current Tahoe-LAFS does using the Foolscap protocol) then you never have to store any of them locally, you can stream them all up as they're generated.
For small files, who cares. But for large files this is likely to be pretty crummy - especially given ZFEC expansion which means you might end up storing 2x or 3x or more (technically the maximum is 255x I think, but that's not a very likely client configuration).