tolerate simultaneous uploads better #2409
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2409
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
In the Nuts+Bolts meeting this morning, we discussed what would happen if an application (in particular the "magic folder / drop-upload" feature) were to upload two copies of the same file at the same time. We thought about this a long time ago, but I can't seem to find a ticket on the particular issue.
I believe there's a race condition on the storage servers which would make the upload go less smoothly than we'd like. The first upload will see no shares for each storage index, so it will allocate a BucketWriter and start writing the share. The second upload will compute the storage-index, ask the server about pre-existing shares, and then.. probably get a yes?
The answer is uncertain, and depends upon the server implementation. The server's read-side might look on disk for the partially-written files, or the server's write-side might be using the write-to-tempfile atomic-swap technique, or the read-side might be looking in a leasedb for evidence of the share. Some of these will result in a "no" answer to the DYHB, in which case the second upload will try to allocate new BucketWriters to fill the shares (which might fail because of the existing writers, or might succeed with hilarious results as the two writers attempt to write the same file with hopefully the same data). It might get a "yes", in which case I think the uploader will ignore the shares and assume that they'll be present in the future.
We should probably:
The Uploader could read the pre-existing shares as it goes, comparing them against locally-generated ones. If they match, great, those shares can count against the servers-of-happiness criteria. If they don't, or if they aren't complete, then oops. The simplest way to deal with such problems is to treat them like a share write that failed (as if the server disconnected before the upload was complete), which may flunk the shares-of-happiness test and mark the upload as failing. A more sophisticated approach (which hopefully is ticketed elsewhere) is to have a second pass which writes out a new copy of any share that wasn't successfully placed during the first pass.
If we implement that verify-during-upload thing, we'll need to think carefully about how simultaneous uploads ought to work. I think we'll need a way to mark shares as "in-progress", which tells the second uploader that they can't verify the share now, but maybe they shouldn't upload it anyways.
This will get better when we make the storage-index be a hash of the share (or the root of a merkle tree with the shares in the leaves), because then the storage-index won't even be defined until the upload is complete, and the intermediate in-progress state will disappear. Simultaneous uploads will then turn into two uploads of the exact same share, detected at
close()
, which is inefficient but sound, I think.Related tickets:
zooko wrote in #952:
zooko also wrote in #952: