uploader confuses self-write-dedup with "server is full" #2110

Open
opened 2013-11-20 23:05:24 +00:00 by zooko · 5 comments

I just got this on my LeastAuthority.com S4 server, and then I verified by code inspection that it is also present in trunk. I haven't yet verified if it is also present in 1382-rewrite-2. If you write to a server and ask to allocate a bucket, and it writes back saying that it won't allocate that bucket for you, and by the way that it doesn't already have that bucket available to you, the client reasonably-enough concludes that the server is full. It reports that server as having been full if the upload fails. This gave me a bit of a start, since I couldn't figure out how Amazon S3 could be "full" so I thought there was a bug in my S4 service. ☺ But the truth appears to be that I was already uploading the same (immutable) file and the upload was in-progress, so the server was unwilling to start a new upload and also unwilling/unable to let me do a download.

I suspect the only real solution to this is going to be to extend the "get_bucket"/"allocate_buckets" protocol for immutable files so the server can mention to the client "... and by the way the reason that I won't take it and also won't give it to you is that there is a partial upload of that same file sitting here. So you might want to report to your human that they could try again in a few minutes and see if that one has finished".

I just got this on my LeastAuthority.com S4 server, and then I verified by code inspection that it is also present in trunk. I haven't yet verified if it is also present in 1382-rewrite-2. If you write to a server and ask to allocate a bucket, and it writes back saying that it won't allocate that bucket for you, and by the way that it doesn't already have that bucket available to you, the client reasonably-enough concludes that the server is full. It reports that server as having been full if the upload fails. This gave me a bit of a start, since I couldn't figure out how Amazon S3 could be "full" so I thought there was a bug in my S4 service. ☺ But the truth appears to be that I was already uploading the same (immutable) file and the upload was in-progress, so the server was unwilling to start a new upload and also unwilling/unable to let me do a download. I suspect the only real solution to this is going to be to extend the "get_bucket"/"allocate_buckets" protocol for immutable files so the server can mention to the client "... and by the way the reason that I won't take it and also won't give it to you is that there is a *partial* upload of that same file sitting here. So you might want to report to your human that they could try again in a few minutes and see if that one has finished".
zooko added the
unknown
normal
defect
1.10.0
labels 2013-11-20 23:05:24 +00:00
zooko added this to the undecided milestone 2013-11-20 23:05:24 +00:00
markberger was assigned by zooko 2013-11-20 23:05:24 +00:00
daira commented 2013-11-23 01:54:11 +00:00
Owner

We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though.

We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though.
Author

Replying to daira:

We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though.

That does sound desirable but complicated to me. I guess it would require the second uploader — the one who is not actually transferring the ciphertext — to check back later and see if the first uploader finished transferring the correct ciphertext or not. There might also need to be some kind of timeout/conflict/retry/2PC protocol in case the second uploader decide that the first alleged uploader is unacceptably slow/stalled/DoS'ing. How about if we add this issue to #1851?

Replying to [daira](/tahoe-lafs/trac-2024-07-25/issues/2110#issuecomment-93980): > We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though. That does sound desirable but complicated to me. I guess it would require the second uploader — the one who is not actually transferring the ciphertext — to check back later and see if the first uploader finished transferring the correct ciphertext or not. There might also need to be some kind of timeout/conflict/retry/2PC protocol in case the second uploader decide that the first alleged uploader is unacceptably slow/stalled/DoS'ing. How about if we add this issue to #1851?
Author

So this ticket is just about error-reporting. Separately report "your upload didn't happen due to the server saying there is another already-initiated, but not-yet-completed upload of the same file" from "the server was full and refused to start your upload".

So *this* ticket is just about error-reporting. Separately report "your upload didn't happen due to the server saying there is another already-initiated, but not-yet-completed upload of the same file" from "the server was full and refused to start your upload".
daira commented 2013-11-24 18:29:06 +00:00
Owner

Replying to [zooko]comment:2:

Replying to daira:

We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though.

That does sound desirable but complicated to me. [...] How about if we add this issue to #1851?

+1

Replying to [zooko]comment:2: > Replying to [daira](/tahoe-lafs/trac-2024-07-25/issues/2110#issuecomment-93980): > > We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though. > > That does sound desirable but complicated to me. [...] How about if we add this issue to #1851? +1
Author

That does sound desirable but complicated to me. [...] How about if we add this issue to #1851?

+1

Done: comment:90131

> > That does sound desirable but complicated to me. [...] How about if we add this issue to #1851? > > +1 Done: [comment:90131](/tahoe-lafs/trac-2024-07-25/issues/1851#issuecomment-90131)
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#2110
No description provided.