uploader confuses self-write-dedup with "server is full" #2110

New Issue

zooko · 2013-11-20T23:05:24Z

zooko commented

2013-11-20 23:05:24 +00:00

I just got this on my LeastAuthority.com S4 server, and then I verified by code inspection that it is also present in trunk. I haven't yet verified if it is also present in 1382-rewrite-2. If you write to a server and ask to allocate a bucket, and it writes back saying that it won't allocate that bucket for you, and by the way that it doesn't already have that bucket available to you, the client reasonably-enough concludes that the server is full. It reports that server as having been full if the upload fails. This gave me a bit of a start, since I couldn't figure out how Amazon S3 could be "full" so I thought there was a bug in my S4 service. ☺ But the truth appears to be that I was already uploading the same (immutable) file and the upload was in-progress, so the server was unwilling to start a new upload and also unwilling/unable to let me do a download.

I suspect the only real solution to this is going to be to extend the "get_bucket"/"allocate_buckets" protocol for immutable files so the server can mention to the client "... and by the way the reason that I won't take it and also won't give it to you is that there is a partial upload of that same file sitting here. So you might want to report to your human that they could try again in a few minutes and see if that one has finished".

I just got this on my LeastAuthority.com S4 server, and then I verified by code inspection that it is also present in trunk. I haven't yet verified if it is also present in 1382-rewrite-2. If you write to a server and ask to allocate a bucket, and it writes back saying that it won't allocate that bucket for you, and by the way that it doesn't already have that bucket available to you, the client reasonably-enough concludes that the server is full. It reports that server as having been full if the upload fails. This gave me a bit of a start, since I couldn't figure out how Amazon S3 could be "full" so I thought there was a bug in my S4 service. ☺ But the truth appears to be that I was already uploading the same (immutable) file and the upload was in-progress, so the server was unwilling to start a new upload and also unwilling/unable to let me do a download. I suspect the only real solution to this is going to be to extend the "get_bucket"/"allocate_buckets" protocol for immutable files so the server can mention to the client "... and by the way the reason that I won't take it and also won't give it to you is that there is a *partial* upload of that same file sitting here. So you might want to report to your human that they could try again in a few minutes and see if that one has finished".

zooko added the

labels 2013-11-20 23:05:24 +00:00

zooko added this to the undecided milestone 2013-11-20 23:05:24 +00:00

markberger was assigned by zooko

2013-11-20 23:05:24 +00:00

daira commented

2013-11-23 01:54:11 +00:00

We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though.

zooko commented

2013-11-24 16:32:02 +00:00

Replying to daira:

We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though.

That does sound desirable but complicated to me. I guess it would require the second uploader — the one who is not actually transferring the ciphertext — to check back later and see if the first uploader finished transferring the correct ciphertext or not. There might also need to be some kind of timeout/conflict/retry/2PC protocol in case the second uploader decide that the first alleged uploader is unacceptably slow/stalled/DoS'ing. How about if we add this issue to #1851?

Replying to [daira](/tahoe-lafs/trac-2024-07-25/issues/2110#issuecomment-93980): > We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though. That does sound desirable but complicated to me. I guess it would require the second uploader — the one who is not actually transferring the ciphertext — to check back later and see if the first uploader finished transferring the correct ciphertext or not. There might also need to be some kind of timeout/conflict/retry/2PC protocol in case the second uploader decide that the first alleged uploader is unacceptably slow/stalled/DoS'ing. How about if we add this issue to #1851?

zooko commented

2013-11-24 16:33:28 +00:00

So this ticket is just about error-reporting. Separately report "your upload didn't happen due to the server saying there is another already-initiated, but not-yet-completed upload of the same file" from "the server was full and refused to start your upload".

So *this* ticket is just about error-reporting. Separately report "your upload didn't happen due to the server saying there is another already-initiated, but not-yet-completed upload of the same file" from "the server was full and refused to start your upload".

daira commented

2013-11-24 18:29:06 +00:00

Replying to [zooko]comment:2:

Replying to daira:

We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though.

That does sound desirable but complicated to me. [...] How about if we add this issue to #1851?

+1

Replying to [zooko]comment:2: > Replying to [daira](/tahoe-lafs/trac-2024-07-25/issues/2110#issuecomment-93980): > > We should certainly do better error reporting, but ideally both uploads should succeed. That seems a bit complicated to implement with the current protocol though. > > That does sound desirable but complicated to me. [...] How about if we add this issue to #1851? +1

zooko commented

2013-11-27 19:57:26 +00:00

That does sound desirable but complicated to me. [...] How about if we add this issue to #1851?

+1

Done: comment:90131

> > That does sound desirable but complicated to me. [...] How about if we add this issue to #1851? > > +1 Done: [comment:90131](/tahoe-lafs/trac-2024-07-25/issues/1851#issuecomment-90131)

Sign in to join this conversation.