cloud backend: for a very large upload, the accounting crawler deletes shares before they are leased #1987

Closed
opened 2013-05-25 03:06:45 +00:00 by daira · 8 comments
daira commented 2013-05-25 03:06:45 +00:00
Owner

I tested uploading a 10 GB (sic) file to the cloud backend on Azure. Before the upload had finished, the accounting crawler ran and started deleting the uploaded chunks. I thought this had been fixed, but it clearly hasn't. (A share is supposed to be considered leased while it is being uploaded, i.e. while it is in STATE_COMING.)

I took a copy of leasedb.sqlite while it was doing this so that I can examine the share state.

I tested uploading a 10 GB (sic) file to the cloud backend on Azure. Before the upload had finished, the accounting crawler ran and started deleting the uploaded chunks. I thought this had been fixed, but it clearly hasn't. (A share is supposed to be considered leased while it is being uploaded, i.e. while it is in STATE_COMING.) I took a copy of leasedb.sqlite while it was doing this so that I can examine the share state.
tahoe-lafs added the
code-storage
major
defect
cloud-branch
labels 2013-05-25 03:06:45 +00:00
tahoe-lafs added this to the soon milestone 2013-05-25 03:06:45 +00:00
daira commented 2013-05-25 22:32:09 +00:00
Author
Owner

#1921 may be the same bug as this. I'm not marking them as duplicates because I'm not sure of that yet.

#1921 may be the same bug as this. I'm not marking them as duplicates because I'm not sure of that yet.

#1833 would fix this. I would be happy with that method of fixing this, because I really like #1833.

#1833 would fix this. I would be happy with that method of fixing this, because I really like #1833.
daira commented 2013-05-27 21:08:18 +00:00
Author
Owner

I would like to understand why the current code is failing, anyway. It may be a symptom of shares being in the wrong state when they are being written, or something similar.

I would like to understand why the current code is failing, anyway. It may be a symptom of shares being in the wrong state when they are being written, or something similar.
daira commented 2013-05-28 01:07:35 +00:00
Author
Owner

After examining the logs more closely, I think I misinterpreted the problem. The upload failed because four consecutive HTTP PUT requests to Azure failed (with ConnectionLost TimeoutError exceptions). Then the share chunks were deleted because that is the behaviour coded in BucketWriter._abort.

After examining the logs more closely, I think I misinterpreted the problem. The upload failed because four consecutive HTTP PUT requests to Azure failed (with ~~`ConnectionLost`~~ `TimeoutError` exceptions). Then the share chunks were deleted because that is the behaviour coded in [BucketWriter._abort](https://github.com/LeastAuthority/tahoe-lafs/blob/1819-cloud-merge/src/allmydata/storage/bucket.py#L88).
daira commented 2013-05-28 01:09:13 +00:00
Author
Owner

I'm retrying the 10 GB upload; if it succeeds this time then I'll reenable share deletion.

I'm retrying the 10 GB upload; if it succeeds this time then I'll reenable share deletion.
daira commented 2013-05-30 17:25:54 +00:00
Author
Owner

The 10 GB upload failed but for an unrelated reason (#1991), and uploads up to 2 GB succeeded concurrently with an accounting crawler run.

The 10 GB upload failed but for an unrelated reason (#1991), and uploads up to 2 GB succeeded concurrently with an accounting crawler run.
tahoe-lafs added the
invalid
label 2013-05-30 17:25:54 +00:00
daira commented 2013-05-30 18:13:44 +00:00
Author
Owner

I made share deletion by the accounting crawler conditional in 416e91ed, and reenabled it in 98b4d8ee on the 1819-cloud-merge branch.

I made share deletion by the accounting crawler conditional in [416e91ed](https://github.com/LeastAuthority/tahoe-lafs/commit/416e91ed0948ee2802e0e2ea20dd48befcaae94c), and reenabled it in [98b4d8ee](https://github.com/LeastAuthority/tahoe-lafs/commit/98b4d8ee3cfeccbdd56b45507e4c0e06d8c5bb10) on the 1819-cloud-merge branch.
I put a comment on <https://github.com/LeastAuthority/tahoe-lafs/commit/416e91ed0948ee2802e0e2ea20dd48befcaae94c#commitcomment-3322692>
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1987
No description provided.