cloud backend fails with DataUnavailable when uploading+downloading a 10 GB file #1991

Closed
opened 2013-05-28 11:24:38 +00:00 by daira · 8 comments
daira commented 2013-05-28 11:24:38 +00:00
Owner
$ bin/tahoe put ~/tahoe/grid/random azure:random
201 Created
URI:CHK:[censored]:1:1:10000000000
$ bin/tahoe webopen azure:

The upload appeared to succeed, taking about 4 hours. (There were errors on some HTTP PUT requests but they were all successfully retried.) The file is listed in the directory. However:

$ time bin/tahoe get URI:CHK:[censored]:1:1:10000000000
Error during GET: 410 Gone
"NoSharesError: no shares could be found. Zero shares usually indicates a corrupt URI, or that no servers were connected, but it might also indicate severe corruption. You should perform a filecheck on this object to learn more.

The full error message is:
no shares (need 1). Last failure: [Failure instance: Traceback (failure with no frames): <class 'allmydata.immutable.downloader.share.DataUnavailable'>: need len=1160: [10008388708-10008388739],[10008388772-10008388803],
[10008388900-10008388931],[10008389156-10008389187],[10008389668-10008389699],
[10008390692-10008390723],[10008392740-10008392771],[10008396836-10008396867],
[10008405028-10008405059],[10008421412-10008421443],[10008454180-10008454211],
[10008519716-10008519747],[10008650788-10008650819],[10008912932-10008912963],
[10009437220-10009437251],[10010485796-10010485827],[10012582916-10012582979],
[10016777284-10016777315],[10016777348-10016777379],[10016777476-10016777507],
[10016777732-10016777763],[10016778244-10016778275],[10016779268-10016779299],
[10016781316-10016781347],[10016785412-10016785443],[10016793604-10016793635],
[10016809988-10016810019],[10016842756-10016842787],[10016908292-10016908323],
[10017039364-10017039395],[10017301508-10017301539],[10017825796-10017825827],
[10018874372-10018874403],[10020971492-10020971555],[10025165830-10025165837]
 but will never get it
]"

real	0m4.025s
user	0m0.268s
sys	0m0.044s

The entry in the leasedb shares table is:

storage_index               shnum  prefix  backend_key  used_space   sharetype  state
ohcac6xn5ot7hxwfcstdeqcf4e  0      oh                   10016777607  0          1

(sharetype 0 is immutable; state 1 is STABLE).

The disk backend is capable of storing files this size.

``` $ bin/tahoe put ~/tahoe/grid/random azure:random 201 Created URI:CHK:[censored]:1:1:10000000000 $ bin/tahoe webopen azure: ``` The upload appeared to succeed, taking about 4 hours. (There were errors on some HTTP PUT requests but they were all successfully retried.) The file is listed in the directory. However: ``` $ time bin/tahoe get URI:CHK:[censored]:1:1:10000000000 Error during GET: 410 Gone "NoSharesError: no shares could be found. Zero shares usually indicates a corrupt URI, or that no servers were connected, but it might also indicate severe corruption. You should perform a filecheck on this object to learn more. The full error message is: no shares (need 1). Last failure: [Failure instance: Traceback (failure with no frames): <class 'allmydata.immutable.downloader.share.DataUnavailable'>: need len=1160: [10008388708-10008388739],[10008388772-10008388803], [10008388900-10008388931],[10008389156-10008389187],[10008389668-10008389699], [10008390692-10008390723],[10008392740-10008392771],[10008396836-10008396867], [10008405028-10008405059],[10008421412-10008421443],[10008454180-10008454211], [10008519716-10008519747],[10008650788-10008650819],[10008912932-10008912963], [10009437220-10009437251],[10010485796-10010485827],[10012582916-10012582979], [10016777284-10016777315],[10016777348-10016777379],[10016777476-10016777507], [10016777732-10016777763],[10016778244-10016778275],[10016779268-10016779299], [10016781316-10016781347],[10016785412-10016785443],[10016793604-10016793635], [10016809988-10016810019],[10016842756-10016842787],[10016908292-10016908323], [10017039364-10017039395],[10017301508-10017301539],[10017825796-10017825827], [10018874372-10018874403],[10020971492-10020971555],[10025165830-10025165837] but will never get it ]" real 0m4.025s user 0m0.268s sys 0m0.044s ``` The entry in the leasedb `shares` table is: ``` storage_index shnum prefix backend_key used_space sharetype state ohcac6xn5ot7hxwfcstdeqcf4e 0 oh 10016777607 0 1 ``` (sharetype 0 is immutable; state 1 is STABLE). The disk backend is capable of storing files this size.
tahoe-lafs added the
code-storage
normal
defect
1.10.0
labels 2013-05-28 11:24:38 +00:00
tahoe-lafs added this to the undecided milestone 2013-05-28 11:24:38 +00:00
daira commented 2013-05-28 13:37:12 +00:00
Author
Owner

Note that the used_space in the cloud backend is 10016777607, but the requested ranges go up to 10025165837. (The ranges are data offsets so they exclude the 12-byte immutable header.) The used_space passed to the disk backend for an upload of the same share is 10025166838.

Next step is to determine whether the share got truncated as stored in the cloud, or whether it was stored correctly but the end can't be read.

Note that the `used_space` in the cloud backend is 10016777607, but the requested ranges go up to 10025165837. (The ranges are data offsets so they exclude the 12-byte immutable header.) The `used_space` passed to the disk backend for an upload of the same share is 10025166838. Next step is to determine whether the share got truncated as stored in the cloud, or whether it was stored correctly but the end can't be read.
daira commented 2013-05-28 20:36:20 +00:00
Author
Owner

This also happens with a 5 GB file, but not with a 1 GB file.

This also happens with a 5 GB file, but not with a 1 GB file.
daira commented 2013-05-28 20:41:40 +00:00
Author
Owner

Replying to daira:

This also happens with a 5 GB file, but not with a 1 GB file.

... assuming it is deterministic, that is.

Oh, maybe it's a 2^32^-byte (4 GiB) or 2^31^-byte (2 GiB) threshold issue. It would still have to be something that is handled differently by the disk and cloud backends, though.

Replying to [daira](/tahoe-lafs/trac-2024-07-25/issues/1991#issuecomment-92072): > This also happens with a 5 GB file, but not with a 1 GB file. ... assuming it is deterministic, that is. Oh, maybe it's a 2^32^-byte (4 GiB) or 2^31^-byte (2 GiB) threshold issue. It would still have to be something that is handled differently by the disk and cloud backends, though.
tahoe-lafs modified the milestone from undecided to 1.12.0 2013-07-22 20:50:23 +00:00

Milestone renamed

Milestone renamed
warner modified the milestone from 1.12.0 to 1.13.0 2016-03-22 05:02:25 +00:00

renaming milestone

renaming milestone
warner modified the milestone from 1.13.0 to 1.14.0 2016-06-28 18:17:14 +00:00

I tested a 5GiB upload against the 2237.cloud-backend-merge.0 and it succeeded. Was this fixed in the branch?

I tested a 5GiB upload against the 2237.cloud-backend-merge.0 and it succeeded. Was this fixed in the branch?

Moving open issues out of closed milestones.

Moving open issues out of closed milestones.
exarkun modified the milestone from 1.14.0 to 1.15.0 2020-06-30 14:45:13 +00:00

The established line of development on the "cloud backend" branch has been abandoned. This ticket is being closed as part of a batch-ticket cleanup for "cloud backend"-related tickets.

If this is a bug, it is probably genuinely no longer relevant. The "cloud backend" branch is too large and unwieldy to ever be merged into the main line of development (particularly now that the Python 3 porting effort is significantly underway).

If this is a feature, it may be relevant to some future efforts - if they are sufficiently similar to the "cloud backend" effort - but I am still closing it because there are no immediate plans for a new development effort in such a direction.

Tickets related to the "leasedb" are included in this set because the "leasedb" code is in the "cloud backend" branch and fairly well intertwined with the "cloud backend". If there is interest in lease implementation change at some future time then that effort will essentially have to be restarted as well.

The established line of development on the "cloud backend" branch has been abandoned. This ticket is being closed as part of a batch-ticket cleanup for "cloud backend"-related tickets. If this is a bug, it is probably genuinely no longer relevant. The "cloud backend" branch is too large and unwieldy to ever be merged into the main line of development (particularly now that the Python 3 porting effort is significantly underway). If this is a feature, it may be relevant to some future efforts - if they are sufficiently similar to the "cloud backend" effort - but I am still closing it because there are no immediate plans for a new development effort in such a direction. Tickets related to the "leasedb" are included in this set because the "leasedb" code is in the "cloud backend" branch and fairly well intertwined with the "cloud backend". If there is interest in lease implementation change at some future time then that effort will essentially have to be restarted as well.
exarkun added the
wontfix
label 2020-10-30 12:35:44 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1991
No description provided.