Not enough available servers are found #2016
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2016
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
When uploading a file, it fails with the following error:
<class 'allmydata.interfaces.UploadUnhappinessError'>: shares could be placed on only 4 server(s) such that any 3 of them have enough shares to recover the file, but we were asked to place shares on at least 5 such servers. (placed all 5 shares, want to place shares on at least 5 servers such that any 3 of them have enough shares to recover the file, sent 6 queries to 6 servers, 4 queries placed some shares, 2 placed none (of which 2 placed none due to the server being full and 0 placed none due to an error))
There are 12 servers connected to this grid (pubgrid) yet 6 queries are send, and because two are full the upload fails (if i interpreted the error right).
Shouldn't there be another round of queries if the first round does not yield enough available servers?
Replying to kapiteined:
somehow attaching a file to this ticket failed, so i put the error report
( incident-2013-07-05--19-34-13Z-7o6admq.flog.bz2 )
at URI:CHK:7tbpjhxokkmpere6nxwfa5cvey:37ypgfhpwg67veqpyhjve22edmh3w3jwpbds47yfnvjussvalmaq:3:5:74128
in the pubgrid.
Here's the most important part of the log:
Here's my interpretation: with h = N = 5, as soon as the
Tahoe2ServerSelector
decides to put two shares on the same server (here sh1 and sh2 on lxmst5bx), the upload is doomed. The shares all have to be on different servers whenever h = N, but the termination condition is just that all shares have been placed, not that they have been placed in a way that meets the happiness condition.If that's the problem, then #1382 should fix it. This would also explain why VG2 was unreliable with h close to N.
Daira: excellent work diagnosing this!! Ed: thanks so much for the bug report. Daira: it looks like you are right, and I think this does explain those bugs that the volunteergrid2 people reported and that I never understood. Thank you!
And to check if that is the case, i changed to 3-7-10 encoding, and now the upload succeeds!
Success: file copied
Does this call for a change in code, or for a big warning sticker:
"don't choose h and n to close together" ?
We intend to fix it for v1.11 (Mark Berger's branch for #1382 already basically works), but there would be no harm in pointing out this problem on tahoe-dev in the meantime.
Same bug as #1791?
Replying to daira:
Yes, that bug also had h = N and two shares that were placed on the same server, so almost identical. I'll copy the conclusions here to that ticket.