start downloading as soon as you know where to get K shares #928
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#928
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The current code (changeset:2a2cc5c4b8a490ae) performs immutable file download in four phases. In phase 1 it sends a
get_buckets()
remote invocation message to every storage server that it knows about ([CiphertextDownloader._get_all_shareholders()]source:src/allmydata/immutable/download.py@4164#L871). In phase 2 it requests the CEB (a.k.a. UEB) from each server in turn until it gets a valid CEB ([CiphertextDownloader._obtain_uri_extension()]source:src/allmydata/immutable/download.py@4164#L956). In phase 3 it requests the crypttext hash tree from each server in turn until it gets a valid hash tree ([CiphertextDownloader._get_crypttext_hash_tree()]source:src/allmydata/immutable/download.py@4164#L1002). In phase 4 is actually downloads the blocks of data from servers in parallel. Out of the shares which it learned about in phase 1, which shares does it choose? It chooses "primary shares" first (their sharenum is < K) because those can be erasure-decoded at no computational cost, then it chooses randomly from among secondary shares until it has K.Now currently phase 1 does not end, and therefore phase 2 does not begin, until all servers have answered the
get_buckets()
! To close this ticket, make it so phase 1 ends as soon as at leastK
buckets have been found.The nice non-invasive way to do this is to replace the DeferredList in [CiphertextDownloader._get_all_shareholders()]source:src/allmydata/immutable/download.py@4164#L871 with an object that acts like a DeferredList but fires once at least
K
shares (or possibly once at leastK
servers) have been found.Here's a patch with no tests. David-Sarah offered to write tests, last night. If they don't, I'll write tests this evening.
Attachment dontwait.darcspatch.txt (46922 bytes) added
That patch had a typo that was caught by the current unit tests. Here is the patch with the typo fixed.
Attachment dontwait2.darcspatch.txt (46923 bytes) added
Amber, driving me to work and listening to me talk about this patch, pointed out that we need to count number of unique share numbers instead of total number of buckets. For example, if we contact all servers, and they all reply "Oh yes! I do have a bucket -- here it, for share number 0.", then with the current patch -- dontwait2.darcspatch.txt -- it will think that we got enough shares to download the file although we didn't. Here is a new patch named dontwait3.darcspatch.txt which fixes this.
Looking at the code to fix this, it appears to me that the current trunk has a similar same problem. It measures
len(self_share_buckets)
to determine if it should abort or not in [CiphertextDownloader._got_all_shareholders()]source:src/allmydata/immutable/download.py@4164#L938, and if there are insufficient buckets then it errors out withFailed to get enough shareholders: have %d, need %d
. However, if it has enough buckets but then it turns out that it doesn't have enough unique shares then this will be caught later, in,len(self._shares[source:src/allmydata/immutable/download.py@4164#L1025 CiphertextDownloader._activate_enough_buckets()], and reported with the error message ```Unable to activate enough shares: have %d, need %d
.I think this patch will eliminate the possibility of that second case, so all such failures will be detected and reported in the same way.
We should add a test of this case.
So here is the latest version. This fails the current tests, not to mention the new tests that we need to write for it, possibly because the deferred returned by
CiphertextDownloader._got_all_shareholders()
can now errback in addition to callback, where on trunk it can only callback. Looking at the chain in [CiphertextDownloader.start()]source:src/allmydata/immutable/download.py?rev=4164#L833, it seems to me like an errback from_get_all_shareholders()
should be nicely handled by_finished()
and then by_failed()
. Hm.Attachment dontwait3.darcspatch.txt (49287 bytes) added
FWIW, my Downloader rewrite will include this feature.
Also, for the currently-proposed patch, it would be a good idea to make sure that we grab the UEB from each share (and validate it) before starting the download. Specifically I want to make sure that if the storage server can open the file (and therefore returns YES to the get_buckets query), but then experiences errors or corruption when actually reading the share file, that the downloader will correctly switch over to a different share. I don't know if the current downloader can switch to a new share after it starts the download process for real.
I figured your Downloader rewrite would. Also, probably lots of other improvements such as more pipelining. Is there a list of your intended improvements to Downloader? I have vague recollections of lots of good ideas you had for Downloader improvements.
I still think it is a good idea to apply this patch now, however, (after thorough tests and code review) because:
get_buckets()
). With current trunk, as long as there is any such server connected to a grid then all downloads from that grid will hang.get_buckets
first (assuming only one share per server) instead of the K servers that have primary shares (or, in the case that you don't get K servers with primary shares, random servers with secondary shares). This sounds potentially a nice performance improvement, especially for heterogeneous and geographically spread-out grids.Here is a version of the patch which omits the status reporting change and makes the deferred returned by
_get_all_shareholders()
callback instead of errback when it runs out of outstanding queries.Attachment dontwait4.darcspatch.txt (49992 bytes) added
Replying to warner:
The current downloader cannot switch to a new share after it starts the download process. What you suggest here would be a good idea, but it would require a bigger, more disruptive patch than the current patch. The current patch basically only changes one thing: when does the deferred returned from
_get_all_shareholders()
fire. It changes it from firing once all queries have resolved, to firing as soon as K shares are found (or all queries are resolved). To make the improvement you suggest here we would have to chain the verification of UEB (and presumably also of ciphertext hash tree) to the remoteget_buckets
query which would reorder the control flow.Now I don't think it is a big loss if we get only K chances to find an uncorrupted UEB (and ciphertext hash tree) versus getting ~N chances. In practice I guess we've never seen a corrupted UEB, so we always succeed on the first UEB we try. But the issue that this ticket is intended to address is a big issue in practice, so I would be happy with the trade-off.
Better handling of corrupted UEBs, ciphertext hash trees, and blocks would then be an improvement for your Downloader rewrite.
There was an edit-o mistake in a comment in the previous version of the patch. Here it is with that edit-o fixed.
Attachment dontwait5.darcspatch.txt (49989 bytes) added
Brian reviewed this and made a couple of comments on IRC which I took into account in this next version (6) of the patch.
Attachment dontwait6.darcspatch.txt (51432 bytes) added
Attachment no-network-hang-diff.txt (2259 bytes) added
Support for hanging servers until a Deferred fires in a NoNetworkGrid
Attachment test_hung_server.py (8458 bytes) added
Incomplete tests for download when some servers are hung
Attachment new-tests-for-928-darcspatch.txt (67419 bytes) added
New tests for #928 (patch bundle includes dontwait6). Problem with errback instead of callback fixed; all existing tests run green. Known bug in _copy_all_shares_from.
Attachment dontwait7.darcspatch.txt (75983 bytes) added
Here is a fix (dontwait7.darcspatch.txt) for the problem in _copy_share(), rewrite line-endings to by '\n', and add a comment. (Perhaps you should re-record these patches into a single patch that we can then apply to close this ticket.)
Brian: David-Sarah's patch, included in "new-tests-for-928-darcspatch.txt" fixes two bugs in my patch and is cleaner without being more invasive. Please re-review! (I've already reviewed it and will commit it, but it wouldn't hurt for you to review it as well.)
The two bugs were:
for sharenum, bucket in buckets.iteritems():
, so if the final server answered the query but said it had zero buckets then that check would be wrongly omitted.Attachment dontwait8.darcspatch.txt (90010 bytes) added
All patches for #928
fixed by changeset:2bd9dfa5bdacb381, changeset:baa11a0ad4f4bafe, changeset:d62428c1e6e291da, changeset:37a242e01af6cf76. Thank you, David-Sarah!