corrupted filesize in CHK filecap causes unexpected "bad hash error" #1529
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1529
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
IRC user "joepie91" noticed that changing a filecap like:
URI:CHK:wfnqcotc4ceokosmfobofjzoya:qcof5dae7usfth4zvti7fmievpsofu25lgcmdq7q2cnebg3txyfq:3:10:29680
to
URI:CHK:wfnqcotc4ceokosmfobofjzoya:qcof5dae7usfth4zvti7fmievpsofu25lgcmdq7q2cnebg3txyfq:3:10:29681
(by changing the last character, raising the filesize by one) causes the following exception to be raised:
I was able to duplicate it locally against current trunk.
It looks like 1: we're relying upon something in the filecap that should
merely be a hint, and 2: we're throwing a particularly weird error
message. A hash mismatch in the
ciphertext_hash_tree
eitherindicates a failure in zfec decoding, or a deliberately malformed file
(in which the original
ciphertext_hash_tree
was computed over thewrong data, then included in the UEB). So my guess is that we're
computing the predicted segment size from the filecap hint, but then not
updating it when the shares' UEBs tell us that the hint was wrong, and
trying to decode with the wrong data.
The test case for this should be pretty straightforward: upload a file
with a couple of segments, produce a mangled filecap, attempt a
download. The download ought to succeed. If we want to be picky about
the filecap being the source of truth, then we're allowed to fail, but
we need a better error message than
BadHashError
(maybe somethingabout "corrupt filecap" or "file size mismatch"). But I think success is
arguably more correct. (note: the same argument may apply to mismatches
in k and N).
Brian: I think we are supposed to rely on that number in the cap--it is canonical and not just a hint.
Although actually now that I think about it, that was one of those things that I "tightened" and that you, I think, changed back to some degree when you write Brian's New Downloader.
IIRC my "tightened" interpretation of all those fields is still intact in the verifier but not in your new downloader. Might be interesting to see what the verifier says about that cap.
I reproduced this by uploading a file to Test Grid and downloading it and then changing the file size and re-downloading it. Then I ran the verifier on it and got an informative error message:
The informative part on the end there says:
allmydata.immutable.checker.BadURIExtension
: inconsistent erasure code params:utcpss: 38424
!=self.tail_segment_size: 3
,self._verifycap.size: 38425
,self.segment_size: 38424
,self._verifycap.needed_shares: 3
After discussing it with Zooko, I agree with his description: the filecap is authoritative/canonical. If we fetch a share that contains a UEB with a filesize that disagrees with the filecap, we should treat that share as invalid, and throw a useful error (
FilecapFilesizeMismatchError
?, althoughBadURIExtension
is even better). The filecap's filesize is not just a hint.The "new" immutable downloader currently ignores all the share fields it doesn't need, including the UEB's filesize field. In this case, I think it's better that the downloader explicitly (one might say gratuitously) compares the UEB's filesize against the filecap's filesize and throw
FilecapFilesizeMismatchError
, rather than pretending that the UEB's filesize is a mistake and attempting to decode the file with the filecap's filesize (which will result in the unhelpful ciphertext_hash_tree error in this ticket). Think of it as bringing the error forward to a better place.Note that the root issue here is that the filesize is redundant: the file can be completely specified by just the readkey and the UEB hash. But it's awfully handy to include the filesize, k, and N in there, both to speed up downloads (parallelizing an appropriate number of DYHB queries) and to let deep-size run without actually fetching file data. Ideally, filecaps would only contain the minimal set of data necessary to retrieve the file. Having extra data there means we have to specify what happens when the extra data in the filecap disagrees with the data in a validated share, which means doing more work during download that can only possibly cause a failure.
Replying to warner:
+1.