mutable file: survive encoding variations #312
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#312
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The current mutable.py has a nasty bug lurking: since the encoding parameters
(k and N) are not included in the URI, a copy is put in each share. The
Retrieve code latches on to the first version it sees, and ignores the values
from all subsequently-fetched shares. If (for whatever reason) some clients
have uploaded the file with different parameters (specifically different
values of k, say 3-of-10 vs 2-of-6), then we could wind up feeding 3-of-10
shares into a zfec decoder configured for 2-of-6, which would cause silent
data corruption.
The first fix for this is to reject shares that have encoding parameters that
differ from the values that we pulled from the first share, rejecting them
with a
CorruptShareError
. That will at least prevent the possible datacorruption.
The longer-term fix is to refactor Retrieve to treat k and N as part of the
'verinfo' index, along with seqnum and roothash and the salt. This
refactoring also calls for building up a table of available versions, and
then deciding which one (or ones) to decode on the basis of available shares
and highest seqnum. The new Retrieve class should be able to return multiple
versions, or indicate the presence of newer versions (that might not be
recoverable).
I've pushed the first fix for this. We still need to come up with a unit testing scheme for this stuff, addressed in #207.
Having that first fix in place addresses the immediate problem, so I'm lowering the severity and pushing the rest of this ticket out a release
If we want #332 to go into the 0.9.0 release, then we also need to fix #312. Do you agree? My concern is that existing dirnodes will wind up with multiple encodings, but maybe I'm wrong.
Hm... yes it would be good to fix this, so that dirnodes produced by v0.8.0 can survive into v0.9.0 and get converted into K=1 dirnodes.
This is our first backwards compatibility decision. :-)
Fixed, in changeset:10d3ea504540ae2f. This retains the property that Retrieve will return with whatever version was recoverable first: it classifies all shares that it sees into buckets indexed by their full "verinfo" tuple: seqnum, roothash, encoding parameters. Whichever bucket gets enough valid shares to decode first will win.
The rest of the refactoring (to actually fetch and return multiple versions, and handle the "epsilon" anti-rollback parameter, etc) is left for ticket #205.
Oh, also note that this change does nothing whatsoever about "rebalancing" mutable files to use more shares upon each successive update. In fact the code retain the behavior that shares are always updated in place rather than moving them, so if you upload 10 shares when there are only three peers on the network, then those shares will remain bunched up on those three peers even after more peers have been added.
I don't know if we have an enhancement ticket to rebalance bunched-up shares when we find enough peers to do so.
This was fixed in changeset:791482cf8de84a91 (the trac changeset now known as changeset:791482cf8de84a91 was formerly known as changeset:10d3ea504540ae2f -- there were two patches listed in the Trac timeline until now that have been obliterated from our trunk).