Can't repair read-only dirnodes/mutable-files #625
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#625
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Running a deep-check operation on a dirnode created with "tahoe backup" fails with this exception.
Deep-check should probably silently avoid walking into ready-only running instead of raising any exception.
Attachment exception.html (8299 bytes) added
Oops.
hm, check ought to be able to handle readonly dirnodes. Repair can't (because of the write-enabler problem), but check should.
I'll start by adding a test case. Thanks!
Ah, looking more carefully at your exception, I see that your node was trying to repair the readonly dirnode in question. I added a test case to make sure that checker can handle readonly dirnodes, and it passes.
So, yeah, deep-repair should refrain from trying to repair readonly mutable files and readonly dirnodes. I'll look at the deep-check-results and see if there's a good way to report that there were objects that need repair but which could not be repaired because they're readonly.
Deep-check chokes on readonly dirnodesto Deep-check chokes on readonly dirnodes that need repairDrat, it looks like this is just too tricky to fix in time for 1.3.0: it requires a better test harness than we currently have, and some more thought about how the problem should be reported (I'm thinking that "repair_attempted" should be incremented, but "repair_succeeded" should not, and the per-file check-and-repair results should include a CannotRepairReadonlyMutableFileError failure).
So we've added a note to NEWS to mention the bug, and we'll put it high on the list for the 1.3.1 release.
Nope, this is getting bumped again.. we haven't made enough progress on it. The new test harness is in place (
NoNetworkTestGrid
), but I'm still not sure how we should handle it. The root issue is that readonly dirnode don't give us enough information to compute the Write Enabler (by design), and when repair needs to create new shares, it must give those shares a Write Enabler so that later writers can modify those shares.We've considered changing the storage-server protocol to have the server validate shares (by checking their signatures): this would remove the need for Write Enablers, but would also obligate servers to know more about the client's data format (causing version dependencies between clients and servers) and increasing the workload of the servers. Despite these tradeoffs, I think we're likely to move in this direction when we implement Accounting, since Write Enablers are a nuisance (and require an encrypted transport layer, which it would be nice to avoid).
But until then, we may have to decide between being able to repair read-only mutable files, and being able to modify those shares later. One possible change (that wouldn't involve modifying any protocols) would be to have the repairer provide an all-zeros Write Enabler, and then if anyone tries to modify the share again later, have the server validate the signatures and replace the WE if they check out. This would have some security implications, in particular making it possible to cause rollbacks in certain situations.
Deep-check chokes on readonly dirnodes that need repairto Can't repair read-only dirnodes/mutable-filesbumping again, this won't be finished in June
I'm going to try to make the june release at least tolerate (i.e. skip over) read-only dircaps, so deep-check-and-repair can work on all the files.
I've also got an idea about a relatively clean way to address this: use an all-zeros WE to ask the server to please validate the new share instead of relying upon the WE for access control. This will require significant (but compatible) changes to both the client and the server. Also note that the current mutable share format doesn't allow the server to validate the encrypted private key, but it can validate all the other bits. One criteria for the new DSA-based mutable file design (#217) is that every bit of the slot data must be validateable by the server (i.e. if we must embed key material in the share, it must be covered by the signature).
#746 may be an instance of this.
The current code seems to raise RepairRequiresWritecapError (see source:src/allmydata/mutable/repairer.py#L103), rather than the error in the description.
See also #568 (make immutable check/verify/repair and mutable check/verify work given only a verify cap).
From comment:69655: "This will require significant (but compatible) changes to both the client and the server.". Well I guess that means it isn't going to happen for v1.7 then! :-) Bumping to v1.8.0 instead of to "eventually" because it seems somewhat urgent and important to fix this.
Once this ticket is fixed then (as mentioned on #1234), we should change the user interfaces to diminish write caps to read caps when appropriate when emitting them or logging them, such as when reporting that a cap is unrecoverable.