Can't repair read-only dirnodes/mutable-files #625

Open
opened 2009-02-13 18:47:42 +00:00 by francois · 12 comments
francois commented 2009-02-13 18:47:42 +00:00
Owner

Running a deep-check operation on a dirnode created with "tahoe backup" fails with this exception.

MutableFileNode instance has no attribute '_writekey'

Deep-check should probably silently avoid walking into ready-only running instead of raising any exception.

Running a deep-check operation on a dirnode created with "tahoe backup" fails with this exception. ``` MutableFileNode instance has no attribute '_writekey' ``` Deep-check should probably silently avoid walking into ready-only running instead of raising any exception.
tahoe-lafs added the
code-storage
minor
defect
1.3.0
labels 2009-02-13 18:47:42 +00:00
tahoe-lafs added this to the undecided milestone 2009-02-13 18:47:42 +00:00
francois commented 2009-02-13 18:48:33 +00:00
Author
Owner

Attachment exception.html (8299 bytes) added

**Attachment** exception.html (8299 bytes) added

Oops.

hm, check ought to be able to handle readonly dirnodes. Repair can't (because of the write-enabler problem), but check should.

I'll start by adding a test case. Thanks!

Oops. hm, check ought to be able to handle readonly dirnodes. Repair can't (because of the write-enabler problem), but check should. I'll start by adding a test case. Thanks!
warner self-assigned this 2009-02-13 20:41:21 +00:00

Ah, looking more carefully at your exception, I see that your node was trying to repair the readonly dirnode in question. I added a test case to make sure that checker can handle readonly dirnodes, and it passes.

So, yeah, deep-repair should refrain from trying to repair readonly mutable files and readonly dirnodes. I'll look at the deep-check-results and see if there's a good way to report that there were objects that need repair but which could not be repaired because they're readonly.

Ah, looking more carefully at your exception, I see that your node was trying to repair the readonly dirnode in question. I added a test case to make sure that checker can handle readonly dirnodes, and it passes. So, yeah, deep-repair should refrain from trying to repair readonly mutable files and readonly dirnodes. I'll look at the deep-check-results and see if there's a good way to report that there were objects that need repair but which could not be repaired because they're readonly.
warner added
code-dirnodes
and removed
code-storage
labels 2009-02-13 20:53:00 +00:00
warner changed title from Deep-check chokes on readonly dirnodes to Deep-check chokes on readonly dirnodes that need repair 2009-02-13 20:53:00 +00:00

Drat, it looks like this is just too tricky to fix in time for 1.3.0: it requires a better test harness than we currently have, and some more thought about how the problem should be reported (I'm thinking that "repair_attempted" should be incremented, but "repair_succeeded" should not, and the per-file check-and-repair results should include a CannotRepairReadonlyMutableFileError failure).

So we've added a note to NEWS to mention the bug, and we'll put it high on the list for the 1.3.1 release.

Drat, it looks like this is just too tricky to fix in time for 1.3.0: it requires a better test harness than we currently have, and some more thought about how the problem should be reported (I'm thinking that "repair_attempted" should be incremented, but "repair_succeeded" should not, and the per-file check-and-repair results should include a CannotRepairReadonlyMutableFileError failure). So we've added a note to NEWS to mention the bug, and we'll put it high on the list for the 1.3.1 release.

Nope, this is getting bumped again.. we haven't made enough progress on it. The new test harness is in place (NoNetworkTestGrid), but I'm still not sure how we should handle it. The root issue is that readonly dirnode don't give us enough information to compute the Write Enabler (by design), and when repair needs to create new shares, it must give those shares a Write Enabler so that later writers can modify those shares.

We've considered changing the storage-server protocol to have the server validate shares (by checking their signatures): this would remove the need for Write Enablers, but would also obligate servers to know more about the client's data format (causing version dependencies between clients and servers) and increasing the workload of the servers. Despite these tradeoffs, I think we're likely to move in this direction when we implement Accounting, since Write Enablers are a nuisance (and require an encrypted transport layer, which it would be nice to avoid).

But until then, we may have to decide between being able to repair read-only mutable files, and being able to modify those shares later. One possible change (that wouldn't involve modifying any protocols) would be to have the repairer provide an all-zeros Write Enabler, and then if anyone tries to modify the share again later, have the server validate the signatures and replace the WE if they check out. This would have some security implications, in particular making it possible to cause rollbacks in certain situations.

Nope, this is getting bumped again.. we haven't made enough progress on it. The new test harness is in place (`NoNetworkTestGrid`), but I'm still not sure how we should handle it. The root issue is that readonly dirnode don't give us enough information to compute the Write Enabler (by design), and when repair needs to create new shares, it must give those shares a Write Enabler so that later writers can modify those shares. We've considered changing the storage-server protocol to have the server validate shares (by checking their signatures): this would remove the need for Write Enablers, but would also obligate servers to know more about the client's data format (causing version dependencies between clients and servers) and increasing the workload of the servers. Despite these tradeoffs, I think we're likely to move in this direction when we implement Accounting, since Write Enablers are a nuisance (and require an encrypted transport layer, which it would be nice to avoid). But until then, we may have to decide between being able to repair read-only mutable files, and being able to modify those shares later. One possible change (that wouldn't involve modifying any protocols) would be to have the repairer provide an all-zeros Write Enabler, and then if anyone tries to modify the share again later, have the server validate the signatures and replace the WE if they check out. This would have some security implications, in particular making it possible to cause rollbacks in certain situations.
warner added this to the 1.5.0 milestone 2009-04-08 02:28:24 +00:00
warner changed title from Deep-check chokes on readonly dirnodes that need repair to Can't repair read-only dirnodes/mutable-files 2009-04-08 02:28:24 +00:00

bumping again, this won't be finished in June

bumping again, this won't be finished in June
warner modified the milestone from 1.5.0 to 1.6.0 2009-06-19 18:44:51 +00:00

I'm going to try to make the june release at least tolerate (i.e. skip over) read-only dircaps, so deep-check-and-repair can work on all the files.

I've also got an idea about a relatively clean way to address this: use an all-zeros WE to ask the server to please validate the new share instead of relying upon the WE for access control. This will require significant (but compatible) changes to both the client and the server. Also note that the current mutable share format doesn't allow the server to validate the encrypted private key, but it can validate all the other bits. One criteria for the new DSA-based mutable file design (#217) is that every bit of the slot data must be validateable by the server (i.e. if we must embed key material in the share, it must be covered by the signature).

I'm going to try to make the june release at least tolerate (i.e. skip over) read-only dircaps, so deep-check-and-repair can work on all the files. I've also got an idea about a relatively clean way to address this: use an all-zeros WE to ask the server to please validate the new share instead of relying upon the WE for access control. This will require significant (but compatible) changes to both the client and the server. Also note that the current mutable share format doesn't allow the server to validate the encrypted private key, but it can validate all the other bits. One criteria for the new DSA-based mutable file design (#217) is that every bit of the slot data must be validateable by the server (i.e. if we must embed key material in the share, it must be covered by the signature).
warner added
code-mutable
and removed
code-dirnodes
labels 2009-06-29 22:54:46 +00:00
tahoe-lafs added
major
and removed
minor
labels 2009-12-06 00:00:10 +00:00
davidsarah commented 2009-12-06 00:02:55 +00:00
Author
Owner

#746 may be an instance of this.

#746 may be an instance of this.
davidsarah commented 2009-12-30 00:30:57 +00:00
Author
Owner

The current code seems to raise RepairRequiresWritecapError (see source:src/allmydata/mutable/repairer.py#L103), rather than the error in the description.

The current code seems to raise RepairRequiresWritecapError (see source:src/allmydata/mutable/repairer.py#L103), rather than the error in the description.
zooko modified the milestone from 1.6.0 to eventually 2010-01-26 15:39:46 +00:00
davidsarah commented 2010-02-11 01:20:30 +00:00
Author
Owner

See also #568 (make immutable check/verify/repair and mutable check/verify work given only a verify cap).

See also #568 (make immutable check/verify/repair and mutable check/verify work given only a verify cap).
tahoe-lafs modified the milestone from eventually to 1.7.0 2010-02-27 08:54:09 +00:00

From comment:69655: "This will require significant (but compatible) changes to both the client and the server.". Well I guess that means it isn't going to happen for v1.7 then! :-) Bumping to v1.8.0 instead of to "eventually" because it seems somewhat urgent and important to fix this.

From [comment:69655](/tahoe-lafs/trac-2024-07-25/issues/625#issuecomment-69655): "This will require significant (but compatible) changes to both the client and the server.". Well I guess that means it isn't going to happen for v1.7 then! :-) Bumping to v1.8.0 instead of to "eventually" because it seems somewhat urgent and important to fix this.
zooko modified the milestone from 1.7.0 to 1.8.0 2010-05-08 19:41:02 +00:00
zooko modified the milestone from 1.8.0 to soon 2010-08-09 22:17:28 +00:00

Once this ticket is fixed then (as mentioned on #1234), we should change the user interfaces to diminish write caps to read caps when appropriate when emitting them or logging them, such as when reporting that a cap is unrecoverable.

Once this ticket is fixed then (as mentioned on #1234), we should change the user interfaces to diminish write caps to read caps when appropriate when emitting them or logging them, such as when reporting that a cap is unrecoverable.
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#625
No description provided.