K=1 for mutable files #332

Closed
opened 2008-03-08 01:40:22 +00:00 by zooko · 8 comments

Per [dicussion http://allmydata.org/pipermail/tahoe-dev/2008-March/000416.html]this, we're changing mutable files to have K=1.

Some documentation needs to be updated. See also #254 (need better user output on UncoordinatedWriteError) and #207 (unit tests for failure modes of small mutable files).

Per [dicussion <http://allmydata.org/pipermail/tahoe-dev/2008-March/000416.html>]this, we're changing mutable files to have K=1. Some documentation needs to be updated. See also #254 (need better user output on [UncoordinatedWriteError](wiki/UncoordinatedWriteError)) and #207 (unit tests for failure modes of small mutable files).
zooko added the
unknown
major
defect
0.8.0
labels 2008-03-08 01:40:22 +00:00
zooko added this to the 0.9.0 (Allmydata 3.0 final) milestone 2008-03-08 01:40:22 +00:00

Also, we need to fix #312 before we change K, otherwise we risk data unavailability for existing files (which are encoded at 3-of-10).

Also, we need to fix #312 before we change K, otherwise we risk data unavailability for existing files (which are encoded at 3-of-10).
Author

We hesitate to make this change until #207 (unit tests for failure modes of small mutable files) is in place to assure us that this change doesn't destroy any data from the 0.8.0-based Allmydata.com 3.0 beta production grid.

We hesitate to make this change until #207 (unit tests for failure modes of small mutable files) is in place to assure us that this change doesn't destroy any data from the 0.8.0-based Allmydata.com 3.0 beta production grid.

Oh, another concern with k=1 is that this makes it a lot easier to experience
an accidental rollback attack when a single server is offline during an
update. Specifically:

  • I'm experimenting with 1-of-8, since that gets the availability that I
    want (relative to the old 3-of-10)
  • the mutfilenode is created, and ver1 shares are pushed to 8 servers
  • later, we update to ver2, but one of the servers is offline at that moment
    • ver2 shares go to 7 of the original servers and one new one.
  • now the offline server comes back. We now have 8 ver2 shares and 1 ver1
    share
  • now a retrieve occurs. If it hits the once-offline server first, we'll
    finish with ver1, and the accidental rollback will have occurred.

If a server was offline, then the chances of experiencing a rollback are
1-out-of-8 (since it requires that the fastest server in the later retrieval
group be the one with the old version).

When we refactor Retrieve to grab multiple versions (#205), we plan to introduce the
"epsilon" parameter as protection against both this and intentional rollback
attacks. But we're likely to switch to k=1 before we finish that work.

Oh, another concern with k=1 is that this makes it a lot easier to experience an accidental rollback attack when a single server is offline during an update. Specifically: * I'm experimenting with 1-of-8, since that gets the availability that I want (relative to the old 3-of-10) * the mutfilenode is created, and ver1 shares are pushed to 8 servers * later, we update to ver2, but one of the servers is offline at that moment * ver2 shares go to 7 of the original servers and one new one. * now the offline server comes back. We now have 8 ver2 shares and 1 ver1 share * now a retrieve occurs. If it hits the once-offline server first, we'll finish with ver1, and the accidental rollback will have occurred. If a server was offline, then the chances of experiencing a rollback are 1-out-of-8 (since it requires that the fastest server in the later retrieval group be the one with the old version). When we refactor Retrieve to grab multiple versions (#205), we plan to introduce the "epsilon" parameter as protection against both this and intentional rollback attacks. But we're likely to switch to k=1 before we finish that work.
Author

If I understand correctly, we're pushing this one out of v0.9.0.

If I understand correctly, we're pushing this one out of v0.9.0.
zooko modified the milestone from 0.9.0 (Allmydata 3.0 final) to undecided 2008-03-12 15:50:10 +00:00
Author

If we change the Prime Directive of Uncoordinated Writes: "Don't Do That", then we also need to change the user output that is visible from the wui on UncoordinatedWriteError, as was described in now-closed ticket #254 (need better user output on UncoordinatedWriteError).

If we change the Prime Directive of Uncoordinated Writes: "Don't Do That", then we also need to change the user output that is visible from the wui on [UncoordinatedWriteError](wiki/UncoordinatedWriteError), as was described in now-closed ticket #254 (need better user output on [UncoordinatedWriteError](wiki/UncoordinatedWriteError)).
Author

I think we've given up on the idea of using {K=1} at all. Let's close this as invalid or wontfix or fixed. :-)

I think we've given up on the idea of using {K=1} at all. Let's close this as invalid or wontfix or fixed. :-)
Author

Putting this into Milestone 1.1.0 so that Brian will notice it. Justification: this was a proposed robustness improvement to mutable files which was obviated by Brian's excellent "new mutable files" work which is going into 1.1.0.

Putting this into Milestone 1.1.0 so that Brian will notice it. Justification: this was a proposed robustness improvement to mutable files which was obviated by Brian's excellent "new mutable files" work which is going into 1.1.0.
zooko modified the milestone from undecided to 1.1.0 2008-05-31 00:15:44 +00:00

This is definitely not a 1.1.0 thing.

I think there may still be value in switching to K=1. It needs more testing than we can give it this week, and it's lower priority that anything we have in the next month. Giving up on it requires some thinking time, and there are higher-priority demands on thinking time right now. So, having noticed it, I'm going to move it all the way back out to the Undecided category.

This is definitely not a 1.1.0 thing. I think there may still be value in switching to K=1. It needs more testing than we can give it this week, and it's lower priority that anything we have in the next month. Giving up on it requires some thinking time, and there are higher-priority demands on thinking time right now. So, having noticed it, I'm going to move it all the way back out to the Undecided category.
warner modified the milestone from 1.1.0 to undecided 2008-06-03 06:13:43 +00:00
warner added
code-mutable
and removed
unknown
labels 2008-06-03 06:13:56 +00:00
zooko added the
invalid
label 2008-09-24 13:20:31 +00:00
zooko closed this issue 2008-09-24 13:20:31 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#332
No description provided.