need better user output on UncoordinatedWriteError #254
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#254
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The current strategy for solving the eternal puzzle of availability vs. consistency in the presence of multiple uncoordinated writes is to tell the user "Don't Do That." -- don't allow multiple people, or multiple processes belonging to the same person, to write to the same directory at the same time. (For example, make a different directory for each person who needs to write.)
This strategy has many benefits, including being easy to implement, easy to understand, offering perfect availability, and being flexible for many different use cases. However, it has the drawback that so far no actual user has read the doc (source:doc/mutable.txt) and planned in advance to avoid uncoordinated writes. In fact, even I, one of the architects of the "Don't Do That." strategy, have often forgotten, and Done It.
This shows that for some but not all of those aforementioned many use cases, people are going to want a more automated way to trade away some availability in order to get consistency. This automation might not need to live in Tahoe proper, but might be more of a feature of the user interface or application layer.
Anyway, the very next improvement, which we should do ASAP, is make the error message that arises clearly explain to the user that (a) this is the expected result of uncoordinated writes, not an internal error in the Tahoe implementation, and (b) who wrote what when (inasmuch as we can easily provide clues about that), and (c) "Don't Do That!".
I think this came about in a set of serial modifications to the directory, not parallel, specifically so that they would not overlap.
Claudio (at Digbang) should be able to give more information about how he did this.
The modifications were indeed parallel. I was handling concurrent adds, deletes and uploads.
Haven't seen the exception pop up when making the requests serially.
(A note clarifying this in the webapi.txt might be useful for front-end developers)
Claudio: good point about a note in webapi.txt. I'll do that.
Also note that the top item on the #207 megaticket is to implement the
recovery algorithm that we documented in
http://allmydata.org/trac/tahoe/browser/docs/mutable.txt#L436
docs/mutable.txt. With recovery in place, UncoordinatedWriteError would
change from meaning "you shouldn't have done that, and the file might now be
lost or at least very unhealthy), to "you shouldn't have done that, but some
version of the file is probably very healthy right now", which is better (for
certain values of "better": it might make it safe for application code to
catch and log+ignore UncoordinatedWriteError).
I'm writing a unit test for UncoordinatedWriteError.
I'm going to do the following on the plane tomorrow:
partially fixed by changeset:9e2ed2df01cf427c
moving the rest to 0.7.1
I'm going to update this to reflect the new K=1 approach for mutable files.
We're pushing off the K == 1 approach. When/if we do it, then let's remember to update the user interface in case of UncoordinatedWriteError.
Closing for now, and linking to this (now closed) ticket from #332 (K=1 for mutable files).