mutable modify() may need to publish even if retry was a NOP #551
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#551
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
The mutable-file
modify()
function takes a "modifier callback" andperforms (retrieve-modify-publish) until either the modifier callback does
not actually modify anything, or the publish succeeds without raising
UncoordinatedWriteError. The intent is to keep applying some change until it
sticks. If UCWE is reliably raised in case of overlapping writes, and if all
parties keep trying (with random backoff) until they succeed, eventually all
parties' changes should be applied.
The don't-publish-if-modifier-didn't-change-anything clause is intended to
handle the case where two parties are each performing the same change.
However, I'm starting to think that it's a mistake. We planned (but did not
implement) a mutable-file "recovery" mechanism, which was to be triggered by
any UCWE, and was to reinforce some version (not necesarily the one that we
just published), to reduce the chance that later writes and crashing clients
could break the file completely.
An uncontested retrieve+publish cycle should have about the same result as a
dedicated recovery operation. So, for the
modify
function, if the firstpublish caused UCWE, then I'm thinking we should always do a second publish,
even if the modifier callback doesn't wind up modifying anything. Since #546
is a place where UCWE can occur when it doesn't really need to, and because
in #546 the publish is quite successful (the UCWE is raised just because of a
few leftover old "surprise" shares; the new version has a full 10 shares
written to the top of the permuted peer list), the second retrieve will get
the new version of the file, and the second publish would normally be
skipped.
So I currently think that we should change the logic of
modify()
tokeep doing retrieve-modify-publish until the publish finishes without UCWE,
and remove the if-modifier-didn't-change-anything test.
changeset:ffb598514656e7b2 implements this. I made it such that if the initial call of the modifier is a NOP, the file is not published. So the rule is that a publish only happens if something changed, but if we ever see a UCWE, we'll keep trying until we get a publish that doesn't see UCWE. (subject to the limitations of the back-off-agent, which gives up after four retries).