add an option for "tahoe manifest" to not skip duplicates, or a --recursive option to "tahoe ls" #662
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#662
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
My current job involves tools which modify a directory tree [...], and I'd like to use "tahoe manifest" to compare the before- and after- trees to make sure they're the same. Unfortunately, "tahoe manifest"'s cycle-avoidance code (which simply ignores files or directories that it's seen before) is causing me trouble, since an object that's referenced by multiple places in the tree will appear in the manifest output at only one of them, and that location will depend upon the traversal order. (I just pushed a patch to make deep_traverse at least sort the child names before walking them, so it should now be consistent).
I'm thinking that it might be nice to have a flag to "tahoe manifest" that tells it to not supress duplicates like this. The cycle-avoidance code would need to change: instead of keeping a set of nodes that have already been visited, it should just keep a list of the ancestors of the current node. A cycle should be declared if the child node we're considering entering appears on its own ancestor list.
It might also be useful to have two sets of stats: one that includes shared objects, and one that does not.
Oh, I should mention that partly this is the result of changing goals/definitions of "tahoe manifest". Originally, it was intended purely as a set of verifycaps: the idea being that you'd compute your manifest and then hand it to a separate Verifier service, which would take responsibility for checking up on all of them. It was also the intention that a verifycap be usable as a repaircap, so the Verifier service could be a Verifier/Repairer service. In these cases, we don't care about duplicates: we just want the minimum-size set of verifycaps, and it doesn't matter what path or paths were used to store each one.
Later, "tahoe manifest" acquired path information, because that made it easier to backtrack and find a parent directory for any object which was later found to have problems. About this same time, the definition of "manifest" started changing, and now we sort of think about is as a list of (path,cap) tuples.
So maybe we need to be more clear about our definitions, and perhaps create a separate API for each one.
Incidentally, the cycle handling code on the "list of (path,cap) tuples" API could respond to cycles by emitting a special marker: (type="cycle", cap), and maybe include otherpath= too. The program which is receiving the manifest could conceivably use this information to stitch together the cycle somehow.
I tried using manifest as a sort of recursive ls, and immediately ran into this issue that it wasn't showing duplicates. Unless there's recursive ls behavior available somewhere else, it would be great to fix this.
change "tahoe manifest" to not skip duplicatesto add an option for "tahoe manifest" to not skip duplicates, or a --recursive option to "tahoe ls"