add streaming manifest/deep-checker/repairer #590
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#590
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Our current deep-traversal webapi operations (manifest, deep-check,
deep-repair, and to a lesser extent deep-stats) run asynchronously. The
"start-operation" POST provides an unguessable "ophandle", the node builds up
a big table of results, and every once in a while the client polls to see if
the operation has completed (by doing a GET to /ophandle/$HANDLE), retrieving
a representation of the big table when it's done.
We changed to this async mode from a synchronous model in which the client
starts with a GET or a POST, the node builds up a big table of results, then
the node returns the representation of that table to the initial GET/POST
request. We made this change because a deep-check (in particular deep-verify
or repair) can take a very long time, days or weeks for very large/deep
directory structures, and HTTP requests felt fragile: browsers get
interrupted, network addresses change, routes flap, etc. The idea of losing a
lot of progress on a deep-traversal operation felt unacceptable to us.
However, the big table that is generated by any traversal operations (except
for the trivial deep-stats) on a large directory structure is a problem all
by itself. For several allmydata.com accounts, we're seeing upwards of 500k
directories, and millions of files. This results in the tahoe node using
something like 700MB of ram to hold this table, which (depending upon the
host) results in swap thrash. In addition, the act of polling for the results
causes large memory usage, both in the server (to translate the table into,
say. JSON), and in the client (to translate it back).
We're currently trying to improve the efficiency and usability of
deep-repair. We'd like to try a more streaming approach: one tool traverses
the directory structure and builds up a list of filecaps/dircaps. A second
tool then walks that list, performing a check on each file, writing the
results to a second list. A third tool reads that check-results list, decides
which files need repair, and performs a repair on those files. A dispatch
tool could run these components on multiple allmydata.com accounts with
whatever degree of parallelism seems appropriate.
All these tools but the first deep-traversal tool could be interrupted and
pick up where they left off, if they kept suitable state about how far they'd
gotten through their (linear) list of work to do. The deep-traversal tool
would not have this luxury (since it's state is basically embedded in the
python stack of the deep-traversal code inside a tahoe node), but hopefully
the chances of interrupting a manifest operation are relatively low.
So, to accomplish this, and to reduce the memory usage of the node and CLI
tool in question, we'd like to have streaming forms of 'build-manifest' and
'deep-check'. These forms will execute in a single HTTP operation (perhaps
GET for manifest, since it has no side-effects, and POST for deep-check,
since with the repair=true option is does have side-effects). Each form
will emit incrementally-parseable units of status as part of their HTTP
response body, one unit per file visited. When the traversal is complete, a
second kind of unit may be emitted to report e.g. the deep-stats summary.
The CLI tools which use these interfaces will be able to read the response
body incrementally, parsing each unit as they arrive, and then writing a
summary to a file for further processing. At no point with either the Tahoe
node or the CLI tool be holding information about more than a single file at
a time. The Tahoe node's stack will, of course, hold state to manage the
deep-traversal operation, but the size of this state is related to the depth
of the directory tree (worst case is the size of all ancestors and
uncle-nodes of the deepest child node), which is roughly log(N) instead of N.
The individual units can be JSON, as long as the stream has some easily
parseable delimiters so the client can know where one unit ends and the next
one begins. If we can guarantee that the JSON does not contain a newline,
then we can use newlines as delimiters. If we can't enforce any such
restrictions on the JSON, then we'd have to use netstrings for each unit, and
parsing them is a bit harder.
If the individual units have a more custom format (like
"\nSI=xxx;CAP=xxx;SIZE=xxx;HEALTHY=true\n"), then we can use tools like
'grep' on the output to e.g. filter out the files that need repair. It may be
easier to use JSON and write our tools in Python.
changeset:476a5c8fac9909d5 adds the stream-deep-check webapi command, which includes repair=true. An earlier patch (which made it into 1.3.0, unlike changeset:476a5c8fac9909d5) added stream-manifest.
The units are JSON, with no internal newlines, so there is one line of output per file/directory examined, plus one at the end with the aggregate stats. We can't use grep on the output, but the python tool that uses ```[for line in out.splitlines()]simplejson.loads(line)))) is easy to write.
The only piece left is to modify the
tahoe deep-check
tool to use this streaming API instead of the old polling one.tahoe manifest
has already been updated.I think this is now complete.
changeset:fde2289e7b1fda8a updates the CLI "tahoe deep-check" command to use the streaming form.
And changeset:fd4ceb6a8762924c + changeset:a3c1fe35d9eda0df update the CLI commands to report errors properly.