add caching to tahoe proper? #316
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#316
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
We might want to add caching to the Tahoe codebase.
I'll start by saying that I'm not excited about the idea, because in my
experience, "transparent" caches are rarely all that transparent, and it is
easy to get into a state where you're sure that you've made some change, but
you aren't seeing it happen, and it turns out that it's because there's some
stale cache in the way that you didn't previously know about.
That said, since our various FUSE-like projects are making Tahoe vdrives
visible to applications that were not designed with this sort of filesystem
in mind, it might be a good idea to determine a common set of goals that a
vdrive cache would fulfill, and then implement them in a central location:
the tahoe codebase.
One such problem has been seen in the windows-FUSE plugin (which, despite the
name, is really a local SMB server). The application does
open/write/write/close, but the SMB client code expects the writes to take a
long time and the close() to be quick. If the plugin is delivering data to a
tahoe node, then the writes may push all that data directly to local disk,
and not start the upload until the close() is called. This is mostly a
consequence of the fact that we use immutable files for data: it is hard to
upload in a streaming fashion, because the application might do a seek() at
any moment and invalidate the data we've written so far.
The SMB client code times out when the close() takes a long time, so one
trick Mike has been forced to pull is to lie to the application: respond to
the close() quickly, and allow the real upload to continue in the background.
This has a whole host of problems, the most dangerous being that the file
isn't really uploaded yet (so if the user turns off their computer, the file
is not really stored: I'm told that this is what prompted Apple to disable
the use of network drives for their Time Machine backup application). The
most obvious problem is that many windows programs follow the close() call
with an immediate open() and read back the data they just wrote. To deal with
this, Mike's plugin must also spoof the directory entry, and pretend that the
file is really there (with contents that come from the temp file being used
for the upload).
So, it is a real problem, and I don't yet see a good answer. But some form of
caching is likely to be thrown about in the search for an answer, and there's
aremote chance that it'd be better to do it inside Tahoe than outside.
My vague thoughts are:
caching, and to control the retention policy.
space we're willing to consume. I.e. the only reason to not cache an
immutable file is to avoid using the disk space. This cache can be
implemented by using the URI as a filename, something like
$BASEDIR/cache/immutable/$URI
change, and may be changed by other people. The goals are:
times within the same second, and we want this to be fast
soon
contents in $BASEDIR/cache/mutable/$URI, with a rule that says we ignore
(and delete) any entry that has been there for more than 10 seconds.
It might also be worthwhile to allow the application (via the web API) to
influence the caching: GET /uri/$DIRURI?t=json&cache-for=180 . There are
several HTTP headers to control this behavior too, which may be more
appropriate (but possibly harder to use) than query args.
This ticket is intended to gather discussion and come to an implementation
decision.
AS a matter of division of labor, and layering of design, I would rather that Tahoe proper (and Brian) concentrate on improved file semantics, i.e. "Medium-Sized Mutable Distributed Files/Archives", and other layers/other authors, e.g. Mike Booker, Nathan Wilcox, FaceItLabs, etc. add caching (if needed for particular apps) separately.
More flexible and faster semantics can reduce the need for caching. For example, either "Small Distributed Mutable Files Plus Incremental Upload" or "Medium-Sized Distributed Mutable Files" could satisfy the current need that Mike reported Windows backup applications require: the ability to open(), then write();write();write();write(), then call close(), assert the the close() returned quickly, then call open() again, then read(), and get back the data just written.
Obviously SMDF+IncrementalUpload and MDMF/A don't solve all needs that all users of filesystems have. There is an infinite progression of such needs, and we hope to support the easiest ones first. (Neither, as Brian points out above, can caching layers satisfy all such needs.)
Brian wrote: "This is mostly a consequence of the fact that we use immutable files for data: it is hard to upload in a streaming fashion, because the application might do a seek() at any moment and invalidate the data we've written so far."
This is not why the current Tahoe CHK files fail to support this use case. Observe that the use case never tries to seek. A possible design point to aim at would be "CHK Files + Incremental Upload", and would support the Windows backup app in question without supporting seek(). This would be easier than a "Medium-Sized Mutable Distributed File/Archive" which supported seek(). I don't know whether it would be worth spending time to implement CHK+IncrementalUpload when that time could instead be spent supporting MDMF/A instead, though.
I had an idea that I wanted to get down before forgetting it: we could add a pubsub mechanism to the storage servers (at least the current generation which is reached via foolscap connections), to let clients be quickly notified about changes to mutable shares for which they're holding a cached copy. They would then be allowed to hold on to their cached value until the pubsub channel send an "invalidated" message. We'd need to limit the number of subscriptions, to bound memory usage on the servers. And it wouldn't get us closer to our goal of fewer active TCP connections. And it wouldn't work with the proposed HTTP-based storage servers.
I think caching would be very valuable, even if only for immutable files. If figuring out how to handle caching for mutable files would delay implementation of immutable caching, I'd say to defer the mutable caching.
OTOH, caching of dirnodes would really be nice, if it could be done safely. I think a cache with a fast timeout, as originally suggested, would accomplish this almost as effectively and much more simply than a pubsub mechanism, especially with a small adjustment: Don't delete the "timed out" mutable caches. Instead, just make Tahoe check them by looking to see if there's a newer version. For small mutable files, that's probably not a big win over retrieving the latest, but it might help a little, and would be a win for larger mutable files.
One other thought: You could trade off a little performance for some security and, perhaps simplicity of implementation, by caching the file shares under the SID, rather than the reassembled and decrypted file. You could use a structure similar (identical?) to that used by storage servers to store shares, and, in fact, the cache could even be a secondary source for the storage server to get shares from, and even to deliver to other peers that request them.
With that approach, retrieval of any file involves looking for shares first in the local cache and storage directories. If there's not enough local data to reconstruct the file, retrieve enough additional shares from remote peers, keeping the downloaded shares in the cache, which could use a typical LRU policy for replacement when it gets full.
If you like this ticket, you might also like #606 (backupdb: add directory cache), #465 (add a mutable-file cache), and #300 (macfuse: need some sort of caching).
Also, I'd like to remind everyone of ticket #280. The purpose of #280 was to provide an API call specifically for caching. I believe it can be implemented with a very small change to Tahoe, no changes to the storage format, and moderate complexity in the clients.
Otherwise, my USD 0.02 on caching design is to leave it out of Tahoe proper. If the community really wants it, we can make a standard caching component that looks like the wapi on the outside, but lives separate from the main node codebase.
I prefer implementation simplicity: Minimize feature count and number of configuration states per component with high test coverage, then hook components together.