support indefinite leases with garbage collection #1832
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1832
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
LeastAuthority.com runs a storage server and we want to offer our customers an indefinite (within the scope of their business relationship with us) lease. That is: as long as they keep paying their bills (or even longer, if we choose to keep their ciphertext until they bring their account back into good standing) we will not delete ciphertext that their LAFS storage client has marked as something to keep, even if they don't successfully get their LAFS storage client to renew leases ever again.
We need this, because the current protocol offers us only two options, neither of which is acceptable:
Or:
The latter is what we currently do, but it isn't sustainable, because:
and even more difficult:
This implies that to satisfy this use case, there must be a protocol whereby the LAFS client can tell the storage server "Okay, here are some ciphertext shares which as of now I want to keep, and any other ones that you might have I hereby cease paying for, so you'd better delete them, if they exist."
Now, it would be troublesome for the LAFS client to be required to build a complete manifest of all ciphertext shares that it wants to keep and then deliver that entire manifest to the server at once. So, a better, more incremental algorithm that would satisfy this use case is like this:
The following protocol is what the LAFS client does when it wants to stop paying for everything not-reachable from a given root.
The client asks the server for a magic token which is something that is meaningful only to the server. The meaning of this is "Give me a special token that when I later give it back to you, you'll know you can delete everything that I didn't touch since you created this special token." As a matter of implementation, the storage server will find it convenient to use a timestamp from his clock to be the token, but in order to deter the client from comparing it to timestamps from the client's clock, it is sent as a string. Ooh, in fact, the server may have actually encrypted the token with a secret key known only to the server and unknown to the client, just to prevent the client from comparing its value to a value taken from their own clock. This way, the protocol adds absolutely no requirement for clock sync between the client's clock and the server's clock, but instead this "timestamp" is derived from the server's clock, and is only ever compared to the server's clock. If the server's clock is set to 1969 and the client's clock is set to 2099, or vice versa, that's fine.
The client starts traversing the files from the root, and for each one (or batch of them) it sends a message to the storage server saying "Please mark these ones as to-keep.". The server replies "Okay, done." (This "mark as to-keep" can be implemented as a "100 year lease" if that helps implementation.) It might make sense for the client to send the magic token with every one of these requests, so it means "Please mark these ones as to-keep when you do the garbage-collection sweep associated with this token.". Or "Please exempt these ones from a probably future garbage-collection sweep that will, when and if it comes, be associated with this token.".
Note: If the client crashes or gets stopped and restarted or loses and regains connection to the storage server during this process, it can always resume at step 2, provided that it still has the "token" from step 1 written down.
Note: If the client creates new files during this process, the newly created file comes with an equivalent mark ("100 year lease"), so that the client doesn't have to worry about race conditions between its traversal for marking keepers and its addition of new files.
(discussed [//pipermail/tahoe-dev/2012-October/007768.html on the mailing list])
The mark/sweep scheme sounds reasonable to me. In addition, I as a server operator want a way to ask (with CLI tools - no web browser!) how much storage space is in use total, and how much due to leases from various entities that store data, perhaps even by age of lease. And I want to be able to delete files that are old and belong to some entity, in various combinations. I don't mean one would often want to do this; this is the equivalent of the sysadmin deleting big files to restore the system to functioning after users are warned and don't clean up.
Replying to gdt:
gdt: the asking-about-resource-usage part would be facilitated by #1836. Would you please open a separate ticket asking for the command-line interface to query these things? Go ahead and specify exactly how it should be spelled!
Please open a separate ticket for a command-line to delete specific shares. (I guess using their verify-cap, and optionally their shnum(s) as the arguments?)
From a private mail message about my ventures into offering rented tahoe server nodes: