add share-type argument to storage server protocol #2796
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2796
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
I was thinking about a few things today:
So the idea I just had was to tell the server, in the upload process, what "type" of share we're uploading. If the server has knowledge of that share type, it will apply the validation rules during upload, and a periodic server-side "scrubbing" operation can pre-emptively detect failed shares. If the server does not currently know the share type, it will accept arbitrary data, but it will remember the claimed type next to the share. Later, when the server is upgraded to a version that does know about the new format, it will start to verify that data too, and if the data doesn't match the type, the share will be deleted as corrupt.
Some of our encoding schemes use this server-side verification to prevent "roadblock attacks", where someone uploads junk data to the same storage-index as your legitimate file, in order to prevent you from uploading enough shares to publish/preserve the file. To support this in the face of not-yet-understood formats, we could have a rule that clients can trigger a verify on any SI. If the second (legitimate) uploader notices a share in place which fails their download check (or which the server doesn't remember having verified themselves since its upgrade), they can call for verification. The roadblock share will fail verification and be deleted, then the uploader can upload their own (valid) share in it's place. (Maybe the server should return the last verification time as part of the "do-you-have-block" response)
It might also work to just scope the SI to the type, or treat SIs uploaded to known types as being independent of those uploaded to unknown types (even if the server was upgraded later). But I think this would cause some shares to effectively disappear upon server upgrade, which wouldn't be good.
We could declare that the existing share-upload API methods use "format 1 mutable" or "format 1 immutable", and add new upload API methods which include the type/format identifier along with the will-you-hold-my-data request. We could advertise format understanding in the server's version dictionary (I'm not sure we'd want clients to behave differently based on that information, but maybe they could prefer to upload shares to servers that do understand the format, over ones that don't).
On the server side, we could either store the format type in the share file itself (I think we have some magic numbers there already), or use separate top-level directories for them, or store the information in the leasedb (and regenerate it by looking for magic numbers in the share). The server is going to want to have a new database ("scrubdb"? "sharedb"?) that tracks when a share was last verified, which might be a good place for remembering their formats too.
We might also generate/regenerate format identifiers by just trying to validate the share against all known formats, and if it passes that check, it must be of that format. That might let us use the scrubdb as the canonical source of format, but build the DB from raw share files for the initial transition.
If we deployed the scrubdb before deploying any new encoding formats, the initial transition could just say "all present shares must be format 1 mutable/immutable", and we wouldn't need the actual initial verification step.