make the storage index be the verifier cap #654
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#654
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
As I discuss in this mailing list post, we could use the verifycap for the purpose of a storage index.
The big advantage of this to me is reducing the number of concepts by one. This would prevent, for example, misunderstandings such as Shawn Willden's misapprehension about overwriting shares (to which my letter is a response).
Another advantage would be that the storage server (as well as anyone else) could verify that the share is a fitting share for that storage index. This would neatly solve all questions about the correctness of storage indices, such as:
This might also allow greater similarity between the immutable and mutable share storage protocols, if both of them used the verify cap as the storage index. The mutable case has much worse issues about security and consistency, of course, and my current assumption is that it, too, could be strengthened and simplified by requiring the storage server to verify the correctness of each share. (Although simpler from some perspectives, this would actually be more complicated for the storage servers because they would have to understand enough about the share layout to verify the correctness. Also it would cost quite a bit of CPU to perform the digital signature checks on the mutable shares.)
I vaguely recall that Brian pointed out some significant added problems or issues with this approach, so hopefully he'll follow up on the list or this ticket and remind me what they were.
In general, I like the idea. I'll have to think about it some more, maybe my
notes have some details of the concerns I had.
The general categories of concerns were:
affecting the storage-server's overhead/indexing system. SIs are currently
16 bytes / 128 bits (since they're derived from a 128bit AES key).
replace/equal the UEB?
index
operation
shares
I'd love a scheme that allowed the servers to validate their own shares but
which didn't obligate them to do so right away.
For our DSA mutable file design, if we had intermediate keys, I think we were
able to make the storage index do exactly what we wanted: every bit of the
storage index can be used to validate the key, so the server can validate the
whole share (and check its signature) all by itself. This enables several
things: write-enabler-less publishing (server accepts the write iff the
signature is good and the seqnum is higher than the old share), local
background Verifier passes (to detect disk errors), buddy-verification
(servers find other servers, check each other's shares). If we can't have
intermediate keys, I think we have a design that will still allow some
portion of the storage index to be used for this purpose, but not the whole
thing.
(I think we could even find a way to switch to pubkey-based-SI for our
existing RSA-based mutable files and still provide backwards compatibility:
basically have the server keep a table which maps from pubkey-hash to SI, and
add an API that looks for shares by pubkey-hash instead of by SI).
For immutable files, I think the prognosis was less cheerful. The storage
index (as it stands today) is used for two purposes: peer selection and share
indexing. The random distribution of the SI (since it is derived by hashing
the writekey) plus the hash-driven permutation of our peer-selection
algorithm gives us load-balancing, or as Zandr likes to put it,
"cryptographically strong load balancing". (note that this is balancing the
inlet rate: the amount of data that is given to each server per unit time..
this may not quite be what you want, since you might want your large servers
to fill at a proportionally-faster rate than your smaller servers). Once you
know which servers to talk to first, the SI is used to reference a specific
share on those servers.
The problem was that any integrity information we get out of an immutable
file won't be known to us until we've finished encoding the file. So we can't
use it for peer-selection, since (to avoid buffering the entire file locally)
we must perform peer-selection before encoding. (we're already considering
switching from CHK to random-keys to avoid the streaming-unfriendly
hash-the-content-to-get-the-key-and-storageindex pass).
Now, there are good arguments to allow alternative peer-selection schemes (as
we've discussed in section 3 of source:docs/specifications/outline.txt), but
there are many desireable properties to the approach we're using now. Most
peer-selection schemes that will work for a large number of servers (where
"work" means the downloader usually doesn't have to ask every single server
whether they have a share or not, and hopefully can find enough shares in a
minimum number of roundtrips) require some sort of peer-selection-index to be
encoded in the filecap.
One approach we discussed was a split index, in which the filecap has two
index fields: one for peer-selection, and a second for share-on-peer. The
peer-selection part could be randomly generated: it doesn't need to be
cryptographically secure, merely long enough to give us good load-balancing
properties.. 10 or 20 bits would probably be enough. (and note that it
wouldn't necessarily have to involve a permuted list: we could pick a random
10-bit starting point on the ring, then select servers in strict clockwise
nodeid order, or something). The share-on-peer part (which is what the server
thinks of as a storage index) could be determined after the encoding process,
and told to the server in the final close() message that commits the finished
share. This would involve a storage server protocol which has some sort of
temporary upload handle (so subsequent messages could refer to the
previously-uploaded partial share fragments with something other than the
final storage index), but that's not hard to build, and might give us some
useful resume-interrupted-upload properties too.
Hmm, here's a scheme that might work: make the peer-selection-index be a hash
of the readkey, and the share-on-peer index be the UEB hash. This would let
us perform peer selection as quickly as we do now (i.e. one pass for CHK, or
zero passes for random-key). Filecaps would remain the same length (although
they'd need a new prefix, of course). Verify caps would be just a prefix plus
the UEB hash (and k+N+size, probably).
The earlier scheme I was thinking of (in which the filecap would need an
extra field for the peer-selection index) had some downsides, but I don't yet
see any in this scheme. The only one I can think of it that it would obligate
us to have some portion of the filecap (the readkey) which can't be used for
integrity checking (since it needs to be generated before we've encoded the
file), but we're already obligated to have that (the readkey is needed to
encode the file in the first place).
Well, and we lose a little bit of convergence: if I upload the same file
(using CHK) as someone in my convergence domain already uploaded before,
they'll wind up with the same filecap (and therefore peer-selection-index and
storage-index) as before, but I won't be able to learn of that fact (and thus
avoid doing the duplicate upload) until the end of encoding, when the
UEBhash/storage-index is generated. I guess this means we should record a
full-cryptographic-length peer-selection-index with each share, maintain a
table that maps from peer-selection-index to UEBhash/storage-index, and have
the servers do a lookup at allocate() time. If allocate() tells us that they
already have a share for that peer-selection-index, we can ask it for the
UEBhash, then download enough data to compare the file we're thinking about
uploading against the shares that are claimed, and see if the upload can be
skipped. Hm, it might also work to use this peer-selection-index as the
"upload-id", for resuming an upload later.
So, a summary of how we could implement this for immutable files:
storage servers who will accept a share of the necessary size (but don't
tell the servers the final storage index yet)
with news that they already have shares for that index, and will
return the UEBhash/storage-index for those shares
equal to the UEB hash
list
== UEBhash
hash(readkey) as peer-selection-index and UEB hash as storage-index
(instead of using hash(readkey) for both)
storage-index, maintains a table that maps from that to storage-index.
file the finished share
peer-selection-index/upload-id
storage index (i.e. bucketdir name) should be the same as the embedded
UEB hash. Older shares cannot be validated this way (the contents can be
validated against the embedded UEB hash, but that hash cannot be checked
against anything).
note that the peer-selection-index table offers some games to an attacker: they could upload a share for file A and pretend that it has the peer-selection-index for file B, with the goal to disrupt someone who is trying to upload file B (and are incorrectly told that the server already has a share for that file, which requires downloading the entire share to verify). I suspect that this is not a very large problem, though.
Also, we might want new server-side share file format, to record the peer-selection-index on the bucket label (the same place that holds the leases). This would be used to rebuild the table from the sharefiles, since we consider the sharefiles to be canonical and all other tables to be caches or performance-improving indices. The peer-selection-index would not be verified like the rest of the share (making it even more appropriate to put on the outside of the container rather than the inside).
Ooh, here's a blast from the past. I just noticed that ticket #5 was "verifierid as storage index: not the whole story". :-) it was closed as fixed on 2007-09-25.