use longer storage index / cap for collision resistance #753
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#753
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
As we work on new encoding schemes (like DSA-based mutable files), I'm
thinking that we want to put a lower bound on the cap/SI length to maintain
reasonable margin against collisions. 256 bits would be more than enough. 128
is ok, but a bit tight. 92 bits would make me nervous.
robk's friend Sylvan expressed concerns about Tahoe (and mnet/mojonation
before it) because, for something that is meant as a backup system, even the
slightest possibility of the CHK-based indexing scheme mapping two documents
to the same storage index was too high for him. (I believe he would be
more satisfied with a scheme that used centrally-allocated guaranteed-unique
storage indices, which we could do but would require more coordination and
longer caps, since we could no longer use a randomly-generated readkey to
derive the storage index. In exchange for a controllable but non-zero
probability of collision, we get to avoid central coordination and use
smaller caps).
The specific places where collisions could occur are:
The "birthday paradox" determines the chance of collision. If I'm doing my
math right, if you want less than '
p
' chance of getting any collisionswhen selecting items out of a namespace of size '
N
', then you can'tselect more than
C = sqrt(2*N*p)
items. This is called a "paradox"(where "surprise" would be a better term) because that square root causes C
to be surprisingly low: for birthdays (in which N=365), p=0.5 leads to C=19.
In the Tahoe context,
C
is the number of files you can add to the grid.In the current case, our 128-bit storage index (N=2^128^) means that p=0.5
gets us a nice large 2^64^ number of files, except that p=0.5 is insufficient
margin: we'd much prefer a vanishingly small chance of collision, like
p=2^-64^. Fortunately we get two bits of margin for every one bit we reduce
from C. The table looks like:
Note that our
N
is the minimum of the storage-index size and thetop-most cap value (i.e. the readkey for immutable files, or the writekey for
mutable files). So a DSA-based mutable file with a 92-bit writecap gives us
an
N
of 2^92^, even if it is expanded into a storage-index of 128 or256 bits.
Also note that the allmydata.com grid currently has something like 10M
objects in it, about C=2^23^.
So, I'm thinking that as much as a nice short 96-bit DSA mutable writecap
would be nice, it's too short to provide enough collision margin. I want to
be able to put trillions of files into a grid, and I want a the chance of
collision to be so small that I don't ever need to worry about it, and 96
bits isn't really there. 128 bits is probably good enough, but doesn't have
enough margin to be obviously and unquestionably safe (C=2^32^ is a lot of
files but you can imagine people wanting more, p=2^-64^ is a tiny probability
but you can imagine people wanting a bit better). 256 would be plenty (but of
course I want my filecaps to be shorter than that).
Thanks for this analysis. I like your comments at the end that we want a bit of "overkill" in number of files and chance of collision. People who don't want to rely on the collision-resistance of secure hash functions at all are probably not part of our market, but people who are willing to do so in principle, but would feel better with a nice fat margin of safety are definitely in our market.
Note that if you generate a new write cap (private key) and then check and it turns out that the same write cap has been generated by someone else then now you have gained the ability to write to their mutable file or directory! That's why I have been thinking that 96-bits was too few for write caps. Originally I had been thinking something like "It would probably not be worthwhile for any attacker to spend 2^96^ computer power trying to forge a write cap.". But this way of thinking discounts two important factors: chance of success and number of targets. If there are 2^40^ writecaps in use, then if an attacker does a mere 2^36^ work (excluding the cost of checking whether each writecap that they generate is already out there), then they have a 2^-20^ chance of forging someone's writecap. (Thanks to Daniel J. Bernstein's papers and mailing list postings for helping me understand this important point. http://cr.yp.to/snuffle/bruteforce-20050425.pdf )
However, a storage-index collision doesn't sound nearly as serious to me. No integrity or confidentiality is necessarily lost due to storage-index collision, right? Well, it could complicate the question of "which servers are handling which shares of this mutable file" -- an issue that is already not well managed.
Anyway, nowadays I think of storage-indexes as a separate layer built on top of the crypto layer. People can define their storage indexes as secure hashes of some pieces of the capabilities if they want, or choose random storage indices, or hierarchical ones based on the DNS names, or just not have storage indices at all and require every downloader to query every server. It shouldn't impact the security of the crypto layer, if the crypto layer includes integrity checking using the verifycap itself on the storage-server side.
I think we should write a document called something like "crypto failure modes (What could possibly go wrong?)" that explains what the consequences are of each different type of failure. (As requested by Nathan: http://allmydata.org/pipermail/tahoe-dev/2009-April/001661.html .)
The one discussed above is caused by "two people choose the same write cap (signing key) seed (possibly due to malice)". That one leads to an integrity failure, where one of the people thinks that they are the only one with a write-cap to a specific file or directory, but actually someone else also has the same write-cap.
So I think that is the worst one (because I value integrity highly). Another similar integrity failure could come about from a failure of the digital signature algorithm -- i.e. if someone were able to forge digital signatures even without the writecap. (Note that a collision in the hash function used by the digital signature algorithm could cause this. People who don't want to rely on collision-resistance of secure hash functions at all can't even rely on rsa, dsa, ssh, ssl, or gpg, although I should hasten to add that those algorithms typically include some randomness in the input to their secure hash function, to make it that much harder for attackers to cause collisions.)
After those integrity failures, there are confidentiality failures. The obvious one is someone being able to crack AES-128-CTR without having the readkey. Another one is if the content-hash-key encryption where to generate the same readkey for two different immutable files. I suppose that's another reason why using an
added convergence secret
is safer (although I hasten to add that I see no reason to distrust the collision-resistance of SHA-256d-truncated-to-128-bits at the present).Note that of course in practice the dangers from bugs and from operator error (i.e. misplacing your keys) are a lot greater than these algorithmic crypto risks. So much greater that the former are pretty much guaranteed to happen and the latter will probably never. Nonetheless, I value getting the crypto part right so that it is secure and also so that everyone who is willing to rely on crypto in principle is willing to rely on our crypto, so thanks for you help with this.
P.S. Oh, what I wrote about people choosing whatever storage index they want is possibly confused because I'm conflating two ideas into the word "storage index". Let's talk about those two uses as "server selector" and "file identifier". The former is how you decide which servers you're going to use (either when uploading a file for the first time or when using a cap to find a file that's already up there). The latter is how you tell a storage server which of the shares that it has are the shares that you are currently interested in. I currently think that the verify cap is a good thing to use for the latter role, but perhaps not for the former.
as before, I think I'd like to continue using "storage index" for what you're
calling the "file identifier", but yeah split out "server selector" or
"peer-selection index" or some similar term for the purpose of determining
which servers you're going to be talking to. One way of describing this would
be "we used to use the storage-index as the peer-selection index, but these
days we put two separate values in the filecap".
I am also starting to think of these as separate concepts, but remember that
we've yet to actually implement such a split.
Sylvan's concern was about availability: he considered a backup system to be
broken if its design has a built-in probability of file unrecoverability.
It's easier to see the problem if we set the encryption-key and hash lengths
to infinity, but restrict the storage index to say four bits. Then upload two
files, and try to download one of them.. you've got a 1/16 chance of getting
a download failure because your two files had the same storage-index, you
downloaded the wrong bits, and now they won't pass the integrity check.
Also, when we talk about this, we should be careful to distinguish between
the failure modes of mutable files versus immutable files.. they're very
distinct. And, collisions at different levels have very different
consequences: if the storage index is too small, we'll get availability
failures; if the immutable encryption key or mutable writekey is too small,
we'll get confidentiality failures. I've been assuming that we'll keep the
security parameters sufficiently large.. this ticket was specifically about
the availability concerns resulting from a too-small storage index.
If we compress the filecap by deriving the storage-index from the writekey,
then clearly we're limited by
min(len(writekey),len(storage-index))
.Mostly I ticketed this issue because it's something I want to keep in mind as
we design the next revision of the filecap format. If we don't already have a
wiki page for it, I'll add one to organize our ideas.. I think they're
currently spread across half a dozen tickets.
I updated the table in the description: I think 192-bit caps will let us have
an effectively-infinite number of files (2^64^) with an effectively-zero
chance of collision (2^-65^).
[to fix trac markup]edited
Wait, if you you have 4-bit storage indexes, and servers verify integrity of shares, then you don't get an unrecoverable file, you get a failure to upload the second one, right? I really like the idea of servers verifying the integrity of shares, because without that then even if you have a nice fat storage index, the first server that you upload to could quickly turn around and upload garbage to all the other servers, under the same storage index that you just told him.
A failure to upload (backup) is a much better problem to have than a failure to download (recover), as long as it can be detected, understood, and fixed by the operators. I.e. "Oh, we need to add servers", or "Oh, we need to start using longer storage indexes". :-)
Well, "It Depends". Current servers do no checking of shares.
(I agree, I like server-side share-verification, except for the load and
complexity it adds to the storage servers)
Immutable Files
Current code has uploaders quietly+simplisticly believing servers who say "I
already have a share by that SI", without additional checking. (they don't
search very far for existing shares, but if the server list is the same as it
was for the first file, they'll get complete overlap). This results in the
second file appearing to upload correctly but being completely unrecoverable
(all shares for that SI are for the first file, so a download of the second
file will get shares that fail the hash checks and fail).
The uploader that we haven't written yet will probably search further for
existing shares (especially once it's seen evidence of one), and will do some
amount of verification of those shares (at the very least pulling down the
UEB to make sure it's for the same file). This uploader will see evidence of
an SI collision and probably fail the upload (with an exception that
basically says "please pick a different SI, perhaps a random one"). Unless,
of course, you're unlucky and wind up talking to a different servers during
the two uploads. The subsequent downloader will either succeed but notice a
lot of "corrupted" shares (if they can talk to the second uploader's servers)
or fail due to a lot of corrupted shares (if they can't).
If we further improve the servers to verify the integrity of immutable shares
upon receipt, and we change the protocol to split server-selection-index from
storage-index, and assume that the SI is required to be derived from the
verifycap (so that servers can validate the share all the way up to the SI),
then we're still in the same boat as the previous paragraph: servers who
don't yet have a share will accept either the first upload or the second
(since we're assuming a too-short SI, and these two files are colliding), and
if the second upload succeeded due to disjoint peersets, then second-file
downloaders will succeed or fail depending upon which servers they're able to
talk to.
Mutable Files
We get the same issues here: if the storage-index is too small, then there
will be multiple RSA keys which map to the same SI. A validating server which
has accepted a share for key1 will reject an update for key2 (because it's
looking at the full pubkey inside the share), but other servers might accept
key2 (if they don't already have a share).
Tagging issues relevant to new cap protocol design.
fixed superscripts
Use trac markup for superscripts instead of Unicode; it's usually rendered much more clearly, and doesn't depend on font support.