Add expanded motivation for GBS #1011
Labels
No Label
Benchmarking and Performance
HTTP Storage Protocol
Nevow Removal
Python 3 Porting
not-for-merge
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: tahoe-lafs/tahoe-lafs#1011
Loading…
Reference in New Issue
No description provided.
Delete Branch "3645.gbs-expanded-motivation"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3645
Coverage remained the same at 94.575% when pulling
37fe06da76
on LeastAuthority:3645.gbs-expanded-motivation intocf04f277db
on tahoe-lafs:master.Couple corrections / updates inline.
"on in" -> "only in"?
Is "speed" really the concern here? (Like, is it really slower than JSON or something? Also does that matter?) Seems like real concern is more "it's big and flexible and thus complex for developers, but Tahoe doesn't need that"
I think "state" and "transfer" should be capitalized too
@ -16,0 +67,4 @@
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
(rather than the standard "web" rules for validation).
I think it's important to mention that this comes with a cost: the implementer has to be careful that the same security properties we rely on for Tahoe are also implemented correctly (i.e. GBS spec has to be followed carefully).
Added some more justification for this conclusion (but without concrete benchmarks I'm not going to claim too much confidence/authority here... If you're still skeptical we can drop this part).
@ -16,0 +67,4 @@
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
(rather than the standard "web" rules for validation).
Added some text below to try to make this clear
I'm not saying you're wrong .. but it sounds like a pretty definitive conclusion. Perhaps just saying "We believe an HTTP-based protocol ... faster"?
(e.g. I agree that Foolscap "encourages" a lot more round-trips and doesn't have promise-pipelining, but does GBS definitely have fewer round-trips than the Foolscap storage protocol?)
Also, do we even care? If our HTTP implementation turned out to be slower, would we go back to Foolscap? (I think: "no").
Just trying to be wary here of making extra claims / promises. Cap'n'proto does promise-pipelining and "zero cost" parsing; should that be in contention instead of HTTP? (I think the answer is "no": we care about simplicity and wide support).
That seems like a reasonable hedge, given the lack of evidence. :)
I think we care even if making things faster isn't the primary objective.
On the server-side, I think the performance benefits are real because a 5% reduction in CPU overhead (say, from serialization) translates directly in to a 5% increase in transfer capacity from the same resources.
I mean "care" in the context of GBS: I'm not saying performance-benefits are bad .. I'm saying that I think we should prefer a simpler, easier-to-understand implementation even if if doesn't have performance benefits. That is, while "it's faster" would certainly be a cool side-effect I'd take slower too (if that meant easier to understand/implement/interop-with).
Maybe even this little thread shows that we can get side-tracked about performance in this regard ;)
Some comments inline. Hopefully they'll make sense in the morning too. :-)
Nitpicking here, but... is Tahoe is the right project name to use in this context?
The legend as I know is that "Tahoe" in "Tahoe-LAFS" stands for the Python implementation of the LAFS. I have also heard rumors to the effect that GBS has ambitions of not being limited to just Python. :-)
@ -16,0 +65,4 @@
The Foolscap-based protocol provides *some* of Tahoe-LAFS's confidentiality, integrity, and authentication properties by leveraging TLS.
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
GBS' x509 validation rules are essentially from RFC 7469, right? I think it would be helpful to spell that out in this document.
Some more thoughts occurred to me while reading this:
I need to read the RFC in detail, but on quick reading it seems that what it proposes is a trust-on-first-use scheme. In the RFC, I could not find sections on certificate expiry and revocation. Perhaps the discussion in this proposal's footnote about validity periods and stolen keys are worth more than a footnote?
Do we know about any other real-world usage of RFC 7469? What would be the impact of adopting this proposal on usability, if any?
I remember reading this in an old post in tahoe-dev:
I'm guessing that we do not want to regress from there, and that is why we adopt a CA-less scheme. If maintaining that status is a goal, it would be helpful to spell that also out.
@ -16,0 +65,4 @@
The Foolscap-based protocol provides *some* of Tahoe-LAFS's confidentiality, integrity, and authentication properties by leveraging TLS.
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
Maybe I need to re-read this after sleeping, but I hope it's not TOFU (trust on first use) because that's NOT what Foolscap relies on. Foolscap does use self-signed TLS certificates, but just for the "transport security" part; the tub-id is what assures you you've gotten the right host (not certificate authorities or hoping you get the right thing on your first try). I believe the only change here in GBS is that we depend on a hash of the public-key (not a hash of the entire cert as in Foolscap) so you can still use a self-signed certificate without any issues. The GBS change is that you can create an entirely new certificate whenever you like so long as you retain the private key matching the public-key in your certificate.
Maybe we need to change the line "It will perform the usual cryptographic verification..." to be more clear that this is JUST checks as to the semantic validity of the certificate, nothing to do with CA signatures etc etc. (Or, if I am wrong here, perhaps I need to read more ;) )
@ -16,0 +65,4 @@
The Foolscap-based protocol provides *some* of Tahoe-LAFS's confidentiality, integrity, and authentication properties by leveraging TLS.
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
Well, I am still yet to read RFC 7469, much less understand it, but it says this in its introduction:
I did a bit of other reading. Per the Wikipedia article on HTTP key pinning, HPKP's usage has been in decline. Chrome and Firefox both added support for HPKP at some point, and then removed it at a later point.
This article in Qualis blog (2016 vintage) makes the claim that HPKP may be "too difficult and too dangerous to use, and that it won’t go anywhere unless we fix it."
Another article on Hashed Out blog (2017) states that "HPKP is too complicated, risky, and quite frankly, unnecessary for the internet-at-large."
I don't know if any of that is actually true, or if any of that makes HPKP unsuitable for GBS. Regardless, I feel like handling keys would be the big complicated part, and so would be useful to discuss these considerations too.
Unless, of course, I have misunderstood the plan. Which is quite possible. :-)
@ -16,0 +65,4 @@
The Foolscap-based protocol provides *some* of Tahoe-LAFS's confidentiality, integrity, and authentication properties by leveraging TLS.
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
My understanding is this (and @exarkun should chime in if I got this wrong, please!):
So, Alice can still operate a storage-server without using any certificate-authority (including Let's Encrypt). All she needs to do is keep her private key safe. She can even re-generate her certificate (e.g. before it expires) without doing a new Introducer announcement. If she loses control of her private key, her TubID (and hence fURL) will change and she needs to do a new announcement (effectively changing the identity of her storage server).
Ultimately, yes, keeping the private key safe is "the hard part" here. There's no getting around that. I don't think anything here prevents an implementation from supporting hardware methods of storing keys, though (e.g. a device that supports OpenPGP like a YubiKey5 or NitroKey2 could be told to sign the certificate).
@ -16,0 +65,4 @@
The Foolscap-based protocol provides *some* of Tahoe-LAFS's confidentiality, integrity, and authentication properties by leveraging TLS.
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
This is nearly the same as what (I understand) Foolscap does, except Foolscap hashes the entire certificate into a TubID (so you can't change the certificate, ever). I don't believe Foolscap looks at the dates in the certificate, though .. so we could maybe drop that part? An operator could still effectively "opt out" of the date-checking by making it 100 years in the future or so...
So we are using TLS here for transport security. We are assured it is the correct server by the TubID-checking (instead of what TLS does, which is to use certificate authorities etc etc).
Consensus on IRC seemed to be that "Tahoe" is just lazy shorthand for "Tahoe-LAFS". I updated the whole doc to use "Tahoe-LAFS", though.
@ -16,0 +65,4 @@
The Foolscap-based protocol provides *some* of Tahoe-LAFS's confidentiality, integrity, and authentication properties by leveraging TLS.
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
From this discussion, it seems like the RFC 7469 reference may be more harmful than useful. We can drop it and put the validation rules inline in the GBS doc, if that would be clearer. That seems beyond the scope of this issue though.
@ -16,0 +65,4 @@
The Foolscap-based protocol provides *some* of Tahoe-LAFS's confidentiality, integrity, and authentication properties by leveraging TLS.
An HTTP-based protocol can make use of TLS in largely the same way to provide the same properties.
Provision of these properties *is* dependant on implementers following Great Black Swamp's rules for x509 certificate validation
Yes, that would be clearer, but we we don't have to drop it entirely just because I have some reading comprehension issues. ;-)
Perhaps it be useful to move the reference to a footnote or a reference section instead? We could clarify what GBS uses from the RFC and what it does not, and maybe the reasoning.