HTTPS storage furls #1189
No reviewers
Labels
No Label
Benchmarking and Performance
HTTP Storage Protocol
Nevow Removal
Python 3 Porting
not-for-merge
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: tahoe-lafs/tahoe-lafs#1189
Loading…
Reference in New Issue
No description provided.
Delete Branch "3875-http-storage-furls"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3875
Probably ready for preliminary review, even if not quite to be merged.
Open issues, feedback from reviewer welcome:
Thanks! I did a rough first pass and left some comments inline. Probably will spend some more time looking on Monday.
Comment copied from Foolscap, right? But I don't think OpenSSL has a sense of humor and it just calls this function for every certificate presented by the peer (including those in the chain).
Is this app data used by anything?
Regarding naming, I suppose we could start to try to be consistent and refer to these as NURLs instead of FURLs. See https://tahoe-lafs.readthedocs.io/en/latest/specifications/url.html#nurls
Playing a bit with hyperlink, it looks like the Twisted endpoint style URLs don't work. eg
DecodedURL.from_text
won't parsepb://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@tcp:127.1:34399/jhjbc3bjbhk#v=1
. The extra:
in the netloc makes it angry.I'd almost say we can just drop this but this is how Tor and I2P are supported. A NURL for a Tor server has a netloc of
tor:u33m4y7klhz3b.onion:1000
and for I2P it looks likei2p:c2ng2pbrmxmlwpijn
.I'm not sure what to do about this. Options that suggest themselves to me are ...
I guess url-encoding the endpoint might fix it? except I don't know if that is a sensible thing to do to the netloc part of a URL.
Apart from this grammar/parsing problem I really like this function and how it fits in to the rest of
StorageClient
. I like how it is just an alternate constructor. I like how all of the certificate validation logic ends up bundled inside the underlying http client object. I like how it is all a self-contained transformation from a structured representation of a NURL (hopefully as a DecodedURL) to a different structured representation as a StorageClient.@ -113,0 +168,4 @@
# the other end. OpenSSL calls it multiple times, for each errno
# for each certificate.
# We do not care about certificate authorities or revocation
For your consideration:
I believe this is a public API, though
cryptography.hazmat
's documentation:@ -9,1 +10,4 @@
from cryptography.x509 import Certificate
from cryptography.hazmat.primitives.serialization import Encoding, PublicFormat
from werkzeug.http import parse_options_header
Ahhh hello hazmat my old friend.
The GBS spec calls for "urlsafe-base64 (without padding)" here. I think I believed that was necessary to be able to represent NURLs using standard URL-shaped tools. This hash ends up in the userinfo part of the URI and it looks like percent encoding is allowed there (https://datatracker.ietf.org/doc/html/rfc3986#section-3.2.1) so it seems like it is not necessary to use urlsafe-base64 to play nicely with other tools.
I wonder if it still might be preferable to avoid ever having a hash with percent encoding in. Maybe it would be slightly annoying to sometimes have to deal with such a hash "by hand" (ie, sometimes naive copy/paste will give you a bogus string and maybe you won't always immediately notice)? The flip side being that sometimes if you forget to handle the hash using the right base64 flavor you will have decoding problems.
So ... maybe it is a wash.
Dang. Too bad there still is no decent Python API for constructing a TLS server endpoint. :/
There's also
allmydata.crypto.rsa.create_signing_keypair
which does the same thing.My guess is that it's necessary because even with non-persistent connections,
Agent
still fires application-code-facing Deferreds before it finishes closing its connections. It's probably necessary for IStorageServer HTTPS tests too but they get lucky more often (or they got persistently lucky once and some other aspect of cleanup lets the reactor tick just enough for those connections to end).I have a similar work-around in a test fixture managing a Treq instance - https://github.com/PrivateStorageio/ZKAPAuthorizer/blob/main/src/_zkapauthorizer/tests/fixtures.py#L195-L205
:(
Hm. So, the way the TLS handshake goes is ... (a bunch of stuff) ... then the client sends a "ClientKeyExchange" message which has a "PreMasterSecret" in it, encrypted using the public key from the server's certificate. The server must decrypt this and do a cipher-specific operation to derive the "MasterSecret". Later the server sends its "Finished" message encrypted using "MasterSecret". The client decrypts it to extract a hash and a MAC of all of the handshake messages up until this point.
Without the private key corresponding to the certificate it presents, the server must compute the wrong "MasterSecret" and when the client decrypts the "Finished" message with the correct "MasterSecret" the hash and MAC will mismatch and the handshake will fail.
So ... a test for "the server is using the wrong private key" seems like it would mostly be a test for the underlying TLS implementation (which, at worst, would be able to deliver garbage data to the client instead of properly signaling a handshake failure).
However, I probably wouldn't mind seeing a test for this case if it were possible to write one since perhaps some TLS libraries offer some ways to choose ciphers so poorly that you somehow lose the above protection (I think this would have to be roughly a choice of the NULL cipher everywhere which would be a pretty bad mistake, but ...).
That said, I don't really know how we'd write this test, given OpenSSL's behavior. My only idea is ... find a different TLS implementation that lets you do something so obviously illegal?
It's probably fine to defer any such work to a follow-up ticket though.
besides
SSL4ServerEndpoint(...)
...?(maybe this provides more inspiration? https://github.com/crossbario/crossbar/blob/master/crossbar/common/twisted/endpoint.py#L107 )
Sorry, I lost the plot here: https://twistedmatrix.com/trac/ticket/9885
Right, besides that. If that were the good Python API I imagine then I suppose it would be easier to use it than to build up a string and then parse it with
serverFromString
.Me too. I couldn't find the mailing list post that ticket used to reference. Since
wrapClientTLS
is an API that makes TLS composable with other endpoints, maybe twisted:9885 is about awrapServerTLS
that works similarly?That would be nice, certainly, and would let
listen_tcp
have a signature that accepts another endpoint instead of a hostname / port number (or would at least remove one barrier to doing this - the other being that the code still needs a way to describe whatever endpoint it is listening on, which is not possible for an arbitrary endpoint, at least if you actually adhere to the interface).Ah, good point. From memory the gist is https://github.com/glyph/txsni/blob/master/txsni/tlsendpoint.py + write a strports plugin.
I dug it up, linked and quoted on the ticket now.
It's called multiple times for the same certificate, though, once per errno?
Oh yea, that's probably true - keeping in mind that "no error" is one of the errnos (so - it doesn't skip any certificates).
@ -113,0 +168,4 @@
# the other end. OpenSSL calls it multiple times, for each errno
# for each certificate.
# We do not care about certificate authorities or revocation
OK, switched to that.
Apparently not.
Done.
The way I've seen this handled elsewhere is by changing the protocol, so e.g.:
pb+https://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@127.1:34399/jhjbc3bjbhk#v=1
pb+http://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@127.1:34399/jhjbc3bjbhk#v=1
pb+tor+https://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@127.1:34399/jhjbc3bjbhk#v=1
pb+i2p://1WUX44xKjKdpGLohmFcBNuIRN-8rlv1Iij_7rQ@sdsdfsdfds/jhjbc3bjbhk#v=1
or something like that.
Seems like a new ticket?
I guess I'm not using URL-safe base64 variant here, at minimum... should I do that, or do you want to switch to % encoding instead?
I'm just going to live with this for now, since it works.
OK.
Just going to leave this here, feels like test code is OK to be a bit duplicative because we might want to tweak it in various ways.
This logic, as a reminder, is client side. And the test would be of the client's behavior. So I'm not sure this test adds very much over .. auditing we've configured the correct ciphers?
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3884
@ -1083,0 +1066,4 @@
# Listen on randomly assigned port, using self-signed cert:
private_key = generate_private_key()
certificate = generate_certificate(private_key)
_, endpoint_string = self._port_assigner.assign(reactor)
This is causing tests to fail, since the two files don't end up installed with the package. Perhaps should make the cert generating utility from
test_storage_https
reusable, and just generate new cert each time.Is there a guarantee it will be called at least once?
Thanks! Some more comments inline - nothing particularly major, though there's some stuff about determinism and reliability (esp in the test suite) that seems like it might need improvements in Twisted to fix in a really satisfying way. :(
There is also a
MANIFEST.in
... does that need to be changed?@ -125,0 +242,4 @@
Create a ``StorageClient`` for the given NURL.
``persistent`` indicates whether to use persistent HTTP connections.
"""
reactor
should be another parameter tofrom_nurl
, probably?I think that if the rest of the code is using
DecodedURL
(or similar) correctly then % encoding will be applied there ... so I don't see anything that's wrong about the code now, apart from not following our spec. But I think I would prefer to see URL-safe base64 rather than letting the % encoding happen.For consistency, probably
FilePath
instead. Ifpathlib
is better I'd get behind an effort to switch to that at some point but I think a mix ofPath
andFilePath
is probably worse than either.As implemented, it will fail sometimes, since Linux (at least) does not make binding to port 0 reliable.
I don't think I care much - and if anything, I wonder if we should get rid of the automatic port assignment "feature" from Tahoe altogether.
Maybe this layer needs to offer the feature so that GBS fits into the Foolscap-shaped hole we're filling, in which case this is fine. As we continue with the process, if we get to a point where we get to make a choice about whether we keep offering this or not, let's carefully consider whether we should.
Ah, oops. This will fail intermittently, though. So either
listen_tls
needs retry logic forEADDRINUSE
or this test code does.If only we could wrap TLS around an arbitrary endpoint, we could keep using
_port_assigner
here...Separate
cert_to_file
andkey_to_file
returningFilePath
would be slightly neater, I think.I filed https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3887 which we could link to from here.
@ -0,0 +84,4 @@
def not_async(*args, **kwargs):
return Deferred.fromCoroutine(f(*args, **kwargs))
return not_async
I filed https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3886, might be nice to link to it from somewhere around here.
@ -0,0 +117,4 @@
)
root = Data(b"YOYODYNE", "text/plain")
root.isLeaf = True
listening_port = await endpoint.listen(Site(root))
This will intermittently fail with EADDRINUSE.
@ -0,0 +145,4 @@
return treq_client.get(url)
@async_to_deferred
async def test_success(self):
This failed on Windows CI with a dirty reactor error for
<TLSMemoryBIOProtocol #0 on 54010>
@ -0,0 +178,4 @@
await self.request(url, certificate2)
@async_to_deferred
async def test_server_certificate_expired(self):
This failed on Windows CI w/ a dirty reactor error for
<TLSMemoryBIOProtocol #0 on 60408>
@ -0,0 +211,4 @@
cert_to_file(FilePath(self.mktemp()), certificate),
) as url:
response = await self.request(url, certificate)
self.assertEqual(await response.content(), b"YOYODYNE")
I guess one other test would be for the case where the certificate is not yet valid.
MANIFEST.in
is post-processing after things likepackage_data
have already been added, as per https://packaging.python.org/en/latest/guides/using-manifest-in/.https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3888
Probably ready for another round of review.
Coverage increased (+0.03%) to 94.708% when pulling
4e0f912a10
on 3875-http-storage-furls into0c18dca1ad
on master.Thanks. This looks very good to me. Just one comment inline, and that not really actionable for this PR.
@ -0,0 +216,4 @@
# A potential attack to test is a private key that doesn't match the
# certificate... but OpenSSL (quite rightly) won't let you listen with that
# so I don't know how to test that! See
# https://tahoe-lafs.org/trac/tahoe-lafs/ticket/3884
One other error case that came to mind is the case of a server using a CA-issued certificate. I think this mostly overlaps with "has wrong hash" which is tested but I guess it might be nice to test separately just to be sure OpenSSL isn't somehow deciding to jump over all of our logic because it sees a signature from a well-known CA.
Of course, how to get a CA-issued certificate to use in the tests is an interesting question.
I don't consider this a blocker for this PR but it might be another case to track on 3884 or in another ticket.