separate Tub per server connection #2759
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2759
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Leif, dawuud, and I had an idea during today's devchat: what it we used a separate Tub for each server connection?
The context was Leif's use case, where he wants a grid in which all servers (including his own) advertise a Tor .onion address, but he wants to connect to his own servers over faster direct TCP connections (these servers are on the local network).
Through a combination of the #68 multi-introducer work, and the #517 server-override work, the plan is:
So Leif can:
But now the issue is: tahoe.cfg has an
anonymous=true
flag, which tells it to configure Foolscap to remove theDefaultTCP
connection-hint handler, for safety: no direct-TCP hints will be honored. So how should this overridden server use an otherwise-prohibited TCP connection?So our idea was that each YAML clause has two chunks of data: one local, one copied from the introducer announcement. The local data should include a string of some form that specifies the properties of the Tub that should be used for connections to this server. The StorageFarmBroker will spin up a new Tub for each connection, configure it according to those properties, then call
getReference()
(actuallyconnectTo()
, to get the reconnect-on-drop behavior).The tahoe.cfg settings for foolscap connection-hint handlers get written into the cached introducer data. StorageFarmBroker creates Tubs that obey those rules because those rules are sitting next to the announcement that will contain the FURL.
In this world, we'll have one Tub for the server (if any), with a persistent identity (storing its key in private/node.privkey as usual). Then we'll have a separate ephemeral Tub for each storage server, which doesn't store its private key anywhere. (I think we'll also have a separate persistent Tub for the control-port / logport).
Potential issues:
.write(data)
messages to it, we could flip it around and give the server access to a client-side ShareReader object, and the server would issue.read(length)
calls to it. That would let the server set the pace more directly. And then the server could sub-contract to a different server by passing it the ShareReader object, then step out of the conversation entirely. However this would only work if our client could accept inbound connections, or if the subcontractor server already had a connection to the client (maybe the client connected to them as well).None of those issues are serious: I think we could live with them.
And one benefit is that we'd eliminate the TubID-based correlation between connections to different storage servers. This is the correlation that foils your plans when you call yourself Alice when you connect to server1, and Bob when you connect to server2.
It would leave the #666 Accounting pubkey relationship (but you'd probably turn that off if you wanted anonymity), and the timing relationship (server1 and server2 compare notes, and see that "Alice" and "Bob" connect at exactly the same time, and conclude that Alice==Bob). And of course there's the usual storage-index correlation: Alice and Bob are always asking for the same shares. But removing the TubID correlation is a good (and necessary) first step.
The StorageFarmBroker object has responsibility for creating IServer objects for each storage server, and it doesn't have to expose what Tub it's using, so things would be encapsulated pretty nicely. (In the long run, the IServer objects it provides won't be using Foolscap at all).
Here's my latest dev branch that partially implements this design:
https://github.com/david415/tahoe-lafs/tree/introless-multiintro_yaml_config.1
StorageFarmBroker makes a Tub for each storage server
the caching needs a little bit of work still; i never delete the old cache but grow it. Maybe we should delete the old cache file upon connecting to the introducer?
TODO:
like this?
connections.yaml
I like this
connections.yaml
layout a lot!Maybe we should have a top-level default
connection_types
key too, to avoid repeating ourselves in each server and introducer definition? (When it exists, the server and introducer-levelconnection_types
dictionary should be used in place of the default dictionary, not in addition to it).I'm a little hesitant about requiring (local) introducer nicknames because people will have to make one up and it'll probably often end up being "My Introducer" or something like that, but it will certainly make the introducer list on the welcome page easier to understand when there are several introducers. The nickname can also be used as the filename for the introducer's yaml announcement cache.
my latest dev branch i have all the unit tests working... and i rewrote the multi-intro tests
to use our new connections.yaml file; also got the static server config working although
i haven't written unit tests for that yet:
https://github.com/david415/tahoe-lafs/tree/introless-multiintro_yaml_config.1
the next step is to load the connection_types sections of the yaml file.
ok! my dev branch is ready for code review. it passes ALL unit tests except two:
note: i did not make these features work for the v1 intro client.
here's a usefull diff to show how my dev branch differs from my introless-multiintro which is the same as Leif's introless-multiintro except that it has the latest upstream/master merged in.
https://github.com/david415/tahoe-lafs/pull/7/files
i made a new foolscap dev branch with the SOCKS5 plugin and merge upstream master into it
https://github.com/david415/foolscap/tree/tor-client-plugin.4
i've also updated the latest tahoe-lafs dev branch and i fixed some of the introducer unit tests that were failing... but i thought that i had previously gotten all or almost all of them to pass.
https://github.com/david415/tahoe-lafs/tree/introless-multiintro_yaml_config.1
i'm also a bit confused as to why the web interface is totally broken.
replying here to comment of meejah
https://tahoe-lafs.org/trac/tahoe-lafs/ticket/517#comment:73
since my foolscap changes aren't merged upstream they require you to do some extra work to get it all to build correctly. i usually pip install tahoe-lafs first and then uninstall old foolscap and install my new foolscap.
this shows a diff relative to the changes in leif's multiintro introducerless branch but it also has the upstream/master merged in:
https://github.com/david415/tahoe-lafs/pull/8
this is diffed against upstream/master
https://github.com/tahoe-lafs/tahoe-lafs/pull/260
further code review should be conducted against one of these pull requests and not an older one
here's my attempt to make the storage broker client make one tub per storage server:
https://github.com/david415/tahoe-lafs/tree/storage_broker_tub.0
so far i've been unable to make some of the unit tests pass.
last night meejah fixed the test that was failing, here:
https://github.com/meejah/tahoe-lafs/tree/storage_broker_tub.0
warner, please review. this is step 1 as you outlined above.
i'm going to begin work on step 2
warner,
step 2 --> https://github.com/david415/tahoe-lafs/tree/intro_yaml_cache.0
I'll wait for review before proceeding further with this ticket.
In f5291b9/trunk:
Landed step 1.. thanks!
Looking at step 2: here's some thoughts:
_auto_deps.py
to add it's pypi name?self.cache_filepath
intoself._cache_filepath
yaml.safe_load()
instead of plainload()
? (I don't know what exactly is unsafe about plainload
, and we aren't parsing files written by others, but maybe it's good general practice to usesafe_load()
by default)FilePath.setContent()
.. that's cool, maybe we should replacefileutil.write_atomically()
with itallmydata.node.Node
which returns theNODEDIR/private/
filename for a given basename, so the magic string "private" isn't duplicated all over the place.test_introducer.Announcements.test_client_*
, basically ayaml.load(filename)
and checking that the announcement and key string are correct (including the cases when there is no key, because the sender didn't sign their announcement, or because it went through an old v1 introducer)I like where this is going!
idnar (on IRC) pointed out that
yaml.load()
will, in fact, perform arbitrary code execution. So I guesssafe_load()
is a good idea.OK i've made those corrections. Although I think my unit tests need a bit of work. I found that the nickname was not propagated into the announcement for some reason.
I was thinking that instead of having a cache expirey policy we can just replace the old cache file once we connect to the introducer. What do you think of this?
Oh, I like that. It sounds like the simplest thing to implement, and mostly retains the current behavior.
We need to think through how replacement announcements get made: I think announcements have sequence numbers, and highest-seqnum wins. If we write all announcements into the cache (as opposed to rewriting the cache each time with only the latest announcement for each server), then we'll have lots of old seqnums in the file, but we can filter those out when we read it.
Also there's a small window when the introducer restarts, before the servers have reconnected to it, when it won't be announcing very much. Our client will erase its cache when it reconnects, and we'll have a small window when the cache is pretty empty. However if the client is still running (it hasn't bounced), it will still remember all the old announcements in RAM, so those connections will stay up. And if it does bounce, then it's no worse than it was before the cache.
OK here's my
"step 3 would be reading from the yaml file too, but still have exactly one introducer"
https://github.com/david415/tahoe-lafs/tree/read_intro_yaml_cache.0
I am not sure exactly how to implement cache purging or announcement replacements.
The naive way I described isn't even implemented here... but to do that I could simply remove
the cache file when we successfully connect to the introducer.
Here's the latest "step 2 is probably to have the introducer start writing to the yaml file, but not have anything which reads from it yet" :
https://github.com/tahoe-lafs/tahoe-lafs/pull/278
please review
I also have dev branches available for "step 3", here:
https://github.com/david415/tahoe-lafs/tree/read_intro_yaml_cache.2
but maybe i can "regenerate" that branch after "step 2" is landed... please do let us know.
In b49b409/trunk:
"step 3" -> https://github.com/tahoe-lafs/tahoe-lafs/pull/281
please review.
09:36 < warner> step 4 is to add the override file (but still the only permissible connection type is "tcp")
"step 4" -> https://github.com/david415/tahoe-lafs/tree/2759.add_connections_yaml_config.0
Here in this minimal code change I've only added one feature:
please review.
I think we've exhausted the purview of this ticket, which is specifically about using a separate Tub for each storage-server connection. Let's move the more general "cache server information and use it later, maybe with overrides" into a separate ticket: #2788
Since f5291b9 landed the per-server Tub, I'm closing this ticket. Work on PR281 and dawuud's other branches will continue in #2788.