ugly (temporary) error message when connecting to new onion server #2850
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#2850
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
While running some smoketests on 1.12, I spun up a new onion-based server (
tahoe create-node --listen=tor
) on an existing grid, while watching a client on that grid.I think the client hears about the new service before it's really ready to listen, because for the first 10 or 15 seconds, the client's Welcome page displays an ugly traceback squeezed into the "Connection" column (screenshots attached). The full traceback (obtained from the flogtool log) was:
After about 15 or maybe 30 seconds, the server became reachable, and the error message cleared up. I believe the client was using the SOCKS port of the local Tor daemon, so the delay probably wasn't because the client was waiting for a local Tor to launch, nor waiting for a control connection to the local Tor.
I think the fix for this will be to have the foolscap Tor handler map this
txsocksx.errors.HostUnreachable
to something more concise, like just "onion server unreachable".It's possible that we could do something on the server side too. The server was using a control port on the system Tor daemon to register the onion service. It will have published the .onion address as soon as it started (just after establishing a Tor-mediated connection to the Introducer), but it tells Tor to set up the server at startup too, and that process probably didn't finish by the time the client heard about the address. The .onion address is generated during
tahoe create-node
, and written into tahoe.cfg, so it's available very early, before the Tor control connection is even started.So maybe we want to change the server to wait for some sort of ACK from the tor_provider, telling us that the Tor daemon says the onion service descriptors and rendezvous points are ready to go, before allowing the IntroducerClient to publish the storage server. That might help with the first launch of the server. It won't help with later launches, when the Introducer (and clients) already have the announcement, and they're busy trying+retrying to connect, but there's nothing we can do about that.
Attachment onion-connect-error.png (335967 bytes) added
error during connect
Attachment onion-connect-success.png (189680 bytes) added
successful connect
Moving open issues out of closed milestones.
Ticket retargeted after milestone closed