improve the mechanism that causes test nodes to exit even if not successfully stopped #1336
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1336
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
source:src/allmydata/test/test_runner.py includes some tests (in the RunNode class) for whether node processes can be successfully started and stopped. If stopping the node fails, we don't want the node process to be left running. (On Windows the process would hold open file handles that prevent the _trial_test directory from being deleted, interfering with subsequent test runs -- although currently these tests don't work on Windows anyway, as discussed below.)
Currently this is done by writing a file,
with the poorly-chosen name "suicide_prevention_hotline"called "exit_trigger", in the node directory. If a node sees this file at startup, it will set a 1-second periodic timer (source:src/allmydata/client.py#L161) that each time it triggers, causes the node process to exit if either the file's mtime is more than 120 seconds ago, or the file no longer exists (source:src/allmydata/client.py#L498).There are several problems with this mechanism:
The name of the file is based on a very poor choice of metaphor, that is both unpleasant and misleading. (The existence of the file doesn't prevent the node from exiting, as the name might imply.)In addition, the tests of starting nodes don't work on Windows, because twistd doesn't daemonize or write the pid file on that platform. While that isn't directly due to this mechanism, it would be nice to redesign these tests in a way that does work on Windows (if we're not going to change the Windows behaviour to be more like Unix).
I would be happy to have these issues fixed. But how?
The name of the file was changed to "exit_trigger" in [647ebce6b993cc6d319ad6be0f4921909d159d73/trunk].