make argv $0 be 'tahoe', not 'twistd' #174
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#174
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
To make it easier to identify a running tahoe node with standard tools like 'ps' and 'top', it
would be awfully nice if argv[0] (aka $0) were 'tahoe'. At the moment, the way we spawn twistd
means that a 'allmydata-tahoe start' command actually results in a process with an argv of
/usr/bin/python /usr/bin/twistd -y tahoe-client.tac --logfile logs/twistd.log
which is kind of gross.
To accomplish this, we'd need to invoke twistd directly (as a regular python function call), rather than using os.system() to spawn off a new process. This is tricky. I've done it in buildbot, so we can probably steal code from there, but it has ramifications for both win32 (#27) and for tailing-the-logfile (#71).
We've changed the name of the .tac file to contain the word "tahoe" (#156), so at least 'ps ax |grep tahoe' results in something useful now, but it would be nice to complete this effort.
So, I stumbled across a feature in Application that might work well enough
for our purposes here. By adding
service.IProcess(application).processName = "tahoe"
to our .tac file, we go from an argv that looks like:
/usr/bin/python /usr/bin/twistd -y tahoe-client.tac --logfile logs/twistd.log
to one that looks like:
tahoe /usr/bin/twistd --originalname -y tahoe-client.tac --logfile logs/twistd.log
It looks kind of weird now, but 'tahoe' shows up as argv0, which is probably a good thing.
I'll do some more experimentation.
Unfortunately, 'top' still shows the process as 'python', instead of 'tahoe'.
Also, we'd like the node's basedir to show up in the argv list, so that 'ps ax |grep tahoe' would tell us which node is which.
make argv $0 be 'tahoe', not 'twistd'to make argv $0 be 'tahoe', not 'twistd', and add BASEDIR to argvwe had to write a C extension to modify argv[0], so I'm curious if you find a cleaner way.
Hm, I was just playing with dupfilefind and I noticed that its entry in top is named "dupfilefind" instead of "python", even though its entry in "ps eax" is:
Upon reflection, I think I want the nickname in argv for nodes that have them, and the base class (cpu-watcher, stats-gatherer, et al) for those that don't.
This was inspired by the behavior of Munin, which shows the hostname of the target of each munin-update process.
Since the Milestone is "undecided" and nobody is actively working on this ticket, it doesn't seem right to have its priority set to "critical".
If it is still of critical importance to a user of Tahoe-LAFS, such as Zandr, then please say so and we'll see if someone actually wants to spend time fixing it.
This might have been mostly fixed with the recent import+call+twistd.run() change. At least argv[0] should now be 'tahoe' instead of twistd. If you passed an explicit basedir (i.e.
tahoe start FOO
), then I think FOO will appear in your ps args.Anyone want to check?
Replying to warner:
The cmdline as shown by
ps ax
looks fine but the process name displayed bytop
remainspython
.Hm, I guess most interpreted-language programs (those with #! lines) will show up that way. Should we change it? That is, do folks think it's useful to know which version/instance of python is being used to run your node? I suppose if it helps you reconstruct the running program, it might be, although then there may be other ambiguities involved (i.e. if you just run "tahoe start", such that it searches your $PATH for "tahoe", will "ps" show you which one it found?).
I imagine we might be able to have the process rearrange its sys.argv array to try and change what ps sees, but I'm not 100% sure that it'd be a good idea..
Here's what
twistd
does to changeargv[0]
: http://twistedmatrix.com/trac/browser/trunk/twisted/scripts/_twistd_unix.py?rev=27324#L173(
name
is given bytwisted.application.service.IProcess(application).processName
.)This requires
execv
and so is only done on Unix. (I don't think there's any way to change how the Processes tab of Task Manager lists processes on Windows.)Reducing priority to minor, since the commandline now does give all significant information.
Replying to davidsarah:
Huh. That suggests that there isn't any way to affect argv from python (i.e.
sys.argv is a one-time copy of the C data structure, and changes are not
propagated back). Or at least that the Twisted devs decided it wasn't worth
trying to use such a feature.
I'm not sure if that extra
exec()
is going to cause us problems. We'replanning to set up a fairly specific environment before calling
twistd.run()
, and I don't think that re-execing the current contents ofargv are likely to keep that same state around. It'll need some
experimentation.
Attachment fix-174.darcs.patch (26398 bytes) added
bin/tahoe-script.template: On non-Windows, invoke support/bin/tahoe directly as a script (rather than via python), so that 'top' for example will show it as 'tahoe'. On Windows, simplify some code that set argv[0], which is never used. fixes #174
I agree that an extra
exec
isn't worth it. However I think that on Unix we can invoke thesupport/bin/tahoe
script directly, rather than viapython
, which should achieve the same effect. (That script's shebang line should point to the same Python interpreter as thebin/tahoe
script, since both should be the full path to the Python that ransetup.py build
.) Please repeat the checks in comment:15 to confirm.I don't think we need a test for this, since the existing tests should catch any regressions. (It would be difficult to test, since
sys.argv[0]
is the same for a script invoked directly or viapython
.)davidsarah: applying your patch on my OS-X box and then using 'tahoe start' to launch a couple of nodes gives me 'ps aux' output that looks the same as it did with current trunk:
"top" on OS-X just shows "Python" in the COMMAND column. (OS-X's top is a bit weird. Actually, all OS-X unix tools are a bit weird. sysv vs bsd, I guess.)
Is that what was supposed to happen?
(if it did the right thing, I suppose I'd r+ the patch, OTOH I get a deep sense of dread when seeing that weird .template file get changed, and it raises my "this file shouldn't even exist" hackles. I'd be much more keen to r+ a patch that deleted it altogether :-)
For the record, I'll be happy if the 'ps aux' output includes "tahoe", since "
ps aux |grep tahoe
" is what I always do to find out if a tahoe process is currently running. And I'm also happy if the argv array includes the target directory. Both of these conditions appear to be true right now.Replying to warner:
No, it was supposed to change the command line to just "
/Users/warner2/stuff/tahoe/t2-174/support/bin/tahoe start ../MY-TESTNET/node-3
". Note thatpython setup.py build
is needed, I should have mentioned that.The patch doesn't help on OS X. Can someone try it on Linux?
make argv $0 be 'tahoe', not 'twistd', and add BASEDIR to argvto make argv $0 be 'tahoe', not 'twistd'Replying to davidsarah:
This patch was tried on my Ubuntu 10.10 box. The output of the
ps
command did not changed but the output of thetop
command did changed.In both cases.
Without the patch.
With it.
francois said on irc:
In changeset:9815852a09582776:
In changeset:1190ce614303b6fb: