the tahoe-lafs logging system is hard to discover #1936
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
4 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#1936
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
[docs/logging.rst]source:git/docs/logging.rst?rev=861892983369c0e96dc1e73420c1d9609724d752 explains how to get logging info out of your Tahoe-LAFS node. (Mostly for developers editing the source code, but also potentially useful for operators.) Nobody discovers this! Many people have reported a similar problem, frustration, and waste of time as Itamar did just now on IRC:
Proposal: the
tahoe --help
output points to the logging docs.This problem is compounded by the fact that there exists a
~/.tahoe/logs
directory which looks at first glance like it has all the logs in it. This means people will assume that they've seen the logs after looking in there, and if they don't see any error messages in there then there aren't any error message being logged. An example of this (probably) just happened:Let's see,
~/.tahoe/logs
contains two things:twistd.log
, which has top-level crashes that happen after daemonization (like "address already in use"), and other unhandled exceptionsincidents/
, which are foolscap "flog" bundles, created when something weird happensIn addition, there's stuff written to stderr during
tahoe start
, if it happens before daemonization. There's also runtime logging accessible withflogtool tail NODEDIR/private/logport.furl
.We have a couple tickets about moving errors from post-daemonization to pre- , where they're immediately visible to the person running
tahoe start
ortahoe run
(in contrast, post-daemonization errors are pretty hard to discover). I think we've managed to move most configuration errors to the pre-daemonization phase (by doing all our checking inClient.*init*
), but address-already-in-use might be a significant exception.What if we had the node write out a
~/.tahoe/logs/README
each time it started, with a few lines about what each logfile was for, and how to runflogtool tail
?Replying to warner:
I agree it would be an improvement to move more things to pre-daemonization. For that matter, what about the major strategy change of not implementing daemonization ourselves at all! We could leave that job up to daemontools, supervisord, upstart, systemd, or
tahoe run > tahoe-stdout.log.txt 2> tahoe-stderr.log.txt &
.+1
Well, actually it should probably instruct you to read the source:docs/logging.rst file (on the principle of Don't Repeat Yourself).
Also, what do you think about the earlier proposal to have
tahoe --help
mention the existence of the source:docs/logging.rst file?Replying to [zooko]comment:4:
I dunno. Daemonizing a tahoe node is a very common thing for people
to do. Twisted has excellent built-in daemonization tools. I don't
want to ask users to discover and learn some external daemonization
tool before they're able to do a very common operation. And my own
test/deployment workflow would be a lot harder if I had to use
daemontools just to launch a half-dozen local nodes. I still miss
tahoe's built-in
--multiple
feature. So I'm -1 on removingthis functionality from tahoe.
Incidentally, that
tahoe run >logs &
command doesn't record aPID anywhere (making it hard to kill, especially when you run
multiple nodes), doesn't necessarily detach from the terminal (so
logging out might kill the process), doesn't rotate the logfiles,
throws any init-time errors into the stderr logfile where they
won't be noticed, and has no way to report setup/config errors via
an exit status code. All things that
tahoe start
provides :).OS packagers, on the other hand, will have some preferred
daemonization solution in mind already, so the non-daemonizing
tahoe run
is important to maintain for them to use.It's not a bad idea, although I worry that
tahoe --help
istoo long (61 lines at the moment). Once it grows beyond a page, it
becomes hard to use. But one more line might be ok.
I don't think it's accurate to say "OS packagers will have a daemonization plan". As a packager I expect daemons to be well behaved by themselves. Having a pidfile written would be good.
It would also be nice to have an easy way to use syslog. I gather flogtool has advantages, but syslog is the standard approach.
Replying to gdt:
Would you want the complete, verbose logs sent to syslog? It is currently possible by running
flogtool tail ~/.tahoe/private/logport.furl | logger
. Is that the behavior you want?Replying to gdt:
It currently has all of the above.
Replying to https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1936#comment:8:
What I would expect is for tahoe to use the LOG_{DEBUG,INFO,NOTICE,WARNING,ERR} hierarchy from syslog.h, and for most of the current logs to be DEBUG. INFO might be peer connections coming and going, and perhaps NOTICE for the introducer. Then people can configure syslogd appropriately for what they want. So piping flogtool to logger fails to label messages with priority level. And then there's assessing what logs make sense for routine operations when nothing is known to be wrong. (Presumably fetching files when all works ok is not noteworthy.)
One thing I noticed about flogtool is that when starting it, the past is apparently in the past, and not accessible. That sort of makes sense, but it somewhat defeats the purpose of logging.
Replying to gdt:
Did you see the
--catch-up
option toflogtool tail
? It only outputs entries that are still in memory, but that's often enough.Let's open a separate ticket for integrating tahoe-lafs logging with syslog.
Related: #1974 (wui: the "report an incident" feature doesn't submit incident reports)