the tahoe-lafs logging system is hard to discover #1936

Open
opened 2013-03-25 16:25:00 +00:00 by zooko · 13 comments

[docs/logging.rst]source:git/docs/logging.rst?rev=861892983369c0e96dc1e73420c1d9609724d752 explains how to get logging info out of your Tahoe-LAFS node. (Mostly for developers editing the source code, but also potentially useful for operators.) Nobody discovers this! Many people have reported a similar problem, frustration, and waste of time as Itamar did just now on IRC:

<itamar1> can someone help me with tahoe's logging code?
<zooko> itamar1: I can!  [10:14]
<zooko> Actually, there's a doc.
<itamar1> oh
<zooko> So, let's see if that doc answers your questions
<itamar1> I guess I should've looked for docs first :)
<zooko> find docs -iname '*log*'
* itamar1 goes looking
<zooko> Well, secondly I wonder if the existence of that doc could be
		better-documented!  [10:15]
<zooko> Maybe linked from some "devel" page about tahoe-lafs...
* zooko looks at https://tahoe-lafs.org/trac/tahoe-lafs/wiki/Dev
<itamar1> I sort've expected everything to be logged by default  [10:16]
<zooko> Hm, well I'm not sure how to make the logging doc more discoverable.
<zooko> Many people have had that exact same problem, including me many times.
<zooko> I more or less approve of Brian's decision to make logging terse by
		default.
<zooko> But
<itamar1> and then spent sometime trying to find log level settings in command
		  line help
<zooko> I'd really like to help people figure it out more easily.
<zooko> Yeah, I'm sure a lot of people have lost time like that. Many have
		reported it on the IRC channel.
[docs/logging.rst]source:git/docs/logging.rst?rev=861892983369c0e96dc1e73420c1d9609724d752 explains how to get logging info out of your Tahoe-LAFS node. (Mostly for developers editing the source code, but also potentially useful for operators.) Nobody discovers this! Many people have reported a similar problem, frustration, and waste of time as Itamar did just now on IRC: ``` <itamar1> can someone help me with tahoe's logging code? <zooko> itamar1: I can! [10:14] <zooko> Actually, there's a doc. <itamar1> oh <zooko> So, let's see if that doc answers your questions <itamar1> I guess I should've looked for docs first :) <zooko> find docs -iname '*log*' * itamar1 goes looking <zooko> Well, secondly I wonder if the existence of that doc could be better-documented! [10:15] <zooko> Maybe linked from some "devel" page about tahoe-lafs... * zooko looks at https://tahoe-lafs.org/trac/tahoe-lafs/wiki/Dev <itamar1> I sort've expected everything to be logged by default [10:16] <zooko> Hm, well I'm not sure how to make the logging doc more discoverable. <zooko> Many people have had that exact same problem, including me many times. <zooko> I more or less approve of Brian's decision to make logging terse by default. <zooko> But <itamar1> and then spent sometime trying to find log level settings in command line help <zooko> I'd really like to help people figure it out more easily. <zooko> Yeah, I'm sure a lot of people have lost time like that. Many have reported it on the IRC channel. ```
zooko added the
code-nodeadmin
normal
defect
1.9.2
labels 2013-03-25 16:25:00 +00:00
zooko added this to the undecided milestone 2013-03-25 16:25:00 +00:00
Author

Proposal: the tahoe --help output points to the logging docs.

<zooko> Maybe the command-line help should point to the logging doc file.
<itamar1> seems reasonable
Proposal: the `tahoe --help` output points to the logging docs. ``` <zooko> Maybe the command-line help should point to the logging doc file. <itamar1> seems reasonable ```
Author

This problem is compounded by the fact that there exists a ~/.tahoe/logs directory which looks at first glance like it has all the logs in it. This means people will assume that they've seen the logs after looking in there, and if they don't see any error messages in there then there aren't any error message being logged. An example of this (probably) just happened:

<dash> hmm. anyone tried bzr with tahoe-lafs over sftp recently?   [10:01]
<dash> getting errors about "readv() returned 0 bytes"  [10:02]
<zooko> dash: hm. I have not.
<zooko> This is... bzr talking to the tahoe-lafs SFTP interface?  [10:28]
<zooko> dash: do you know how to loo for tahoe-lafs logs?
<dash> sure, in ~/.tahoe/logs
This problem is compounded by the fact that there exists a `~/.tahoe/logs` directory which looks at first glance like it has all the logs in it. This means people will assume that they've seen the logs after looking in there, and if they don't see any error messages in there then there aren't any error message being logged. An example of this (probably) just happened: ``` <dash> hmm. anyone tried bzr with tahoe-lafs over sftp recently? [10:01] <dash> getting errors about "readv() returned 0 bytes" [10:02] <zooko> dash: hm. I have not. <zooko> This is... bzr talking to the tahoe-lafs SFTP interface? [10:28] <zooko> dash: do you know how to loo for tahoe-lafs logs? <dash> sure, in ~/.tahoe/logs ```

Let's see, ~/.tahoe/logs contains two things:

  • twistd.log, which has top-level crashes that happen after daemonization (like "address already in use"), and other unhandled exceptions
  • incidents/, which are foolscap "flog" bundles, created when something weird happens

In addition, there's stuff written to stderr during tahoe start, if it happens before daemonization. There's also runtime logging accessible with flogtool tail NODEDIR/private/logport.furl.

We have a couple tickets about moving errors from post-daemonization to pre- , where they're immediately visible to the person running tahoe start or tahoe run (in contrast, post-daemonization errors are pretty hard to discover). I think we've managed to move most configuration errors to the pre-daemonization phase (by doing all our checking in Client.*init*), but address-already-in-use might be a significant exception.

What if we had the node write out a ~/.tahoe/logs/README each time it started, with a few lines about what each logfile was for, and how to run flogtool tail ?

Let's see, `~/.tahoe/logs` contains two things: * `twistd.log`, which has top-level crashes that happen after daemonization (like "address already in use"), and other unhandled exceptions * `incidents/`, which are foolscap "flog" bundles, created when something weird happens In addition, there's stuff written to stderr during `tahoe start`, if it happens before daemonization. There's also runtime logging accessible with `flogtool tail NODEDIR/private/logport.furl`. We have a couple tickets about moving errors from post-daemonization to pre- , where they're immediately visible to the person running `tahoe start` or `tahoe run` (in contrast, post-daemonization errors are pretty hard to discover). I think we've managed to move most configuration errors to the pre-daemonization phase (by doing all our checking in `Client.*init*`), but address-already-in-use might be a significant exception. What if we had the node write out a `~/.tahoe/logs/README` each time it started, with a few lines about what each logfile was for, and how to run `flogtool tail` ?
Author

Replying to warner:

Let's see, ~/.tahoe/logs contains two things:

  • twistd.log, which has top-level crashes that happen after daemonization (like "address already in use"), and other unhandled exceptions
  • incidents/, which are foolscap "flog" bundles, created when something weird happens

In addition, there's stuff written to stderr during tahoe start, if it happens before daemonization. There's also runtime logging accessible with flogtool tail NODEDIR/private/logport.furl.

We have a couple tickets about moving errors from post-daemonization to pre- , where they're immediately visible to the person running tahoe start or tahoe run (in contrast, post-daemonization errors are pretty hard to discover). I think we've managed to move most configuration errors to the pre-daemonization phase (by doing all our checking in Client.*init*), but address-already-in-use might be a significant exception.

I agree it would be an improvement to move more things to pre-daemonization. For that matter, what about the major strategy change of not implementing daemonization ourselves at all! We could leave that job up to daemontools, supervisord, upstart, systemd, or tahoe run > tahoe-stdout.log.txt 2> tahoe-stderr.log.txt &.

What if we had the node write out a ~/.tahoe/logs/README each time it started, with a few lines about what each logfile was for, and how to run flogtool tail ?

+1

Well, actually it should probably instruct you to read the source:docs/logging.rst file (on the principle of Don't Repeat Yourself).

Also, what do you think about the earlier proposal to have tahoe --help mention the existence of the source:docs/logging.rst file?

Replying to [warner](/tahoe-lafs/trac-2024-07-25/issues/1936#issuecomment-91264): > Let's see, `~/.tahoe/logs` contains two things: > > * `twistd.log`, which has top-level crashes that happen after daemonization (like "address already in use"), and other unhandled exceptions > * `incidents/`, which are foolscap "flog" bundles, created when something weird happens > > In addition, there's stuff written to stderr during `tahoe start`, if it happens before daemonization. There's also runtime logging accessible with `flogtool tail NODEDIR/private/logport.furl`. > > We have a couple tickets about moving errors from post-daemonization to pre- , where they're immediately visible to the person running `tahoe start` or `tahoe run` (in contrast, post-daemonization errors are pretty hard to discover). I think we've managed to move most configuration errors to the pre-daemonization phase (by doing all our checking in `Client.*init*`), but address-already-in-use might be a significant exception. I agree it would be an improvement to move more things to pre-daemonization. For that matter, what about the major strategy change of not implementing daemonization ourselves at all! We could leave that job up to daemontools, supervisord, upstart, systemd, or `tahoe run > tahoe-stdout.log.txt 2> tahoe-stderr.log.txt &`. > What if we had the node write out a `~/.tahoe/logs/README` each time it started, with a few lines about what each logfile was for, and how to run `flogtool tail` ? +1 Well, actually it should probably instruct you to read the source:docs/logging.rst file (on the principle of Don't Repeat Yourself). Also, what do you think about [the earlier proposal](/tahoe-lafs/trac-2024-07-25/issues/1936#issuecomment-91262) to have `tahoe --help` mention the existence of the source:docs/logging.rst file?
Author
<zooko> Updated #1936 to say that our system misleads people like dash into
		thinking they've seen the logs when they haven't.  [10:51]
``` <zooko> Updated #1936 to say that our system misleads people like dash into thinking they've seen the logs when they haven't. [10:51] ```

Replying to [zooko]comment:4:

For that matter, what about the major strategy change of not
implementing daemonization ourselves at all! We could leave that
job up to daemontools, supervisord, upstart, systemd, or tahoe run > tahoe-stdout.log.txt 2> tahoe-stderr.log.txt &.

I dunno. Daemonizing a tahoe node is a very common thing for people
to do. Twisted has excellent built-in daemonization tools. I don't
want to ask users to discover and learn some external daemonization
tool before they're able to do a very common operation. And my own
test/deployment workflow would be a lot harder if I had to use
daemontools just to launch a half-dozen local nodes. I still miss
tahoe's built-in --multiple feature. So I'm -1 on removing
this functionality from tahoe.

Incidentally, that tahoe run >logs & command doesn't record a
PID anywhere (making it hard to kill, especially when you run
multiple nodes), doesn't necessarily detach from the terminal (so
logging out might kill the process), doesn't rotate the logfiles,
throws any init-time errors into the stderr logfile where they
won't be noticed, and has no way to report setup/config errors via
an exit status code. All things that tahoe start provides :).

OS packagers, on the other hand, will have some preferred
daemonization solution in mind already, so the non-daemonizing
tahoe run is important to maintain for them to use.

Also, what do you think about [comment:1 the earlier proposal] to
have tahoe --help mention the existence of the
[source:docs/logging.rst] file?

It's not a bad idea, although I worry that tahoe --help is
too long (61 lines at the moment). Once it grows beyond a page, it
becomes hard to use. But one more line might be ok.

Replying to [zooko]comment:4: > For that matter, what about the major strategy change of not > implementing daemonization ourselves at all! We could leave that > job up to daemontools, supervisord, upstart, systemd, or ```tahoe > run > tahoe-stdout.log.txt 2> tahoe-stderr.log.txt &```. I dunno. Daemonizing a tahoe node is a very common thing for people to do. Twisted has excellent built-in daemonization tools. I don't want to ask users to discover and learn some external daemonization tool before they're able to do a very common operation. And my own test/deployment workflow would be a lot harder if I had to use daemontools just to launch a half-dozen local nodes. I still miss tahoe's built-in `--multiple` feature. So I'm -1 on removing this functionality from tahoe. Incidentally, that `tahoe run >logs &` command doesn't record a PID anywhere (making it hard to kill, especially when you run multiple nodes), doesn't necessarily detach from the terminal (so logging out might kill the process), doesn't rotate the logfiles, throws any init-time errors into the stderr logfile where they won't be noticed, and has no way to report setup/config errors via an exit status code. All things that `tahoe start` provides :). OS packagers, on the other hand, will have some preferred daemonization solution in mind already, so the non-daemonizing `tahoe run` is important to maintain for them to use. > Also, what do you think about [comment:1 the earlier proposal] to > have `tahoe --help` mention the existence of the > [source:docs/logging.rst] file? It's not a bad idea, although I worry that `tahoe --help` is too long (61 lines at the moment). Once it grows beyond a page, it becomes hard to use. But one more line might be ok.
Owner

I don't think it's accurate to say "OS packagers will have a daemonization plan". As a packager I expect daemons to be well behaved by themselves. Having a pidfile written would be good.

It would also be nice to have an easy way to use syslog. I gather flogtool has advantages, but syslog is the standard approach.

I don't think it's accurate to say "OS packagers will have a daemonization plan". As a packager I expect daemons to be well behaved by themselves. Having a pidfile written would be good. It would also be nice to have an easy way to use syslog. I gather flogtool has advantages, but syslog is the standard approach.
Author

Replying to gdt:

I don't think it's accurate to say "OS packagers will have a daemonization plan". As a packager I expect daemons to be well behaved by themselves. Having a pidfile written would be good.

It would also be nice to have an easy way to use syslog. I gather flogtool has advantages, but syslog is the standard approach.

Would you want the complete, verbose logs sent to syslog? It is currently possible by running flogtool tail ~/.tahoe/private/logport.furl | logger. Is that the behavior you want?

Replying to [gdt](/tahoe-lafs/trac-2024-07-25/issues/1936#issuecomment-91268): > I don't think it's accurate to say "OS packagers will have a daemonization plan". As a packager I expect daemons to be well behaved by themselves. Having a pidfile written would be good. > > It would also be nice to have an easy way to use syslog. I gather flogtool has advantages, but syslog is the standard approach. Would you want the complete, verbose logs sent to syslog? It is currently possible by running `flogtool tail ~/.tahoe/private/logport.furl | logger`. Is that the behavior you want?
Author

Replying to gdt:

I don't think it's accurate to say "OS packagers will have a daemonization plan". As a packager I expect daemons to be well behaved by themselves. Having a pidfile written would be good.

It currently has all of the above.

Replying to [gdt](/tahoe-lafs/trac-2024-07-25/issues/1936#issuecomment-91268): > I don't think it's accurate to say "OS packagers will have a daemonization plan". As a packager I expect daemons to be well behaved by themselves. Having a pidfile written would be good. It currently has all of the above.
Owner

Replying to https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1936#comment:8:

What I would expect is for tahoe to use the LOG_{DEBUG,INFO,NOTICE,WARNING,ERR} hierarchy from syslog.h, and for most of the current logs to be DEBUG. INFO might be peer connections coming and going, and perhaps NOTICE for the introducer. Then people can configure syslogd appropriately for what they want. So piping flogtool to logger fails to label messages with priority level. And then there's assessing what logs make sense for routine operations when nothing is known to be wrong. (Presumably fetching files when all works ok is not noteworthy.)

One thing I noticed about flogtool is that when starting it, the past is apparently in the past, and not accessible. That sort of makes sense, but it somewhat defeats the purpose of logging.

Replying to <https://tahoe-lafs.org/trac/tahoe-lafs/ticket/1936#comment:8>: What I would expect is for tahoe to use the LOG_{DEBUG,INFO,NOTICE,WARNING,ERR} hierarchy from syslog.h, and for most of the current logs to be DEBUG. INFO might be peer connections coming and going, and perhaps NOTICE for the introducer. Then people can configure syslogd appropriately for what they want. So piping flogtool to logger fails to label messages with priority level. And then there's assessing what logs make sense for routine operations when nothing is known to be wrong. (Presumably fetching files when all works ok is not noteworthy.) One thing I noticed about flogtool is that when starting it, the past is apparently in the past, and not accessible. That sort of makes sense, but it somewhat defeats the purpose of logging.
daira commented 2013-05-02 00:07:38 +00:00
Owner

Replying to gdt:

One thing I noticed about flogtool is that when starting it, the past is apparently in the past, and not accessible. That sort of makes sense, but it somewhat defeats the purpose of logging.

Did you see the --catch-up option to flogtool tail? It only outputs entries that are still in memory, but that's often enough.

Replying to [gdt](/tahoe-lafs/trac-2024-07-25/issues/1936#issuecomment-91271): > One thing I noticed about flogtool is that when starting it, the past is apparently in the past, and not accessible. That sort of makes sense, but it somewhat defeats the purpose of logging. Did you see the `--catch-up` option to `flogtool tail`? It only outputs entries that are still in memory, but that's often enough.
Author

Let's open a separate ticket for integrating tahoe-lafs logging with syslog.

Let's open a separate ticket for integrating tahoe-lafs logging with syslog.

Related: #1974 (wui: the "report an incident" feature doesn't submit incident reports)

Related: #1974 (wui: the "report an incident" feature doesn't submit incident reports)
Sign in to join this conversation.
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1936
No description provided.