add a "censor" command to filter out sensitive information from log files #562
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#562
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
per [//pipermail/tahoe-dev/2008-December/000946.html tahoe-dev/2008-December/000946.html] it would be good to omit the introducer furl from the log file.
This is part of a cluster of tickets including: #562, #563, #685, #1008, #1904, and #1989.
If you like this bug, you might also like #823.
If you like this bug, you might also like #860.
First, note that the log file that inspired this ticket is here: [//pipermail/tahoe-dev/attachments/20081222/20cc919e/attachment-0001.html]
The tahoe-lafs code itself, unless I'm missing something, doesn't ever print the introducer_furl to a log. I notice that there's one exception in there with a censored furl; perhaps that's an artifact from how things were then, or something that foolscap is doing? I'll look into that more thoroughly later.
I do notice that the storage server furls are also censored in the motivating log file. I don't mind having them there in my log files, and, as Zooko points out in that thread, censoring too much makes the log files less useful. Maybe this can be a configuration switch -- if paranoid logging is turned on, then IP addresses, storage server furls, storage indices/verify caps are censored somehow, and if not they aren't.
..alternatively, maybe there's a way that we could add a tool to censor logs after they've been created.
For example, you can do
to post-process logs that way. So maybe you could, if you wanted a censored log snippet to post to tahoe-dev or on the Trac, do something like
and have flogtool (or whatever) obfuscate the SIs, furls, and so on. Of course, it's probably much harder to do it that way.
Censorship in a running node is relatively easy, as you can easily determine what is what as it is being logged, and censor accordingly. Censorship after the fact is much harder, because you need to be able to reliably determine whether a certain string is a furl, a storage index, an IP address, something else that should be censored, or nothing at all. It seems to be closer to what I as a user would want, though; if I want to have a useful, low-effort log to attach to a bug report, I shouldn't have to run my node such that it never produces logs with information that might help me later, nor should I have to stop, reconfigure, and restart my node, then hope that the problem reappears.
Kevan,
I like your idea of creating a new 'flogtool censor' command.
What about tagging potentially sensitive informations at logging time? For example, let's modify this type of log line
into
It will then by pretty easy to filter out IP addresses, furls, storage indexes and so on.
That would solve the problem.
I haven't had much time to play with the censorer lately, but it's more or less functional now, with that idea. I'm hoping I can have some patches and tests for people to play with by the end of this weekend.
A correct solution to this will probably need to be implemented in foolscap, since it turns out that a lot of the compromising log entries come from there.
David-Sarah suggested that foolscap could offer callers of its logging system a way to mark certain log messages (or certain parts of certain log messages) as sensitive, so
flogtool censor
or whatever would know to censor them. For example,You'd basically need to do the following to solve this ticket, if you wanted to do it as above:
Between GSoC and school, I'm not going to have time to do all of that before 1.7 is due, so I'm unaccepting this ticket in case someone else wants to finish what I've started. I implemented 2, but as
tahoe censor
. I'm attaching that, and the tests I wrote for it to this ticket -- maybe they'll be useful somehow to whoever accepts this ticket. If I do get time, I'll re-accept it and continue working on it.Attachment censor.darcspatch.txt (10011 bytes) added
implementation of 'tahoe censor'
Attachment tests.darcspatch.txt (20889 bytes) added
tests for 'tahoe censor'
It sounds like from Kevan's comment:68662 that he would not recommend committing these patches to Tahoe-LAFS trunk. Therefore I'm unsetting "review-needed".
censor introducer furl from log filesto add a "censor" command to filter out sensitive information from log filesOther potentially sensitive information that shows up in foolscap logs (including incident report files):
This issue is interfering with debugging #1670, because a user has reported an occurrence of #1670, but their incident report files contain information which is sensitive to them, so they don't want their flog files posted to the issue tracker.