twisted.web logs the uri on some exceptional conditions, leading to a privacy leak in logfiles #887

Closed
opened 2010-01-09 04:41:24 +00:00 by zooko · 2 comments

We have a policy of not logging filenames or caps into our logging system. This is very useful, because then users who want to report a problem can send us their log files, or let us connect a foolscap log watcher tool to their running Tahoe-LAFS node, without exposing their filenames or capabilities to us tahoe-lafs developers. However, I just noticed that twisted.web logs the URI in some error cases, which means the twistd.log file can have these privacy-sensitive strings in it. I noticed because I was looking at a twistd.log file and it said:

2009-12-17 07:59:14.525Z [HTTPChannel,162,207.7.153.173] Unhandled Error
        Traceback (most recent call last):
        Failure: exceptions.RuntimeError: Producer was not unregistered for /uri/URI:CHK:dskdfkdsfdsf:skjhfsdfhdafkjhdskfjhskjdfhskjfhdksjhfkshf:3:10:6069379?save=true&filename=02.%E5%B7%AE%E4%B8%8D%E5%A4%9A%E5%85%88%E7%94%9F.mp3

(Actually I censored the cap itself when posting this ticket.)

Here is the twisted.web line that logs the uri:

http://twistedmatrix.com/trac/browser/trunk/twisted/web/http.py?rev=27335#L591

The error that is triggering this log message is #685 (RuntimeError: Producer was not unregistered), although there may well be other exceptional conditions that we might sometimes hit that could stimulate twisted to log the URI.

We have hitherto been treating the twistd.log file as a log file, potentially a source of useful diagnostic information, and inviting users to send theirs to us if they have problems. I guess in the short term we should stop doing that, although that could make it impossible to diagnose some things. In the long term we should systematically fix privacy and confidentiality leaks like this. (Also we should get rid of the twistd.log file entirely and make all logging go through the foolscap system. That is probably orthogonal to this ticket though.)

This was with the following versions of software:

         Nevow: 0.9.26
       Twisted: 2.5.0
      argparse: 0.8.0
      foolscap: 0.4.2
      platform: Linux-Ubuntu_8.04-i686-32bit
     pyOpenSSL: 0.6
    pycryptopp: 0.5.16-r669
        python: 2.5.2
        pyutil: 1.3.20
    setuptools: 0.6c8
    simplejson: 1.7.3
  tahoe-server: 1.4.1
       twisted: 2.5.0
     z-base-32: 1.0.1
          zfec: 1.4.0-4
zope.interface: 3.3.1
We have a policy of not logging filenames or caps into our logging system. This is very useful, because then users who want to report a problem can send us their log files, or let us connect a foolscap log watcher tool to their running Tahoe-LAFS node, without exposing their filenames or capabilities to us tahoe-lafs developers. However, I just noticed that twisted.web logs the URI in some error cases, which means the `twistd.log` file can have these privacy-sensitive strings in it. I noticed because I was looking at a `twistd.log` file and it said: ``` 2009-12-17 07:59:14.525Z [HTTPChannel,162,207.7.153.173] Unhandled Error Traceback (most recent call last): Failure: exceptions.RuntimeError: Producer was not unregistered for /uri/URI:CHK:dskdfkdsfdsf:skjhfsdfhdafkjhdskfjhskjdfhskjfhdksjhfkshf:3:10:6069379?save=true&filename=02.%E5%B7%AE%E4%B8%8D%E5%A4%9A%E5%85%88%E7%94%9F.mp3 ``` (Actually I censored the cap itself when posting this ticket.) Here is the twisted.web line that logs the uri: <http://twistedmatrix.com/trac/browser/trunk/twisted/web/http.py?rev=27335#L591> The error that is triggering this log message is #685 (RuntimeError: Producer was not unregistered), although there may well be other exceptional conditions that we might sometimes hit that could stimulate twisted to log the URI. We have hitherto been treating the twistd.log file as a log file, potentially a source of useful diagnostic information, and inviting users to send theirs to us if they have problems. I guess in the short term we should stop doing that, although that could make it impossible to diagnose some things. In the long term we should systematically fix privacy and confidentiality leaks like this. (Also we should get rid of the twistd.log file entirely and make all logging go through the foolscap system. That is probably orthogonal to this ticket though.) This was with the following versions of software: ``` Nevow: 0.9.26 Twisted: 2.5.0 argparse: 0.8.0 foolscap: 0.4.2 platform: Linux-Ubuntu_8.04-i686-32bit pyOpenSSL: 0.6 pycryptopp: 0.5.16-r669 python: 2.5.2 pyutil: 1.3.20 setuptools: 0.6c8 simplejson: 1.7.3 tahoe-server: 1.4.1 twisted: 2.5.0 z-base-32: 1.0.1 zfec: 1.4.0-4 zope.interface: 3.3.1 ```
zooko added the
unknown
major
defect
1.4.1
labels 2010-01-09 04:41:24 +00:00
zooko added this to the undecided milestone 2010-01-09 04:41:24 +00:00

one idea: we could have our web Request handler erase request.uri, or censor it. If this happens after .uri has been parsed into components and query strings, then I don't think any control flow will be affected, but all log messages should emit the censored string instead of the original.

This would probably go into allmydata.webish.MyRequest.requestReceived, right after the last usage of self.uri.

one idea: we could have our web Request handler erase `request.uri`, or censor it. If this happens after .uri has been parsed into components and query strings, then I don't think any control flow will be affected, but all log messages should emit the censored string instead of the original. This would probably go into `allmydata.webish.MyRequest.requestReceived`, right after the last usage of `self.uri`.
tahoe-lafs added
code-frontend-web
and removed
unknown
labels 2010-02-01 19:59:44 +00:00
Author

duplicate of #685

duplicate of #685
zooko added the
duplicate
label 2013-01-14 09:05:31 +00:00
zooko closed this issue 2013-01-14 09:05:31 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#887
No description provided.