replace google chart in wui with d3.js: it leaks information #1942

Closed
opened 2013-04-10 18:33:35 +00:00 by leif · 18 comments

The timing chart on the mutable file upload status page is rendered by http://chart.apis.google.com.

This reveals the IDs and latencies of storage servers to Google, as well as anyone able to observe the network between Google and the web browser.

I think this is generally undesirable, but it is particularly problematic for users of grids hosted on i2p or Tor hidden services.

It is possible (if not likely) that anonymity-desiring users are running tahoe under an LD-preload tool (such as torsocks/usewithtor) but are connecting to their WUI using a non-torified browser because they expect it to only connect to localhost. When they browse to the mutable file upload status page containing this chart, they'll inadvertently reveal themselves to be a user of the grid.

Warner suggested in email that this chart should instead be rendered locally with d3.js, which is already being used for the download timeline.

The code which constructs the google chart URL is in src/allmydata/web/status.py and might also be used on pages besides the mapupdate page where I noticed it.

The timing chart on the mutable file upload status page is rendered by <http://chart.apis.google.com>. This reveals the IDs and latencies of storage servers to Google, as well as anyone able to observe the network between Google and the web browser. I think this is generally undesirable, but it is particularly problematic for users of grids hosted on i2p or Tor hidden services. It is possible (if not likely) that anonymity-desiring users are running tahoe under an LD-preload tool (such as torsocks/usewithtor) but are connecting to their WUI using a non-torified browser because they expect it to only connect to localhost. When they browse to the mutable file upload status page containing this chart, they'll inadvertently reveal themselves to be a user of the grid. Warner suggested in email that this chart should instead be rendered locally with d3.js, which is already being used for the download timeline. The code which constructs the google chart URL is in src/allmydata/web/status.py and might also be used on pages besides the mapupdate page where I noticed it.
leif added the
unknown
normal
defect
1.9.2
labels 2013-04-10 18:33:35 +00:00
leif added this to the undecided milestone 2013-04-10 18:33:35 +00:00
Owner

Further, the very fact that anything is downloaded from any third-party server is a serious privacy bug.

Further, the very fact that anything is downloaded from any third-party server is a serious privacy bug.
daira commented 2013-04-10 22:12:14 +00:00
Owner

+1. It also potentially gives Google access to other pages (that are related in the browsing history) of the gateway's local origin.

We should put a note in source:docs/known_issues.rst for 1.10, and switch to using d3.js for 1.11.

There don't appear to be any other uses of Google Charts other than the mutable servermap update status page.

+1. It also potentially gives Google access to other pages (that are related in the browsing history) of the gateway's local origin. We should put a note in source:docs/known_issues.rst for 1.10, and switch to using `d3.js` for 1.11. There don't appear to be any other uses of Google Charts other than the mutable servermap update status page.

Adding to 1.10 to remind me to update known_issues.rst. Will retarget to 1.11 after that to cover the d3.js rewrite.

BTW I was careful to only have this chart on a page whose URL has no secrets (it just has the storage index, which is also exposed to storage servers), but I agree that a JS-enabled browser or a non-Tor-ified browser would experience a privacy/access problem. Oops.

Adding to 1.10 to remind me to update `known_issues.rst`. Will retarget to 1.11 after that to cover the `d3.js` rewrite. BTW I was careful to only have this chart on a page whose URL has no secrets (it just has the storage index, which is also exposed to storage servers), but I agree that a JS-enabled browser or a non-Tor-ified browser would experience a privacy/access problem. Oops.
warner added
code-frontend-web
and removed
unknown
labels 2013-04-11 11:50:00 +00:00
warner modified the milestone from undecided to 1.10.0 2013-04-11 11:50:00 +00:00

Oh, wait, I remember thinking about this. No, the chart that is loaded is an IMG tag (and google generally returns a PNG). Everything Leif said is correct, but it does not give google access to the rest of the origin (if it were including JS or CSS or something active, it would, but a plain IMG tag won't load anything active). I briefly had code to generate a PNG in the tahoe client itself, but that added a dependency on the PIL library which seemed a bit big.

I think d3.js is the right way to go: it doesn't make the python-side code any bigger, the JS library is already in our tree, and I'm ok with not giving timelines to folks who have JS turned off.

Oh, wait, I remember thinking about this. No, the chart that is loaded is an IMG tag (and google generally returns a PNG). Everything Leif said is correct, but it does *not* give google access to the rest of the origin (if it were including JS or CSS or something active, it would, but a plain IMG tag won't load anything active). I briefly had code to generate a PNG in the tahoe client itself, but that added a dependency on the PIL library which seemed a bit big. I think d3.js is the right way to go: it doesn't make the python-side code any bigger, the JS library is already in our tree, and I'm ok with not giving timelines to folks who have JS turned off.
daira commented 2013-04-12 04:48:52 +00:00
Owner

It used to be possible to same-origin-attack a browser using JavaScript in an SVG file loaded by an img tag (http://www.librador.com/2011/03/09/Dangers-of-SVG-and-the-img-tag/), but apparently recent browsers do not load JavaScript in that case (http://stackoverflow.com/questions/7917008/xss-when-loading-untrusted-svg-using-img-tag).

It used to be possible to same-origin-attack a browser using JavaScript in an SVG file loaded by an `img` tag (<http://www.librador.com/2011/03/09/Dangers-of-SVG-and-the-img-tag/>), but apparently recent browsers do not load JavaScript in that case (<http://stackoverflow.com/questions/7917008/xss-when-loading-untrusted-svg-using-img-tag>).
Brian Warner <warner@lothar.com> commented 2013-04-15 05:28:12 +00:00
Owner

In changeset:3a18157456951841:

known_issues: document the google-chart-API privacy leak. Refs #1942.
In changeset:3a18157456951841: ``` known_issues: document the google-chart-API privacy leak. Refs #1942. ```

Ok, that patch updates known_issues.rst . We can now turn this ticket into "replace the timing chart with local d3.js".

BTW, the leak was even smaller than I thought: the referrer URL only contains the count of mapupdate operations (basically any mutable-file read or write) since last node boot. It doesn't even include the storage-index of the file.

Ok, that patch updates known_issues.rst . We can now turn this ticket into "replace the timing chart with local d3.js". BTW, the leak was even smaller than I thought: the referrer URL only contains the count of mapupdate operations (basically any mutable-file read or write) since last node boot. It doesn't even include the storage-index of the file.
warner added
task
and removed
defect
labels 2013-04-15 05:32:43 +00:00
warner modified the milestone from 1.10.0 to 1.11.0 2013-04-15 05:32:43 +00:00
warner changed title from google chart in wui leaks information to replace google chart in wui with d3.js: it leaks information 2013-04-15 05:32:43 +00:00
daira commented 2013-04-15 16:11:07 +00:00
Owner

Reviewed the known_issues text, +1.

Reviewed the known_issues text, +1.
Author

Replying to Brian Warner <warner@…>:

In changeset:3a18157456951841:

#CommitTicketReference repository="git" revision="3a18157456951841d268dfc00ad63f8c97a0056d"
known_issues: document the google-chart-API privacy leak. Refs #1942.

I think instead of "reveal your use of Tahoe to the outside world" it would be better to say "reveal your use of that grid to the outside world", since the chart URL contains storage server IDs.

Replying to [Brian Warner <warner@…>](/tahoe-lafs/trac-2024-07-25/issues/1942#issuecomment-91362): > In changeset:3a18157456951841: > ``` > #CommitTicketReference repository="git" revision="3a18157456951841d268dfc00ad63f8c97a0056d" > known_issues: document the google-chart-API privacy leak. Refs #1942. > ``` I think instead of "reveal your use of Tahoe to the outside world" it would be better to say "reveal your use of that grid to the outside world", since the chart URL contains storage server IDs.
Brian Warner <warner@lothar.com> commented 2013-04-23 23:40:25 +00:00
Owner

In changeset:02975d188735a59f:

known_issues: update chart-API text, with suggestions from Leif. refs #1942
In changeset:02975d188735a59f: ``` known_issues: update chart-API text, with suggestions from Leif. refs #1942 ```
Owner

Attachment 1942-wui-replace-google-chart-with-d3js-initial.patch (2834 bytes) added

**Attachment** 1942-wui-replace-google-chart-with-d3js-initial.patch (2834 bytes) added
Owner

Starting to work on it... see https://tahoe-lafs.org/trac/tahoe-lafs/attachment/ticket/1942/1942-wui-replace-google-chart-with-d3js-initial.patch

Added the needed JS libraries to source:src/allmydata/web/map-update-status.xhtml and a div id like in source:src/allmydata/web/download-status-timeline.xhtml.

The code to create in source:src/allmydata/web/static/update_status_timing_chart.js might be similar to the one that is in source:src/allmydata/web/static/download_status_timeline.js.

The functions to change in source:src/allmydata/web/status.py seems to be MapupdateStatusPage.render_timing_chart and MapupdateStatusPage._timing_chart.

Is the code in DownloadStatusPage.child_event_json (source:src/allmydata/web/status.py) creating the data that source:src/allmydata/web/static/download_status_timeline.js is using?

Starting to work on it... see <https://tahoe-lafs.org/trac/tahoe-lafs/attachment/ticket/1942/1942-wui-replace-google-chart-with-d3js-initial.patch> Added the needed JS libraries to source:src/allmydata/web/map-update-status.xhtml and a `div` id like in source:src/allmydata/web/download-status-timeline.xhtml. The code to create in source:src/allmydata/web/static/update_status_timing_chart.js might be similar to the one that is in source:src/allmydata/web/static/download_status_timeline.js. The functions to change in source:src/allmydata/web/status.py seems to be `MapupdateStatusPage.render_timing_chart` and `MapupdateStatusPage._timing_chart`. Is the code in `DownloadStatusPage.child_event_json` (source:src/allmydata/web/status.py) creating the data that source:src/allmydata/web/static/download_status_timeline.js is using?

I'm provisionally moving this into the 1.12 milestone, in case we want to make a push for #1010 anonymous = true, which I think would depend upon making this fix.

If so, I think it'd be acceptable to change the WUI to not serve that IMG tag when we're in anonymous mode. That'd be a bit quicker of a fix than properly re-implementing the chart.

We might not treat 1.12 as the "client-side Tor enabled" release, in which case we can push this out a bit further.

Note that if you're using a non-Tor-ified browser to view files coming out of your Tahoe client, then those files could use their own image tags to leak your IP address. Not a reason to not fix this, but something to remain aware of.

I'm provisionally moving this into the 1.12 milestone, in case we want to make a push for #1010 `anonymous = true`, which I think would depend upon making this fix. If so, I think it'd be acceptable to change the WUI to not serve that IMG tag when we're in anonymous mode. That'd be a bit quicker of a fix than properly re-implementing the chart. We might not treat 1.12 as the "client-side Tor enabled" release, in which case we can push this out a bit further. Note that if you're using a non-Tor-ified browser to view files coming out of your Tahoe client, then those files could use their own image tags to leak your IP address. Not a reason to not fix this, but something to remain aware of.
warner modified the milestone from soon to 1.12.0 2016-08-30 01:33:39 +00:00

Note that Google deprecated the charts API in 2012, so this feature is lucky to still be working anyways.

I'm going to delete the chart now (the time axis of the chart is pretty hard to read anyways), and then repurpose this ticket to be about adding a new chart (rendered locally with d3.js).

Note that Google deprecated the charts API in 2012, so this feature is lucky to still be working anyways. I'm going to delete the chart now (the time axis of the chart is pretty hard to read anyways), and then repurpose this ticket to be about adding a new chart (rendered locally with d3.js).

Attachment servermap-chart.png (64473 bytes) added

sample chart as rendered by (old) Google charts API

**Attachment** servermap-chart.png (64473 bytes) added sample chart as rendered by (old) Google charts API
Brian Warner <warner@lothar.com> commented 2016-09-02 23:34:18 +00:00
Owner

In ed91398/trunk:

WUI: disable google timing chart on mapupdate page

The google image chart API has been deprecated since 2012, sending the
URL to google leaks server IDs and the client's IP address (especially
important when the client is otherwise behind Tor), and the X-axis has
no units anyways.

refs ticket:1942 , which is both about removing the URL-based chart, and
eventually replacing it with a browser-rendered d3.js-based one
In [ed91398/trunk](/tahoe-lafs/trac-2024-07-25/commit/ed91398a3f7da1f0c896c94a2ca8e68ddf4044a3): ``` WUI: disable google timing chart on mapupdate page The google image chart API has been deprecated since 2012, sending the URL to google leaks server IDs and the client's IP address (especially important when the client is otherwise behind Tor), and the X-axis has no units anyways. refs ticket:1942 , which is both about removing the URL-based chart, and eventually replacing it with a browser-rendered d3.js-based one ```

Moving this out of 1.12, now that it's not a privacy threat.

Moving this out of 1.12, now that it's not a privacy threat.
warner modified the milestone from 1.12.0 to eventually 2016-09-02 23:35:10 +00:00

Since the offending functionality was removed years ago, I'm closing this ticket.

If someone wants to re-add this functionality, I think they should re-add it to a brand new Web UI that is packaged separately from the core Tahoe-LAFS software and uses only public, documented network APIs to retrieve the necessary information.

Since the offending functionality was removed years ago, I'm closing this ticket. If someone wants to re-add this functionality, I think they should re-add it to a brand new Web UI that is packaged separately from the core Tahoe-LAFS software and uses only public, documented network APIs to retrieve the necessary information.
exarkun added the
somebody else's problem
label 2021-05-18 17:24:05 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1942
No description provided.