Double Encoding in HTML in File Names in WUI #1143

Closed
opened 2010-08-01 05:00:32 +00:00 by chrisp · 8 comments
chrisp commented 2010-08-01 05:00:32 +00:00
Owner

My file "zumby-bumby ; mail blaggy@mailinator.com < /etc/hosts" in the pubgrid root http://pubgrid.tahoe-lafs.org/uri/URI%3ADIR2%3Actmtx2awdo4xt77x5xxaz6nyxm%3An5t546ddvd6xlv4v6se6sjympbdbvo7orwizuzl42urm73sxazqa/ is listed as "zumby-bumby ; mail blaggy@mailinator.com < /etc/hosts" in the listing.

That is, the < got converted to < and then that ampersand got converted to &. Thus, we end up with &lt;.

HTML entity-encoding is good because it can stop XSS, but be careful: it increases the size of memory you have to allocate to handle the request. Also, double-encoding is just plain incorrect. Single-encode, and place limits on how much memory you will allocate to do the encoding. One way to do this is to include input size limits as part of your input validation framework.

My file "zumby-bumby ; mail blaggy@mailinator.com < /etc/hosts" in the pubgrid root <http://pubgrid.tahoe-lafs.org/uri/URI%3ADIR2%3Actmtx2awdo4xt77x5xxaz6nyxm%3An5t546ddvd6xlv4v6se6sjympbdbvo7orwizuzl42urm73sxazqa/> is listed as "zumby-bumby ; mail blaggy@mailinator.com &lt; /etc/hosts" in the listing. That is, the < got converted to &lt; and then that ampersand got converted to &amp;. Thus, we end up with &amp;lt;. HTML entity-encoding is good because it can stop XSS, but be careful: it increases the size of memory you have to allocate to handle the request. Also, double-encoding is just plain incorrect. Single-encode, and place limits on how much memory you will allocate to do the encoding. One way to do this is to include input size limits as part of your input validation framework.
tahoe-lafs added the
unknown
major
defect
1.7.1
labels 2010-08-01 05:00:32 +00:00
tahoe-lafs added this to the undecided milestone 2010-08-01 05:00:32 +00:00

which tools did you use to add and list this file? CLI or WUI?

which tools did you use to add and list this file? CLI or WUI?
davidsarah commented 2011-08-24 02:58:12 +00:00
Author
Owner

I've just spotted the likely cause of this bug: at several places in [DirectoryAsHTML.render_row]source:src/allmydata/web/directory.py@5185#L668, we use T.a(href=...)html.escape(name)). This is wrong because nevow already escapes the argument to T.a (if it is a string).

I think it only affects the WUI.

I've just spotted the likely cause of this bug: at several places in [DirectoryAsHTML.render_row]source:src/allmydata/web/directory.py@5185#L668, we use `T.a(href=...)html.escape(name))`. This is wrong because nevow already escapes the argument to `T.a` (if it is a string). I think it only affects the WUI.
tahoe-lafs added
code-frontend-web
and removed
unknown
labels 2011-08-24 02:58:12 +00:00
tahoe-lafs modified the milestone from undecided to 1.10.0 2011-08-24 02:58:12 +00:00
tahoe-lafs modified the milestone from 1.11.0 to 1.10.0 2012-04-01 03:44:41 +00:00

I just used freedomsponsors.org to offer USD 25.00 to whoever fixes this issue: http://www.freedomsponsors.org/core/offer/24/double-encoding-in-html-in-file-names-in-wui?alert=SPONSOR&c=s

I just used freedomsponsors.org to offer USD 25.00 to whoever fixes this issue: <http://www.freedomsponsors.org/core/offer/24/double-encoding-in-html-in-file-names-in-wui?alert=SPONSOR&c=s>
mk.fg commented 2012-10-16 12:43:31 +00:00
Author
Owner

David-Sarah was right, as strings passed as stan in Nevow are automatically escaped, unless raw() marker is used.

Nevow uses it's own escapeToXML method to do that though, which leaves single/double quotes intact (unless string is used as an attribute), so it doesn't match twisted.web.html.escape() 1-to-1, but I think it should be okay, as it doesn't affect rendering.

Fixed now in 1143_double_encoding_html_filenames branch (non-official repo), github pull request 16.

David-Sarah was right, as strings passed as stan in Nevow are automatically escaped, unless raw() marker is used. Nevow uses it's own escapeToXML method to do that though, which leaves single/double quotes intact (unless string is used as an attribute), so it doesn't match twisted.web.html.escape() 1-to-1, but I think it should be okay, as it doesn't affect rendering. Fixed now in [1143_double_encoding_html_filenames branch](https://github.com/mk-fg/tahoe-lafs/tree/1143_double_encoding_html_filenames) (non-official repo), [github pull request 16](https://github.com/tahoe-lafs/tahoe-lafs/pull/16).

Since Nevow's escapeToXML method leaves single/double quotes intact, could that be used to malicious craft input which would confuse the HTML parser by having embedded quote characters?

Since Nevow's `escapeToXML` method leaves single/double quotes intact, could that be used to malicious craft input which would confuse the HTML parser by having embedded quote characters?
mk.fg commented 2012-10-21 18:15:19 +00:00
Author
Owner

Replying to zooko:

Since Nevow's escapeToXML method leaves single/double quotes intact, could that be used to malicious craft input which would confuse the HTML parser by having embedded quote characters?

I don't really see how and don't think I've heard of such things happening, maybe example of what you mean would be helpful?

I can imagine it happening only if malicious person can insert markup somewhere else, i.e. something like this:

<p>filename with <span randomattr="</p>
<p>filename_that_should_be_hidden</p>
<p>">visible_filename_ending</p>

But then again, I think if any tags can be inserted, it'll be something like

Replying to [zooko](/tahoe-lafs/trac-2024-07-25/issues/1143#issuecomment-78972): > Since Nevow's `escapeToXML` method leaves single/double quotes intact, could that be used to malicious craft input which would confuse the HTML parser by having embedded quote characters? I don't really see how and don't think I've heard of such things happening, maybe example of what you mean would be helpful? I can imagine it happening only if malicious person can insert markup somewhere else, i.e. something like this: ``` <p>filename with <span randomattr="</p> <p>filename_that_should_be_hidden</p> <p>">visible_filename_ending</p> ``` But then again, I think if any tags can be inserted, it'll be something like <script> and the game is over, no amount of quote escaping should make any difference. Interesting thing to note that the source for this very page contains unescaped quotes in user-submitted content.
davidsarah commented 2012-10-25 01:21:50 +00:00
Author
Owner

Fixed by changeset:1df7f114b7094dab.

Fixed by changeset:1df7f114b7094dab.
tahoe-lafs added the
fixed
label 2012-10-25 01:21:50 +00:00
davidsarah closed this issue 2012-10-25 01:21:50 +00:00
davidsarah commented 2012-10-25 01:34:02 +00:00
Author
Owner

Replying to zooko:

Since Nevow's escapeToXML method leaves single/double quotes intact, could that be used to malicious craft input which would confuse the HTML parser by having embedded quote characters?

No, because the input does not occur in an attribute or other quoted context.

Replying to [zooko](/tahoe-lafs/trac-2024-07-25/issues/1143#issuecomment-78972): > Since Nevow's `escapeToXML` method leaves single/double quotes intact, could that be used to malicious craft input which would confuse the HTML parser by having embedded quote characters? No, because the input does not occur in an attribute or other quoted context.
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1143
No description provided.