Tahoe URIs and gateway URLs are too long and ugly #882
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#882
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Add all your complaints about the length, ugliness, spamminess, and other usability niggles of Tahoe URLs here :-)
#217 and its comments describe proposals, discussed further in NewMutableEncodingDesign, for changing the mutable file protocol. This seems to have become associated with the proposal to use ECDSA, but only some of these suggested protocols actually depend on the shorter public keys enabled by ECDSA. (There are other performance reasons to use ECDSA anyway.)
So, here are the comments from there that relate specifically to URL or cap length, starting with /tahoe-lafs/trac-2024-07-25/issues/5279#comment:-1 by Zooko:
David-Sarah: this length is incorrect. The cryptovalue part appears to be base32, i.e. 130 bits. NewMutableEncodingDesign can't achieve that; the cryptovalue would have to be twice as long as this.
Note that in the last comment, Zooko was actually talking about a different proposal, so he wasn't actually incorrect (although that length of read cap is indeed unachievable).
/tahoe-lafs/trac-2024-07-25/issues/5279#comment:-1 by Brian:
Zooko replied in /tahoe-lafs/trac-2024-07-25/issues/5279#comment:-1
This single-letter prefix seems like a good idea to me.
Zooko:
...
Dig those horizontal scrollbars. Sorry.
Zooko:
swillden:
zooko:
Replying to davidsarah:
Here's why: preimage attacks against a hash function can be performed simultaneously against multiple targets. That is, for N target files, the work factor of a brute force attack against a K-bit hash that succeeds with probability p is p/N * 2^K^.
In the case of encryption, we can use a longer key for the actual cipher than for the secret value in the read cap. Provided that the cipher key is long enough and is derived using a salt that is unique for each file, this prevents multiple-target attacks. But for a hash we can't do that, because the attacker would be in control of the salt.
Therefore, if the work factor needed for attacks against confidentiality is p * 2^K^ (which we assume to be sufficient even for low p), then to get at least the same work factor for attacks against integrity, we need at least K + log2(N) bits, where N is the number of targets available to an attacker.
In my opinion, we should assume N to be at least 2^50^. In that case, for an optimal protocol that obtains integrity from every bit of the cryptovalue, the minimum cryptovalue length of a read cap would be 178 bits for a 2^128^ security level.
zooko:
Hmm. Are we sure that it is actually the length that is triggering the spam filter?
In any case, given the argument in the previous comment, it is quite possible that the minimum feasible read cap length would be still be long enough (or whatever) to trigger the filter. If that were the case, there would be no point in worrying about something we can't fix.
Brian:
Zooko:
David-Sarah:
[that's all the comments from #217.]OK,
From NewCapDesign:
In reply to Zooko on tahoe-dev:
highlighted
on the right-hand half of the preceding space
and not a hand cursor
fully visible in the address bar, this check requires:
click, pause, click-drag to end, scan for mangling,
click, pause, click-drag to front, scan for mangling,
click to unselect.
(Maybe this is more clicks than necessary, but it's what I
actually did. The pauses are a habit to avoid Windows'
horrible double-click behaviour in text fields.)
Phew. That should have been:
Replying to [davidsarah]comment:6:
Clarification: this depends on the fact that we would only truncate the final hash for each file, not the intermediate hashes used in Merkle trees or the hash of the UEB. Section 6 of this paper explains that for a Merkle-Damgård hash (such as SHA-256), there is a second-preimage attack with work factor p/B * 2^L^ where B is the total number of blocks (of 64 bytes in the case of SHA-256) that have been hashed, and L is the hash output length in bits. However since the intermediate chaining values are not truncated, this could only be applied with L = 256, and so it shouldn't be a threat as long as SHA-256 is secure.
"You've gotta get a tiny URL." --Jacob Appelbaum
Attachment wiki.html.png (901 bytes) added
Raw CAP as a QR code
Attachment wiki.html-url.png (1135 bytes) added
CAP with full URL
I've been looking at encoding caps in QR codes as a way to publish them, get them into phones and other devices, etc. You can see from the attached images the codes are large. They could probably be further reduced, but these were generated with "small" size at http://qrcode.kaywa.com/. I don't know how well they'd reproduce if printed.
The URL is also too large to fit into a single SMS (at 141 characters).
Sizes of QR codes for various raw bit lengths:
Replying to davidsarah:
Hmm, these sizes seem larger than they should be. For example, the last link above has 1177 data pixels (37*37 - 192 for the corner registration marks). That's a lot of redundancy to encode 345 bits. I generated the URLs and bit lengths by using the "telephone number" encoding at qrcode.kaywa.com, finding the largest number of digits that could be encoded at each size, and converting to a bit length. Maybe that method is incorrect, I'll check.
Replying to [davidsarah]comment:17:
Actually that was the next to last link, so it was encoding 275 bits.
The second link is the same size as the example on Wikipedia, so in that case I know that the number of additional fixed bits is 126, leaving 33*33 - 192 - 126 = 771 data pixels to encode 209 bits. I suspect that the problem is that it's actually encoding each digit with 8 bits (which for this example would be 504 bits in 771 data pixels).
Ah, I should have been using the URL option. For some reason that produces smaller QR codes than the phone option even when both are restricted to decimal digits.
Some old notes about this are on ticket #102. Please read them!
Tantek collected some examples of tiny urls being used in the wild in print:
http://tantek.pbworks.com/ShortURLPrintExample
I have previously noticed when I post comments on people's blogs and give a tahoe-lafs url such as http://pubgrid.tahoe-lafs.org/uri/URI:DIR2-RO:ixqhc4kdbjxc7o65xjnveoewym:5x6lwoxghrd5rxhwunzavft2qygfkt27oj3fbxlq4c6p45z5uneq/blog.html as my "home page URL" that my comment gets automatically canned for being spam.
Just now I received this comment on twitter:
@zooko you must trigger some URI overflow into Google Reader's back end, it's impossible to subscribe to your blog's feed!
(http://twitter.com/seb_martini/status/9347963721617408)
Followed by:
@zooko never mind, seems to work when shortened http://bit.ly/dURKfO ;)
Where http://bit.ly/dURKfO is a tinyurl that currently elicits http://pubgrid.tahoe-lafs.org/uri/URI%3ADIR2-RO%3Aixqhc4kdbjxc7o65xjnveoewym%3A5x6lwoxghrd5rxhwunzavft2qygfkt27oj3fbxlq4c6p45z5uneq/blog.xml .
So apparently (I haven't confirmed this) the current tahoe-lafs URLs are incompatible with Google Reader.
(http://twitter.com/naesten/status/12352362991583232)
@zooko: wow this is a long URL you've got here <http://is.gd/dUlo9> -- its almost as long as a freenet URI!
Of those characters, the base URL (http://insecure.tahoe-lafs.org/uri/URI:DIR2-RO:ixqhc4kdbjxc7o65xjnveoewym:5x6lwoxghrd5rxhwunzavft2qygfkt27oj3fbxlq4c6p45z5uneq/blog.html) only has ':', and the rest come from a fragment.
On 13/05/12 08:55, Michael Rogers wrote on tahoe-dev:
Ticket retargeted after milestone closed (editing milestones)