build "sharing slots" / use mutable files as primitives for sharing messages #152
Labels
No Label
0.2.0
0.3.0
0.4.0
0.5.0
0.5.1
0.6.0
0.6.1
0.7.0
0.8.0
0.9.0
1.0.0
1.1.0
1.10.0
1.10.1
1.10.2
1.10a2
1.11.0
1.12.0
1.12.1
1.13.0
1.14.0
1.15.0
1.15.1
1.2.0
1.3.0
1.4.1
1.5.0
1.6.0
1.6.1
1.7.0
1.7.1
1.7β
1.8.0
1.8.1
1.8.2
1.8.3
1.8β
1.9.0
1.9.0-s3branch
1.9.0a1
1.9.0a2
1.9.0b1
1.9.1
1.9.2
1.9.2a1
LeastAuthority.com automation
blocker
cannot reproduce
cloud-branch
code
code-dirnodes
code-encoding
code-frontend
code-frontend-cli
code-frontend-ftp-sftp
code-frontend-magic-folder
code-frontend-web
code-mutable
code-network
code-nodeadmin
code-peerselection
code-storage
contrib
critical
defect
dev-infrastructure
documentation
duplicate
enhancement
fixed
invalid
major
minor
n/a
normal
operational
packaging
somebody else's problem
supercritical
task
trivial
unknown
was already fixed
website
wontfix
worksforme
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Reference: tahoe-lafs/trac-2024-07-25#152
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
We were talking with Peter yesterday about what sort of sharing UI he'd like
to use. In exchanging documents with a colleague, he said he'd like to take
the spreadsheet that he's editing and push a button that says "Share This
File", and immediately get a window with a string that he can IM or email to
somebody. He doesn't want to wait for a file to finish uploading or even
encoding, because he wants to be able to walk away from the process once he's
IM'ed this string to his friend.
We can do this. The requirements are that his computer stays online until the
upload finishes, and that his friend might not be able to download the file
right away (i.e. if he uses the IM'ed string too quickly). If the download is
not yet available, the friend should get an ETA or some sort of progress
message to let them know when they should start downloading it, so that they
can plan their time ("do I go get coffee, or go out to lunch, or come back
tomorrow?").
To build this, I'm thinking we start with an SSK-based mutable slot. The
"Share This File" button creates an SSK slot, fills it with some starting
data, and displays the SSK URI to the originating user. The slot is filled
with:
The originating client will modify the SSK slot every once in a while
(perhaps once every 10 to 60 seconds?) to update the ETA, and will eventually
fill in the URI.
The recipient's GUI should accept an SSK URI (with some framing information
to suggest that it is filled with data in this format) and read the slot to
see whether the file is available yet or not. There should be a "Retrieve
Shared File" button to which you can paste or drag the SSK URI, and it either
produces a window with "waiting for upload to complete: NN%, ETA XX", or
"downloading: NN%, ETA XX", or a file icon ready to be dragged somewhere.
These SSK slots should expire after a while, maybe a week or a month (perhaps
the "share this file" button should have an option somewhere to specify how
long the file will be available). The CHK file needs to last at least the
same duration, so perhaps it needs an extra purely-time-based lease (still
accounted to the originator, but not cancelled if they remove the file from
their vdrive (or never added it in the first place)).
How big are big spreadsheets? I have some small spreadsheets that are about 20 KB. If a file is less than a couple hundred KB, the upload of the file itself might complete faster than Peter can cut-and-paste the string and IM it to his friend. (Back-of-envelope 1s per file plus 23 KB/s, so maybe 2 seconds for a 40 KB file.)
But for sufficiently large files, this feature sounds cool.
Hm, actually, why doesn't his friend start downloading the file before Peter's computer has finished uploading the file? So the progress meter isn't telling you how far to go until you can start downloading, it is telling you how far to go until the file is completely downloaded. Also if the file is useful when incomplete (such as a movie or audio file), then the friend can start using it as soon as Peter's computer starts uploading it.
I guess I'm assuming that microsoft produces are incapable of creating any
file smaller than a few megabytes. I'm also assuming slow consumer-grade ADSL
uplinks.
I'd think that the user should be able to wait up to, say, 15 seconds (from
the time they push the button to the time they get an IM-able string). If
it's less than 2 seconds, then it will feel like their file is being
instantly transmitted, at least from the sender's point of view. The burden
of waiting is really being transferred to their friend, but most of that
latency is hidden from both parties by their own natural sloth :-). (the
longer they procrastinate before pushing the "download this file" link, the
better we look).
If the only thing we need to do is to generate a unique string (like a
Storage Index), then we can respond in a few milliseconds. I think we should
evaluate this time in absolute terms rather than how long it takes Peter to
subsequently cut-and-paste the string, since Peter is waiting on us before
that point, and only on himself after that point. I.e., he can't blame us for
how long it takes him to manipulate his IM client.
Starting the download before the upload finishes would be really slick. It
also won't work at all for our current CHK format, unless we allow the
recipient to download unverified data and keep it quarantined somewhere until
the hashes are uploaded and downloaded and checked. The CHK format has only
one place for verification data (the UEB hash inside the URI), and we can't
generate it until the very end.
Doing download-before-upload on SSK would need some clever work too.. like
signing each segment separately. Or, we could make the validation section
contain a hash tree over just the segments that have been encoded thus far,
with a signature on the root. As we encode more segments, we keep replacing
this tree with a larger one that covers more segments. When we finish
uploading, we'll have a bunch of segments, a complete merkle tree of hashes
(covering all segments), and a single signature on the root.
If this is an important use case, we should keep it in mind when we design
the SSK format. We've talked in the past about designing SSKs that can handle
large amounts of data (using FEC instead of simple replication); if we also
design them to handle partial-upload (with the merkle tree and a variable
number of segments), then we can implement this very nifty feature. (and if
we do this, then the "sharing slot" might just be the SSK itself.. this would
require a place to store "expected file size" or "expected number of
segments", and then we'd probably need to put the suggested file name in the
metadata that wraps the SSK URI and gets pasted or IM'ed to the recipient).
Now we're designing SSKs, and I still think that this is a valuable use case, so I'm posting this comment to remind us to think about this while designing SSKs.
We were chatting with Ping at the hackfest last night, explaining how I was
guessing that sharing would work, specifically the idea of having a pair-wise
directory: when Alice wants to give something to Bob, she creates a new
directory, links its write-cap to "outbox/to-Bob" in her vdrive, puts the
file/files she wants to share in the dir, then mails him the directory's
read-cap. Bob links the read-cap to "inbox/from-Alice". Then Alice can
"revoke" the grant by just deleting the file from that directory, and she has
a record of what she's shared.
Ping was surprised by the idea that we'd re-use this directory. He suggested
that we treat the directory like a one-time "Purse" (from the Mint example,
either from erights.org or Tyler's IOU protocol). The specific thing that he
thought would be confusing was that Bob might come to assume that the file
would remain forever in that inbox (that he "owns" the inbox), and therefore
he would be upset if Alice removed something from his space. Likewise Bob
might be upset to think that Alice could add things to his vdrive at will.
Using the same directory for multiple files would increase the utility of
this inbox, increasing the chances that Bob would keep using things in-place
rather than copying them elsewhere, increasing the surprise/upset.
The other realization we had was that the #217 elliptic-curve -based
DSA-based mutable files would have smaller write-caps than read-caps: with
some tricks, we could get them down to 96 bits (plus prefix), so about 15
characters of base-62. If we use a separate mutable file per act of sharing,
then we could give the recipient the full write-cap instead of the (longer)
read-cap. Then we wouldn't need to treat the gift as a directory at all, we
could just use it as a "channel" that the two parties can use to communicate
about this gift.
For example, we could define a human-shareable cap format (i.e. printable,
short enough to avoid wrapping, and with an http prefix) specifically for
sharing things, with a prefix character of "S" (as opposed to "D" for
directory and "F" for file). The rest of the cap would be a mutable-file
write-cap, but the "S" would indicate that we want to treat the contents
specially.
The contsnts would contain a message from the giver to the recipient. It
would include a list of file/directory caps (with names), the nickname of the
sender, heck it could include the public key of the sender and the rest of
the body could be signed (allowing the recipient to assign a petname to the
sender). Higher-level code would accept the gift, look up the mutable file,
read and parse the contents, then offer the user the choice of what to do
with the gift. The response channel could just be writing a timestamp and a
short note into the slot, saying "got it.. thanks". The revocation action
would be to have the writer erase the slot, replacing it with a type byte
that says "this gift was revoked" or something.
The key insight is to use mutable files as a primitive, and to use
higher-level protocols to generate and interpret their contents.
build "sharing slots"to build "sharing slots" / use mutable files as primitives for sharing messagesThis doesn't have to be restricted to mutable files; the ability to generate a file cap before the file has been fully uploaded has also been discussed for immutable files in the new cap protocol. That is possible if we use public key crypto for immutable files (the integrity and confidentiality of the file would still only depend on symmetric crypto). See http://allmydata.org/pipermail/tahoe-dev/2009-October/002962.html