consider share-at-a-time uploader #1340

Open
opened 2011-01-27 19:09:11 +00:00 by warner · 1 comment

As discussed in this thread:

http://tahoe-lafs.org/pipermail/tahoe-dev/2011-January/005963.html

(in response to discussion about UIs for #467 static-server-selection),
Chris Palmer described Octavia's is-it-uploaded-yet UI. I liked it, and
I'd like to explore how we might achieve something similar in Tahoe.
This ticket is to make sure we don't forget the idea.

The basic goal is to give each upload a sort of "red/yellow/green-light"
status indicator, showing how healthy/robust the file is. When the
upload first starts, it stays in the "red" state until there are enough
shares pushed to recover the file if it were to be deleted from the
original computer, then transitions into the "yellow" state. In that
state, more shares are uploaded until we've achieved the desired
diversity, at which point it goes "green".

The key differences from our current uploader:

  • we set N higher than usual, maybe 25 or even 50
  • we don't upload all N shares. Instead, we define some lower threshold
    "Nup" (closer to our old N value) which indicates how many shares we
    want to upload. The idea is that we could create more shares if we
    wanted to, without changing the encoding parameters or the filecaps.
  • we store the whole file locally for longer, and do multiple passes
    over it, instead of the almost-streaming approach we have now
  • the first pass generates all shares, builds all the hash trees, but
    only uploads 'k' shares (discarding the rest)
  • subsequent passes re-use the cached hash trees, generate all shares
    again, but only upload some other subset of the shares, maybe 'k' at
    a time, or maybe more.
  • retain the whole file and the cached hash trees until we've uploaded
    Nup shares
  • if the network is partitioned before Nup is achieved, or the client
    is shut down, resume generating and uploading shares when it comes
    back
  • have a realtime status icon which shows the aggregate
    red/yellow/green state of all current uploads. This lets users know
    when it's safe to close their laptops.

The general notion is to sacrifice some efficiency to reduce the time
needed to get to a mostly-safe upload. The original downloader generates
and uploads all shares in parallel, which eliminates wasted CPU cycles
and almost doesn't need the disk (and #320 plus some new CHK format
could make it fully streaming). But if the process is interrupted before
the very last close() is sent to each share, the whole upload fails and
must be started again from scratch. This new uploader would need more
local disk storage (to handle multiple passes), and waste some amount of
CPU (encoding shares that were then discarded, unless they too were
stored on local disk, and we learned from Tahoe's predecessor that disks
don't do matrix transpositions well), but would get the file to at least
a recoverable state in about the same time a normal non-encoded FTP
upload would have finished, and then gets better after that.

(caching the hashtrees would save some of the hashing time on the second
pass, which may or may not be a win, since hashing is generally pretty
quick too)

Some assumptions that must be tested before this scheme is at all
realistic:

  • generating extra shares (and then discarding them without pushing) is
    cheap
  • generating N=25 or N=50 shares (and never uploading them, just
    keeping the encoding space in reserve for the future) is cheap
  • building the hash trees on each pass, or caching them, is cheap
  • we can tolerate the extra disk IO (multiple passes over each file)

At present, zfec doesn't have an API to create fewer than N
shares: you have to make all of them and then throw some away. It might
be possible to enhance zfec to allow this (and of course do it
faster than create-and-discard: ideally, the CPU time would be
proportional to the number of shares we retain), but I haven't looked at
the Reed-Solomon encoding scheme enough to tell. If so, then the second
and subsequent passes could avoid the encoding waste (we'd still
generate each share twice: once on the first pass to build the hash
trees, and a second time on some subsequent pass when we actually push
that share).

Octavia uses pure replication (k=1), which removes the question of
encoding overhead, so it's a bit easier to use this scheme in Octavia
than in Tahoe.

The big big win of this approach is the UI. The k-of-N encoding
parameters don't matter quite so much when you can keep creating more
shares until you're happy with their distribution, so there might be
less need for users to choose k/N instead of sticking with the defaults.
And the "red/yellow/green" status light is a dead-simple UI indicator,
like the way Dropbox tells you when it's done moving data around.

The next step is probably to do some zfec performance tests, to find out
what setting N=25 (and then discarding the extra shares) would do to our
upload speed.

As discussed in this thread: <http://tahoe-lafs.org/pipermail/tahoe-dev/2011-January/005963.html> (in response to discussion about UIs for #467 static-server-selection), Chris Palmer described Octavia's is-it-uploaded-yet UI. I liked it, and I'd like to explore how we might achieve something similar in Tahoe. This ticket is to make sure we don't forget the idea. The basic goal is to give each upload a sort of "red/yellow/green-light" status indicator, showing how healthy/robust the file is. When the upload first starts, it stays in the "red" state until there are enough shares pushed to recover the file if it were to be deleted from the original computer, then transitions into the "yellow" state. In that state, more shares are uploaded until we've achieved the desired diversity, at which point it goes "green". The key differences from our current uploader: * we set N higher than usual, maybe 25 or even 50 * we don't upload all N shares. Instead, we define some lower threshold "Nup" (closer to our old N value) which indicates how many shares we want to upload. The idea is that we could create more shares if we wanted to, without changing the encoding parameters or the filecaps. * we store the whole file locally for longer, and do multiple passes over it, instead of the almost-streaming approach we have now * the first pass generates all shares, builds all the hash trees, but only uploads 'k' shares (discarding the rest) * subsequent passes re-use the cached hash trees, generate all shares again, but only upload some other subset of the shares, maybe 'k' at a time, or maybe more. * retain the whole file and the cached hash trees until we've uploaded Nup shares * if the network is partitioned before Nup is achieved, or the client is shut down, resume generating and uploading shares when it comes back * have a realtime status icon which shows the aggregate red/yellow/green state of all current uploads. This lets users know when it's safe to close their laptops. The general notion is to sacrifice some efficiency to reduce the time needed to get to a mostly-safe upload. The original downloader generates and uploads all shares in parallel, which eliminates wasted CPU cycles and almost doesn't need the disk (and #320 plus some new CHK format could make it fully streaming). But if the process is interrupted before the very last close() is sent to each share, the whole upload fails and must be started again from scratch. This new uploader would need more local disk storage (to handle multiple passes), and waste some amount of CPU (encoding shares that were then discarded, unless they too were stored on local disk, and we learned from Tahoe's predecessor that disks don't do matrix transpositions well), but would get the file to at least a recoverable state in about the same time a normal non-encoded FTP upload would have finished, and then gets better after that. (caching the hashtrees would save some of the hashing time on the second pass, which may or may not be a win, since hashing is generally pretty quick too) Some assumptions that must be tested before this scheme is at all realistic: * generating extra shares (and then discarding them without pushing) is cheap * generating N=25 or N=50 shares (and never uploading them, just keeping the encoding space in reserve for the future) is cheap * building the hash trees on each pass, or caching them, is cheap * we can tolerate the extra disk IO (multiple passes over each file) At present, `zfec` doesn't have an API to create fewer than N shares: you have to make all of them and then throw some away. It might be possible to enhance `zfec` to allow this (and of course do it faster than create-and-discard: ideally, the CPU time would be proportional to the number of shares we retain), but I haven't looked at the Reed-Solomon encoding scheme enough to tell. If so, then the second and subsequent passes could avoid the encoding waste (we'd still generate each share twice: once on the first pass to build the hash trees, and a second time on some subsequent pass when we actually push that share). Octavia uses pure replication (k=1), which removes the question of encoding overhead, so it's a bit easier to use this scheme in Octavia than in Tahoe. The big big win of this approach is the UI. The k-of-N encoding parameters don't matter quite so much when you can keep creating more shares until you're happy with their distribution, so there might be less need for users to choose k/N instead of sticking with the defaults. And the "red/yellow/green" status light is a dead-simple UI indicator, like the way Dropbox tells you when it's done moving data around. The next step is probably to do some zfec performance tests, to find out what setting N=25 (and then discarding the extra shares) would do to our upload speed.
warner added the
code-encoding
major
enhancement
1.8.1
labels 2011-01-27 19:09:11 +00:00
warner added this to the undecided milestone 2011-01-27 19:09:11 +00:00

I like it! See related tickets #678 (converge same file, same K, different M) and #711 (repair to different levels of M).

I like it! See related tickets #678 (converge same file, same K, different M) and #711 (repair to different levels of M).
tahoe-lafs added
normal
and removed
major
labels 2012-11-02 01:03:20 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1340
No description provided.