using removable disk as a storage backend #793

Open
opened 2009-08-22 20:20:11 +00:00 by warner · 1 comment

For years, on my linux box, I've used an encrypted USB flash drive for my GPG
keys, SSH keys, future winning lottery numbers that I've written down after a
session at my secret time portal, etc. It's a two- or three- factor scheme:
you need to possess the drive (which stays in my pocket), and get into my
computer, and sometimes into my head too.

But I'm not very confident in the linux encrypted-disk schemes, and they
generally don't provide much integrity checking. And, one of these USB drives
has started to fail. On the other hand, I am confident in Tahoe's
encryption, and integrity checking, and, hey, waitaminute..

So what I'm thinking is that some aspects of #467 (specifically the creation
of alternate backends) would enable a "local storage" configuration: shares
are stored in a locally-designated directory instead of out on the network.
Each "server" could correspond to a different removeable drive. I'd probably
put k+1 shares on each drive, use two or three drives, and keep at least one
of them in a safe offline place. Having k+1 per drive might let me tolerate a
bad sector without having to go to the safe-deposit box. When a drive starts
failing, I mount a new one in the same place and hit "Repair". Mutable
updates would only change the shares on the drive in my pocket, until the
next time I fetch the safe-deposit-box drive and do a Repair (at which point
those shares will be updated to the latest versions).

The amount of data is usually quite small compared to the size of the drive.
I probably have about 10kB of data to keep safe, and a 4GB thumb drive to
store it on. So this can afford to use k=1 and a high expansion ratio.

Some wrinkles to figure out:

  • it would be a nuisance to actually mount all the drives at the same time.
    It might be useful to configure a "staging area", on a temporary ramdisk
    (since the whole exercise is to maintain the two-factor requirement:
    removable drive with the shares, plus the rootcap). Then tahoe would
    encode to the staging area, and you could copy shares to the USB drive
    later. Or maybe mount two drives at a time, and tell the Repairer to only
    create certain shares (instead of also creating the shares for the missing
    drive and putting them in the wrong place), and using the Repairer
    multiple times.
  • the storage backend would store the share contents directly to disk,
    instead of wrapping them in the usual "container" format (since we don't
    need leases, or write-enablers)
  • the backend code would need to correctly interpret the lack of a readable
    path (i.e. the removable drive being removed) as the "server" being
    offline, and look for a different one
  • since many systems will mount removable media at a fixed location, it
    might be useful to define the "server id" by writing a special file to the
    removable drive (sort of like the regular disk UUID). When the tahoe
    backend code looks to see if a "server" is "online", it looks for this
    serverid file to decide which server is actually available.
  • and, of course, this would be easier to use with a good FUSE frontend.
    Most of the time I use a hand-built "keycache" program (which copies the
    data into a ramdisk and erases it after a timeout), for which a simple
    "tahoe get" would be sufficient. But sometimes I want programs to read out
    that data directly, for which I'd need FUSE.
For years, on my linux box, I've used an encrypted USB flash drive for my GPG keys, SSH keys, future winning lottery numbers that I've written down after a session at my secret time portal, etc. It's a two- or three- factor scheme: you need to possess the drive (which stays in my pocket), and get into my computer, and sometimes into my head too. But I'm not very confident in the linux encrypted-disk schemes, and they generally don't provide much integrity checking. And, one of these USB drives has started to fail. On the other hand, I *am* confident in Tahoe's encryption, and integrity checking, and, hey, waitaminute.. So what I'm thinking is that some aspects of #467 (specifically the creation of alternate backends) would enable a "local storage" configuration: shares are stored in a locally-designated directory instead of out on the network. Each "server" could correspond to a different removeable drive. I'd probably put k+1 shares on each drive, use two or three drives, and keep at least one of them in a safe offline place. Having k+1 per drive might let me tolerate a bad sector without having to go to the safe-deposit box. When a drive starts failing, I mount a new one in the same place and hit "Repair". Mutable updates would only change the shares on the drive in my pocket, until the next time I fetch the safe-deposit-box drive and do a Repair (at which point those shares will be updated to the latest versions). The amount of data is usually quite small compared to the size of the drive. I probably have about 10kB of data to keep safe, and a 4GB thumb drive to store it on. So this can afford to use k=1 and a high expansion ratio. Some wrinkles to figure out: * it would be a nuisance to actually mount all the drives at the same time. It might be useful to configure a "staging area", on a temporary ramdisk (since the whole exercise is to maintain the two-factor requirement: removable drive with the shares, plus the rootcap). Then tahoe would encode to the staging area, and you could copy shares to the USB drive later. Or maybe mount two drives at a time, and tell the Repairer to only create certain shares (instead of also creating the shares for the missing drive and putting them in the wrong place), and using the Repairer multiple times. * the storage backend would store the share contents directly to disk, instead of wrapping them in the usual "container" format (since we don't need leases, or write-enablers) * the backend code would need to correctly interpret the lack of a readable path (i.e. the removable drive being removed) as the "server" being offline, and look for a different one * since many systems will mount removable media at a fixed location, it might be useful to define the "server id" by writing a special file to the removable drive (sort of like the regular disk UUID). When the tahoe backend code looks to see if a "server" is "online", it looks for this serverid file to decide which server is actually available. * and, of course, this would be easier to use with a good FUSE frontend. Most of the time I use a hand-built "keycache" program (which copies the data into a ramdisk and erases it after a timeout), for which a simple "tahoe get" would be sufficient. But sometimes I want programs to read out that data directly, for which I'd need FUSE.
warner added the
code-storage
major
enhancement
1.5.0
labels 2009-08-22 20:20:11 +00:00
warner added this to the undecided milestone 2009-08-22 20:20:11 +00:00
davidsarah commented 2010-07-06 19:36:35 +00:00
Owner

Possibly merge with ticket #1107.

Possibly merge with ticket #1107.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#793
No description provided.