add share-type argument to storage server protocol #2796

New Issue

warner · 2016-07-02T20:37:20Z

warner commented

2016-07-02 20:37:20 +00:00

I was thinking about a few things today:

hard drives fade over time, losing bits silently (not detected by the built-in error detection/correction, which is usually a CRC or basic parity check, designed for speed rather than thoroughness)
periodic client verification is the best defense, but it's also the most expensive (all 'N' shares must be downloaded, not just the 'k' necessary ones)
one desiderata for our new/Rainhill share-encoding schemes/WhatCouldGoWrong is the property that storage servers can verify all of the bits they are holding against each other, and against the storage index: in other words, the storage index is the verifycap
one pushback against servers knowing too much about the encoding format is that it makes new formats harder to deploy: you must upgrade all servers before you can switch any client to use the new format

So the idea I just had was to tell the server, in the upload process, what "type" of share we're uploading. If the server has knowledge of that share type, it will apply the validation rules during upload, and a periodic server-side "scrubbing" operation can pre-emptively detect failed shares. If the server does not currently know the share type, it will accept arbitrary data, but it will remember the claimed type next to the share. Later, when the server is upgraded to a version that does know about the new format, it will start to verify that data too, and if the data doesn't match the type, the share will be deleted as corrupt.

Some of our encoding schemes use this server-side verification to prevent "roadblock attacks", where someone uploads junk data to the same storage-index as your legitimate file, in order to prevent you from uploading enough shares to publish/preserve the file. To support this in the face of not-yet-understood formats, we could have a rule that clients can trigger a verify on any SI. If the second (legitimate) uploader notices a share in place which fails their download check (or which the server doesn't remember having verified themselves since its upgrade), they can call for verification. The roadblock share will fail verification and be deleted, then the uploader can upload their own (valid) share in it's place. (Maybe the server should return the last verification time as part of the "do-you-have-block" response)

It might also work to just scope the SI to the type, or treat SIs uploaded to known types as being independent of those uploaded to unknown types (even if the server was upgraded later). But I think this would cause some shares to effectively disappear upon server upgrade, which wouldn't be good.

We could declare that the existing share-upload API methods use "format 1 mutable" or "format 1 immutable", and add new upload API methods which include the type/format identifier along with the will-you-hold-my-data request. We could advertise format understanding in the server's version dictionary (I'm not sure we'd want clients to behave differently based on that information, but maybe they could prefer to upload shares to servers that do understand the format, over ones that don't).

On the server side, we could either store the format type in the share file itself (I think we have some magic numbers there already), or use separate top-level directories for them, or store the information in the leasedb (and regenerate it by looking for magic numbers in the share). The server is going to want to have a new database ("scrubdb"? "sharedb"?) that tracks when a share was last verified, which might be a good place for remembering their formats too.

We might also generate/regenerate format identifiers by just trying to validate the share against all known formats, and if it passes that check, it must be of that format. That might let us use the scrubdb as the canonical source of format, but build the DB from raw share files for the initial transition.

If we deployed the scrubdb before deploying any new encoding formats, the initial transition could just say "all present shares must be format 1 mutable/immutable", and we wouldn't need the actual initial verification step.

I was thinking about a few things today: * hard drives fade over time, losing bits silently (not detected by the built-in error detection/correction, which is usually a CRC or basic parity check, designed for speed rather than thoroughness) * periodic client verification is the best defense, but it's also the most expensive (all 'N' shares must be downloaded, not just the 'k' necessary ones) * one [desiderata](wiki/NewCapDesign) for our [new](wiki/NewCaps)/Rainhill [share-encoding](wiki/NewMutableEncodingDesign) [schemes](wiki/NewCaps)/WhatCouldGoWrong is the property that storage servers can verify all of the bits they are holding against each other, and against the storage index: in other words, the storage index *is* the verifycap * one pushback against servers knowing too much about the encoding format is that it makes new formats harder to deploy: you must upgrade all servers before you can switch any client to use the new format So the idea I just had was to tell the server, in the upload process, what "type" of share we're uploading. If the server has knowledge of that share type, it will apply the validation rules during upload, and a periodic server-side "scrubbing" operation can pre-emptively detect failed shares. If the server does not *currently* know the share type, it will accept arbitrary data, but it will remember the claimed type next to the share. Later, when the server is upgraded to a version that *does* know about the new format, it will start to verify that data too, and if the data doesn't match the type, the share will be deleted as corrupt. Some of our encoding schemes use this server-side verification to prevent "roadblock attacks", where someone uploads junk data to the same storage-index as your legitimate file, in order to prevent you from uploading enough shares to publish/preserve the file. To support this in the face of not-yet-understood formats, we could have a rule that clients can trigger a verify on any SI. If the second (legitimate) uploader notices a share in place which fails their download check (or which the server doesn't remember having verified themselves since its upgrade), they can call for verification. The roadblock share will fail verification and be deleted, then the uploader can upload their own (valid) share in it's place. (Maybe the server should return the last verification time as part of the "do-you-have-block" response) It might also work to just scope the SI to the type, or treat SIs uploaded to known types as being independent of those uploaded to unknown types (even if the server was upgraded later). But I think this would cause some shares to effectively disappear upon server upgrade, which wouldn't be good. We could declare that the existing share-upload API methods use "format 1 mutable" or "format 1 immutable", and add new upload API methods which include the type/format identifier along with the will-you-hold-my-data request. We could advertise format understanding in the server's version dictionary (I'm not sure we'd want clients to behave differently based on that information, but maybe they could prefer to upload shares to servers that *do* understand the format, over ones that don't). On the server side, we could either store the format type in the share file itself (I think we have some magic numbers there already), or use separate top-level directories for them, or store the information in the leasedb (and regenerate it by looking for magic numbers in the share). The server is going to want to have a new database ("scrubdb"? "sharedb"?) that tracks when a share was last verified, which might be a good place for remembering their formats too. We might also generate/regenerate format identifiers by just trying to validate the share against all known formats, and if it passes that check, it must be of that format. That might let us use the scrubdb as the canonical source of format, but build the DB from raw share files for the initial transition. If we deployed the scrubdb before deploying any new encoding formats, the initial transition could just say "all present shares must be format 1 mutable/immutable", and we wouldn't need the actual initial verification step.