update architecture.txt a little bit

This commit is contained in:
Brian Warner 2007-07-22 20:30:05 -07:00
parent 9c5ab89afe
commit a45bb727d9
1 changed files with 54 additions and 41 deletions

View File

@ -47,18 +47,18 @@ that would cause it to consume more space than it wants to provide. When a
lease expires, the data is deleted. Peers might renew their leases. lease expires, the data is deleted. Peers might renew their leases.
This storage is used to hold "shares", which are themselves used to store This storage is used to hold "shares", which are themselves used to store
files in the grid. There are many shares for each file, typically around 100 files in the grid. There are many shares for each file, typically between 10
(the exact number depends upon the tradeoffs made between reliability, and 100 (the exact number depends upon the tradeoffs made between
overhead, and storage space consumed). The files are indexed by a piece of reliability, overhead, and storage space consumed). The files are indexed by
the URI called the "verifierid", which is derived from the contents of the a "StorageIndex", which is derived from the encryption key, which may be
file. Leases are indexed by verifierid, and a single StorageServer may hold randomly generated or it may be derived from the contents of the file. Leases
multiple shares for the corresponding file. Multiple peers can hold leases on are indexed by StorageIndex, and a single StorageServer may hold multiple
the same file, in which case the shares will be kept alive until the last shares for the corresponding file. Multiple peers can hold leases on the same
lease expires. The typical lease is expected to be for one month: enough time file, in which case the shares will be kept alive until the last lease
for interested parties to renew it, but not so long that abandoned data expires. The typical lease is expected to be for one month: enough time for
consumes unreasonable space. Peers are expected to "delete" (drop leases) on interested parties to renew it, but not so long that abandoned data consumes
data that they know they no longer want: lease expiration is meant as a unreasonable space. Peers are expected to "delete" (drop leases) on data that
safety measure. they know they no longer want: lease expiration is meant as a safety measure.
In this release, peers learn about each other through the "introducer". Each In this release, peers learn about each other through the "introducer". Each
peer connects to this central introducer at startup, and receives a list of peer connects to this central introducer at startup, and receives a list of
@ -78,28 +78,34 @@ http://allmydata.org/trac/tahoe/ticket/22 ).
FILE ENCODING FILE ENCODING
When a file is to be added to the grid, it is first encrypted using a key When a file is to be added to the grid, it is first encrypted using a key
that is derived from the hash of the file itself. The encrypted file is then that is derived from the hash of the file itself (if convergence is desired)
broken up into segments so it can be processed in small pieces (to minimize or randomly generated (if not). The encrypted file is then broken up into
the memory footprint of both encode and decode operations, and to increase segments so it can be processed in small pieces (to minimize the memory
the so-called "alacrity": how quickly can the download operation provide footprint of both encode and decode operations, and to increase the so-called
validated data to the user, basically the lag between hitting "play" and the "alacrity": how quickly can the download operation provide validated data to
movie actually starting). Each segment is erasure coded, which creates the user, basically the lag between hitting "play" and the movie actually
encoded blocks that are larger than the input segment, such that only a starting). Each segment is erasure coded, which creates encoded blocks that
subset of the output blocks are required to reconstruct the segment. These are larger than the input segment, such that only a subset of the output
blocks are then combined into "shares", such that a subset of the shares can blocks are required to reconstruct the segment. These blocks are then
be used to reconstruct the whole file. The shares are then deposited in combined into "shares", such that a subset of the shares can be used to
StorageServers in other peers. reconstruct the whole file. The shares are then deposited in StorageServers
in other peers.
A tagged hash of the original file is called the "fileid", while a A tagged hash of the encryption key is used to form the "storage index",
differently-tagged hash of the original file provides the encryption key. A which is used for both peer selection (described below) and to index shares
tagged hash of the *encrypted* file is called the "verifierid", and is used within the StorageServers on the selected peers.
for both peer selection (described below) and to index shares within the
StorageServers on the selected peers. A variety of hashes are computed while the shares are being produced, to
validate the plaintext, the crypttext, and the shares themselves. Merkle hash
trees are also produced to enable validation of individual segments of
plaintext or crypttext without requiring the download/decoding of the whole
file. These hashes go into the "URI Extension Block", which will be stored
with each share.
The URI contains the encryption key, the hash of the URI Extension Block, and
any encoding parameters necessary to perform the eventual decoding process.
For convenience, it also contains the size of the file being stored.
The URI contains the fileid, the verifierid, the encryption key, any encoding
parameters necessary to perform the eventual decoding process, and some
additional hashes that allow the download process to validate the data it
receives.
On the download side, the node that wishes to turn a URI into a sequence of On the download side, the node that wishes to turn a URI into a sequence of
bytes will obtain the necessary shares from remote nodes, break them into bytes will obtain the necessary shares from remote nodes, break them into
@ -113,8 +119,12 @@ Netstrings are used where necessary to insure these tags cannot be confused
with the data to be hashed. All encryption uses AES in CTR mode. The erasure with the data to be hashed. All encryption uses AES in CTR mode. The erasure
coding is performed with zfec (a python wrapper around Rizzo's FEC library). coding is performed with zfec (a python wrapper around Rizzo's FEC library).
A Merkle Hash Tree is used to validate the encoded blocks before they are fed A Merkle Hash Tree is used to validate the encoded blocks before they are fed
into the decode process, and a second tree is used to validate the shares into the decode process, and a transverse tree is used to validate the shares
before they are retrieved. The hash tree root is put into the URI. before they are retrieved. A third merkle tree is constructed over the
plaintext segments, and a fourth is constructed over the crypttext segments.
All necessary hash chains are stored with the shares, and the hash tree roots
are put in the URI extension block. The final hash of the extension block
goes into the URI itself.
Note that the number of shares created is fixed at the time the file is Note that the number of shares created is fixed at the time the file is
uploaded: it is not possible to create additional shares later. The use of a uploaded: it is not possible to create additional shares later. The use of a
@ -126,13 +136,16 @@ calculated correctly.
URIs URIs
Each URI represents a specific set of bytes. Think of it like a hash Each URI represents a specific set of bytes. Think of it like a hash
function: you feed in a bunch of bytes, and you get out a URI. The URI is function: you feed in a bunch of bytes, and you get out a URI. If convergence
deterministically derived from the input data: changing even one bit of the is enabled, the URI is deterministically derived from the input data:
input data will result in a drastically different URI. The URI provides both changing even one bit of the input data will result in a drastically
"identification" and "location": you can use it to locate/retrieve a set of different URI. If convergence is not enabled, the encoding process will
bytes that are probably the same as the original file, and then you can use generate a different URI each time the file is uploaded.
it to validate that these potential bytes are indeed the ones that you were
looking for. The URI provides both "location" and "identification": you can use it to
locate/retrieve a set of bytes that are possibly the same as the original
file, and then you can use it to validate ("identify") that these potential
bytes are indeed the ones that you were looking for.
URIs refer to an immutable set of bytes. If you modify a file and upload the URIs refer to an immutable set of bytes. If you modify a file and upload the
new version to the grid, you will get a different URI. URIs do not represent new version to the grid, you will get a different URI. URIs do not represent