grid identifier #403

New Issue

zooko · 2008-05-01T22:24:50Z

zooko commented

2008-05-01 22:24:50 +00:00

If you give someone a Tahoe URL containing a Tahoe capability, and they are using a different grid (i.e., they are using a different introducer to meet storage servers), then they will get a graceless failure suggesting that the file they asked for doesn't exist.

It might be good to include a "grid ID" in the URL.

If you give someone a Tahoe URL containing a Tahoe capability, and they are using a different grid (i.e., they are using a different introducer to meet storage servers), then they will get a graceless failure suggesting that the file they asked for doesn't exist. It might be good to include a "grid ID" in the URL.

zooko added the

labels 2008-05-01 22:24:50 +00:00

zooko added this to the eventually milestone 2008-05-01 22:24:50 +00:00

warner commented

2008-05-07 22:57:11 +00:00

indeed. First we must solve the rather thorny problem of what a "grid ID" would consist of. The obvious answer (a copy of the introducer.furl, or a copy) is unsatisfying, because we plan to move to a more distributed introduction system that does not depend upon a single introducer.

warner commented

2008-06-01 20:49:43 +00:00

I wrote up a proposal for the grid id, and send it to the tahoe-dev mailing list.

warner modified the milestone from eventually to undecided

2008-06-01 20:49:43 +00:00

warner commented

2008-06-01 22:04:51 +00:00

(http://allmydata.org/pipermail/tahoe-dev/2008-May/000586.html) is the link.

zooko commented

2009-02-26 19:03:28 +00:00

I have the vague idea that the domain name can serve as the the grid-id for some use cases. For example, currently <http://webapi.allmydata.com:8123/cap/$CAP> always means the prod grid. {{http://testgrid.allmydata.org:3567/cap/$CAP}}} always means the test grid. It isn't clear if we can hack DNS to provide privacy in the sense of not accidentally revealing the cap to a DNS server, but at least I would like to consider the advantages of using a separate, widely-understood tool like DNS instead of inventing our own grid ids.

I have the vague idea that the domain name can serve as the the grid-id for some use cases. For example, currently `<http://webapi.allmydata.com:8123/cap/$CAP>` always means the prod grid. {{<http://testgrid.allmydata.org:3567/cap/$CAP>}}} always means the test grid. It isn't clear if we can hack DNS to provide privacy in the sense of not accidentally revealing the cap to a DNS server, but at least I would like to consider the advantages of using a separate, widely-understood tool like DNS instead of inventing our own grid ids.

zooko commented

2009-02-26 19:06:38 +00:00

Perhaps this paper is relevant:

Miguel Castro, et al.: "One Ring to Rule them All: Service Discovery and Binding in Structured Peer-to-Peer Overlay Networks"

http://research.microsoft.com/users/mcastro/publications/ring.pdf

(By the way, I think I might have inspired Miguel Castro to write this paper by talking about the "grid id" problem at the first P2P Workshop in 2002.)

Perhaps this paper is relevant: Miguel Castro, et al.: "One Ring to Rule them All: Service Discovery and Binding in Structured Peer-to-Peer Overlay Networks" <http://research.microsoft.com/users/mcastro/publications/ring.pdf> (By the way, I think I might have inspired Miguel Castro to write this paper by talking about the "grid id" problem at the first P2P Workshop in 2002.)

davidsarah commented

2009-10-28 03:33:24 +00:00

Tagging issues relevant to new cap protocol design.

zooko modified the milestone from undecided to 2.0.0

2010-02-23 03:07:42 +00:00

davidsarah commented

2012-11-20 00:31:33 +00:00

Here are the design criteria I'd like to see satisfied:

A grid specification (grid-spec) is a file specifying a set of servers, and optional parameters such as a recommended share placement policy or recommended encoding parameters.
A grid-cap is a global, securely unique reference to a grid-spec. It can be an immutable or mutable reference (e.g. a hash of the grid-spec or a public key of an authority that can publish updates).
There is an IANA-registered URI scheme (say, lafs:) that has a grid-cap in the authority field, and a file-cap in the path field. Like most authority-based URI schemes, the authority can be omitted and implied by the context.
Local aliases can map to URIs that include a grid-cap.
It is possible to use different grids in different arguments to a CLI command, say to copy ciphertext shares (preferably without decryption) between grids.

Note that it may seem as though we have a bootstrapping issue in that the problem of looking up grid-specs is essentially the same as the problem of looking up files -- and so if we can solve the former, why can't we solve the latter without the former?

The answer is that there are far fewer grids than there are files, and even fewer grids that are used by a given client. So,

It is practical to persistently cache information on every grid that a client normally uses, whereas it wouldn't practical to persistently cache information on every file.
The mechanism to look up grid information only needs to be available to a given client when it actually adds a new grid.
Adding a grid is infrequent enough that the lookup could be manual, i.e. downloading a file describing the grid and adding that to the local cache.

However, it would also be convenient to allow looking up a grid-spec automatically, so that the resulting caps are truly global names. Here is a proposed implementation of that, which takes advantage of the similarity between grid-caps and file-caps by having them be the same thing:

Each client is configured with the grid-spec of a bootstrap grid, that can be used to look up the grid-specs of other grids.
There is a public bootstrap grid which is the initial bootstrap grid for newly created clients.
Accounting mechanisms are used to mitigate denial of service on the public bootstrap grid.
Each client has a local cache mapping grid-caps to grid-specs.
There is a command (and possibly a WUI admin page, if we can figure out how to do that securely) to add a grid-spec to the cache and mark it as persistent.
A grid-cap is exactly a file-cap that is interpreted as being on the client's bootstrap grid. It may be to a mutable or immutable file containing the grid-spec.
If a client sees an URI with an authority that is not a cached grid-cap, it looks up that grid-cap on its bootstrap grid. It does not cache this persistently (but should cache it temporarily).

Note that we wouldn't implement this in one go; stage 1 could be to leave out the automatic lookup of grid-caps on the bootstrap grid, and instead just have the persistent cache to which grid-specs have to be added manually.

Here are the design criteria I'd like to see satisfied: * A *grid specification* (grid-spec) is a file specifying a set of servers, and optional parameters such as a recommended share placement policy or recommended encoding parameters. * A *grid-cap* is a global, securely unique reference to a grid-spec. It can be an immutable or mutable reference (e.g. a hash of the grid-spec or a public key of an authority that can publish updates). * There is an IANA-registered URI scheme (say, `lafs:`) that has a grid-cap in the authority field, and a file-cap in the path field. Like most authority-based URI schemes, the authority can be omitted and implied by the context. * Local aliases can map to URIs that include a grid-cap. * It is possible to use different grids in different arguments to a CLI command, say to copy ciphertext shares (preferably without decryption) between grids. Note that it may seem as though we have a bootstrapping issue in that the problem of looking up grid-specs is essentially the same as the problem of looking up files -- and so if we can solve the former, why can't we solve the latter without the former? The answer is that there are far fewer grids than there are files, and even fewer grids that are used by a given client. So, * It is practical to persistently cache information on every grid that a client normally uses, whereas it wouldn't practical to persistently cache information on every file. * The mechanism to look up grid information only needs to be available to a given client when it actually adds a new grid. * Adding a grid is infrequent enough that the lookup could be manual, i.e. downloading a file describing the grid and adding that to the local cache. However, it would also be convenient to allow looking up a grid-spec automatically, so that the resulting caps are truly global names. Here is a proposed implementation of that, which takes advantage of the similarity between grid-caps and file-caps by having them be the same thing: * Each client is configured with the grid-spec of a *bootstrap grid*, that can be used to look up the grid-specs of other grids. * There is a *public bootstrap grid* which is the initial bootstrap grid for newly created clients. * Accounting mechanisms are used to mitigate denial of service on the public bootstrap grid. * Each client has a local cache mapping grid-caps to grid-specs. * There is a command (and possibly a WUI admin page, if we can figure out how to do that securely) to add a grid-spec to the cache and mark it as persistent. * A grid-cap is exactly a file-cap that is interpreted as being on the client's bootstrap grid. It may be to a mutable or immutable file containing the grid-spec. * If a client sees an URI with an authority that is not a cached grid-cap, it looks up that grid-cap on its bootstrap grid. It does *not* cache this persistently (but *should* cache it temporarily). Note that we wouldn't implement this in one go; stage 1 could be to leave out the automatic lookup of grid-caps on the bootstrap grid, and instead just have the persistent cache to which grid-specs have to be added manually.

nejucomo commented

2013-06-28 04:19:33 +00:00

This feature may be useful for implementing the "universal cap" use case: #2009