Add storage location hint to Storage Server #1838

Open
opened 2012-11-01 22:58:12 +00:00 by PRabahy · 3 comments
PRabahy commented 2012-11-01 22:58:12 +00:00
Owner

I'm new to Tahoe, so sorry if I don't use all the correct terminology.

I believe that the Storage Server could be enhanced to announce a hint to where it stores its data. This will allow a Client to more intelligently choose which Storage Servers it chooses to trust with its data (#467, #573).

On wiki/ServerSelection it looks like Brian already thought about this: "I'd love to be able to get stronger diversity among hosts, racks, or data centers, but I don't yet know how to get that and get the properties listed above, while keeping the filecaps small."

My ideas that when a Client first connects to a Storage Server, the Storage Server can respond with a special string that describes where it believe that it is storing the data it receives. For example, any server that is using the S3 backend (#999) could reply "S3" while any server using the Google Drive backend (#1831) could reply "Google". A datacenter admin could set all of their servers to reply "ABC_DataCenter".

These hints would be non-authoritative because any Storage Server could easily lie about where it is storing the data, but even if all Storage Server were lying it would be no worse than today. The rogue Storage Servers could collude to say they were all using the same storage location, but then the Client would just avoid using more than one of the rogue servers. The rouge Storage Servers could also collude to say the were all using different storage locations (even though the were in the same location), but that simply trick the client back into the current behavior.

Finally, by adhering to a convention and parsing the string, a hierarchy of locations could be made. For example a company has 2 data centers so they set the hint for their servers as "ABC_DC1_RACK1", "ABC_DC1_RACK2", "ABC_DC2_RACK1", "ABC_DC2_RACK2", etc. (Related discussion [//pipermail/tahoe-dev/2009-April/001555.html])

I'm new to Tahoe, so sorry if I don't use all the correct terminology. I believe that the Storage Server could be enhanced to announce a hint to where it stores its data. This will allow a Client to more intelligently choose which Storage Servers it chooses to trust with its data (#467, #573). On [wiki/ServerSelection](wiki/ServerSelection) it looks like Brian already thought about this: "I'd love to be able to get stronger diversity among hosts, racks, or data centers, but I don't yet know how to get that and get the properties listed above, while keeping the filecaps small." My ideas that when a Client first connects to a Storage Server, the Storage Server can respond with a special string that describes where it believe that it is storing the data it receives. For example, any server that is using the S3 backend (#999) could reply "S3" while any server using the Google Drive backend (#1831) could reply "Google". A datacenter admin could set all of their servers to reply "ABC_DataCenter". These hints would be non-authoritative because any Storage Server could easily lie about where it is storing the data, but even if all Storage Server were lying it would be no worse than today. The rogue Storage Servers could collude to say they were all using the same storage location, but then the Client would just avoid using more than one of the rogue servers. The rouge Storage Servers could also collude to say the were all using different storage locations (even though the were in the same location), but that simply trick the client back into the current behavior. Finally, by adhering to a convention and parsing the string, a hierarchy of locations could be made. For example a company has 2 data centers so they set the hint for their servers as "ABC_DC1_RACK1", "ABC_DC1_RACK2", "ABC_DC2_RACK1", "ABC_DC2_RACK2", etc. (Related discussion [//pipermail/tahoe-dev/2009-April/001555.html])
tahoe-lafs added the
unknown
normal
enhancement
1.9.2
labels 2012-11-01 22:58:12 +00:00
tahoe-lafs added this to the undecided milestone 2012-11-01 22:58:12 +00:00
tahoe-lafs added
code-storage
and removed
unknown
labels 2012-11-01 23:05:11 +00:00

This may be useful to implement the "universal cap" use case: #2009

This may be useful to implement the "universal cap" use case: #2009

Please read #2059, which had a succinct specification by markberger. Please also read #573, which had a lot of useful content.

Please read #2059, which had a succinct specification by markberger. Please also read #573, which had a lot of useful content.

Project Voldemort is a distributed key-value store made by linkedin and used at scale in linkedin. They have a well-trodden solution for this kind of issue, described here: https://github.com/voldemort/voldemort/wiki/Topology-awareness-capability

Project Voldemort is a distributed key-value store made by linkedin and [used at scale in linkedin](http://engineering.linkedin.com/voldemort/announcing-voldemort-160-open-source-release). They have a well-trodden solution for this kind of issue, described here: <https://github.com/voldemort/voldemort/wiki/Topology-awareness-capability>
zooko changed title from Add StorageLocation hint to Storage Server to Add storage location hint to Storage Server 2014-02-04 12:18:11 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1838
No description provided.