Peer selection doesn't rebalance shares on overwrite of mutable file. #232

Open
opened 2007-12-12 19:31:25 +00:00 by zooko · 15 comments

When you upload a new version of a mutable file, it currently uploads the new shares to peers which already have old shares, then checks that enough shares have been uploaded, then is happy. However, this means it never "rebalances", so if there were few peers (or just yourself!) the first time, and many peers the second time, the file is still stored on only those few peers.

This is an instance of the general principle that shares are not the right units for robustness measurements -- servers are.

When you upload a new version of a mutable file, it currently uploads the new shares to peers which already have old shares, then checks that enough shares have been uploaded, then is happy. However, this means it never "rebalances", so if there were few peers (or just yourself!) the first time, and many peers the second time, the file is still stored on only those few peers. This is an instance of the general principle that shares are not the right units for robustness measurements -- servers are.
zooko added the
code-peerselection
major
defect
0.7.0
labels 2007-12-12 19:31:25 +00:00
zooko added this to the 0.7.0 milestone 2007-12-12 19:31:25 +00:00
warner was assigned by zooko 2007-12-12 19:31:25 +00:00
Author

Actually I'm going to bump this out of the v0.7.0 Milestone and instead document that you have to have as many servers as your "total shares" parameter if you want robust storage. As mentioned in (@@http://allmydata.org/trac/tahoe/ticket/115#comment:-1@@) , the WUI should be enhanced to indicate the status of the creation of the private directory this to the user.

Actually I'm going to bump this out of the v0.7.0 Milestone and instead document that you have to have as many servers as your "total shares" parameter if you want robust storage. As mentioned in (@@http://allmydata.org/trac/tahoe/ticket/115#[comment:-1](/tahoe-lafs/trac-2024-07-25/issues/232#issuecomment--1)@@) , the WUI should be enhanced to indicate the status of the creation of the private directory this to the user.
warner was unassigned by zooko 2007-12-13 00:18:58 +00:00
zooko self-assigned this 2007-12-13 00:18:58 +00:00
zooko changed title from peer selection doesn't rebalance shares to peer selection doesn't rebalance shares on overwrite of mutable file 2007-12-13 00:18:58 +00:00
Author

This is related to ticket #213 -- "good handling of small numbers of servers, or strange choice of servers".

This is related to ticket #213 -- "good handling of small numbers of servers, or strange choice of servers".
zooko changed title from peer selection doesn't rebalance shares on overwrite of mutable file to Peer selection doesn't rebalance shares on overwrite of mutable file. 2007-12-19 22:49:54 +00:00

This is an instance of the general principle that shares are not the right
units for robustness measurements -- servers are.

Oh, I think it's actually more complicated than that. When we decide to take
the plunge, our peer selection algorithm should be aware of the chassis,
rack, and colo of each storage server. It should start by putting shares in
different colos. If it is forced to put two shares in the same colo, it
should try to put them in different racks. If they must share a rack, get
them in different chassis. If they must share a chassis, put them on
different disks. Only when all other options are exhausted, then two shares
can be put on the same disk (but we shouldn't be happy about it).

For now, in small grids, getting the shares onto different nodes is a good
start.

When a mutable file is modified, it's fairly easy to detect an improvement
that could be made and move shares to new servers. Another desireable feature
would be for the addition of a new server to automatically kick off a wave of
rebalancing. We have to decide upon how we want to trigger that, though: the
most naive approach (sweep through all files and check/repair/rebalance each
one every month) will have a certain bandwidth/diskio cost that might be
excessive and/or starve normal traffic.

I'm moving this to the 0.8.0 milestone since it matches the 0.8.0 goals.
There are a couple of different levels of support we might provide, so once
we come up with a plan, we might want to make a couple of new tickets and
schedule them differently.

> This is an instance of the general principle that shares are not the right > units for robustness measurements -- servers are. Oh, I think it's actually more complicated than that. When we decide to take the plunge, our peer selection algorithm should be aware of the chassis, rack, and colo of each storage server. It should start by putting shares in different colos. If it is forced to put two shares in the same colo, it should try to put them in different racks. If they must share a rack, get them in different chassis. If they must share a chassis, put them on different disks. Only when all other options are exhausted, then two shares can be put on the same disk (but we shouldn't be happy about it). For now, in small grids, getting the shares onto different nodes is a good start. When a mutable file is modified, it's fairly easy to detect an improvement that could be made and move shares to new servers. Another desireable feature would be for the addition of a new server to automatically kick off a wave of rebalancing. We have to decide upon how we want to trigger that, though: the most naive approach (sweep through all files and check/repair/rebalance each one every month) will have a certain bandwidth/diskio cost that might be excessive and/or starve normal traffic. I'm moving this to the 0.8.0 milestone since it matches the 0.8.0 goals. There are a couple of different levels of support we might provide, so once we come up with a plan, we might want to make a couple of new tickets and schedule them differently.
Author

One more wrinkle is that if N/(K+1) is large enough (>= 2, perhaps), then perhaps it should put K+1 shares into the same co-lo in order to enable regeneration of shares using only in-co-lo bandwidth.

One more wrinkle is that if N/(K+1) is large enough (>= 2, perhaps), then perhaps it should put K+1 shares into the same co-lo in order to enable regeneration of shares using only in-co-lo bandwidth.
Author

Brian: did you leave this behavior unchanged in the recent mutable-file upload/download refactoring?

Brian: did you leave this behavior unchanged in the recent mutable-file upload/download refactoring?
zooko removed their assignment 2008-05-12 19:51:16 +00:00
warner was assigned by zooko 2008-05-12 19:51:16 +00:00

Yes, this behavior is unchanged, and this ticket remains open. The publish process will seek to update the shares in-place, and will only look for new homes for shares that cannot be found.

To get automatic rebalancing, the publish process (specifically Publish.update_goal) needs to count how many shares are present on each server, and gently try to find a new home for them if there is more than one. ("gentle" in the sense that it should leave the share where it is if there are not extra empty servers to be found). In addition, we need to consider deleting the old share rather than merely creating a new copy of it.

Yes, this behavior is unchanged, and this ticket remains open. The publish process will seek to update the shares in-place, and will only look for new homes for shares that cannot be found. To get automatic rebalancing, the publish process (specifically Publish.update_goal) needs to count how many shares are present on each server, and gently try to find a new home for them if there is more than one. ("gentle" in the sense that it should leave the share where it is if there are not extra empty servers to be found). In addition, we need to consider deleting the old share rather than merely creating a new copy of it.
warner modified the milestone from 1.1.0 to 1.2.0 2008-05-29 22:20:44 +00:00

One additional thing to consider when working on this: if the mutable share lives on a server which is now full, the client should have the option of removing the share from that server (so it can go to a not-yet-full one). This can get tricky.

The first thing we need is a storage-server API to cancel leases on mutable shares, then code to delete the share when the lease count goes to zero. A mutable file that has multiple leases on it will be particularly tricky to consider.

One additional thing to consider when working on this: if the mutable share lives on a server which is now full, the client should have the option of removing the share from that server (so it can go to a not-yet-full one). This can get tricky. The first thing we need is a storage-server API to cancel leases on mutable shares, then code to delete the share when the lease count goes to zero. A mutable file that has multiple leases on it will be particularly tricky to consider.
warner modified the milestone from 1.5.0 to 1.6.0 2009-06-19 18:46:42 +00:00
Author

The following clump of tickets might be of interest to people who are interested in this ticket: #711 (repair to different levels of M), #699 (optionally rebalance during repair or upload), #543 ('rebalancing manager'), #232 (Peer selection doesn't rebalance shares on overwrite of mutable file.), #678 (converge same file, same K, different M), #610 (upload should take better advantage of existing shares), #573 (Allow client to control which storage servers receive shares).

The following clump of tickets might be of interest to people who are interested in this ticket: #711 (repair to different levels of M), #699 (optionally rebalance during repair or upload), #543 ('rebalancing manager'), #232 (Peer selection doesn't rebalance shares on overwrite of mutable file.), #678 (converge same file, same K, different M), #610 (upload should take better advantage of existing shares), #573 (Allow client to control which storage servers receive shares).
Author

Also related: #778 ("shares of happiness" is the wrong measure; "servers of happiness" is better).

Also related: #778 ("shares of happiness" is the wrong measure; "servers of happiness" is better).
davidsarah commented 2009-10-28 07:50:39 +00:00
Owner

Sorry, not integrity, only reliability.

Sorry, not integrity, only reliability.

moving this to category=mutable, since it's more of an issue with the mutable publish code than with the general category of peer selection

moving this to category=mutable, since it's more of an issue with the mutable publish code than with the general category of peer selection
warner added
code-mutable
and removed
code-peerselection
labels 2010-01-10 07:21:58 +00:00
zooko modified the milestone from 1.6.0 to eventually 2010-01-26 15:41:14 +00:00
Author

It's really bothering me that mutable file upload and download behavior is so finicky, buggy, inefficient, hard to understand, different from immutable file upload and download behavior, etc. So I'm putting a bunch of tickets into the "1.8" Milestone. I am not, however, at this time, volunteering to work on these tickets, so it might be a mistake to put them into the 1.8 Milestone, but I really hope that someone else will volunteer or that I will decide to do it myself. :-)

It's really bothering me that mutable file upload and download behavior is so finicky, buggy, inefficient, hard to understand, different from immutable file upload and download behavior, etc. So I'm putting a bunch of tickets into the "1.8" Milestone. I am not, however, at this time, volunteering to work on these tickets, so it might be a mistake to put them into the 1.8 Milestone, but I really hope that someone else will volunteer or that I will decide to do it myself. :-)
zooko modified the milestone from eventually to 1.8.0 2010-05-26 14:49:09 +00:00
Author

It was a mistake to put this ticket into the 1.8 Milestone. :-)

It was a mistake to put this ticket into the 1.8 Milestone. :-)
zooko modified the milestone from 1.8.0 to soon 2010-07-24 05:39:53 +00:00
davidsarah commented 2012-12-06 21:43:36 +00:00
Owner

Related to #1057 (Alter mutable files to use servers of happiness). Ideally the server selection for mutable and immutable files would use the same code, as far as possible.

Related to #1057 (Alter mutable files to use servers of happiness). Ideally the server selection for mutable and immutable files would use the same code, as far as possible.
tahoe-lafs modified the milestone from soon to 1.11.0 2012-12-06 21:43:36 +00:00
davidsarah commented 2012-12-06 22:36:17 +00:00
Owner

See also #1816: ideally, only the shares that are still needed for the new version should have their leases renewed.

See also #1816: ideally, only the shares that are still needed for the new version should have their leases renewed.
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#232
No description provided.