shares.happy is the wrong name of the measure #1092

Open
opened 2010-06-19 15:36:17 +00:00 by zooko · 11 comments

There is a configuration option named shares.happy which is how you control the servers-of-happiness value. It is mis-named! It should be named servers.happy. Of course, it belongs right next to shares.needed and shares.total, but hopefully placement and docs can make their intimate relationship clear. Also, shares.needed serves double-duty. It means both:

  1. Number of shares necessary to reconstruct the file, and
  2. Number of servers necessary to serve the file in a servers-of-happiness upload-quality metric.
    Maybe that name should also be changed or at least documented even more carefully.
    Assigning to Brian. The next step on this ticket is for Brian to study the new servers-of-happiness feature (#778) and let us know what he thinks about it, both in general and in regard to this specific issue.
There is a configuration option named `shares.happy` which is how you control the servers-of-happiness value. It is mis-named! It should be named `servers.happy`. Of course, it belongs right next to `shares.needed` and `shares.total`, but hopefully placement and docs can make their intimate relationship clear. Also, `shares.needed` serves double-duty. It means both: 1. Number of shares necessary to reconstruct the file, and 2. Number of servers necessary to serve the file in a servers-of-happiness upload-quality metric. Maybe that name should also be changed or at least documented even more carefully. Assigning to Brian. The next step on this ticket is for Brian to study the new servers-of-happiness feature (#778) and let us know what he thinks about it, both in general and in regard to this specific issue.
zooko added the
code-nodeadmin
minor
defect
1.7.0
labels 2010-06-19 15:36:17 +00:00
zooko added this to the eventually milestone 2010-06-19 15:36:17 +00:00
warner was assigned by zooko 2010-06-19 15:36:17 +00:00
kevan commented 2010-12-23 06:45:26 +00:00
Owner

I'm attaching a patch that changes shares.happy to servers.happy. The client now ignores shares.happy, since it doesn't make a lot of sense to use shares.happy for servers.happy, given the differences between the two robustness metrics. Should we make the startup code print a warning if it doesn't find a servers.happy but does find a shares.happy?

I've defined servers.happy with the default value of 1; this means that servers of happiness checks will be disabled for nodes without a servers.happy directive in their tahoe.cfg (including the result of tahoe create-node).

I don't think there's a particularly convincing argument for leaving the default at 7; probably the only good it is doing is forcing people to reason about their grid when they have to go in and edit tahoe.cfg when their uploads fail because their "Hello, world!" grid isn't big enough to satisfy servers.happy=7. There are probably friendlier ways to do that :-). I'm open to being convinced for a value that isn't 1, but I think that there's something to be said for giving the user the information that they need to set the value sensibly and staying out of their way until they do that.

(I don't have a clear opinion yet on shares.needed, since I hadn't thought about that until I read the ticket this morning)

I'm attaching a patch that changes `shares.happy` to `servers.happy`. The client now ignores `shares.happy`, since it doesn't make a lot of sense to use `shares.happy` for `servers.happy`, given the differences between the two robustness metrics. Should we make the startup code print a warning if it doesn't find a `servers.happy` but does find a `shares.happy`? I've defined `servers.happy` with the default value of 1; this means that servers of happiness checks will be disabled for nodes without a `servers.happy` directive in their `tahoe.cfg` (including the result of `tahoe create-node`). I don't think there's a particularly convincing argument for leaving the default at 7; probably the only good it is doing is forcing people to reason about their grid when they have to go in and edit `tahoe.cfg` when their uploads fail because their "Hello, world!" grid isn't big enough to satisfy `servers.happy=7`. There are probably friendlier ways to do that :-). I'm open to being convinced for a value that isn't 1, but I think that there's something to be said for giving the user the information that they need to set the value sensibly and staying out of their way until they do that. (I don't have a clear opinion yet on `shares.needed`, since I hadn't thought about that until I read the ticket this morning)
kevan commented 2010-12-23 06:45:52 +00:00
Owner

Attachment 1092.dpatch (8527 bytes) added

**Attachment** 1092.dpatch (8527 bytes) added
Owner

-1 on the servers.happy.

If we're going to change, I think it would be good to also pick a different word than happy. There's an important concept lurking under a seemingly flippant word.

bWhat's really going on is that this single variable is a rough first cut at ensuring that there is adequate redundancy based on some policy and some knowledge of physical and administrative correlation among servers. I see the 3/7/10 values as very closely linked, and changing shares to servers makes that less clear.

I do agree that shares.happy gives the wrong impression. So I'll suggest "shares.independent", with the meaning being "the minimum number of shares that must be on independent servers". I think that's what is meant, and this keeps the parallelism of shares.* and clarifies this variable. One could have shares.independent and shares.independent-target, but I'm not sure independent-target needs to be different from total.

The current ordering gives the impression that shares.needed are shares.total are more independent than they are. So perhaps "shares.coding = (3, 10)" would be better than two variables. (I am under the impression that I can't just set shares.total to 12 and reconstruct those missing sh10, sh11 without having to recode the entire file; if I'm confused on that point this paragraph is invalid.)

3/7/10 seems reasonable, and I've been using 2/5/7. I don't think it makes sense to talk about the right value of shares.independent/shares.happy without considering the whole 3-tuple.

-1 on the servers.happy. If we're going to change, I think it would be good to also pick a different word than happy. There's an important concept lurking under a seemingly flippant word. bWhat's really going on is that this single variable is a rough first cut at ensuring that there is adequate redundancy based on some policy and some knowledge of physical and administrative correlation among servers. I see the 3/7/10 values as very closely linked, and changing shares to servers makes that less clear. I do agree that shares.happy gives the wrong impression. So I'll suggest "shares.independent", with the meaning being "the minimum number of shares that must be on independent servers". I think that's what is meant, and this keeps the parallelism of shares.* and clarifies this variable. One could have shares.independent and shares.independent-target, but I'm not sure independent-target needs to be different from total. The current ordering gives the impression that shares.needed are shares.total are more independent than they are. So perhaps "shares.coding = (3, 10)" would be better than two variables. (I am under the impression that I can't just set shares.total to 12 and reconstruct those missing sh10, sh11 without having to recode the entire file; if I'm confused on that point this paragraph is invalid.) 3/7/10 seems reasonable, and I've been using 2/5/7. I don't think it makes sense to talk about the right value of shares.independent/shares.happy without considering the whole 3-tuple.
Owner

Thinking about kevan's comments on the default, I think there are two use cases: setting up a single node with storage to play with tahoe for the very first time, and actually wanting to store bits. 1 is definitely not a good value for actual use. So perhaps there should be "tahoe create-test-node" that has encoding parameters set up for demo use, where the node is client, server, and introducer. Then create-node can be tuned for real use.

Thinking about kevan's comments on the default, I think there are two use cases: setting up a single node with storage to play with tahoe for the very first time, and actually wanting to store bits. 1 is definitely not a good value for actual use. So perhaps there should be "tahoe create-test-node" that has encoding parameters set up for demo use, where the node is client, server, and introducer. Then create-node can be tuned for real use.
davidsarah commented 2010-12-23 19:19:34 +00:00
Owner

Replying to kevan:

...
I've defined servers.happy with the default value of 1; this means that servers of happiness checks will be disabled for nodes without a servers.happy directive in their tahoe.cfg (including the result of tahoe create-node).

I don't think there's a particularly convincing argument for leaving the default at 7; probably the only good it is doing is forcing people to reason about their grid when they have to go in and edit tahoe.cfg when their uploads fail because their "Hello, world!" grid isn't big enough to satisfy servers.happy=7. There are probably friendlier ways to do that :-). I'm open to being convinced for a value that isn't 1, but I think that there's something to be said for giving the user the information that they need to set the value sensibly and staying out of their way until they do that.

A value of 1 means that at least one share has been placed (it is vacuously true that it is on an independent server). This isn't sufficient for the file to be retrievable.

We should probably require that at least k shares are placed in order for an upload or repair to succeed, regardless of the happiness threshold. In that case happiness thresholds less than k would make more sense.

Independently of that, I don't think that 1 is a sensible default. Even for a toy grid that is only being created for someone to see that Tahoe works, it's not unreasonable to require at least two servers. If the happiness threshold is 1, then even if there are no other servers, uploads will succeed by putting shares on the gateway, provided it has sufficient space. I don't think they should succeed (by default) in that case.

Replying to [kevan](/tahoe-lafs/trac-2024-07-25/issues/1092#issuecomment-78152): > ... > I've defined `servers.happy` with the default value of 1; this means that servers of happiness checks will be disabled for nodes without a `servers.happy` directive in their `tahoe.cfg` (including the result of `tahoe create-node`). > > I don't think there's a particularly convincing argument for leaving the default at 7; probably the only good it is doing is forcing people to reason about their grid when they have to go in and edit `tahoe.cfg` when their uploads fail because their "Hello, world!" grid isn't big enough to satisfy `servers.happy=7`. There are probably friendlier ways to do that :-). I'm open to being convinced for a value that isn't 1, but I think that there's something to be said for giving the user the information that they need to set the value sensibly and staying out of their way until they do that. A value of 1 means that at least one share has been placed (it is vacuously true that it is on an independent server). This isn't sufficient for the file to be retrievable. We should probably require that at least `k` shares are placed in order for an upload or repair to succeed, regardless of the happiness threshold. In that case happiness thresholds less than `k` would make more sense. Independently of that, I don't think that 1 is a sensible default. Even for a toy grid that is only being created for someone to see that Tahoe works, it's not unreasonable to require at least two servers. If the happiness threshold is 1, then even if there are no other servers, uploads will succeed by putting shares on the gateway, provided it has sufficient space. I don't think they should succeed (by default) in that case.

You know, I actually kinda like servers.happy=1, probably because I still
haven't internalized the whole bijective-mapping-of-servers concept yet. (I
mean, I know what's going on, yet each time that error appears, I walk away
in confusion because the text of the error message is so hard to follow, so
it leaves a general taste in my mouth that the whole idea is bad, even though
I know it's not really that bad)

Kevan's arguments in the first comment are spot on. "forcing people to reason
about their grid" needs to happen in a friendlier place than the error
message.

gdt's comment about the flippant use of "happy" is accurate too. I originally
picked that for shares-of-happiness because it was a somewhat arbitrary
threshold appliedin a very narrow and probably-rare error case (you've
connected to enough servers at the start of the upload, but then some were
lost by the time you finished.. do you still declare success? are you still
happy?)

The current ordering gives the impression that shares.needed are
shares.total are more independent than they are. So perhaps
"shares.coding = (3, 10)" would be better than two variables. (I am under
the impression that I can't just set shares.total to 12 and reconstruct
those missing sh10, sh11 without having to recode the entire file; if I'm
confused on that point this paragraph is invalid.)

(you're correct: you can't go from 3-of-10 to 3-of-12 without reencoding the
whole file. raw zfec would treat them the same, but the share-hash-trees that
tahoe adds for integrity checking would be different, so we fold both k and N
into the CHK hash, so you'll get an entirely different encryption key and
share data anyways)

Yeah, combining two tahoe.cfg directives into one might be a good idea. In
fact, it should be phrased in the same way we talk about it in english:

client
shares.encoding = 3-of-10

So I'll suggest "shares.independent", with the meaning being "the minimum
number of shares that must be on independent servers"

I get the impression that this issue is more about "servers" than about
"shares", so I wonder if maybe it ought to be "servers.independent". I know
the math touches both, but I'd like to give users the ability to learn how
this works in chunks, where the first chunk is only about shares ("3-of-10, I
need 3 distinct shares, doesn't matter where they come from, ok, got it"),
and then a later chunk is about where those shares are placed ("oh, right,
what happens if there aren't enough servers?"). Maybe, if all the "shares."
configuration fit into the first chunk, then all the controls that involve
servers (even though they also involve shares) could be put into a different
namespace and support the user's concept of a second chunk of things to
learn. "servers.
" might support that.

I'm still undecided about what the default "use-case" ought to be. I think
it's vital that folks be able to bring up a small grid and test it out. I
also think it's important to protect "tahoe backup" users against the trivial
case where you're only putting shares on yourself. Maybe what I'm really
wishing for were better #467 explicit-server-selection code and UI. Maybe I'm
coming around to the idea that diversity trumps write-availability: if you
have some way of configuring (or at least acknowledging) who you're
supposed to connect to, then you could fail writes unless all those servers
were present. Maybe a set of checkboxes on the known-servers web page,
meaning "don't allow uploads to succeed unless this server is present". Maybe
I'm balking at simple integer success criteria because I don't see it as
being easy for a user (or me) to understand what it means, whereas a list of
required serverids is pretty straightforward.

But I'm hesitant on the explicit serverlist too, because of how it'd not work
so well in very dynamic grids, and how it kind of needs constant attention
and decision making by the user.

Hm. I'll think about the checkboxes idea more, I kinda like it.

You know, I actually kinda like servers.happy=1, probably because I still haven't internalized the whole bijective-mapping-of-servers concept yet. (I mean, I know what's going on, yet each time that error appears, I walk away in confusion because the text of the error message is so hard to follow, so it leaves a general taste in my mouth that the whole idea is bad, even though I know it's not really that bad) Kevan's arguments in the first comment are spot on. "forcing people to reason about their grid" needs to happen in a friendlier place than the error message. gdt's comment about the flippant use of "happy" is accurate too. I originally picked that for shares-of-happiness because it was a somewhat arbitrary threshold appliedin a very narrow and probably-rare error case (you've connected to enough servers at the start of the upload, but then some were lost by the time you finished.. do you still declare success? are you still happy?) > The current ordering gives the impression that shares.needed are > shares.total are more independent than they are. So perhaps > "shares.coding = (3, 10)" would be better than two variables. (I am under > the impression that I can't just set shares.total to 12 and reconstruct > those missing sh10, sh11 without having to recode the entire file; if I'm > confused on that point this paragraph is invalid.) (you're correct: you can't go from 3-of-10 to 3-of-12 without reencoding the whole file. raw zfec would treat them the same, but the share-hash-trees that tahoe adds for integrity checking would be different, so we fold both k and N into the CHK hash, so you'll get an entirely different encryption key and share data anyways) Yeah, combining two tahoe.cfg directives into one might be a good idea. In fact, it should be phrased in the same way we talk about it in english: client shares.encoding = 3-of-10 > So I'll suggest "shares.independent", with the meaning being "the minimum > number of shares that must be on independent servers" I get the impression that this issue is more about "servers" than about "shares", so I wonder if maybe it ought to be "servers.independent". I know the math touches both, but I'd like to give users the ability to learn how this works in chunks, where the first chunk is only about shares ("3-of-10, I need 3 distinct shares, doesn't matter where they come from, ok, got it"), and then a later chunk is about where those shares are placed ("oh, right, what happens if there aren't enough servers?"). Maybe, if all the "shares.*" configuration fit into the first chunk, then all the controls that involve servers (even though they also involve shares) could be put into a different namespace and support the user's concept of a second chunk of things to learn. "servers.*" might support that. I'm still undecided about what the default "use-case" ought to be. I think it's vital that folks be able to bring up a small grid and test it out. I also think it's important to protect "tahoe backup" users against the trivial case where you're only putting shares on yourself. Maybe what I'm really wishing for were better #467 explicit-server-selection code and UI. Maybe I'm coming around to the idea that diversity trumps write-availability: if you have some way of configuring (or at least acknowledging) who you're *supposed* to connect to, then you could fail writes unless all those servers were present. Maybe a set of checkboxes on the known-servers web page, meaning "don't allow uploads to succeed unless this server is present". Maybe I'm balking at simple integer success criteria because I don't see it as being easy for a user (or me) to understand what it means, whereas a list of required serverids is pretty straightforward. But I'm hesitant on the explicit serverlist too, because of how it'd not work so well in very dynamic grids, and how it kind of needs constant attention and decision making by the user. Hm. I'll think about the checkboxes idea more, I kinda like it.
zooko modified the milestone from eventually to 1.11.0 2015-05-12 16:50:36 +00:00
daira commented 2015-05-14 15:54:07 +00:00
Owner

Replying to zooko:

Also, shares.needed serves double-duty. It means both:

  1. Number of shares necessary to reconstruct the file, and
  2. Number of servers necessary to serve the file in a servers-of-happiness upload-quality metric.

This is wrong. shares.needed only ever refers to a number of shares. Those shares can be served from any number of servers (which necessarily is between 1 and shares.needed inclusive, but that's a logical requirement rather than an additional criterion imposed by the upload/download/repair algorithms).

Replying to [zooko](/tahoe-lafs/trac-2024-07-25/issues/6154): > Also, `shares.needed` serves double-duty. It means both: > 1. Number of shares necessary to reconstruct the file, and > 2. Number of servers necessary to serve the file in a servers-of-happiness upload-quality metric. This is wrong. `shares.needed` only ever refers to a number of shares. Those shares can be served from any number of servers (which necessarily is between 1 and `shares.needed` inclusive, but that's a logical requirement rather than an additional criterion imposed by the upload/download/repair algorithms).

Milestone renamed

Milestone renamed
warner modified the milestone from 1.11.0 to 1.12.0 2016-03-22 05:02:52 +00:00

moving most tickets from 1.12 to 1.13 so we can release 1.12 with magic-folders

moving most tickets from 1.12 to 1.13 so we can release 1.12 with magic-folders
warner modified the milestone from 1.12.0 to 1.13.0 2016-06-28 18:20:37 +00:00

Moving open issues out of closed milestones.

Moving open issues out of closed milestones.
exarkun modified the milestone from 1.13.0 to 1.15.0 2020-06-30 14:45:13 +00:00
Owner

Ticket retargeted after milestone closed

Ticket retargeted after milestone closed
meejah modified the milestone from 1.15.0 to soon 2021-03-30 18:40:19 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1092
No description provided.