update Announcement "timestamp": sequence number? #1767

New Issue

warner · 2012-06-14T20:09:59Z

warner commented

2012-06-14 20:09:59 +00:00

One proposal that came out of #1765 was to change the current
Announcement's "timestamp"-like field to be a sequence number
instead of an actual clock value. This field is used by both the
Introducer (server) and the IntroducerClient to decide when to
replace a previous announcement with the same (pubkey, servicename)
index, so it needs to be orderable and mostly
monotonically-increasing. (it's ok if a publisher briefly uses a
lower value than it did previously, as long as it's also ok for
other subscribers to ignore that message, which generally means the
publisher needs to periodically update their messages).

A timestamp (plus periodic updates) is a simple, cheap way to
achieve this property. The only rollback would be for a timequake
(when the publisher's clock has been adjusted backwards, maybe by
NTP being turned on for the first time), and that will eventually
be resolved when the new-time increases beyond the old-time of the
last update (so rolling the clock back by one hour means one hour
of stale announcements).

#1765 specifically discourages comparing this "timestamp" against
anybody else's clock (since clocks are never really synchronized).
So it really doesn't need to be a clock: it could just be a
sequence number. The advantage of a seqnum would be that it would
reveal less information about the client (which might help a
de-anonymyzing attacker correlate the tahoe node's behavior with
other externally-visible things).

The disadvantage is that we'd have to manage the counter ourselves,
and tolerate node restarts which don't maintain the saved counter
state. We want to make sure folks can back up their nodes by just
recording some static private keys, and don't need to constantly be
saving their updated counters.

The proposal is to do the following:

use a separate counter for each service-name
store the current counter values in

NODEDIR/private/announcement.counters, one line per service, like:
storage: 12

initialize all counters to 0 at node creation
increment the counter each time IntroducerClient.publish is called
if we receive a valid signed announcement from our own pubkey:
- if the seqnum is higher than our current value, set our counter
  to one greater than the received value, and re-publish
- if the seqnum is equal to our current value, but the signed
  message body is different, do the same: set the counter to one
  greater than the received value, and re-publish
- if the seqnum is lower, or (equal and the message is
  identical), do nothing

In conjunction with the gossip protocol from #1765, that ought to
converge. Nodes that are restored from backup (and thus experience
a "counterquake") will send stale announcements for a little while
(which everyone else will ignore) until they hear back their own
earlier (higher-seqnum) announcements, at which point they'll
advance their counters enough to become fresh again.

One requirement this imposes on clients is that anyone who
publishes a record for service-name=X must also subscribe to
service-name=X. Otherwise they won't know to update their counters
after a counterquake. Alternatively, we could require that anyone
else who receives message they recognize as stale must immediately
send back the fresh version, even if the publisher wasn't
subscribed to hear about them. This would require some changes to
the APIs, as publishers and subscribers are quite distinct right
now.

It might be easier if we only had one counter for the whole node,
instead of separate counters for each service-name. Then receipt of
any message with a higher counter would trigger the updates.
(when gossip-introduction happens, all nodes will subscribe to
"grid-control", so we don't need to require specific loopback
rules). My concern is that we might announce (counter=0,
service-name=storage, data=X) and (counter=0,
service-name=grid-control, data=Y), then have a quake, then some
small thing changes about the storage server but not about
grid-control. When the node comes back, it will announce
(counter=0, storage, data=Z) but still (counter=0, grid-control,
data=Y). If we aren't subscribed to "storage", we'll see the
grid-control loopback and conclude that we've converged, and not
replace the stale storage/data=X announcement. Maybe requiring a
nonce be added to grid-control messages would avoid this.

I want to get this change into 1.10, even though the #68/#1765
gossip-introducer won't happen until later, so that old 1.10
clients can continue to correctly update themselves in a gossipy
world. Also, since the current implementation uses a clock, I'd
like to switch to smaller integers as quickly as possible, so there
are fewer nodes which have ever used a (large) time.time() and will
thus have problems updating those announcements.

One proposal that came out of #1765 was to change the current Announcement's "timestamp"-like field to be a sequence number instead of an actual clock value. This field is used by both the Introducer (server) and the IntroducerClient to decide when to replace a previous announcement with the same (pubkey, servicename) index, so it needs to be orderable and mostly monotonically-increasing. (it's ok if a publisher briefly uses a lower value than it did previously, as long as it's also ok for other subscribers to ignore that message, which generally means the publisher needs to periodically update their messages). A timestamp (plus periodic updates) is a simple, cheap way to achieve this property. The only rollback would be for a timequake (when the publisher's clock has been adjusted backwards, maybe by NTP being turned on for the first time), and that will eventually be resolved when the new-time increases beyond the old-time of the last update (so rolling the clock back by one hour means one hour of stale announcements). #1765 specifically discourages comparing this "timestamp" against anybody else's clock (since clocks are never really synchronized). So it really doesn't need to be a clock: it could just be a sequence number. The advantage of a seqnum would be that it would reveal less information about the client (which might help a de-anonymyzing attacker correlate the tahoe node's behavior with other externally-visible things). The disadvantage is that we'd have to manage the counter ourselves, and tolerate node restarts which don't maintain the saved counter state. We want to make sure folks can back up their nodes by just recording some static private keys, and don't need to constantly be saving their updated counters. The proposal is to do the following: * use a separate counter for each service-name * store the current counter values in > NODEDIR/private/announcement.counters, one line per service, like: `storage: 12` * initialize all counters to 0 at node creation * increment the counter each time `IntroducerClient.publish` is called * if we receive a valid signed announcement from our own pubkey: * if the seqnum is higher than our current value, set our counter to one greater than the received value, and re-publish * if the seqnum is equal to our current value, but the signed message body is different, do the same: set the counter to one greater than the received value, and re-publish * if the seqnum is lower, or (equal and the message is identical), do nothing In conjunction with the gossip protocol from #1765, that ought to converge. Nodes that are restored from backup (and thus experience a "counterquake") will send stale announcements for a little while (which everyone else will ignore) until they hear back their own earlier (higher-seqnum) announcements, at which point they'll advance their counters enough to become fresh again. One requirement this imposes on clients is that anyone who publishes a record for service-name=X must also subscribe to service-name=X. Otherwise they won't know to update their counters after a counterquake. Alternatively, we could require that anyone else who receives message they recognize as stale must immediately send back the fresh version, even if the publisher wasn't subscribed to hear about them. This would require some changes to the APIs, as publishers and subscribers are quite distinct right now. It might be easier if we only had one counter for the whole node, instead of separate counters for each service-name. Then receipt of *any* message with a higher counter would trigger the updates. (when gossip-introduction happens, all nodes will subscribe to "grid-control", so we don't need to require specific loopback rules). My concern is that we might announce (counter=0, service-name=storage, data=X) and (counter=0, service-name=grid-control, data=Y), then have a quake, then some small thing changes about the storage server but not about grid-control. When the node comes back, it will announce (counter=0, storage, data=Z) but still (counter=0, grid-control, data=Y). If we aren't subscribed to "storage", we'll see the grid-control loopback and conclude that we've converged, and not replace the stale storage/data=X announcement. Maybe requiring a nonce be added to grid-control messages would avoid this. I want to get this change into 1.10, even though the #68/#1765 gossip-introducer won't happen until later, so that old 1.10 clients can continue to correctly update themselves in a gossipy world. Also, since the current implementation uses a clock, I'd like to switch to smaller integers as quickly as possible, so there are fewer nodes which have ever used a (large) time.time() and will thus have problems updating those announcements.

warner added the

labels 2012-06-14 20:09:59 +00:00

warner added this to the 1.10.0 milestone 2012-06-14 20:09:59 +00:00

warner self-assigned this 2012-06-14 20:09:59 +00:00

zooko commented

2012-07-05 02:37:39 +00:00

+1

zooko added

cloud-branch

and removed

1.9.1

labels 2012-07-05 02:37:39 +00:00

zooko modified the milestone from 1.10.0 to 1.11.0

2012-07-05 02:37:39 +00:00

zooko added

1.9.1

and removed

cloud-branch

labels 2012-07-05 15:01:22 +00:00

zooko modified the milestone from 1.11.0 to 1.10.0

2012-07-05 15:01:22 +00:00

warner commented

2012-09-05 00:08:47 +00:00

The current code (in revision changeset:15c95c2e) uses time.time(), so this proposed change has not yet been implemented.

The current code (in revision changeset:15c95c2e) uses `time.time()`, so this proposed change has not yet been implemented.

warner commented

2012-09-05 00:17:25 +00:00

fixed some typos in the description.

Two other thoughts:

if two nodes are somehow configured with the same private key, they'll fight over the announcements: each inbound announcement will trigger an outbound one with the higher seqnum, and they won't ever converge because they'll undoubtedly have different swissnums for the storage-server FURLs. They'll just chase each other up to infinity.
it's probably ok to just switch to a locally-managed persistent counter for 1.10 . I think we can safely defer the feedback/quake-handling code until later.

fixed some typos in the description. Two other thoughts: * if two nodes are somehow configured with the same private key, they'll fight over the announcements: each inbound announcement will trigger an outbound one with the higher seqnum, and they won't ever converge because they'll undoubtedly have different swissnums for the storage-server FURLs. They'll just chase each other up to infinity. * it's probably ok to just switch to a locally-managed persistent counter for 1.10 . I think we can safely defer the feedback/quake-handling code until later.

davidsarah commented

2012-09-05 03:10:40 +00:00

The one-counter-per-node variant seems simplest.

warner commented

2012-12-06 16:33:29 +00:00

I wanted to make sure this makes it into 1.10, as it makes forward-compatibility easier.

warner added

major

and removed

normal

labels 2012-12-06 16:33:29 +00:00

zooko commented

2013-03-07 16:04:55 +00:00

Let's open a separate ticket for subscribing to announcements about your own types of services, noticing an announcement about yourself, and updating your counter to be newer than the counter in that announcement.

zooko commented

2013-03-07 16:05:38 +00:00

I didn't understand the weird issue about having a single counter per-node rather than one counter per service, and how this could lead to some subtle failure when one service has changed and another hasn't. Therefore, I concluded that having a counter per service is simpler. ☺

davidsarah commented

2013-03-07 16:34:25 +00:00

Brian wrote:

It might be easier if we only had one counter for the whole node, instead of separate counters for each service-name. Then receipt of any message with a higher counter would trigger the updates. (when gossip-introduction happens, all nodes will subscribe to "grid-control", so we don't need to require specific loopback rules). My concern is that we might announce (counter=0, service-name=storage, data=X) and (counter=0, service-name=grid-control, data=Y), then have a quake, then some small thing changes about the storage server but not about grid-control. When the node comes back, it will announce (counter=0, storage, data=Z) but still (counter=0, grid-control, data=Y).

Ah, for "one-counter-per-node", I was thinking that the counter would be incremented whenever we encode a service announcement. So we would announce (counter=0, service-name=storage, data=X) then (counter=1, service-name=grid-control, data=Y), and the above situation couldn't occur.

Brian wrote: > It might be easier if we only had one counter for the whole node, instead of separate counters for each service-name. Then receipt of *any* message with a higher counter would trigger the updates. (when gossip-introduction happens, all nodes will subscribe to "grid-control", so we don't need to require specific loopback rules). My concern is that we might announce (counter=0, service-name=storage, data=X) and (counter=0, service-name=grid-control, data=Y), then have a quake, then some small thing changes about the storage server but not about grid-control. When the node comes back, it will announce (counter=0, storage, data=Z) but still (counter=0, grid-control, data=Y). Ah, for "one-counter-per-node", I was thinking that the counter would be incremented whenever we encode a service announcement. So we would announce (counter=0, service-name=storage, data=X) then (counter=1, service-name=grid-control, data=Y), and the above situation couldn't occur.

zooko commented

2013-03-07 16:46:00 +00:00

Okay, I'm +½ on one-counter-per-node, with davidsarah's variant (increment the counter once per service), and I'm +½ on one-counter-per-service.

warner commented

2013-03-16 08:01:10 +00:00

Hm. If we increment the counter with each announcement, we could still get into the same trap:

announce (0,storage,X)
announce (1,grid-control,Y)
rewind to a backup
restart
announce (0,storage,Z) [by others: seqnum not newer]ignored
announce (1,grid-control,Y) [by others]ignored
hear back the earlier (1,grid-control,Y)
looks just like our recent announcement, so don't provoke a retransmit

Also I think we'd need to store (at least in RAM) a separate counter for each service, so we could recognize when announcements are newer than anything we've ever created. Ideally we want the very first message from our previous life to trigger re-encodings of everything we might publish.

I'm +1 on one-counter-per-node. I think I'm +1 on using the same counter value for all announcements (i.e. every time we update any service, we increment the counter, write it to disk, then publish a full set of announcements for all services, all using the same counter value). Then if we hear back any announcement with a higher counter value, or an equal counter but different contents, we trigger new announcements with the next higher counter. I think that will let any single message-from-the-future trigger an update, and will also correctly ignore echoes of any current message. I'm also +1 on adding a nonce to each announcement (created when we encode the message, stored in RAM only, not persistent), so we can distinguish messages from two different reboots that happen to otherwise contain the same contents.

Hm. If we increment the counter with each announcement, we could still get into the same trap: * announce (0,storage,X) * announce (1,grid-control,Y) * rewind to a backup * restart * announce (0,storage,Z) [by others: seqnum not newer]ignored * announce (1,grid-control,Y) [by others]ignored * hear back the earlier (1,grid-control,Y) * looks just like our recent announcement, so don't provoke a retransmit Also I think we'd need to store (at least in RAM) a separate counter for each service, so we could recognize when announcements are newer than anything we've ever created. Ideally we want the very first message from our previous life to trigger re-encodings of everything we might publish. I'm +1 on one-counter-per-node. I think I'm +1 on using the same counter value for all announcements (i.e. every time we update any service, we increment the counter, write it to disk, then publish a full set of announcements for all services, all using the same counter value). Then if we hear back any announcement with a higher counter value, or an equal counter but different contents, we trigger new announcements with the next higher counter. I *think* that will let any single message-from-the-future trigger an update, and will also correctly ignore echoes of any current message. I'm also +1 on adding a nonce to each announcement (created when we encode the message, stored in RAM only, not persistent), so we can distinguish messages from two different reboots that happen to otherwise contain the same contents.

warner commented

2013-03-16 16:05:24 +00:00

I thought of a better example. Suppose we have 4 services A/B/C/D, and we've been publishing them for a while:

(10,service=A,data=dA1) (11,B,dB1) (12,C,dC1) (13,D,dD1)
now we shut down services B and C and stop announcing them
now the node crashes and loses its counter
the new node wakes up and publishes its remaining services
with a new counter
with new data for service A
(0,service=A,data=dA2) [by others]ignored
(1,D,dD1) [by others]ignored
node hears back the earlier (10,A,dA1), but isn't subscribed to hear about the others
node increments counter to 10+1=11
announce (11,A,dA2) (12,D,dD1)
node doesn't realize it's service D is out-of-date, and other nodes will continue to ignore announcements for a while

Also, I'm thinking the nonce is important, and should basically be part of the counter. So every time you increment the counter, you also make a new nonce, and the announcement's seqnum= field is a two-element list of (counter, nonce). The comparison rule says that recipients ignore announcements with counters that are lower than or equal than what they currently know about. They also ignore announcements which are identical to what they're currently heard. In the future, when a recipient sees a lower-or-equal counter in a non-identical announcement, they can send back a copy of their current copy, to let the sender know that they need to update their counter. The nonce distinguishes reuses of the same counter value (just without any sense of ordering).

seqnum+nonce is the same trick I used in Foolscap during reconnections, to tell whether you're talking to the same peer that you were just connected to, or to a newly rebooted instance of that peer. I think we use similar rules over there.

I thought of a better example. Suppose we have 4 services A/B/C/D, and we've been publishing them for a while: * (10,service=A,data=dA1) (11,B,dB1) (12,C,dC1) (13,D,dD1) * now we shut down services B and C and stop announcing them * now the node crashes and loses its counter * the new node wakes up and publishes its remaining services * with a new counter * with new data for service A * (0,service=A,data=dA2) [by others]ignored * (1,D,dD1) [by others]ignored * node hears back the earlier (10,A,dA1), but isn't subscribed to hear about the others * node increments counter to 10+1=11 * announce (11,A,dA2) (12,D,dD1) * node doesn't realize it's service D is out-of-date, and other nodes will continue to ignore announcements for a while Also, I'm thinking the nonce is important, and should basically be part of the counter. So every time you increment the counter, you also make a new nonce, and the announcement's seqnum= field is a two-element list of `(counter, nonce)`. The comparison rule says that recipients ignore announcements with counters that are lower than or equal than what they currently know about. They also ignore announcements which are identical to what they're currently heard. In the future, when a recipient sees a lower-or-equal counter in a non-identical announcement, they can send back a copy of their current copy, to let the sender know that they need to update their counter. The nonce distinguishes reuses of the same counter value (just without any sense of ordering). seqnum+nonce is the same trick I used in Foolscap during reconnections, to tell whether you're talking to the same peer that you were just connected to, or to a newly rebooted instance of that peer. I think we use similar rules over there.

AmmonRs commented

2013-03-19 01:57:24 +00:00

Replying to warner:

if two nodes are somehow configured with the same private key, they'll fight over the announcements: each inbound announcement will trigger an outbound one with the higher seqnum, and they won't ever converge because they'll undoubtedly have different swissnums for the storage-server FURLs. They'll just chase each other up to infinity.
if an attacker were able to get a node's private key, they could use the seqnum as a DoS attack, by making the node increment the counter until it looped, at which point all other nodes would forever ignore that node. probably not something to worry about, since if the private key is leaked, there are bigger problems, but something to consider.

Replying to [warner](/tahoe-lafs/trac-2024-07-25/issues/1767#issuecomment-88997): > * if two nodes are somehow configured with the same private key, they'll fight over the announcements: each inbound announcement will trigger an outbound one with the higher seqnum, and they won't ever converge because they'll undoubtedly have different swissnums for the storage-server FURLs. They'll just chase each other up to infinity. if an attacker were able to get a node's private key, they could use the seqnum as a DoS attack, by making the node increment the counter until it looped, at which point all other nodes would forever ignore that node. probably not something to worry about, since if the private key is leaked, there are bigger problems, but something to consider.

warner commented

2013-03-19 07:38:16 +00:00

AmmonRs: yeah, good point. I don't think there's much we could do about it.. even without sequence numbers, they could just keep re-publishing a bad address until the server admin gave up (sort of a manual version of the same attack).