don't claim to provide better semantics of timestamps than Python claims to provide #1133

Open
opened 2010-07-20 04:55:04 +00:00 by zooko · 5 comments

In [webapi.txt]source:docs/frontends/webapi.txt@4508#L677 it says:

The timestamps are represented as a number of seconds since the UNIX epoch (1970-01-01 00:00:00 UTC), with leap seconds not being counted in the long term.

However we get our timestamps from time.time(), which is documented as:

Return the time as a floating point number expressed in seconds since the epoch, in UTC.

If I understand correctly these two specifications are different, because UTC includes leap seconds and our docs currently say that leap seconds are not counted "in the long term". What does that mean exactly?

However, this ticket is about documentation and is intended to make this particular argument:

argument: We should not claim to provide more precise or more correct semantics to our users than Python claims to provide to us. (Which semantics it in turn gets from gettimeofday() which gets it from some combination of linux kernel, libc, local sysadmin policy, ntp server, and international telecommunications body politics. I'm not kidding.)

As far as I understand, the question of how to handle leap seconds is in fact left up to the local system administrator. Many but not all system administrators then defer to an NTP authority. NTP authorities may or may not change their policies about that in the near future--see for example this allusion to debates within the ITU-R to change the policy: http://www.ucolick.org/~sla/leapsecs/onlinebib.html .

I think the right thing for our docs to do is to clearly state that the precise semantics of this value are unspecified.

In [webapi.txt]source:docs/frontends/webapi.txt@4508#L677 it says: ``` The timestamps are represented as a number of seconds since the UNIX epoch (1970-01-01 00:00:00 UTC), with leap seconds not being counted in the long term. ``` However we get our timestamps from `time.time()`, which [is documented](http://docs.python.org/library/time.html#time.time) as: ``` Return the time as a floating point number expressed in seconds since the epoch, in UTC. ``` If I understand correctly these two specifications are different, because UTC includes leap seconds and our docs currently say that leap seconds are not counted "in the long term". What does that mean exactly? However, this ticket is about documentation and is intended to make this particular argument: *argument:* We should not claim to provide more precise or more correct semantics to our users than Python claims to provide to us. (Which semantics it in turn gets from `gettimeofday()` which gets it from some combination of linux kernel, libc, local sysadmin policy, ntp server, and international telecommunications body politics. I'm not kidding.) As far as I understand, the question of how to handle leap seconds is in fact left up to the local system administrator. Many but not all system administrators then defer to an NTP authority. NTP authorities may or may not change their policies about that in the near future--see for example this allusion to debates within the ITU-R to change the policy: <http://www.ucolick.org/~sla/leapsecs/onlinebib.html> . I think the right thing for our docs to do is to clearly state that the precise semantics of this value are unspecified.
zooko added the
documentation
minor
defect
1.7.0
labels 2010-07-20 04:55:04 +00:00
zooko added this to the undecided milestone 2010-07-20 04:55:04 +00:00
Author

Here is an interesting resource: http://www.ucolick.org/~sla/leapsecs/onlinebib.html

Search in text for "Unix system time and the POSIX standard".

Here is an interesting resource: <http://www.ucolick.org/~sla/leapsecs/onlinebib.html> Search in text for "Unix system time and the POSIX standard".
Author

Oh, coincidentally this was discussed on the python list two days ago:

http://mail.python.org/pipermail/python-list/2010-July/1250623.html

Standards are good.  When it comes to leap seconds there can be no
current implementation which satisfies everyone because of this
http://www.ucolick.org/~sla/leapsecs/epochtime.html
Until the delegates to ITU-R SG7 produce a better recommendation there
is going to be chaotic disregard of the standard where folks with
different needs choose different practical implementations.
Oh, coincidentally this was discussed on the python list two days ago: <http://mail.python.org/pipermail/python-list/2010-July/1250623.html> ``` Standards are good. When it comes to leap seconds there can be no current implementation which satisfies everyone because of this http://www.ucolick.org/~sla/leapsecs/epochtime.html Until the delegates to ITU-R SG7 produce a better recommendation there is going to be chaotic disregard of the standard where folks with different needs choose different practical implementations. ```
davidsarah commented 2010-07-23 21:21:06 +00:00
Owner

But no-one disagrees about whether Unix time numbers count leap seconds -- they don't. Look:

>>> import calendar, time
>>> calendar.timegm(time.strptime("2010-07-24 00:00:00 UTC",
...     "%Y-%m-%d %H:%M:%S %Z"))
1279929600
>>> 1279929600 % 86400
0

If leap seconds since 1970-01-01 were included, the results would have been 1279929624 and 24. This is working exactly as intended; there is no implementation bug (although there is a bug in the Python documentation). The Tahoe-LAFS spec shouldn't be unnecessarily vague on a point that is not in contention.

There are differences between clock implementations in exactly how to report the current time in the period shortly before, during, and after a leap second. The Unix numeric time representation can't unambiguously represent times around a leap second, and that's an unfixable problem for that representation. However it's a problem that I find it hard to get worked up about in the context of file timestamps; it is misguided to rely on those being accurate to second resolution (and a design error in programs like make that they do so). Also, note that this doesn't affect how Unix time numbers are converted to UTC time strings.

Proposals to change how leap seconds are decided on, or whether any more will occur in future, are a red herring. For any given set of decisions about when leap seconds occur, such proposals don't change how either UTC or Unix time numbers are defined.

But no-one disagrees about whether Unix time numbers count leap seconds -- they don't. Look: ``` >>> import calendar, time >>> calendar.timegm(time.strptime("2010-07-24 00:00:00 UTC", ... "%Y-%m-%d %H:%M:%S %Z")) 1279929600 >>> 1279929600 % 86400 0 ``` If leap seconds since 1970-01-01 were included, the results would have been 1279929624 and 24. This is working exactly as intended; there is no implementation bug (although there is a bug in the Python documentation). The Tahoe-LAFS spec shouldn't be unnecessarily vague on a point that is not in contention. There are differences between clock implementations in exactly how to report the current time in the period shortly before, during, and after a leap second. The Unix numeric time representation can't unambiguously represent times around a leap second, and that's an unfixable problem for that representation. However it's a problem that I find it hard to get worked up about in the context of file timestamps; it is misguided to rely on those being accurate to second resolution (and a design error in programs like `make` that they do so). Also, note that this doesn't affect how Unix time numbers are converted to UTC time strings. Proposals to change how leap seconds are decided on, or whether any more will occur in future, are a red herring. For any given set of decisions about when leap seconds occur, such proposals don't change how either UTC or Unix time numbers are defined.

Please, let's spend as little time on this as possible. I don't want us (or our users) to get distracted with the list of ways in which "time" is a complex topic.

What's wrong with just saying "standard unix time (seconds since epoch)"? That sends the following messages to the following audiences:

  • non-specialists: "ah, ok, no timezones, behaves just like everything else on every other computer I've used"
  • specialists: "ugh, it's that same slightly-broken definition of 'time' that everyone else uses, if I want TAI then I have to consult a leap-second-lookup table. But hey, at least it's the same definition as everyone else uses: whatever heroics I must do to accurately compare timestamps for the files that I drunkenly upload to Tahoe during my london New Year's party (and occur during the "bewitching second") are exactly the same heroics that I have to do for everything else."

In particular I really don't want readers of webapi.txt to be distracted by words like "ITU-R SG7" or "TAI" which will need length discursive inline explanations.

Please, let's spend as little time on this as possible. I don't want us (or our users) to get distracted with the list of ways in which "time" is a complex topic. What's wrong with just saying "standard unix time (seconds since epoch)"? That sends the following messages to the following audiences: * non-specialists: "ah, ok, no timezones, behaves just like everything else on every other computer I've used" * specialists: "ugh, it's that same slightly-broken definition of 'time' that everyone else uses, if I want TAI then I have to consult a leap-second-lookup table. But hey, at least it's the *same* definition as everyone else uses: whatever heroics I must do to accurately compare timestamps for the files that I drunkenly upload to Tahoe during my london New Year's party (and occur during the "bewitching second") are exactly the same heroics that I have to do for everything else." In particular I *really* don't want readers of webapi.txt to be distracted by words like "ITU-R SG7" or "TAI" which will need length discursive inline explanations.

make that "the usual unix time (seconds since epoch, as reported by python's time.time())", and "lengthy discursive inline explanations". I'll concede the point that the word "standard" is claiming too much and may provoke backlash from specialists who will point out that there's not much "standard" about it.

Also, thanks for marking this ticket priority=minor !

make that "`the usual unix time (seconds since epoch, as reported by python's time.time())`", and "lengthy discursive inline explanations". I'll concede the point that the word "standard" is claiming too much and may provoke backlash from specialists who will point out that there's not much "standard" about it. Also, thanks for marking this ticket priority=minor !
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Reference: tahoe-lafs/trac-2024-07-25#1133
No description provided.