Time is difficult. Not only from the point of view of
special and general
relativity
but also for simple calendars. Time zones,
Gregorian vs. Julian calendar, leap years, and leap
seconds
make this awfully difficult. Time and date is a complex concept in our society, so
we must capture this complexity in software.
In general, a computer program has two easy-to-use approaches
Work with timestamps or work with
ISO-8601 date time strings.
Both are perfectly valid approaches with different advantages and disadvantages.
This article discusses issues introduced by leap seconds and the Python approach with the datetime
package.
Leap seconds
First, let’s look at leap seconds. The Gregorian calendar introduces an additional day (February 29) for every year divisible by four, except for years divisible by 100. However, years divisible by 400 still receive the additional day. This brings the average year to 365.2425 days, which is very close to the length of a tropical year. I recommend reading Matt Parker’s Humble Pi book for a lightweight and funny introduction to this issue.
Earth rotation slows down and varies over time due to many effects.
To keep our clocks (UTC) in sync with earth’s rotation, we need to squeeze in
leap seconds from time to time. As Opposed to perfectly deterministic leap years
(you can write a program to compute leap years for year 3000), leap
seconds are based on astronomical measurements. We cannot reliably predict all
leap seconds until the year 3000. Unix time stamps are entirely agnostic of the
existence of leap seconds. The expression time() % 60
will always give you the
number of seconds since the last minute tick.
This makes it necessary to synchronize the Unix time
stamp with UTC whenever a leap second is introduced.
By making a Unix time stamp slightly longer, we can accommodate the
leap second. For example, after 60 ticks of the Unix clock, 61 ticks were
recorded at an atomic clock. Some applications stretch the time stamp over a
full day, some only over a minute, and some just double the last second.
This approach is straightforward to use but can cause issues if you count external
events over time. Assume an event occurs every second. If you use Unix time
stamps to aggregate the count, the count will be off if leap seconds are
introduced. There is one additional external event in the time frame that
accommodates the leap second. Depending on the application,this could be an
issue.
On the other hand, time stamps are very easy to use. They work independently of
local time zones. Two people in different time zones calling time()
at the
same time (i.e., talking on the phone) get approximately the same value.
Practically every external system can store timestamps as a (hopefully)
64-bit integer.
Datetime
An alternative approach to work within the ISO-8601 framework and Python’s
datetime
module. This is already where it becomes difficult. datetime
objects come in two flavors: time-zone aware or unaware. By default the object
is time-zone unaware storing local time information:
from datetime import datetime
now = datetime.now()
print(now.isoformat())
# 2022-07-19T22:19:36.925858
print(now.timestamp())
# 1658261976.925858
This might be perfectly fine for desktop applications but becomes
problematic
in distributed or online applications. In these cases, you should always prefer
time-zone aware datetime
objects with pytz
. The time stamp printed in the
last line would also work as an alternative, as described in the previous
section.
Interestingly, there is also datetime.utcnow()
. Let’s look at its properties.
utcnow = datetime.utcnow()
print(utcnow.isoformat())
# 2022-07-19T20:23:24.130360
print(utcnow.timestamp())
# 1658255004.13036
diff = now.timestamp() - utcnow.timestamp()
print(diff)
# 6972.795498132706
The first print gives the current UTC time, a two-hour offset for my local time.
Earlier, I said, time stamps work independently of time zones. However, here we
see utcnow
is offset by approximately 7200 seconds or 2 hours. How can this
be? The issue lies with the Python implementation. utcnow()
returns the time
in UTC time zone, but as a time-zone unaware object. There is no difference
between the object returned by utcnow()
and an object returned by now()
two
hours earlier.
I think this is a very sublet pitfall, and I think utcnow()
should never be used in the
first place. Use time-zone aware objects, local unaware objects, or time
stamps, but do not mix!
This might also interest you