Back in 2019, I exhorted everyone to stop using datetime.utcnow() and datetime.utcfromtimestamp() because as of Python 3, naïve datetimes stopped representing abstract calendar dates and started representing the current local time. No longer would datetime.now().timestamp() fail because no mapping exists between a naïve time and the absolute timeline! At the time, I only explained this so that I could tell you why utcnow() was dangerous — which may give the mistaken impression that this change just added a footgun and gave us nothing in exchange. However, over time I have come to the opinion that this may in fact be the most elegant way to represent system local time in Python
Ideally, we would create a "local time" tzinfo object representing system local times (like dateutil.tz.tzlocal tries to do), but as it turns out, it is not possible to do that while maintaining datetime's guaranteed semantics when the system's time zone changes. It is, however, possible to tack "local time" semantics onto the existing naïve datetime object in a way that gives a lot of the same functionality.
A local timezone object
Early in the process of putting together PEP 615, which added support for IANA time zones to the standard library, I originally was hoping to broadly solve the problem of time zones in Python. My contention was that nearly all time zone users want one of three types of time zone:
- UTC and fixed offsets
- System local time
- IANA Time Zones
At the time, we already had UTC and fixed offset zones, and I was hoping to create classes that would represent local time and IANA time zones. I knew that naïve times were in a sense local times, but things like subtraction between an aware datetime and a naïve datetime were still not supported. It seemed like there was no "first class" solution to the local time problem, and some analogue of dateutil.tz.tzlocal should be added to the standard library. However, when starting to work out the semantics of what such an object should be, I found that any such object would have very unfortunate and counter-intuitive properties, and that making naïve datetimes represent local time was actually a stroke of genius on the part of Alexander Belopolsky. The reason for this is simple: it's possible to change your system local time zone during the run of a Python program, and datetime objects are not designed to allow tzinfo objects to return different offsets at different points in their lifespan.
Invariants
Important to note is that datetimes are both immutable and hashable. This means that you can use them as, for example, the keys to a dict. Along with hashability also comes some constraints on the equality semantics, most notably the fact that two objects that compare equal must have the same hash; in other words, if a == b it must be the case that hash(a) == hash(b). This is where the problem comes in, because datetime equality semantics say that two aware datetimes in different zones are equal if they represent the same time in UTC, which means that equality depends on the time zone offset, which in turn means that the hash must depend on the time zone offset.
Now bringing this back to local times — at any point during the run of a program, the system local time zone could change; if you were to use dateutil.tz.tzlocal or some equivalent, that means that the offsets of existing datetimes can change, for example:
# Local time is America/New_York
dt = datetime(2021, 4, 1, 12, tzinfo=tzlocal())
dt_utc = dt.astimezone(timezone.utc)
print(dt.utcoffset() / timedelta(hours=1)) # -4.0
print(dt == dt_utc) # True
# Change local time to America/Los_Angeles
print(dt.utcoffset() / timedelta(hours=1)) # -7.0
print(dt == dt_utc) # False
This is a major problem! It also means that we must choose between hash immutability and keeping the hash linked to equality, because datetimes that once compared equal no longer compare equal. In the current implementation, datetime.datetime caches its hash value when first calculated to deal with precisely this kind of problem, but what that means is that a otherwise-identical datetime objects will have different hashes depending on when hash was first called on them!
# Local time is America/New_York
dt1 = datetime(2021, 4, 1, 12, tzinfo=tzlocal())
dt2 = datetime(2021, 4, 1, 12, tzinfo=dt1.tzinfo)
dt3 = datetime(2021, 4, 1, 12, tzinfo=dt1.tzinfo)
my_dict = {dt1: "Before the change"}
print(my_dict[dt2]) # "Before the change"
print(dt1 == dt2) # True
# Local time is America/Los_Angeles
dt4 = datetime(2021, 4, 1, 12, tzinfo=dt1.tzinfo)
dt5 = datetime(2021, 4, 1, 12, tzinfo=dt1.tzinfo)
my_dict[dt4] = "After the change"
print(my_dict[dt2]) # "Before the change"
print(my_dict[dt3]) # "After the change"
print(my_dict[dt5]) # "After the change"
print(dt1 == dt2 == dt3 == dt4 == dt5) # True
This is slightly better than breaking all existing keys, but it's confusing and still violates some of our invariants. This drives home the fact that there is no way for an aware datetime to satisfy our invariants if its offset can change. What this means is that we cannot simply create a tzinfo object representing local times, we must create some new datetime type with semantics that can survive a change in the system local time.
The solution
The solution to this is quite elegant: change naïve datetimes to represent a system local time for the purposes of conversion to absolute time without altering naïve datetime semantics! Since naïve datetimes originally represented "abstract" datetime objects, they have no UTC offset and both hash and equality is based only on the raw values, so there is no problem if the concrete time represented by a given "system local time" datetime changes.
The only thing missing from this is that sometimes people want their local times to be actual aware datetimes — they want to be able to compare them to other aware datetimes, or to print out the actual UTC offset. This is solved neatly with .astimezone(None) / .astimezone(), which takes an aware or naïve datetime and gives it a fixed offset in the current system time zone:
dt = datetime(2021, 4, 1, 12)
print(dt)
# 2021-04-01 12:00:00
print(dt.astimezone())
# 2021-04-01 12:00:00-04:00
dt_la = datetime(2021, 4, 1, 12, tzinfo=ZoneInfo("America/Los_Angeles"))
print(dt_la.astimezone())
# 2021-04-01 15:00:00-04:00
The way this avoids the problems from the previous section is that it requires you to be explicit as to when you query for the offset. The result of any .astimezone() calls will always have the same offset, and naïve datetimes are always "floating" until you convert them into concrete times.
Takeaways
I have spilled out many words trying to justify why I actually like the naïve-as-local paradigm and why I think the obvious solution — a tzinfo object representing system local time — is not workable, but I imagine most people don't care about why these things are the case. I appreciate you bearing with me despite the high ratio of justification to practical advice. In exchange, I will give you some simple bullet points that you can write down for safe keeping before trying to scrub your brain clean of all the useless information I buried it in:
- The local offset may change during the course of the interpreter run.
- You can use datetime.astimezone with None to convert a naïve time into an aware datetime with a fixed offset representing the current system local time.
- All arithmetic operations should be applied to naïve datetimes when working in system local civil time — only call .astimezone(None) when you need to represent an absolute time, e.g. for display or comparison with aware datetimes.
Foonotes