A curious case of non-transitive datetime comparison
In December 2016, a user reported an interesting bug to the dateutil tracker. The bug is summarized as follows [1]:
from datetime import datetime
from dateutil import tz
LON = tz.gettz('Europe/London')
# Construct a datetime
x = datetime(2007, 3, 25, 1, 0, tzinfo=LON)
ts = x.timestamp()      # Get a timestamp representing the same datetime
# Get the same datetime from the timestamp
y = datetime.fromtimestamp(ts, LON)
# Get the same datetime from the timestamp with a fresh instance of LON
z = datetime.fromtimestamp(ts, tz.gettz.nocache('Europe/London'))
print(x == y)       # False
print(x == z)       # True
print(y == z)       # True
To summarize: x, y and z should all represent the same datetime – they all have the same time zone, and y and z are the result of converting x into a timestamp and then back into a datetime, but for some reason x != y, and, even more curiously, x == z, even though the only difference between y and z is that z uses a different tzinfo object (representing the same zone). Even stranger, the equality relationship between the three is non-transitive, because x != y even though x == z and y == z. What the hell is going on? There are two key facts you need in order to understand what's happening here.
Imaginary times
The first piece of information you need to know is that the datetime constructor will not prevent you from creating a datetime that does not exist, which is what here:
x = datetime(2007, 3, 25, 1, 0, tzinfo=LON)
print(tz.datetime_exists(x))    # False
Turns out that Daylight Saving Time started at 01:00 on 25 March 2007 in London, so times from 01:00:00 to 01:59:59 were skipped over that day. Imaginary datetimes like this violate an assumption built in to the datetime.fromtimestamp(x.timestamp()) round trip, which is that all datetimes should be able to survive a round trip to and from UTC, or, in code:
dt.astimezone(tz.UTC).astimezone(dt.tzinfo) == dt
This is true for all real datetimes, but it cannot be true for an imaginary datetime because astimezone is guaranteed to produce a real datetime - since this datetime never existed, there's no time in UTC to map to it. Any trip from an erroneously constructed imaginary datetime to UTC is necessarily one-way. Looking at the actual datetimes produced, you can thus see why x == y is not obviously True:
print(x)
# 2007-03-25 01:00:00+01:00
print(y)
# 2007-03-25 00:00:00+00:00
But now the question is, if x == y is False, why is x == z True?
Aware datetime comparison
The next thing you need to know to unravel this mystery is how datetime equality semantics works between timezone-aware datetimes, since this is not an unambiguous operation. Python's approach is most recently documented as part of PEP 495; datetime comparison is divided into "same zone" and "different zone" comparison. When two datetimes are in the same zone, they are equal if the "wall time" is the same:
dt1 = datetime(2017, 10, 29, 1, 30, tzinfo=LON)
# 2017-10-29 01:30:00+01:00
dt2 = datetime(2017, 10, 29, 1, 30, fold=1, tzinfo=LON)
# 2017-10-29 01:30:00+00:00
print(dt1 == dt2)                               # True
print(dt1.timestamp() == dt2.timestamp())       # False
Note that in the above ambiguous time, the wall times are the same, but they resolve to different absolute timestamps because they are two sides of a daylight saving time transition (this is only possible in Python 3.6+, unless you use the dateutil.tz.enfold backport).
For comparisons between different zones, however, two datetimes are equal if they resolve to the same absolute UTC timestamp [2]:
dt1 = datetime(2017, 10, 28, 1, 30, tzinfo=LON)
dt2 = datetime(2017, 10, 28, 0, 30, tzinfo=tz.UTC)
dt3 = datetime(2017, 10, 28, 1, 30, tzinfo=tz.UTC)
# Resolves to the same timestamp
print(dt1 == dt2)               # True
# Has the same "wall time"
print(dt1 == dt3)               # False
The way this relates to our problem above is that "same zone" and "different zone" is defined by object identity, not by object equality [3], which is to say that dt1 == dt2 is an same-zone comparison if and only if dt1.tzinfo is dt2.tzinfo, even if dt1.tzinfo == dt2.tzinfo, which explains why y and z are treated differently:
print(x.tzinfo is y.tzinfo) # True
print(x.tzinfo is z.tzinfo) # False
print(x.tzinfo == z.tzinfo) # True
x == y is a same-zone comparison while x == z and y == z are between-zone comparisons.
Why it was non-transitive
Now looking back at the dates:
print(x)
# 2007-03-25 01:00:00+01:00
print(y)
# 2007-03-25 00:00:00+00:00
print(z)
# 2007-03-25 00:00:00+00:00
For x == y, we have an same-zone comparison, so we're only comparing 2007-03-25 01:00 to 2007-03-25 00:00, which is False. For x == z, we have an between-zones comparison, so we convert them both to UTC first, then compare:
print(x.astimezone(tz.UTC))
# 2007-03-25 00:00:00+00:00
print(z.astimezone(tz.UTC))
# 2007-03-25 00:00:00+00:00
I don't think Python's specification defines what happens when you map imaginary datetimes to UTC [4], but since the way z was constructed involved converting to UTC in order to calculate the UTC timestamp, it's no surprise that these two are equal.
Finally, y == z is true under either semantics, since both the wall clock and the offset are the same.
Note
This post was adapted from a small portion of my 2017 PyBay talk on time zones (slides). If you're interested in this topic, I go into greater detail about time zones in Python in that talk.
Note
This post was updated on 2018-06-11 to use dateutil 2.7.3 which made this particular issue harder to stumble upon. tz.gettz('Europe/London') will no longer get a fresh instance of the Europe/London timezone as of version 2.7, instead, tz.gettz.nocache('Europe/London') is used. I then forgot to upload the fixed version for over 3 years, so the modification only went public on 2021-10-12.
| [1] | All the code in this is executed with Python 3.6 and dateutil==2.7.3 | 
| [2] | One odd case is that in Python 3.6 (which introduces the fold attribute), inter-zone comparisons are always False if either object is ambiguous, apparently for backwards compatibility reasons. | 
| [3] | Over my objections. | 
| [4] | More correctly, I don't think there is any specification for what should be returned when calling utcoffset() on an imaginary datetime. The convention is to use the last valid offset and DST values, but a case could be made for returning None, returning the next offset or throwing an error. |