I also need to update this website to better reflect my latest understanding of the problem, but until I get around to that I will leave all the original material here for you to read.
Computer handling of dates and times is surprisingly hard to get right and can become quite unpleasant to deal with. Chronian.com is for you if you're a programmer who wants to:
Just so we're all on the same page here: I assume that you take pride in your work and strive for code that is correct. If 99.9% is acceptable, or if understanding underlying principles isn't important, or if "good enough" is good enough for you, then you're in the wrong place. Go watch TV or play a video game instead of wasting your time reading this.
Still with me? Good. Let's get down to work.
This website proposes a conceptual model of dates and times which is different in subtle but crucial ways from the model usually used. I've found this alternate model makes time-handling much easier to understand.
Chronian.com is not about any particular programming language, API, or library. It's about a better way to think about dates and times. You don't need any specific knowledge to understand this, but it will be easier to see the value in this material if you've done battle with dates and times before and have a few scars to prove it.
Here is the basic thesis of the Chronian model:
The real world is too complex for our feeble human minds to deal with directly, so in nearly every field of human endeavor we create models of the real world. These are simply abstractions which include the aspects of reality which are important to our task, and leave out the rest to keep the model simple enough to make our problems tractable.
This is an incredibly powerful technique, but it absolutely depends on getting the model right. If you leave out something important, you'll get wrong answers. If you leave in things that are not important, you may get so buried in complexity that you work much harder than necessary to get an answer, or perhaps never get any answer at all. To make things more of a challenge, the definition of "right" depends on the context. E.g. modeling every tall green thing as a "tree" might be fine for high-level urban planning, but that wouldn't work at all for studying the workings of an ecosystem.
In most date-and-time libraries and source code that I've seen, the model used (either explicitly or implicitly) is this:
The implementation used is generally to choose something like UTC to store all points on the timeline. Conversion of all times to UTC is done immediately after parsing, and conversion to the desired timezone or calendar is done when formatting for output. For example, Unix "time" is always represented as seconds-past-epoch where the epoch is specified in UTC.
This is a simple, logical system, which no doubt explains why it's so popular.
It is also wrong.
The problem is that this model assumes that the rules for converting between a local time (e.g. "US/Pacific") and the underlying representation (e.g. "UTC") are fixed. Oh, if only that were so, but it's not true at all. The rules change just frequently enough to make some of our answers wrong, but infrequently enough to seduce us into thinking our code works. So what changes? Here are some examples:
How does this make our answers wrong? Here is one possible scenario:
Similar scenarios can occur with the insertion of leap seconds.
The crucial point here is that these erroneous results are not the result of inadequate coding skills or bad design. The root cause is a fundamental shortcoming of the underlying model, specifically that we left something important (the mutability of the rules) out of the model. No amount of testing or bug-fixes or refactoring is going to correct that.
Take heart pilgrims, all is not lost. If we simply tilt our necks and look at the problem from a different angle we can still get our heads wrapped around it. (I'm not saying everything will become easy exactly; only that I believe this model is worthy of your consideration and can reduce some of the "friction" in your programming life.)
The key is to use a slightly different model. In the Chronian model, we still use the abstraction of a timeline with lots of timepoints on it, but instead of one single timeline we have a separate timeline for each system of reckoning time. So each of the following has its own timeline:
(For more details, see the Timelines page.)
Each timeline has:
First of all, the set of timepoints which are contained in a timeline may change. For example, whenever IERS declares a leap second, a whole second's worth of timepoints are added into the UTC timeline. We call such timelines unstable because they can change over time. Not every timeline is unstable (e.g. TAI is stable), but many of the most common ones are. It's very hard to incorporate this concept into the traditional model, but it's easy to describe what's happening in the Chronian model. Of course it's still annoying to deal with, but at least you can talk about it without tying your brain into a knot.
Because timelines can be unstable the conversions between timelines can be unsafe, meaning that a given conversion may give a different answer tomorrow than it does today. For example, the rule for converting between TAI and UTC changes every time a leap second is added. The traditional model has no way to represent this; the Chronian model simply specifies that certain conversions are unsafe. As a small consolation, we can often say that certain conversions are unsafe for timepoints in the future but are safe for timepoints in the past, and we can put bounds on how unsafe the conversion is (i.e. how much the current answer can differ from the final answer). This is an important point, because sometimes it makes sense to give a user an approximate answer, so long as we know that the error bounds are acceptable for the given application. Just make sure you are conscious of what you're doing, and don't cache or store an approximate answer in your database to be used later.
Timelines that use DST have gaps (timepoints that you'd expect to see, but which are not present in the timeline) and overlaps (two periods of time where the timepoints have the same labels). As if that weren't bad enough, the gaps and overlaps are unpredictable. In the Chronian model this all just gets rolled into the general statement that the timeline is unstable.
Most software uses "UTC" values that don't match the official definition of UTC. Specifically, most Windows and Unix systems ignore leap seconds, meaning the timeline their libraries use doesn't include them. In the specific case of Unix, the "time" variable which claims to be the "seconds past epoch" is actually the "non-leap seconds past epoch." The traditional model tries to sweep all this under the rug; the Chronian model simply defines a different timeline for each way of counting (i.e. with and without leap seconds). Any conversions between those timelines is then explicit.
Most computer clocks don't match atomic reference clocks, especially around leap seconds. Some computer clocks just end up a second fast and then get yanked back into sync with NTP — causing jitter and possible time-moving-backwards problems. Others use techniques like UTC-SLS (smoothed leap seconds) which deliberately introduces variable slew — causing the clock to be a fraction of a second off from true UTC for a period of several minutes. The traditional model pretends that the clock always matches true UTC; the Chronian model allows an explicit specification of the differences.
In the Chronian model the internal representation of date and time is not important and should be opaque to the API user. Of course the choice of representation may be very important for performance and compatibility with legacy data, but as far as conceptual understanding and correctness are concerned the internals are of no importance. This is how it should be; implementation details are supposed to be hidden.
In contrast, libraries using the traditional model usually have to tell you what the internal representation is so that you can understand the semantics of the library. This muddying of the boundary between interface and implementation is often a symptom of poor design, but in this case I believe it's a symptom of faulty modeling.
In object-oriented languages the obvious design is to have one class per timeline, but that may not always be the best solution. For example there is nothing wrong with having a single class to represent all local times, so long as each object of the class knows what timeline it's part of.
The parsing and serialization formats supported have nothing to do with the internal representation. It is entirely reasonable to have more than one external name for a given timepoint, and providing such multiple formats has no effect on the semantics of the timepoints.
How does the Chronian model help? Let's look a few examples of things which induce brain-pain in traditional models.
How do we know which timelines are stable or unstable, and which conversions are safe or unsafe? It's about as hard as it ever was to figure that out, but we can do it once and compile a table, then never have to worry about it again. If we're smart, we'll let somebody else create the table for us (and post it here!) and then all we have to do is learn to use it properly. Note that the same table can be used for any language/API/library, since these are properties of the model, not of any particular implementation.
Since we can't safely convert local time to UTC time, does that mean we have to store the local time in our database? In a word, yes. I know of no other completely reliable way to do it.
The general principle is to use whatever timeline is "appropriate". For instance the scheduled start time of a future meeting should almost certainly be represented as a timepoint in local time, since that's what people will use to determine when the meeting starts. On the other hand entries in a debugging log could benefit from being timestamped in UTC to make it easier to compare entries collected from across timezones. The Chronian model doesn't tell you what to do, but it does give you a vocabulary and tell you what you can do.
These terms are often confused, but they are three different concepts and each has a precise definition. There isn't anything special about how they apply to Chronian as opposed to any other model, but if you're interested enough to have read this far then I recommend you do a web search for those terms, or just read this article.
In both the traditional and Chronian models:
Besides that, my ego just can't tolerate the idea of knowingly writing code that sometimes gives wrong answers at random.
Also, the places where existing libraries fail are often relatively rare or peculiar edge cases, which means that unless you are specifically looking for them you can easily miss them even in an extensive test suite.
Finally, the fact that a library is "standard" or "official" doesn't mean it's good. Just ask any Java programmer for their opinion of the original date/time API supplied with the compiler.
Sounds like fun, but I have a mortgage to pay, mouths to feed, and a job that I like.
If you would like to build such a thing (open-source of course; nobody should own time) I'd be happy to consult with you; I just don't have time to do the massive amount of grunt work that would be required.
Please feel free to email me feedback about Chronian. Praise is always sweet, but constructive criticism is more useful.