A look at the internals of Joda-Time

Getting date and time algebra to work just right has always been a challenging task in software development, so I decided to take a look at the implementation of the Joda-Time library.

The DateTime class

The obvious place to start looking, if you’re at all familiar with the library, is the DateTime class.
But soon after starting to browse it you realise that it isn’t much more than a container of 2 fields:

  • a timestamp (long, milliseconds since 1970-01-01)
  • an instance of the Chronology abstract class

It also contains (a lot of) methods to create new instances of the same DateTime class, by adding or subtracting time to the current instance or by modifying selected fields, for example the day of year.
All these methods do though is just use the Chronology object to compute the new timestamp, and then create a new instance of DateTime with this timestamp and with the same Chronology.

The Chronology class(es)

Chronology

Chronology is an an abstract class containing some general purpose methods to convert dates (i.e. combination of year/month/day of month/hour of day.. values) to timestamps and vice versa. None of the methods are implemented here. The class also doesn’t declare any fields, as the timestamp is always either passed as parameter or returned by a method.

BaseChronology

Some of the methods are implemented by the BaseChronology subclass, which is also abstract. We’re going to take a look at the getDateTimeMillis() methods in particular.

Here’s the implementation of one of the signatures of this method, which converts a (year, month, day, millis of day) tuple into a timestamp.

public long getDateTimeMillis(int year, int monthOfYear, int dayOfMonth, int millisOfDay)
throws IllegalArgumentException
{
    long instant = year().set(0, year);
    instant = monthOfYear().set(instant, monthOfYear);
    instant = dayOfMonth().set(instant, dayOfMonth);
    return millisOfDay().set(instant, millisOfDay);
}

What is happening here? The methods year(), monthOfYear(), dayOfMonth(), and millisOfDay() return objects of type DateTimeField. For now, it’s enough to know that we can call long set(long instant, int value) on these objects to update a timestamp according to the value of a particular field. Since I know that didn’t make any sense, I’ll give an example. Let’s say we have a timestamp t1 representing the date “January 3rd 2015”, and an instance m of DateTimeField representing “months”, then we can do

    long t2 = m.set(t1, 2)

to get the timestamp t2 for “February 3rd 2015”.

The implementation shown above of getDateTimeMillis() makes much more sense now. What it does is update the 0 timestamp with the desired year, and then it keeps updating the resulting timestamps with all the remaining fields.

BasicChronology

Further down the inheritance tree we find the BasicChronology.
From the class javadoc:

Abstract implementation for calendar systems that use a typical day/month/year/leapYear model.

This class provides* a concrete definition for the DateTimeField objects used by the getDateTimeMillis() in BaseChronology, discussed in the previous section.

* through both AssembledChronology and its own assemble(Fields fields) method

It also defines several methods:

  • long getYearMillis(int year) – get the timestamp for Jan 1st 00:00:00 of the specified year
  • long getYearMonthMillis(int year, int month) – get milliseconds for the first day of the specified month
  • long getYearMonthDayMillis(int year, int month, int dayOfMonth) – same, with also the day specified
  • int getDayOfWeek(long instant)
  • int getMonthOfYear(long millis)
  • int getDayOfMonth(long millis)
  • int getYear(long instant)

And so on. If you want to have some fun, look at the implementation of int getMonthOfYear(long millis, int year) in BasicGJChronology, used by int getMonthOfYear(long millis) in BasicChronology.
Note that having the timestamp for the start of the year, it is relatively easy to compute the timestamp for the start of a particular month/day/hour of day.. combination. For instance, this is how getYearMonthDayMillis() is implemented:

long getYearMonthDayMillis(int year, int month, int dayOfMonth) {
    long millis = getYearMillis(year);
    millis += getTotalMillisByYearMonth(year, month);
    return millis + (dayOfMonth - 1) * (long)DateTimeConstants.MILLIS_PER_DAY;
}

We will see how getYearMillis() is implemented when we look at GregorianChronology below.

The getTotalMillisByYearMonth() used in the second line is tedious but not conceptually hard to implement.

Maybe the most interesting method here though is the getYear(). Let’s look at the implementation.

    int getYear(long instant) {
        // Get an initial estimate of the year, and the millis value that
        // represents the start of that year. Then verify estimate and fix if
        // necessary.

        // Initial estimate uses values divided by two to avoid overflow.
        long unitMillis = getAverageMillisPerYearDividedByTwo();
        long i2 = (instant >> 1) + getApproxMillisAtEpochDividedByTwo();
        if (i2 < 0) {
            i2 = i2 - unitMillis + 1;
        }
        int year = (int) (i2 / unitMillis);

        long yearStart = getYearMillis(year);
        long diff = instant - yearStart;

        if (diff < 0) {
            year--;
        } else if (diff >= DateTimeConstants.MILLIS_PER_DAY * 365L) {
            // One year may need to be added to fix estimate.
            long oneYear;
            if (isLeapYear(year)) {
                oneYear = DateTimeConstants.MILLIS_PER_DAY * 366L;
            } else {
                oneYear = DateTimeConstants.MILLIS_PER_DAY * 365L;
            }

            yearStart += oneYear;

            if (yearStart <= instant) {
                // Didn't go too far, so actually add one year.
                year++;
            }
        }

        return year;
    }

Quite a lot of stuff. Let’s break it down.

long unitMillis = getAverageMillisPerYearDividedByTwo();

Simply the number of milliseconds in a year, divided by 2. Implemented in GregorianChronology.

long i2 = (instant >> 1) + getApproxMillisAtEpochDividedByTwo();

This is basically: i2 = (timestamp / 2) + (millisAtEpoch / 2) = (timestamp + millisAtEpoch) / 2

Let’s assume that i2 is always non negative and ignore the if for now. Skipping to the next line we have:

int year = (int) (i2 / unitMillis);

Which is basically (timestamp + millisAtEpoch) / 2, divided by millisPerYear / 2. Use some pen and paper to see that this becomes (timestamp + millisAtEpoch) / millisPerYear, which is an initial estimate for the year!

Going back to the case when i2 is negative, we see that the code does this:

i2 = i2 - unitMillis + 1;

Why? If we ignore the divisions by 2 (done to prevent overflow), we understand that i2 represents the current milliseconds since the year 0. So, it is negative simply when the original timestamp represents an instant before the year 0.

Think about the following. If i2 is -1 (milliseconds), we are in December 31 23:59:59:999 of year -1. Since we are later dividing i2 by the number of milliseconds in a year, we subtract that number from i2 when it’s negative, so that when we do the division by unitMillis we get the correct year, -1, and not 0. Same thing for all the years before that.

We can now forget about those variables as they’re not used again and were only useful in obtaining the initial estimate for the year.

Now this estimate can never be off by more than 1 year, and sometimes it will be correct.

The remainder of the method checks whether we need to fix the estimate, and if so whether we need to add or subtract a year from it. We begin by getting the timestamp of the start of the estimated year using getYearMillis(), which we discussed before. We also compute the difference between the original timestamp and this start of the year:

long yearStart = getYearMillis(year);
long diff = instant - yearStart;

If instant – yearStart is negative it means that instant comes before yearStart in time, so we need to fix our estimate by decreasing our estimate by one:

if (diff < 0) {
    year--;
}

We also need to fix the estimate if the difference between our estimated start of the year and the original timestamp is more than a year, and this is what the else-if checks:

} else if (diff >= MILLIS_PER_DAY * 365L) {

Inside the else-if, we add 1 year to yearStart, by adding the appropriate amount of milliseconds depending on whether the estimate year is a leap year or not:

long oneYear;
if (isLeapYear(year)) {
    oneYear = MILLIS_PER_DAY * 366L;
} else {
    oneYear = MILLIS_PER_DAY * 365L;
}

yearStart += oneYear;

Now we only actually have to fix the estimate if the new yearStart is still before the original timestamp, even after adding 1 year to it. The opposite can happen if diff was between 365 and 366 days (remember we are dealing with milliseconds so intermediate values are possible) and we added 366 because it is a leap year. So, we have the last lines inside this else-if:

if (yearStart <= instant) {
    year++;
}

Finally, if the difference between the original timestamp and the estimated year start is positive and less than 365 days, the original estimate is correct and we don’t need to fix at all, ignoring both the if and the else-if. In any case we are done and can return the estimate, which is now the correct result:

return year;

GregorianChronology

The abstract method calculateFirstDayOfYearMillis() is used by getYearMillis() to compute the milliseconds for Jan 1st of the specified year.

Let’s look at the implementation in GregorianChronology.

    long calculateFirstDayOfYearMillis(int year) {
        // Initial value is just temporary.
        int leapYears = year / 100;
        if (year < 0) {
            // Add 3 before shifting right since /4 and >> 2 behave differently
            // on negative numbers. When the expression is written as
            // (year / 4) - (year / 100) + (year / 400),
            // it works for both positive and negative values, except this optimization
            // eliminates two divisions.
            leapYears = ((year + 3) >> 2) - leapYears + ((leapYears + 3) >> 2) - 1;
        } else {
            leapYears = (year >> 2) - leapYears + (leapYears >> 2);
            if (isLeapYear(year)) {
                leapYears--;
            }
        }

        return (year * 365L + (leapYears - DAYS_0000_TO_1970)) * DateTimeConstants.MILLIS_PER_DAY;
    }

Removing the optimisation, it becomes:

long calculateFirstDayOfYearMillis(int year) {
    int leapYears = (year / 4) - (year / 100) + (year / 400);
    return (year * 365L + (leapYears - DAYS_0000_TO_1970)) * DateTimeConstants.MILLIS_PER_DAY;
}

The first line computes the number of leap years occurring between the year 0 and the argument.

It works because a year is a leap year if it is divisible by 4, except when it is divisible by 100 – except when it is divisible by 400 -. Wikipedia is your friend if you’re looking for a more in depth explanation.

The second line takes the number of days from Jan 1st of the year 0 to the desired year (year * 365L), adds one day for every leap year, and subtracts the precomputed (719527) number of days between the year 0 and Jan 1st 1970.
This way, we have the number of days from epoch to Jan 1st of the desired year, and we can just multiply that number by the number of milliseconds in a day to obtain the timestamp.

The DateTimeField class

As briefly mentioned above, this class is used to manipulate timestamp values based on date/time fields such as
year, month, day of month and so on.

The two main methods are

  • long set(long instant, int value)
  • int get(long instant)

We described the set() method above – it updates the given timestamp with the field value.
The get() method simply “extracts” the field value from the timestamp.
For example if the timestamp corresponds to “January 3rd 2015” and the DateTimeField represents “months”, the get() will return 1, since January is the first month of the year.

As you might expect this is an abstract class with implementations for all the possible date/time fields.

Many of the implementations use methods from the Chronology class. Exploring all them is beyond the scope of this post.

Conclusions

The Joda-Time library is beautifully engineered and interesting to read, especially if you never had to deal with date and time computations directly before.

If you feel like I missed something important or made any mistakes, feel free to contact me at simone.russo89@gmail.com

Leave a comment