Universal Time Scale

Overview

There are quite a few different conventions for binary datetime, depending on the platform or protocol. Some of these have severe drawbacks. For example, people using Unix time (seconds since Jan 1, 1970, usually in a 32-bit integer) think that they are safe until near the year 2038. But cases can and do arise where arithmetic manipulations causes serious problems. Consider the computation of the average of two datetimes, for example: if one calculates them with averageTime = (time1 + time2)/2, there will be overflow even with dates beginning in 2004. Moreover, even if these problems don't occur, there is the issue of conversion back and forth between different systems.

Binary datetimes differ in a number of ways: the data type, the unit, and the epoch (origin). We'll refer to these as time scales. For example: (Sorted by epoch and unit, descending. In Java, int64_t=long and int32_t=int.)

SourceData TypeEpochUnit
MacOS X (CFDate/NSDate)double (1.0=1s but fractional seconds are used as well; imprecise for 0.1s etc.)2001-Jan-01seconds (and fractions thereof)
Unix time_tint32_t or int64_t (signed int32_t limited to 1970..2038)1970-Jan-01seconds
Java Dateint64_t1970-Jan-01milliseconds
Joda DateTimeint64_t1970-Jan-01milliseconds
ICU4C UDatedouble (does not use fractional milliseconds)1970-Jan-01milliseconds
JavaScript Datedouble (does not use fractional milliseconds; JavaScript Number stores a double)1970-Jan-01milliseconds
Unix struct timeval (as in gettimeofday)struct: time_t (seconds); suseconds_t (microseconds)1970-Jan-01microseconds
Gnome g_get_real_time()gint641970-Jan-01microseconds
Unix struct timespec (as in clock_gettime)struct: time_t (seconds); long (nanoseconds)1970-Jan-01nanoseconds
MacOS (old)uint32_t (1904..2040)1904-Jan-01seconds
Excel?1899-Dec-31days
DB2?1899-Dec-31days
Windows FILETIMEint64_t1601-Jan-01ticks (100 nanoseconds; finest granularity in industry)
.NET DateTimeuint62 (only 0001-9999; only 62 bits; also 2-bit field for UTC/local)0001-Jan-01ticks (100 nanoseconds; finest granularity in industry)
ICU Universal Time Scaleint64_t0001-Jan-01same as .Net but allows 29000BC..29000AD

All of the epochs start at 00:00 am (the earliest possible time on the day in question), and are usually assumed to be UTC.

The ranges, in years, for different data types are given in the following table. The range for integer types includes the entire range expressible with positive and negative values of the data type. The range for double is the range that would be allowed without losing precision to the corresponding unit.

Units64-bit integerDouble32-bit integer
1 second5.84542x1011285,420,920.94136.10
1 millisecond584,542,046.09285,420.920.14
1 microsecond584,542.05285.420.00
100 nanoseconds (tick)58,454.2028.540.00
1 nanosecond584.54204610.28540.00

ICU implements a universal time scale that is similar to the .NET framework's System.DateTime. The universal time scale is a 64-bit integer that holds ticks since midnight, January 1st, 0001. Negative values are supported. This has enough range to guarantee that calculations involving dates around the present are safe.

The universal time scale always measures time according to the proleptic Gregorian calendar. That is, the Gregorian calendar's leap year rules are used for all times, even before 1582 when it was introduced. (This is different from the default ICU calendar which switches from the Julian to the Gregorian calendar in 1582. See GregorianCalendar::setGregorianChange() and ucal_setGregorianChange().)

ICU provides conversion functions to and from all other major time scales, allowing datetimes in any time scale to be converted to the universal time scale, safely manipulated, and converted back to any other datetime time scale.

Background

So how did we decide what to use for the universal time scale? Java time has plenty of range, but cannot represent a .NET System.DateTime value without severe loss of precision. ICU4C time addresses this by using a double that is otherwise equivalent to the Java time. However, there are disadvantages with doubles. They provide for much more graceful degradation in arithmetic operations. But they only have 53 bits of accuracy, which means that they will lose precision when converting back and forth to ticks. What would really be nice would be a long double (80 bits -- 64 bit mantissa), but that is not supported on most systems.

The Unix extended time uses a structure with two components: time in seconds and a fractional field (microseconds). However, this is clumsy, slow, and prone to error (you always have to keep track of overflow and underflow in the fractional field). BigDecimal would allow for arbitrary precision and arbitrary range, but we did not want to use this as the normal type, because it is slow and does not have a fixed size.

Because of these issues, we concluded that the .NET System.DateTime is the best timescale to use. However, we use the full range allowed by the data type, allowing for datetimes back to 29,000 BC and up to 29,000 AD. (System.DateTime uses only 62 bits and only supports dates from 0001 AD to 9999 AD.) This time scale is very fine grained, does not lose precision, and covers a range that will meet almost all requirements. It will not handle the range that Java times do, but frankly, being able to handle dates before 29,000 BC or after 29,000 AD is of very limited interest.

Constants

ICU provides routines to convert from other timescales to the universal time scale, to convert from the universal time scale to other timescales, and to get information about a particular timescale. In all of these routines, the timescales are referenced using an integer constant, according to the following table:

SourceICU4CICU4J
JavaUDTS_JAVA_TIMEJAVA_TIME
UnixUDTS_UNIX_TIMEUNIX_TIME
ICU4CUDTS_ICU4C_TIMEICU4C_TIME
Windows FILETIMEUDTS_WINDOWS_FILE_TIMEWINDOWS_FILE_TIME
.NET DateTimeUDTS_DOTNET_DATE_TIMEDOTNET_DATE_TIME
Macintosh (old)UDTS_MAC_OLD_TIMEMAC_OLD_TIME
MacintoshUDTS_MAC_TIMEMAC_TIME
ExcelUDTS_EXCEL_TIMEEXCEL_TIME
DB2UDTS_DB2_TIMEDB2_TIME
Unix with microsecondsUDTS_UNIX_MICROSECONDS_TIMEUNIX_MICROSECONDS_TIME

The routine that gets a particular piece of information about a timescale takes an integer constant that identifies the particular piece of information, according to the following table:

ValueICU4CICU4J
PrecisionUTSV_UNITS_VALUEUNITS_VALUE
Epoch offetUTSV_EPOCH_OFFSET_VALUEEPOCH_OFFSET_VALUE
Minimum “from” valueUTSV_FROM_MIN_VALUEFROM_MIN_VALUE
Maximum “from” valueUTSV_FROM_MAX_VALUEFROM_MAX_VALUE
Minimum “to” valueUTSV_TO_MIN_VALUETO_MIN_VALUE
Maximum “to” valueUTSV_TO_MAX_VALUETO_MAX_VALUE

Here is what the values mean:

  • Precision -- the precision of the timescale, in ticks.
  • Epoch offset -- the distance from the universal timescale‘s epoch to the timescale’s epoch, in the timescale's precision.
  • Minimum “from” value -- the minimum timescale value that can safely be converted to the universal timescale.
  • Maximum “from” value -- the maximum timescale value that can safely be converted to the universal timescale.
  • Minimum “to” value -- the minimum universal timescale value that can safely be converted to the timescale.
  • Maximum “to” value -- the maximum universal timescale value that can safely be converted to the timescale.

Converting

You can convert from other timescale values to the universal timescale using the “from” methods. In ICU4C, you use utmscale_fromInt64:

UErrorCode err = U_ZERO_ERROR;
int64_t unixTime = ...;
int64_t universalTime;

universalTime = utmscale_fromInt64(unixTime, UDTS_UNIX_TIME, &err);

In ICU4J, you use UniversalTimeScale.from:

long javaTime = ...;
long universalTime;

universalTime = UniversalTimeScale.from(javaTime, UniversalTimeScale.JAVA_TIME);

You can convert values in the universal timescale to other timescales using the “to” methods. In ICU4C, you use utmscale_toInt64:

UErrorCode err = U_ZERO_ERROR;
int64_t universalTime = ...;
int64_t unixTime;

unixTime = utmscale_toInt64(universalTime, UDTS_UNIX_TIME, &err);

In ICU4J, you use UniversalTimeScale.to:

long universalTime = ...;
long javaTime;

javaTime = UniversalTimeScale.to(universalTime, UniversalTimeScale.JAVA_TIME);

That's all there is to it!

If the conversion is out of range, the ICU4C routines will set the error code to U_ILLEGAL_ARGUMENT_ERROR, and the ICU4J methods will throw IllegalArgumentException. In ICU4J, you can avoid out of range conversions by using the BigDecimal methods:

long fileTime = ...;
double icu4cTime = ...;
BigDecimal utICU4C, utFile, utUnix, unixTime, macTime;

utFile   = UniversalTimeScale.bigDecimalFrom(fileTime, UniversalTime.WINDOWS_FILE_TIME);

utICU4C  = UniversalTimeScale.bigDecimalFrom(icu4cTime, UniversalTimeScale.ICU4C_TIME);

unixTime = UniversalTimeScale.toBigDecimal(utFile, UniversalTime.UNIX_TIME);
macTime  = UniversalTimeScale.toBigDecimal(utICU4C, UniversalTime.MAC_TIME);

utUnix   = UniversalTimeScale.bigDecimalFrom(unixTime, UniversalTime.UNIX_TIME);

:point_right: Note: Because the Universal Time Scale has a finer resolution than some other time scales, time values that can be represented exactly in the Universal Time Scale will be rounded when converting to these time scales, and resolution will be lost. If you convert these values back to the Universal Time Scale, you will not get the same time value that you started with. If the time scale to which you are converting uses a double to represent the time value, you may loose precision even though the double supports a range that is larger than the range supported by the Universal Time Scale.

Formatting and Parsing

Currently, ICU does not support direct formatting or parsing of Universal Time Scale values. If you want to format a Universal Time Scale value, you will need to convert it to an ICU time scale value first. Use UTDS_ICU4C_TIME with ICU4C, and UniversalTimeScale.JAVA_TIME with ICU4J.

When you parse a datetime string, the result will be an ICU time scale value. You can convert this value to a Universal Time Scale value using UDTS_ICU4C_TIME with ICU4C, and UniversalTime.JAVA_TIME for ICU4J.

See the previous section, Converting, for details of how to do the conversion.

Getting Timescale Information

To get information about a particular timescale in ICU4C, use utmscale_getTimeScaleValue:

UErrorCode err = U_ZERO_ERROR;
int64_t unixEpochOffset = utmscale_getTimeScaleValue(
    UDTS_UNIX_TIME,
    UTSV_EPOCH_OFFSET_VALUE,
    &err);

In ICU4J, use UniversalTimeScale.getTimeScaleValue:

long javaEpochOffset = UniversalTimeScale.getTimeScaleValue(
    UniversalTimeScale.JAVA_TIME,
    UniversalTimeScale.EPOCH_OFFSET_VALUE);

If the integer constants for selecting the timescale or the timescale value are out of range, the ICU4C routines will set the error code to U_ILLEGAL_ARGUMENT_ERROR, and the ICU4J methods will throw IllegalArgumentException.