| Copyright (C) 1999-2000, International Business Machines Corporation |
| and others. All Rights Reserved. |
| |
| Readme file for ICU time zone data (source/tools/gentz) |
| |
| |
| RAW DATA |
| -------- |
| The time zone data in ICU is taken from the UNIX data files at |
| ftp://elsie.nci.nih.gov/pub/tzdata<year>. The other input to the |
| process is an alias table, described below. |
| |
| |
| BUILD PROCESS |
| ------------- |
| Two tools are used to process the data into a format suitable for ICU: |
| |
| tz.pl directory of raw data files + tz.alias -> tz.txt |
| gentz tz.txt -> tz.dat (memory mappable binary file) |
| |
| After gentz is run, standard ICU data tools are used to incorporate |
| tz.dat into the icudata module. |
| |
| In order to incorporate the raw data from that source into ICU, take |
| the following steps. |
| |
| 1. Download the archive of current zone data. This should be a file |
| named something like tzdata1999j.tar.gz. Use the URL listed above. |
| |
| 2. Unpack the archive into a directory, retaining the name of the |
| archive. For example, unpack tzdata1999j.tar.gz into tzdata1999j/. |
| Place this directory anywhere; one option is to place it within |
| source/tools/gentz. |
| |
| 3. Run the perl script tz.pl, passing it the directory location as a |
| command-line argument. On Windows system use the batch file |
| tz.bat. The output of this step is the intermediate text file |
| source/tools/gentz/tz.txt. |
| |
| As the second argument, pass in "tz.htm". This will generate an |
| html documentation file that goes into the icu/docs directory. |
| |
| 4. Run source/tools/makedata on Windows. On UNIX systems the |
| equivalent build steps are performed by 'make' and 'make install'. |
| |
| The tz.txt file is typically checked into CVS, whereas the raw data |
| files are not, since they are readily available from the URL listed |
| above. |
| |
| |
| ALIAS TABLE |
| ----------- |
| For backward compatibility, we define several three-letter IDs that |
| have been used since early ICU and correspond to IDs used in old JDKs. |
| These IDs are listed in tz.alias. The tz.pl script processes this |
| alias table and issues errors if there are problems. |
| |
| |
| IDS |
| --- |
| All *system* zone IDs must consist only of characters in the invariant |
| set. See utypes.h for an explanation of what this means. If an ID is |
| encountered that contains a non-invariant character, tz.pl complains. |
| Non-system zones may use non-invariant characters. |
| |
| |
| Etc/GMT... |
| ---------- |
| Users may be confused by the fact that various zones with names of the |
| form Etc/GMT+n appear to have an offset of the wrong sign. For |
| example, Etc/GMT+8 is 8 hours *behind* GMT; that is, it corresponds to |
| what one typically sees displayed as "GMT-8:00". The reason for this |
| inversion is explained in the UNIX zone data file "etcetera". |
| Briefly, this is done intentionally in order to comply with |
| POSIX-style signedness. In ICU we reproduce the UNIX zone behavior |
| faithfully, including this confusing aspect. |
| |
| |
| Alan Liu 1999 |