blob: bcf9390d6335f4ff8b55aa82e6bdd6a1a54cad30 [file] [log] [blame]
<html>
<head>
<title>ICU4J 2.2 Release Notes</title>
<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
<!--
*******************************************************************************
* Copyright (C) 2001-2002, International Business Machines Corporation and *
* others. All Rights Reserved. *
*******************************************************************************
*
* $Source: /xsrl/Nsvn/icu/icu4j/Attic/releasenotes.html,v $
* $Date: 2002/08/16 15:51:35 $
* $Revision: 1.7 $
*
*******************************************************************************
-->
</head>
<body bgcolor="#FFFFFF">
<!--#include virtual="/icu/ssi/header.html" -->
<h2>International Components for Unicode for Java</h2>
<h3>Release Notes for ICU4J 2.2</h3>
<hr size="2" width="100%" align="center">
<p><b>Release Date</b><br>
August 15th, 2002</p>
<p>For the most recent
release, see the <a href="http://oss.software.ibm.com/icu4j/download/index.html">
ICU4J download site</a>.
</P>
<p><B>What's new in Release 2.2</B></p>
<ul>
<li><b>Unicode 3.2 support</b>
<ul>
<li>All properties and algorithms are upgraded to <a href="http://www.unicode.org/reports/tr28/">Unicode
3.2</a>.
<li>The UCA (<a href="http://www.unicode.org/reports/tr10/">Unicode
Collation Algorithm</a>) table is updated to the current version 3.1.1,
with Unicode 3.2-based canonical closure.
<li>Most Unicode properties are now <a href="http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/ucd_icu.html">available
via direct APIs, and in UnicodeSet</a>.</li>
</ul>
</li>
<li><b>Collation</b><br>
The collation engine has been completely rewritten to be conformant to the
latest version of the UCA, and to use the same performance enhancements
available in ICU4C. Compared to the JDK, it is dramatically faster, has much
smaller sort keys, more customization, and better and additional
locale-specific data. For more details, see <a href="#collation">Collation
Enhancements</a> below.
<li><b>Normalization</b><br>
The Normalization service adds several high performance functions for:
<ul>
<li>fast concatenation, preserving NFC and NFD.
<li>fast detection of normalized text
<li>fast canonical equivalance and/or case-insensitive matching</li>
</ul>
<li><b>Transliteration</b><br>
There are now transliterators for all scripts used by ICU locales. In
addition, mixed-script text can be automatically transliterated to a target
script.
<li><b>Alternate currency formatting</b><br>
ICU permits arbitrary currencies to be formatted for display according to
arbitrary locales, such as formatting Yen amounts for the US.
<li><b>UnicodeSets</b><br>
UnicodeSets now permit strings as well as characters, so that they can
contain grapheme-clusters. UnicodeSetIterator provides a fast mechanism for
iterating through the contents (code points and strings) of a UnicodeSet.
For code points the iteration is either by individual code point or by
ranges.
<li><b>General improvements</b><br>
Performance has been enhanced in a number of services, and a number of
significant bugs have been fixed.</li>
</ul>
<p>Please also note the <a href="#repackage">packaging</a> and <a href="#ResourceData">resource data</a> changes
introduced with ICU4J 2.1, described below.</p>
<P><B>Reference Platforms</B></P>
<p>
The reference platforms for version 2.2 are:<ul>
<li> Win2000, IBM JDK 1.3
<li> Solaris 2.7, Sun JDK 1.3.1
<li> AIX 5.1, IBM JDK 1.3
</ul>
</p>
<P>
In release 2.2, there are 2 known non-reference platform compilers that fail compiling ICU4J source code, IBM JDK 1.3.1_02 and Jikes 1.16. Use ICU4J on unsupported JVMs and non-reference platform compilers at your own risk.</P>
<p><b>Warnings</b></p>
<p>
In release 2.2, there is a limitation with pattern handling with DecimalFormat when a percent or permille sign is included.
</p>
<p>
<ul>
<li> Quoted percent or permille signs, which should be treated as literals, are instead treated as if they were unquoted.
<li> Patterns involving percent or permille signs will be generated with localized pattern characters for those signs, when a non-localized pattern is requested. If such a pattern is subsequently applied to a DecimalFormat as a non-localized pattern, the percent or permille signs will not be recognized if they differ from the non-localized signs in the locale being used.
</ul>
</p>
<p>These problems will be fixed in the next release.</p>
<p><b>For More Information</b></p>
<p>For further detailed information about the ICU4J library, please refer to the
<A href="readme.html">readme.</A>
</p>
<hr size="2" width="100%" align="center">
<p><h3><a name="collation">Collation Enhancements</a></h3></p>
<p>ICU4J's collation has been upgraded and now differs significantly from the JDK's implementation (originally provided by us several years ago).</p>
<p>ICU's collation is, in general, much more efficient than the JDK's. (The time to generate sort keys is longer, because they are so much shorter and more efficient to process). For instance:
<ul>
<li>ICU4J can correctly process FCD format strings with normalization off. The JDK has no notion of FCD. Much user text is in FCD form (for more information about FCD, see <a href="http://www.unicode.org/notes/tn5/">Unicode Technical Note #5</a>).</li>
<li>CollationKeys generated by ICU4J are compressed, and as compared with the JDK's, can be up to 70% smaller (e.g. in the case of Latin characters).</li>
<li>String comparison in ICU is faster than the JDK's. In our tests of Latin characters, it took just 35% of the JDK's time. </li>
</ul></p>
<p>Although ICU4J's collation API is very compatable with the JDK's, there are some differences. Here is a listing of the main ones:
<ul>
<li>ICU4J supports <b>quaternary</b> and <b>identical</b> strength, the JDK does not.</li>
<li> ICU4J supports extra collation options, the JDK does not:</li>
<ul>
<li>alternate handling</li>
<li>case level sort</li>
<li>upper case first or lower case first switch</li>
</ul>
<li>ICU4J supports Unicode 3.2, while the JDK (as of version 1.4) only supports Unicode 3.0.</li>
<li>ICU4J does not allow turning off Thai reordering, while the JDK does. This is because in Unicode 3.2 Thai reordering is always required. The JDK uses '!' in the rules to turn off Thai reordering; ICU4J ignores it.</li>
<li>ICU4J supports additional rule syntax for various options, for example, setting <b>variable-top</b>, code point collation element positioning, and others. For details, see the <a href="http://oss.software.ibm.com/icu/userguide/Collate_Customization.html">user's guide</a>.
<li>ICU4J's version of CollationKey has a public constructor, so subclasses of RuleBasedCollator can create their own CollationKeys. This was overlooked in the JDK (mea culpa).</li>
<li>The FULL_DECOMPOSITION mode used in the JDK is unnecessary for the UCA (and actually incorrect in some cases). ICU4J does not define this mode; clients should use CANONICAL_DECOMPOSITION instead.
<li>ICU4J uses the standard UCA default ordering, plus fixes and additions for
different languages, so the sorting order will differ from the JDK's.
<li>The CollationKeys generated by ICU4J and the JDK are different, so they cannot be compared.</li>
</ul>
</p>
<hr size="2" width="100%" align="center">
<h3><a name="repackage">Package Restructuring</a></h3>
<p>Starting with enhancement release 2.1 of ICU4J, the cvs repository
and package organization has changed. This helps us to more cleanly
organize the classes, and to clarify relationships and differences
between parts of the project.</p>
<p>The new high-level structure is as follows:<br><tt><pre>
com
.ibm
.richtext --- root of rich edit control
.icu --- root of icu
.dev --- classes excluded from icu4j.jar (development only)
.data --- data (e.g. unicode data files)
.demo --- demos (e.g. calendar, holiday, translit)
.test --- api tests grouped by functionality
.tool --- tools used in development
.impl --- root of 'internal' classes
.data --- shipped data (text and resources)
.lang --- similar to java.lang
.math --- similar to java.math
.text --- similar to java.text
.util --- similar to java.util
</pre></tt></p>
<p>By and large class names didn't change, only packaging, so changing
the packages in your source should be sufficient to resolve most problems.
The package change <b>will break serialization</b> for those classes that are
serializable.</p>
<p>The classes in com.ibm.icu.impl are <b>internal use only</b>.
Their javadocs are not generated, their APIs are not supported, and
they can change APIs or disappear entirely <b>at any
time</b>. Many classes in this package are public in order to
facilitate use by classes in multiple other packages, but this should not
be construed to mean that such classes will necessarily be 'promoted'
to full public classes in the future. Clients are warned not to depend
on anything in this package.</p>
<hr size="2" width="100%" align="center">
<h3><a name="ResourceData">ICU Resource Data added to ICU4J</a></h3>
<p>Starting with JDK 1.4, the resource information that used to be
available through public classes in java.text.resources is no longer
available. Sun has moved these classes to an internal package. This
has two consequences. One, both the format and contents of the
resources can now change at any time-- dot releases and special bugfix
releases can be different. Two, the resources are now no longer
accessible without explicit permission by the java user.
</p>
<p>
For these reasons, starting with release 2.1, ICU4J includes its own
resource information
which is completely independent of the JDK resource information. The
new ICU4J information is equivalent to the information in ICU4C and
ultimately derives from the same source. This allows ICU4J 2.1 and above
to be
built on, and run on, JDK 1.4.
</p>
<p>
There are two main consequences of this decision. The first is an
increase in size of ICU4J. The new resource information, currently
stored as class files residing in a jar file, is approximately 1.15
megabytes. The second is an increased difference between ICU's
resource information and Java's. Neither is a clear superset of the
other. For example, Java core currently has more timezone information
than ICU. ICU's model for handling currency is also different than
Java's. This will change over time as new versions of Java and ICU
are released.
</p>
<p>
In addition to the resource information that corresponds to the Java
resource information, ICU4J also includes resource information needed
to support its additional features, such as Transliteration, Calendar,
and DictionaryBasedBreakIterator. This information has existed in
some form in prior releases on ICU4J and has not greatly changed in
size.
</p>
<p>For information about modifying resource information, please see
the <a href=readme.html>readme</a>.</p>
<hr size="2" width="100%" align="center">
<h3><B>License</B></h3>
<P>
Please read and understand the <a href="./license.html">license</a>
included with this release before installing and using the ICU4J libraries.
</P>
<hr size="2" width="100%" align="center">
<p><i><font size="-1">Copyright (C) 2002 International Business Machines Corporation and others. All Rights Reserved.</font></i></p>
<!--#include virtual="/icu/ssi/footer.html" -->
</body>
</html>