releasenotes.html - external/github.com/unicode-org/icu - Git at Google

 <html>
 <head>
 <title>ICU4J 2.2 Release Notes</title>
 <meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
 <!--
 *******************************************************************************
 * Copyright (C) 2001-2002, International Business Machines Corporation and    *
 * others. All Rights Reserved.                                                *
 *******************************************************************************
 *
 * $Source: /xsrl/Nsvn/icu/icu4j/Attic/releasenotes.html,v $
 * $Date: 2002/08/16 15:51:35 $
 * $Revision: 1.7 $
 *
 *******************************************************************************
 -->
 </head>

 <body bgcolor="#FFFFFF">
 <!--#include virtual="/icu/ssi/header.html" -->
     <h2>International Components for Unicode for Java</h2>
     <h3>Release Notes for ICU4J 2.2</h3>

 <hr size="2" width="100%" align="center">
     <p><b>Release Date</b><br>
     August 15th, 2002</p>

 <p>For the most recent
       release, see the <a href="http://oss.software.ibm.com/icu4j/download/index.html">
       ICU4J download site</a>.
     </P>

     <p><B>What's new in Release 2.2</B></p>

 <ul>
   <li><b>Unicode 3.2 support</b>
     <ul>
       <li>All properties and algorithms are upgraded to <a href="http://www.unicode.org/reports/tr28/">Unicode
         3.2</a>.
       <li>The UCA (<a href="http://www.unicode.org/reports/tr10/">Unicode
         Collation Algorithm</a>) table is updated to the current version 3.1.1,
         with Unicode 3.2-based canonical closure.
       <li>Most Unicode properties are now <a href="http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/ucd_icu.html">available
         via direct APIs, and in UnicodeSet</a>.</li>
     </ul>
   </li>
   <li><b>Collation</b><br>
     The collation engine has been completely rewritten to be conformant to the
     latest version of the UCA, and to use the same performance enhancements
     available in ICU4C. Compared to the JDK, it is dramatically faster, has much
     smaller sort keys, more customization, and better and additional
     locale-specific data. For more details, see <a href="#collation">Collation
     Enhancements</a> below.
   <li><b>Normalization</b><br>
     The Normalization service adds several high performance functions for:
     <ul>
       <li>fast concatenation, preserving NFC and NFD.
       <li>fast detection of normalized text
       <li>fast canonical equivalance and/or case-insensitive matching</li>
     </ul>
   <li><b>Transliteration</b><br>
     There are now transliterators for all scripts used by ICU locales. In
     addition, mixed-script text can be automatically transliterated to a target
     script.
   <li><b>Alternate currency formatting</b><br>
     ICU permits arbitrary currencies to be formatted for display according to
     arbitrary locales, such as formatting Yen amounts for the US.
   <li><b>UnicodeSets</b><br>
     UnicodeSets now permit strings as well as characters, so that they can
     contain grapheme-clusters. UnicodeSetIterator provides a fast mechanism for
     iterating through the contents (code points and strings) of a UnicodeSet.
     For code points the iteration is either by individual code point or by
     ranges.
   <li><b>General improvements</b><br>
     Performance has been enhanced in a number of services, and a number of
     significant bugs have been fixed.</li>
 </ul>
   <p>Please also note the <a href="#repackage">packaging</a> and <a href="#ResourceData">resource data</a> changes
     introduced with ICU4J 2.1, described below.</p>

      <P><B>Reference Platforms</B></P>
       <p>
       The reference platforms for version 2.2 are:<ul>
 <li> Win2000, IBM JDK 1.3
 <li> Solaris 2.7, Sun JDK 1.3.1
 <li> AIX 5.1, IBM JDK 1.3
 </ul>
       </p>
       <P>
       In release 2.2, there are 2 known non-reference platform compilers that fail compiling ICU4J source code, IBM JDK 1.3.1_02 and Jikes 1.16. Use ICU4J on unsupported JVMs and non-reference platform compilers at your own risk.</P>

       <p><b>Warnings</b></p>
       <p>
          In release 2.2, there is a limitation with pattern handling with DecimalFormat when a percent or permille sign is included.
       </p>
       <p>
       <ul>
           <li> Quoted percent or permille signs, which should be treated as literals, are instead treated as if they were unquoted.
           <li> Patterns involving percent or permille signs will be generated with localized pattern characters for those signs, when a non-localized pattern is requested.  If such a pattern is subsequently applied to a DecimalFormat as a non-localized pattern, the percent or permille signs will not be recognized if they differ from the non-localized signs in the locale being used.
       </ul>
       </p>
       <p>These problems will be fixed in the next release.</p>

 <p><b>For More Information</b></p>

       <p>For further detailed information about the ICU4J library, please refer to the
       <A href="readme.html">readme.</A>
       </p>

 <hr size="2" width="100%" align="center">

     <p><h3><a name="collation">Collation Enhancements</a></h3></p>

 <p>ICU4J's collation has been upgraded and now differs significantly from the JDK's implementation (originally provided by us several years ago).</p>

 <p>ICU's collation is, in general, much more efficient than the JDK's. (The time to generate sort keys is longer, because they are so much shorter and more efficient to process).  For instance:
 <ul>
   <li>ICU4J can correctly process FCD format strings with normalization off.  The JDK has no notion of FCD. Much user text is in FCD form (for more information about FCD, see <a href="http://www.unicode.org/notes/tn5/">Unicode Technical Note #5</a>).</li>
   <li>CollationKeys generated by ICU4J are compressed, and as compared with the JDK's, can be up to 70% smaller (e.g. in the case of Latin characters).</li>
   <li>String comparison in ICU is faster than the JDK's. In our tests of Latin characters, it took just 35% of the JDK's time. </li>
 </ul></p>

 <p>Although ICU4J's collation API is very compatable with the JDK's, there are some differences.  Here is a listing of the main ones:
 <ul>
   <li>ICU4J supports <b>quaternary</b> and <b>identical</b> strength, the JDK does not.</li>
   <li> ICU4J supports extra collation options, the JDK does not:</li>
     <ul>
       <li>alternate handling</li>
       <li>case level sort</li>
       <li>upper case first or lower case first switch</li>
     </ul>
   <li>ICU4J supports Unicode 3.2, while the JDK (as of version 1.4) only supports Unicode 3.0.</li>
   <li>ICU4J does not allow turning off Thai reordering, while the JDK does.  This is because in Unicode 3.2 Thai reordering is always required. The JDK uses '!' in the rules to turn off Thai reordering; ICU4J ignores it.</li>
   <li>ICU4J supports additional rule syntax for various options, for example, setting <b>variable-top</b>, code point collation element positioning, and others.  For details, see the <a href="http://oss.software.ibm.com/icu/userguide/Collate_Customization.html">user's guide</a>.
   <li>ICU4J's version of CollationKey has a public constructor, so subclasses of RuleBasedCollator can create their own CollationKeys.  This was overlooked in the JDK (mea culpa).</li>
   <li>The FULL_DECOMPOSITION mode used in the JDK is unnecessary for the UCA (and actually incorrect in some cases). ICU4J does not define this mode; clients should use CANONICAL_DECOMPOSITION instead.
   <li>ICU4J uses the standard UCA default ordering, plus fixes and additions for
     different languages, so the sorting order will differ from the JDK's.
   <li>The CollationKeys generated by ICU4J and the JDK are different, so they cannot be compared.</li>
 </ul>
 </p>

 <hr size="2" width="100%" align="center">
     <h3><a name="repackage">Package Restructuring</a></h3>

 <p>Starting with enhancement release 2.1 of ICU4J, the cvs repository
 and package organization has changed.  This helps us to more cleanly
 organize the classes, and to clarify relationships and differences
 between parts of the project.</p>

 <p>The new high-level structure is as follows:<br><tt><pre>
 com
    .ibm
        .richtext       ---  root of rich edit control
        .icu            ---  root of icu
             .dev       ---  classes excluded from icu4j.jar (development only)
                 .data  ---  data (e.g. unicode data files)
                 .demo  ---  demos (e.g. calendar, holiday, translit)
                 .test  ---  api tests grouped by functionality
                 .tool  ---  tools used in development
             .impl      ---  root of 'internal' classes
                 .data  ---  shipped data (text and resources)
             .lang      ---  similar to java.lang
             .math      ---  similar to java.math
             .text      ---  similar to java.text
             .util      ---  similar to java.util
 </pre></tt></p>

 <p>By and large class names didn't change, only packaging, so changing
 the packages in your source should be sufficient to resolve most problems.
 The package change <b>will break serialization</b> for those classes that are
 serializable.</p>

 <p>The classes in com.ibm.icu.impl are <b>internal use only</b>.
 Their javadocs are not generated, their APIs are not supported, and
 they can change APIs or disappear entirely <b>at any
 time</b>.  Many classes in this package are public in order to
 facilitate use by classes in multiple other packages, but this should not
 be construed to mean that such classes will necessarily be 'promoted'
 to full public classes in the future. Clients are warned not to depend
 on anything in this package.</p>

 <hr size="2" width="100%" align="center">
     <h3><a name="ResourceData">ICU Resource Data added to ICU4J</a></h3>
 <p>Starting with JDK 1.4, the resource information that used to be
 available through public classes in java.text.resources is no longer
 available.  Sun has moved these classes to an internal package.  This
 has two consequences.  One, both the format and contents of the
 resources can now change at any time-- dot releases and special bugfix
 releases can be different.  Two, the resources are now no longer
 accessible without explicit permission by the java user.
 </p>
 <p>
 For these reasons, starting with release 2.1, ICU4J includes its own
 resource information
 which is completely independent of the JDK resource information.  The
 new ICU4J information is equivalent to the information in ICU4C and
 ultimately derives from the same source.  This allows ICU4J 2.1 and above
 to be
 built on, and run on, JDK 1.4.
 </p>
 <p>
 There are two main consequences of this decision.  The first is an
 increase in size of ICU4J.  The new resource information, currently
 stored as class files residing in a jar file, is approximately 1.15
 megabytes.  The second is an increased difference between ICU's
 resource information and Java's.  Neither is a clear superset of the
 other.  For example, Java core currently has more timezone information
 than ICU.  ICU's model for handling currency is also different than
 Java's.  This will change over time as new versions of Java and ICU
 are released.
 </p>
 <p>
 In addition to the resource information that corresponds to the Java
 resource information, ICU4J also includes resource information needed
 to support its additional features, such as Transliteration, Calendar,
 and DictionaryBasedBreakIterator.  This information has existed in
 some form in prior releases on ICU4J and has not greatly changed in
 size.
 </p>
 <p>For information about modifying resource information, please see
 the <a href=readme.html>readme</a>.</p>
 <hr size="2" width="100%" align="center">
     <h3><B>License</B></h3>
     <P>
     Please read and understand the <a href="./license.html">license</a>
     included with this release before installing and using the ICU4J libraries.
     </P>

 <hr size="2" width="100%" align="center">
 <p><i><font size="-1">Copyright (C) 2002 International Business Machines Corporation and others.  All Rights Reserved.</font></i></p>
 <!--#include virtual="/icu/ssi/footer.html" -->
 </body>
 </html>
	<html>
	<head>
	<title>ICU4J 2.2 Release Notes</title>
	<meta http-equiv=Content-Type content="text/html; charset=iso-8859-1">
	<!--
	*******************************************************************************
	* Copyright (C) 2001-2002, International Business Machines Corporation and *
	* others. All Rights Reserved. *
	*******************************************************************************
	*
	* $Source: /xsrl/Nsvn/icu/icu4j/Attic/releasenotes.html,v $
	* $Date: 2002/08/16 15:51:35 $
	* $Revision: 1.7 $
	*
	*******************************************************************************
	-->
	</head>

	<body bgcolor="#FFFFFF">
	<!--#include virtual="/icu/ssi/header.html" -->
	<h2>International Components for Unicode for Java</h2>
	<h3>Release Notes for ICU4J 2.2</h3>

	<hr size="2" width="100%" align="center">
	<p><b>Release Date</b><br>
	August 15th, 2002</p>

	<p>For the most recent
	release, see the <a href="http://oss.software.ibm.com/icu4j/download/index.html">
	ICU4J download site</a>.
	</P>

	<p><B>What's new in Release 2.2</B></p>

	<ul>
	<li><b>Unicode 3.2 support</b>
	<ul>
	<li>All properties and algorithms are upgraded to <a href="http://www.unicode.org/reports/tr28/">Unicode
	3.2</a>.
	<li>The UCA (<a href="http://www.unicode.org/reports/tr10/">Unicode
	Collation Algorithm</a>) table is updated to the current version 3.1.1,
	with Unicode 3.2-based canonical closure.
	<li>Most Unicode properties are now <a href="http://oss.software.ibm.com/cvs/icu/~checkout~/icuhtml/design/ucd_icu.html">available
	via direct APIs, and in UnicodeSet</a>.</li>
	</ul>
	</li>
	<li><b>Collation</b><br>
	The collation engine has been completely rewritten to be conformant to the
	latest version of the UCA, and to use the same performance enhancements
	available in ICU4C. Compared to the JDK, it is dramatically faster, has much
	smaller sort keys, more customization, and better and additional
	locale-specific data. For more details, see <a href="#collation">Collation
	Enhancements</a> below.
	<li><b>Normalization</b><br>
	The Normalization service adds several high performance functions for:
	<ul>
	<li>fast concatenation, preserving NFC and NFD.
	<li>fast detection of normalized text
	<li>fast canonical equivalance and/or case-insensitive matching</li>
	</ul>
	<li><b>Transliteration</b><br>
	There are now transliterators for all scripts used by ICU locales. In
	addition, mixed-script text can be automatically transliterated to a target
	script.
	<li><b>Alternate currency formatting</b><br>
	ICU permits arbitrary currencies to be formatted for display according to
	arbitrary locales, such as formatting Yen amounts for the US.
	<li><b>UnicodeSets</b><br>
	UnicodeSets now permit strings as well as characters, so that they can
	contain grapheme-clusters. UnicodeSetIterator provides a fast mechanism for
	iterating through the contents (code points and strings) of a UnicodeSet.
	For code points the iteration is either by individual code point or by
	ranges.
	<li><b>General improvements</b><br>
	Performance has been enhanced in a number of services, and a number of
	significant bugs have been fixed.</li>
	</ul>
	<p>Please also note the <a href="#repackage">packaging</a> and <a href="#ResourceData">resource data</a> changes
	introduced with ICU4J 2.1, described below.</p>

	<P><B>Reference Platforms</B></P>
	<p>
	The reference platforms for version 2.2 are:<ul>
	<li> Win2000, IBM JDK 1.3
	<li> Solaris 2.7, Sun JDK 1.3.1
	<li> AIX 5.1, IBM JDK 1.3
	</ul>
	</p>
	<P>
	In release 2.2, there are 2 known non-reference platform compilers that fail compiling ICU4J source code, IBM JDK 1.3.1_02 and Jikes 1.16. Use ICU4J on unsupported JVMs and non-reference platform compilers at your own risk.</P>

	<p><b>Warnings</b></p>
	<p>
	In release 2.2, there is a limitation with pattern handling with DecimalFormat when a percent or permille sign is included.
	</p>
	<p>
	<ul>
	<li> Quoted percent or permille signs, which should be treated as literals, are instead treated as if they were unquoted.
	<li> Patterns involving percent or permille signs will be generated with localized pattern characters for those signs, when a non-localized pattern is requested. If such a pattern is subsequently applied to a DecimalFormat as a non-localized pattern, the percent or permille signs will not be recognized if they differ from the non-localized signs in the locale being used.
	</ul>
	</p>
	<p>These problems will be fixed in the next release.</p>

	<p><b>For More Information</b></p>

	<p>For further detailed information about the ICU4J library, please refer to the
	<A href="readme.html">readme.</A>
	</p>

	<hr size="2" width="100%" align="center">

	<p><h3><a name="collation">Collation Enhancements</a></h3></p>

	<p>ICU4J's collation has been upgraded and now differs significantly from the JDK's implementation (originally provided by us several years ago).</p>

	<p>ICU's collation is, in general, much more efficient than the JDK's. (The time to generate sort keys is longer, because they are so much shorter and more efficient to process). For instance:
	<ul>
	<li>ICU4J can correctly process FCD format strings with normalization off. The JDK has no notion of FCD. Much user text is in FCD form (for more information about FCD, see <a href="http://www.unicode.org/notes/tn5/">Unicode Technical Note #5</a>).</li>
	<li>CollationKeys generated by ICU4J are compressed, and as compared with the JDK's, can be up to 70% smaller (e.g. in the case of Latin characters).</li>
	<li>String comparison in ICU is faster than the JDK's. In our tests of Latin characters, it took just 35% of the JDK's time. </li>
	</ul></p>

	<p>Although ICU4J's collation API is very compatable with the JDK's, there are some differences. Here is a listing of the main ones:
	<ul>
	<li>ICU4J supports <b>quaternary</b> and <b>identical</b> strength, the JDK does not.</li>
	<li> ICU4J supports extra collation options, the JDK does not:</li>
	<ul>
	<li>alternate handling</li>
	<li>case level sort</li>
	<li>upper case first or lower case first switch</li>
	</ul>
	<li>ICU4J supports Unicode 3.2, while the JDK (as of version 1.4) only supports Unicode 3.0.</li>
	<li>ICU4J does not allow turning off Thai reordering, while the JDK does. This is because in Unicode 3.2 Thai reordering is always required. The JDK uses '!' in the rules to turn off Thai reordering; ICU4J ignores it.</li>
	<li>ICU4J supports additional rule syntax for various options, for example, setting <b>variable-top</b>, code point collation element positioning, and others. For details, see the <a href="http://oss.software.ibm.com/icu/userguide/Collate_Customization.html">user's guide</a>.
	<li>ICU4J's version of CollationKey has a public constructor, so subclasses of RuleBasedCollator can create their own CollationKeys. This was overlooked in the JDK (mea culpa).</li>
	<li>The FULL_DECOMPOSITION mode used in the JDK is unnecessary for the UCA (and actually incorrect in some cases). ICU4J does not define this mode; clients should use CANONICAL_DECOMPOSITION instead.
	<li>ICU4J uses the standard UCA default ordering, plus fixes and additions for
	different languages, so the sorting order will differ from the JDK's.
	<li>The CollationKeys generated by ICU4J and the JDK are different, so they cannot be compared.</li>
	</ul>
	</p>

	<hr size="2" width="100%" align="center">
	<h3><a name="repackage">Package Restructuring</a></h3>

	<p>Starting with enhancement release 2.1 of ICU4J, the cvs repository
	and package organization has changed. This helps us to more cleanly
	organize the classes, and to clarify relationships and differences
	between parts of the project.</p>

	<p>The new high-level structure is as follows:<br><tt><pre>
	com
	.ibm
	.richtext --- root of rich edit control
	.icu --- root of icu
	.dev --- classes excluded from icu4j.jar (development only)
	.data --- data (e.g. unicode data files)
	.demo --- demos (e.g. calendar, holiday, translit)
	.test --- api tests grouped by functionality
	.tool --- tools used in development
	.impl --- root of 'internal' classes
	.data --- shipped data (text and resources)
	.lang --- similar to java.lang
	.math --- similar to java.math
	.text --- similar to java.text
	.util --- similar to java.util
	</pre></tt></p>

	<p>By and large class names didn't change, only packaging, so changing
	the packages in your source should be sufficient to resolve most problems.
	The package change <b>will break serialization</b> for those classes that are
	serializable.</p>

	<p>The classes in com.ibm.icu.impl are <b>internal use only</b>.
	Their javadocs are not generated, their APIs are not supported, and
	they can change APIs or disappear entirely <b>at any
	time</b>. Many classes in this package are public in order to
	facilitate use by classes in multiple other packages, but this should not
	be construed to mean that such classes will necessarily be 'promoted'
	to full public classes in the future. Clients are warned not to depend
	on anything in this package.</p>

	<hr size="2" width="100%" align="center">
	<h3><a name="ResourceData">ICU Resource Data added to ICU4J</a></h3>
	<p>Starting with JDK 1.4, the resource information that used to be
	available through public classes in java.text.resources is no longer
	available. Sun has moved these classes to an internal package. This
	has two consequences. One, both the format and contents of the
	resources can now change at any time-- dot releases and special bugfix
	releases can be different. Two, the resources are now no longer
	accessible without explicit permission by the java user.
	</p>
	<p>
	For these reasons, starting with release 2.1, ICU4J includes its own
	resource information
	which is completely independent of the JDK resource information. The
	new ICU4J information is equivalent to the information in ICU4C and
	ultimately derives from the same source. This allows ICU4J 2.1 and above
	to be
	built on, and run on, JDK 1.4.
	</p>
	<p>
	There are two main consequences of this decision. The first is an
	increase in size of ICU4J. The new resource information, currently
	stored as class files residing in a jar file, is approximately 1.15
	megabytes. The second is an increased difference between ICU's
	resource information and Java's. Neither is a clear superset of the
	other. For example, Java core currently has more timezone information
	than ICU. ICU's model for handling currency is also different than
	Java's. This will change over time as new versions of Java and ICU
	are released.
	</p>
	<p>
	In addition to the resource information that corresponds to the Java
	resource information, ICU4J also includes resource information needed
	to support its additional features, such as Transliteration, Calendar,
	and DictionaryBasedBreakIterator. This information has existed in
	some form in prior releases on ICU4J and has not greatly changed in
	size.
	</p>
	<p>For information about modifying resource information, please see
	the <a href=readme.html>readme</a>.</p>
	<hr size="2" width="100%" align="center">
	<h3><B>License</B></h3>
	<P>
	Please read and understand the <a href="./license.html">license</a>
	included with this release before installing and using the ICU4J libraries.
	</P>

	<hr size="2" width="100%" align="center">
	<p><i><font size="-1">Copyright (C) 2002 International Business Machines Corporation and others. All Rights Reserved.</font></i></p>
	<!--#include virtual="/icu/ssi/footer.html" -->
	</body>
	</html>