| --- |
| layout: default |
| title: ICU 76 |
| nav_order: 9999 |
| has_children: false |
| --- |
| |
| <!-- |
| © 2024 and later: Unicode, Inc. and others. |
| License & terms of use: http://www.unicode.org/copyright.html |
| --> |
| |
| <!-- TODO: fix nav_order, add download/index.html, hook into docs/index.md --> |
| |
| # ICU 76 |
| |
| ICU is the [premier library for software internationalization](https://icu.unicode.org/#h.i33fakvpjb7o), |
| used by a [wide array of companies and organizations](https://icu.unicode.org/#h.f9qwubthqabj). |
| |
| ## Release Overview |
| |
| ICU 76 updates to |
| [Unicode 16](https://www.unicode.org/versions/Unicode16.0.0/) |
| ([blog](https://blog.unicode.org/2024/09/announcing-unicode-standard-version-160.html)), |
| including new characters and scripts, emoji, collation & IDNA changes, and corresponding APIs and implementations. |
| |
| It also updates to |
| [CLDR 46](https://cldr.unicode.org/downloads/cldr-46) |
| ([beta blog](https://blog.unicode.org/2024/09/unicode-cldr-46-beta-available-for.html)) |
| locale data with new locales, signficant updates to existing locales, |
| and various additions and corrections. |
| For example, the CLDR and Unicode default sort orders are now very nearly the same. |
| |
| Most of the java.time (Temporal) types can now be formatted directly |
| using the existing ICU4J date/time formatting classes. |
| |
| There are some new APIs to make ICU easier to use with modern C++ and Java patterns. |
| Most of the C/C++ APIs added for this purpose are implemented as C++ header-only APIs, |
| and usable on top of binary stable C APIs, which is a first for ICU. |
| |
| The Java and C++ technology preview implementations of the (also in [tech preview](https://github.com/unicode-org/message-format-wg?tab=readme-ov-file#messageformat-2-technical-preview)) CLDR MessageFormat 2.0 specification have been updated to match recent changes. |
| |
| For more details, including migration issues, see below. |
| |
| Please use the [icu-support mailing list](https://icu.unicode.org/contacts) and/or [find/submit error reports](https://icu.unicode.org/bugs). |
| |
| ### Version Number |
| |
| The initial release has library version number 76.1. |
| |
| * Release date: _planned for_ 2024-10-24 |
| * [List of tickets fixed in ICU 76](https://unicode-org.atlassian.net/issues/?jql=project%20%3D%20ICU%20AND%20status%20%3D%20Done%20AND%20resolution%20in%20%28Fixed%2C%20%22Fixed%20by%20Other%20Ticket%22%29%20AND%20fixVersion%20%3D%2076.1%20ORDER%20BY%20component%20ASC%2C%20created%20DESC) |
| |
| If there are maintenance releases, they will be 76.2, 76.3, etc. (During ICU 76 development, the library version number was 76.0.x.) |
| |
| Note: There may be additional commits on the [maint/maint-76](https://github.com/unicode-org/icu/tree/maint/maint-76) branch that are not included in the prepackaged download files. |
| |
| ## Common Changes |
| |
| * [Unicode 16](https://www.unicode.org/versions/Unicode16.0.0/) |
| ([blog](https://blog.unicode.org/2024/09/announcing-unicode-standard-version-160.html)): |
| * Adds five modern-use scripts: Garay, Gurung Khema, Kirat Rai, Ol Onal, Sunuwar |
| * Adds two historic scripts & almost 4000 additional Egyptian Hieroglyphs |
| * Seven new emoji characters |
| * Over 700 symbols from legacy computing environments |
| * ICU line breaking improvements have been upstreamed into |
| [UAX #14](https://www.unicode.org/reports/tr14/tr14-53.html#Modifications) |
| * ICU 76 adds support for the new UCD property Modifier_Combining_Mark for |
| [UAX #53](https://www.unicode.org/reports/tr53/) Arabic Mark Rendering |
| * ICU 76 also adds support for the UCD property Indic_Conjunct_Break |
| which was new in Unicode 15.1. ([ICU-22503](https://unicode-org.atlassian.net/browse/ICU-22503)) |
| * [IDNA](https://www.unicode.org/reports/tr46/tr46-33.html#Modifications): |
| The handling of UseSTD3ASCIIRules was simplified. |
| Some existing characters changed from disallowed (when that was only for compatibility with |
| long-obsolete IDNA2003) to valid. |
| * [CLDR 46](https://github.com/unicode-org/cldr/blob/main/docs/site/downloads/cldr-46.md) |
| ([beta blog](https://blog.unicode.org/2024/09/unicode-cldr-46-beta-available-for.html)): |
| * Significant data updates across all locales |
| * Locales which are now at modern coverage level: Nigerian Pidgin, Tigrinya |
| * Locales which are now at moderate coverage level: |
| Akan, Baluchi (Latin), Kangri, Tajik, Tatar, Wolof |
| * New measurement units "night" and "light-speed" |
| * Note: ICU 76 does not yet support `portion-per-1e9` (aka per-billion). (See [ICU-22781](https://unicode-org.atlassian.net/browse/ICU-22781)) |
| * [MessageFormat 2.0 tech preview updates](https://cldr.unicode.org/downloads/cldr-46#message-format-specification) |
| * Language matching: Dropped the fallback mapping |
| desired="uk" → supported="ru" |
| (so that Ukrainian (uk) doesn’t fall back to Russian (ru)) |
| * [Collation](https://cldr.unicode.org/downloads/cldr-46#collation-data-changes): |
| Significant changes to the CLDR root collation (CLDR default sort order) |
| * Realigned With DUCET: |
| The order of groups of characters which sort below letters is now the same. |
| In both sort orders, non-decimal-digit numeric characters now sort after decimal digits, |
| and the CLDR root collation no longer tailors any currency symbols |
| (making some of them sort like letter sequences, as in the DUCET). |
| _These changes eliminate sort order differences among almost all |
| regular characters between the CLDR root collation and the DUCET._ |
| * Improved Han Radical-Stroke Order: |
| The CLDR radical-stroke order now matches that of the Unicode Radical-Stroke Index; |
| traditional vs. simplified forms of radicals are now distinguished on a lower level than the number of residual strokes. |
| In alphabetic indexes for radical-stroke sort orders, |
| only the traditional forms of radicals are now available as index characters. |
| * Time zone data (tzdata) version 2024b (2024-sep). Note that pre-1970 data for a number of time zones has been removed, as has been the case in the upstream [tzdata](https://www.iana.org/time-zones) release since 2021b. |
| * The Asia/Almaty time zone has become an alias following IANA TZ database changes. |
| * CLDR added support for deprecated timezone codes by remapping: |
| CST6CDT → America/Chicago, EST → America/Panama, EST5EDT → America/New_York, |
| MST7MDT → America/Denver, PST8PDT → America/Los_Angeles |
| (These IANA TZ changes were motivated by CLDR, see |
| [CLDR-17111](https://unicode-org.atlassian.net/browse/CLDR-17111)) |
| |
| ## ICU4C Specific Changes |
| |
| * [API changes since ICU4C 75 (Markdown)](https://github.com/unicode-org/icu/blob/maint/maint-76/icu4c/APIChangeReport.md) / [(HTML)](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-76/icu4c/APIChangeReport.html) |
| * A UnicodeString can now be converted to & from UTF-16 standard string_view types |
| (std::u16string_view, and on Windows to/from std::wstring_view) |
| and other UTF-16 types (string literals, standard string classes). |
| Several other member functions have been widened to accept standard UTF-16 types as well. |
| ([ICU-22843](https://unicode-org.atlassian.net/browse/ICU-22843)) |
| * New APIs for colloquial iteration over the elements of a C++ UnicodeSet or a C USet. ([ICU-22876](https://unicode-org.atlassian.net/browse/ICU-22876)) |
| * For details and an example see the “C++ Header-Only APIs” section of the [Migration Issues](#migration-issues) below. |
| * New APIs for colloquial use of C++ Collator / C UCollator with |
| standard C++ algorithms (e.g, sort) & data structures (e.g., map). |
| ([ICU-22879](https://unicode-org.atlassian.net/browse/ICU-22879)) |
| (The UCollator wrappers are also C++ header-only APIs.) |
| * Note: Some APIs were changed to accept a wider range of input types than before, |
| but in the API change report they look like the old, stable signatures are removed, |
| and like the wider signatures are added as “born stable”. |
| For example, several UnicodeString constructors that take a raw pointer |
| have been replaced with a signature that accepts such raw pointers but also additional input types. |
| * Note: Similarly, the API change report appears to show removal+addition of |
| certain UnicodeString::remove() and UnicodeString::removeBetween() overloads, |
| but only the _expression_ of one of their default parameter values has changed. |
| * Many changes for more robust string and memory handling. |
| |
| ## ICU4J Specific Changes |
| |
| * [API Changes since ICU4J 75](https://htmlpreview.github.io/?https://github.com/unicode-org/icu/blob/maint/maint-76/icu4j/APIChangeReport.html) |
| * Most of the java.time (Temporal) types can now be formatted directly |
| using the existing ICU4J date/time formatting classes. ([ICU-22853](https://unicode-org.atlassian.net/browse/ICU-22853)) |
| * New APIs for colloquial iteration over the elements of a UnicodeSet. |
| In addition to the existing ranges(), strings(), and UnicodeSet-is-an-Iterable, |
| there is a new codePoints() (returns an Iterable), |
| and new methods that return Streams (e.g., codePointStream() & rangeStream()). |
| ([ICU-22845](https://unicode-org.atlassian.net/browse/ICU-22845)) |
| |
| ## Known Issues |
| |
| * None yet |
| |
| ## Migration Issues |
| |
| ### IDNA Default Option Changed to Nontransitional Processing |
| After all major browsers have switched to nontransitional processing, |
| Unicode 15.1 (a year ago) changed the [UTS #46 spec](https://www.unicode.org/reports/tr46/#Processing) |
| to declare transitional processing deprecated. |
| |
| ICU 76 changes the "DEFAULT" API constants from 0 to UIDNA_NONTRANSITIONAL_TO_ASCII | UIDNA_NONTRANSITIONAL_TO_UNICODE. |
| |
| ICU 76 does not change the behavior of using options value 0. |
| (That would change the behavior of existing binaries linking with new ICU libraries.) |
| However, when code is recompiled against a new version of ICU, |
| and when it uses the DEFAULT constant, then it will pass these option flags into the factory method. |
| |
| * In C/C++: unicode/uidna.h [UIDNA_DEFAULT](https://unicode-org.github.io/icu-docs/apidoc/dev/icu4c/uidna_8h.html#a726ca809ffd3d67ab4b8476646f26635aa1eb63014cdaf41c7ea6cf3abecf1169) |
| * In Java: IDNA.java [DEFAULT](https://unicode-org.github.io/icu-docs/apidoc/dev/icu4j/com/ibm/icu/text/IDNA.html#DEFAULT) |
| |
| See [ICU-22294](https://unicode-org.atlassian.net/browse/ICU-22294) |
| |
| ### SimpleNumber::truncateStart() Removed |
| ICU 75 renamed the still-draft SimpleNumber::truncateStart() to setMaximumIntegerDigits(). |
| ICU 76 removes the never-stable, original function. |
| Same for the C API usnum_truncateStart(). |
| ([ICU-22900](https://unicode-org.atlassian.net/browse/ICU-22900)) |
| |
| ### C++ Header-Only APIs |
| ICU 76 is the first version where we add what we call C++ header-only APIs. |
| These are especially intended for users who rely on only binary stable DLL/library exports of C APIs |
| (C++ APIs cannot be binary stable). |
| |
| _Please test these new APIs and let us know if you find problems — |
| especially if you find a platform/compiler/options combination |
| where the call site does end up calling into ICU DLL/library exports._ |
| |
| Remember that regular C++ APIs can be hidden by callers defining `U_SHOW_CPLUSPLUS_API=0`. |
| The new header-only APIs can be separately enabled via `U_SHOW_CPLUSPLUS_HEADER_API=1`. |
| |
| ([GitHub query for `U_SHOW_CPLUSPLUS_HEADER_API` in public header files](https://github.com/search?q=repo%3Aunicode-org%2Ficu+U_SHOW_CPLUSPLUS_HEADER_API+path%3Aunicode%2F*.h&type=code)) |
| |
| These are C++ definitions that are not exported by the ICU DLLs/libraries, |
| are thus inlined into the calling code, |
| and which may call ICU C APIs but not into ICU non-header-only C++ APIs. |
| |
| The header-only APIs are defined in a nested `header` namespace. |
| If entry point renaming is turned off (the main namespace is `icu` rather than `icu_76` etc.), |
| then the new `U_HEADER_ONLY_NAMESPACE` is `icu::header`. |
| |
| ([Link to the API proposal which introduced this concept](https://docs.google.com/document/d/1xERVccTYsptzjfbjcj6HDtoKVF_mEKmslPsOiQzzaFg/view#heading=h.cf4bmhjgozry)) |
| |
| For example, for iterating over the code point ranges in a `USet` (excluding the strings): |
| |
| ```c++ |
| U_NAMESPACE_USE |
| using U_HEADER_NESTED_NAMESPACE::USetRanges; |
| LocalUSetPointer uset(uset_openPattern(u"[abcçカ🚴]", -1, &errorCode)); |
| for (auto [start, end] : USetRanges(uset.getAlias())) { |
| printf("uset.range U+%04lx..U+%04lx\n", (long)start, (long)end); |
| } |
| for (auto range : USetRanges(uset.getAlias())) { |
| for (UChar32 c : range) { |
| printf("uset.range.c U+%04lx\n", (long)c); |
| } |
| } |
| ``` |
| |
| (Implementation note: On most platforms, when compiling ICU itself, |
| the `U_HEADER_ONLY_NAMESPACE` is `icu::internal`, |
| so that any such symbols that get exported differ from the ones that calling code sees. |
| On Windows, where DLL exports are explicit, |
| the namespace is always the same, but these header-only APIs are not marked for export.) |
| |
| ### Migration Issues Related to CLDR |
| * See [CLDR 46 migration issues](https://cldr.unicode.org/downloads/cldr-46#migration) |
| |
| ## ICU4C Platform Support |
| |
| ICU4C requires C++17 and has been tested with up to C++20. |
| |
| We routinely test on recent versions of Linux, macOS, and Windows. |
| |
| We accept patches for other platforms. |
| |
| For ICU 76, we have received a contribution to make ICU4C work again on z/OS, |
| using a newer (clang-based) compiler. ([ICU-22714](https://unicode-org.atlassian.net/browse/ICU-22714) [icu/pull/3008](https://github.com/unicode-org/icu/pull/3008) + [ICU-22916](https://unicode-org.atlassian.net/browse/ICU-22916) [icu/pull/3208](https://github.com/unicode-org/icu/pull/3208)) |
| |
| Windows: The minimum supported version is Windows 7. (See [How To Build And Install On Windows](../userguide/icu4c/build.html#how-to-build-and-install-on-windows) for more details.) |
| |
| ## ICU4J Platform Support |
| |
| ICU4J works on Java 8..21 (at least). |
| |
| ICU4J should work on Android API level 21 and later but may require “[library desugaring](https://developer.android.com/studio/write/java8-support#library-desugaring)”. |
| |
| ## Download |
| |
| Source and binary downloads are available on the git/GitHub tag page: https://github.com/unicode-org/icu/releases/tag/release-76-rc |
| |
| See the [Source Code Setup](../devsetup/source/) page for how to download the ICU file tree directly from GitHub. |
| |
| ICU locale data was generated from CLDR data equivalent to: |
| |
| * https://github.com/unicode-org/cldr/releases/tag/release-46-beta3 |
| * https://github.com/unicode-org/cldr-staging/releases/tag/release-46-beta3 |
| |
| [Maven dependency](https://central.sonatype.com/artifact/com.ibm.icu/icu4j): |
| TODO |
| ``` |
| <dependency> |
| <groupId>com.ibm.icu</groupId> |
| <artifactId>icu4j</artifactId> |
| <version>76.1</version> |
| </dependency> |
| ``` |