{: .no_toc }
{: .no_toc .text-delta }
Check that there are no “poor man's RTTI” methods in new class hierarchies.
After ICU 50: New class hierarchies should not declare getDynamicClassID() at all. UObject has a dummy implementation for it.
ICU 4.6..50: New class hierarchies used UOBJECT_DEFINE_NO_RTTI_IMPLEMENTATION. See Normalizer2 for an example declaration and implementation that satisfies the virtual-function override without adding new ClassID support.
We do need to keep and add “poor man's RTTI” in old classes, and in new classes extending existing class hierarchies (where parent classes have @stable RTTI functions). (For example, a new Format subclass.)
One easy way to check for this is to search through the API change report and look for “UClassID” or similar.
Note: ICU4C and ICU4J source files are UTF-8. The ASCII check is no longer appropriate for them.
cd icu4c/source find . \( -name "*.[ch]" -or -name "*.cpp" \) -exec grep -PHn [^[:ascii:]] {} \;
Verify that all source and text files in the repository have plain LF line endings.
To do this on Linux, In an up-to-date git workspace,
cd icu/icu4c/source tools/icu-file-utf8-check.py # reports problems
The same python script from the icu4c tools will also check icu4j
cd icu/icu4j ../icu4c/source/tools/icu-file-utf8-check.py
To double-check the line endings, the following grep will find all text files containing \r characters. Do not run from Windows, where \r\n line endings are expected.
cd icu grep -rPIl "\r" *
Even when run from Mac or Linux, some WIndows specific files (.bat, etc) will be found by this check. This is OK.
Note: As of ICU 63, the project moved from svn to GitHub. SVN file properties are no longer relevant.
Note: As of ICU 59, ICU4C source files are UTF-8 encoded, and have the svn mime-type property “text/plain;charset=utf-8”. They must not have a BOM.
This is checked by the above task, Check svn properties, valid UTF-8 and text file line endings.
The bomfix.py script, formerly used for this task, must not be run over the ICU4C sources
Note: This task is only applicable to ICU4C. ICU4J .java source files are encoded by UTF-8, but must be without UTF-8 BOM.
Check that the following match: Files marked as UTF-8 vs. Files beginning with the UTF-8 signature byte sequence (“BOM”).
Run:
The Eclipse IDE provides a feature which allow you to organize import statements for multiple files. Right click on projects/source folders/files, you can select [Source] - [Organize Imports] which resolve all wildcard imports and sort the import statements in a consistent order. (Note: You may experience OOM problem when your run this for projects/folders which contain many files. In this case, you may need to narrow a selection per iteration.)
ICU 64+ (2019+): Done automatically by a build bot, at least in one of the two modes (debug/release), ok to skip as BRS task.
We want to keep dependencies between .c/.cpp/.o files reasonable, both between and inside ICU's libraries.
On Linux, run source/test/depstest/depstest.py, for example:
~/icu/mine/src/icu4c/source/test/depstest$ ./depstest.py ~/icu/mine/dbg/icu4c
Do this twice: Once for a release build (optimized) and once for a debug build (unoptimized). They pull in slightly different sets of standard library symbols (see comments in dependencies.txt).
If everything is fine, the test will print “OK: Specified and actual dependencies match.” If not...
Get changes reviewed by others, including Markus; including changes in dependencies.txt, .py scripts, or ICU4C code.
At first, the test will likely complain about .o files not listed in its dependencies.txt or, if files were removed, the other way around. Try to add them to groups or create new groups as appropriate.
As a rule, smaller groups with fewer dependencies are preferable. If public API (e.g., constructing a UnicodeString via conversion) is not needed inside ICU (e.g., unistr_cnv), make its library depend on its new group.
If the test prints “Info: group icuplug does not need to depend on platform” then the plug-in code is disabled, as is the default since ICU 56. Consider enabling it for dependency checking, but make sure to revert that before you commit changes!
There are usually other “Info” messages where the tool thinks that dependencies on system symbols are not needed. These are harmless, that is, don't try to remove them: They are needed or not needed based on the compiler flags used. If you remove them, then you will likely cause an error for someone with different flags. Also, in an unoptimized build you only get half as many info messages. You get more in an optimized build because more system stuff gets inlined.
The test might complain “is this a new system symbol?” We should be careful about adding those. For example, we must not call printf() from library code, nor the global operator new.
The test might complain that some .o file “imports icu_48::UnicodeString::UnicodeString(const char *) but does not depend on unistr_cnv.o”. This probably means that someone passes a simple “string literal” or a char* into a function that takes a UnicodeString, which invokes the default-conversion constructor. We do not want that! In most cases, such code should be fixed, like in changeset 30186. Only implementations of API that require conversion should depend on it; for example, group formattable_cnv depends on group unistr_cnv, but then nothing inside ICU depends on that.
Verify the following for library code (common, i18n, layout, ustdio). The requirement is for ICU's memory management to be customizable by changing cmemory.h and the common base class.
Note: The requirement remains. The techniques to fix issues are valid. For testing, see the section “Check library dependencies” above.
uobject.h
UObject::new
and delete
are defined by default. Currently, this means to grep to see that U_OVERRIDE_CXX_ALLOCATION
is defined to 1 (in pwin32.h
for Windows).new
must never be imported. Global delete
will be imported and used by empty-virtual destructors in interface/mixin classes. However, they are not called because implementation classes always derive from UMemory. No other functions must use global delete.utypes.h
for global new/delete, with inline implementations that will always crash. These global new/delete operators are only defined for code inside the ICU4C libraries (but must be there for all of those). See ticket #2581.(Purify, Boundary Checker, valgrind...)
Make sure we fixed all the memory leak problems that were discovered when running these tools.
Build ICU with debug information. On Linux,
runConfigureICU --enable-debug --disable-release Linux
Run all of the standard tests under valgrind. For intltest, for example,
cd <where ever>/source/test/intltest LD_LIBRARY_PATH=../../lib:../../stubdata:../../tools/ctestfw:$LD_LIBRARY_PATH valgrind ./intltest
You can grab the command line for running the tests from the output from “make check”, and then just insert “valgrind” before the executable.
Our goal is that all releases go out to the public with 100% API test and at least 85% code coverage.
Testing external dependencies in header files:
(on Unixes) Prerequisite: Configure with --prefix (../icu4c/source/runConfigureICU Linux --prefix=/some/temp/folder) and do ‘make install’. Then set the PATH so that the installed icu-config script can be found. (export PATH=/some/temp/folder/bin:$PATH)
Then go to the ‘icu4c/test/hdrtst’ directory (note: not ‘source/test/hdrtst’) and do ‘make check’. This will attempt to compile against each header file individually to make sure there aren't any problems. Output looks like this, if no error springs up all is in order.
If a C++ file fails to compile as a C file, add it to the ‘cxxfiles.txt’ located in the hdrtst directory.
As of ICU 65, the hdrtst is now run as part of the regular CI builds, and the C++ headers are now guarded with the macro “U_SHOW_CPLUSPLUS_API”.
There is no longer any “cxxfiles.txt” file. Instead the public C++ headers are all guarded with the macro “U_SHOW_CPLUSPLUS_API” which is set to 1 by default if __cplusplus is defined. Users of ICU can explicitly set the macro before including any ICU headers if they wish to only use the C APIs. Any new public C++ header needs to be similarly guarded with the macro, though this should be caught in the CI builds for a pull-request before it is merged.
Run this test with all the uconfig.h variations (see below).
ctest unicode/docmain.h ctest unicode/icudataver.h ctest unicode/icuplug.h
Run the following script straight from the source tree (from inside the “source” folder, not on the top level), no need to build nor install.
For a new release, also look for new tools and tests and add their folders to the script. You can ignore messages stating that no ‘*.h’ files were found in a particular directory.
The command line is simply
~/git.icu/icu4c/source$ test/hdrtst/testinternalheaders.sh
See https://unicode-org.atlassian.net/browse/ICU-12141 “every header file should include all other headers if it depends on definitions from them”
As of ICU 68, the internal header test is now automated as part of Travis CI.
Test ICU completely, and run the header test (above) with:
(See common/unicode/uconfig.h for more documentation.)
There is a script available which will automatically test ICU in this way on UNIXes, it lives in: tools/release/c/uconfigtest.sh. See docs at top of script for information.
When guard conditionals (e.g. #ifndef U_HIDE_INTERNAL_API) are removed because they cause header test failures, please note in the header file the reason that guard conditionals cannot be used in that location, or they will lkeiely be re-added in the future.
Verify that ICU builds without enabling the default use of the ICU namespace. To test on Linux,
./runConfigureICU Linux CXXFLAGS="-DU_USING_ICU_NAMESPACE=0" make check
Any problems will show up as compilation errors.
When definitions outside the ICU C++ namespace refer to ICU C++ classes, those need to be qualified with “icu::
”, as in “icu::UnicodeString
”. In rare cases, a C++ type is also visible in C code (e.g., ucol_imp.h has definitions that are visible to cintltst) and then we use U_NAMESPACE_QUALIFIER
which is defined to be empty when compiling for C.
The automated build system should have a machine that sets both -DU_USING_ICU_NAMESPACE=0
and -DU_CHARSET_IS_UTF8=1
.
Make sure that the ICU4C common and i18n libraries build with UCONFIG_NO_CONVERSION set to 1. We cannot do this as part of “Test uconfig.h variations” because the test suites cannot be built like this, but the library code must support it.
The simplest is to take an ICU4C workspace, modify uconfig.h ==temporarily== by changing the value of UCONFIG_NO_CONVERSION to 1, and do “make -j 6” (not “make check” or “make tests”). Verify that the stubdata, common & i18n libraries build fine; layout should build too but toolutil will fail, that's expected.
Fix any stubdata/common/i18n issues, revert the UCONFIG_NO_CONVERSION value, and verify that it still works with the normal setting.
If this breaks, someone probably inadvertently uses the UnicodeString(const char *) constructor. See the “Check library dependencies” section and example fixes in changeset 30186.
Verify that ICU builds with default charset hardcoded to UTF-8. To test on Linux,
./runConfigureICU Linux CPPFLAGS="-DU_CHARSET_IS_UTF8=1" make -j6 check
Any problems will show up as compilation or test errors.
Rather than setting the CPPFLAGS, you can also temporarily add #define U_CHARSET_IS_UTF8 1
in unicode/platform.h before it gets its default definition, or modify the default definition there. (In ICU 4.8 and earlier, this flag was in unicode/utypes.h.)
This works best on a machine that is set to use UTF-8 as its system charset, which is not possible on Windows.
The automated build system should have a machine that sets both -DU_USING_ICU_NAMESPACE=0
and -DU_CHARSET_IS_UTF8=1
.
Verify that ICU builds with U_OVERRIDE_CXX_ALLOCATION=0 on Linux. Problems will show as build failures.
CPPFLAGS="-DU_OVERRIDE_CXX_ALLOCATION=0" ./runConfigureICU Linux make clean make -j12 check
Only necessary up to ICU4C 49.
--disable-threads
configure option is gone. If you want to test with ICU_USE_THREADS=0 then temporarily change this flag in intltest.h or in the intltest Makefile.Verify that ICU builds and tests with threading disabled. To test on Linux,
./runConfigureICU Linux --disable-threads make check
To build the ICU4C samples on Windows with Visual Studio, use the following steps:
To test the sample programs, run the “source\samples\all\samplecheck.bat” script for each configuration, and ensure that they are successful.
See https://github.com/unicode-org/icu-demos/blob/main/icu-kube/README.md
See: https://github.com/unicode-org/icu-demos/blob/main/icu4jweb/README.md
These are the demo applets, see above for the icu4jweb demos.
To test ICU4J demo applications, cd to ICU4J directory and build and run the demo.
$ cd icu4j $ ant jarDemos $ java -jar icu4jdemos.jar
Above command invokes GUI demo applications. As such it has to connect to a X-Server. The easiest way is to run via e.g. remote desktop on the machine on which it is executed instead of in a ssh shell.
The demos include calendar, charset detection, holidays, RBNF and transliterator. Check if each application is working OK.
To check ICU4J samples, open Eclipse workspace and import icu4j-samples project from directory <icu4j_root>/samples. Make sure these sample code has no build issues. Also run sample code with main and see if each sample code runs.
For ICU4J,
$ ant exhaustiveCheck
For ICU4C, testing with an optimized build will help reduce the elapsed time required for the tests to complete.
$ make -j6 check-exhaustive
The build bots run the thread sanitizer on the most interesting multithreaded tests. These instructions run the sanitizer on the entire test suite. The clang compiler is required.
$ CPPFLAGS=-fsanitize=thread LDFLAGS=-fsanitize=thread ./runConfigureICU --enable-debug --disable-release Linux --disable-renaming $ make clean $ make -j6 check