| --- |
| layout: default |
| title: Healthy Code |
| parent: Release & Milestone Tasks |
| grand_parent: Contributors |
| nav_order: 50 |
| --- |
| |
| <!-- |
| © 2021 and later: Unicode, Inc. and others. |
| License & terms of use: http://www.unicode.org/copyright.html |
| --> |
| |
| # Healthy Code |
| {: .no_toc } |
| |
| ## Contents |
| {: .no_toc .text-delta } |
| |
| 1. TOC |
| {:toc} |
| |
| --- |
| |
| ## Check for ClassID in new class hierarchies |
| |
| Check that there are no "poor man's RTTI" methods in new class hierarchies. |
| |
| After ICU 50: New class hierarchies should not declare getDynamicClassID() at |
| all. UObject has a dummy implementation for it. |
| |
| ICU 4.6..50: New class hierarchies used UOBJECT_DEFINE_NO_RTTI_IMPLEMENTATION. |
| See Normalizer2 for an example declaration and implementation that satisfies the |
| virtual-function override without adding new ClassID support. |
| |
| We do need to keep and add "poor man's RTTI" in old classes, and in new classes |
| extending existing class hierarchies (where parent classes have @stable RTTI |
| functions). (For example, a new Format subclass.) |
| |
| One easy way to check for this is to search through the API change report and |
| look for "UClassID" or similar. |
| |
| * Latest trunk report: |
| [icu/icu4c/APIChangeReport.html](https://github.com/unicode-org/icu/blob/main/icu4c/APIChangeReport.html) |
| * ~~The automated build system creates a new report: |
| <https://cldr-build.unicode.org/jenkins/job/icu/job/icu-apidocs/>~~ |
| * Click "ICU4C API Change Report vs Latest" |
| * Old: was http://bugs.icu-project.org/trac/build/icu4cdocs |
| * Go into the latest successful build, the report is an attachment there. |
| * Download the style sheet next to it: |
| https://github.com/unicode-org/icu/blob/main/icu4c/icu4c.css |
| |
| --- |
| |
| ## ~~Check for non-ascii characters in ICU4C source files \[obsolete\]~~ |
| |
| ~~Note: ICU4C and ICU4J source files are UTF-8. The ASCII check is no longer |
| appropriate for them.~~ |
| |
| ```sh |
| cd icu4c/source |
| find . \( -name "*.[ch]" -or -name "*.cpp" \) -exec grep -PHn [^[:ascii:]] {} \; |
| ``` |
| |
| --- |
| |
| ## Check source files for valid UTF-8 and correct text file line endings |
| |
| Verify that all source and text files in the repository have plain LF line |
| endings. |
| |
| To do this on Linux, In an up-to-date git workspace, |
| |
| ```sh |
| cd icu/icu4c/source |
| tools/icu-file-utf8-check.py # reports problems |
| ``` |
| |
| The same python script from the icu4c tools will also check icu4j |
| |
| ```sh |
| cd icu/icu4j |
| ../icu4c/source/tools/icu-file-utf8-check.py |
| ``` |
| |
| To double-check the line endings, the following grep will find all text files |
| containing \\r characters. Do not run from Windows, where \\r\\n line endings |
| are expected. |
| |
| ```sh |
| cd icu |
| grep -rPIl "\r" * |
| ``` |
| |
| Even when run from Mac or Linux, some WIndows specific files (.bat, etc) will be |
| found by this check. This is OK. |
| |
| ## ~~Check UTF-8 file properties \[obsolete\]~~ |
| |
| *Note: As of ICU 63, the project moved from svn to GitHub. SVN file properties |
| are no longer relevant.* |
| |
| <span style="color:red">**Note: As of ICU 59, ICU4C source files are UTF-8 |
| encoded, and have the svn mime-type property "text/plain;charset=utf-8". They |
| must not have a BOM.**</span> |
| |
| This is checked by the above task, *Check svn properties, valid UTF-8 and text |
| file line endings.* |
| |
| The bomfix.py script, formerly used for this task, must *not* be run over the |
| ICU4C sources |
| |
| <span style="color:red">**~~Note: This task is only applicable to ICU4C. ICU4J |
| .java source files are encoded by UTF-8, but must be without UTF-8 |
| BOM.~~**</span> |
| |
| ~~Check that the following match: Files marked as UTF-8 vs. Files beginning with |
| the UTF-8 signature byte sequence ("BOM").~~ |
| |
| ~~Run:~~ |
| |
| <pre><code><s>cd {icu}/icu4c</s> |
| <s>python ../tools/release/c/bomfix.py</s></code></pre> |
| |
| --- |
| |
| ## Clean up import statements |
| |
| The Eclipse IDE provides a feature which allow you to organize import statements |
| for multiple files. Right click on projects/source folders/files, you can select |
| \[Source\] - \[Organize Imports\] which resolve all wildcard imports and sort |
| the import statements in a consistent order. (Note: You may experience OOM |
| problem when your run this for projects/folders which contain many files. In |
| this case, you may need to narrow a selection per iteration.) |
| |
| --- |
| |
| ## Check library dependencies |
| |
| ***ICU 64+ (2019+): Done automatically by a build bot, at least in one of the |
| two modes (debug/release), ok to skip as BRS task.*** |
| |
| We want to keep dependencies between .c/.cpp/.o files reasonable, both between |
| and inside ICU's libraries. |
| |
| On Linux, run |
| [source/test/depstest/depstest.py](https://github.com/unicode-org/icu/blob/main/icu4c/source/test/depstest/dependencies.py), |
| for example: |
| ```sh |
| ~/icu/mine/src/icu4c/source/test/depstest$ ./depstest.py ~/icu/mine/dbg/icu4c |
| ``` |
| |
| Do this twice: Once for a release build (optimized) and once for a debug build |
| (unoptimized). They pull in slightly different sets of standard library symbols |
| (see comments in dependencies.txt). |
| |
| If everything is fine, the test will print "OK: Specified and actual |
| dependencies match." If not... |
| |
| *Get changes reviewed by others, including Markus; including changes in |
| dependencies.txt, .py scripts, or ICU4C code.* |
| |
| At first, the test will likely complain about .o files not listed in its |
| dependencies.txt or, if files were removed, the other way around. Try to add |
| them to groups or create new groups as appropriate. |
| |
| As a rule, smaller groups with fewer dependencies are preferable. If public API |
| (e.g., constructing a UnicodeString via conversion) is not needed inside ICU |
| (e.g., unistr_cnv), make its library depend on its new group. |
| |
| If the test prints "Info: group icuplug does not need to depend on platform" |
| then the plug-in code is disabled, as is the default since ICU 56. Consider |
| enabling it for dependency checking, but make sure to revert that before you |
| commit changes! |
| |
| There are usually other "Info" messages where the tool thinks that dependencies |
| on system symbols are not needed. These are harmless, that is, don't try to |
| remove them: They are needed or not needed based on the compiler flags used. If |
| you remove them, then you will likely cause an error for someone with different |
| flags. Also, in an unoptimized build you only get half as many info messages. |
| You get more in an optimized build because more system stuff gets inlined. |
| |
| The test might complain "is this a new system symbol?" We should be careful |
| about adding those. For example, we must not call printf() from library code, |
| nor the global operator new. |
| |
| The test might complain that some .o file "imports |
| `icu_48::UnicodeString::UnicodeString(const char *)` but does not depend on |
| unistr_cnv.o". This probably means that someone passes a simple `"string literal"` |
| or a `char *` into a function that takes a UnicodeString, which invokes the |
| default-conversion constructor. We do not want that! In most cases, such code |
| should be fixed. Only implementations |
| of API that require conversion should depend on it; for example, group |
| formattable_cnv depends on group unistr_cnv, but then nothing inside ICU depends |
| on that. |
| |
| --- |
| |
| ## Verify proper memory allocation functions |
| |
| Verify the following for library code (common, i18n, layout, ustdio). The |
| requirement is for ICU's memory management to be customizable by changing |
| cmemory.h and the common base class. |
| |
| **Note:** The requirement remains. The techniques to fix issues are valid. |
| ***For testing**, see the section "Check library dependencies" above.* |
| |
| * No raw malloc/free/realloc but their uprv_ versions. |
| * All C++ classes must inherit the common base class UObject or UMemory |
| * But not for mixin/interface classes (pure virtual, no data members, no |
| constructors) because that would break the single-inheritance model. |
| * Also not for pure-static classes (used for name scoping) declare but |
| don't implement a private default constructor to prevent instantiation. |
| * Simple struct-like C++ classes (and structs) that do not have |
| constructors, destructors, and virtual methods, need not inherit the |
| base class but must then be allocated with uprv_malloc. |
| * All simple types (UChar, int32_t, pointers(!), ...) must be allocated with |
| uprv_malloc and released with uprv_free. |
| * For Testing that this is the case, on Linux |
| * run the Perl script ICUMemCheck.pl. Follow the instructions in the |
| script. The script is in in the ICU4C sources at |
| source/tools/memcheck/ICUMemCheck.pl. |
| * also, depstest.py will show an error message if the global operator new |
| is used (see section about checking dependencies) |
| * For testing that this is the case, on Windows: |
| * Don't bother, as of Nov, 2004. Failures appear in many dozens of files |
| from the mixin class destructors. Do the check on Linux. But, for |
| reference, here are the old instructions. |
| * Make sure that in `uobject.h` `UObject::new` and `delete` are |
| defined by default. Currently, this means to grep to see that |
| `U_OVERRIDE_CXX_ALLOCATION` is defined to 1 (in `pwin32.h` for |
| Windows). |
| * Check the \*.obj's for linkage to the global (non-UObject::) |
| operators new/delete; see uobject.cpp for details. |
| * Global `new` must never be imported. Global `delete` will be |
| imported and used by empty-virtual destructors in interface/mixin |
| classes. However, they are not called because implementation classes |
| always derive from UMemory. No other functions must use global |
| delete. |
| * There are now (2002-dec-17) definitions in `utypes.h` for global |
| new/delete, with inline implementations that will always crash. |
| These global new/delete operators are only defined for code inside |
| the ICU4C libraries (but must be there for all of those). See ticket |
| #2581. |
| * If a global new/delete is used somewhere, then change the class |
| inheritance and/or use uprv_malloc/free until no global new/delete |
| are used in the libraries (and the tests still pass...). See the |
| User Guide Coding Guidelines for details. |
| |
| --- |
| |
| ## Run static code analysis tools |
| |
| (Purify, Boundary Checker, valgrind...) |
| |
| Make sure we fixed all the memory leak problems that were discovered when |
| running these tools. |
| |
| Build ICU with debug information. On Linux, |
| |
| ```sh |
| runConfigureICU --enable-debug --disable-release Linux |
| ``` |
| |
| Run all of the standard tests under valgrind. For intltest, for example, |
| |
| ```sh |
| cd <where ever>/source/test/intltest |
| LD_LIBRARY_PATH=../../lib:../../stubdata:../../tools/ctestfw:$LD_LIBRARY_PATH valgrind ./intltest |
| ``` |
| |
| You can grab the command line for running the tests from the output from "make |
| check", and then just insert "valgrind" before the executable. |
| |
| --- |
| |
| ## Check the code coverage numbers |
| |
| Our goal is that all releases go out to the public with 100% API test and at |
| least 85% code coverage. |
| |
| --- |
| |
| ## Test ICU4C headers |
| |
| ### Test ICU4C public headers |
| |
| Testing external dependencies in header files: |
| |
| (on Unixes) Prerequisite: Configure with --prefix |
| (../icu4c/source/runConfigureICU Linux --prefix=/some/temp/folder) and do 'make |
| install'. Then set the PATH so that the installed icu-config script can be |
| found. (export PATH=/some/temp/folder/**bin**:$PATH) |
| |
| Then go to the 'icu4c/test/hdrtst' directory (note: not 'source/test/hdrtst') |
| and do 'make check'. This will attempt to compile against each header file |
| individually to make sure there aren't any problems. Output looks like this, if |
| no error springs up all is in order. |
| |
| ~~If a C++ file fails to compile as a C file, add it to the 'cxxfiles.txt' |
| located in the hdrtst directory.~~ |
| |
| <span style="color:red">**As of ICU 65, the hdrtst is now run as part of the |
| regular CI builds, and the C++ headers are now guarded with the macro |
| "U_SHOW_CPLUSPLUS_API".**</span> |
| |
| There is no longer any "cxxfiles.txt" file. Instead the public C++ headers are |
| all guarded with the macro "U_SHOW_CPLUSPLUS_API" which is set to 1 by default |
| if __cplusplus is defined. Users of ICU can explicitly set the macro before |
| including any ICU headers if they wish to only use the C APIs. Any new public |
| C++ header needs to be similarly guarded with the macro, though this should be |
| caught in the CI builds for a pull-request before it is merged. |
| |
| Run this test with all the uconfig.h variations (see below). |
| |
| ```sh |
| ctest unicode/docmain.h |
| ctest unicode/icudataver.h |
| ctest unicode/icuplug.h |
| ``` |
| |
| ### Test ICU4C internal headers |
| |
| Run the following script straight from the source tree (from inside the "source" |
| folder, not on the top level), no need to build nor install. |
| |
| For a new release, also look for new tools and tests and add their folders to |
| the script. You can ignore messages stating that no '\*.h' files were found in a |
| particular directory. |
| |
| The command line is simply |
| |
| ```sh |
| ~/git.icu/icu4c/source$ test/hdrtst/testinternalheaders.sh |
| ``` |
| |
| See https://unicode-org.atlassian.net/browse/ICU-12141 "every header file should |
| include all other headers if it depends on definitions from them" |
| |
| <span style="color:red">**As of ICU 68, the internal header test is now |
| automated as part of Travis CI.**</span> |
| |
| --- |
| |
| ## Test uconfig.h variations |
| |
| Test ICU completely, and run the header test (above) with: |
| |
| 1. **none** of the 'UCONFIG_NO_XXX' switches turned on (i.e., the normal case) |
| 2. **all** of the 'UCONFIG_NO_XXX' switches turned on (set to 1) |
| 3. For each switch, test once with **just** that one switch on. But note the |
| exception regarding [UCONFIG_NO_CONVERSION test](healthy-code.md) below. |
| |
| (See |
| [common/unicode/uconfig.h](https://github.com/unicode-org/icu/blob/main/icu4c/source/common/unicode/uconfig.h) |
| for more documentation.) |
| |
| There is a script available which will automatically test ICU in this way on |
| UNIXes, it lives in: |
| [tools/release/c/uconfigtest.sh](https://github.com/unicode-org/icu/blob/main/tools/release/c/uconfigtest.sh). |
| See docs at top of script for information. |
| |
| <span style="background-color:yellow">When guard conditionals (e.g. #ifndef |
| U_HIDE_INTERNAL_API) are removed because they cause header test failures, please |
| note in the header file the reason that guard conditionals cannot be used in |
| that location, or they will lkeiely be re-added in the future. |
| |
| --- |
| |
| ## Test C++ Namespace Use |
| |
| Verify that ICU builds without enabling the default use of the ICU namespace. To |
| test on Linux, |
| |
| ```sh |
| ./runConfigureICU Linux CXXFLAGS="-DU_USING_ICU_NAMESPACE=0" |
| make check |
| ``` |
| |
| Any problems will show up as compilation errors. |
| |
| When definitions outside the ICU C++ namespace refer to ICU C++ classes, those |
| need to be qualified with "`icu::`", as in "`icu::UnicodeString`". In rare |
| cases, a C++ type is also visible in C code (e.g., ucol_imp.h has definitions |
| that are visible to cintltst) and then we use `U_NAMESPACE_QUALIFIER` which is |
| defined to be empty when compiling for C. |
| |
| The automated build system should have a machine that sets both |
| `-DU_USING_ICU_NAMESPACE=0` and `-DU_CHARSET_IS_UTF8=1`. |
| |
| --- |
| |
| ## Test UCONFIG_NO_CONVERSION |
| |
| Make sure that the ICU4C common and i18n libraries build with |
| UCONFIG_NO_CONVERSION set to 1. We cannot do this as part of "Test uconfig.h |
| variations" because the test suites cannot be built like this, but the library |
| code must support it. |
| |
| The simplest is to take an ICU4C workspace, modify uconfig.h *==temporarily==* |
| by changing the value of UCONFIG_NO_CONVERSION to 1, and do "make -j 6" (not |
| "make check" or "make tests"). Verify that the stubdata, common & i18n libraries |
| build fine; layout should build too but toolutil will fail, that's expected. |
| |
| Fix any stubdata/common/i18n issues, revert the UCONFIG_NO_CONVERSION value, and |
| verify that it still works with the normal setting. |
| |
| If this breaks, someone probably inadvertently uses the `UnicodeString(const char *)` constructor. |
| See the "Check library dependencies" section. |
| |
| --- |
| |
| ## Test U_CHARSET_IS_UTF8 |
| |
| Verify that ICU builds with default charset hardcoded to UTF-8. To test on |
| Linux, |
| |
| ```sh |
| ./runConfigureICU Linux CPPFLAGS="-DU_CHARSET_IS_UTF8=1" |
| make -j6 check |
| ``` |
| |
| Any problems will show up as compilation or test errors. |
| |
| Rather than setting the CPPFLAGS, you can also temporarily add `#define |
| U_CHARSET_IS_UTF8 1` in unicode/platform.h before it gets its default |
| definition, or modify the default definition there. (In ICU 4.8 and earlier, |
| this flag was in unicode/utypes.h.) |
| |
| This works best on a machine that is set to use UTF-8 as its system charset, |
| which is not possible on Windows. |
| |
| The automated build system should have a machine that sets both |
| `-DU_USING_ICU_NAMESPACE=0` and `-DU_CHARSET_IS_UTF8=1`. |
| |
| --- |
| |
| ## Test U_OVERRIDE_CXX_ALLOCATION=0 |
| |
| Verify that ICU builds with U_OVERRIDE_CXX_ALLOCATION=0 on Linux. Problems will |
| show as build failures. |
| |
| ```sh |
| CPPFLAGS="-DU_OVERRIDE_CXX_ALLOCATION=0" ./runConfigureICU Linux |
| make clean |
| make -j12 check |
| ``` |
| |
| ## ~~Test ICU_USE_THREADS=0 \[Obsolete\]~~ |
| |
| ***Only necessary up to ICU4C 49.*** |
| |
| * ICU 50m1 removes ICU_USE_THREADS from the runtime code (ticket |
| [ICU-9010](https://unicode-org.atlassian.net/browse/ICU-9010)). |
| * It is still possible to build and test intltest with ICU_USE_THREADS=0 but |
| not nearly as important. |
| * In ICU 50m1, the `--disable-threads` configure option is gone. If you want |
| to test with ICU_USE_THREADS=0 then temporarily change this flag in |
| intltest.h or in the intltest Makefile. |
| |
| Verify that ICU builds and tests with threading disabled. To test on Linux, |
| |
| ```sh |
| ./runConfigureICU Linux --disable-threads |
| make check |
| ``` |
| |
| --- |
| |
| ## Test ICU4C Samples and Demos |
| |
| ### Windows build and test |
| To build the ICU4C samples on Windows with Visual Studio, use the following |
| steps: |
| |
| * Open the "allinone" solution file located under |
| "source\\allinone\\allinone.sln". |
| * Build the Debug/Release + x86/x64 configurations (all 4 configurations). |
| * Make sure the generated data file (ex: "icu4c\\bin\\icudt64.dll") is not |
| stub data, it should be ~26MB. |
| * Open the "all" Solution file located under "source\\samples\\all\\all.sln". |
| * Build both x86 and x64 using the "Batch Build" option. This is located under |
| the menu Build > Batch Build. |
| * Click the "Select All" button. |
| * Click the "Build" button. |
| * If Visual Studio returns errors using the Batch Build option, build each |
| configuration individually instead. |
| * The samples should all build cleanly with no errors. |
| |
| To test the sample programs, run the "source\\samples\\all\\samplecheck.bat" |
| script for each configuration, and ensure that they are successful. |
| |
| ### Linux /Unix build and test |
| To build and test ICU4C samples: |
| |
| * In icu4c/source, run the configuration "./runConfigure Linux" (or appropriate system) |
| * Build and install ICU4C. |
| * Set PATH to include the bin directory of the installed ICU4c. |
| * Set LD_LIBRARY_PATH to include the libs directory on the installed ICU4c. |
| |
| |
| ``` |
| cd icu4c/source |
| cd samples |
| # To clean all the test binaries |
| make clean-samples-recursive |
| # To rebuild them all |
| make all-samples-recursive |
| # To run all tests serially |
| make check-samples-recursive |
| |
| ``` |
| |
| |
| ## **Test ICU4C Demos via Docker** |
| |
| See <https://github.com/unicode-org/icu-demos/blob/main/icu-kube/README.md> |
| |
| --- |
| |
| ## **Test ICU4J Web Demos via Docker** |
| |
| See: <https://github.com/unicode-org/icu-demos/blob/main/icu4jweb/README.md> |
| |
| --- |
| |
| ## Test ICU4J Demos |
| |
| These are the demo applets, see above for the icu4jweb demos. |
| |
| To test ICU4J demo applications, cd to ICU4J directory and build and run the |
| demo. |
| |
| ```sh |
| $ cd icu4j |
| $ ant jarDemos |
| $ java -jar icu4jdemos.jar |
| ``` |
| |
| Above command invokes GUI demo applications. As such it has to connect to a |
| X-Server. The easiest way is to run via e.g. remote desktop on the machine on |
| which it is executed instead of in a ssh shell. |
| |
| The demos include calendar, charset detection, holidays, RBNF and |
| transliterator. Check if each application is working OK. |
| |
| To check ICU4J samples, open Eclipse workspace and import icu4j-samples project |
| from directory <icu4j_root>/samples. Make sure these sample code has no build |
| issues. Also run sample code with main and see if each sample code runs. |
| |
| --- |
| |
| ## Test, exhaustive mode, C & J |
| |
| For ICU4J, |
| |
| ```sh |
| $ ant exhaustiveCheck |
| ``` |
| |
| For ICU4C, testing with an optimized build will help reduce the elapsed time |
| required for the tests to complete. |
| |
| ```sh |
| $ make -j6 check-exhaustive |
| ``` |
| |
| --- |
| |
| ## Test ICU4C with the Thread Sanitizer |
| |
| The build bots run the thread sanitizer on the most interesting multithreaded |
| tests. These instructions run the sanitizer on the entire test suite. The clang |
| compiler is required. |
| |
| ```sh |
| $ CPPFLAGS=-fsanitize=thread LDFLAGS=-fsanitize=thread ./runConfigureICU --enable-debug --disable-release Linux --disable-renaming |
| $ make clean |
| $ make -j6 check |
| ``` |