ICU-20732 Adds instruction how to develop an ICU fuzzer target and how to
reproduce fuzzer findings.

ICU-20732 Addresses review comments.

diff --git a/docs/processes/ b/docs/processes/
new file mode 100644
index 0000000..7b179bd
--- /dev/null
+++ b/docs/processes/
@@ -0,0 +1,156 @@
+© 2019 and later: Unicode, Inc. and others.
+License & terms of use:
+Developing Fuzzer Targets for ICU APIs
+This documents describes how to develop a [fuzzer](
+target for an ICU API and its integration into the ICU build process.
+### Directory and naming conventions
+Fuzzer targets are exclusively in directory
+and end with `_fuzzer.cpp`. Only files with such ending are recognized and executed as fuzzer
+targets by the OSS-Fuzz system.
+### General structure of a fuzzer target
+As a minimum, a fuzzer target contains the function
+extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
+  ...
+This function is expected and invoked by the fuzzer system. The `data` parameter contains the
+fuzzer-controlled data of size `size` bytes. Part or all of this data is then passed into the
+ICU API under test.
+Fuzzer target
+illustrates the basic elements.
+// © 2019 and later: Unicode, Inc. and others.
+// License & terms of use:
+#include <cstring>
+#include "fuzzer_utils.h"
+#include "unicode/coll.h"
+#include "unicode/localpointer.h"
+#include "unicode/locid.h"
+#include "unicode/tblcoll.h"
+IcuEnvironment* env = new IcuEnvironment();
+extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
+  UErrorCode status = U_ZERO_ERROR;
+  size_t unistr_size = size/2;
+  std::unique_ptr<char16_t[]> fuzzbuff(new char16_t[unistr_size]);
+  std::memcpy(fuzzbuff.get(), data, unistr_size * 2);
+  icu::UnicodeString fuzzstr(false, fuzzbuff.get(), unistr_size);
+  icu::LocalPointer<icu::RuleBasedCollator> col1(
+      new icu::RuleBasedCollator(fuzzstr, status));
+  return 0;
+The ICU API under test is the `RuleBasedCollator(const UnicodeString &rules, UErrorCode &status)`
+constructor. The code interprets the fuzzer data as UnicodeString and passes it to the constructor.
+And that is all. Specific error handling or return value verification is not required because the
+fuzzer will detect all memory issues by means of memory/address sanitizer findings.
+### changes
+ICU fuzzer targets are built and executed by the OSS-Fuzz project. On side of ICU they are compiled
+to assure that the code is syntactically correct and, as a sanity check, executed in the most basic
+manner, i.e. with minimal testdata and without ASAN or MSAN analysis.
+Add the new fuzzer target to the list of targets in the `FUZZER_TARGETS` variable in
+The new fuzzer target will then be built and executed as part of a normal ICU4C unit test run. Note
+that each fuzzer target becomes executable on its own. As such it is linked with the code in
+`fuzzer_driver.cpp`, which contains the `main()` function.
+### Fuzzer seed corpus
+Any fuzzer seed data for a fuzzer target goes into a file with name `<fuzzer_target>_seed_corpus.txt`.
+In many cases the input parameter of the ICU API under test is of type `UnicodeString`, in case
+of which the seed data should be in UTF-16 format. As an example,see
+### Guidelines and tips
+*   Leave all randomness to the fuzzer. If a random selection of any kind is needed (e.g., of a
+    locale), then use bytes from the fuzzer data to make the selection
+    ([example](
+*   In many cases ICU unit tests can provide seed data or at least ideas for seed data. If the API
+    under test requires a Unicode string then make sure that the seed data is in UTF-16 encoding.
+    This can be achieved with e.g. the 'iconv' command or using an editor that saves text in UTF-16.
+### How to locally reproduce fuzzer findings
+At this time reproduction of fuzzer findings requires Docker installed on the local machine and the
+OSS-Fuzz project downloaded in a local git client.
+1.  Install Docker (Ubuntu):
+    ```
+    sudo apt install docker
+    ```
+2.  Download OSS-Fuzz, switch into directory oss-fuzz/
+    In a git client directory, download the fuzzer system.
+    ```
+    git clone
+    cd oss-fuzz/
+    ```
+3.  Build the Docker image for ICU.
+    In some setups root permissions may be required to connect to the Docker.
+    ```
+    [sudo] python infra/ build_image icu
+    ```
+    A prompt will appear: `Pull latest base images (compiler/runtime)? (y/N)`
+    Respond: 'N'. If you are curious then respond with 'y' (won't hurt).
+4.  Build the ICU fuzzers:
+    ```
+    [sudo] python infra/ build_fuzzers --sanitizer [address | memory | undefined] icu
+    ```
+    Check that the fuzzer targets were built successfully: ```ls -l build/out/icu```
+5.   Reproduce the fuzzer finding.
+     First, get the testdata the fuzzer used when finding the issue. In the fuzzer bug report look
+     for 'Reproducer Testcase', a click on the link will download the testdata. Then execute
+     ```
+     [sudo] python infra/ reproduce icu <icu_fuzzer> <testdata>
+     ```
+     Concrete example:
+     ```
+     sudo python infra/ reproduce icu uregex_open_fuzzer  ~/Downloads/clusterfuzz-testcase-minimized-uregex_open_fuzzer-5732067058384896
+     ```
+**Limitations:** When reproducing a fuzzer finding in the way outlined above the fuzzer environment
+will use the current ICU trunk from Thus it is not possible
+to modify the code to try out a possible fix. What can be done is to redirect Docker to download ICU
+from a forked ICU repository. Open the file oss-fuzz/projects/icu/Dockerfile and adjust the line
+with `git clone --depth 1 icu` accordingly. Then modify
+the code in the forked repository and follow the steps above beginning with step 3, create a Docker
+This of course is still a tedious way of reproducing and working on a fuzzer finding. Ticket
+[ICU-20734]( aims to introduce a fuzzer driver
+that can reproduce certain fuzzer findings in a local ICU workspace.