ICU-20467 get XLocaleMatcher ready for drop-in

Get XLocaleMatcher ready for replacing the LocaleMatcher code.
More simplifications beyond ICU-20330 PR #409, smaller data, some more optimizations.
New API ready to be moved over.

- less work for region partitions distance lookup:
  - encode each array of single-character partition strings as one string
  - look up each desired partition only once, not for each (desired, supported) pair
  - look up the * fallback region distance only for the first mismatch, not for each non-matching pair
- skip region distance lookup if minRegionDistance>=remainingThreshold
- locale distance table: remove subtables that contain only *-* with default script/region distance
- mark intermediate subtag matches via last-character bit 7, not also with a match value
- likely subtags data: prune trailing *-only levels, and skip *-only script levels; likely subtags perf test
- likely subtags: skip_script=1; LSR.indexForRegion(ill-formed)=0 not negative
- likely subtags small optimization: array lookup for first letter of language subtag
- defaultDemotionPerDesiredLocale=distance(en, en-GB)
- favor=script: still reject a script mismatch
- if an explicit default locale is given, prefer that (by LSR), not the first supported locale
- XLocaleMatcher.Builder: copy supported locales into a List not a Set to preserve input indexes; duplicates are harmless
- match by LSR only, not exact locale match; results consistent with no fastpath, simpler, sometimes a little slower
- internal getBestMatch() returns just the suppIndex
- store the best desired locale & index in an LSR iterator
- make an LSR from Locale without ULocale detour
- adjust the XLocaleMatcher API as proposed; remove unused internal methods; clean up LocalePriorityList docs
12 files changed
tree: 3e58c3266dfad75399f058ff4ef27ab923e7d8be
  1. .ci-builds/
  2. .github/
  3. docs/
  4. icu4c/
  5. icu4j/
  6. tools/
  7. vendor/
  8. .appveyor.yml
  9. .cpyskip.txt
  10. .gitattributes
  11. .gitignore
  12. .travis.yml
  13. README.md
README.md

International Components for Unicode

This is the repository for the International Components for Unicode. The ICU project is under the stewardship of The Unicode Consortium.

ICU Logo

Build Status

BuildStatus
TravisCIBuild Status
Win x64 ReleaseBuild status
Win x86 ReleaseBuild status

Subdirectories and Information

License

Please see ./icu4c/LICENSE (C and J are under an identical license file.)

Copyright © 2016 and later Unicode, Inc. and others. All Rights Reserved. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. Terms of Use and License