commit	d6b88d49e3be7096baf3828776c2b482a8ed1780	[log] [tgz]
author	Andy Heninger <andy.heninger@gmail.com>	Sat Feb 01 20:20:37 2020 -0800
committer	Steven R. Loomis <srl295@gmail.com>	Mon Feb 03 16:51:17 2020 -0800
tree	95495986e3726905d07dfe31823efccd08344311
parent	b7d08bc04a4296982fcef8b6b8a354a9e4e7afca [diff]

commit

d6b88d49e3be7096baf3828776c2b482a8ed1780

[log] [tgz]

author

Andy Heninger <andy.heninger@gmail.com>

Sat Feb 01 20:20:37 2020 -0800

committer

Steven R. Loomis <srl295@gmail.com>

Mon Feb 03 16:51:17 2020 -0800

tree

95495986e3726905d07dfe31823efccd08344311

parent

b7d08bc04a4296982fcef8b6b8a354a9e4e7afca [diff]

ICU-20939 Fix problem w regexp \b boundaries & UTF-8 text In regular expressions, when testing for word boundaries with \b, the boundaries were incorrect when in Unicode mode, meaning that an ICU word break iterator is being used to find the boundaries, and the text being matched is UTF-8 encoded. The bug stemmed from a misunderstanding of how string indexes work with UText and break iterators, leading to the inclusion of code to convert from UTF-8 to UTF-16 indexing, when what was wanted was the original UTF-8 index everywhere. Removing the indexing conversion fixes the problem.

tree: 95495986e3726905d07dfe31823efccd08344311

README.md

International Components for Unicode

This is the repository for the International Components for Unicode. The ICU project is under the stewardship of The Unicode Consortium.

Source: https://github.com/unicode-org/icu
Bugs: https://unicode-org.atlassian.net/projects/ICU

ICU Logo

Build Status (master branch)

Build	Status
TravisCI
Azure Pipelines
Azure Pipelines (Exhaustive Tests)
AppVeyor
Fuzzing

Subdirectories and Information

License

Please see ./icu4c/LICENSE (C and J are under an identical license file.)

Copyright © 2016 and later Unicode, Inc. and others. All Rights Reserved. Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries. Terms of Use and License