| # Confusables.txt |
| # Generated: %date%, MED |
| # This is a draft list of visually confusable characters, for use in conjunction with the |
| # recommendations in http://www.unicode.org/reports/tr36/ |
| # |
| # To fold using this list, first perform NFKD (if not already performed), |
| # then map each source character to the target character(s), then perform NFKD again. |
| # |
| # The format the standard Unicode semicolon-delimited hex. |
| # <source> ; <target> ; <internal_info> # <comment> |
| # |
| # The characters may be visually distinguishable in many fonts, or at larger sizes. |
| # Some anomalies are also introduced by 'closure'. That is, there may be a sequence of |
| # characters where each is visually confusable from the next, but the start and end are |
| # visually distinguishable. But when the set is closed, these will all map to together. |
| # |
| # This is unlike normalization data. There may be no connection between characters other |
| # than visual confusability. This data should not be used except in assessing visual confusability. |
| # |
| # This list is not limited to Unicode Identifier characters (XID_Continue) although the primary |
| # application will be to such characters. It is also not limited to lowercase characters, |
| # although the recommendations are to lowercase for security. |
| # |
| # Note that a some characters have unusual characteristics, and are not yet accounted for. |
| # For example, U+302E (?) HANGUL SINGLE DOT TONE MARK and U+302F (?) HANGUL DOUBLE DOT TONE MARK |
| # appear to the left of the prevous character. So what looks like "a:b" can actually be "ab\u302F" |
| # |
| # WARNING: The data is not final; it is very draft at this point, put together from different |
| # sources that need to be reviewed for accuracy and completeness of the mappings. |
| # There are still clear errors in the data; do not use this in any implementations. |
| # Ignore the internal_info field; it will be removed. |
| # |
| # Thanks especially to Eric van der Poel for collecting information about fonts using shared glyphs. |
| # ================================= |