| <html> |
| |
| <head> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| <meta http-equiv="Content-Language" content="en-us"> |
| <link rel="stylesheet" href="charts.css" type="text/css"> |
| <meta name="GENERATOR" content="Microsoft FrontPage 4.0"> |
| <meta name="ProgId" content="FrontPage.Editor.Document"> |
| <title>UCA Chart Help</title> |
| <base target="main"> |
| </head> |
| |
| <body> |
| |
| <h2 align="center">UCA Chart Help</h2> |
| <p>This set of charts shows the Unicode Collation Algorithm values for Unicode |
| characters. The characters are arranged in the following groups:</p> |
| <table cellspacing="0" cellpadding="4"> |
| <tr> |
| <th align="left"><i>Null</i></th> |
| <th class="x">Completely ignoreable (primary, secondary and tertiary levels)<br> |
| These include control codes and various formatting codes.</th> |
| </tr> |
| <tr> |
| <th align="left"><i>Ignorable</i></th> |
| <th class="x">Ignorable at a primary level, but not at a secondary or |
| tertiary level.<br> |
| These include most accents and diacritics.</th> |
| </tr> |
| <tr> |
| <th align="left"><i>Variable</i></th> |
| <th class="x">Characters that may be set to ignorable by a programmatic |
| switch.<br> |
| These include spaces, punctuation marks, and most symbols.</th> |
| </tr> |
| <tr> |
| <th align="left"><i>Common</i></th> |
| <th class="x">Characters that are none of the above, but not considered |
| letters.<br> |
| These include numbers, currency symbols, etc.</th> |
| <tr> |
| <th align="left"><i>Letters</i></th> |
| <th class="x">According to script</th> |
| </tr> |
| <tr> |
| <th align="left"><i>Unsupported</i></th> |
| <th class="x">Not explicitly supported in this version of UCA; uses |
| code-point order</th> |
| </tr> |
| </table> |
| <p>The characters* within each group are arranged in cells. The color of the |
| cell indicates the strength of the difference between that character and the <i>previous</i> |
| character in the chart, as follows.</p> |
| <table cellspacing="0" cellpadding="4"> |
| <tr> |
| <th colspan="2"><font size="3"><u>No Expansion</u></font> |
| <th rowspan="5"> |
| <th colspan="2"><font size="3"><u>Expansion</u></font> |
| </tr> |
| <tr> |
| <td class="p">a<br> |
| <tt>0061</tt></td> |
| <th class="x">Primary difference |
| <td class="ep">dz<br> |
| <tt>01F3</tt></td> |
| <th class="x">Primary difference</th> |
| </tr> |
| <tr> |
| <td class="s">á<br> |
| <tt>00E1</tt></td> |
| <th class="x">Secondary Difference</th> |
| <td class="es">DZ<br> |
| <tt>01F1</tt></td> |
| <th class="x">Secondary Difference</th> |
| </tr> |
| <tr> |
| <td class="t">A<br> |
| <tt>0041</tt></td> |
| <th class="x">Tertiary difference</th> |
| <td class="et">Dz<br> |
| <tt>01F2</tt></td> |
| <th class="x">Tertiary difference</th> |
| <tr> |
| <td class="q">Å<br> |
| <tt>212B</tt></td> |
| <th class="x">Quarternary difference<br> |
| or no difference</th> |
| <td class="eq"> </td> |
| <th class="x">Quarternary difference<br> |
| or no difference</th> |
| </tr> |
| </table> |
| <blockquote> |
| <p align="left"><b>Note: </b>If tool-tips are enabled in your browser, then if |
| you pause the mouse over any cell, you will see the name of the character and |
| a representation of the sort key. In this representation, the separators |
| between the weight levels are represented with "|".</p> |
| </blockquote> |
| <table> |
| <tr> |
| <th>*</th> |
| <th class="x">In some cases, the UCA data table also includes contractions.<br> |
| They can be recognized by the multiple code point numbers, as in the |
| following:</th> |
| <td class="p">ஔ<br> |
| <tt>0B92 0BD7</tt></td> |
| </tr> |
| </table> |
| <h3><b>Notes</b></h3> |
| <ul> |
| <li>The UCA results are versioned <i>both</i> by the version of the UCA <i>and</i> |
| by the version of The Unicode Standard used to process the data.</li> |
| <li>These charts only provide one of the alternatives for handling variable |
| characters (punctuation), whereby these characters are <b>non-ignorable.</b></li> |
| <li>Characters from large blocks, such as CJK-Ideographs, Hangul Syllables, |
| Private Use Area, etc. are represented by a sampling.</li> |
| <li>Some unassigned code points, noncharacters and other edge cases are also |
| added to the list for comparison.</li> |
| <li>For more information, see <a href="http://www.unicode.org/unicode/reports/tr10/" target="_top">UTS |
| #10: Unicode Collation Algorithm</a>.</li> |
| </ul> |
| |
| </body> |
| |
| </html> |