docs/usermanual-what-is-harfbuzz.xml - third_party/harfbuzz - Git at Google

 <?xml version="1.0"?>
 <!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
                "http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
   <!ENTITY % local.common.attrib "xmlns:xi  CDATA  #FIXED 'http://www.w3.org/2003/XInclude'">
   <!ENTITY version SYSTEM "version.xml">
 ]>
 <chapter id="what-is-harfbuzz">
   <title>What is HarfBuzz?</title>
   <para>
     HarfBuzz is a <emphasis>text-shaping engine</emphasis>. If you
     give HarfBuzz a font and a string containing a sequence of Unicode
     codepoints, HarfBuzz selects and positions the corresponding
     glyphs from the font, applying all of the necessary layout rules
     and font features. HarfBuzz then returns the string to you in the
     form that is correctly arranged for the language and writing
     system.
   </para>
   <para>
     HarfBuzz can properly shape all of the world's major writing
     systems. It runs on all major operating systems and software
     platforms and it supports the major font formats in use
     today.
   </para>
   <section id="what-is-text-shaping">
     <title>What is text shaping?</title>
     <para>
       Text shaping is the process of translating a string of character
       codes (such as Unicode codepoints) into a properly arranged
       sequence of glyphs that can be rendered onto a screen or into
       final output form for inclusion in a document.
     </para>
     <para>
       The shaping process is dependent on the input string, the active
       font, the script (or writing system) that the string is in, and
       the language that the string is in.
     </para>
     <para>
       Modern software systems generally only deal with strings in the
       Unicode encoding scheme (although legacy systems and documents may
       involve other encodings).
     </para>
     <para>
       There are several font formats that a program might
       encounter, each of which has a set of standard text-shaping
       rules.
     </para>
     <para>The dominant format is <ulink
       url="http://www.microsoft.com/typography/otspec/">OpenType</ulink>. The
     OpenType specification defines a series of <ulink url="https://github.com/n8willis/opentype-shaping-documents">shaping models</ulink> for
     various scripts from around the world. These shaping models depend on
     the font incorporating certain features as
     <emphasis>lookups</emphasis> in its <literal>GSUB</literal>
     and <literal>GPOS</literal> tables.
     </para>
     <para>
       Alternatively, OpenType fonts can include shaping features for
       the <ulink url="https://graphite.sil.org/">Graphite</ulink> shaping model.
     </para>
     <para>
       TrueType fonts can also include OpenType shaping
       features. Alternatively, TrueType fonts can also include <ulink url="https://developer.apple.com/fonts/TrueType-Reference-Manual/RM09/AppendixF.html">Apple
       Advanced Typography</ulink> (AAT) tables to implement shaping
       support. AAT fonts are generally only found on macOS and iOS systems.
     </para>
     <para>
       Text strings will usually be tagged with a script and language
       tag that provide the context needed to perform text shaping
       correctly.  The necessary <ulink
       url="https://docs.microsoft.com/en-us/typography/opentype/spec/scripttags">script</ulink>
       and <ulink
       url="https://docs.microsoft.com/en-us/typography/opentype/spec/languagetags">language</ulink>
       tags are defined by OpenType.
     </para>
   </section>

   <section id="why-do-i-need-a-shaping-engine">
     <title>Why do I need a shaping engine?</title>
     <para>
       Text shaping is an integral part of preparing text for
       display. Before a Unicode sequence can be rendered, the
       codepoints in the sequence must be mapped to the corresponding
       glyphs provided in the font, and those glyphs must be positioned
       correctly relative to each other. For many of the scripts
       supported in Unicode, these steps involve script-specific layout
       rules, including complex joining, reordering, and positioning
       behavior. Implementing these rules is the job of the shaping engine.
     </para>
     <para>
       Text shaping is a fairly low-level operation. HarfBuzz is
       used directly by text-handling libraries like <ulink
       url="https://www.pango.org/">Pango</ulink>, as well as by the layout
       engines in Firefox, LibreOffice, and Chromium. Unless you are
       <emphasis>writing</emphasis> one of these layout engines
       yourself, you will probably not need to use HarfBuzz: normally,
       a layout engine, toolkit, or other library will turn text into
       glyphs for you.
     </para>
     <para>
       However, if you <emphasis>are</emphasis> writing a layout engine
       or graphics library yourself, then you will need to perform text
       shaping, and this is where HarfBuzz can help you.
     </para>
     <para>
       Here are some specific scenarios where a text-shaping engine
       like HarfBuzz helps you:
     </para>
     <itemizedlist>
       <listitem>
         <para>
           OpenType fonts contain a set of glyphs (that is, shapes
 	  to represent the letters, numbers, punctuation marks, and
 	  all other symbols), which are indexed by a <literal>glyph ID</literal>.
 	</para>
 	<para>
           A particular glyph ID within the font does not necessarily
 	  correlate to a predictable Unicode codepoint. For instance,
 	  some fonts have the letter &quot;a&quot; as glyph ID 1, but
 	  many others do not. In order to retrieve the right glyph
 	  from the font to display &quot;a&quot;, you need to consult
 	  the table inside the font (the <literal>cmap</literal>
 	  table) that maps Unicode codepoints to glyph IDs. In other
 	  words, <emphasis>text shaping turns codepoints into glyph
 	  IDs</emphasis>.
         </para>
       </listitem>
       <listitem>
         <para>
           Many OpenType fonts contain ligatures: combinations of
           characters that are rendered as a single unit. For instance,
 	  it is common for the &quot;f, i&quot; letter
 	  sequence to appear in print as the single ligature glyph
 	  &quot;ﬁ&quot;.
 	</para>
 	<para>
 	  Whether you should render an &quot;f, i&quot; sequence
 	  as <literal>fi</literal> or as &quot;ﬁ&quot; does not
           depend on the input text. Instead, it depends on the whether
 	  or not the font includes an &quot;ﬁ&quot; glyph and on the
 	  level of ligature application you wish to perform. The font
 	  and the amount of ligature application used are under your
 	  control. In other words, <emphasis>text shaping involves
 	  querying the font's ligature tables and determining what
 	  substitutions should be made</emphasis>.
         </para>
       </listitem>
       <listitem>
         <para>
           While ligatures like &quot;ﬁ&quot; are optional typographic
           refinements, some languages <emphasis>require</emphasis> certain
           substitutions to be made in order to display text correctly.
         </para>
 	<para>
 	  For example, in Tamil, when the letter &quot;TTA&quot; (ட)
 	  letter is followed by &quot;U&quot; (உ), the pair
 	  must be replaced by the single glyph &quot;டு&quot;. The
 	  sequence of Unicode characters &quot;டஉ&quot; needs to be
 	  substituted with a single &quot;டு&quot; glyph from the
 	  font.
 	</para>
 	<para>
 	  But &quot;டு&quot; does not have a Unicode codepoint. To
 	  find this glyph, you need to consult the table inside
 	  the font (the <literal>GSUB</literal> table) that contains
 	  substitution information. In other words, <emphasis>text shaping
 	  chooses the correct glyph for a sequence of characters
 	  provided</emphasis>.
         </para>
       </listitem>
       <listitem>
         <para>
           Similarly, each Arabic character has four different variants
 	  corresponding to the different positions it might appear in
 	  within a sequence. Inside a font, there will be separate
 	  glyphs for the initial, medial, final, and isolated forms of
 	  each letter, each at a different glyph ID.
 	</para>
 	<para>
 	  Unicode only assigns one codepoint per character, so a
 	  Unicode string will not tell you which glyph variant to use
 	  for each character. To decide, you need to analyze the whole
 	  string and determine the appropriate glyph for each character
 	  based on its position. In other words, <emphasis>text
 	  shaping chooses the correct form of the letter by its
 	  position and returns the correct glyph from the font</emphasis>.
         </para>
       </listitem>
       <listitem>
         <para>
           Other languages involve marks and accents that need to be
           rendered in specific positions relative a base character. For
           instance, the Moldovan language includes the Cyrillic letter
           &quot;zhe&quot; (ж) with a breve accent, like so: &quot;ӂ&quot;.
 	</para>
 	<para>
 	  Some fonts will provide this character as a single
 	  zhe-with-breve glyph, but other fonts will not and, instead,
 	  will expect the rendering engine to form the character by
           superimposing the separate &quot;ж&quot; and &quot;˘&quot;
 	  glyphs.
 	</para>
 	<para>
 	  But exactly where you should draw the breve depends on the
 	  height and width of the preceding zhe glyph. To find the
 	  right position, you need to consult the table inside
 	  the font (the <literal>GPOS</literal> table) that contains
 	  positioning information.
           In other words, <emphasis>text shaping tells you whether you
 	  have a precomposed glyph within your font or if you need to
 	  compose a glyph yourself out of combining marks&mdash;and,
 	  if so, where to position those marks.</emphasis>
         </para>
       </listitem>
     </itemizedlist>
     <para>
       If tasks like these are something that you need to do, then you
       need a text shaping engine. You could use Uniscribe if you are
       writing Windows software; you could use CoreText on macOS; or
       you could use HarfBuzz.
     </para>
     <note>
       <para>
 	In the rest of this manual, the text will assume that the reader
 	is that implementor of a text-layout engine.
       </para>
     </note>
   </section>


   <section>
     <title>What does HarfBuzz do?</title>
     <para>
       HarfBuzz provides text shaping through a cross-platform
       C API that accepts sequences of Unicode codepoints as input. Currently,
       the following OpenType shaping models are supported:
     </para>
     <itemizedlist>
       <listitem>
 	<para>
 	  Indic (covering Devanagari, Bengali, Gujarati,
 	  Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, and
 	  Sinhala)
 	</para>
       </listitem>
       <listitem>
 	<para>
 	  Arabic (covering Arabic, N'Ko, Syriac, and Mongolian)
 	</para>
       </listitem>
       <listitem>
 	<para>
 	  Thai and Lao
 	</para>
       </listitem>
       <listitem>
 	<para>
 	  Khmer
 	</para>
       </listitem>
       <listitem>
 	<para>
 	  Myanmar
 	</para>
       </listitem>

       <listitem>
 	<para>
 	  Tibetan
 	</para>
       </listitem>

       <listitem>
 	<para>
 	  Hangul
 	</para>
       </listitem>

       <listitem>
 	<para>
 	  Hebrew
 	</para>
       </listitem>
       <listitem>
 	<para>
 	  The Universal Shaping Engine or <emphasis>USE</emphasis>
 	  (covering complex scripts not covered by the above shaping
 	  models)
 	</para>
       </listitem>
       <listitem>
 	<para>
 	  A default shaping model for non-complex scripts
 	  (covering Latin, Cyrillic, Greek, Armenian, Georgian, Tifinagh,
 	  and many others)
 	</para>
       </listitem>
       <listitem>
 	<para>
 	  Emoji (including emoji modifier sequences, flag sequences,
 	  and ZWJ sequences)
 	</para>
       </listitem>
     </itemizedlist>

     <para>
       In addition to OpenType shaping, HarfBuzz supports the latest
       version of Graphite shaping (the "Graphite 2" model) and AAT
       shaping.
     </para>

     <para>
       HarfBuzz can read and understand TrueType fonts (.ttf), TrueType
       collections (.ttc), and OpenType fonts (.otf, including those
       fonts that contain TrueType-style outlines and those that
       contain PostScript CFF or CFF2 outlines).
     </para>

     <para>
       HarfBuzz is designed and tested to run on top of the FreeType
       font renderer. It can run on Linux, Android, Windows, macOS, and
       iOS systems.
     </para>

     <para>
       In addition to its core shaping functionality, HarfBuzz provides
       functions for accessing other font features, including optional
       GSUB and GPOS OpenType features, as well as
       all color-font formats (<literal>CBDT</literal>,
       <literal>sbix</literal>, <literal>COLR/CPAL</literal>, and
       <literal>SVG-OT</literal>) and OpenType variable fonts. HarfBuzz
       also includes a font-subsetting feature. HarfBuzz can perform
       some low-level math-shaping operations, although it does not
       currently perform full shaping for mathematical typesetting.
     </para>

     <para>
       A suite of command-line utilities is also provided in the
       source-code tree, designed to help users test and debug
       HarfBuzz's features on real-world fonts and input.
     </para>
   </section>

   <section id="what-harfbuzz-doesnt-do">
     <title>What HarfBuzz doesn't do</title>
     <para>
       HarfBuzz will take a Unicode string, shape it, and give you the
       information required to lay it out correctly on a single
       horizontal (or vertical) line using the font provided. That is the
       extent of HarfBuzz's responsibility.
     </para>
     <para>
       It is important to note that if you are implementing a complete
       text-layout engine you may have other responsibilities that
       HarfBuzz will <emphasis>not</emphasis> help you with. For example:
     </para>
     <itemizedlist>
       <listitem>
         <para>
           HarfBuzz won't help you with bidirectionality. If you want to
           lay out text that includes a mix of Hebrew and English, you
 	  will need to ensure that each buffer provided to HarfBuzz
 	  has all of its characters in the same order and that the
 	  directionality of the buffer is set correctly. This may mean
 	  segmenting the text before it is placed into HarfBuzz buffers. In
           other words, the user will hit the keys in the following
           sequence:
         </para>
         <programlisting>
 	  A B C [space] ג ב א [space] D E F
         </programlisting>
         <para>
           but will expect to see in the output:
         </para>
         <programlisting>
 	  ABC אבג DEF
         </programlisting>
         <para>
           This reordering is called <emphasis>bidi processing</emphasis>
           (&quot;bidi&quot; is short for bidirectional), and there's an
           algorithm as an annex to the Unicode Standard which tells you how
           to process a string of mixed directionality.
           Before sending your string to HarfBuzz, you may need to apply the
           bidi algorithm to it. Libraries such as <ulink
 	  url="http://icu-project.org/">ICU</ulink> and <ulink
 	  url="http://fribidi.org/">fribidi</ulink> can do this for you.
         </para>
       </listitem>
       <listitem>
         <para>
           HarfBuzz won't help you with text that contains different font
           properties. For instance, if you have the string &quot;a
           <emphasis>huge</emphasis> breakfast&quot;, and you expect
           &quot;huge&quot; to be italic, then you will need to send three
           strings to HarfBuzz: <literal>a</literal>, in your Roman font;
           <literal>huge</literal> using your italic font; and
           <literal>breakfast</literal> using your Roman font again.
 	</para>
 	<para>
           Similarly, if you change the font, font size, script,
 	  language, or direction within your string, then you will
 	  need to shape each run independently and output them
 	  independently. HarfBuzz expects to shape a run of characters
 	  that all share the same properties.
         </para>
       </listitem>
       <listitem>
         <para>
           HarfBuzz won't help you with line breaking, hyphenation, or
           justification. As mentioned above, HarfBuzz lays out the string
           along a <emphasis>single line</emphasis> of, notionally,
           infinite length. If you want to find out where the potential
           word, sentence and line break points are in your text, you
           could use the ICU library's break iterator functions.
         </para>
         <para>
           HarfBuzz can tell you how wide a shaped piece of text is, which is
           useful input to a justification algorithm, but it knows nothing
           about paragraphs, lines or line lengths. Nor will it adjust the
           space between words to fit them proportionally into a line.
         </para>
       </listitem>
     </itemizedlist>
     <para>
       As a layout-engine implementor, HarfBuzz will help you with the
       interface between your text and your font, and that's something
       that you'll need&mdash;what you then do with the glyphs that your font
       returns is up to you.
     </para>
   </section>

   <section id="why-is-it-called-harfbuzz">
     <title>Why is it called HarfBuzz?</title>
     <para>
       HarfBuzz began its life as text-shaping code within the FreeType
       project (and you will see references to the FreeType authors
       within the source code copyright declarations), but was then
       extracted out to its own project. This project is maintained by
       Behdad Esfahbod, who named it HarfBuzz. Originally, it was a
       shaping engine for OpenType fonts&mdash;&quot;HarfBuzz&quot; is
       the Persian for &quot;open type&quot;.
     </para>
   </section>
 </chapter>
	<?xml version="1.0"?>
	<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.3//EN"
	"http://www.oasis-open.org/docbook/xml/4.3/docbookx.dtd" [
	<!ENTITY % local.common.attrib "xmlns:xi CDATA #FIXED 'http://www.w3.org/2003/XInclude'">
	<!ENTITY version SYSTEM "version.xml">
	]>
	<chapter id="what-is-harfbuzz">
	<title>What is HarfBuzz?</title>
	<para>
	HarfBuzz is a <emphasis>text-shaping engine</emphasis>. If you
	give HarfBuzz a font and a string containing a sequence of Unicode
	codepoints, HarfBuzz selects and positions the corresponding
	glyphs from the font, applying all of the necessary layout rules
	and font features. HarfBuzz then returns the string to you in the
	form that is correctly arranged for the language and writing
	system.
	</para>
	<para>
	HarfBuzz can properly shape all of the world's major writing
	systems. It runs on all major operating systems and software
	platforms and it supports the major font formats in use
	today.
	</para>
	<section id="what-is-text-shaping">
	<title>What is text shaping?</title>
	<para>
	Text shaping is the process of translating a string of character
	codes (such as Unicode codepoints) into a properly arranged
	sequence of glyphs that can be rendered onto a screen or into
	final output form for inclusion in a document.
	</para>
	<para>
	The shaping process is dependent on the input string, the active
	font, the script (or writing system) that the string is in, and
	the language that the string is in.
	</para>
	<para>
	Modern software systems generally only deal with strings in the
	Unicode encoding scheme (although legacy systems and documents may
	involve other encodings).
	</para>
	<para>
	There are several font formats that a program might
	encounter, each of which has a set of standard text-shaping
	rules.
	</para>
	<para>The dominant format is <ulink
	url="http://www.microsoft.com/typography/otspec/">OpenType</ulink>. The
	OpenType specification defines a series of <ulink url="https://github.com/n8willis/opentype-shaping-documents">shaping models</ulink> for
	various scripts from around the world. These shaping models depend on
	the font incorporating certain features as
	<emphasis>lookups</emphasis> in its <literal>GSUB</literal>
	and <literal>GPOS</literal> tables.
	</para>
	<para>
	Alternatively, OpenType fonts can include shaping features for
	the <ulink url="https://graphite.sil.org/">Graphite</ulink> shaping model.
	</para>
	<para>
	TrueType fonts can also include OpenType shaping
	features. Alternatively, TrueType fonts can also include <ulink url="https://developer.apple.com/fonts/TrueType-Reference-Manual/RM09/AppendixF.html">Apple
	Advanced Typography</ulink> (AAT) tables to implement shaping
	support. AAT fonts are generally only found on macOS and iOS systems.
	</para>
	<para>
	Text strings will usually be tagged with a script and language
	tag that provide the context needed to perform text shaping
	correctly. The necessary <ulink
	url="https://docs.microsoft.com/en-us/typography/opentype/spec/scripttags">script</ulink>
	and <ulink
	url="https://docs.microsoft.com/en-us/typography/opentype/spec/languagetags">language</ulink>
	tags are defined by OpenType.
	</para>
	</section>

	<section id="why-do-i-need-a-shaping-engine">
	<title>Why do I need a shaping engine?</title>
	<para>
	Text shaping is an integral part of preparing text for
	display. Before a Unicode sequence can be rendered, the
	codepoints in the sequence must be mapped to the corresponding
	glyphs provided in the font, and those glyphs must be positioned
	correctly relative to each other. For many of the scripts
	supported in Unicode, these steps involve script-specific layout
	rules, including complex joining, reordering, and positioning
	behavior. Implementing these rules is the job of the shaping engine.
	</para>
	<para>
	Text shaping is a fairly low-level operation. HarfBuzz is
	used directly by text-handling libraries like <ulink
	url="https://www.pango.org/">Pango</ulink>, as well as by the layout
	engines in Firefox, LibreOffice, and Chromium. Unless you are
	<emphasis>writing</emphasis> one of these layout engines
	yourself, you will probably not need to use HarfBuzz: normally,
	a layout engine, toolkit, or other library will turn text into
	glyphs for you.
	</para>
	<para>
	However, if you <emphasis>are</emphasis> writing a layout engine
	or graphics library yourself, then you will need to perform text
	shaping, and this is where HarfBuzz can help you.
	</para>
	<para>
	Here are some specific scenarios where a text-shaping engine
	like HarfBuzz helps you:
	</para>
	<itemizedlist>
	<listitem>
	<para>
	OpenType fonts contain a set of glyphs (that is, shapes
	to represent the letters, numbers, punctuation marks, and
	all other symbols), which are indexed by a <literal>glyph ID</literal>.
	</para>
	<para>
	A particular glyph ID within the font does not necessarily
	correlate to a predictable Unicode codepoint. For instance,
	some fonts have the letter "a" as glyph ID 1, but
	many others do not. In order to retrieve the right glyph
	from the font to display "a", you need to consult
	the table inside the font (the <literal>cmap</literal>
	table) that maps Unicode codepoints to glyph IDs. In other
	words, <emphasis>text shaping turns codepoints into glyph
	IDs</emphasis>.
	</para>
	</listitem>
	<listitem>
	<para>
	Many OpenType fonts contain ligatures: combinations of
	characters that are rendered as a single unit. For instance,
	it is common for the "f, i" letter
	sequence to appear in print as the single ligature glyph
	"ﬁ".
	</para>
	<para>
	Whether you should render an "f, i" sequence
	as <literal>fi</literal> or as "ﬁ" does not
	depend on the input text. Instead, it depends on the whether
	or not the font includes an "ﬁ" glyph and on the
	level of ligature application you wish to perform. The font
	and the amount of ligature application used are under your
	control. In other words, <emphasis>text shaping involves
	querying the font's ligature tables and determining what
	substitutions should be made</emphasis>.
	</para>
	</listitem>
	<listitem>
	<para>
	While ligatures like "ﬁ" are optional typographic
	refinements, some languages <emphasis>require</emphasis> certain
	substitutions to be made in order to display text correctly.
	</para>
	<para>
	For example, in Tamil, when the letter "TTA" (ட)
	letter is followed by "U" (உ), the pair
	must be replaced by the single glyph "டு". The
	sequence of Unicode characters "டஉ" needs to be
	substituted with a single "டு" glyph from the
	font.
	</para>
	<para>
	But "டு" does not have a Unicode codepoint. To
	find this glyph, you need to consult the table inside
	the font (the <literal>GSUB</literal> table) that contains
	substitution information. In other words, <emphasis>text shaping
	chooses the correct glyph for a sequence of characters
	provided</emphasis>.
	</para>
	</listitem>
	<listitem>
	<para>
	Similarly, each Arabic character has four different variants
	corresponding to the different positions it might appear in
	within a sequence. Inside a font, there will be separate
	glyphs for the initial, medial, final, and isolated forms of
	each letter, each at a different glyph ID.
	</para>
	<para>
	Unicode only assigns one codepoint per character, so a
	Unicode string will not tell you which glyph variant to use
	for each character. To decide, you need to analyze the whole
	string and determine the appropriate glyph for each character
	based on its position. In other words, <emphasis>text
	shaping chooses the correct form of the letter by its
	position and returns the correct glyph from the font</emphasis>.
	</para>
	</listitem>
	<listitem>
	<para>
	Other languages involve marks and accents that need to be
	rendered in specific positions relative a base character. For
	instance, the Moldovan language includes the Cyrillic letter
	"zhe" (ж) with a breve accent, like so: "ӂ".
	</para>
	<para>
	Some fonts will provide this character as a single
	zhe-with-breve glyph, but other fonts will not and, instead,
	will expect the rendering engine to form the character by
	superimposing the separate "ж" and "˘"
	glyphs.
	</para>
	<para>
	But exactly where you should draw the breve depends on the
	height and width of the preceding zhe glyph. To find the
	right position, you need to consult the table inside
	the font (the <literal>GPOS</literal> table) that contains
	positioning information.
	In other words, <emphasis>text shaping tells you whether you
	have a precomposed glyph within your font or if you need to
	compose a glyph yourself out of combining marks—and,
	if so, where to position those marks.</emphasis>
	</para>
	</listitem>
	</itemizedlist>
	<para>
	If tasks like these are something that you need to do, then you
	need a text shaping engine. You could use Uniscribe if you are
	writing Windows software; you could use CoreText on macOS; or
	you could use HarfBuzz.
	</para>
	<note>
	<para>
	In the rest of this manual, the text will assume that the reader
	is that implementor of a text-layout engine.
	</para>
	</note>
	</section>


	<section>
	<title>What does HarfBuzz do?</title>
	<para>
	HarfBuzz provides text shaping through a cross-platform
	C API that accepts sequences of Unicode codepoints as input. Currently,
	the following OpenType shaping models are supported:
	</para>
	<itemizedlist>
	<listitem>
	<para>
	Indic (covering Devanagari, Bengali, Gujarati,
	Gurmukhi, Kannada, Malayalam, Oriya, Tamil, Telugu, and
	Sinhala)
	</para>
	</listitem>
	<listitem>
	<para>
	Arabic (covering Arabic, N'Ko, Syriac, and Mongolian)
	</para>
	</listitem>
	<listitem>
	<para>
	Thai and Lao
	</para>
	</listitem>
	<listitem>
	<para>
	Khmer
	</para>
	</listitem>
	<listitem>
	<para>
	Myanmar
	</para>
	</listitem>

	<listitem>
	<para>
	Tibetan
	</para>
	</listitem>

	<listitem>
	<para>
	Hangul
	</para>
	</listitem>

	<listitem>
	<para>
	Hebrew
	</para>
	</listitem>
	<listitem>
	<para>
	The Universal Shaping Engine or <emphasis>USE</emphasis>
	(covering complex scripts not covered by the above shaping
	models)
	</para>
	</listitem>
	<listitem>
	<para>
	A default shaping model for non-complex scripts
	(covering Latin, Cyrillic, Greek, Armenian, Georgian, Tifinagh,
	and many others)
	</para>
	</listitem>
	<listitem>
	<para>
	Emoji (including emoji modifier sequences, flag sequences,
	and ZWJ sequences)
	</para>
	</listitem>
	</itemizedlist>

	<para>
	In addition to OpenType shaping, HarfBuzz supports the latest
	version of Graphite shaping (the "Graphite 2" model) and AAT
	shaping.
	</para>

	<para>
	HarfBuzz can read and understand TrueType fonts (.ttf), TrueType
	collections (.ttc), and OpenType fonts (.otf, including those
	fonts that contain TrueType-style outlines and those that
	contain PostScript CFF or CFF2 outlines).
	</para>

	<para>
	HarfBuzz is designed and tested to run on top of the FreeType
	font renderer. It can run on Linux, Android, Windows, macOS, and
	iOS systems.
	</para>

	<para>
	In addition to its core shaping functionality, HarfBuzz provides
	functions for accessing other font features, including optional
	GSUB and GPOS OpenType features, as well as
	all color-font formats (<literal>CBDT</literal>,
	<literal>sbix</literal>, <literal>COLR/CPAL</literal>, and
	<literal>SVG-OT</literal>) and OpenType variable fonts. HarfBuzz
	also includes a font-subsetting feature. HarfBuzz can perform
	some low-level math-shaping operations, although it does not
	currently perform full shaping for mathematical typesetting.
	</para>

	<para>
	A suite of command-line utilities is also provided in the
	source-code tree, designed to help users test and debug
	HarfBuzz's features on real-world fonts and input.
	</para>
	</section>

	<section id="what-harfbuzz-doesnt-do">
	<title>What HarfBuzz doesn't do</title>
	<para>
	HarfBuzz will take a Unicode string, shape it, and give you the
	information required to lay it out correctly on a single
	horizontal (or vertical) line using the font provided. That is the
	extent of HarfBuzz's responsibility.
	</para>
	<para>
	It is important to note that if you are implementing a complete
	text-layout engine you may have other responsibilities that
	HarfBuzz will <emphasis>not</emphasis> help you with. For example:
	</para>
	<itemizedlist>
	<listitem>
	<para>
	HarfBuzz won't help you with bidirectionality. If you want to
	lay out text that includes a mix of Hebrew and English, you
	will need to ensure that each buffer provided to HarfBuzz
	has all of its characters in the same order and that the
	directionality of the buffer is set correctly. This may mean
	segmenting the text before it is placed into HarfBuzz buffers. In
	other words, the user will hit the keys in the following
	sequence:
	</para>
	<programlisting>
	A B C [space] ג ב א [space] D E F
	</programlisting>
	<para>
	but will expect to see in the output:
	</para>
	<programlisting>
	ABC אבג DEF
	</programlisting>
	<para>
	This reordering is called <emphasis>bidi processing</emphasis>
	("bidi" is short for bidirectional), and there's an
	algorithm as an annex to the Unicode Standard which tells you how
	to process a string of mixed directionality.
	Before sending your string to HarfBuzz, you may need to apply the
	bidi algorithm to it. Libraries such as <ulink
	url="http://icu-project.org/">ICU</ulink> and <ulink
	url="http://fribidi.org/">fribidi</ulink> can do this for you.
	</para>
	</listitem>
	<listitem>
	<para>
	HarfBuzz won't help you with text that contains different font
	properties. For instance, if you have the string "a
	<emphasis>huge</emphasis> breakfast", and you expect
	"huge" to be italic, then you will need to send three
	strings to HarfBuzz: <literal>a</literal>, in your Roman font;
	<literal>huge</literal> using your italic font; and
	<literal>breakfast</literal> using your Roman font again.
	</para>
	<para>
	Similarly, if you change the font, font size, script,
	language, or direction within your string, then you will
	need to shape each run independently and output them
	independently. HarfBuzz expects to shape a run of characters
	that all share the same properties.
	</para>
	</listitem>
	<listitem>
	<para>
	HarfBuzz won't help you with line breaking, hyphenation, or
	justification. As mentioned above, HarfBuzz lays out the string
	along a <emphasis>single line</emphasis> of, notionally,
	infinite length. If you want to find out where the potential
	word, sentence and line break points are in your text, you
	could use the ICU library's break iterator functions.
	</para>
	<para>
	HarfBuzz can tell you how wide a shaped piece of text is, which is
	useful input to a justification algorithm, but it knows nothing
	about paragraphs, lines or line lengths. Nor will it adjust the
	space between words to fit them proportionally into a line.
	</para>
	</listitem>
	</itemizedlist>
	<para>
	As a layout-engine implementor, HarfBuzz will help you with the
	interface between your text and your font, and that's something
	that you'll need—what you then do with the glyphs that your font
	returns is up to you.
	</para>
	</section>

	<section id="why-is-it-called-harfbuzz">
	<title>Why is it called HarfBuzz?</title>
	<para>
	HarfBuzz began its life as text-shaping code within the FreeType
	project (and you will see references to the FreeType authors
	within the source code copyright declarations), but was then
	extracted out to its own project. This project is maintained by
	Behdad Esfahbod, who named it HarfBuzz. Originally, it was a
	shaping engine for OpenType fonts—"HarfBuzz" is
	the Persian for "open type".
	</para>
	</section>
	</chapter>