blob: 5c59d981d82719e3c9ad001e97b5d674546ef73b [file] [log] [blame]
<!-- HTML header for doxygen 1.8.7-->
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=UTF-8"/>
<meta http-equiv="X-UA-Compatible" content="IE=9"/>
<meta name="generator" content="Doxygen 1.8.16"/>
<title>RapidJSON: Internals</title>
<link href="tabs.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="jquery.js"></script>
<script type="text/javascript" src="dynsections.js"></script>
<link href="navtree.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="resize.js"></script>
<script type="text/javascript" src="navtreedata.js"></script>
<script type="text/javascript" src="navtree.js"></script>
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
$(document).ready(initResizable);
/* @license-end */</script>
<link href="search/search.css" rel="stylesheet" type="text/css"/>
<script type="text/javascript" src="search/searchdata.js"></script>
<script type="text/javascript" src="search/search.js"></script>
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
$(document).ready(function() { init_search(); });
/* @license-end */
</script>
<link href="doxygen.css" rel="stylesheet" type="text/css" />
<link href="doxygenextra.css" rel="stylesheet" type="text/css"/>
</head>
<body>
<div id="top"><!-- do not remove this div, it is closed by doxygen! -->
<div id="topbanner"><a href="https://github.com/Tencent/rapidjson" title="RapidJSON GitHub"><i class="githublogo"></i></a></div>
<div id="MSearchBox" class="MSearchBoxInactive">
<span class="left">
<img id="MSearchSelect" src="search/mag_sel.png"
onmouseover="return searchBox.OnSearchSelectShow()"
onmouseout="return searchBox.OnSearchSelectHide()"
alt=""/>
<input type="text" id="MSearchField" value="Search" accesskey="S"
onfocus="searchBox.OnSearchFieldFocus(true)"
onblur="searchBox.OnSearchFieldFocus(false)"
onkeyup="searchBox.OnSearchFieldChange(event)"/>
</span><span class="right">
<a id="MSearchClose" href="javascript:searchBox.CloseResultsWindow()"><img id="MSearchCloseImg" border="0" src="search/close.png" alt=""/></a>
</span>
</div>
<!-- end header part -->
<!-- Generated by Doxygen 1.8.16 -->
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
var searchBox = new SearchBox("searchBox", "search",false,'Search');
/* @license-end */
</script>
</div><!-- top -->
<div id="side-nav" class="ui-resizable side-nav-resizable">
<div id="nav-tree">
<div id="nav-tree-contents">
<div id="nav-sync" class="sync"></div>
</div>
</div>
<div id="splitbar" style="-moz-user-select:none;"
class="ui-resizable-handle">
</div>
</div>
<script type="text/javascript">
/* @license magnet:?xt=urn:btih:cf05388f2679ee054f2beb29a391d25f4e673ac3&amp;dn=gpl-2.0.txt GPL-v2 */
$(document).ready(function(){initNavTree('md_doc_internals.html','');});
/* @license-end */
</script>
<div id="doc-content">
<!-- window showing the filter options -->
<div id="MSearchSelectWindow"
onmouseover="return searchBox.OnSearchSelectShow()"
onmouseout="return searchBox.OnSearchSelectHide()"
onkeydown="return searchBox.OnSearchSelectKey(event)">
</div>
<!-- iframe showing the search results (closed by default) -->
<div id="MSearchResultsWindow">
<iframe src="javascript:void(0)" frameborder="0"
name="MSearchResults" id="MSearchResults">
</iframe>
</div>
<div class="PageDoc"><div class="header">
<div class="headertitle">
<div class="title">Internals </div> </div>
</div><!--header-->
<div class="contents">
<div class="toc"><h3>Table of Contents</h3>
<ul><li class="level1"><a href="#Architecture">Architecture</a><ul><li class="level2"><a href="#autotoc_md64">SAX and DOM</a></li>
<li class="level2"><a href="#autotoc_md65">Utility Classes</a></li>
</ul>
</li>
<li class="level1"><a href="#Value">Value</a><ul><li class="level2"><a href="#DataLayout">Data Layout</a></li>
<li class="level2"><a href="#Flags">Flags</a></li>
<li class="level2"><a href="#ShortString">Short-String Optimization</a></li>
</ul>
</li>
<li class="level1"><a href="#InternalAllocator">Allocator</a><ul><li class="level2"><a href="#MemoryPoolAllocator">MemoryPoolAllocator</a></li>
</ul>
</li>
<li class="level1"><a href="#ParsingOptimization">Parsing Optimization</a><ul><li class="level2"><a href="#SkipwhitespaceWithSIMD">Skip Whitespaces with SIMD</a><ul><li class="level3"><a href="#autotoc_md66">Page boundary issue</a></li>
</ul>
</li>
<li class="level2"><a href="#LocalStreamCopy">Local Stream Copy</a></li>
<li class="level2"><a href="#ParsingDouble">Parsing to Double</a></li>
</ul>
</li>
<li class="level1"><a href="#GenerationOptimization">Generation Optimization</a><ul><li class="level2"><a href="#itoa">Integer-to-String conversion</a></li>
<li class="level2"><a href="#dtoa">Double-to-String conversion</a></li>
</ul>
</li>
<li class="level1"><a href="#Parser">Parser</a><ul><li class="level2"><a href="#IterativeParser">Iterative Parser</a><ul><li class="level3"><a href="#IterativeParserGrammar">Grammar</a></li>
<li class="level3"><a href="#IterativeParserParsingTable">Parsing Table</a></li>
<li class="level3"><a href="#IterativeParserImplementation">Implementation</a></li>
</ul>
</li>
</ul>
</li>
</ul>
</div>
<div class="textblock"><p>This section records some design and implementation details.</p>
<h1><a class="anchor" id="Architecture"></a>
Architecture</h1>
<h2><a class="anchor" id="autotoc_md64"></a>
SAX and DOM</h2>
<p>The basic relationships of SAX and DOM is shown in the following UML diagram.</p>
<div class="image">
<img src="architecture.png" alt=""/>
<div class="caption">
Architecture UML class diagram</div></div>
<p>The core of the relationship is the <code>Handler</code> concept. From the SAX side, <code>Reader</code> parses a JSON from a stream and publish events to a <code>Handler</code>. <code>Writer</code> implements the <code>Handler</code> concept to handle the same set of events. From the DOM side, <code>Document</code> implements the <code>Handler</code> concept to build a DOM according to the events. <code>Value</code> supports a <code>Value::Accept(Handler&amp;)</code> function, which traverses the DOM to publish events.</p>
<p>With this design, SAX is not dependent on DOM. Even <code>Reader</code> and <code>Writer</code> have no dependencies between them. This provides flexibility to chain event publisher and handlers. Besides, <code>Value</code> does not depends on SAX as well. So, in addition to stringify a DOM to JSON, user may also stringify it to a XML writer, or do anything else.</p>
<h2><a class="anchor" id="autotoc_md65"></a>
Utility Classes</h2>
<p>Both SAX and DOM APIs depends on 3 additional concepts: <code>Allocator</code>, <code>Encoding</code> and <code>Stream</code>. Their inheritance hierarchy is shown as below.</p>
<div class="image">
<img src="utilityclass.png" alt=""/>
<div class="caption">
Utility classes UML class diagram</div></div>
<h1><a class="anchor" id="Value"></a>
Value</h1>
<p><code>Value</code> (actually a typedef of <code>GenericValue&lt;UTF8&lt;&gt;&gt;</code>) is the core of DOM API. This section describes the design of it.</p>
<h2><a class="anchor" id="DataLayout"></a>
Data Layout</h2>
<p><code>Value</code> is a <a href="http://en.wikipedia.org/wiki/Variant_type">variant type</a>. In RapidJSON's context, an instance of <code>Value</code> can contain 1 of 6 JSON value types. This is possible by using <code>union</code>. Each <code>Value</code> contains two members: <code>union Data data_</code> and a<code>unsigned flags_</code>. The <code>flags_</code> indicates the JSON type, and also additional information.</p>
<p>The following tables show the data layout of each type. The 32-bit/64-bit columns indicates the size of the field in bytes.</p>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">Null </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kNullType kNullFlag</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">Bool </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kBoolType</code> (either <code>kTrueFlag</code> or <code>kFalseFlag</code>) </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">String </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>Ch* str</code> </td><td class="markdownTableBodyNone">Pointer to the string (may own) </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>SizeType length</code> </td><td class="markdownTableBodyNone">Length of string </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kStringType kStringFlag ...</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">Object </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>Member* members</code> </td><td class="markdownTableBodyNone">Pointer to array of members (owned) </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>SizeType size</code> </td><td class="markdownTableBodyNone">Number of members </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>SizeType capacity</code> </td><td class="markdownTableBodyNone">Capacity of members </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kObjectType kObjectFlag</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">Array </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>Value* values</code> </td><td class="markdownTableBodyNone">Pointer to array of values (owned) </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>SizeType size</code> </td><td class="markdownTableBodyNone">Number of values </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>SizeType capacity</code> </td><td class="markdownTableBodyNone">Capacity of values </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kArrayType kArrayFlag</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">Number (Int) </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>int i</code> </td><td class="markdownTableBodyNone">32-bit signed integer </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone">(zero padding) </td><td class="markdownTableBodyNone">0 </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kNumberType kNumberFlag kIntFlag kInt64Flag ...</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">Number (UInt) </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>unsigned u</code> </td><td class="markdownTableBodyNone">32-bit unsigned integer </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone">(zero padding) </td><td class="markdownTableBodyNone">0 </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kNumberType kNumberFlag kUintFlag kUint64Flag ...</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">Number (Int64) </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>int64_t i64</code> </td><td class="markdownTableBodyNone">64-bit signed integer </td><td class="markdownTableBodyCenter">8 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kNumberType kNumberFlag kInt64Flag ...</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">Number (Uint64) </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>uint64_t i64</code> </td><td class="markdownTableBodyNone">64-bit unsigned integer </td><td class="markdownTableBodyCenter">8 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kNumberType kNumberFlag kInt64Flag ...</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">Number (Double) </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>uint64_t i64</code> </td><td class="markdownTableBodyNone">Double precision floating-point </td><td class="markdownTableBodyCenter">8 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone">(unused) </td><td class="markdownTableBodyNone"></td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">8 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kNumberType kNumberFlag kDoubleFlag</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<p>Here are some notes:</p><ul>
<li>To reduce memory consumption for 64-bit architecture, <code>SizeType</code> is typedef as <code>unsigned</code> instead of <code>size_t</code>.</li>
<li>Zero padding for 32-bit number may be placed after or before the actual type, according to the endianness. This makes possible for interpreting a 32-bit integer as a 64-bit integer, without any conversion.</li>
<li>An <code>Int</code> is always an <code>Int64</code>, but the converse is not always true.</li>
</ul>
<h2><a class="anchor" id="Flags"></a>
Flags</h2>
<p>The 32-bit <code>flags_</code> contains both JSON type and other additional information. As shown in the above tables, each JSON type contains redundant <code>kXXXType</code> and <code>kXXXFlag</code>. This design is for optimizing the operation of testing bit-flags (<code>IsNumber()</code>) and obtaining a sequential number for each type (<code>GetType()</code>).</p>
<p>String has two optional flags. <code>kCopyFlag</code> means that the string owns a copy of the string. <code>kInlineStrFlag</code> means using <a href="#ShortString">Short-String Optimization</a>.</p>
<p>Number is a bit more complicated. For normal integer values, it can contains <code>kIntFlag</code>, <code>kUintFlag</code>, <code>kInt64Flag</code> and/or <code>kUint64Flag</code>, according to the range of the integer. For numbers with fraction, and integers larger than 64-bit range, they will be stored as <code>double</code> with <code>kDoubleFlag</code>.</p>
<h2><a class="anchor" id="ShortString"></a>
Short-String Optimization</h2>
<p><a href="https://github.com/Kosta-Github">Kosta</a> provided a very neat short-string optimization. The optimization idea is given as follow. Excluding the <code>flags_</code>, a <code>Value</code> has 12 or 16 bytes (32-bit or 64-bit) for storing actual data. Instead of storing a pointer to a string, it is possible to store short strings in these space internally. For encoding with 1-byte character type (e.g. <code>char</code>), it can store maximum 11 or 15 characters string inside the <code>Value</code> type.</p>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadNone">ShortString (Ch=char) </th><th class="markdownTableHeadNone"></th><th class="markdownTableHeadCenter">32-bit </th><th class="markdownTableHeadCenter">64-bit </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>Ch str[MaxChars]</code> </td><td class="markdownTableBodyNone">String buffer </td><td class="markdownTableBodyCenter">11 </td><td class="markdownTableBodyCenter">15 </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyNone"><code>Ch invLength</code> </td><td class="markdownTableBodyNone">MaxChars - Length </td><td class="markdownTableBodyCenter">1 </td><td class="markdownTableBodyCenter">1 </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyNone"><code>unsigned flags_</code> </td><td class="markdownTableBodyNone"><code>kStringType kStringFlag ...</code> </td><td class="markdownTableBodyCenter">4 </td><td class="markdownTableBodyCenter">4 </td></tr>
</table>
<p>A special technique is applied. Instead of storing the length of string directly, it stores (MaxChars - length). This make it possible to store 11 characters with trailing <code>\0</code>.</p>
<p>This optimization can reduce memory usage for copy-string. It can also improve cache-coherence thus improve runtime performance.</p>
<h1><a class="anchor" id="InternalAllocator"></a>
Allocator</h1>
<p><code>Allocator</code> is a concept in RapidJSON: </p><div class="fragment"><div class="line">concept <a class="code" href="classrapidjson_1_1_allocator.html">Allocator</a> {</div>
<div class="line"> <span class="keyword">static</span> <span class="keyword">const</span> <span class="keywordtype">bool</span> kNeedFree; <span class="comment">//!&lt; Whether this allocator needs to call Free().</span></div>
<div class="line"><span class="comment"></span> </div>
<div class="line"> <span class="comment">// Allocate a memory block.</span></div>
<div class="line"> <span class="comment">// \param size of the memory block in bytes.</span></div>
<div class="line"> <span class="comment">// \returns pointer to the memory block.</span></div>
<div class="line"> <span class="keywordtype">void</span>* Malloc(<span class="keywordtype">size_t</span> size);</div>
<div class="line"> </div>
<div class="line"> <span class="comment">// Resize a memory block.</span></div>
<div class="line"> <span class="comment">// \param originalPtr The pointer to current memory block. Null pointer is permitted.</span></div>
<div class="line"> <span class="comment">// \param originalSize The current size in bytes. (Design issue: since some allocator may not book-keep this, explicitly pass to it can save memory.)</span></div>
<div class="line"> <span class="comment">// \param newSize the new size in bytes.</span></div>
<div class="line"> <span class="keywordtype">void</span>* Realloc(<span class="keywordtype">void</span>* originalPtr, <span class="keywordtype">size_t</span> originalSize, <span class="keywordtype">size_t</span> newSize);</div>
<div class="line"> </div>
<div class="line"> <span class="comment">// Free a memory block.</span></div>
<div class="line"> <span class="comment">// \param pointer to the memory block. Null pointer is permitted.</span></div>
<div class="line"> <span class="keyword">static</span> <span class="keywordtype">void</span> Free(<span class="keywordtype">void</span> *ptr);</div>
<div class="line">};</div>
</div><!-- fragment --><p>Note that <code>Malloc()</code> and <code>Realloc()</code> are member functions but <code>Free()</code> is static member function.</p>
<h2><a class="anchor" id="MemoryPoolAllocator"></a>
MemoryPoolAllocator</h2>
<p><code>MemoryPoolAllocator</code> is the default allocator for DOM. It allocate but do not free memory. This is suitable for building a DOM tree.</p>
<p>Internally, it allocates chunks of memory from the base allocator (by default <code>CrtAllocator</code>) and stores the chunks as a singly linked list. When user requests an allocation, it allocates memory from the following order:</p>
<ol type="1">
<li>User supplied buffer if it is available. (See <a class="el" href="md_doc_dom.html">User Buffer section in DOM</a>)</li>
<li>If user supplied buffer is full, use the current memory chunk.</li>
<li>If the current block is full, allocate a new block of memory.</li>
</ol>
<h1><a class="anchor" id="ParsingOptimization"></a>
Parsing Optimization</h1>
<h2><a class="anchor" id="SkipwhitespaceWithSIMD"></a>
Skip Whitespaces with SIMD</h2>
<p>When parsing JSON from a stream, the parser need to skip 4 whitespace characters:</p>
<ol type="1">
<li>Space (<code>U+0020</code>)</li>
<li>Character Tabulation (<code>U+000B</code>)</li>
<li>Line Feed (<code>U+000A</code>)</li>
<li>Carriage Return (<code>U+000D</code>)</li>
</ol>
<p>A simple implementation will be simply: </p><div class="fragment"><div class="line"><span class="keywordtype">void</span> <a class="code" href="namespacerapidjson.html#a6efb0f4d2a6f81477a59718d42e9464a">SkipWhitespace</a>(InputStream&amp; s) {</div>
<div class="line"> <span class="keywordflow">while</span> (s.Peek() == <span class="charliteral">&#39; &#39;</span> || s.Peek() == <span class="charliteral">&#39;\n&#39;</span> || s.Peek() == <span class="charliteral">&#39;\r&#39;</span> || s.Peek() == <span class="charliteral">&#39;\t&#39;</span>)</div>
<div class="line"> s.Take();</div>
<div class="line">}</div>
</div><!-- fragment --><p>However, this requires 4 comparisons and a few branching for each character. This was found to be a hot spot.</p>
<p>To accelerate this process, SIMD was applied to compare 16 characters with 4 white spaces for each iteration. Currently RapidJSON supports SSE2, SSE4.2 and ARM Neon instructions for this. And it is only activated for UTF-8 memory streams, including string stream or <em>in situ</em> parsing.</p>
<p>To enable this optimization, need to define <code>RAPIDJSON_SSE2</code>, <code>RAPIDJSON_SSE42</code> or <code>RAPIDJSON_NEON</code> before including <code><a class="el" href="rapidjson_8h.html" title="common definitions and configuration">rapidjson.h</a></code>. Some compilers can detect the setting, as in <code>perftest.h</code>:</p>
<div class="fragment"><div class="line"><span class="comment">// __SSE2__ and __SSE4_2__ are recognized by gcc, clang, and the Intel compiler.</span></div>
<div class="line"><span class="comment">// We use -march=native with gmake to enable -msse2 and -msse4.2, if supported.</span></div>
<div class="line"><span class="comment">// Likewise, __ARM_NEON is used to detect Neon.</span></div>
<div class="line"><span class="preprocessor">#if defined(__SSE4_2__)</span></div>
<div class="line"><span class="preprocessor"># define RAPIDJSON_SSE42</span></div>
<div class="line"><span class="preprocessor">#elif defined(__SSE2__)</span></div>
<div class="line"><span class="preprocessor"># define RAPIDJSON_SSE2</span></div>
<div class="line"><span class="preprocessor">#elif defined(__ARM_NEON)</span></div>
<div class="line"><span class="preprocessor"># define RAPIDJSON_NEON</span></div>
<div class="line"><span class="preprocessor">#endif</span></div>
</div><!-- fragment --><p>Note that, these are compile-time settings. Running the executable on a machine without such instruction set support will make it crash.</p>
<h3><a class="anchor" id="autotoc_md66"></a>
Page boundary issue</h3>
<p>In an early version of RapidJSON, <a href="https://code.google.com/archive/p/rapidjson/issues/104">an issue</a> reported that the <code>SkipWhitespace_SIMD()</code> causes crash very rarely (around 1 in 500,000). After investigation, it is suspected that <code>_mm_loadu_si128()</code> accessed bytes after &lsquo;&rsquo;\0'`, and across a protected page boundary.</p>
<p>In <a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel® 64 and IA-32 Architectures Optimization Reference Manual</a>, section 10.2.1:</p>
<blockquote class="doxtable">
<p>To support algorithms requiring unaligned 128-bit SIMD memory accesses, memory buffer allocation by a caller function should consider adding some pad space so that a callee function can safely use the address pointer safely with unaligned 128-bit SIMD memory operations. The minimal padding size should be the width of the SIMD register that might be used in conjunction with unaligned SIMD memory access. </p>
</blockquote>
<p>This is not feasible as RapidJSON should not enforce such requirement.</p>
<p>To fix this issue, currently the routine process bytes up to the next aligned address. After tha, use aligned read to perform SIMD processing. Also see <a href="https://github.com/Tencent/rapidjson/issues/85">#85</a>.</p>
<h2><a class="anchor" id="LocalStreamCopy"></a>
Local Stream Copy</h2>
<p>During optimization, it is found that some compilers cannot localize some member data access of streams into local variables or registers. Experimental results show that for some stream types, making a copy of the stream and used it in inner-loop can improve performance. For example, the actual (non-SIMD) implementation of <code><a class="el" href="namespacerapidjson.html#a6efb0f4d2a6f81477a59718d42e9464a" title="Skip the JSON white spaces in a stream.">SkipWhitespace()</a></code> is implemented as:</p>
<div class="fragment"><div class="line"><span class="keyword">template</span>&lt;<span class="keyword">typename</span> InputStream&gt;</div>
<div class="line"><span class="keywordtype">void</span> <a class="code" href="namespacerapidjson.html#a6efb0f4d2a6f81477a59718d42e9464a">SkipWhitespace</a>(InputStream&amp; is) {</div>
<div class="line"> internal::StreamLocalCopy&lt;InputStream&gt; copy(is);</div>
<div class="line"> InputStream&amp; s(copy.s);</div>
<div class="line"> </div>
<div class="line"> <span class="keywordflow">while</span> (s.Peek() == <span class="charliteral">&#39; &#39;</span> || s.Peek() == <span class="charliteral">&#39;\n&#39;</span> || s.Peek() == <span class="charliteral">&#39;\r&#39;</span> || s.Peek() == <span class="charliteral">&#39;\t&#39;</span>)</div>
<div class="line"> s.Take();</div>
<div class="line">}</div>
</div><!-- fragment --><p>Depending on the traits of stream, <code>StreamLocalCopy</code> will make (or not make) a copy of the stream object, use it locally and copy the states of stream back to the original stream.</p>
<h2><a class="anchor" id="ParsingDouble"></a>
Parsing to Double</h2>
<p>Parsing string into <code>double</code> is difficult. The standard library function <code>strtod()</code> can do the job but it is slow. By default, the parsers use normal precision setting. This has has maximum 3 <a href="http://en.wikipedia.org/wiki/Unit_in_the_last_place">ULP</a> error and implemented in <code>internal::StrtodNormalPrecision()</code>.</p>
<p>When using <code>kParseFullPrecisionFlag</code>, the parsers calls <code>internal::StrtodFullPrecision()</code> instead, and this function actually implemented 3 versions of conversion methods.</p><ol type="1">
<li><a href="http://www.exploringbinary.com/fast-path-decimal-to-floating-point-conversion/">Fast-Path</a>.</li>
<li>Custom DIY-FP implementation as in <a href="https://github.com/floitsch/double-conversion">double-conversion</a>.</li>
<li>Big Integer Method as in (Clinger, William D.&#160;How to read floating point numbers accurately. Vol. 25. No. 6. ACM, 1990).</li>
</ol>
<p>If the first conversion methods fail, it will try the second, and so on.</p>
<h1><a class="anchor" id="GenerationOptimization"></a>
Generation Optimization</h1>
<h2><a class="anchor" id="itoa"></a>
Integer-to-String conversion</h2>
<p>The naive algorithm for integer-to-string conversion involves division per each decimal digit. We have implemented various implementations and evaluated them in <a href="https://github.com/miloyip/itoa-benchmark">itoa-benchmark</a>.</p>
<p>Although SSE2 version is the fastest but the difference is minor by comparing to the first running-up <code>branchlut</code>. And <code>branchlut</code> is pure C++ implementation so we adopt <code>branchlut</code> in RapidJSON.</p>
<h2><a class="anchor" id="dtoa"></a>
Double-to-String conversion</h2>
<p>Originally RapidJSON uses <code>snprintf(..., ..., "%g")</code> to achieve double-to-string conversion. This is not accurate as the default precision is 6. Later we also find that this is slow and there is an alternative.</p>
<p>Google's V8 <a href="https://github.com/floitsch/double-conversion">double-conversion</a> implemented a newer, fast algorithm called Grisu3 (Loitsch, Florian. "Printing floating-point numbers quickly and accurately with integers."&#160;ACM Sigplan Notices&#160;45.6 (2010): 233-243.).</p>
<p>However, since it is not header-only so that we implemented a header-only version of Grisu2. This algorithm guarantees that the result is always accurate. And in most of cases it produces the shortest (optimal) string representation.</p>
<p>The header-only conversion function has been evaluated in <a href="https://github.com/miloyip/dtoa-benchmark">dtoa-benchmark</a>.</p>
<h1><a class="anchor" id="Parser"></a>
Parser</h1>
<h2><a class="anchor" id="IterativeParser"></a>
Iterative Parser</h2>
<p>The iterative parser is a recursive descent LL(1) parser implemented in a non-recursive manner.</p>
<h3><a class="anchor" id="IterativeParserGrammar"></a>
Grammar</h3>
<p>The grammar used for this parser is based on strict JSON syntax: </p><div class="fragment"><div class="line">S -&gt; array | object</div>
<div class="line">array -&gt; [ values ]</div>
<div class="line">object -&gt; { members }</div>
<div class="line">values -&gt; non-empty-values | ε</div>
<div class="line">non-empty-values -&gt; value addition-values</div>
<div class="line">addition-values -&gt; ε | , non-empty-values</div>
<div class="line">members -&gt; non-empty-members | ε</div>
<div class="line">non-empty-members -&gt; member addition-members</div>
<div class="line">addition-members -&gt; ε | , non-empty-members</div>
<div class="line">member -&gt; STRING : value</div>
<div class="line">value -&gt; STRING | NUMBER | NULL | BOOLEAN | object | array</div>
</div><!-- fragment --><p>Note that left factoring is applied to non-terminals <code>values</code> and <code>members</code> to make the grammar be LL(1).</p>
<h3><a class="anchor" id="IterativeParserParsingTable"></a>
Parsing Table</h3>
<p>Based on the grammar, we can construct the FIRST and FOLLOW set.</p>
<p>The FIRST set of non-terminals is listed below:</p>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadCenter">NON-TERMINAL </th><th class="markdownTableHeadCenter">FIRST </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">array </td><td class="markdownTableBodyCenter">[ </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">object </td><td class="markdownTableBodyCenter">{ </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">values </td><td class="markdownTableBodyCenter">ε STRING NUMBER NULL BOOLEAN { [ </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">addition-values </td><td class="markdownTableBodyCenter">ε COMMA </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">members </td><td class="markdownTableBodyCenter">ε STRING </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">addition-members </td><td class="markdownTableBodyCenter">ε COMMA </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">member </td><td class="markdownTableBodyCenter">STRING </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">value </td><td class="markdownTableBodyCenter">STRING NUMBER NULL BOOLEAN { [ </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">S </td><td class="markdownTableBodyCenter">[ { </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">non-empty-members </td><td class="markdownTableBodyCenter">STRING </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">non-empty-values </td><td class="markdownTableBodyCenter">STRING NUMBER NULL BOOLEAN { [ </td></tr>
</table>
<p>The FOLLOW set is listed below:</p>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadCenter">NON-TERMINAL </th><th class="markdownTableHeadCenter">FOLLOW </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">S </td><td class="markdownTableBodyCenter">$ </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">array </td><td class="markdownTableBodyCenter">, $ } ] </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">object </td><td class="markdownTableBodyCenter">, $ } ] </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">values </td><td class="markdownTableBodyCenter">] </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">non-empty-values </td><td class="markdownTableBodyCenter">] </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">addition-values </td><td class="markdownTableBodyCenter">] </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">members </td><td class="markdownTableBodyCenter">} </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">non-empty-members </td><td class="markdownTableBodyCenter">} </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">addition-members </td><td class="markdownTableBodyCenter">} </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">member </td><td class="markdownTableBodyCenter">, } </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">value </td><td class="markdownTableBodyCenter">, } ] </td></tr>
</table>
<p>Finally the parsing table can be constructed from FIRST and FOLLOW set:</p>
<table class="markdownTable">
<tr class="markdownTableHead">
<th class="markdownTableHeadCenter">NON-TERMINAL </th><th class="markdownTableHeadCenter">[ </th><th class="markdownTableHeadCenter">{ </th><th class="markdownTableHeadCenter">, </th><th class="markdownTableHeadCenter">: </th><th class="markdownTableHeadCenter">] </th><th class="markdownTableHeadCenter">} </th><th class="markdownTableHeadCenter">STRING </th><th class="markdownTableHeadCenter">NUMBER </th><th class="markdownTableHeadCenter">NULL </th><th class="markdownTableHeadCenter">BOOLEAN </th></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">S </td><td class="markdownTableBodyCenter">array </td><td class="markdownTableBodyCenter">object </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">array </td><td class="markdownTableBodyCenter">[ values ] </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">object </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">{ members } </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">values </td><td class="markdownTableBodyCenter">non-empty-values </td><td class="markdownTableBodyCenter">non-empty-values </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">ε </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">non-empty-values </td><td class="markdownTableBodyCenter">non-empty-values </td><td class="markdownTableBodyCenter">non-empty-values </td><td class="markdownTableBodyCenter">non-empty-values </td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">non-empty-values </td><td class="markdownTableBodyCenter">value addition-values </td><td class="markdownTableBodyCenter">value addition-values </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">value addition-values </td><td class="markdownTableBodyCenter">value addition-values </td><td class="markdownTableBodyCenter">value addition-values </td><td class="markdownTableBodyCenter">value addition-values </td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">addition-values </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">, non-empty-values </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">ε </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">members </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">ε </td><td class="markdownTableBodyCenter">non-empty-members </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">non-empty-members </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">member addition-members </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">addition-members </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">, non-empty-members </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">ε </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td></tr>
<tr class="markdownTableRowEven">
<td class="markdownTableBodyCenter">member </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">STRING : value </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td></tr>
<tr class="markdownTableRowOdd">
<td class="markdownTableBodyCenter">value </td><td class="markdownTableBodyCenter">array </td><td class="markdownTableBodyCenter">object </td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter"></td><td class="markdownTableBodyCenter">STRING </td><td class="markdownTableBodyCenter">NUMBER </td><td class="markdownTableBodyCenter">NULL </td><td class="markdownTableBodyCenter">BOOLEAN </td></tr>
</table>
<p>There is a great <a href="http://hackingoff.com/compilers/predict-first-follow-set">tool</a> for above grammar analysis.</p>
<h3><a class="anchor" id="IterativeParserImplementation"></a>
Implementation</h3>
<p>Based on the parsing table, a direct(or conventional) implementation that pushes the production body in reverse order while generating a production could work.</p>
<p>In RapidJSON, several modifications(or adaptations to current design) are made to a direct implementation.</p>
<p>First, the parsing table is encoded in a state machine in RapidJSON. States are constructed by the head and body of production. State transitions are constructed by production rules. Besides, extra states are added for productions involved with <code>array</code> and <code>object</code>. In this way the generation of array values or object members would be a single state transition, rather than several pop/push operations in the direct implementation. This also makes the estimation of stack size more easier.</p>
<p>The state diagram is shown as follows:</p>
<div class="image">
<img src="iterative-parser-states-diagram.png" alt=""/>
<div class="caption">
State Diagram</div></div>
<p>Second, the iterative parser also keeps track of array's value count and object's member count in its internal stack, which may be different from a conventional implementation. </p>
</div></div><!-- contents -->
</div><!-- PageDoc -->
</div><!-- doc-content -->
<div class="ttc" id="aclassrapidjson_1_1_allocator_html"><div class="ttname"><a href="classrapidjson_1_1_allocator.html">Allocator</a></div><div class="ttdoc">Concept for allocating, resizing and freeing memory block.</div></div>
<div class="ttc" id="anamespacerapidjson_html_a6efb0f4d2a6f81477a59718d42e9464a"><div class="ttname"><a href="namespacerapidjson.html#a6efb0f4d2a6f81477a59718d42e9464a">rapidjson::SkipWhitespace</a></div><div class="ttdeci">void SkipWhitespace(InputStream &amp;is)</div><div class="ttdoc">Skip the JSON white spaces in a stream.</div><div class="ttdef"><b>Definition:</b> reader.h:266</div></div>
<!-- HTML footer for doxygen 1.8.7-->
<!-- start footer part -->
<div id="nav-path" class="navpath"><!-- id is needed for treeview function! -->
<ul>
</ul>
</div>
</body>
</html>