Add "proper"-property-reader

The word- and sentence-segmentation algorithms make use of a complicated
logic to accomodate "raw" and "skip" properties. The code is barely
readable and doesn't separate abstractions away nicely. Moreover, there
is a high probability that certain edge-cases are not handled properly.

To fix this, this commit adds a "proper"-property-reader, which
basically does the whole dirty details in the background using
well-commented and transparent code that builds on top of the
herodotus-reader instead of doing this by hand. This ensures that we
will (provably) never have buffer overflows unless there is a mistake
in the implementation itself, which can be verified relatively easily
given each function has a limited scope.

Signed-off-by: Laslo Hunhold <dev@frign.de>
3 files changed