[smooth] Implement Bezier quadratic arc flattenning with DDA
Benchmarking shows that this provides a very slighty performance
boost when rendering fonts with lots of quadratic bezier arcs,
compared to the recursive arc splitting, but only when SSE2 is
available, or on 64-bit CPUs.
On a 2017 Core i5-7300U CPU on Linux/x86_64:
./ftbench -p -s10 -t5 -cb .../DroidSansFallbackFull.ttf
Before: 4.033 us/op (best of 5 runs for all numbers)
After: 3.876 us/op
./ftbench -p -s60 -t5 -cb .../DroidSansFallbackFull.ttf
Before: 13.467 us/op
After: 13.385 us/op
2 files changed