Avoid recursion for (most) stroke tessellation patches

Adds quick accepts to the switch statement in
GrStrokeHardwareTessellator::prepare() that allow us to write out most
tessellation patches immediately. This avoids making function calls
into recursive methods as well as avoiding some of their checks that
aren't necessary the first time around.

Also adds a microbench that mimics the MotionMark "paths" benchmark
and measures our CPU-side prepare() time.

This shaves up to 30% off the microbenchmarks.

Bug: chromium:1172543
Change-Id: Idc93bebb79db9898a4ec241b1f6c8b9eb9ba7da3
