Use back-face culling to give PLS triangles negative coverage

Previously, we would check at every pixel whether its triangle was clockwise, and use this information to decide if coverage should be positive or negative.

This PR updates to PLS to enable back-face culling emit every triangle twice: once clockwise and once counterclockwise; once with positive coverage and once with negative. And it emits this geometry in such a way that back-face culling naturally selects the triangle with the appropriately signed coverage (discarding the other).

For strokes, we take care to ensure the triangles are always clockwise and only emit the triangles once. This has a nice side effect of discarding some backwards triangles that we didn't need anyway when the stroke folded over onto itself.

This change gets some nice speedups by eliminating another branch in the fragment shader, dropping some unnecessary stroke triangles, and by no longer requiring the GPU to surface winding information to the shader.

It also paves the way for some really neat fill rule optimizations by always drawing negative coverage FIRST, and then drawing the positive coverage after.

Perf improvements:

187 martys on M1: 114 - 122 fps
27 martys on Intel: 27.4 - 31.7 fps
paper on Intel: 60.6 - 86.8 fps
3 tigers on Intel: 54.9 - 69.5 fps

Diffs=
f6d19cc24 Use back-face culling to give PLS triangles negative coverage (#5637)

Co-authored-by: Chris Dalton <99840794+csmartdalton@users.noreply.github.com>
20 files changed