Use specialized quad lists in rectangle ops

Hopefully reduces memory footprint of GrFillRectOp and GrTextureOp

The original rect code (GrAAFillRectOp) stored 2 SkMatrices (18 floats), 2
SkRects (8 floats) an SkPMColor4f (4 floats) and a flag (1 int) for a total
of 124 bytes per quad that was stored in the op.

The first pass at the rectangle consolidation switched to storing device and
local quads as GrPerspQuads (32 floats), an SkPMColor4f (4 floats) and a flag
(1 int) for a total of 148 bytes per quad. After landing, several memory
regressions appeared in Chrome and our perf monitor.

Several intertwined approaches are taken here. First, GrPerspQuad no longer
caches 1/w, which makes a quad 12 floats instead of 16. Second, a specialized
list type is defined that allows storing the x, y, and extra metadata together
for quads, but keeps the w components separate. When the quad type isn't
perspective, w is not stored at all since it is implicitly 1 and can be
reconstituted at tessellation time. This brings the total per quad to either
84 or 116 bytes, depending on if the op list needs perspective information.

Bug: chromium:915025
Bug: chromium:917242
Change-Id: If37ee122847b0c32604bb45dc2a1326b544f9cf6
Reviewed-on: https://skia-review.googlesource.com/c/180644
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Robert Phillips <robertphillips@google.com>
6 files changed