Reland "Simplify quickReject implementation in SkCanvas"

This is a reland of 0a0f4f5c35520cb7ca5b065d3813e30dae124a77

This change makes SkCanvas::quickReject always reject empty draw bounds,
whereas previously scale+translate CTMs allowed bounds with w or h == 0
but otherwise contained in the clip to be drawn. This uncovered some
bugs in Skia where bounds shouldn't be empty, and in Flutter where
bounds were legit empty but not expected by the test.

No code changes needed. The issues that required its revert have been
fixed with:
1. https://github.com/flutter/engine/pull/26053 so that platforms that
use an empty typeface, leading to empty draws are just skipped.
2. https://skia-review.googlesource.com/c/skia/+/406140 so that path
effects update bounds so that Android's 1D dash path effect applied to
a horizontal line is properly not rejected.

Based on the period of time where the original CL was landed, some perf
data was collected.
- There were no significant changes in most SKPs or SVGs, except for a
Flutter page flip skp, which saw a 10% net improvement (perhaps the
flip is drawn with perspective?)
- A 10-20% regression in the motionmark paths skp, but dominated by the MSVC
compiler, so I'm not too concerned about that.
- Perspective microbenchmarks for drawing rectangles are 1.5-2x faster.
- quickReject microbenchmarks are about 2x slower.

The last two microbenchmark results aren't surprising since perspective
was the largest improvement in perf for SkM44::MapRect vs.
SkMatrix::mapRect, and the scale+translate specializations in Skmatrix
were maybe 50% faster than SkM44's. That would account for some of the
slow downs, and the rest could be explained by moving away from the
SIMD rect intersection and nan test.

Since these microreductions don't seem to bleed into more complex
benchmarks, I'm inclined to keep the code simple and not bring back the
custom intrinsics.

Original change's description:
> Simplify quickReject implementation in SkCanvas
>
>  - SkCanvas no longer keeps fIsScaleTranslate bool that has to stay in
>    sync with the type of the matrix.
>  - No more fast or slow path for quickReject, the Sk4f code has been
>    completely removed.
>  - Uses SkM44::mapRect instead of SkMatrix::mapRect. This is slightly
>    slower for S+T, but much faster for other transforms. I'm hopeful we
>    won't notice the regression in the grand scheme for S+T, since the
>    code is a lot simpler now.
>  - The final isFinite() and intersects() check for quickReject uses
>    SkRect's functions instead of hand-written SSE/NEON. If we think this
>    is optimization is necessary, I'm hoping we can rewrite it in terms
>    of skvx instead of specific instructions.
>  - Consolidated how the quick-reject bounds outsetting into
>    computeDeviceClipBounds, and added an option to skip outsetting for
>    the one call site that doesn't want it.
>
> Bug: skia:10987
> Change-Id: I3cf2a73636cdeed06d12cab4548cfb94d1eb074a
> Reviewed-on: https://skia-review.googlesource.com/c/skia/+/405198
> Commit-Queue: Mike Reed <reed@google.com>
> Auto-Submit: Michael Ludwig <michaelludwig@google.com>
> Reviewed-by: Mike Reed <reed@google.com>

Bug: skia:10987
Change-Id: Id0d4b4ecebf0b83ae30f7e1a263961ab25de28dd
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/407358
Commit-Queue: Michael Ludwig <michaelludwig@google.com>
Reviewed-by: Michael Ludwig <michaelludwig@google.com>
2 files changed