plumb register aliasing hints through on arm64

There's no reason not to do this, though there are so many registers on
arm64 that I doubt we'll see any speed difference here at all.

I let dst() take a second hint, which makes most of these super easy;
double hints don't really come up on x86 because we've got all that
any() register-or-memory-address complexity to deal with instead there.

The most subtle bit is that it's safe to alias the index and destination
registers of the gather ops... we pull an index out of a lane, load the
value, and shove it back into that same lane, all totally safe.

Change-Id: I0f28ead95922e99e712ccb2cf824bf2610f556a6
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/340721
Commit-Queue: Herb Derby <herb@google.com>
Auto-Submit: Mike Klein <mtklein@google.com>
Reviewed-by: Herb Derby <herb@google.com>
1 file changed