convert SkVMBlitter over to floats

As we've learned there's not much advantage to working directly in i32
ops over f32... it's the same size, kind of a wash speed-wise, and f32
supports all operations we want where i32 supports only a subset.  If we
really want to go fast, we need to focus on i16 operations, which are
both significantly faster and operate on twice as much data at a time.

(This is the same split as SkRasterPipeline, highp f32 and lowp i16.)

For now port everything to f32, with i16 to follow, perhaps much later.

There's a little here we could spin off to land first (uniformF, better
unpremul) but I think it might be easiest to land all at once.

Cq-Include-Trybots: skia.primary:Test-Debian9-Clang-GCE-CPU-AVX2-x86_64-Debug-All-SK_USE_SKVM_BLITTER
Change-Id: I6fa0fd2031a0de18456abf529cc5b0d8137ecbe0
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/253704
Commit-Queue: Mike Klein <mtklein@google.com>
Reviewed-by: Mike Reed <reed@google.com>
14 files changed