add fast_mul(F32,F32)

This is just like mul(F32,F32) but optimizes 0*x == 0.
Use it in SkSLVMGenerator; sksl already applies this optimization.

PS2 has a sneaky version using % as a fast_mul() operator, and
PS3 has a sneakier version using ** instead.

We could of course write this all out using fast_mul() the long way,
but I found that quickly became difficult to read.

Change-Id: Iae35ce54411abc00e7729e178eb6a10f151a5304
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/368838
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
6 files changed