Restructure skcms ops to be more like Raster Pipeline.

This generates extremely similar code, but not entirely identical;
on M1, the instructions appear in a different order, and a handful
of stages use one fewer instruction. I wasn't able to discern a
performance impact via either skcms bench or nanobench.

Change-Id: I96198b24990e55b06f1e2f9e43f9b53c54d1d9ba
Bug: b/305974160
Reviewed-on: https://skia-review.googlesource.com/c/skcms/+/771419
Reviewed-by: Brian Osman <brianosman@google.com>
Commit-Queue: John Stiles <johnstiles@google.com>
Auto-Submit: John Stiles <johnstiles@google.com>
4 files changed