Add AVX2 implementations for SkSwizzler opts

nanobench performance improvements are observed using i9-9900k.

                        before      after       improvement
RGBA_to_BGRA            121  ns     66.3 ns     45%
RGBA_to_bgrA            290  ns     154  ns     47%
RGBA_to_rgbA            290  ns     157  ns     46%
RGB_to_BGR1*            138  ns     137  ns     -
RGB_to_RGB1*            137  ns     137  ns     -
grayA_to_rgbA           162  ns     96   ns     41%
grayA_to_RGBA           91   ns     68.1 ns     25%
gray_to_RGB1            106  ns     87.1 ns     18%
inverted_CMYK_to_BGR1   272  ns     146  ns     46%
inverted_CMYK_to_RGB1   273  ns     147  ns     46%

*Note. For RGB_to_BGR1 and RGB_to_RGB1, nanobench perf regressions
are observed on some platforms using skvx + AVX2 intrinsics. So keep
the original SSSE3 implementations here.

Change-Id: I2f7a5980dda82455932bddf168dfe836c46b1341
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/290436
Reviewed-by: Mike Klein <mtklein@google.com>
Commit-Queue: Mike Klein <mtklein@google.com>
2 files changed