Rearrange append to multiply by sizeOfT in the templated code

Multiplying by sizeOfT is very expensive in the common code. The
compiler must generate a multiply instruction. When the sizeof(T)
is known at compiler time, then the compiler can strength reduce the
multiply to shifts and adds. This CL provide a large performance
improvement for the system.

Change the two most common append calls to use this technique.

Bug: b/249254511
Bug: chromium:1369069
Change-Id: I9bc7bbdb007a31357581426336b96f7cfc4eaa1b
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/587896
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: Herb Derby <herb@google.com>
2 files changed