SkTDArray: add sizeof(T) as member to common code.

This allows nearly perfect inlining of all the SkTDArray routines
greatly reducing the .eh_frame data.

In previous CLs I passed the sizeof(T) as a parameter to the function,
but the 3 additional bytes blocked inlining causing a massive
number of .eh_frame entries.

This CL moves the sizeof(T) into the shared class, which causes
every instance of SkTDArray to increase from 16 bytes to 24 bytes.

By embedding the shared class as the first field, causes SkTDArray's
this pointer to be the same location as the field. This means that
no register rearrangement needs to happen to jump from SkTDArray to
the shared functionality.

Reduction in size: 4K
    FILE SIZE
 --------------
  +100%     -16    .text
  +100%    -680    .eh_frame_hdr
   +81% -1.01Ki    [Unmapped]
  +100% -2.31Ki    .eh_frame

This is almost entirely due to over 100 functions properly inlining.

Performance:
Skia's perf shows that desk_micrographygirlsvg.skp is the worst
regression.

On an M1 Mac (an arm64 binary), comparing the CL before all the
SkTDArray changes:
https://skia-review.googlesource.com/c/skia/+/583238
git sha: ff37365c47f87dd74d301f31e4d61a2d15ebecd9

to this CL shows no difference in performance.

Bug: skia:13657
Change-Id: Ie6e1a766fd6739e72fb243766128b3bd8faf2158
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/585418
Reviewed-by: John Stiles <johnstiles@google.com>
Commit-Queue: Herb Derby <herb@google.com>
4 files changed