Add new tool for easily comparing raster pipeline benchmarks

One workflow I have for running benchmarks on SkRP is to build
a control version of nanobench, then comment out the ML4 (AVX512)
branch, then re-compile for a control version of ML3 (AVX2) and
then comment out the ML3 branch and get a control version of SSE2.
Then make my changes. Then build experimental versions of nanobench
to mirror those three settings.

This has gotten tedious, so I made a script to make this easier.

```
$ python3 tools/raster_pipeline/run_benchmarks.py --compile
Checking Git status of src/opts/SkRasterPipeline_opts.h...

--- Checking out Control baseline (origin/main) ---

Configuring SkOpts.cpp for Control ML4...
Compiling nanobench_control_ml4...
Saved binary to out/Release/nanobenches/nanobench_ml4_control

Configuring SkOpts.cpp for Control ML3...
Compiling nanobench_control_ml3...
Saved binary to out/Release/nanobenches/nanobench_ml3_control

Configuring SkOpts.cpp for Control SSE2...
Compiling nanobench_control_sse2...
Saved binary to out/Release/nanobenches/nanobench_sse2_control

--- Checking out HEAD (Optimized) ---

Configuring SkOpts.cpp for Optimized ML4...
Compiling nanobench_with_changes_ml4...
Saved binary to out/Release/nanobenches/nanobench_ml4_with_changes
...
$ python3 tools/raster_pipeline/run_benchmarks.py --run --match rp_div

======================================================================
 Running nanobench_ml4_control...
Timer overhead: 23.8ns
curr/maxrss	loops	min	median	mean	max	stddev	samples   	config	bench
  96/93  MB	38	303ns	304ns	304ns	305ns	0%	█▂▄▃▆▃▃█▅▁	nonrendering	skrp_div_uint_4
  96/93  MB	215	76ns	76.2ns	78.3ns	96.9ns	8%	▁▁▁▁▁▁█▁▁▁	nonrendering	skrp_div_uint_1
  96/93  MB	247	302ns	303ns	303ns	312ns	1%	▂▁▁█▁▁▁▁▁▁	nonrendering	skrp_div_int_4
  96/93  MB	593	75.9ns	75.9ns	76.7ns	81ns	2%	▁█▁▁▁▁▁▁▁▆	nonrendering	skrp_div_int_1

======================================================================
 Running nanobench_ml4_with_changes...
Timer overhead: 23.3ns
curr/maxrss	loops	min	median	mean	max	stddev	samples   	config	bench
  96/94  MB	39	303ns	304ns	309ns	358ns	6%	▁▁▁▁█▁▁▁▁▁	nonrendering	skrp_div_uint_4
  96/94  MB	256	75.9ns	76.1ns	76.1ns	76.3ns	0%	▆█▁▄▁▄▁▆▆▁	nonrendering	skrp_div_uint_1
  96/94  MB	247	302ns	303ns	304ns	319ns	2%	▁▁▁▁▁█▁▁▁▁	nonrendering	skrp_div_int_4
  96/94  MB	506	75.8ns	76.1ns	76.1ns	76.5ns	0%	▃▄▁▂▂█▂▄▇▄	nonrendering	skrp_div_int_1

======================================================================
 Running nanobench_ml3_control...
Timer overhead: 24.7ns
curr/maxrss	loops	min	median	mean	max	stddev	samples   	config	bench
  95/93  MB	33	303ns	304ns	304ns	306ns	0%	█▂▄▁▅▁▄▁▅▄	nonrendering	skrp_div_uint_4
  95/93  MB	327	162ns	162ns	162ns	164ns	0%	▂▃▃▃▂█▂▁▂▁	nonrendering	skrp_div_uint_1
  95/93  MB	350	302ns	302ns	306ns	339ns	4%	▁▁▁▁▁▁▁▁▁█	nonrendering	skrp_div_int_4
  95/93  MB	606	142ns	143ns	143ns	148ns	1%	▂▂▂▂▂▁▃▁▃█	nonrendering	skrp_div_int_1

======================================================================
...

```

I also tweaked llvm_mca_analysis.py so one can add --reset-experiments
to remove all but the first (control) set of data. This is handy
when on the 3rd+ thing to try and the spreadsheet gets unweildy
with no-longer-useful info. It doesn't remove the assembly files
(in case we do want to go back), just the data from the .tsv

Change-Id: I85362ab6fdbfffc8cca29a6ba91ea8ccc23b231d
Reviewed-on: https://skia-review.googlesource.com/c/skia/+/1283016
Auto-Submit: Kaylee Lubick <kjlubick@google.com>
Reviewed-by: Florin Malita <fmalita@google.com>
Commit-Queue: Florin Malita <fmalita@google.com>
2 files changed