Drop unnecessary memcpy in do_bench_image_decode
This doesn't affect the std/bmp or std/gif implementations per se, only
the code that benches them. Prior to this commit, the source file was
first decoded into a pixel buffer and then always copied from the
(2-dimensional) wuffs_base__pixel_buffer to a (1-dimensional)
wuffs_base_io_buffer. That copy is useful for *tests*, which compare
that io_buffer's contents to a known good result (i.e. a separate I/O
buffer filled with the contents of a golden file), but it isn't
necessary for *benches*. Dropping that unnecessary copy reduces the
total time per iteration, and therefore increases the reported speed.
name old speed new speed delta
wuffs_bmp_decode_40k/clang5 2.09GB/s ± 2% 2.48GB/s ± 0% +18.78% (p=0.008 n=5+5)
wuffs_bmp_decode_40k/gcc7 1.73GB/s ± 1% 1.93GB/s ± 0% +11.54% (p=0.008 n=5+5)
wuffs_gif_decode_1k_bw/clang5 291MB/s ± 0% 376MB/s ± 1% +29.14% (p=0.008 n=5+5)
wuffs_gif_decode_1k_color_full_init/clang5 95.3MB/s ± 1% 103.2MB/s ± 1% +8.37% (p=0.008 n=5+5)
wuffs_gif_decode_1k_color_part_init/clang5 117MB/s ± 0% 130MB/s ± 0% +10.68% (p=0.008 n=5+5)
wuffs_gif_decode_10k_bgra/clang5 469MB/s ± 0% 489MB/s ± 1% +4.39% (p=0.008 n=5+5)
wuffs_gif_decode_10k_indexed/clang5 126MB/s ± 1% 130MB/s ± 1% +3.50% (p=0.008 n=5+5)
wuffs_gif_decode_20k/clang5 152MB/s ± 0% 156MB/s ± 1% +2.77% (p=0.008 n=5+5)
wuffs_gif_decode_100k_artificial/clang5 318MB/s ± 0% 329MB/s ± 0% +3.53% (p=0.008 n=5+5)
wuffs_gif_decode_100k_realistic/clang5 137MB/s ± 0% 139MB/s ± 0% +1.40% (p=0.008 n=5+5)
wuffs_gif_decode_1000k_full_init/clang5 138MB/s ± 0% 140MB/s ± 1% +1.45% (p=0.008 n=5+5)
wuffs_gif_decode_1000k_part_init/clang5 139MB/s ± 0% 140MB/s ± 1% +1.03% (p=0.008 n=5+5)
wuffs_gif_decode_anim_screencap/clang5 661MB/s ± 0% 747MB/s ± 1% +13.08% (p=0.008 n=5+5)
mimic_gif_decode_1k_bw/clang5 92.0MB/s ± 0% 91.9MB/s ± 1% ~ (p=0.841 n=5+5)
mimic_gif_decode_1k_color/clang5 45.5MB/s ± 0% 45.7MB/s ± 0% +0.45% (p=0.032 n=5+4)
mimic_gif_decode_10k_indexed/clang5 53.5MB/s ± 0% 53.7MB/s ± 0% +0.31% (p=0.032 n=5+4)
mimic_gif_decode_20k/clang5 55.5MB/s ± 0% 55.1MB/s ± 2% ~ (p=0.690 n=5+5)
mimic_gif_decode_100k_artificial/clang5 88.8MB/s ± 0% 89.0MB/s ± 1% ~ (p=0.151 n=5+5)
mimic_gif_decode_100k_realistic/clang5 53.5MB/s ± 1% 53.7MB/s ± 0% +0.31% (p=0.024 n=5+5)
mimic_gif_decode_1000k/clang5 54.6MB/s ± 0% 54.6MB/s ± 1% ~ (p=0.222 n=5+5)
mimic_gif_decode_anim_screencap/clang5 106MB/s ± 0% 109MB/s ± 0% +2.59% (p=0.008 n=5+5)
wuffs_gif_decode_1k_bw/gcc7 284MB/s ± 0% 356MB/s ± 0% +25.33% (p=0.016 n=4+5)
wuffs_gif_decode_1k_color_full_init/gcc7 82.1MB/s ± 0% 95.3MB/s ± 0% +16.08% (p=0.008 n=5+5)
wuffs_gif_decode_1k_color_part_init/gcc7 97.5MB/s ± 0% 117.4MB/s ± 0% +20.39% (p=0.008 n=5+5)
wuffs_gif_decode_10k_bgra/gcc7 345MB/s ± 0% 396MB/s ± 0% +14.71% (p=0.016 n=4+5)
wuffs_gif_decode_10k_indexed/gcc7 104MB/s ± 0% 121MB/s ± 0% +16.38% (p=0.008 n=5+5)
wuffs_gif_decode_20k/gcc7 131MB/s ± 0% 149MB/s ± 0% +13.55% (p=0.008 n=5+5)
wuffs_gif_decode_100k_artificial/gcc7 293MB/s ± 0% 321MB/s ± 1% +9.77% (p=0.008 n=5+5)
wuffs_gif_decode_100k_realistic/gcc7 116MB/s ± 1% 130MB/s ± 1% +12.10% (p=0.008 n=5+5)
wuffs_gif_decode_1000k_full_init/gcc7 117MB/s ± 1% 132MB/s ± 1% +12.53% (p=0.008 n=5+5)
wuffs_gif_decode_1000k_part_init/gcc7 117MB/s ± 0% 132MB/s ± 1% +12.20% (p=0.008 n=5+5)
wuffs_gif_decode_anim_screencap/gcc7 615MB/s ± 2% 711MB/s ± 2% +15.53% (p=0.008 n=5+5)
mimic_gif_decode_1k_bw/gcc7 91.4MB/s ± 0% 91.7MB/s ± 1% ~ (p=0.095 n=5+5)
mimic_gif_decode_1k_color/gcc7 46.3MB/s ± 0% 46.2MB/s ± 2% ~ (p=0.841 n=5+5)
mimic_gif_decode_10k_indexed/gcc7 51.9MB/s ± 1% 51.7MB/s ± 1% ~ (p=0.310 n=5+5)
mimic_gif_decode_20k/gcc7 54.2MB/s ± 0% 54.2MB/s ± 1% ~ (p=0.421 n=5+5)
mimic_gif_decode_100k_artificial/gcc7 88.4MB/s ± 1% 88.7MB/s ± 1% ~ (p=0.222 n=5+5)
mimic_gif_decode_100k_realistic/gcc7 53.7MB/s ± 0% 53.7MB/s ± 0% ~ (p=0.206 n=5+5)
mimic_gif_decode_1000k/gcc7 53.8MB/s ± 2% 54.0MB/s ± 2% ~ (p=0.690 n=5+5)
mimic_gif_decode_anim_screencap/gcc7 106MB/s ± 0% 109MB/s ± 0% +2.79% (p=0.008 n=5+5)
diff --git a/doc/changelog.md b/doc/changelog.md
index eef47f0..5fe6a86 100644
--- a/doc/changelog.md
+++ b/doc/changelog.md
@@ -25,6 +25,7 @@
- Added single-quoted strings.
- Added tokens.
- Changed `gif.decoder_workbuf_len_max_incl_worst_case` from 1 to 0.
+- Changed what the `std/gif` benchmarks actually measure.
- Made `wuffs_base__pixel_format` a struct.
- Made `wuffs_base__pixel_subsampling` a struct.
- Made `wuffs_base__status` a struct.
diff --git a/test/c/testlib/testlib.c b/test/c/testlib/testlib.c
index cfef4c6..d7cd760 100644
--- a/test/c/testlib/testlib.c
+++ b/test/c/testlib/testlib.c
@@ -1017,9 +1017,6 @@
size_t src_ri,
size_t src_wi,
uint64_t iters_unscaled) {
- wuffs_base__io_buffer have = ((wuffs_base__io_buffer){
- .data = g_have_slice_u8,
- });
wuffs_base__io_buffer src = ((wuffs_base__io_buffer){
.data = g_src_slice_u8,
});
@@ -1030,10 +1027,9 @@
uint64_t i;
uint64_t iters = iters_unscaled * g_flags.iterscale;
for (i = 0; i < iters; i++) {
- have.meta.wi = 0;
src.meta.ri = src_ri;
CHECK_STRING(
- (*decode_func)(&n_bytes, &have, wuffs_initialize_flags, pixfmt, &src));
+ (*decode_func)(&n_bytes, NULL, wuffs_initialize_flags, pixfmt, &src));
}
bench_finish(iters, n_bytes);
return NULL;