tree 218647b445aa249a5cd512213b02b6a0f7fa8a12
parent 8b7f82142c90503e1b9ac6c9ae2cdcce63a8dd01
author Nigel Tao <nigeltao@golang.org> 1666223638 +1100
committer Nigel Tao <nigeltao@golang.org> 1666225089 +1100

Avoid (NULL + 0) in derived io_buffer variables

This addresses a "runtime error: applying zero offset to null pointer"
UBSAN (Undefined Behavior Sanitizer) warning using clang 15.0.1:
https://logs.chromium.org/logs/skia/5e005cc1b1981011/+/steps/dm/0/stdout

The offending line of code was "io1_a_dst = io0_a_dst + a_dst->meta.wi"
in wuffs_lzw__decoder__write_to and "io1_a_dst = NULL + 0" was Undefined
Behavior even though io1_a_dst was never dereferenced.

This commit avoids initializing io1_a_dst to (NULL + 0) when io0_a_dst
is NULL, falling back to initializing it with just NULL.

As discussed in https://reviews.llvm.org/D67122 while (nullptr + 0) is
Defined Behavior according to the C++ spec, (NULL + 0) is Undefined
Behavior (by omission) in C11 6.5.6/8: "If both the pointer operand and
the result point to elements of the same array object, or one past the
last element of the array object, the evaluation shall not produce an
overflow; otherwise, the behavior is undefined."

The benchmarks (an excerpt of the full suite is below) seem quite
sensitive to this simple change that's outside of hot loops, sometimes
much better and sometimes much worse. I don't know why.

name                                               old speed       new speed      delta

wuffs_bzip2_decode_10k/clang11                    63.1MB/s ± 0%   58.4MB/s ± 0%   -7.43%  (p=0.008 n=5+5)
wuffs_bzip2_decode_100k/clang11                   49.4MB/s ± 0%   46.2MB/s ± 0%   -6.35%  (p=0.008 n=5+5)

wuffs_bzip2_decode_10k/gcc10                      60.9MB/s ± 0%   56.3MB/s ± 0%   -7.59%  (p=0.016 n=5+4)
wuffs_bzip2_decode_100k/gcc10                     49.6MB/s ± 0%   47.0MB/s ± 0%   -5.15%  (p=0.008 n=5+5)

wuffs_deflate_decode_1k_full_init/clang11          196MB/s ± 0%    195MB/s ± 1%   -0.63%  (p=0.008 n=5+5)
wuffs_deflate_decode_1k_part_init/clang11          226MB/s ± 0%    224MB/s ± 0%   -0.95%  (p=0.008 n=5+5)
wuffs_deflate_decode_10k_full_init/clang11         409MB/s ± 0%    420MB/s ± 0%   +2.84%  (p=0.008 n=5+5)
wuffs_deflate_decode_10k_part_init/clang11         418MB/s ± 0%    431MB/s ± 0%   +2.97%  (p=0.008 n=5+5)
wuffs_deflate_decode_100k_just_one_read/clang11    517MB/s ± 1%    542MB/s ± 0%   +4.78%  (p=0.008 n=5+5)
wuffs_deflate_decode_100k_many_big_reads/clang11   330MB/s ± 0%    338MB/s ± 0%   +2.45%  (p=0.008 n=5+5)

wuffs_deflate_decode_1k_full_init/gcc10            188MB/s ± 0%    177MB/s ± 0%   -5.38%  (p=0.008 n=5+5)
wuffs_deflate_decode_1k_part_init/gcc10            218MB/s ± 0%    209MB/s ± 0%   -4.11%  (p=0.016 n=4+5)
wuffs_deflate_decode_10k_full_init/gcc10           402MB/s ± 0%    407MB/s ± 1%   +1.19%  (p=0.008 n=5+5)
wuffs_deflate_decode_10k_part_init/gcc10           413MB/s ± 0%    419MB/s ± 1%   +1.64%  (p=0.008 n=5+5)
wuffs_deflate_decode_100k_just_one_read/gcc10      520MB/s ± 0%    532MB/s ± 0%   +2.25%  (p=0.008 n=5+5)
wuffs_deflate_decode_100k_many_big_reads/gcc10     330MB/s ± 0%    335MB/s ± 0%   +1.56%  (p=0.008 n=5+5)

wuffs_gif_decode_1k_bw/clang11                     781MB/s ± 0%    645MB/s ± 0%  -17.46%  (p=0.008 n=5+5)
wuffs_gif_decode_1k_color_full_init/clang11        176MB/s ± 0%    181MB/s ± 0%   +3.11%  (p=0.008 n=5+5)
wuffs_gif_decode_1k_color_part_init/clang11        215MB/s ± 0%    223MB/s ± 0%   +3.46%  (p=0.008 n=5+5)
wuffs_gif_decode_10k_bgra/clang11                  786MB/s ± 0%    801MB/s ± 0%   +1.94%  (p=0.008 n=5+5)
wuffs_gif_decode_10k_indexed/clang11               208MB/s ± 0%    211MB/s ± 0%   +1.44%  (p=0.008 n=5+5)
wuffs_gif_decode_20k/clang11                       261MB/s ± 0%    252MB/s ± 0%   -3.43%  (p=0.008 n=5+5)
wuffs_gif_decode_100k_artificial/clang11           578MB/s ± 0%    583MB/s ± 0%   +0.86%  (p=0.008 n=5+5)
wuffs_gif_decode_100k_realistic/clang11            224MB/s ± 0%    223MB/s ± 0%   -0.79%  (p=0.008 n=5+5)
wuffs_gif_decode_1000k_full_init/clang11           229MB/s ± 0%    225MB/s ± 0%   -1.73%  (p=0.008 n=5+5)
wuffs_gif_decode_1000k_part_init/clang11           229MB/s ± 0%    225MB/s ± 0%   -1.69%  (p=0.008 n=5+5)
wuffs_gif_decode_anim_screencap/clang11           1.31GB/s ± 0%   1.29GB/s ± 0%   -1.06%  (p=0.008 n=5+5)

wuffs_gif_decode_1k_bw/gcc10                       650MB/s ± 0%    631MB/s ± 0%   -3.03%  (p=0.008 n=5+5)
wuffs_gif_decode_1k_color_full_init/gcc10          169MB/s ± 0%    168MB/s ± 0%   -0.51%  (p=0.008 n=5+5)
wuffs_gif_decode_1k_color_part_init/gcc10          202MB/s ± 0%    202MB/s ± 0%   -0.18%  (p=0.008 n=5+5)
wuffs_gif_decode_10k_bgra/gcc10                    766MB/s ± 0%    790MB/s ± 0%   +3.08%  (p=0.008 n=5+5)
wuffs_gif_decode_10k_indexed/gcc10                 202MB/s ± 0%    208MB/s ± 0%   +3.16%  (p=0.008 n=5+5)
wuffs_gif_decode_20k/gcc10                         250MB/s ± 0%    258MB/s ± 0%   +3.20%  (p=0.008 n=5+5)
wuffs_gif_decode_100k_artificial/gcc10             563MB/s ± 0%    566MB/s ± 0%   +0.57%  (p=0.008 n=5+5)
wuffs_gif_decode_100k_realistic/gcc10              217MB/s ± 0%    220MB/s ± 0%   +1.49%  (p=0.008 n=5+5)
wuffs_gif_decode_1000k_full_init/gcc10             220MB/s ± 0%    224MB/s ± 0%   +1.49%  (p=0.008 n=5+5)
wuffs_gif_decode_1000k_part_init/gcc10             220MB/s ± 0%    224MB/s ± 0%   +1.43%  (p=0.008 n=5+5)
wuffs_gif_decode_anim_screencap/gcc10             1.30GB/s ± 0%   1.30GB/s ± 0%   +0.34%  (p=0.008 n=5+5)

wuffs_lzw_decode_20k/clang11                       293MB/s ± 0%    292MB/s ± 0%   -0.41%  (p=0.008 n=5+5)
wuffs_lzw_decode_100k/clang11                      486MB/s ± 0%    537MB/s ± 0%  +10.49%  (p=0.008 n=5+5)

wuffs_lzw_decode_20k/gcc10                         258MB/s ± 0%    276MB/s ± 0%   +7.19%  (p=0.016 n=4+5)
wuffs_lzw_decode_100k/gcc10                        512MB/s ± 0%    527MB/s ± 0%   +3.08%  (p=0.016 n=4+5)
