Performance mystery: delete unused string constant

This commit deletes the definition of the `const char*
wuffs_base__note__i_o_redirect` global variable. This variable is not
used anywhere (after the previous commit removed references to it).

Deleting this one line of code (two if you count the declaration, not
just the definition) can have a dramatic effect on seemingly unrelated
performance micro-benchmarks. Some numbers get better (e.g. +5%), some
numbers get worse (e.g. -10%). The same micro-benchmark can get faster
on one C compiler but slower on another.

It is a mystery why this happens. This small, self-contained commit (as
well as the previous one, which set it up) will be immediately followed
by a roll back, but it is committed anyway so that we can refer to these
numbers in the git log. To reproduce:

----
$ git clone https://github.com/google/wuffs.git
$ cd wuffs
$ gcc --version | head -n 1
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0

$ # Check out the parent commit.
$ git checkout --quiet THE_HASH_OF_THIS_COMMIT^
$ gcc -O3 -Wall -std=c99 test/c/std/json.c -DWUFFS_MIMIC
$ ./a.out -bench -focus=wuffs_json_decode_1k | grep MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1125 ns/op	     756.762 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1131 ns/op	     752.652 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1131 ns/op	     752.785 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1131 ns/op	     752.697 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1132 ns/op	     752.365 MB/s

$ # Check out this commit.
$ git checkout --quiet THE_HASH_OF_THIS_COMMIT
$ gcc -O3 -Wall -std=c99 test/c/std/json.c -DWUFFS_MIMIC
$ ./a.out -bench -focus=wuffs_json_decode_1k | grep MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1200 ns/op	     709.617 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1203 ns/op	     707.651 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1203 ns/op	     707.976 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1233 ns/op	     690.663 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1203 ns/op	     708.153 MB/s

$ # Dropping the "-std=c99" also changes the numbers.

$ # Check out the parent commit.
$ git checkout --quiet THE_HASH_OF_THIS_COMMIT^
$ gcc -O3 -Wall          test/c/std/json.c -DWUFFS_MIMIC
$ ./a.out -bench -focus=wuffs_json_decode_1k | grep MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1229 ns/op	     693.226 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1230 ns/op	     692.254 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1230 ns/op	     692.486 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1234 ns/op	     689.986 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1232 ns/op	     691.480 MB/s

$ # Check out this commit.
$ git checkout --quiet THE_HASH_OF_THIS_COMMIT
$ gcc -O3 -Wall          test/c/std/json.c -DWUFFS_MIMIC
$ ./a.out -bench -focus=wuffs_json_decode_1k | grep MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1129 ns/op	     754.103 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1131 ns/op	     752.849 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1135 ns/op	     750.622 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1150 ns/op	     740.731 MB/s
Benchmarkwuffs_json_decode_1k/gcc7	 1000000	    1135 ns/op	     750.609 MB/s

$ ldd ./a.out
	linux-vdso.so.1 (0x00007ffd9b1f2000)
	libgtk3-nocsd.so.0 => /usr/lib/x86_64-linux-gnu/libgtk3-nocsd.so.0 (0x00007f9b46f60000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f9b46b6f000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f9b4696b000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f9b4674c000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f9b5b59a000)
$ rm a.out
----

Some more benchmark numbers before/after this commit:

----
name                                                old speed      new speed      delta

wuffs_deflate_decode_1k_full_init/clang5             103MB/s ± 1%   105MB/s ± 1%   +2.25%  (p=0.000 n=9+10)
wuffs_deflate_decode_1k_part_init/clang5             121MB/s ± 0%   122MB/s ± 0%     ~     (p=1.000 n=9+9)
wuffs_deflate_decode_10k_full_init/clang5            148MB/s ± 1%   149MB/s ± 0%     ~     (p=0.059 n=9+8)
wuffs_deflate_decode_10k_part_init/clang5            151MB/s ± 0%   151MB/s ± 0%     ~     (p=0.968 n=9+10)
wuffs_deflate_decode_100k_just_one_read/clang5       172MB/s ± 0%   171MB/s ± 1%   -0.44%  (p=0.001 n=9+9)
wuffs_deflate_decode_100k_many_big_reads/clang5      149MB/s ± 0%   149MB/s ± 1%   -0.56%  (p=0.000 n=10+9)

wuffs_deflate_decode_1k_full_init/gcc7               104MB/s ± 1%   102MB/s ± 1%   -2.45%  (p=0.000 n=9+10)
wuffs_deflate_decode_1k_part_init/gcc7               121MB/s ± 0%   121MB/s ± 0%   +0.46%  (p=0.000 n=9+10)
wuffs_deflate_decode_10k_full_init/gcc7              147MB/s ± 0%   153MB/s ± 0%   +3.75%  (p=0.000 n=8+10)
wuffs_deflate_decode_10k_part_init/gcc7              150MB/s ± 1%   155MB/s ± 0%   +3.98%  (p=0.000 n=9+10)
wuffs_deflate_decode_100k_just_one_read/gcc7         184MB/s ± 0%   194MB/s ± 1%   +5.43%  (p=0.000 n=9+10)
wuffs_deflate_decode_100k_many_big_reads/gcc7        154MB/s ± 0%   158MB/s ± 0%   +2.45%  (p=0.000 n=9+10)

wuffs_gif_decode_1k_bw/clang5                        281MB/s ± 1%   278MB/s ± 1%   -0.92%  (p=0.003 n=10+9)
wuffs_gif_decode_1k_color_full_init/clang5          97.8MB/s ± 1%  93.7MB/s ± 3%   -4.26%  (p=0.000 n=10+10)
wuffs_gif_decode_1k_color_part_init/clang5           117MB/s ± 1%   115MB/s ± 1%   -1.41%  (p=0.000 n=9+10)
wuffs_gif_decode_10k_bgra/clang5                     463MB/s ± 1%   455MB/s ± 1%   -1.68%  (p=0.000 n=10+10)
wuffs_gif_decode_10k_indexed/clang5                  124MB/s ± 1%   121MB/s ± 1%   -2.31%  (p=0.000 n=9+9)
wuffs_gif_decode_20k/clang5                          149MB/s ± 1%   146MB/s ± 1%   -1.95%  (p=0.000 n=10+10)
wuffs_gif_decode_100k_artificial/clang5              313MB/s ± 1%   307MB/s ± 0%   -1.94%  (p=0.000 n=9+7)
wuffs_gif_decode_100k_realistic/clang5               135MB/s ± 1%   133MB/s ± 1%   -1.18%  (p=0.000 n=10+8)
wuffs_gif_decode_1000k_full_init/clang5              136MB/s ± 1%   132MB/s ± 4%   -2.27%  (p=0.023 n=10+10)
wuffs_gif_decode_1000k_part_init/clang5              136MB/s ± 1%   132MB/s ± 3%   -2.96%  (p=0.000 n=9+10)
wuffs_gif_decode_anim_screencap/clang5               644MB/s ± 1%   621MB/s ± 3%   -3.51%  (p=0.000 n=10+10)

wuffs_gif_decode_1k_bw/gcc7                          290MB/s ± 1%   287MB/s ± 1%   -1.18%  (p=0.000 n=10+10)
wuffs_gif_decode_1k_color_full_init/gcc7            91.7MB/s ± 1%  88.2MB/s ± 1%   -3.85%  (p=0.000 n=10+10)
wuffs_gif_decode_1k_color_part_init/gcc7             107MB/s ± 1%   103MB/s ± 1%   -4.12%  (p=0.000 n=9+10)
wuffs_gif_decode_10k_bgra/gcc7                       370MB/s ± 1%   344MB/s ± 1%   -6.98%  (p=0.000 n=10+9)
wuffs_gif_decode_10k_indexed/gcc7                    113MB/s ± 2%   104MB/s ± 1%   -8.31%  (p=0.000 n=10+8)
wuffs_gif_decode_20k/gcc7                            139MB/s ± 1%   127MB/s ± 3%   -8.49%  (p=0.000 n=10+10)
wuffs_gif_decode_100k_artificial/gcc7                299MB/s ± 0%   280MB/s ± 1%   -6.43%  (p=0.000 n=9+8)
wuffs_gif_decode_100k_realistic/gcc7                 123MB/s ± 1%   111MB/s ± 1%  -10.21%  (p=0.000 n=10+8)
wuffs_gif_decode_1000k_full_init/gcc7                124MB/s ± 1%   112MB/s ± 1%   -9.34%  (p=0.000 n=10+8)
wuffs_gif_decode_1000k_part_init/gcc7                125MB/s ± 1%   112MB/s ± 1%  -10.19%  (p=0.000 n=10+9)
wuffs_gif_decode_anim_screencap/gcc7                 622MB/s ± 1%   599MB/s ± 3%   -3.68%  (p=0.000 n=9+10)

wuffs_json_decode_1k/clang5                          628MB/s ± 1%   628MB/s ± 1%     ~     (p=0.842 n=9+10)
wuffs_json_decode_21k_formatted/clang5               331MB/s ± 1%   331MB/s ± 0%     ~     (p=0.720 n=10+9)
wuffs_json_decode_26k_compact/clang5                 521MB/s ± 1%   521MB/s ± 0%     ~     (p=0.549 n=10+9)
wuffs_json_decode_217k_stringy/clang5                432MB/s ± 1%   432MB/s ± 1%     ~     (p=0.780 n=10+9)

wuffs_json_decode_1k/gcc7                            745MB/s ± 0%   703MB/s ± 0%   -5.70%  (p=0.000 n=9+10)
wuffs_json_decode_21k_formatted/gcc7                 418MB/s ± 1%   400MB/s ± 1%   -4.44%  (p=0.000 n=10+8)
wuffs_json_decode_26k_compact/gcc7                   554MB/s ± 1%   511MB/s ± 1%   -7.72%  (p=0.000 n=9+10)
wuffs_json_decode_217k_stringy/gcc7                  452MB/s ± 1%   422MB/s ± 1%   -6.58%  (p=0.000 n=9+10)
----
2 files changed
tree: 8964d6923de11f3eba9b2a99382c3fdfe66cbdbd
  1. .github/
  2. cmd/
  3. doc/
  4. example/
  5. fuzz/
  6. hello-wuffs-c/
  7. internal/
  8. lang/
  9. lib/
  10. release/
  11. script/
  12. std/
  13. test/
  14. .gitignore
  15. AUTHORS
  16. build-all.sh
  17. build-example.sh
  18. build-fuzz.sh
  19. CONTRIBUTING.md
  20. CONTRIBUTORS
  21. go.mod
  22. go.sum
  23. LICENSE
  24. README.md
  25. wuffs-root-directory.txt
README.md

Wrangling Untrusted File Formats Safely

(Formerly known as Puffs: Parsing Untrusted File Formats Safely).

Wuffs is a memory-safe programming language (and a standard library written in that language) for wrangling untrusted file formats safely. Wrangling includes parsing, decoding and encoding. Example file formats include images, audio, video, fonts and compressed archives.

It is also fast. On many of its GIF decoding benchmarks, Wuffs measures 2x faster than “giflib” (C), 3x faster than “image/gif” (Go) and 7x faster than “gif” (Rust).

Goals and Non-Goals

Wuffs' goal is to produce software libraries that are as safe as Go or Rust, roughly speaking, but as fast as C, and that can be used anywhere C libraries are used. This includes very large C/C++ projects, such as popular web browsers and operating systems (using that term to include desktop and mobile user interfaces, not just the kernel).

Wuffs the Library is available as transpiled C code. Other C/C++ projects can use that library without requiring the Wuffs the Language toolchain. Those projects can use Wuffs the Library like using any other third party C library. It's just not hand-written C.

However, unlike hand-written C, Wuffs the Language is safe with respect to buffer overflows, integer arithmetic overflows and null pointer dereferences. A key difference between Wuffs and other memory-safe languages is that all such checks are done at compile time, not at run time. If it compiles, it is safe, with respect to those three bug classes.

The trade-off in aiming for both safety and speed is that Wuffs programs take longer for a programmer to write, as they have to explicitly annotate their programs with proofs of safety. A statement like x += 1 unsurprisingly means to increment the variable x by 1. However, in Wuffs, such a statement is a compile time error unless the compiler can also prove that x is not the maximal value of x's type (e.g. x is not 255 if x is a base.u8), as the increment would otherwise overflow. Similarly, an integer arithmetic expression like x / y is a compile time error unless the compiler can also prove that y is not zero.

Wuffs is not a general purpose programming language. It is for writing libraries, not programs. The idea isn't to write your whole program in Wuffs, only the parts that are both performance-conscious and security-conscious. For example, while technically possible, it is unlikely that a Wuffs compiler would be worth writing entirely in Wuffs.

What Does Wuffs Code Look Like?

The /std/lzw/decode_lzw.wuffs file is a good example. The Wuffs the Language document has more information on how it differs from other languages in the C family.

What Does Compile Time Checking Look Like?

For example, making this one-line edit to the LZW codec leads to a compile time error. wuffs gen fails to generate the C code, i.e. fails to compile (transpile) the Wuffs code to C code:

diff --git a/std/lzw/decode_lzw.wuffs b/std/lzw/decode_lzw.wuffs
index f878c5e..f10dcee 100644
--- a/std/lzw/decode_lzw.wuffs
+++ b/std/lzw/decode_lzw.wuffs
@@ -98,7 +98,7 @@ pub func lzw_decoder.decode?(dst ptr buf1, src ptr buf1, src_final bool)() {
                        in.dst.write?(x:s)

                        if use_save_code {
-                               this.suffixes[save_code] = c as u8
+                               this.suffixes[save_code] = (c + 1) as u8
                                this.prefixes[save_code] = prev_code as u16
                        }
$ wuffs gen std/gif
check: expression "(c + 1) as u8" bounds [1 ..= 256] is not within bounds [0 ..= 255] at
/home/n/go/src/github.com/google/wuffs/std/lzw/decode_lzw.wuffs:101. Facts:
    n_bits < 8
    c < 256
    this.stack[s] == (c as u8)
    use_save_code

In comparison, this two-line edit will compile (but the “does it decode GIF correctly” tests then fail):

diff --git a/std/lzw/decode_lzw.wuffs b/std/lzw/decode_lzw.wuffs
index f878c5e..b43443d 100644
--- a/std/lzw/decode_lzw.wuffs
+++ b/std/lzw/decode_lzw.wuffs
@@ -97,8 +97,8 @@ pub func lzw_decoder.decode?(dst ptr buf1, src ptr buf1, src_final bool)() {
                        // type checking, bounds checking and code generation for it).
                        in.dst.write?(x:s)

-                       if use_save_code {
-                               this.suffixes[save_code] = c as u8
+                       if use_save_code and (c < 200) {
+                               this.suffixes[save_code] = (c + 1) as u8
                                this.prefixes[save_code] = prev_code as u16
                        }
$ wuffs gen std/gif
gen wrote:      /home/n/go/src/github.com/google/wuffs/gen/c/gif.c
gen unchanged:  /home/n/go/src/github.com/google/wuffs/gen/h/gif.h
$ wuffs test std/gif
gen unchanged:  /home/n/go/src/github.com/google/wuffs/gen/c/gif.c
gen unchanged:  /home/n/go/src/github.com/google/wuffs/gen/h/gif.h
test:           /home/n/go/src/github.com/google/wuffs/test/c/gif
gif/basic.c     clang   PASS (8 tests run)
gif/basic.c     gcc     PASS (8 tests run)
gif/gif.c       clang   FAIL test_lzw_decode: bufs1_equal: wi: got 19311, want 19200.
contents differ at byte 3 (in hex: 0x000003):
  000000: dcdc dc00 00d9 f5f9 f6df dc5f 393a 3a3a  ..........._9:::
  000010: 3a3b 618e c8e4 e4e4 e5e4 e600 00e4 bbbb  :;a.............
  000020: eded 8f91 9191 9090 9090 9190 9192 9192  ................
  000030: 9191 9292 9191 9293 93f0 f0f0 f1f1 f2f2  ................
excerpts of got (above) versus want (below):
  000000: dcdc dcdc dcd9 f5f9 f6df dc5f 393a 3a3a  ..........._9:::
  000010: 3a3a 618e c8e4 e4e4 e5e4 e6e4 e4e4 bbbb  ::a.............
  000020: eded 8f91 9191 9090 9090 9090 9191 9191  ................
  000030: 9191 9191 9191 9193 93f0 f0f0 f1f1 f2f2  ................

gif/gif.c       gcc     FAIL test_lzw_decode: bufs1_equal: wi: got 19311, want 19200.
contents differ at byte 3 (in hex: 0x000003):
  000000: dcdc dc00 00d9 f5f9 f6df dc5f 393a 3a3a  ..........._9:::
  000010: 3a3b 618e c8e4 e4e4 e5e4 e600 00e4 bbbb  :;a.............
  000020: eded 8f91 9191 9090 9090 9190 9192 9192  ................
  000030: 9191 9292 9191 9293 93f0 f0f0 f1f1 f2f2  ................
excerpts of got (above) versus want (below):
  000000: dcdc dcdc dcd9 f5f9 f6df dc5f 393a 3a3a  ..........._9:::
  000010: 3a3a 618e c8e4 e4e4 e5e4 e6e4 e4e4 bbbb  ::a.............
  000020: eded 8f91 9191 9090 9090 9090 9191 9191  ................
  000030: 9191 9191 9191 9193 93f0 f0f0 f1f1 f2f2  ................

wuffs-test-c: some tests failed
wuffs test: some tests failed

Directory Layout

  • lang holds the Go libraries that implement Wuffs the Language: tokenizer, AST, parser, renderer, etc. The Wuffs tools are written in Go, but as mentioned above, Wuffs transpiles to C code, and Go is not necessarily involved if all you want is to use the C edition of Wuffs.
  • lib holds other Go libraries, not specific to Wuffs the Language per se.
  • internal holds internal implementation details, as per Go's internal packages convention.
  • cmd holds Wuffs the Language' command line tools, also written in Go.
  • std holds Wuffs the Library's code.
  • release holds the releases (e.g. in their C form) of Wuffs the Library.
  • test holds the regular tests for Wuffs the Library.
  • fuzz holds the fuzz tests for Wuffs the Library.
  • script holds miscellaneous utility programs.
  • doc holds documentation.
  • example holds example programs for Wuffs the Library.
  • hello-wuffs-c holds an example program for Wuffs the Language.

Documentation

The Note directory also contains various short articles.

Status

Version 0.2. The API and ABI aren't stabilized yet. The compiler undoubtedly has bugs. Assertion checking needs more rigor, especially around side effects and aliasing, and being sufficiently well specified to allow alternative implementations. Lots of detail needs work, but the broad brushstrokes are there.

Discussion

The mailing list is at https://groups.google.com/forum/#!forum/wuffs.

Contributing

The CONTRIBUTING.md file contains instructions on how to file the Contributor License Agreement before sending any pull requests (PRs). Of course, if you‘re new to the project, it’s usually best to discuss any proposals and reach consensus before sending your first PR.

Source code is auto-formatted.

License

Apache 2. See the LICENSE file for details.

Disclaimer

This is not an official Google product, it is just code that happens to be owned by Google.


Updated on December 2019.