Related Work

Wuffs is an industrial project, not an academic project, and this Related Work section is not as extensive or as rigorous as would be found in an academic thesis or journal article. The underlying goal is usefulness, not novelty. For Wuffs' users, it‘d be perfectly fine to have multiple implementations of a good idea, even if the later ones wouldn’t earn a Ph. D.

A number of other projects are listed below. Wuffs is not exactly like any of them, for one reason or another. We‘re not claiming that “use a state of the art theorem prover” or “write it in Rust” are unworkable approaches. They’re just different approaches, with different trade-offs.

For example, Wuffs is primarily concerned about safety and performance. Formal correctness, such as being able to mathematically express that “when this function returns, that array's elements will be sorted in increasing order”, is less of a concern for Wuffs the Language than for other projects. This is especially as higher level concerns like “correctly implements the JPEG specification” are less amenable to formalization than foundational concepts like searching and sorting. Even with something as mathematical and conceptually simple as Quicksort, compare a two-line Haskell implementation with the essay that is an Idris implementation. Proving correctness is an admirable goal, but it is not Wuffs' goal.

For Wuffs the Library, confidence about correctness is instead based on testing, both unit testing and for e.g. media codecs, corpora of test files. A small corpus is included in the Wuffs source code repository. Other, much larger corpora (e.g. based on a sample of a web crawl), are not included for download size and copyright reasons.

Static Checkers

Wuffs is a language by itself, integrated with a compiler, not something embedded in the comments of another language's program, supported by a separate program checker, as the bounds check elimination affects the generated code.

Checked C has a similar goal of safety, but aims to be backwards compatible with C, including the idiomatic ways in which C code plays fast and loose with pointers and arrays, and implicitly inserting run time checks while maintaining existing data layout (e.g. the sizeof a struct type) for interoperability with other systems. This is a much harder problem than Wuffs' approach of a new, simple, domain-specific programming language.

Extended Static Checker for Java (ESC/Java) and its successor OpenJML, which obviously target the Java programming language, similarly has to analyze a language that is more complicated than Wuffs. Java is also not commonly used for writing e.g. low level image codecs, as the language lacks unsigned integer types, it is garbage collected and idiomatic code often allocates.

Similarly, Whiley is a programming language with extended static checking. In contrast to Wuffs, its int type uses unbounded arithmetic, instead of e.g. 32-bit twos-complement representation. That can be easier for theorem provers to reason about, but this can have a performance impact when such integers are used in inner loops. In theory, a compiler could recognize “an int restricted to the range [0, 99]” as a uint8 instead of a potentially ‘big integer’, but even so, it may improve performance to explicitly use a uint32 here even when a uint8 would do.

Spark ADA targets a subset of the Ada programming language, which again is more complicated than Wuffs. It also allows for pluggable theorem provers such as Z3, similar to Dafny. This can increase programmer productivity in that it can take less effort to prove more programs safe, but it also affects reproducibility (when some programs that are provable in one programmer‘s configuration are unprovable in another) and the size of the trusted computing base. It’s also not obvious up front what effect different theorem provers, and their different heuristics, will have on compile times or proof times.

In constrast, Wuffs' proof system is fast (and dumb) instead of smart (and slow). For example, Vale (Verified Assembly Language for Everest) is a promising approach that uses Dafny to verify implementations of cryptographic algorithms. However, “Vale: Verifying High-Performance Cryptographic Assembly Code” reports in Table 1 that verification times are measured in minutes. Wuffs aims for sub-second compilation times, including proofs of (memory) safety. Figure 15 in that paper suggests that Vale's verification times can grow exponentially in some cases, and also goes on to say, “Dafny/Z3 fails to verify the AES key inversion for 6 unrolled iterations and 9 unrolled iterations, indicating that SMT solvers like Z3 are still occasionally unpredictable”. Of course, Vale is also tackling a much harder problem than Wuffs: proving correctness, not just safety.

Once again, we're not claiming that these other approaches are unworkable, or not useful, just different, with different trade-offs.

Why a New Language?

Simpler languages are easier to prove things about. Macros, inheritance, closures, generics, operator overloading, goto's and built-in container types are all useful features, but as mentioned in the top-level README, Wuffs is not a general purpose programming language. Instead, its focus is on compile-time provable memory safety, with C-like performance, for CPU-intensive file format decoders.

Nonetheless, Wuffs is an imperative language, not a functional language, despite the long history of functional languages and provability. Inner loop performance usage matters, and it is easier to match the optimization techniques of incumbent C libraries with a C-like language than with a functional language. Compared to something like F*, Idris or Liquid Haskell, Wuffs has no garbage collector overhead, as it has no garbage collector.

Memory usage also matters, again considering per-pixel costs can be multiplied by millions of pixels. It is important to represent e.g. an RGBA pixel as four u8 values (or one u32), not as four naturally-sized-for-the-CPU integers or four ‘big integers’.

F* / KreMLin is a subset of F* that is similar in some sense to Wuffs, in that it transpiles to C and cares about both safety and performance. Unlike Wuffs, it is a functional language (with dependent types), not an imperative one (with a simpler type system), and uses a sophisticated theorem prover like Z3, with the same trade-offs as discussed above for Spark ADA and Dafny. It also tackles a more complicated problem, in attempting to prove correctness properties, not just safety properties. Further reading is in “Everest: Towards a Verified, Drop-in Replacement of HTTPS” and “Verified Low-Level Programming Embedded in F*”.

Cryptol is another project that, like Everest (including F* / KreMLin and its HACL* sub-project, and Dafny / Vale), focuses on cryptographic algorithms, rather than Wuffs' focus on file formats, and also relies on a sophisticated theorem prover like Z3.

Once again, we're not claiming that these other approaches are unworkable, or not useful, just different, with different trade-offs.

Why Not ATS?

ATS (Applied Type System) is a statically typed programming language that unifies implementation with formal specification. Its goals sound similar to Wuffs in the abstract, but they differ in their approach. That‘s not to say that one is better or worse than the other, they’re just different.

Graydon Hoare, on the topic of “Extended static checking (ESC), refinement types, general dependent-typed languages” says of existing approaches, including ATS, “So far, most exercises in this space have ended in frustration... the complexity wall here can make other areas of computering [sic] look... a bit like child's play. It gets very dense, very fast.”

In contrast, Wuffs aims to be significantly simpler. The trade-off is, of course, that Wuffs is not and does not aim to be a general purpose language. ATS might be more powerful and therefore placed higher in the Blub Paradox analogy, but the flip side of that is that, unlike Lisp, ATS is much more complicated to learn and to understand, perhaps dauntingly so. For example, its manual mentions that types, sorts, views, viewtypes, dataviews, datatypes and dataviewtypes are all similar-sounding but distinct ATS concepts. As of February 2018, the ATS home page says that “the current implementation... [consists] of more than 180K lines of code” and that “in general, one should expect to encounter many unfamiliar programming concepts and features in ATS and be prepared to spend a substantial amount of time on learning them.” In the words of one commentator, “the cognitive overhead of managing your proof-threading in ATS is much higher than managing your lifetimes/borrows in Rust, which is by far the biggest cliff of Rust's learning curve. Not to mention ATS is syntactically... challenging”.

On verification, ATS' Examples says that “A fully verified implementaion of the [Fibonacci] function in ATS can now be given... Please click here if you are interested in compiling and running this example on-line.” Modifying that example‘s “N” value from 10 to 10000 yields “fibats(10000) = Infinity”, which is clearly incorrect. Somewhere along the way, there’s been a breakdown between numbers in the ideal mathematical sense (often used by proof systems) and numbers in practice (whether fixed width integers or floating point) as actually computed on by CPUs. Regardless of where that breakdown is in this case, it doesn't build confidence that, in general, ATS programs are guaranteed free of arithmetic overflow despite compiler-verified proofs.

Once again, we're not claiming that these other approaches are unworkable, or not useful, just different, with different trade-offs.

Why Not Rust?

Rust is a general purpose programming language, which aims for C-like performance in general. Wuffs aims for C-like performance specifically for the limited problem space of bits-and-bytes level CPU-intensive computation (while doing that safely). As one data point, a GIF decoding benchmark measures Wuffs as 4x faster than two separate Rust implementations, including the one that crates.io calls “gif”, and one based on Nom (discussed below).

Wuffs also transpiles to standard C99 code with no dependencies, and should run anywhere that has a C99 compliant C compiler. An existing C/C++ project, such as the Chromium web browser, can choose to replace e.g. libpng with Wuffs PNG, without needing any additional toolchains. Sure, languages like D and Rust have great binary-level interop with C code, and Mozilla are reporting progress with parsing media formats in Rust, but it‘s still an operational hurdle to grow a project’s build process and to assess build times and binary sizes.

Nom is a parser combinator library in Rust, described in “Writing parsers like it is 2017”. Wuffs differs from nom by itself, in that Wuffs is an end to end implementation, not limited to that part of a file format that is easily expressible as a formal grammar. In particular, it also handles entropy encodings such as LZW (for GIF), ZLIB (for PNG) and Huffman (for JPEG, TODO). Wuffs differs from nom combined with other Rust code (e.g. a Rust LZW implementation) in that bounds and overflow checks are not just ubiquitous but also completely eliminated at compile time.

Kaitai Struct is in a similar space, generating safe parsers for multiple target programming languages from one declarative specification. Again, Wuffs differs in that it is a complete (and performant) end to end implementation, not just for the structured parts of a file format. Repeating a point in the previous paragraph, the difficulty in decoding the GIF format isn‘t in the regularly-expressible part of the format, it’s in the LZW compression. Kaitai's GIF parser returns the compressed LZW data as an opaque blob.

Once again, we're not claiming that these other approaches are unworkable, or not useful, just different, with different trade-offs.

Updated on February 2018.