There are several executables that need to get built at various times and places that go into compiling and running user's untrusted C++ code. The following executables part of that process:
Some of these are built and included as part of the push package:
build_release +---------------------------------------------+ | | | fiddle_secwrap.cpp +----> fiddle_secwrap | | fiddle_run/main.go +----> fiddle_run | | | +---------------------------------------------+
The rest are built on the server as it runs:
skia-fiddle +----------------------------------------------------------+ | | | gn/ninja | | FIDDLE_ROOT/versions/<githash>/ +-----> libskia.a | | FIDDLE_ROOT/.../fiddle_main.cpp +-----> fiddle | | | | | | | | User's code written and mounted in container to | | overwrite the default tools/fiddle/draw.cpp. | | | | | | systemd-nspawn | | + | | | | | +-> fiddle_run (stdout produces JSON) | | + (capture stdout/stderr of child procs) | | | | | | ninja | | +-> draw.cpp +-----> fiddle | | | | | | | | | | | +-> fiddle_secwrap | | + | | | | | +-> fiddle | | | | | +----------------------------------------------------------+
By default $FIDDLE_ROOT is /mnt/pd0, but can be another directory when running locally and not using systemd-nspawn.
Skia is checked out into $FIDDLE_ROOT/versions/, and gn/ninja built. Good builds are recorded in $FIDDLE_ROOT/goodbuilds.txt, which is just a text file of good builds in the order they are done, that is, new good builds are appended to the end of the file.
The rest of the work, compiling the user's code and then running it, is done in the container, i.e. run in a root jail using systemd-nspawn.
In the container, / is mounted read-only. Also bind an overlay filesystem is created to mirror /mnt/pd0/fiddle into the container, so the full contents are available to read, but any writes are directed into the temp directory created for each run.
The source for draw.cpp is mounted readonly as a file that takes the place of the default draw.cpp. The source of the mount point for draw.cpp is $FIDDLE_ROOT/tmp//, where tmpdir is unique for each requested compile. (This is just a symbolic link when not running via nspawn.)
Compile ‘fiddle’ using Ninja and run via ptrace control with fiddle_secwrap. Source images will be loaded in $FIDDLE_ROOT/images. The output from running fiddle_main is piped to stdout and contains the images as base64 encoded values in JSON. The $FIDDLE_ROOT/images directory is whitelisted for access by fiddle_secwrap so that fiddle_main can read the source image.
Summary of directories and files and how they are mounted in the container:
Directory | Perms | Container |
---|---|---|
$FIDDLE_ROOT/goodbuilds.txt | R | Y |
$FIDDLE_ROOT/ | RWish | Y |
.//tools/fiddle/draw.cpp | R | Y |
$FIDDLE_ROOT/tmp// | R | N |
$FIDDLE_ROOT/images/ | R | Y |
$FIDDLE_ROOT/bin/fiddle_secwrap | R | Y |
Where ‘RWish’ means that the directory is mounted is a way that allows reading from any file, but writes are redirected to a temp directory.
We could continuously add new builds to /versions/ but each checkout and build is ~1.3GB. So we‘ll fill up our 1TB disk in under a year. So we need to keep around older builds, but can’t keep them all. Having finer-grained history for recent builds is also important, while we can tolerate gaps in older builds. I.e. we don't really need a build from 30 days ago, and 30 days and 1 hr ago, but we would like to have almost all of the last weeks worth of commits available. So we end up with a decimation strategy that is simple but also accomplishes the above goals. For example:
Named fiddles are actually just like soft links from a name to the fiddleHash of a fiddle. They can only be created by logged in users and the id of the person that created the named shortcut is attached as metadata to the file.
The URL structure of fiddle is:
/c/cbb8dee39e9f1576cd97c2d504db8eee - Direct link to a fiddle.
Links to individual resources:
/i/cbb8dee39e9f1576cd97c2d504db8eee_raster.png /i/cbb8dee39e9f1576cd97c2d504db8eee_gpu.png /i/cbb8dee39e9f1576cd97c2d504db8eee.pdf /i/cbb8dee39e9f1576cd97c2d504db8eee.skp
Links to individual resources for a given commit:
/ai/<runid>/cbb8dee39e9f1576cd97c2d504db8eee_raster.png /ai/<runid>/cbb8dee39e9f1576cd97c2d504db8eee_gpu.png /ai/<runid>/cbb8dee39e9f1576cd97c2d504db8eee.pdf /ai/<runid>/cbb8dee39e9f1576cd97c2d504db8eee.skp
Where runid is the hash timestamp and git hash of a particular version of Skia.
To create a new fiddle, POST JSON to /_/run of the form:
{ "code":"void draw(SkCanvas...", "width":256, "height":256, "source":0, }
Embedding fiddles in iframes is done by:
/iframe/cbb8dee39e9f1576cd97c2d504db8eee
Which should really just be a version of index.html that strips out much of the surrounding elements.
Fiddles are stored in Google Storage under gs://skia-fiddle/, which is different from fiddle 1.0 where they were stored in MySql. For each fiddle we store the user's code at:
gs://skia-fiddle/fiddle/<fiddlehash>/draw.cpp
The image width, height, and source (as a 64bit int) values are stored as metadata on the draw.cpp file.
Note that the fiddlehash must match the hash generated by fiddle 1.0, so that hash is actually the hash of the user's code with line numbers added, along with the width and height added in a comment. We also store the rendered images as directories below each fiddlehash directory:
gs://skia-fiddle/fiddle/<fiddlehash>/<ts-hash>-<githash>/cpu.png gs://skia-fiddle/fiddle/<fiddlehash>/<ts-hash>-<githash>/gpu.png gs://skia-fiddle/fiddle/<fiddlehash>/<ts-hash>-<githash>/skp.skp gs://skia-fiddle/fiddle/<fiddlehash>/<ts-hash>-<githash>/pdf.pdf
Note that is the timestamp of the git commit time in RFC3339 format, followed by a dash, and then by the githash (revision) of the Skia commit. This allows the directories to be sorted quickly by name to find the most recent version of the images, which is what will be displayed by default.
The only other thing that needs to be stored are the source images, which are stored as files in the /source directory:
gs://skia-fiddle/source/1 gs://skia-fiddle/source/2
In addition there is a text file:
gs://skia-fiddle/source/lastid.txt
That contains in text the largest ID for a source image ever used. This should be incremented and written back to Google Storage before adding a new image. Note that writing using generations can prevent the lost update problem.
Named fiddles are actually just like soft links from a name to the fiddleHash of a fiddle. The named fiddles are stored in:
gs://skia-fiddle/named/<fiddle name>
Where the name of the fiddle is the filename, and the contents of the file is the fiddleHash. The id of the person that created the named shortcut is attached as metadata to the file.
An attached disk will reside at /mnt/pd0 and will be populated as:
/mnt/pd0/fiddle - $FIDDLE_ROOT /mnt/pd0/container /mnt/pd0/fiddle/depot_tools
During instance startup git and systemd-container will be installed and depot_tools will also be installed.
The container image and all other exes will be installed via push.
We're putting a C++ compiler on the web, and promising to run the results of user submitted code, so security is a large concern. Security is handled in a layered approach, using a combination of seccomp-bpf, chroot jail and rlimits.
seccomp-bpf - Used to limit the types of system calls that the user code can make. Any attempts to make a system call that isn't allowed causes the application to terminate immediately. Seccomp-bpf and ptrace are used from fiddle_secwrap.cpp.
chroot jail - The code is run in a chroot jail via systemd-nspawn, making the rest of the operating system files unreachable from the running code. Systemd-nspawn is launched from fiddle_run.
rlimits - Used to limit the resources the running code can get access to, for example runtime is limited to 10s of CPU. The limits are set in fiddle_run.
A backup of all the named fiddles from gs://skia-fiddle/named to gs://skia-fiddle-backup takes placed on a daily basis. See:
https://console.cloud.google.com/storage/transfer?project=google.com:skia-buildbots