DESIGN

Overview

Provides interactive dashboard for Skia performance data.

Code Locations

The code for the server along with VM instance setup scripts is kept in:

https://skia.googlesource.com/buildbot/+/master/perf/server

Architecture

This is the general flow of data for the Skia performance application. The frontend is available at http://skiaperf.com.

+-------------+ | | | Browser | | | | | | | +----------^--+ | +--------------------+----+-----+ | GCE Instance|skia+perf+b | | | | | +-----------+----------+ | | | Squid3 | | | | | | | +--------^-------------+ | | | | | +----------+-------------+ | | | Perf (Go) | | | | ^ ^ | | | +------------------------+ | | | | | | | | | | | | +------------------+ | | | | |Tile Pipeline (Go)| | | | | | ^ | | | | | +--+---------------+ | | | | | | | +-------------------------------+ | | | | +---------+-+ | | +-------+--+ | MySQL | | | | BigQuery | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | +-----------+ | | +----------+ | | +-+----v---+ | Tile | | Repo | | | | | | | | | +----------+ Perf is a Go application that serves the HTML, CSS, JS and the JSON representations that the JS needs. It loads test results in the form of ‘tiles’ from the Tile Repo. It combines that data with data about commits and annotations from the MySQL data base and serves that the UI.

The Tile Pipeline is a separate application that periodically queries for fresh data from BigQuery and then writes Tiles into the Tile Repo. Note that when ingestion moves out of prod and into the same server we can do the tile updates immediately after ingestion is done.

Tile Repo will be represented internally as an interface, the first implemetation will be as files on the local disk, with a directory tree that contains gzipped JSON files called tiles. Note that we may alternatively use Go native gob encoded files and just transform then into JSON when serving them to the UI.

Each tile contains exactly 16 points of every trace for a dataset. The one exception being the last.gz tile, which may contain less that 16 points; see below for an explanation of that. The Tile Repo directory structure is:

$TILE_REPO_ROOT/<dataset>/<scale>/<tilenumber>.gz

Where:

dataset = {skps|micro}
scale = 0..5 The scale factor of 4^N, so points in the /0/ directory represent 1:1 with test results, while tiles in the /1/ directory have every fourth commit with data, and /2/ has every 16th commit with data.
tilenumber = The number of the tile, at the given scale, starting at BOT (Beginning of Time).

So the data in:

/skps/0/0.gz

contains the data for the first 16 commits from the BOT that have test data.

/micro/3/6.gz

contains the 16 commits per trace of every 64th commit that has data and is the 7th data set in that order. So it contains 16 points per trace, and each point falls between 6 * 64 * 16=6144 and 7 * 64 * 16=7168, i.e it is a slice of 16 commits that represent a range of 1024 commits.

A manifest file will be available that give the commit timestamp ranges for each tile.

When navigating the UI users can select the tiles they are looking at (<, >) and also change the scaling factor that they are looking at (+,-).

Each /dataset/scale directory also contains a file, last.gz, that contains the most recent data, from 1 to 16 points. The Tile Pipeline process will update the ‘last.gz’ files in each directory and write new tile files as new data arrives. Last.gz will contain the id of the tile that appears just before it, that way the UI can request /skps/0/last.gz and know how to proceed from there.

URL Structure

The URL structure for retrieving Datasets is TBD.

Navigating

For each point if the user wants to zoom out, add 1 to the scale factor and divide tilenumber by two. Do the opposite to zoom in. To move forwards or backwards in time add or subtract 1 to the tile number. The actual UI mechanisms for navigating around traces are TBD, this is just a description of how the tiles are arranged.

Tile Pipeline Algorithm

Start at /0/last.gz and find the previous tilenumber.gz, open that and find the last githash. Query for all data newer than that githash. Group into tiles of 16 githashes. Put the remainder (or the last 16 if the remainder is 0) into /0/last.gz.

Now do the same for /1/, but the data comes from /0/ and not from BigQuery. I.e. /1/0.gz is just a sampling of /0/0.gz, /0/1.gz, /0/2.gz and /0/3.gz. At each level only rewrite last.gz and write out any new complete tiles as they are filled.

Perf Stats Database

The data for the performance metrics are kept in the BigQuery tables stored in the google.com:chrome-skia project. Note that this is a different project from where the data is accessed, which is by VM instances running under the google.com:skia-buildbots project. For this to work the service account email of the VM needs to be added to the permissions group of the google.com:chrome-skia project. If this isn't done then the BigQuery access will fail with a 403 error.

Logs

We use the https://github.com/golang/glog for logging, which puts Google style Error, Warning and Info logs in /tmp on the server under the ‘perf’ account.

Debugging Tips

Starting the application is done via /etc/init.d/perf which does the backgrounding itself via start-stop-daemon, which means that if the app crashes when first starting then nothing will make it to the logs. To debug the cause in that case edit /etc/init.d/perf and remove the --background flag and then run:

$ sudo /etc/init.d/perf start

And you should get stdout and stderr output.

Monitoring

Monitoring of the application is done via Graphite at http://skiamonitor.com. Both system and application level metrics are monitored.

Annotations Database

A Cloud SQL (a cloud version of MySQL) database is used to keep information on Skia git revisions and their corresponding annotations. The database will be updated when users add/edit/delete annotations via the dashboard UI.

All passwords for MySQL are stored in valentine (search “skia perf”).

To connect to the database from authorized network (including skia-perf GCE):

$ mysql -h 173.194.104.24 -u root -p

Initial setup of the database, the users, and the tables:

CREATE DATABASE skia;
USE skia;
CREATE USER 'readonly'@'%' IDENTIFIED BY <password in valentine>;
GRANT SELECT ON *.* TO 'readonly'@'%';
CREATE USER 'readwrite'@'%' IDENTIFIED BY <password in valentine>;
GRANT SELECT, DELETE, UPDATE, INSERT ON *.* TO 'readwrite'@'%';

// Table for storing annotations.
CREATE TABLE notes (
  id     INT       NOT NULL AUTO_INCREMENT PRIMARY KEY,
  type   TINYINT,
  author TEXT,
  notes  TEXT      NOT NULL
);

// Table for storing git revision information.
CREATE TABLE githash (
  githash   VARCHAR(40)   NOT NULL PRIMARY KEY,
  ts        TIMESTAMP     NOT NULL,
  gitnumber INT           NOT NULL,
  author    TEXT          NOT NULL,
  message   TEXT          NOT NULL
);

// Table for mapping revisions and annotations. This support many-to-many
// mapping.
CREATE TABLE githashnotes (
  githash VARCHAR(40)  NOT NULL,
  ts      TIMESTAMP    NOT NULL,
  id      INT          NOT NULL,

  FOREIGN KEY (githash) REFERENCES githash(githash),
  FOREIGN KEY (id) REFERENCES notes(id)
);

CREATE TABLE shortcuts (
  id      INT             NOT NULL AUTO_INCREMENT PRIMARY KEY,
  traces  MEDIUMTEXT      NOT NULL
);

Common queries that the dashboard will use:

INSERT INTO notes (type, author, notes) VALUES (1, 'bsalomon', 'Alert!');

SELECT LAST_INSERT_ID();

INSERT INTO githashnotes (ts, id) VALUES (<githash_ts>, <last_insert_id>);

The above set of commands will usually be used together to add new annotations and associate them with corresponding git commits. The commands below remove an annotation and its associations with any commit.

DELETE FROM githashnotes WHERE id = <id_to_delete>;

DELETE FROM notes WHERE id = <id_to_delete>;

Since the data size is relatively small, the dashboard server can keep a copy of all recent commit info (e.g., for constructing a “blamelist”), annotations, and their many-to-many relationship for use in the context.

Password for the database will be stored in the metadata instance. To see the current password stored in metadata and the fingerprint:

gcutil --project=google.com:skia-buildbots getinstance [skia-perf GCE instance]

To set the mysql password that perf is to use:

gcutil --project=google.com:skia-buildbots setinstancemetadata [skia-perf GCE instance] --metadata=readonly:[password-from-valentine] --metadata=readwrite:[password-from-valentine] --fingerprint=[the metadata fingerprint]

Startup and config

The server is started and stopped via:

sudo /etc/init.d/perf [start|stop|restart]

But sysv init only handles starting and stopping a program once, so we use Monit to monitor the application and restart it if it crashes. The config is in:

/etc/monit/conf.d/perf

Installation

See the README file.