Storing Expectations in Firestore

Gold Expectations are essentially a map of (Grouping, Digest) to Label where Grouping is currently TestName (but could be a combination of TestName + ColorSpace or something more complex), Digest is the md5 hash of an image's content, and Label is Positive, Negative, or Untriaged (default).

These Triples are stored in a large map of maps, i.e. map[string]map[string]int. This is encapsulated by the type Expectations. If a given (Grouping, Digest) doesn't have a label, it is assumed to be Untriaged.

There is the idea of the MasterExpectations, which is the Expectations belonging to the git branch “master”. Additionally, there can be smaller CLExpectations that belong to a ChangeList (CL) and stay separate from the MasterExpectations until the CL lands. These CLExpectations are the “delta” compared to the MasterExpectations.

We'd like to be able to do the following:

  • Store and retrieve Expectations (both MasterExpectations and CLExpectations).
  • Update the Label for a (Grouping, Digest).
  • Keep an audit record of what user updated the Label for a given (Grouping, Digest).
  • Undo a previous change.
  • Support Gerrit CLs and GitHub PullRequests (PRs)

Schema

In the spreadsheet metaphor, Firestore Collections are tables and Documents are the rows, with the fields of the Documents being the columns.

Like all other projects, we will use the firestore.NewClient to create a top level “gold” Collection with a parent Document for this instance (e.g. “skia-prod”, “flutter”, etc). Underneath that parent Document, we will create a three Collections: expectations, triage_records, and triage_changes.

In the expectations Collection, we will store many expectationEntry Documents with the following schema:

Grouping       string    # starting as the TestName
Digest         string
Label          int
Updated        time.Time
CRSAndCLID     string     # "" for master branch, otherwise CRS + "_" + clID

The expectationEntry will have an ID of [grouping]|[digest], allowing updates.

The triage_records Collection will have triageRecords Documents:

ID           string    # autogenerated
UserName     string
TS           time.Time
CRSAndCLID   string    # "" for master branch, otherwise CRS + "_" + clID
Committed    bool      # if writing has completed (e.g. large triage)
Changes      int       # how many records match in triage_changes Collection

The triage_changes Collection will have triageChanges Documents:

RecordID       string # From the triage_records table
Grouping       string
Digest         string
LabelBefore    int
LabelAfter     int

The vast majority of LabelAfter will be Positive, with some Negatives and a rare Untriaged (in the case of an undo).

We split the triage data into two tables to account for the fact that bulk triages can sometimes be across thousands of groups/digests, which would surpass the 10Mb firestore limit per Document.

Indexing

Firebase has pretty generous indexing limits, so we should be fine with the default single-field indexes. In addition, we need the following composite indexes:

Collection ID | Fields

expstore_expectations_v2 | crs_cl_id: ASC digest: ASC expstore_triage_changes_v2 | record_id: ASC grouping: ASC digest: ASC expstore_triage_records_v2 | committed: ASC crs_cl_id: ASC ts: DESC

Usage

To create the MasterExpectations map we create a QuerySnapshotIterator on expectationEntry Documents with CRSAndCLID=="" and assemble the initial load together. This is kept in RAM for returning to clients.

For performance, we shard fetching the expectations based on digest, since that data is essentially random and evenly distributed.

CLExpectations will have their changed Expectations (essentially their delta from the MasterExpectations) stored in the expectations Collection with nonempty CRSAndCLID fields. When the tryjob monitor notes that a CL has landed, it can add the CLExpectations to the master branch.

Storing the data as above yields for trivial triage log fetching:

q := client.Collection("triage_records").OrderBy("Updated").Limit(N).Offset(M)
firestore.IterDocs("", "", q, 3, 5*time.Second, ...)

To undo, we can query the original change by id (from the triage_records Collection) and simply apply the opposite of it, if the current state matches the labelBefore (otherwise, do nothing, because either it has been changed again or already undone).

Growth Opportunities

The design should be open to future changes, for example:

  1. Specifying a maximum age of an expectation. e.g. Forget about positive digests not seen for a year, forget about negative digests not seen for 6 months.
  2. Add in the ability to say why something was marked negative.

For item #1, the schema could be augmented with a “last seen on” timestamp that is written to once per day or so in a batch write. Note: to not overly tax the indexes, the last seen timestamps should all be the same for each batch write.

The schema could be augmented for #2 with additional fields in the expectations and triage_changes Collections and some UI support.