Project Objectives: Skia Perf is a performance monitoring system designed to ingest, store, analyze, and visualize performance data for various projects, with a primary focus on Skia and related systems (e.g., Flutter, Android, Chrome). Its core objectives are:
Functionality: Perf consists of several key components that work together:
perfserver (to run the different services) and perf-tool (for administrative tasks, data inspection, and database backups/restores).draw_a_circle on arch=x86,config=8888). Trace IDs are structured key-value strings like ,arch=x86,config=8888,test=draw_a_circle,units=ms,.tile_size (number of commits per tile) is configurable and affects how data is sharded and queried.FORMAT.md) that Perf expects for input data files.Perf follows a services-oriented architecture, where the main perfserver executable can run in different modes (frontend, ingest, cluster, maintenance). Data flows from external benchmark systems into Perf, where it's processed, stored, analyzed, and finally presented to users.
Data Flow and Main Components:
External Benchmark Systems
|
V
[Data Files (JSON) in Perf Ingestion Format]
| (Uploaded to Google Cloud Storage - GCS)
V
GCS Bucket (e.g., gs://skia-perf/nano-json-v1/)
| (Pub/Sub event on new file arrival)
V
Perf Ingest Service(s) (`perfserver ingest` mode)
| - Parses JSON files (see /go/ingest/parser)
| - Validates data (see /go/ingest/format)
| - Associates data with Git commits (see /go/git)
| - Writes trace data to TraceStore (SQL, tiled) (see /go/tracestore)
| - Updates ParamSets (for UI query builders)
| - (Optionally) Emits Pub/Sub events for "Event Driven Alerting"
V
SQL Database (CockroachDB / Spanner)
| - Trace Data (values, parameters, indexed by commit/tile)
| - Commit Information (hashes, timestamps, messages)
| - Alert Configurations
| - Regression Records (details of detected regressions, triage status)
| - Shortcuts, User Favorites, etc.
|
+<--> Perf Cluster Service(s) (`perfserver cluster` or `perfserver frontend --do_clustering` mode)
| - Loads Alert configurations
| - Queries TraceStore for relevant data
| - Performs clustering (k-means) (see /go/clustering2, /go/ctrace2)
| - Fits step functions to cluster centroids (see /go/stepfit)
| - Calculates Regression statistic
| - Stores "Interesting" clusters/regressions in the database
| - Sends notifications (email, issue tracker) (see /go/notify)
|
+<--> Perf Frontend Service (`perfserver frontend` mode)
| - Serves HTML, CSS, JS (see /pages, /modules)
| - Handles API requests from the UI (see /go/frontend, /API.md)
| - Queries database for trace data, alert configs, regressions
| - Formats data for UI display (often as DataFrames)
| - Manages user authentication (via X-WEBAUTH-USER header)
|
+<--> Perf Maintenance Service (`perfserver maintenance` mode)
- Git repository synchronization
- Database schema migrations (see /migrations)
- Old data cleanup
- Cache refreshing (e.g., ParamSet cache)
Rationale for Key Architectural Choices:
database/sql package is used, with schema defined and managed by /go/sql and migration scripts in /migrations.TraceStore (/go/tracestore) implementation uses SQL tables but structures them to represent tiles of commits. ParamSets and Postings tables act as inverted indexes for fast lookup of traces matching specific key-value parameters.perfserver) with Modes:perfserver uses command-line flags and subcommands to determine its operational mode. Configuration files (/configs/*.json) further dictate behavior within each mode./go/clustering2 and /go/kmeans. ctrace2 handles trace normalization.This section focuses on significant modules beyond simple file/directory descriptions.
/go/config:
InstanceConfig). This is the central place where all settings for a Perf deployment (database, ingestion sources, Git repo, UI features, notification settings) are specified.InstanceConfig is a Go struct with fields for various aspects of the system. JSON files in /configs are unmarshaled into this struct. The module provides functions to load and validate these configurations./go/ingest:
format.Format specification, extracting performance metrics and metadata, associating them with Git commits, and writing the data to the TraceStore.ingest/format: Defines the expected structure of input JSON files (format.Format Go struct) and provides validation. This ensures data consistency.ingest/parser: Contains logic to parse the format.Format structure and extract individual trace measurements and their associated parameters.ingest/process: Coordinates the steps: reading from a source (e.g., GCS via /go/file), parsing, resolving commit information (via /go/git), and writing to the TraceStore.Source (e.g., GCSSource via PubSub) indicates a new file.process reads the file.parser and format validate and extract Results.Result, its git_hash is resolved to a CommitNumber using /go/git./go/tracestore./go/tracestore:
TraceStore is designed to efficiently retrieve trace values for specific parameter combinations over ranges of commits.TraceValues table: Stores the actual metric values, often sharded by tile.ParamSets table: Stores unique key-value pairs found in trace identifiers within each tile.Postings table: An inverted index mapping (tile, param_key, param_value) to a list of trace IDs that contain that key-value pair within that tile. This structure allows queries like “get all traces where config=8888 and arch=x86” to be resolved efficiently by intersecting posting lists. SQLTraceStore is the primary implementation using the SQL database./go/git:
git_hash values in ingested data and Perf's internal CommitNumber sequence.git CLI) or a Gitiles service API. It maintains a Commits table in the SQL database, mapping commit hashes to CommitNumbers and storing other metadata. It periodically updates its local Git repository clone or queries Gitiles for new commits./go/regression:
/go/clustering2) and step-fit analysis (from /go/stepfit) to identify “Interesting” clusters.Store interface (implemented by sqlregression2store): Persists information about detected regressions, including the cluster summary, owning alert, commit hash, regression statistic, and triage status (New, Ignore, Bug).DESIGN.md (comparing new interesting clusters with existing ones based on trace fingerprints) is implemented here to manage the lifecycle of a regression.Run clustering (e.g., hourly or event-driven) | V Identify "Interesting" new clusters (high |Regression| score) | V For each new Interesting Cluster: Compare fingerprint (top N traces) with existing relevant Clusters in DB | +-- No match? --> New Regression: Store in DB with status "New". | +-- Match found? --> Update existing Regression if new one has better |Regression| score. Keep triage status of existing./go/frontend:
net/http package to define HTTP handlers for various API endpoints (e.g., fetching data for plots, listing alerts, updating triage statuses). It authenticates users based on the X-WEBAUTH-USER header. It often fetches data, converts it into DataFrame structures, and then serializes these to JSON for the frontend./modules (Frontend TypeScript):
plot-simple-sk, alert-config-sk, query-sk). These elements handle rendering, user interaction, and making API calls to the Go backend.perf-scaffold-sk: Provides the main page layout (header, sidebar, content area).explore-simple-sk / explore-sk: Core components for querying data and displaying plots.json/index.ts: Contains TypeScript interfaces mirroring Go backend structs for type-safe API communication. This is crucial for ensuring frontend and backend data structures are compatible. It's often generated from Go source using /go/ts/ts.go./pages:
perf-scaffold-sk and the main page-specific custom element.alerts.html) includes the perf-scaffold-sk and the relevant page element (e.g., <alerts-page-sk>). An associated TypeScript file (e.g., alerts.ts) imports the necessary custom element definitions. Server-side Go templates inject initial context data (window.perf = {%.context %};) into the HTML.DESIGN.md:
clusters table (though the actual schema is in /go/sql and may have evolved to Regressions table).FORMAT.md:
git_hash, key (for global parameters), and results (an array of measurements). Each result can have its own key (for test-specific parameters like test name and units) and either a single measurement or a more complex measurements object for statistics (min, max, median). This document is crucial for data producers who need to integrate with Perf.BUILD.bazel (Root):
perfserver, backendserver) that package the Go executables and necessary static resources (configs, frontend assets).skia_app_container rules to assemble Docker images. It copies the perfserver and perf-tool binaries, configuration files from /configs, and compiled frontend assets (HTML, JS, CSS from /pages built output) into the image. The entrypoint for the perfserver image is the perfserver executable itself.A. New Alert Creation via UI and API:
User (in Perf UI, e.g., on /alerts page) | | Fills out Alert configuration form (<alert-config-sk> element) | Clicks "Save" | V Frontend JS (<alert-config-sk>) | | 1. If new alert, GET /_/alert/new | (Server responds with a pre-populated Alert JSON with id: -1) | | 2. Modifies this Alert JSON based on form input | | 3. POST modified Alert JSON to /_/alert/update | (Authorization: Bearer token if auth is enabled) | V Perf Backend (`/go/frontend/service.go` - UpdateAlertHandler) | | Receives Alert JSON | If alert.ID == -1, it's a new alert. | Validates Alert configuration | Persists Alert to SQL Database (via `alerts.Store`) | Responds 200 OK | V SQL Database (Alerts Table) | | New Alert record is created or existing one updated.
Rationale:
GET /_/alert/new step is a convenience. It provides the frontend with a valid Alert structure, including any instance-default values, simplifying new alert creation logic on the client.id: -1 to signify a new alert during the POST to /_/alert/update is a common pattern to allow a single endpoint to handle both creation and updates. The backend inspects the ID to determine the correct action.API.md.B. Data Ingestion and Event-Driven Regression Detection:
Benchmark System | | Produces performance_data.json (Perf Ingestion Format) | Uploads to GCS: gs://[bucket]/[path]/YYYY/MM/DD/HH/performance_data.json | V Google Cloud Storage | | File "OBJECT_FINALIZE" event | Publishes message to PubSub Topic (e.g., "perf-ingestion-topic") | V Perf Ingest Service(s) (Subscribed to "perf-ingestion-topic") | | 1. Receives PubSub message (contains GCS file path) | 2. Downloads performance_data.json from GCS | 3. Parses JSON, validates data (see /go/ingest/format, /go/ingest/parser) | 4. Looks up git_hash in /go/git to get CommitNumber | 5. Writes trace data to TraceStore (SQL tables) | 6. If Event Driven Alerting enabled for this instance: | Constructs a list of Trace IDs updated by this file | Publishes message (containing gzipped Trace IDs) to another PubSub Topic (e.g., "trace-update-topic") | V Perf Cluster Service(s) (Subscribed to "trace-update-topic") | | 1. Receives PubSub message (with updated Trace IDs) | 2. For each Alert Configuration (/go/alerts): | If Alert's query matches any of the updated Trace IDs: | Run clustering & regression detection for THIS Alert, | focusing on the commit range and data relevant to the updated traces. | (Reduces scope compared to full continuous clustering) | 3. If regressions found: | Store in SQL Database (Regressions table) | Send notifications (email, issue tracker)
Rationale:
FORMAT.md and DESIGN.md, GCS is the standard way data enters Perf. The YYYY/MM/DD/HH path structure is a convention.DESIGN.md explicitly states this is for large/sparse datasets. Sending only updated Trace IDs significantly narrows the scope of clustering for each event, making it faster and less resource-intensive than re-clustering everything. PubSub's 10MB message limit is considered for gzipped trace ID lists.This documentation provides a comprehensive starting point for a software engineer to understand the Skia Perf project. It covers its purpose, architecture, core concepts, and the rationale behind key design and implementation choices, referencing existing documentation and source code structure where appropriate.
The /cockroachdb module provides a set of shell scripts designed to facilitate interaction with a CockroachDB instance, specifically one named perf-cockroachdb, which is presumed to be running within a Kubernetes cluster. These scripts abstract away some of the complexities of kubectl commands, offering streamlined access for common database operations.
The primary motivation behind these scripts is to simplify development and administrative workflows. Instead of requiring users to remember and type lengthy kubectl commands with specific flags and resource names, these scripts provide convenient, single-command access points.
Key Components and Responsibilities:
admin.sh: This script focuses on providing access to the CockroachDB administrative web interface.
kubectl port-forward can be cumbersome to set up repeatedly.kubectl port-forward to map the local port 8080 to the port 8080 of the perf-cockroachdb-0 pod. Crucially, it then immediately attempts to open this local address in Google Chrome, providing an instant user experience. This assumes Google Chrome is installed and available in the system's PATH. User runs admin.sh | V Script executes: kubectl port-forward perf-cockroachdb-0 8080 | V Local port 8080 now forwards to CockroachDB pod's port 8080 | V Script executes: google-chrome http://localhost:8080 | V CockroachDB Admin UI opens in Chromeconnect.sh: This script is designed to provide a SQL shell connection to the CockroachDB instance.
kubectl run command with the correct image and arguments can be error-prone.kubectl run to create a temporary, interactive pod named androidx-cockroachdb. This pod uses the cockroachdb/cockroach:v19.2.5 Docker image. The --rm flag ensures the pod is deleted after the session ends, and --restart=Never prevents it from being restarted. The crucial part is the command passed to the pod: sql --insecure --host=perf-cockroachdb-public. This starts the CockroachDB SQL client, connecting insecurely to the database service exposed at perf-cockroachdb-public. User runs connect.sh | V Script executes: kubectl run androidx-cockroachdb -it --image=... --rm --restart=Never -- sql --insecure --host=perf-cockroachdb-public | V Temporary pod 'androidx-cockroachdb' is created | V CockroachDB SQL client starts inside the pod, connecting to 'perf-cockroachdb-public' | V User has an interactive SQL shell | V User exits shell -> Pod 'androidx-cockroachdb' is deletedskia-infra-public-port-forward.sh: This script sets up a port forward for direct database connections, typically for use with a local CockroachDB SQL client or other database tools.
connect.sh provides an in-cluster SQL shell, sometimes a direct connection from the local machine is preferred, for instance, to use graphical SQL clients or specific client libraries that are not available within the temporary pod created by connect.sh. The perf-cockroachdb instance is likely within a private network in the Kubernetes cluster (namespace perf), and this script makes it accessible locally.../../kube/attach.sh skia-infra-public (the details of which are outside this module‘s scope but presumably handles Kubernetes context or authentication for the skia-infra-public cluster). This helper script is then used to execute kubectl port-forward specifically for the perf-cockroachdb-0 pod within the perf namespace. It maps local port 25000 to the pod’s CockroachDB port 26257. The script also helpfully prints instructions for the user on how to connect using the cockroach sql command once the port forward is active. The set -e command ensures the script exits immediately if any command fails, and set -x enables command tracing for debugging. User runs skia-infra-public-port-forward.sh | V Script prints connection instructions | V Script executes: ../../kube/attach.sh skia-infra-public kubectl port-forward -n perf perf-cockroachdb-0 25000:26257 | V Port forward is established: local:25000 -> perf-cockroachdb-0:26257 (in 'perf' namespace) | V User can now run 'cockroach sql --insecure --host=127.0.0.1:25000' in another terminalThese scripts collectively aim to make interacting with the perf-cockroachdb instance as straightforward as possible by encapsulating the necessary kubectl commands and providing context-specific instructions or actions. They rely on the Kubernetes cluster being correctly configured and accessible, and on kubectl and potentially google-chrome being available on the user's system.
The /configs directory houses JSON configuration files for various instances of the Perf performance monitoring system. Each file defines the specific behavior and data sources for a particular Perf deployment. These configurations are crucial for tailoring Perf to different projects and environments, enabling developers and performance engineers to monitor and analyze performance data effectively.
The core idea is to provide a declarative way to set up a Perf instance. Instead of hardcoding settings, these JSON files act as blueprints. Each file serializes to and from a Go struct named config.InstanceConfig. This struct serves as the canonical schema for all instance configurations, and its Go documentation provides detailed explanations of each field. This approach ensures consistency and makes it easier to manage and evolve the configuration options.
Key Components and Responsibilities:
The primary responsibility of this module is to define and store these instance configurations. Each JSON file represents a distinct Perf instance, often corresponding to a specific project or a particular version of a project (e.g., a public vs. internal build, or a stable vs. experimental branch).
Instance-Specific Configuration Files (e.g., android2.json, chrome-public.json):
config.InstanceConfig Go struct.URL: The public-facing URL of the Perf instance.data_store_config: Defines the backend database (e.g., CockroachDB, Spanner), connection strings, and parameters like tile_size which can impact query performance and data retrieval efficiency. The choice between CockroachDB and Spanner often depends on scalability needs and existing infrastructure.ingestion_config: Specifies how performance data is brought into Perf. This includes the source_type (e.g., gcs for Google Cloud Storage, dir for local directories), the specific sources (e.g., GCS bucket paths or local file paths), and Pub/Sub topics for real-time ingestion. This section is vital for connecting Perf to the data producers.git_repo_config: Links Perf to the source code repository. This allows Perf to correlate performance data with specific code changes (commits). It includes the repository url, the provider (e.g., gitiles, git), and sometimes a commit_number_regex to extract meaningful commit identifiers from commit messages.notify_config: Configures how alerts and notifications are sent when regressions are detected. This can range from none to html_email, markdown_issuetracker, or anomalygroup. It often includes templates for notification subjects and bodies, leveraging placeholders like {{ .Alert.DisplayName }} to include dynamic information.auth_config: Defines the authentication mechanism, commonly using a header like X-WEBAUTH-USER for integration with existing authentication systems.query_config: Customizes how users can query and view data, including which parameters are available for filtering (include_params), default selections, and URL value defaults to tailor the user experience. It can also include caching configurations (e.g., using Redis) to improve query performance by specifying cache_config with level1_cache_key and level2_cache_key.anomaly_config: Contains settings related to anomaly detection, such as settling_time which defines how long Perf waits before considering new data for anomaly detection, helping to avoid flagging transient issues.contact, ga_measurement_id (for Google Analytics), feedback_url, trace_sample_proportion (to control the volume of detailed trace data collected), and favorites (for pre-defined links on the Perf UI) further customize the instance.android2.json):ingestion_config.source_config.sources (e.g., gs://android-perf-2/android2).perf-ingestion-android2-production.git_repo_config (e.g., https://android.googlesource.com/platform/superproject), and stores it in the CockroachDB instance defined in data_store_config.anomaly_config.notify_config. For android2.json, this means an issue is filed in an issue tracker ("notifications": "markdown_issuetracker") with a subject and body formatted using the provided templates, including details like affected tests and devices.local.json:
ingestion_config to a local directory (integration/data) that contains sample data. This data is often the same data used for unit tests, ensuring consistency between testing environments. The database connection will also point to a local instance.demo.json and demo_spanner.json:
local.json, demo.json uses a local directory for data ingestion ("./demo/data/") and a local CockroachDB instance. demo_spanner.json is analogous but configured to use Spanner as the backend, demonstrating flexibility in data store choices. They often include simpler git_repo_config pointing to public demo repositories (e.g., https://github.com/skia-dev/perf-demo-repo.git). The favorites section in demo.json shows how to add curated links to the Perf UI./spanner subdirectory:
spanner/chrome-public.json, spanner/skia-public.json) will have their data_store_config.datastore_type set to "spanner". They often include Spanner-specific settings or optimizations. For example, enable_follower_reads might be set to true in data_store_config for Spanner instances to distribute read load. Many of these configurations also define redis_config within their query_config.cache_config to further enhance query performance for frequently accessed data.optimize_sqltracestore flag, often set to true in Spanner configurations, indicates that specific optimizations for the SQL-based trace store are enabled, likely tailored to Spanner's characteristics.chrome-internal.json and chrome-public.json demonstrate sophisticated setups, including:commit_number_regex in git_repo_config to extract structured commit positions.temporal_config for integrating with Temporal workflows for tasks like regression grouping and bisection.enable_sheriff_config to integrate with sheriffing systems for managing alerts.trace_format: "chrome" indicates that the performance data adheres to the Chrome trace event format.The choice of fields and their values within each JSON file reflects a series of design decisions aimed at balancing flexibility, performance, and operational manageability for each specific Perf instance. For instance, the tile_size in data_store_config is adjusted based on expected data characteristics and query patterns. Similarly, trace_sample_proportion is set to manage storage costs and processing load while still capturing enough data for meaningful analysis. The notify_config templates are crafted to provide actionable information to developers when regressions occur.
The csv2days module is a command-line utility designed to process CSV files downloaded from the Perf performance monitoring system. Its primary purpose is to simplify time-series data by consolidating multiple data points from the same calendar day into a single representative value. This is particularly useful when analyzing performance trends over longer periods, where daily granularity is sufficient and finer-grained timestamps can introduce noise or unnecessary complexity.
The core problem this module solves is the overabundance of data points when Perf exports data at a high temporal resolution (e.g., multiple commits per day). For certain types of analysis, this level of detail is not required and can make it harder to discern broader trends. csv2days transforms such CSVs by keeping only the first encountered data column for each unique day and aggregating subsequent values from the same day into that single column using a “max” aggregation strategy.
The module operates as a streaming processor. It reads the input CSV file row by row, processes the header to determine which columns to modify or drop, and then transforms each subsequent data row accordingly before writing it to standard output.
Key Design Choices:
--in flag and outputs the transformed CSV to stdout. This follows common Unix philosophies for tool interoperability.csv2days processes the file line by line. This makes the tool memory-efficient.datetime) to match RFC3339 formatted dates in the header row. The date part (YYYY-MM-DD) of these timestamps is used for grouping.csv2days tool currently implements a “max” aggregation strategy: for the set of values corresponding to a single day, the maximum numerical value is chosen. If non-numerical values are encountered, the first value in the sequence is typically used.skipCols) are sorted in reverse order. This is crucial because removing an element from a slice shifts the indices of subsequent elements. Processing removals from right-to-left (largest index to smallest) ensures that the indices remain valid throughout the removal process.Workflow:
The main workflow within transformCSV can be visualized as follows:
Read Input CSV File (--in flag)
|
v
Parse Header Row
|
+----------------------------------------------------------------------+
| Identify Timestamp Columns (using RFC3339 regex) |
| For each timestamp: |
| Extract Date (YYYY-MM-DD) |
| If new date: |
| Add Date to Output Header |
| Record current column as start of a new "run" for this day |
| Else (same date as previous timestamp): |
| Mark current column for skipping (`skipCols`) |
| Increment length of current day's "run" (`runLengths`) |
| |
| Non-timestamp columns are added to Output Header as-is |
+----------------------------------------------------------------------+
|
v
Write Transformed Header to Output
|
v
Sort `skipCols` in Reverse Order
|
v
For each Data Row in Input CSV:
|
+----------------------------------------------------------------------+
| Apply "Max" Aggregation: |
| For each "run" of columns belonging to the same day (from header): |
| Find the maximum numerical value in the corresponding cells |
| Replace the first cell of the run with this max value |
+----------------------------------------------------------------------+
|
v
Remove Skipped Columns (based on `skipCols` from header processing)
|
v
Write Transformed Data Row to Output
|
v
Flush Output Buffer
main.go: This is the heart of the module.main() function: Handles command-line flag parsing (--in for the input CSV file). It orchestrates the reading of the input file and calls transformCSV to perform the core logic. Error handling and logging are also managed here.transformCSV(input io.Reader, output io.Writer) error: This is the core function responsible for the CSV transformation.csv.Reader for input and csv.Writer for output.datetime = regexp.MustCompile(...)) is used to identify columns containing RFC3339 timestamps.lastDate to detect when a new day starts in the header sequence.skipCols (a slice of integers) stores the indices of columns that represent subsequent entries for an already seen day and should thus be removed from the data rows.runLengths (a map of int to int) stores, for each column that starts a sequence of same-day entries, how many columns belong to that day. This is used later for aggregation. For example, if columns 5, 6, and 7 are all for “2023-01-15”, runLengths[5] would be 3.outHeader) is constructed by keeping the date part (YYYY-MM-DD) for the first occurrence of each day and omitting subsequent columns for the same day. Non-date columns are passed through unchanged.applyMaxToRuns(s []string, runLengths map[int]int) []string: For each “run” of columns identified in the header as belonging to the same day, this function takes the corresponding values from the current data row and replaces the value in the first column of that run with the maximum of those values. The max(s []string) string helper function is used here to find the maximum float value, falling back to the first string if parsing fails.removeAllIndexesFromSlices(s []string, skipCols []int) []string: After aggregation, this function removes the data cells corresponding to the skipCols identified during header processing. It uses removeValueFromSliceAtIndex repeatedly. It's crucial that skipCols is sorted in reverse order for this to work correctly.removeValueFromSliceAtIndex(s []string, index int) []string: A utility to remove an element at a specific index from a string slice.max(s []string) string: Iterates through a slice of strings, attempts to parse them as floats, and returns the string representation of the maximum float found. If no floats are found or parsing errors occur, it defaults to returning the first string in the input slice. This function underpins the aggregation logic.main_test.go: Contains unit tests for the transformCSV function.TestTransformCSV_HappyPath: Provides a simple input CSV string and the expected output string. It then calls transformCSV with these and asserts that the actual output matches the expected output. This serves as a concrete example of the module's behavior.BUILD.bazel: Defines how the csv2days Go binary and its associated library and tests are built using Bazel. It specifies source files, dependencies (like skerr, sklog, util), and visibility.The design decision to use strconv.ParseFloat and handle potential errors by continuing or defaulting implies that the tool is somewhat lenient with non-numeric data in columns expected to be numeric. The “max” operation will effectively ignore non-convertible strings unless all strings in a run are non-convertible, in which case the first string is chosen.
The demo module provides the necessary data and tools to showcase the capabilities of the Perf performance monitoring system. Its primary purpose is to offer a tangible and reproducible example of how Perf ingests and processes performance data. This allows users and developers to understand Perf's functionality without needing to set up a complex real-world data pipeline.
The core of this module revolves around a set of pre-generated data files and a Go program to create them.
Key Components:
/demo/data/ (Directory): This directory houses the actual demo data files in JSON format. Each file represents performance measurements associated with a specific commit hash.
format.Format specification (defined in perf/go/ingest/format), which Perf understands. This allows for a simple and direct way to feed data into Perf for demonstration purposes.demo_data_commit_1.json) contains a git_hash, key (identifying the test environment like architecture and configuration), and results. The results section includes measurements for various tests (like “encode” and “decode”) across different units (like “ms” and “kb”). Some files also include links which can point to external resources relevant to the data point or the overall commit. The data in these files is designed to show some variation over commits to demonstrate Perf's ability to track changes and detect regressions/improvements. For instance, the decode measurement and encodeMemory show a deliberate shift in values starting from demo_data_commit_6.json.generate_data.go: This Go program is responsible for creating the JSON data files located in the /demo/data/ directory.
format.Format evolves. It ensures the demo data remains relevant and can be adapted.skia-dev/perf-demo-repo repository, establishing a direct link between the performance data and a version control history, a common scenario in real-world Perf usage.encode, decode, encodeMemory). The generation includes some randomness (rand.Float32()) to make the data appear more realistic. _ A deliberate change in the data generation logic is introduced for commits at index 5 and onwards (e.g., multiplier = 1.2), which leads to a noticeable shift in decode and encodeMemory values in the corresponding JSON files. This is done to demonstrate how Perf can track and visualize such changes. _ It populates a format.Format struct (from go.skia.org/infra/perf/go/ingest/format) with the generated data, including the Git hash, environment keys, and the measurement results. _ The format.Format struct is then marshaled into JSON with indentation for readability. * Finally, the JSON data is written to a file named according to the commit sequence (e.g., demo_data_commit_1.json) within the data subdirectory. The program uses the runtime.Caller(0) function to determine its own location, ensuring that the data directory is created relative to the Go file itself, making the script more portable.Workflow for Demo Data Usage:
generate_data.go --(generates)--> /demo/data/*.json files
|
V
Perf Ingester (type 'dir', configured to read from /demo/data/)
|
V
Perf System (stores, analyzes, and visualizes the data)
The demo data is specifically designed to be used in conjunction with the perf/configs/demo.json configuration file and the https://github.com/skia-dev/perf-demo-repo.git repository. This linkage provides a complete, albeit simplified, end-to-end scenario for demonstrating Perf.
This main module, located at /go, serves as the root for all Go language components of the Perf performance monitoring system. It encompasses a wide array of functionalities, from data ingestion and storage to analysis, alerting, and user interface backend logic. The design promotes modularity, with specific responsibilities delegated to sub-modules.
The system is designed to handle large volumes of performance data, track it against code revisions, detect regressions automatically, and provide tools for developers and performance engineers to investigate and manage performance.
/go/alerts, /go/ingest, /go/regression, /go/frontend), each with a well-defined responsibility. This promotes separation of concerns, making the system easier to develop, test, and maintain.tracestore.Store, alerts.Store, regression.Store). This allows for different implementations to be swapped in (e.g., SQL-based stores vs. in-memory mocks for testing) and promotes loose coupling./go/config module defines a comprehensive InstanceConfig structure, which is loaded from a JSON file. This configuration dictates many aspects of an instance's behavior, including database connections, data sources, alert settings, and UI features. This allows for flexible deployment and customization of Perf instances./go/progress module provides a mechanism for tracking and reporting the status of such tasks to the UI./go/workflows module utilizes Temporal to orchestrate complex, multi-step processes like triggering bisections and processing their results. Temporal provides resilience and fault tolerance for these critical operations./go/alerts), regression details (/go/regression), commit information (/go/git), user favorites (/go/favorites), subscriptions (/go/subscription), and more. The /go/sql module manages the database schema./go/tracestore): Performance trace data is stored in a tiled fashion, with inverted indexes to allow for efficient querying. This specialized storage approach is optimized for time-series performance metrics./go/file and /go/filestore modules provide abstractions for interacting with these files./go/git, /go/progress)./go/tracecache for trace IDs./go/psrefresh module manages caching of ParamSets (used for UI query builders), potentially using Redis (/go/redis)./go/graphsshortcut offers an in-memory cache for graph shortcuts, especially for development./go/git module interacts with Git repositories (via local CLI or Gitiles API) to fetch commit information./go/issuetracker and /go/culprit integrate with issue tracking systems (e.g., Buganizer) for automated bug filing./go/chromeperf module allows communication with the Chrome Performance Dashboard for reporting regressions or fetching anomaly data./go/pinpoint module provides a client for the Pinpoint bisection service./go/sheriffconfig module integrates with LUCI Config for managing alert configurations./go/perfserver: The main executable for running different Perf services (frontend, ingestion, clustering, maintenance)./go/perf-tool: A CLI for various administrative and data inspection tasks./go/initdemo: A tool to initialize a database for demo or development./go/ts: A utility to generate TypeScript definitions from Go structs for frontend type safety.Data Ingestion:
External Data Source (e.g., GCS event)
|
V
/go/file (Source Interface: DirSource, GCSSource) --> Raw File Data
|
V
/go/ingest/process (Orchestrator)
|
+--> /go/ingest/parser (Parses file based on /go/ingest/format) --> Extracted Traces & Metadata
|
+--> /go/git (Resolves Git hash to CommitNumber)
|
V
/go/tracestore (Writes traces, updates inverted index & ParamSets)
|
V
/go/ingestevents (Publishes event: "File Ingested")
Regression Detection (Event-Driven Example):
/go/ingestevents (Receives "File Ingested" event)
|
V
/go/regression/continuous (Controller)
|
+--> /go/alerts (Loads matching Alert configurations)
|
+--> /go/dfiter & /go/dataframe & /go/dfbuilder (Prepare DataFrames for analysis)
|
V
/go/regression/detector (Core detection logic)
|
+--> /go/clustering2 (KMeans clustering)
|
+--> /go/stepfit (Individual trace step detection)
|
V
Detected Regressions
|
+--> /go/regression (Store results using Store interface, e.g., sqlregression2store)
|
+--> /go/notify (Format & send notifications via Email, IssueTracker, Chromeperf)
|
+--> /go/workflows (MaybeTriggerBisectionWorkflow for potential bisection)
User Interaction (Frontend Request for Graph):
User in Browser (Requests graph)
|
V
/go/frontend (HTTP Handlers, e.g., graphApi)
|
+--> /go/ui/frame (ProcessFrameRequest)
| |
| +--> /go/dataframe/dfbuilder (Builds DataFrame based on query)
| | |
| | +--> /go/tracestore (Fetch trace data)
| | +--> /go/git (Fetch commit data)
| |
| +--> /go/calc (If formulas are used)
| |
| +--> /go/pivot (If pivot table requested)
| |
| +--> /go/anomalies (Fetch anomaly data to overlay)
|
V
FrameResponse (JSON data for UI) --> User in Browser
Automated Bisection via Temporal Workflow: /go/workflows.MaybeTriggerBisectionWorkflow (Triggered by significant regression) | +--> Waits for related anomalies to group | +--> /go/anomalygroup (Loads anomaly group details) | +--> If GroupAction == BISECT: | | | +--> /go/gerrit (Activity: Get commit hashes from positions) | | | +--> Executes Pinpoint.CulpritFinderWorkflow (Child Workflow) | | (Pinpoint performs bisection) | V | Pinpoint calls back to /go/workflows.ProcessCulpritWorkflow | | | +--> /go/culprit (Activity: Persist culprit & Notify user) | +--> If GroupAction == REPORT: | +--> /go/culprit (Activity: Notify user of anomaly group)
ALL, OWNER). Ensures consistent filter definitions.Alert configurations, their storage (sqlalertstore), and efficient retrieval (ConfigProvider with caching). Defines how performance regressions are detected.InstanceConfig structure (loaded from JSON) that governs a Perf instance.DataFrame structure for handling performance data in a tabular, commit-centric way, inspired by R's dataframes.DataFrame objects from TraceStore, handling query logic and data aggregation.DataFrames, typically by slicing a larger fetched frame. Used in regression detection.File and Source interfaces for abstracting file access from different origins (local, GCS via Pub/Sub).fs.FS for local and GCS file access, providing a unified way to read files.TraceStore.DataFrame based on specified grouping criteria (like pivot tables).paramtools.ParamSet instances (used for UI query builders) to improve performance.TraceSet and ReadOnlyParamSet objects from multiple, potentially disparate chunks of trace data using a worker pool.CommitNumber, TileNumber, Trace).This comprehensive suite of modules works together to provide the Skia Perf performance monitoring system.
This module, go/alertfilter, provides constants that define different filtering modes for alerts. These constants are used throughout the Perf application to control which alerts are displayed or processed.
The primary motivation behind this module is to centralize the definition of alert filtering options. By having these constants in a dedicated module, we avoid scattering magic strings like “ALL” or “OWNER” throughout the codebase. This improves maintainability, reduces the risk of typos, and makes it easier to understand and modify the filtering logic. If new filtering modes are needed in the future, they can be added here, providing a single source of truth.
Key Components/Files:
alertfilter.go: This is the sole file in this module. It defines the string constants used for alert filtering.ALL: This constant represents a filter that includes all alerts, irrespective of their owner or other properties. It is used when a user or a system process needs to view or operate on the entire set of active alerts.OWNER: This constant represents a filter that includes only alerts assigned to a specific owner. This is crucial for user-specific views where individuals only want to see alerts relevant to their responsibilities.Workflow/Usage Example:
Imagine a user interface for viewing alerts. The user might have a dropdown to select how they want to filter the alerts.
User Interface:
[Alert List]
Filter: [Dropdown: "ALL", "OWNER"]
Backend Logic:
func GetAlerts(filterMode string, userID string) []Alert {
if filterMode == alertfilter.ALL {
// Fetch all alerts from the database.
return database.GetAllAlerts()
} else if filterMode == alertfilter.OWNER {
// Fetch alerts owned by the current user.
return database.GetAlertsByOwner(userID)
}
// ... other filter modes or error handling
}
In this scenario, the backend uses the constants from the alertfilter module to determine the correct query to execute against the database. This ensures consistency and clarity in how filtering is applied.
The /go/alerts module is responsible for managing alert configurations within the Perf application. These configurations define the conditions under which users or systems should be notified about performance regressions. The module handles the definition, storage, retrieval, and caching of these alert configurations.
A core design principle is the separation of concerns between defining an alert's structure (config.go), providing access to these configurations (configprovider.go), and persisting them (store.go and its SQL implementation in sqlalertstore). This modularity allows for flexibility in how alerts are stored (e.g., potentially different database backends) and accessed.
Key Components and Responsibilities:
config.go: This file defines the Alert struct, which is the central data structure representing a single alert configuration.
Alert struct includes fields for:IDAsString: A string representation of the alert's unique identifier. This is used for JSON serialization to avoid potential issues with large integer handling in JavaScript. The BadAlertID and BadAlertIDAsAsString constants represent an invalid/uninitialized ID.Query: A URL-encoded string that defines the criteria for selecting traces from the performance data.GroupBy: A comma-separated list of parameter keys. If specified, the Query is expanded into multiple sub-queries, one for each unique combination of values for the GroupBy keys found in the data. This allows for more granular alerting. The GroupCombinations and QueriesFromParamset methods handle this expansion.Alert: The email address for notifications.IssueTrackerComponent: The ID of the issue tracker component to file bugs against. A custom SerializesToString type is used for this field to handle JSON serialization of the int64 component ID as a string, with 0 serializing to "".DirectionAsString: Specifies whether to alert on upward (UP), downward (DOWN), or both (BOTH) changes in performance. This replaces the deprecated StepUpOnly boolean.StateAsString: Indicates if the alert is ACTIVE or DELETED. This is managed internally and affects whether an alert is processed.Action: Defines what action to take when an anomaly is detected (e.g., types.AlertActionReport, types.AlertActionBisect).Interesting, Algo, Step, Radius, K, Sparse, MinimumNum, Category control the specifics of regression detection and reporting.Direction and ConfigState and helper functions for ID conversion and validation (Validate). The Validate function ensures consistency, for example, that GroupBy keys do not also appear in the main Query.store.go: This file defines the Store interface, which abstracts the persistence mechanism for Alert configurations.
Store interface specifies methods for:Save: Saving a new or updating an existing alert. It takes a SaveRequest which includes the Alert configuration and an optional SubKey (linking the alert to a subscription).ReplaceAll: Atomically replacing all existing alerts with a new set. This is useful for bulk updates, often tied to configuration subscriptions. It requires a pgx.Tx to ensure transactional integrity.Delete: Marking an alert as deleted.List: Retrieving alerts, with an option to include deleted ones. Alerts are typically sorted by DisplayName.ListForSubscription: Retrieving all active alerts associated with a specific subscription name.configprovider.go: This file implements a ConfigProvider that serves Alert configurations, incorporating a caching layer.
Store for every request would be inefficient.configProviderImpl implements the ConfigProvider interface.cache_active for active alerts and cache_all for all alerts including deleted ones) using the configCache struct.NewConfigProvider), it performs an initial refresh and starts a background goroutine that periodically calls Refresh to update the caches from the Store.GetAllAlertConfigs and GetAlertConfig serve data from these caches.sync.RWMutex is used to protect concurrent access to the caches.Refresh method explicitly fetches data from the alertStore and updates both caches.Submodule sqlalertstore: This submodule provides a SQL-based implementation of the alerts.Store interface.
sqlalertstore.go:SQLAlertStore struct holds a database connection pool (pool.Pool) and a map of SQL statements.Alerts table (schema defined in sqlalertstore/schema/schema.go). This simplifies schema evolution of the Alert struct itself, as changes to the struct don't always require immediate SQL schema migrations, though it makes querying based on specific alert fields harder directly in SQL.Save: For new alerts (ID is BadAlertIDAsAsString), it performs an INSERT and retrieves the generated ID. For existing alerts, it performs an UPSERT (or an INSERT ... ON CONFLICT DO UPDATE for Spanner).Delete: Marks an alert as deleted by setting its config_state to 1 (representing alerts.DELETED) and updates last_modified. It doesn't physically remove the row.ReplaceAll: Within a transaction, it first marks all existing active alerts as deleted, then inserts the new set of alerts.List and ListForSubscription: Query the Alerts table, deserialize the JSON alert column into alerts.Alert structs, and sort them by DisplayName.spanner.go: Contains Spanner-specific SQL statements. This is necessary because CockroachDB and Spanner have slightly different SQL syntax for certain operations like UPSERTs and RETURNING clauses. The correct set of statements is chosen in sqlalertstore.New based on the dbType.sqlalertstore/schema/schema.go: Defines the Go struct AlertSchema representing the Alerts table in the SQL database. Key fields include id, alert (TEXT, storing the JSON serialized alerts.Alert), config_state (INT), last_modified (INT, Unix timestamp), sub_name, and sub_revision.Key Workflows:
Creating/Updating an Alert:
alerts.Alert struct.alerts.Store.Save() is called.sqlalertstore.Save() serializes the Alert to JSON.IDAsString is BadAlertIDAsAsString, an INSERT statement is executed, and the new ID is populated back into the Alert struct.UPSERT or INSERT ... ON CONFLICT DO UPDATE statement is executed.ConfigProvider's cache will eventually be updated during its next refresh cycle.[Client/Service] -- Alert Data --> [alerts.Store.Save()]
|
v
[sqlalertstore.Save()] -- Serializes Alert to JSON --> [Database]
| (If new, DB returns ID)
<---------------------------------------
| (Updates Alert struct with ID)
v
[ConfigProvider.Refresh() periodically] --> [alerts.Store.List()]
|
v
[sqlalertstore.List()] --> [Database]
| (Reads & deserializes)
v
[ConfigProvider Cache Update]
Retrieving All Active Alerts:
alerts.ConfigProvider.GetAllAlertConfigs(ctx, false).configProviderImpl.GetAllAlertConfigs() checks its cache_active.[]*Alert.Refresh call) would have populated it by:alerts.Store.List(ctx, false).sqlalertstore.List(ctx, false).sqlalertstore queries the database for alerts where config_state = 0 (ACTIVE), deserializes them, and returns the list.[Service] -- Request All Active Alerts --> [ConfigProvider.GetAllAlertConfigs(includeDeleted=false)]
| (Checks cache_active)
|
+-- [Cache Hit] ----> Returns cached []*Alert
|
+-- [Cache Miss/Stale (via periodic Refresh)]
|
v
[alerts.Store.List(includeDeleted=false)]
|
v
[sqlalertstore.List(includeDeleted=false)] -- SQL Query (WHERE config_state=0) --> [Database]
| (Reads & deserializes)
v
[Updates & Returns from Cache]
Expanding GroupBy Queries:
GroupBy clause is processed (e.g., by the regression detection system), Alert.QueriesFromParamset(paramset) is called.Alert.GroupCombinations(paramset) is invoked to find all unique combinations of values for the keys specified in GroupBy from the provided paramtools.ReadOnlyParamSet.Alert.Query and appending the key-value pairs from the combination.[Alert Processing System] -- Has Alert with GroupBy="config,arch", Query="metric=latency" & ParamSet --> [Alert.QueriesFromParamset()]
|
v
[Alert.GroupCombinations()]
| (e.g., finds {config:A, arch:X}, {config:B, arch:X})
v
[Generates specific queries:]
- "metric=latency&config=A&arch=X"
- "metric=latency&config=B&arch=X"
|
<-- Returns []string (list of queries)
The use of SerializesToString for IssueTrackerComponent highlights a common challenge when interfacing Go backend systems with JavaScript frontends: JavaScript's limitations with handling large integer IDs. Serializing them as strings is a robust workaround.
The existence of a mock subdirectory with generated mocks for Store and ConfigProvider (using stretchr/testify/mock) is standard Go practice, facilitating unit testing of components that depend on these interfaces without needing a real database or complex setup.
The /go/anomalies module is responsible for retrieving anomaly data. Anomalies represent significant deviations in performance metrics. This module acts as an intermediary between the application and the chromeperf service, which is the source of truth for anomaly data. It provides an abstraction layer, potentially including caching, to optimize anomaly retrieval.
1. anomalies.go:
Store interface. This interface dictates the contract for any component that aims to provide anomaly data. It ensures that different implementations (e.g., a cached store or a direct passthrough store) can be used interchangeably.GetAnomalies: Retrieves anomalies for a list of trace names within a specific commit position range. This is useful for analyzing performance regressions or improvements tied to code changes.GetAnomaliesInTimeRange: Fetches anomalies within a given time window. This is helpful for time-based analysis, independent of specific commit versions.GetAnomaliesAroundRevision: Finds anomalies that occurred near a particular revision (commit). This helps pinpoint performance changes related to a specific code submission.2. impl.go:
Store interface. It directly forwards requests to the chromeperf.AnomalyApiClient.chromeperf service. It can be used when caching is not desired or not yet implemented.store struct (the implementation of Store) makes a corresponding call to the ChromePerf client. For example, GetAnomalies calls ChromePerf.GetAnomalies. Error handling is included to log failures from the chromeperf service. Trace names are sorted before being passed to chromeperf which might be a requirement or an optimization for the chromeperf API.3. /go/anomalies/cache/cache.go:
Store interface. This is designed to improve performance by reducing the number of direct calls to the chromeperf service, which can be network-intensive.chromeperf service.testsCache for anomalies queried by trace names and commit ranges, and revisionCache for anomalies queried around a specific revision. LRU ensures that the least accessed items are evicted when the cache reaches its cacheSize limit.cacheItemTTL. A periodic cleanupCache goroutine removes entries older than this TTL. This ensures that stale data doesn't persist indefinitely.invalidationMap: This map tracks trace names for which anomalies have been modified (e.g., an alert was updated). If a trace name is in this map, any cached anomalies for that trace are considered invalid and will be re-fetched from chromeperf.invalidationMap itself is cleared periodically (invalidationCleanupPeriod) to prevent it from growing too large. This is a trade-off: it's simpler and has lower memory overhead but can lead to inaccuracies if a trace is invalidated and then the map is cleared before the next fetch for that trace.numEntriesInCache to monitor cache utilization.store struct in cache.go):GetAnomalies:testsCache.invalidationMap. If a trace is marked invalid, it's treated as a cache miss.as.ChromePerf.GetAnomalies.testsCache with newly fetched data. Client Request (traceNames, startCommit, endCommit) | v [Cache Store] -- GetAnomalies() | +---------------------------------+ | For each traceName: | | 1. Check testsCache | ----> Cache Hit? -----> Add to Result | (Key: trace:start:end) | | | 2. Check invalidationMap | No (Cache Miss or Invalidated) +---------------------------------+ | | (traceNamesMissingFromCache) | v | [ChromePerf Client] -- GetAnomalies() -----------+ | v [Cache Store] -- Add new data to testsCache | v Return Combined ResultGetAnomaliesInTimeRange: This method currently bypasses the cache and directly calls as.ChromePerf.GetAnomaliesTimeBased. The decision to not cache time-based queries might be due to the potentially large and less frequently reused nature of such requests, or it might be a feature planned for later.GetAnomaliesAroundRevision: Similar to GetAnomalies, it first checks revisionCache. If it's a miss, it fetches from as.ChromePerf.GetAnomaliesAroundRevision and updates the cache.InvalidateTestsCacheForTraceName: Adds a traceName to the invalidationMap. This is likely called when an external event (e.g., user updating an anomaly in Chrome Perf) indicates that the cached data for this trace is no longer accurate.4. /go/anomalies/mock/Store.go:
Store interface, generated using the testify/mock library.anomalies.Store to be tested in isolation, without needing a real chromeperf instance or a fully functional cache. Developers can define expected calls and return values for the mock store.mock.Mock struct from stretchr/testify is embedded, providing methods like On(), Return(), and AssertExpectations() to control and verify the mock’s behavior during tests.anomalies.Store): This is a common and robust pattern in Go. It allows for flexibility in how anomalies are fetched and managed. For example, a new caching strategy or a different backend data source could be implemented without affecting code that consumes anomalies, as long as the new implementation adheres to the Store interface.cache.go):invalidationMap: A pragmatic approach to handling external data modifications. While not perfectly accurate (invalidates all anomalies for a trace even if only one changed, and susceptible to the invalidationCleanupPeriod timing), it's simpler and less memory-intensive than more granular invalidation schemes. This suggests a balance was struck between accuracy, complexity, and resource usage.testsCache, revisionCache): Likely done because the query patterns and cache keys for these two types of requests are different. testsCache uses a composite key (traceName:startCommit:endCommit), while revisionCache uses the revision number as the key.chromeperf but often return an empty AnomalyMap or nil slice to the caller in case of an error from the underlying service. This design choice means that callers might receive no data instead of an error, simplifying the caller's error handling logic but potentially obscuring issues if not monitored through logs.chromeperf.GetAnomalies or chromeperf.GetAnomaliesTimeBased, the list of traceNames is sorted. This could be a requirement of the chromeperf API for deterministic behavior, or an optimization to improve chromeperf's internal processing or caching.go.opencensus.io/trace): Spans are added to some methods (GetAnomaliesInTimeRange, GetAnomaliesAroundRevision). This is crucial for observability, allowing developers to track the performance and flow of requests through the system, especially in a distributed environment.Typical Anomaly Retrieval (with Cache):
GetAnomalies* methods on an anomalies.Store instance (which is likely the cached store from cache.go).store first checks its internal LRU cache(s) (testsCache or revisionCache) for the requested data.GetAnomalies, it also consults the invalidationMap to see if any relevant traces have been marked as stale.Caller -> anomalies.Store.GetAnomalies(traces, range) | v Cache.GetAnomalies() | +--> Check testsCache (e.g., trace1:100:200) -> Found & Valid | +--> Check testsCache (e.g., trace2:100:200) -> Not Found or Invalid | Return cached data for trace1store makes a network request to the chromeperf.AnomalyApiClient. - The response from chromeperf is received. - This new data is added to the LRU cache for future requests. - The data is returned to the caller. Caller -> anomalies.Store.GetAnomalies(traces, range) | v Cache.GetAnomalies() | +--> Check testsCache (e.g., trace1:100:200) -> Found & Valid | +--> Check testsCache (e.g., trace2:100:200) -> Not Found or Invalid | | | (Data for trace1) v +---------------------------> [ ChromePerf API ] -- GetAnomalies(trace2, range) | v Cache.Add(trace2_data) | v Combine trace1_data & trace2_data | v Return to CallerCache Invalidation Workflow:
cache.store.InvalidateTestsCacheForTraceName(ctx, "affected_trace_name").affected_trace_name is added to the invalidationMap in the cache.store.GetAnomalies call for affected_trace_name:testsCache contains an entry for this trace and range, the presence of affected_trace_name in invalidationMap will cause a cache miss.chromeperf.invalidationMap entry for affected_trace_name typically remains until the invalidationMap is periodically cleared.This module effectively decouples the rest of the Perf application from the direct complexities of interacting with chromeperf for anomaly data, offering performance benefits through caching and a consistent interface for data retrieval.
The anomalygroup module is designed to group related anomalies (regressions in performance metrics) together. This grouping allows for consolidated actions like filing a single bug report for multiple related regressions or triggering a single bisection job to find the common culprit for a set of anomalies. This approach aims to reduce noise and improve the efficiency of triaging performance regressions.
The core idea is to identify anomalies that share common characteristics, such as the subscription (alert configuration), benchmark, and commit range. When a new anomaly is detected, the system attempts to find an existing group that matches these criteria. If a suitable group is found, the new anomaly is added to it. Otherwise, a new group is created.
The module defines a gRPC service for managing anomaly groups, a storage interface for persisting group data, and utilities for processing regressions and interacting with the grouping logic.
store.go: Anomaly Group Storage InterfaceThe store.go file defines the Store interface, which outlines the contract for persisting and retrieving anomaly group data. This abstraction allows for different storage backends (e.g., SQL databases) to be used.
Key Responsibilities:
The Store interface ensures that the core logic for anomaly grouping is decoupled from the specific implementation of data persistence.
sqlanomalygroupstore/sqlanomalygroupstore.go: SQL-backed Anomaly Group StoreThis file provides a concrete implementation of the Store interface using a SQL database (specifically designed with CockroachDB and Spanner in mind).
Implementation Details:
sqlanomalygroupstore/schema/schema.go. It includes fields for the group ID, creation time, list of anomaly IDs, metadata (stored as JSONB), common commit range, action type, and associated IDs for bisections, issues, and culprits.Create: Inserts a new row into the AnomalyGroups table. It takes parameters like subscription details, benchmark, commit range, and action, and stores them. The group metadata (subscription name, revision, domain, benchmark) is marshaled into a JSON string before insertion.LoadById: Selects an anomaly group from the database based on its ID. It retrieves core attributes of the group.UpdateBisectID, UpdateReportedIssueID, AddAnomalyID, AddCulpritIDs: These methods execute SQL UPDATE statements to modify specific fields of an existing anomaly group record. They handle array appends for lists like anomaly_ids and culprit_ids, with specific syntax considerations for different SQL databases (e.g., Spanner's COALESCE for array concatenation).FindExistingGroup: Constructs a SQL SELECT query with WHERE clauses to match the provided criteria (subscription, revision, domain, benchmark, commit range overlap, and action). This allows finding groups that a new anomaly might belong to.Design Choices:
group_meta_data as JSONB provides flexibility in the metadata stored without requiring schema changes for minor additions.anomaly_ids and culprit_ids as array types in the database is a natural way to represent lists of associated entities.dbType checks.service/service.go: gRPC Service ImplementationThis file implements the AnomalyGroupServiceServer interface defined by the protobuf definitions in proto/v1/anomalygroup_service.proto. It acts as the entry point for external systems to interact with the anomaly grouping functionality.
Responsibilities:
anomalygroup.Store interface. For example, CreateNewAnomalyGroup calls anomalygroupStore.Create.FindTopAnomalies Logic: This method involves more than a simple store passthrough.regression.Store.median_before to median_after).ag.Anomaly protobuf message format, extracting relevant paramset values (bot, benchmark, story, measurement, stat).FindIssuesFromCulprits Logic:culprit.Store to get the details of these culprits.GroupIssueMap to find any issue IDs that are specifically associated with the given anomaly group ID. This allows correlation between a group (potentially containing multiple anomalies that led to a bisection) and the issues filed for the culprits found by that bisection.Design Choices:
anomalygroup.Store, culprit.Store, and regression.Store as dependencies, promoting testability and decoupling.newGroupCounter) whenever a new group is created, allowing for monitoring of the system's behavior.proto/v1/anomalygroup_service.proto: Protocol Buffer DefinitionsThis file defines the gRPC service AnomalyGroupService and the message types used for requests and responses. This is the contract for how clients interact with the anomaly grouping system.
Key Messages:
AnomalyGroup: Represents a group of anomalies, including its ID, the action to take, lists of associated anomaly and culprit IDs, reported issue ID, and metadata like subscription and benchmark names.Anomaly: Represents a single regression, including its start and end commit positions, a paramset (key-value pairs describing the test), improvement direction, and median values before and after the regression.GroupActionType: An enum defining the possible actions for a group (NOACTION, REPORT, BISECT).CreateNewAnomalyGroupRequest, FindExistingGroupsResponse).Purpose:
notifier/anomalygroupnotifier.go: Anomaly Group NotifierThis component implements the notify.Notifier interface. It's invoked when a new regression is detected by the alerting system. Its primary role is to integrate the regression detection with the anomaly grouping logic.
Workflow when RegressionFound is called:
paramset from the trace data.paramset to ensure it contains required keys (e.g., master, bot, benchmark, test, subtest_1). This is important because the grouping and subsequent actions (like bisection) rely on these parameters.testPath from the paramset. This path is used in finding or creating anomaly groups.grouper.ProcessRegressionInGroup (which eventually calls utils.ProcessRegression) to handle the grouping logic for this new regression.Design Choices:
notify.Notifier interface, allowing it to be plugged into the existing notification pipeline of the performance monitoring system.AnomalyGrouper: It delegates the core grouping logic to an AnomalyGrouper instance (typically utils.AnomalyGrouperImpl). This keeps the notifier focused on the integration aspect.utils/anomalygrouputils.go: Anomaly Grouping UtilitiesThis file contains the core logic for processing a new regression and integrating it into an anomaly group.
ProcessRegression Function - Key Steps:
sync.Mutex (groupingMutex). This is a critical point: it aims to prevent race conditions when multiple regressions are processed concurrently, especially around creating new groups. However, the comment notes that with multiple containers, this mutex might not be sufficient and needs review.AnomalyGroupServiceClient to communicate with the gRPC service.FindExistingGroups gRPC method to see if the new anomaly fits into any current groups based on subscription, revision, action type, commit range overlap, and test path.CreateNewAnomalyGroup to create a new group. - Calls UpdateAnomalyGroup to add the current anomalyID to this newly created group. - Triggers a Temporal Workflow: Initiates a MaybeTriggerBisection workflow. This workflow is responsible for deciding whether to start a bisection or file a bug based on the group‘s action type and other conditions. Regression Detected --> FindExistingGroups | +-- No Group Found --> CreateNewAnomalyGroup --> UpdateAnomalyGroup (add anomaly) --> Start Temporal Workflow (MaybeTriggerBisection) - If existing group(s) are found: - For each matching group: - Calls UpdateAnomalyGroup to add the current anomalyID to that group. - Calls FindIssuesToUpdate to determine if any existing bug reports (either the group’s own ReportedIssueId or issues linked via culprits) should be updated with information about this new anomaly. - If issues are found, it uses the issuetracker to add a comment to each relevant issue. Regression Detected --> FindExistingGroups | +-- Group(s) Found --> For each group: | +-- UpdateAnomalyGroup (add anomaly) +-- FindIssuesToUpdate --> If issues exist --> Add Comment to Issue(s)FindIssuesToUpdate Function:
This helper determines which existing issue tracker IDs should be updated with information about a new anomaly being added to a group.
group_action is REPORT and reported_issue_id is set on the group, that issue ID is returned.group_action is BISECT, it calls the FindIssuesFromCulprits gRPC method. This method looks up culprits associated with the group and then checks if those culprits have specific issues filed for them in the context of this particular group. This is important because a single culprit (commit) might be associated with multiple anomaly groups, and each might have its own context or bug report.Design Choices:
The module extensively uses mocks for testing:
mocks/Store.go: A mock implementation of the anomalygroup.Store interface, generated by testify/mock. Used in service/service_test.go.proto/v1/mocks/AnomalyGroupServiceServer.go: A mock for the gRPC server interface AnomalyGroupServiceServer, generated by testify/mock (with manual adjustments noted in the file). Used by clients or other services that might call this gRPC service.utils/mocks/AnomalyGrouper.go: A mock for the AnomalyGrouper interface, used in notifier/anomalygroupnotifier_test.go.This approach allows for unit testing components in isolation by providing controlled behavior for their dependencies.
AnomalyGroupNotifier.RegressionFound is called.paramset, validates it, and derives testPath.utils.ProcessRegression):AnomalyGroupService.FindExistingGroups using the anomaly's properties (subscription, commit range, test path, action type).AnomalyGroupService.CreateNewAnomalyGroup is called.AnomalyGroupService.UpdateAnomalyGroup.MaybeTriggerBisection) is started for this new group.AnomalyGroupService.UpdateAnomalyGroup.utils.FindIssuesToUpdate is called for each group.REPORT and it has a ReportedIssueId, that issue is updated.BISECT, AnomalyGroupService.FindIssuesFromCulprits is called. If it returns issue IDs associated with this group’s culprits, those issues are updated.MaybeTriggerBisection - not detailed here but implied):GroupActionType:BISECT: It might check conditions (e.g., number of anomalies in the group) and then trigger a bisection job (e.g., Pinpoint) using AnomalyGroupService.FindTopAnomalies to pick the most significant anomaly. The bisection ID is then saved to the group.REPORT: It might check conditions and then file a bug using AnomalyGroupService.FindTopAnomalies to gather details. The issue ID is saved to the group.This system aims to automate and streamline the handling of performance regressions by intelligently grouping them and initiating appropriate follow-up actions.
The /go/backend module implements a gRPC-based backend service for Perf. This service is designed to host API endpoints that are not directly user-facing, promoting a separation of concerns and enabling better scalability and maintainability.
Core Purpose and Design Philosophy:
The primary motivation for this backend service is to create a stable, internal API layer. This decouples user-facing components (like the frontend) from the direct implementation details of various backend tasks. For instance, if Perf needs to trigger a Pinpoint job, the frontend doesn't interact with Pinpoint or a workflow engine like Temporal directly. Instead, it makes a gRPC call to an endpoint on this backend service. The backend service then handles the interaction with the underlying system (e.g., Temporal).
This design offers several advantages:
Key Components and Responsibilities:
backend.go: This is the heart of the backend service.
Backend struct: Encapsulates the state and configuration of the backend application, including gRPC server settings, ports, and loaded configuration.BackendService interface: Defines a contract for any service that wishes to be hosted by this backend. Each such service must provide its gRPC service descriptor, registration logic, and an authorization policy. This interface-based approach allows for modular addition of new functionalities.GetAuthorizationPolicy() method returns a shared.AuthorizationPolicy which specifies whether unauthenticated access is allowed and which user roles are authorized to call the service or specific methods within it.RegisterGrpc() is responsible for registering the specific gRPC service implementation with the main gRPC server.GetServiceDescriptor() provides metadata about the gRPC service.initialize() function: This is a crucial setup function. It:demo.json).NotifyConfig.Notifications is set to AnomalyGrouper, as this indicates that anomaly grouping workflows managed by Temporal are in use.BackendService implementations. This involves setting up authorization rules based on the policy defined by each service and then registering their gRPC handlers.configureServices() and registerServices(): These helper functions iterate over the list of BackendService implementations to set up authorization and register them with the main gRPC server.configureAuthorizationForService(): This function applies the authorization policies defined by each individual service to the gRPC server's authorization policy. It uses grpcsp.ServerPolicy to define which roles can access the service or specific methods.New() constructor: Creates and initializes a new Backend instance. It takes various store implementations and a notifier as arguments, allowing for dependency injection, particularly useful for testing. If these are nil, they are typically created within initialize() based on the configuration.ServeGRPC() and Serve(): These methods start the gRPC server and block until it's shut down.Cleanup(): Handles graceful shutdown of the gRPC server.pinpoint.go: This file defines a wrapper for the actual Pinpoint service implementation (which resides in pinpoint/go/service).
pinpointService struct: Implements the BackendService interface.NewPinpointService(): Creates a new instance, taking a Temporal provider and a rate limiter as arguments. This indicates that Pinpoint operations might be rate-limited and potentially involve Temporal workflows.roles.Editor to access Pinpoint functionalities. This is a good example of how specific services define their own access control rules.shared/authorization.go:
AuthorizationPolicy struct: A simple struct used by BackendService implementations to declare their authorization requirements. This includes whether unauthenticated access is permitted, a list of roles authorized for the entire service, and a map for method-specific role authorizations. This promotes a consistent way for services to define their security posture.client/backendclientutil.go: This utility file provides helper functions for creating gRPC clients to connect to the backend service itself (or specific services hosted by it).
getGrpcConnection(): Abstracts the logic for establishing a gRPC connection. It handles both insecure (typically for local development/testing) and secure connections. For secure connections, it uses TLS (with InsecureSkipVerify: true as it's intended for internal GKE cluster communication) and OAuth2 for authentication, obtaining tokens for the service account running the client process.NewPinpointClient(), NewAnomalyGroupServiceClient(), NewCulpritServiceClient(): These are factory functions that simplify the creation of typed gRPC clients for the specific services hosted on the backend. They first check if the backend service is configured/enabled before attempting to create a connection. This pattern makes it easy for other internal services to consume the APIs provided by this backend.backendserver/main.go: This is the entry point for the backend server executable.
urfave/cli library to define a command-line interface.run command initializes and starts the Backend service using the backend.New() constructor and then calls b.Serve().config.BackendFlags) and passes them to the backend package. It doesn't instantiate stores or notifiers directly, relying on the backend.New (and subsequently initialize) to create them based on the loaded configuration if nil is passed.Workflow Example: Handling a gRPC Request
client/backendclientutil.go) to make a call to a specific method on a service hosted by the backend (e.g., Pinpoint.ScheduleJob).Backend server's listener (b.lisGRPC).grpc.Server routes the request to the appropriate service implementation (e.g., pinpointService).grpcsp.ServerPolicy): Before the service method is executed, the UnaryInterceptor configured in backend.go (which uses b.serverAuthPolicy) intercepts the call. Incoming gRPC Request --> UnaryInterceptor (grpcsp) | V Check Auth Policy for Service/Method (defined by pinpointService.GetAuthorizationPolicy()) | V Allow/Deny ----> Yes: Proceed to service method No: Return errorpinpointService (which delegates to the actual pinpoint_service.PinpointServer implementation) is invoked.Configuration and Initialization:
The system relies heavily on a configuration file (specified by flags.ConfigFilename, often demo.json for local development as seen in backend_test.go and testdata/demo.json). This file dictates:
data_store_config).notify_config).backend_host_url), which it might use if it needs to call itself or if other components need to discover it.temporal_config - though not explicitly in demo.json, it's checked in backend.go).The initialize function in backend.go is responsible for parsing this configuration and setting up all necessary dependencies like database connections, the Temporal client, and the culprit notifier. The use of builder functions (e.g., builders.NewAnomalyGroupStoreFromConfig) allows the system to be flexible with regard to the actual implementations of these components, as long as they conform to the required interfaces.
This backend module serves as a crucial intermediary, enhancing the robustness and maintainability of the Perf system by providing a well-defined internal API layer.
The go/bug module is designed to facilitate the creation of URLs for reporting bugs or regressions identified within the Skia performance monitoring system. Its primary purpose is to dynamically generate these URLs based on a predefined template and specific details about the identified issue. This approach allows for flexible integration with various bug tracking systems, as the URL structure can be configured externally.
Core Functionality and Design:
The module centers around the concept of URI templates. Instead of hardcoding URL formats for specific bug trackers, it uses a template string that contains placeholders for relevant information. This makes the system adaptable to changes in bug tracker URL schemes or the adoption of new trackers without requiring code modifications.
The key function, Expand, takes a URI template and populates it with details about the regression. These details include:
clusterLink: A URL pointing to the specific performance data cluster that exhibits the regression. This provides direct context for anyone investigating the bug.c provider.Commit: Information about the specific commit suspected of causing the regression. This includes the commit's URL, allowing for easy navigation to the code change. The use of the provider.Commit type from perf/go/git/provider indicates an integration with a system that can furnish commit details.message: A user-provided message describing the regression. This allows the reporter to add specific observations or context.The Expand function utilizes the gopkg.in/olivere/elastic.v5/uritemplates library to perform the actual substitution of placeholders in the template string with the provided values. This library handles URL encoding of the substituted values, ensuring the generated URL is valid.
Key Components/Files:
bug.go: This file contains the core logic for expanding URI templates.
Expand(uriTemplate string, clusterLink string, c provider.Commit, message string) string: This is the primary function responsible for generating the bug reporting URL. It takes the template and the contextual information as input and returns the fully formed URL. If the template expansion fails (e.g., due to a malformed template), it logs an error using go.skia.org/infra/go/sklog and returns an empty string or a partially formed URL depending on the nature of the error.ExampleExpand(uriTemplate string) string: This function serves as a utility or example for demonstrating how to use the Expand function. It calls Expand with pre-defined example data for the cluster link, commit, and message. This can be useful for testing the template expansion logic or for providing a quick way to see how a given template would be populated.bug_test.go: This file contains unit tests for the functionality in bug.go.
TestExpand(t *testing.T): This test function verifies that the Expand function correctly substitutes the provided values into the URI template and produces the expected URL. It uses the github.com/stretchr/testify/assert library for assertions, ensuring that the generated URL matches the anticipated output, including proper URL encoding.Workflow:
A typical workflow involving this module would be:
Configuration: An external system (e.g., the Perf frontend) is configured with a URI template for the desired bug tracking system. This template will contain placeholders like {cluster_url}, {commit_url}, and {message}. Example Template: https://bugtracker.example.com/new?summary=Regression%20Found&description=Regression%20details:%0ACluster:%20{cluster_url}%0ACommit:%20{commit_url}%0AMessage:%20{message}
Regression Identification: A user or an automated system identifies a performance regression.
Information Gathering: The system gathers the necessary information:
URL Generation: The Expand function in go/bug is called with the configured URI template and the gathered information.
template := "https://bugtracker.example.com/new?summary=Regression%20Found&description=Cluster:%20{cluster_url}%0ACommit:%20{commit_url}%0AMessage:%20{message}"
clusterURL := "https://perf.skia.org/t/?some_params"
commitData := provider.Commit{URL: "https://skia.googlesource.com/skia/+show/abcdef123"}
userMessage := "Significant drop in frame rate on TestXYZ."
bugReportURL := bug.Expand(template, clusterURL, commitData, userMessage)
Redirection/Display: The generated bugReportURL is then presented to the user, who can click it to navigate to the bug tracker with the pre-filled information.
This design decouples the bug reporting logic from the specifics of any single bug tracking system, promoting flexibility and maintainability. The use of a standard URI template expansion library ensures robustness in URL generation.
The builders module is responsible for constructing various core components of the Perf system based on instance configuration. This centralized approach to object creation prevents cyclical dependencies that could arise if configuration objects were directly responsible for building the components they configure. The module acts as a factory, taking an InstanceConfig and returning fully initialized and operational objects like data stores, file sources, and caches.
The primary design goal is to decouple the configuration of Perf components from their instantiation. This allows for cleaner dependencies and makes it easier to manage the lifecycle of different parts of the system. For example, a TraceStore needs a database connection, but the InstanceConfig that defines the database connection string shouldn't also be responsible for creating the TraceStore itself. The builders module bridges this gap.
Key components and their instantiation logic:
builders.go: This is the central file containing all the builder functions.NewDBPoolFromConfig): This function is crucial as many other components rely on a database connection. It establishes a connection pool to the configured database (e.g., CockroachDB, Spanner).InstanceConfig, configures pool parameters like maximum and minimum connections, and sets up a logging adapter (pgxLogAdaptor) to integrate database logs with the application's logging system.singletonPool. This ensures that only one database connection pool is created per application instance, preventing resource exhaustion and ensuring consistent database interaction. A mutex (singletonPoolMutex) protects the creation of this singleton.timeout.New wrapper. This enforces that all database operations are performed within a context that has a timeout, preventing indefinite blocking. InstanceConfig --> NewDBPoolFromConfig --> pgxpool.ParseConfig | +-> pgxpool.ConnectConfig --> rawPool | +-> timeout.New(rawPool) --> singletonPool (if schema check passes)NewPerfGitFromConfig): Constructs a perfgit.Git object, which provides an interface to Git repository data.getDBPool (which in turn uses NewDBPoolFromConfig) and then instantiates perfgit.New with this pool and the instance configuration.NewTraceStoreFromConfig): Creates a tracestore.TraceStore for managing performance trace data.TraceParamStore (for managing trace parameter sets) and then instantiates the appropriate sqltracestore.NewMetadataStoreFromConfig): Creates a tracestore.MetadataStore for managing metadata associated with traces.TraceStore, it obtains a database pool and then creates an sqltracestore.NewSQLMetadataStore.getDBPool and then instantiate their respective SQL-backed store implementations (e.g., sqlalertstore, sqlregression2store).NewRegressionStoreFromConfig has a conditional logic based on instanceConfig.UseRegression2 to instantiate either sqlregression2store or sqlregressionstore. This allows for migrating to a new regression store implementation controlled by configuration.NewGraphsShortcutStoreFromConfig can return a cached version (graphsshortcutstore.NewCacheGraphsShortcutStore) if localToProd is true, indicating a local development or testing environment where a simpler in-memory cache might be preferred over a database-backed store.NewSourceFromConfig): Creates a file.Source which defines where Perf ingests data from (e.g., Google Cloud Storage, local directories).switch statement based on instanceConfig.IngestionConfig.SourceConfig.SourceType to instantiate either a gcssource or a dirsource.NewIngestedFSFromConfig): Creates a fs.FS (file system interface) that provides access to already ingested files.NewSourceFromConfig, it switches on the source type to return a GCS or local file system implementation.GetCacheFromConfig): Returns a cache.Cache instance (either Redis-backed or local in-memory).instanceConfig.QueryConfig.CacheConfig.Type to determine whether to create a redisCache (connecting to a Google Cloud Redis instance) or a localCache.The getDBPool helper function is used internally by many builder functions. It acts as a dispatcher based on instanceConfig.DataStoreConfig.DataStoreType, calling NewDBPoolFromConfig with appropriate schema checking flags. This abstracts the direct call to NewDBPoolFromConfig and centralizes the logic for selecting the database type.
The test file (builders_test.go) ensures that these builder functions correctly instantiate objects and handle different configurations, including invalid ones. A notable aspect of the tests is the management of the singletonPool. Since NewDBPoolFromConfig creates a singleton, tests that require fresh database instances must explicitly clear this singleton (singletonPool = nil) before calling the builder to avoid reusing a connection from a previous test. This is handled in newDBConfigForTest.
The chromeperf module facilitates interaction with the Chrome Perf backend, which is the system of record for performance data for Chromium. This module allows Perf to send and receive data from Chrome Perf.
The primary responsibility of this module is to abstract the communication details with the Chrome Perf API. It provides a typed Go interface to various Chrome Perf endpoints, handling request formatting, authentication, and response parsing.
This interaction is crucial for:
sqlreversekeymapstore submodule, helps manage these differences.chromeperfClient.goThis file defines the generic ChromePerfClient interface and its implementation, chromePerfClientImpl. This is the core component responsible for making HTTP GET and POST requests to the Chrome Perf API.
Why: Abstracting the HTTP client allows for easier testing (by mocking the client) and centralizes the logic for handling authentication (using OAuth2 Google default token source) and constructing target URLs.
How:
google.DefaultTokenSource for authentication.generateTargetUrl constructs the correct API endpoint URL, differentiating between the Skia-Bridge proxy (https://skia-bridge-dot-chromeperf.appspot.com) and direct calls to the legacy Chrome Perf endpoint (https://chromeperf.appspot.com). The Skia-Bridge is generally preferred.SendGetRequest and SendPostRequest handle the actual HTTP communication, JSON marshalling/unmarshalling, and basic error handling, including checking for accepted HTTP status codes.Example workflow for a POST request:
Caller -> chromePerfClient.SendPostRequest(ctx, "anomalies", "add", requestBody, &responseObj, []int{200})
|
| (Serializes requestBody to JSON)
v
|--------------------------------------------------------------------------------------------------------|
| generateTargetUrl("https://skia-bridge-dot-chromeperf.appspot.com/anomalies/add") |
|--------------------------------------------------------------------------------------------------------|
|
v
httpClient.Post(targetUrl, "application/json", jsonBody)
|
v
(HTTP Request to Chrome Perf API)
|
v
(Receives HTTP Response)
|
v
(Checks if response status code is in acceptedStatusCodes)
|
v
(Deserializes response body into responseObj)
|
v
Caller (receives populated responseObj or error)
anomalyApi.goThis file builds upon chromeperfClient.go to provide a specialized client for interacting with the /anomalies endpoint in Chrome Perf. It defines the AnomalyApiClient interface and its implementation anomalyApiClientImpl.
Why: This client encapsulates the logic specific to anomaly-related operations, such as formatting requests for reporting regressions or fetching anomaly details, and parsing the specific JSON structures returned by these endpoints. It also handles the translation between Perf‘s trace identifiers and Chrome Perf’s test_path format.
How:
ReportRegression: Constructs a ReportRegressionRequest and sends it to the anomalies/add endpoint. This is how Perf informs Chrome Perf about a new regression.GetAnomalyFromUrlSafeKey: Fetches details for a specific anomaly using its key from the anomalies/get endpoint.GetAnomalies: Retrieves anomalies for a list of tests within a specific commit range (min_revision, max_revision) by calling the anomalies/find endpoint.traceNameToTestPath converts Perf‘s comma-separated key-value trace names (e.g., ,benchmark=Blazor,bot=MacM1,...) into Chrome Perf’s slash-separated test_path (e.g., ChromiumPerf/MacM1/Blazor/...).perfGit.CommitNumberFromGitHash to resolve these.GetAnomaliesTimeBased: Similar to GetAnomalies, but fetches anomalies based on a time range (start_time, end_time) by calling the anomalies/find_time endpoint.GetAnomaliesAroundRevision: Fetches anomalies that occurred around a specific revision number.traceNameToTestPath: This function is key for interoperability. It parses a Perf trace name (which is a string of key-value pairs) and constructs the corresponding test_path string that Chrome Perf expects. It also handles an experimental feature (EnableSkiaBridgeAggregation) which can modify how test paths are generated, particularly for aggregated statistics (e.g., ensuring testName_avg is used if the stat is value).statToSuffixMap and hasSuffixInTestValue addresses historical inconsistencies where test names in Perf might or might not include statistical suffixes (like _avg, _max). The goal is to derive the correct Chrome Perf test_path.Workflow for fetching anomalies:
Perf UI/Backend -> anomalyApiClient.GetAnomalies(ctx, ["trace_A,key=val", "trace_B,key=val"], 100, 200)
|
v
(For each traceName)
traceNameToTestPath("trace_A,key=val") -> "chromeperf/test/path/A"
|
v
chromeperfClient.SendPostRequest(ctx, "anomalies", "find", {Tests: ["path/A", "path/B"], MinRevision: "100", MaxRevision: "200"}, &anomaliesResponse, ...)
|
v
(Parses anomaliesResponse, potentially resolving commit hashes to commit numbers)
|
v
Perf UI/Backend (receives AnomalyMap)
alertGroupApi.goThis file provides a client for interacting with Chrome Perf's /alert_group API, specifically to get details about alert groups. An alert group in Chrome Perf typically corresponds to a set of related anomalies (regressions).
Why: When Perf displays information about an alert (which might have originated from Chrome Perf), it needs to fetch details about the associated alert group, such as the specific anomalies included, the commit range, and other metadata.
How:
GetAlertGroupDetails: Takes an alert group key and calls the alert_group/details endpoint on Chrome Perf.AlertGroupDetails struct holds the response, including a map of Anomalies (where the value is the Chrome Perf test_path) and start/end commit numbers/hashes.GetQueryParams and GetQueryParamsPerTrace: These methods are utilities to transform the AlertGroupDetails into query parameters that can be used to construct URLs for Perf's own explorer page. This allows users to easily navigate from a Chrome Perf alert to viewing the corresponding data in Perf.GetQueryParams aggregates all test path components (masters, bots, benchmarks, etc.) from all anomalies in the group into a single set of parameters.GetQueryParamsPerTrace generates a separate set of query parameters for each individual anomaly in the alert group.test_path from Chrome Perf back into individual components.Workflow for getting alert group details:
Perf Backend (e.g., when processing an incoming alert from Chrome Perf)
|
v
alertGroupApiClient.GetAlertGroupDetails(ctx, "chrome_perf_group_key")
|
v
chromeperfClient.SendGetRequest(ctx, "alert_group", "details", {key: "chrome_perf_group_key"}, &alertGroupResponse)
|
v
(alertGroupResponse is populated)
|
v
alertGroupResponse.GetQueryParams(ctx) -> Perf Explorer URL query params
store.go and the sqlreversekeymapstore submodulestore.go defines the ReverseKeyMapStore interface. The sqlreversekeymapstore directory and its schema subdirectory provide an SQL-based implementation of this interface.
Why: Test paths in Chrome Perf can contain characters that are considered “invalid” or are handled differently by Perf‘s parameter parsing (e.g., Perf’s trace keys are comma-separated key-value pairs, and the values themselves should ideally not interfere with this). When data is ingested into Perf from Chrome Perf, or when Perf constructs test paths to query Chrome Perf, these “invalid” characters in Chrome Perf test path components (like subtest names) might be replaced (e.g., with underscores).
This creates a problem: if Perf has test/foo_bar and Chrome Perf has test/foo?bar, Perf needs a way to know that foo_bar corresponds to foo?bar when querying Chrome Perf. The ReverseKeyMapStore is designed to store these mappings.
How:
sqlreversekeymapstore/schema/schema.go defines the SQL table schema ReverseKeyMapSchema with columns:ModifiedValue: The value as it appears in Perf (e.g., foo_bar).ParamKey: The parameter key this value belongs to (e.g., subtest_1).OriginalValue: The original value as it was in Chrome Perf (e.g., foo?bar).ModifiedValue and ParamKey.sqlreversekeymapstore/sqlreversekeymapstore.go implements the ReverseKeyMapStore interface using a SQL database (configurable for CockroachDB or Spanner via different SQL statements).Create: Inserts a new mapping. If a mapping for the ModifiedValue and ParamKey already exists (conflict), it does nothing. This is important because the mapping should be stable.Get: Retrieves the OriginalValue given a ModifiedValue and ParamKey.This store is likely used during the process of converting between Perf trace parameters and Chrome Perf test paths, especially when generating requests to Chrome Perf. If a parameter value in Perf might have been modified from its Chrome Perf original, this store can be queried to get the original value needed for the Chrome Perf API call. The exact point of integration for creating these mappings (i.e., when are Create calls made) is not explicitly detailed within this module but would typically happen when Perf first encounters/ingests a test path from Chrome Perf that requires modification.
For example, if anomalyApi.go needs to construct a test_path to query Chrome Perf based on parameters from Perf:
test=my_test, subtest_1=value_with_question_marktest_path segment for subtest_1: - Call reverseKeyMapStore.Get(ctx, "value_with_question_mark", "subtest_1"). - If it returns an original value like "value?with?question?mark", use that for the Chrome Perf API call. - Otherwise, use "value_with_question_mark".The store.go file simply defines the interface, allowing for different backend implementations of this mapping store if needed, though sqlreversekeymapstore is the provided concrete implementation.
The clustering2 module is responsible for grouping similar performance traces together using k-means clustering. This helps in identifying patterns and regressions in performance data by analyzing the collective behavior of traces rather than individual ones. The core idea is to represent each trace as a point in a multi-dimensional space and then find k clusters of these points.
K-means is a well-understood and relatively efficient clustering algorithm suitable for the scale of performance data encountered. It partitions data into k distinct, non-overlapping clusters. Each data point belongs to the cluster with the nearest mean (cluster centroid). This approach allows for the summarization of large numbers of traces into a smaller set of representative “shapes” or behaviors.
clustering.goThis file contains the primary logic for performing k-means clustering on performance traces.
ClusterSummary: This struct represents a single cluster found by the k-means algorithm.
Centroid: The average shape of all traces in this cluster. This is the core representation of the cluster's behavior.Keys: A list of identifiers for the traces belonging to this cluster. These are sorted by their distance to the Centroid, allowing users to quickly see the most representative traces. This is not serialized to JSON to keep the payload manageable, as it can be very large.Shortcut: An identifier for a pre-computed set of Keys, used for efficient retrieval and display in UIs.ParamSummaries: A breakdown of the parameter key-value pairs present in the cluster and their prevalence (see valuepercent.go). This helps in understanding what distinguishes this cluster (e.g., “all traces in this cluster are for arch=x86”).StepFit: Contains information about how well the Centroid fits a step function. This is crucial for identifying regressions or improvements that manifest as sudden shifts in performance.StepPoint: The specific data point (commit/timestamp) where the step (if any) in the Centroid is detected.Num: The total number of traces in this cluster.Timestamp: Records when the cluster analysis was performed.NotificationID: Stores the ID of any alert or notification sent regarding a significant step change detected in this cluster.ClusterSummaries: A container for all the ClusterSummary objects produced by a single clustering run, along with metadata like the K value used and the StdDevThreshold.
CalculateClusterSummaries function: This is the main entry point for the clustering process.
dataframe.DataFrame (which holds traces and their metadata) and converts each trace into a kmeans.Clusterable object. The ctrace2.NewFullTrace function is used here, which likely involves some form of normalization or feature extraction to make traces comparable. The stddevThreshold parameter is used during this conversion, potentially to filter out noisy or flat traces.chooseK): K-means requires an initial set of k centroids. This function randomly selects k traces from the input data to serve as the initial centroids. Random selection is a common and simple initialization strategy.kmeans.Do function performs one iteration of the k-means algorithm:ctrace2.CalculateCentroid function is likely responsible for computing the mean of a set of traces.MAX_KMEANS_ITERATIONS or until the change in totalError (sum of squared distances from each point to its centroid) between iterations falls below KMEAN_EPSILON. This convergence criterion prevents unnecessary computations once the clusters stabilize.Progress callback can be provided to monitor the clustering process, reporting the totalError at each iteration.getClusterSummaries): After the k-means algorithm converges, this function takes the final centroids and the original observations to generate ClusterSummary objects for each cluster.ParamSummaries (see valuepercent.go) to describe the common characteristics of traces in that cluster.stepfit.GetStepFitAtMid) on the cluster's centroid to identify significant performance shifts. The interesting parameter likely defines a threshold for what constitutes a noteworthy step change, and stepDetection specifies the algorithm or method used for step detection.ClusterSummary.Keys lists the most representative traces first. A limited number of sample keys (config.MaxSampleTracesPerCluster) are stored.ClusterSummary objects are sorted, likely by the magnitude or significance of the detected step (StepFit.Regression), to highlight the most impactful changes first.Constants:
K: The default number of clusters to find. 50 is chosen as a balance between granularity and computational cost.MAX_KMEANS_ITERATIONS: A safeguard against non-converging k-means runs.KMEAN_EPSILON: A threshold to determine convergence, balancing precision with computation time.valuepercent.goThis file defines how to summarize and present the parameter distributions within a cluster.
ValuePercent struct: Represents a specific parameter key-value pair (e.g., “config=8888”) and the percentage of traces in a cluster that have this pair. This provides a quantitative measure of how characteristic a parameter is for a given cluster.
SortValuePercentSlice function: This is crucial for making the ParamSummaries in ClusterSummary human-readable and informative. The goal is to:
This complex sorting logic ensures that the most dominant and distinguishing parameters for a cluster are presented prominently. For example:
config=8888 90% config=565 10% arch=x86 80% arch=arm 20%
Here, “config” is listed before “arch” because its top value (“config=8888”) has a higher percentage (90%) than the top value for “arch” (“arch=x86” at 80%).
Input: DataFrame (traces, headers), K, StdDevThreshold, ProgressCallback, InterestingThreshold, StepDetectionMethod
1. [clustering.go: CalculateClusterSummaries]
a. Initialize empty list of observations.
b. For each trace in DataFrame.TraceSet:
i. Create ClusterableTrace (ctrace2.NewFullTrace) using trace data and StdDevThreshold.
ii. Add to observations list.
c. If no observations, return error.
d. [clustering.go: chooseK]
i. Randomly select K observations to be initial centroids.
e. Initialize lastTotalError = 0.0
f. Loop MAX_KMEANS_ITERATIONS times OR until convergence:
i. [kmeans.Do] -> new_centroids
1. Assign each observation to its closest centroid (from previous iteration or initial).
2. Recalculate centroids (ctrace2.CalculateCentroid) based on assigned observations.
ii. [kmeans.TotalError] -> currentTotalError
iii. If ProgressCallback provided, call it with currentTotalError.
iv. If |currentTotalError - lastTotalError| < KMEAN_EPSILON, break loop.
v. lastTotalError = currentTotalError
g. [clustering.go: getClusterSummaries] -> clusterSummaries
i. [kmeans.GetClusters] -> allClusters (list of observations per centroid)
ii. For each cluster in allClusters and its corresponding centroid:
1. Create new ClusterSummary.
2. [clustering.go: getParamSummaries] (using cluster members) -> ParamSummaries
a. [clustering.go: GetParamSummariesForKeys]
i. Count occurrences of each param=value in cluster keys.
ii. Convert counts to ValuePercent structs.
iii. [valuepercent.go: SortValuePercentSlice] -> sorted ParamSummaries.
3. [stepfit.GetStepFitAtMid] (on centroid values, StdDevThreshold, InterestingThreshold, StepDetectionMethod) -> StepFit, StepPoint.
4. Set ClusterSummary.Num = number of members in cluster.
5. Sort cluster members by distance to centroid.
6. Populate ClusterSummary.Keys with top N sorted member keys.
7. Populate ClusterSummary.Centroid with centroid values.
iii. Sort all ClusterSummary objects (e.g., by StepFit.Regression).
h. Populate ClusterSummaries struct with results, K, and StdDevThreshold.
i. Return ClusterSummaries.
Output: ClusterSummaries object or error.
This process effectively transforms raw trace data into a structured summary that highlights significant patterns and changes, facilitating performance analysis and regression detection.
The /go/config module defines the configuration structure for Perf instances and provides utilities for loading, validating, and managing these configurations. It plays a crucial role in customizing the behavior of a Perf instance, from data ingestion and storage to alert notifications and UI presentation.
Core Responsibilities and Design:
The primary responsibility of this module is to define and manage the InstanceConfig struct. This struct is a comprehensive container for all settings that govern a Perf instance. The design emphasizes:
InstanceConfig struct (config.go), the module provides a single source of truth. This simplifies understanding the state of an instance and reduces the chances of configuration drift.encoding/json for this, making it easy to create, read, and modify configurations./go/config/validate/validate.go, /go/config/validate/instanceConfigSchema.json).instanceConfigSchema.json) formally defines the structure and types of the InstanceConfig. This schema is automatically generated from the Go struct definition using the /go/config/generate/main.go program, ensuring the schema stays in sync with the code.validate.InstanceConfigFromFile function uses this schema to validate a configuration file before attempting to deserialize it. This allows for early detection of malformed or incomplete configurations.BackendFlags, FrontendFlags, IngestFlags, and MaintenanceFlags (config.go). These structs group related command-line flags and provide methods (AsCliFlags) to convert them into cli.Flag slices, compatible with the github.com/urfave/cli/v2 library. This design keeps flag definitions organized and associated with the components they configure.InstanceConfig is designed to be extensible. New configuration options can be added as new fields to the relevant sub-structs. The JSON schema generation and validation mechanisms will automatically adapt to these changes.Key Components and Files:
config.go: This is the heart of the module.InstanceConfig struct, which aggregates various sub-configuration structs like AuthConfig, DataStoreConfig, IngestionConfig, GitRepoConfig, NotifyConfig, IssueTrackerConfig, AnomalyConfig, QueryConfig, TemporalConfig, and DataPointConfig. Each of these sub-structs groups settings related to a specific aspect of the Perf system (e.g., authentication, data storage, data ingestion).DataStoreType, SourceType, GitAuthType, GitProvider, TraceFormat) to provide clear and constrained options for certain configuration values.DurationAsString, a custom type for handling time.Duration serialization and deserialization as strings in JSON, which is more human-readable than nanosecond integers. It also provides a custom JSON schema for this type.MaxSampleTracesPerCluster, MinStdDev, GotoRange, and QueryMaxRunTime are defined here, providing default values or limits used across the application./go/config/validate/validate.go:InstanceConfig beyond what the JSON schema can enforce. This includes semantic checks, such as ensuring that required fields are present based on the values of other fields (e.g., API keys for issue tracker notifications).InstanceConfigFromFile function is the primary entry point for loading and validating a configuration file. It first performs schema validation and then calls the Validate function for further business logic checks.NotifyConfig by attempting to format them with sample data. This helps catch template syntax errors early./go/config/validate/instanceConfigSchema.json:InstanceConfig JSON files. It is used by validate.go to perform initial validation of configuration files./go/config/generate/main.go:instanceConfigSchema.json file based on the InstanceConfig struct definition in config.go. This ensures that the schema is always up-to-date with the Go code. The //go:generate directive at the top of the file allows for easy regeneration of the schema.config_test.go and /go/config/validate/validate_test.go:DurationAsString), and validation logic. The tests for validate.go include checks against actual configuration files used in production (//perf:configs), ensuring that the validation logic is robust and correctly handles real-world scenarios.Workflows:
1. Loading and Validating a Configuration File:
User provides config file path (e.g., "configs/nano.json")
|
V
Application calls validate.InstanceConfigFromFile("configs/nano.json")
|
V
validate.go: Reads the JSON file content.
|
V
validate.go: Validates content against instanceConfigSchema.json (using jsonschema.Validate).
| \
| (If schema violation) \
V V
Error returned with schema violations. Deserializes JSON into config.InstanceConfig struct.
|
V
validate.go: Calls Validate(instanceConfig) for further business logic checks.
| (e.g., API key presence, template validity)
|
| (If validation error)
V
Error returned.
|
V (If all valid)
Returns the populated config.InstanceConfig struct.
|
V
Application sets config.Config = returnedInstanceConfig
|
V
Perf instance uses config.Config for its operations.
2. Generating the JSON Schema:
This is typically done during development when the InstanceConfig struct changes.
Developer modifies config.InstanceConfig struct in config.go
|
V
Developer runs `go generate` in the /go/config/generate directory (or via bazel)
|
V
/go/config/generate/main.go: Calls jsonschema.GenerateSchema("../validate/instanceConfigSchema.json", &config.InstanceConfig{})
|
V
jsonschema library: Introspects the config.InstanceConfig struct and its fields.
|
V
jsonschema library: Generates a JSON Schema definition.
|
V
/go/config/generate/main.go: Writes the generated schema to /go/config/validate/instanceConfigSchema.json.
The design prioritizes robustness through schema and semantic validation, maintainability through structured Go types and centralized configuration, and ease of use through standard JSON format and command-line flag integration. The separation of schema generation (generate subdirectory) and validation (validate subdirectory) keeps concerns distinct.
The ctrace2 module provides the functionality to adapt trace data (represented as a series of floating-point values) for use with k-means clustering algorithms. The primary goal is to transform raw trace data into a format that is suitable for distance calculations and centroid computations, which are fundamental operations in k-means. This involves normalization and handling of missing data points.
In performance analysis, traces often represent measurements over time or across different configurations. Clustering these traces helps identify groups of similar performance characteristics. However, raw trace data might have issues that hinder effective clustering:
The ctrace2 module addresses these by:
vec32.Norm function from the go/vec32 module is leveraged for this. Before normalization, any missing data points (vec32.MissingDataSentinel) are filled in using vec32.Fill, which likely interpolates or uses a similar strategy to replace them.minStdDev parameter is used during normalization. If the calculated standard deviation of a trace is below this minimum, the minStdDev value is used instead. This is a practical approach to handle traces with very little variation without excluding them from clustering.ClusterableTrace Structure: This structure wraps the trace data (Key and Values) and implements the kmeans.Clusterable and kmeans.Centroid interfaces from the perf/go/kmeans module. This makes ClusterableTrace instances directly usable by the k-means algorithm.ctrace.go: This is the core file of the module.ClusterableTrace struct:Key (a string identifier for the trace) and Values (a slice of float32 representing the normalized data points).Distance(c kmeans.Clusterable) float64 method: Calculates the Euclidean distance between the current ClusterableTrace and another ClusterableTrace. This is crucial for the k-means algorithm to determine how similar two traces are. The calculation assumes that both traces have the same number of data points (a guarantee maintained by NewFullTrace). For each point i in trace1 and trace2: diff_i = trace1.Values[i] - trace2.Values[i] squared_diff_i = diff_i * diff_i Sum all squared_diff_i Distance = Sqrt(Sum)AsClusterable() kmeans.Clusterable method: Returns the ClusterableTrace itself, satisfying the kmeans.Centroid interface requirement.Dup(newKey string) *ClusterableTrace method: Creates a deep copy of the ClusterableTrace with a new key. This is useful when you need to manipulate a trace without affecting the original.NewFullTrace(key string, values []float32, minStdDev float32) *ClusterableTrace function:ClusterableTrace instances from raw trace data.key (string identifier), raw values ([]float32), and a minStdDev. 2. Creates a copy of the input values to avoid modifying the original slice. 3. Calls vec32.Fill() on the copied values. This step handles missing data points by filling them, likely through interpolation or a similar imputation technique provided by the go/vec32 module. 4. Calls vec32.Norm() on the filled values, using minStdDev. This normalizes the trace data so that its standard deviation is effectively 1.0 (or adjusted if the original standard deviation was below minStdDev). 5. Returns a new ClusterableTrace with the provided key and the processed (filled and normalized) values. Input: key, raw_values, minStdDev ------------------------------------ copied_values = copy(raw_values) filled_values = vec32.Fill(copied_values) normalized_values = vec32.Norm(filled_values, minStdDev) Output: ClusterableTrace{Key: key, Values: normalized_values}CalculateCentroid(members []kmeans.Clusterable) kmeans.Centroid function:kmeans.CalculateCentroid function type. Given a slice of ClusterableTrace instances (which are members of a cluster), it computes their centroid.float32 (mean) with the same length as the Values of the first member trace. 2. It iterates through each member trace in the members slice. 3. For each member, it iterates through its Values and adds each value to the corresponding element in the mean slice. 4. After summing up all values component-wise, it divides each element in the mean slice by the total number of members to get the average value for each dimension. 5. It returns a new ClusterableTrace representing the centroid. The key for this centroid trace is set to CENTROID_KEY (“special_centroid”). Input: members (list of ClusterableTraces) ------------------------------------------ Initialize mean_values = [0.0, 0.0, ..., 0.0] (same length as members[0].Values) For each member_trace in members: For each i from 0 to len(member_trace.Values) - 1: mean_values[i] = mean_values[i] + member_trace.Values[i] For each i from 0 to len(mean_values) - 1: mean_values[i] = mean_values[i] / len(members) Output: ClusterableTrace{Key: CENTROID_KEY, Values: mean_values}CENTROID_KEY constant:The interaction with the go/vec32 module is crucial for data preprocessing (filling missing values and normalization), while the perf/go/kmeans module provides the interfaces that ctrace2 implements to be compatible with k-means clustering algorithms.
The culprit module is responsible for identifying, storing, and notifying about commits that are likely causes of performance regressions. It integrates with anomaly detection and subscription systems to automate the process of pinpointing culprits and alerting relevant parties.
store.go & sqlculpritstore/sqlculpritstore.gostore.go defines the Store interface, which outlines the contract for culprit data operations like Get, Upsert, and AddIssueId.sqlculpritstore/sqlculpritstore.go provides a SQL-based implementation of this interface. It uses a SQL database (configured via pool.Pool) to store culprit information.Upsert method is crucial. It either inserts a new culprit record or updates an existing one if a commit has already been identified as a culprit for a different anomaly group. This prevents duplicate culprit entries for the same commit. It also links the culprit to the anomaly_group_id.AddIssueId method updates a culprit record to include the ID of an issue (e.g., a bug tracker ticket) that was created for it, and also maintains a map between the anomaly group and the issue ID. This is important for tracking and referencing.sqlculpritstore/schema/schema.go) includes fields for commit details (host, project, ref, revision), associated anomaly group IDs, and associated issue IDs. An index on (revision, host, project, ref) helps in efficiently querying for existing culprits.Store) decouples the rest of the module from the specific database implementation, allowing for easier testing and potential future changes in the storage backend.Upsert logic is designed to handle cases where the same commit might be identified as a culprit for multiple regressions (different anomaly groups). Instead of creating duplicate entries, it appends the new anomaly_group_id to the existing record.group_issue_map as JSONB allows flexible storage of the mapping between anomaly groups and the specific issue filed for that group in the context of this culprit.formatter/formatter.goFormatter interface with methods GetCulpritSubjectAndBody (for new culprit notifications) and GetReportSubjectAndBody (for new anomaly group reports).MarkdownFormatter is the concrete implementation. It uses Go's text/template package to render notification messages.InstanceConfig. If not provided, default templates are used.TemplateContext and ReportTemplateContext provide the data that can be used within the templates (e.g., commit details, subscription information, anomaly group details).buildCommitURL, buildAnomalyGroupUrl, and buildAnomalyDetails are available within the templates to construct URLs and format anomaly details.formatter/noop.go: Provides a NoopFormatter that generates empty subjects and bodies, useful for disabling notifications or for testing scenarios where actual formatting is not needed.transport/transport.goTransport interface with the SendNewNotification method.IssueTrackerTransport is the concrete implementation for interacting with an issue tracker (e.g., Google Issue Tracker/Buganizer).go.skia.org/infra/go/issuetracker/v1 client library.secret package.SendNewNotification is called, it constructs an issuetracker.Issue object based on the provided subject, body, and subscription details (like component ID, priority, CCs, hotlists).SendNewNotificationSuccess, SendNewNotificationFail) are recorded to monitor the success rate of sending notifications.Transport interface allows for different notification mechanisms to be plugged in (e.g., email, Slack) in the future.transport/noop.go: Provides a NoopTransport that doesn't actually send any notifications, useful for disabling notifications or for testing.notify/notify.goFormatter and a Transport.CulpritNotifier interface with methods NotifyCulpritFound and NotifyAnomaliesFound.DefaultCulpritNotifier implements this interface. It takes a formatter.Formatter and a transport.Transport as dependencies.GetDefaultNotifier factory function determines which Formatter and Transport to use based on the InstanceConfig.IssueTrackerConfig.NotificationType. If NoneNotify, it uses NoopFormatter and NoopTransport. If IssueNotify, it sets up MarkdownFormatter and IssueTrackerTransport.NotifyCulpritFound:GetCulpritSubjectAndBody to get the message content.SendNewNotification to send the message.NotifyAnomaliesFound:GetReportSubjectAndBody.SendNewNotification.service/service.goculprit.proto. This is the main entry point for external systems (like a bisection service or an anomaly detection pipeline) to interact with the culprit module.pb.CulpritServiceServer interface.anomalygroup.Store, culprit.Store, subscription.Store, and notify.CulpritNotifier.PersistCulprit RPC:culpritStore.Upsert to save the identified culprit commits and associate them with the anomaly_group_id.anomalygroupStore.AddCulpritIDs to link the newly created/updated culprit IDs back to the anomaly group. [Client (e.g., Bisection Service)] | v [PersistCulpritRequest {Commits, AnomalyGroupID}] | v [culpritService.PersistCulprit] | \ | `-> [culpritStore.Upsert(AnomalyGroupID, Commits)] -> Returns CulpritIDs | | | v `<----------------------- [anomalygroupStore.AddCulpritIDs(AnomalyGroupID, CulpritIDs)] | v [PersistCulpritResponse {CulpritIDs}] | v [Client]GetCulprit RPC:culpritStore.Get to retrieve culprit details by their IDs.NotifyUserOfCulprit RPC:culpritStore.Get.AnomalyGroup using anomalygroupStore.LoadById.Subscription associated with the anomaly group using subscriptionStore.GetSubscription.notifier.NotifyCulpritFound for each culprit to send a notification (e.g., file a bug).culpritStore.AddIssueId to store the generated issue ID with the culprit and the specific anomaly group. [Client (e.g., Bisection Service after PersistCulprit)] | v [NotifyUserOfCulpritRequest {CulpritIDs, AnomalyGroupID}] | v [culpritService.NotifyUserOfCulprit] |-> [culpritStore.Get(CulpritIDs)] -> Culprits |-> [anomalygroupStore.LoadById(AnomalyGroupID)] -> AnomalyGroup | | -> [subscriptionStore.GetSubscription(AnomalyGroup.SubName, AnomalyGroup.SubRev)] -> Subscription | (For each Culprit in Culprits) | | | `-> [notifier.NotifyCulpritFound(Culprit, Subscription)] -> Returns IssueID | | | v | `-> [culpritStore.AddIssueId(Culprit.ID, IssueID, AnomalyGroupID)] | v [NotifyUserOfCulpritResponse {IssueIDs}] | v [Client]NotifyUserOfAnomaly RPC:AnomalyGroup and its associated Subscription.notifier.NotifyAnomaliesFound to send a notification about the group of anomalies (e.g., file a summary bug). [Client (e.g., Anomaly Detection Service)] | v [NotifyUserOfAnomalyRequest {AnomalyGroupID, Anomalies[]}] | v [culpritService.NotifyUserOfAnomaly] |-> [anomalygroupStore.LoadById(AnomalyGroupID)] -> AnomalyGroup | | -> [subscriptionStore.GetSubscription(AnomalyGroup.SubName, AnomalyGroup.SubRev)] -> Subscription | `-> [notifier.NotifyAnomaliesFound(AnomalyGroup, Subscription, Anomalies[])] -> Returns IssueID | v [NotifyUserOfAnomalyResponse {IssueID}] | v [Client]PrepareSubscription is a helper function used to potentially override or mock subscription details for testing or during transitional phases before full sheriff configuration is active. This is a temporary measure.GetAuthorizationPolicy) is currently set to allow unauthenticated access, which might need to be revisited for production environments.proto/v1/culprit_service.protoCommit: Represents a source code commit.Culprit: Represents an identified culprit commit, including its ID, the commit details, associated anomaly group IDs, and issue IDs. It also includes group_issue_map to track which issue was filed for which anomaly group in the context of this culprit.Anomaly: Represents a detected performance anomaly (duplicated from anomalygroup service for potential independent evolution).PersistCulpritRequest/Response: For storing new culprits.GetCulpritRequest/Response: For retrieving existing culprits.NotifyUserOfAnomalyRequest/Response: For triggering notifications about a new set of anomalies (anomaly group).NotifyUserOfCulpritRequest/Response: For triggering notifications about newly identified culprits.Anomaly message is duplicated from the anomalygroup service. This choice was made to allow the culprit service and anomalygroup service to evolve their respective Anomaly definitions independently if needed in the future, avoiding tight coupling.group_issue_map in the Culprit message is important for scenarios where a single culprit might be associated with multiple anomaly groups, and each of those (culprit, group) pairs might result in a distinct bug being filed.mocks/ subdirectories)culprit module (e.g., Store, Formatter, Transport, CulpritNotifier, CulpritServiceServer).mockery.AnomalyGroup.AnomalyGroup.CulpritService.PersistCulprit RPC with the identified Commit(s) and the AnomalyGroupID.culpritService uses culpritStore.Upsert to save these commits as Culprit records, linking them to the AnomalyGroupID.anomalygroupStore.AddCulpritIDs to update the AnomalyGroup record with the IDs of these new culprits.CulpritService.NotifyUserOfCulprit RPC with the CulpritID(s) and the AnomalyGroupID.culpritService retrieves the full Culprit details and the associated Subscription.DefaultCulpritNotifier is invoked:MarkdownFormatter generates the subject and body for the notification.IssueTrackerTransport sends this formatted message to the issue tracker, creating a new bug.culpritService calls culpritStore.AddIssueId to associate this bug ID with the specific Culprit and AnomalyGroupID.This flow ensures that culprits are stored, linked to their regressions, and users are notified through the configured channels. The modular design allows for flexibility in how each step (storage, formatting, transport) is implemented.
The dataframe module provides the DataFrame data structure and related functionality for handling and manipulating performance trace data. It is a core component for querying, analyzing, and visualizing performance metrics within the Skia Perf system.
Key Design Principles:
DataFrame encapsulates a types.TraceSet, which is a map of trace keys to their corresponding performance values. It also maintains a paramtools.ReadOnlyParamSet, which describes the unique parameter key-value pairs present in the TraceSet. This allows for efficient filtering and aggregation based on trace characteristics.DataFrame are defined by ColumnHeader structs, each containing a commit offset and a timestamp. This ties the performance data directly to specific points in the codebase's history.DataFrameBuilder interface decouples the DataFrame creation logic from the underlying data source. This allows for different implementations to fetch data (e.g., from a database) while providing a consistent API for consumers.Join), filtering traces (FilterOut), slicing data (Slice), and compressing data by removing empty columns (Compress). These operations are designed with performance considerations in mind.Key Components and Files:
dataframe.go: This is the central file defining the DataFrame struct and its associated methods.
DataFrame struct:TraceSet: Stores the actual performance data, mapping trace keys (strings representing parameter combinations like “,arch=x86,config=8888,”) to types.Trace (slices of float32 values).Header: A slice of *ColumnHeader pointers, defining the columns of the DataFrame. Each ColumnHeader links a column to a specific commit (Offset) and its Timestamp.ParamSet: A paramtools.ReadOnlyParamSet that contains all unique key-value pairs from the keys in TraceSet. This is crucial for understanding the dimensions of the data and for building UI controls for filtering. It's rebuilt by BuildParamSet().Skip: An integer indicating if any commits were skipped during data retrieval to keep the DataFrame size manageable (related to MAX_SAMPLE_SIZE).DataFrameBuilder interface: Defines the contract for objects that can construct DataFrame instances. This allows for different data sources or retrieval strategies. Key methods include:NewFromQueryAndRange: Creates a DataFrame based on a query and a time range.NewFromKeysAndRange: Creates a DataFrame for specific trace keys over a time range.NewNFromQuery / NewNFromKeys: Creates a DataFrame with the N most recent data points for matching traces or specified keys.NumMatches / PreflightQuery: Used to estimate the size of the data that a query will return, often for UI feedback or to refine queries.ColumnHeader struct: Represents a single column in the DataFrame, typically corresponding to a commit. It contains:Offset: A types.CommitNumber identifying the commit.Timestamp: The timestamp of the commit in seconds since the Unix epoch.NewEmpty(): Creates an empty DataFrame.NewHeaderOnly(): Creates a DataFrame with populated headers (commits within a time range) but no trace data. This can be useful for setting up the structure before fetching actual data.FromTimeRange(): Retrieves commit information (headers and commit numbers) for a given time range from a perfgit.Git instance. This is a foundational step in populating the Header of a DataFrame.MergeColumnHeaders(): A utility function that takes two slices of ColumnHeader and merges them into a single sorted slice, returning mapping indices to reconstruct traces. This is essential for the Join operation.Join(): Combines two DataFrames into a new DataFrame. It merges their headers and trace data. If traces exist in one DataFrame but not the other for a given key, missing data points (vec32.MissingDataSentinel) are inserted. The ParamSet of the resulting DataFrame is the union of the input ParamSets. DataFrame A (Header: [C1, C3], TraceX: [v1, v3]) | V DataFrame B (Header: [C2, C3], TraceX: [v2', v3']) | V Joined DataFrame (Header: [C1, C2, C3], TraceX: [v1, v2', v3/v3']) (TraceY from A or B padded with missing data)BuildParamSet(): Recalculates the ParamSet for a DataFrame based on the current keys in its TraceSet. This is called after operations like FilterOut that might change the set of traces.FilterOut(): Removes traces from the TraceSet based on a provided TraceFilter function. It then calls BuildParamSet() to update the ParamSet.Slice(): Returns a new DataFrame that is a view into a sub-section of the original DataFrame's columns. The underlying trace data is sliced, not copied, for efficiency.Compress(): Creates a new DataFrame by removing any columns (and corresponding data points in traces) that contain only missing data sentinels across all traces. This helps in reducing data size and focusing on relevant data points.dataframe_test.go: Contains unit tests for the functionality in dataframe.go. These tests cover various scenarios, including empty DataFrames, different merging and joining cases, filtering, slicing, and compression. The tests often use gittest for creating mock Git repositories to test time range queries.
/go/dataframe/mocks/DataFrameBuilder.go: This file contains a mock implementation of the DataFrameBuilder interface, generated using the testify/mock library. This mock is used in tests of other packages that depend on DataFrameBuilder, allowing them to simulate DataFrame creation without needing a real data source or Git repository.
Workflows:
Fetching Data for Display/Analysis:
DataFrameBuilder (e.g., one that queries a CockroachDB instance) uses NewFromQueryAndRange.FromTimeRange (which calls perfgit.Git.CommitSliceFromTimeRange). This populates the Header.TraceSet.ParamSet using BuildParamSet().DataFrame is returned.Client Request (Query, TimeRange)
|
V
DataFrameBuilder.NewFromQueryAndRange(ctx, begin, end, query, ...)
|
+-> FromTimeRange(ctx, git, begin, end, ...) // Get commit headers
| |
| V
| perfgit.Git.CommitSliceFromTimeRange()
| |
| V
| [ColumnHeader{Offset, Timestamp}, ...]
|
+-> DataSource.QueryTraces(query, commit_numbers) // Fetch trace data
| |
| V
| types.TraceSet
|
+-> DataFrame.BuildParamSet() // Populate ParamSet
|
V
DataFrame{Header, TraceSet, ParamSet}
Joining DataFrames (e.g., from different sources or queries):
DataFrame instances, dfA and dfB, are available.Join(dfA, dfB) is called.MergeColumnHeaders(dfA.Header, dfB.Header) creates a unified header and maps to align traces.TraceSet is built. For each key:dfA but not dfB, its trace is copied, padded with missing values for columns unique to dfB.dfB but not dfA, its trace is copied, padded with missing values for columns unique to dfA.ParamSets of dfA and dfB are combined.DataFrame is returned.Filtering Data:
DataFrame df exists.TraceFilter function myFilter is defined (e.g., to remove traces with all zero values).df.FilterOut(myFilter) is called.df.TraceSet. If myFilter returns true for a trace, that trace is deleted from the TraceSet.df.BuildParamSet() is called to reflect the potentially reduced set of parameters.Constants:
DEFAULT_NUM_COMMITS: Default number of commits to fetch when using methods like NewNFromQuery. Set to 50.MAX_SAMPLE_SIZE: A limit on the number of commits (columns) a DataFrame might contain, especially when downsampling. Set to 5000. (Note: The downsample parameter in FromTimeRange is currently ignored, meaning this might not be strictly enforced by that specific function directly but could be a target for other parts of the system or future enhancements.)The dfbuilder module is responsible for constructing DataFrame objects. DataFrames are fundamental data structures in Perf, representing a collection of performance traces (time series data) along with their associated parameters and commit information. This module acts as an intermediary between the raw trace data stored in a TraceStore and the higher-level analysis and visualization components that consume DataFrames.
The core design revolves around efficiently fetching and organizing trace data based on various querying criteria. This involves interacting with a perfgit.Git instance to resolve commit ranges and timestamps, and a tracestore.TraceStore to retrieve the actual trace data.
Key Responsibilities and Components:
dfbuilder.go: This is the central file implementing the DataFrameBuilder interface.builder struct: This struct holds the necessary dependencies like perfgit.Git, tracestore.TraceStore, tracecache.TraceCache, and configuration parameters (e.g., tileSize, numPreflightTiles, QueryCommitChunkSize). It also maintains metrics for various DataFrame construction operations.NewDataFrameBuilderFromTraceStore): Initializes a builder instance. An important configuration here is filterParentTraces. If enabled, the builder will attempt to remove redundant parent traces when child traces (more specific traces) exist. For example, if traces for test=foo,subtest=bar and test=foo both exist, the latter might be filtered out if filterParentTraces is true.NewFromQueryAndRange):config=8888) within a given time period.dataframe.FromTimeRange (which internally queries perfgit.Git) to get a list of ColumnHeader (commit information) and CommitNumbers within the specified time range. It also handles downsampling if requested. 2. It then determines the relevant tiles to query from the TraceStore based on the commit numbers (sliceOfTileNumbersFromCommits). 3. The core data fetching happens in the new method. This method queries the TraceStore for matching traces per tile concurrently using errgroup.Group for parallelism. This is a key optimization to speed up data retrieval, especially over large time ranges spanning multiple tiles. 4. A tracesetbuilder.TraceSetBuilder is used to efficiently aggregate the traces fetched from different tiles into a single types.TraceSet and paramtools.ParamSet. 5. Finally, it constructs and returns a compressed DataFrame. NewFromQueryAndRange | -> dataframe.FromTimeRange (get commits in time range from Git) | -> sliceOfTileNumbersFromCommits (determine tiles to query) | -> new (concurrently query TraceStore for each tile) | -> TraceStore.QueryTraces (for each tile) | -> tracesetbuilder.Add (aggregate results) | -> tracesetbuilder.Build | -> DataFrame.CompressNewFromKeysAndRange):NewFromQueryAndRange in terms of getting commit information for the time range. However, instead of querying by a query.Query object, it directly calls TraceStore.ReadTraces for each relevant tile, providing the list of trace keys. Results are then aggregated. This is generally faster if the exact trace keys are known as it avoids the overhead of query parsing and matching within the TraceStore.NewNFromQuery, NewNFromKeys):QueryCommitChunkSize if configured), until N data points are collected for the matching traces. 1. It starts from a given end time (or the latest commit if end is zero). 2. It determines an initial beginIndex and endIndex for commit numbers. The QueryCommitChunkSize can influence this beginIndex to fetch a larger chunk of commits at once, potentially improving parallelism in the new method. 3. In a loop: - It fetches commit headers and indices for the current beginIndex-endIndex range. - It calls the new method (for NewNFromQuery) or a similar tile-based fetching logic (for NewNFromKeys) to get a DataFrame for this smaller range. - It counts non-missing data points in the fetched DataFrame. If no data is found for maxEmptyTiles consecutive attempts, it stops to prevent searching indefinitely through sparse data. - It appends the data from the fetched DataFrame to the result DataFrame, working backward from the Nth slot. - It then adjusts beginIndex and endIndex to move to the previous chunk of commits/tiles. 4. If filterParentTraces is enabled, it calls filterParentTraces to remove redundant parent traces from the final TraceSet. 5. The resulting DataFrame might have traces of length less than N if not enough data points were found. It trims the traces if necessary. NewNFromQuery (or NewNFromKeys) | -> findIndexForTime (get commit number for 'end' time) | -> Loop (until N points are found or maxEmptyTiles reached): | -> fromIndexRange (get commits for current chunk) | -> new (or similar logic for keys) (fetch data for this chunk) | -> Aggregate data into result DataFrame | -> Update beginIndex/endIndex to previous chunk | -> [Optional] filterParentTraces | -> Trim traces if fewer than N points foundPreflightQuery):DataFrame, it's useful to get an estimate of how many traces will match and what the resulting ParamSet will look like. This allows UIs to present filter options dynamically.TraceStore. 2. It queries the numPreflightTiles most recent tiles (concurrently) for trace IDs matching the query q. This uses getTraceIds, which first attempts to fetch from tracecache and falls back to TraceStore.QueryTracesIDOnly. 3. The trace IDs (which are paramtools.Params) found are used to build up a ParamSet. 4. The count of matching traces from the tile with the most matches is taken as the estimated count. 5. Crucially, for parameter keys present in the input query q, it replaces the values in the computed ParamSet with all values for those keys from the referenceParamSet. This ensures that the UI can still offer all possible filter options for parameters the user has already started filtering on. 6. The resulting ParamSet is normalized. PreflightQuery | -> TraceStore.GetLatestTile | -> Loop (for numPreflightTiles, concurrently): | -> getTraceIds (TileN, query) // Checks tracecache first, then TraceStore.QueryTracesIDOnly | -> [If cache miss] TraceStore.QueryTracesIDOnly | -> [If cache miss & tracecache enabled] tracecache.CacheTraceIds | -> Aggregate Params into a new ParamSet -> Update max count | -> Update ParamSet with values from referenceParamSet for keys in the original query | -> Normalize ParamSetNumMatches):PreflightQuery that only returns the estimated number of matching traces.TraceStore.QueryTracesIDOnly and returns the higher of the two counts.filterParentTraces function):tracefilter.NewTraceFilter(). For each trace key in the input TraceSet:paramtools.Params.traceFilter.traceFilter.GetLeafNodeTraceKeys() returns only the keys corresponding to the most specific (leaf) traces in the hierarchical structure implied by the paths.TraceSet is built containing only these leaf node traces.getTraceIds, cacheTraceIdsIfNeeded):QueryTracesIDOnly can still be somewhat expensive if performed frequently on the same tiles and queries (e.g., during PreflightQuery). Caching the results (the list of matching trace IDs/params) can significantly speed this up.getTraceIds function first attempts to retrieve trace IDs from the tracecache.TraceCache. If there's a cache miss or the cache is not configured, it queries the TraceStore. If a database query was performed and the cache is configured, cacheTraceIdsIfNeeded is called to store the results in the cache for future requests. The cache key is typically a combination of the tile number and the query string.Design Choices and Trade-offs:
TraceStore organizes data into tiles. Most dfbuilder operations that involve fetching data across a range of commits are designed to process these tiles concurrently. This improves performance by parallelizing I/O and computation.tracesetbuilder: This utility is used to efficiently merge trace data coming from different tiles (which might have different sets of commits) into a coherent TraceSet and ParamSet.QueryCommitChunkSize: This parameter in NewNFromQuery allows fetching data in larger chunks than a single tile. This can increase parallelism in the underlying new method call, but fetching too large a chunk might lead to excessive memory usage or longer latency for the first chunk.maxEmptyTiles / newNMaxSearch: When searching backward for N data points, these constants prevent indefinite searching if the data is very sparse or the query matches very few traces.singleTileQueryTimeout: This guards against queries on individual tiles taking too long, which could happen with “bad” tiles containing excessive data or due to backend issues. This is particularly important for operations like NewNFromQuery or PreflightQuery which might issue many such single-tile queries.PreflightQuery: PreflightQuery is often called by UIs to populate filter options. Caching the results of QueryTracesIDOnly (which provides the raw data for ParamSet construction in preflight) via tracecache helps make these UI interactions faster.The dfbuilder_test.go file provides comprehensive unit tests for these functionalities, covering various scenarios including empty queries, queries matching data in single or multiple tiles, N-point queries, and preflight operations with and without caching. It uses gittest for creating a mock Git history and sqltest (for Spanner) or mock implementations for the TraceStore and TraceCache.
dfiter Module DocumentationThe dfiter module is responsible for efficiently creating and providing dataframe.DataFrame objects, which are fundamental data structures used in regression detection within the Perf application. It acts as an iterator, allowing consuming code to process DataFrames one by one. This is particularly useful for performance reasons, as constructing and holding all possible DataFrames in memory simultaneously could be resource-intensive.
The core purpose of dfiter is to abstract away the complexities of fetching and structuring data from the underlying trace store and Git history. It ensures that DataFrames are generated with the correct dimensions and data points based on user-defined queries, commit ranges, and alert configurations.
The dfiter module employs a “slicing” strategy for generating DataFrames. This means it typically fetches a larger, encompassing DataFrame from the dataframe.DataFrameBuilder and then yields smaller, overlapping sub-DataFrames.
Why this approach?
DataFrameBuilder) is often more efficient than making numerous small queries. The slicing operation itself is a relatively cheap in-memory operation.Key Components and Responsibilities:
DataFrameIterator Interface:Next() bool: Advances the iterator to the next DataFrame. Returns true if a next DataFrame is available, false otherwise.Value(ctx context.Context) (*dataframe.DataFrame, error): Returns the current DataFrame.dataframeSlicer struct:DataFrameIterator. It embodies the slicing strategy described above.dataframe.DataFrame (df), the desired size of the sliced DataFrames (determined by alert.Radius), and the current offset for slicing. The Next() method checks if another slice of the specified size can be made, and Value() performs the actual slicing using df.Slice().NewDataFrameIterator Function:DataFrameIterator instances. It encapsulates the logic for determining how the initial, larger DataFrame should be fetched based on the input parameters.queryAsString into a query.Query object.domain.Offset: - domain.Offset == 0 (Continuous/Sliding Window Mode): - This mode is typically used for ongoing regression detection across a range of recent commits. - It fetches a DataFrame of domain.N commits ending at domain.End. - Settling Time: If anomalyConfig.SettlingTime is configured, it adjusts domain.End to exclude very recent data points that might not have “settled” (e.g., due to data ingestion delays or pending backfills). This prevents alerts on potentially incomplete or volatile fresh data. - The dataframeSlicer will then produce overlapping DataFrames of size 2*alert.Radius + 1. - domain.Offset != 0 (Specific Commit/Exact DataFrame Mode): - This mode is used when analyzing a specific commit or a small, fixed window around it (e.g., when a user clicks on a specific point in a chart to see its details or re-runs detection for a particular regression). - It aims to return a single DataFrame. - The size of this DataFrame is 2*alert.Radius + 1. - To determine the End time for fetching data, it calculates the commit alert.Radius positions after the domain.Offset. This ensures the commit at domain.Offset is centered within the radius. For example, if domain.Offset is commit 21 and alert.Radius is 3, it will fetch data up to commit 24 (21 + 3). The resulting DataFrame will then contain commits [18, 19, 20, 21, 22, 23, 24]. This is a specific requirement to ensure consistency with how different step detection algorithms expect their input DataFrames.dataframe.DataFrameBuilder (dfBuilder) to construct the initial DataFrame (dfBuilder.NewNFromQuery). This involves querying the trace store and potentially Git history.2*alert.Radius + 1 commits). If not, it returns ErrInsufficientData. This is crucial because regression detection algorithms require a minimum amount of data to operate correctly.metrics2.GetCounter("perf_regression_detection_floats"). This helps in monitoring the data processing load.dataframeSlicer instance initialized with the fetched DataFrame and the calculated slice size.ErrInsufficientData:1. Continuous Regression Detection (Sliding Window):
This typically happens when domain.Offset is 0.
[Caller] [NewDataFrameIterator] [DataFrameBuilder] | -- Request with query, domain (N, End), alert (Radius) --> | | | | -- Parse query | | | -- (If anomalyConfig.SettlingTime > 0) Adjust domain.End --> | | | -- dfBuilder.NewNFromQuery(ctx, domain.End, q, domain.N) --> | | | | -- Query TraceStore | | | -- Build large DataFrame | | | <----- DataFrame (df) | | -- Check if len(df.Header) >= 2*Radius+1 | | | -- (If insufficient) Return ErrInsufficientData ----------- | | | -- Create dataframeSlicer(df, size=2*Radius+1, offset=0) | | <----------------- DataFrameIterator (slicer) ------------- | | [Caller] [dataframeSlicer] | | | -- it.Next() ---------------------------------------------> | | | -- return offset+size <= len(df.Header) | <------------------------------ true ---------------------- | | -- it.Value() --------------------------------------------> | | | -- subDf = df.Slice(offset, size) | | -- offset++ | <-------------------------- subDf, nil -------------------- | | -- (Process subDf) | | ... (loop Next()/Value() until Next() returns false) ... |
2. Specific Commit Analysis (Exact DataFrame):
This typically happens when domain.Offset is non-zero.
[Caller] [NewDataFrameIterator] [Git] [DataFrameBuilder] | -- Request with query, domain (Offset), alert (Radius) --> | | | | | -- Parse query | | | | -- targetCommitNum = domain.Offset + alert.Radius | | | | -- perfGit.CommitFromCommitNumber(targetCommitNum) ------> | | | | | -- Lookup commit | | | <----------------------------- commitDetails, nil --------- | | | | -- dfBuilder.NewNFromQuery(ctx, commitDetails.Timestamp, | | | | q, n=2*Radius+1) ------------> | | | | | | -- Query TraceStore | | | | -- Build DataFrame (size 2*R+1) | | <-------------------------------------------------------- DataFrame (df) ----- | | | -- Check if len(df.Header) >= 2*Radius+1 | | | | -- (If insufficient) Return ErrInsufficientData --------- | | | | -- Create dataframeSlicer(df, size=2*Radius+1, offset=0) | | | <----------------------- DataFrameIterator (slicer) ------ | | | [Caller] [dataframeSlicer] | | | -- it.Next() ---------------------------------------------> | | | -- return offset+size <= len(df.Header) (true for the first call) | <------------------------------ true ---------------------- | | -- it.Value() --------------------------------------------> | | | -- subDf = df.Slice(offset, size) (returns the whole df) | | -- offset++ | <-------------------------- subDf, nil -------------------- | | -- (Process subDf) | | -- it.Next() ---------------------------------------------> | | | -- return offset+size <= len(df.Header) (false for subsequent calls) | <------------------------------ false --------------------- |
This design allows for flexible and efficient generation of DataFrames tailored to the specific needs of regression detection, whether it's scanning a wide range of recent commits or focusing on a particular point in time. The use of an iterator pattern also helps manage memory consumption by processing DataFrames sequentially.
The dryrun module provides the capability to test an alert configuration and preview the regressions it would identify without actually creating an alert or sending notifications. This is a crucial tool for developers and performance engineers to fine-tune alert parameters and ensure they accurately capture relevant performance changes.
The core idea is to simulate the regression detection process for a given alert configuration over a historical range of data. This allows users to iterate on alert definitions, observe the potential impact of those definitions, and avoid alert fatigue caused by poorly configured alerts.
The primary responsibility of the dryrun module is to handle HTTP requests for initiating and reporting the progress of these alert simulations.
dryrun.go: This is the heart of the dryrun module. It defines the Requests struct, which manages the state and dependencies required for processing dry run requests. It also contains the HTTP handler (StartHandler) that orchestrates the dry run process.
Requests struct:
Why: Encapsulates all necessary dependencies (like perfgit.Git for Git interactions, shortcut.Store for shortcut lookups, dataframe.DataFrameBuilder for data retrieval, progress.Tracker for reporting progress, and regression.ParamsetProvider for accessing parameter sets) into a single unit. This promotes modularity and makes it easier to manage and test the dry run functionality.
How: It is instantiated via the New function, which takes these dependencies as arguments. This allows for dependency injection, making the component more testable and flexible.
StartHandler function:
Why: This is the entry point for initiating a dry run. It handles the incoming HTTP request, validates the alert configuration, and kicks off the asynchronous regression detection process.
How:
progress.Tracker to allow clients to monitor the status of the long-running dry run operation.detectorResponseProcessor callback function. This function is invoked by the underlying regression.ProcessRegressions function whenever potential regressions are found.regression module to focus on detection, while the dryrun module handles the presentation and progress updates for the dry run scenario.ClusterResponse objects from the regression detection, converts them into user-friendly RegressionAtCommit structures (which include commit details and the detected regression), and updates the Progress object with these results. This enables real-time feedback to the user as regressions are identified.regression.ProcessRegressions function is then called in the goroutine, passing the alert request, the callback, and other necessary dependencies. This function iterates through the relevant data, applies the alert's clustering and detection logic, and invokes the callback for each identified cluster.Progress object, allowing the client to start polling for updates.RegressionAtCommit struct:
Why: Provides a structured way to represent a regression found at a specific commit. This includes both the commit information (CID) and the details of the regression itself (Regression).
How: It's a simple struct used for marshalling the results into JSON for the client.
Client (UI/API) --HTTP POST /dryrun/start with AlertConfig--> Requests.StartHandler
|
V
[Validate AlertConfig]
|
+----------------------------------+----------------------------------+
| (Validation Fails) | (Validation Succeeds)
V V
[Update Progress with Error] [Add to Progress Tracker]
| |
V V
Respond to Client with Error Progress Launch Goroutine: regression.ProcessRegressions(...)
|
V
[Iterate through data, detect regressions]
|
V
For each potential regression cluster:
Invoke `detectorResponseProcessor` callback
|
V
Callback: [Convert ClusterResponse to RegressionAtCommit]
[Update Progress with new RegressionAtCommit]
|
V
(Client polls for Progress updates)
|
V
When ProcessRegressions completes:
[Update Progress: Finished or Error]
The StartHandler effectively acts as a controller that receives the request, performs initial setup and validation, and then delegates the heavy lifting of regression detection to the regression.ProcessRegressions function, ensuring the HTTP request can return quickly while the background processing continues. The callback mechanism allows the dryrun module to react to findings from the regression module in a way that's specific to the dry run use case (i.e., accumulating and formatting results for client display).
The favorites module provides functionality for users to save and manage “favorite” configurations or views within the Perf application. This allows users to quickly return to specific data explorations or commonly used settings.
The core design philosophy is to provide a persistent storage mechanism for user-specific preferences related to application state (represented as URLs). This is achieved through a Store interface, which abstracts the underlying data storage, and a concrete SQL-based implementation.
store.go: This file defines the central Store interface.
List operation to retrieve all favorites for a specific user and a Liveness check.Favorite: This struct represents a single favorite item, containing fields like ID, UserId, Name, Url, Description, and LastModified. The Url is a key piece of data, as it allows the application to reconstruct the state the user wants to save.SaveRequest: This struct is used for creating and updating favorites, encapsulating the data needed for these operations, notably excluding the ID (which is generated or already known) and LastModified (which is handled by the store).Liveness method is a bit of an outlier. It's used to check the health of the database connection. It was placed in this store “arbitrarily because of its lack of essential function” in other more critical stores, making it a relatively safe place to perform this check without impacting core performance data operations.sqlfavoritestore/sqlfavoritestore.go: This file provides the SQL implementation of the Store interface.
Store interface. These statements interact with a Favorites table.FavoriteStore struct holds a database connection pool (pool.Pool).Get, Create, Update, Delete, and List execute their corresponding SQL statements against the database.LastModified) are handled automatically during create and update operations to track when a favorite was last changed.skerr.Wrapf to provide context to any database errors.sqlfavoritestore/schema/schema.go: This file defines the SQL schema for the Favorites table.
FavoriteSchema struct uses struct tags (sql:"...") to define column names, types, constraints (like PRIMARY KEY, NOT NULL), and indexes. The byUserIdIndex is crucial for efficiently listing favorites for a specific user.mocks/Store.go: This file contains a generated mock implementation of the Store interface.
Store interface. They allow tests to simulate different store behaviors (e.g., successful operations, errors) without requiring an actual database connection.mockery tool. It provides a Store struct that embeds mock.Mock from the testify library. Each method of the interface has a corresponding mock function that can be configured to return specific values or errors.1. Creating a New Favorite:
User Action (e.g., clicks "Save as Favorite" in UI)
|
V
Application Handler
|
V
[favorites.Store.Create] is called with user ID, name, URL, description
|
V
[sqlfavoritestore.FavoriteStore.Create]
|
V
Generates current timestamp for LastModified
|
V
Executes INSERT SQL statement:
INSERT INTO Favorites (user_id, name, url, description, last_modified) VALUES (...)
|
V
Database stores the new favorite record
|
V
Returns success/error to Application Handler
2. Listing User's Favorites:
User navigates to "My Favorites" page
|
V
Application Handler
|
V
[favorites.Store.List] is called with the current user's ID
|
V
[sqlfavoritestore.FavoriteStore.List]
|
V
Executes SELECT SQL statement:
SELECT id, name, url, description FROM Favorites WHERE user_id=$1
|
V
Database returns rows matching the user ID
|
V
[sqlfavoritestore.FavoriteStore.List] scans rows into []*favorites.Favorite
|
V
Returns list of favorites to Application Handler
|
V
UI displays the list
3. Retrieving a Specific Favorite (e.g., when a user clicks on a favorite to load it):
User clicks on a specific favorite in their list
|
V
Application Handler (obtains favorite ID)
|
V
[favorites.Store.Get] is called with the favorite ID
|
V
[sqlfavoritestore.FavoriteStore.Get]
|
V
Executes SELECT SQL statement:
SELECT id, user_id, name, url, description, last_modified FROM Favorites WHERE id=$1
|
V
Database returns the single matching favorite row
|
V
[sqlfavoritestore.FavoriteStore.Get] scans row into a *favorites.Favorite struct
|
V
Returns the favorite object to Application Handler
|
V
Application uses the `Url` from the favorite object to restore the application state
The file module and its submodules are responsible for providing a unified interface for accessing files from different sources, such as local directories or Google Cloud Storage (GCS). This abstraction allows the Perf ingestion system to treat files consistently regardless of their origin.
The central idea is to define a file.Source interface that abstracts the origin of files. Implementations of this interface are then responsible for monitoring their respective sources (e.g., a GCS bucket via Pub/Sub notifications, or a local directory) and emitting file.File objects through a channel when new files become available.
The file.File struct encapsulates the essential information about a file: its name, an io.ReadCloser for its contents, its creation timestamp, and optionally, the associated pubsub.Message if the file originated from a GCS Pub/Sub notification. This optional field is crucial for acknowledging the message after successful processing, or nack'ing it if an error occurs, ensuring reliable message handling in a distributed system.
file.goThis file defines the core File struct and the Source interface.
File struct: Represents a single file.
Name: The identifier for the file (e.g., gs://bucket/object or a local path).Contents: An io.ReadCloser to read the file's content. This design allows for streaming file data, which is memory-efficient, especially for large files. The consumer is responsible for closing this reader.Created: The timestamp when the file was created or last modified (depending on the source).PubSubMsg: A pointer to a pubsub.Message. This is populated if the file notification came from a Pub/Sub message (e.g., GCS object change notifications). It's used to Ack or Nack the message, indicating successful processing or a desire to retry/dead-letter.Source interface: Defines the contract for file sources.
Start(ctx context.Context) (<-chan File, error): This method initiates the process of watching for new files. It returns a read-only channel (<-chan File) through which File objects are sent as they are discovered. The method is designed to be called only once per Source instance. This design ensures that the resource setup and monitoring logic (like starting a Pub/Sub subscription listener or initiating a directory walk) is done once.file.SourcedirsourceThe dirsource submodule provides an implementation of file.Source that reads files from a local filesystem directory.
New(dir string): Constructs a DirSource for a given directory path. It resolves the path to an absolute path.Start(_ context.Context): When called, it initiates a filepath.Walk over the specified directory.file.File object.ModTime of the file is used as the Created timestamp, which is a known simplification for its intended use cases.file.File objects are sent to an unbuffered channel.New(directory) -> DirSource instance | V DirSource.Start() --> Goroutine starts | V filepath.Walk(directory) | +----------------------+ | | V V For each file: For each directory: os.Open(path) (skip) Create file.File{Name, Contents, ModTime} Send file.File to channel | V Caller receives file.File from channelgcssourceThe gcssource submodule implements file.Source for files stored in Google Cloud Storage, using Pub/Sub notifications for new file events.
New(ctx context.Context, instanceConfig *config.InstanceConfig):Subscription name from instanceConfig or generate one based on the Topic (often adding a suffix like -prod or using a round-robin scheme for load distribution if multiple ingester instances are running).sub.Subscription object to manage receiving messages from the configured Pub/Sub topic/subscription. A key configuration here is ReceiveSettings.MaxExtension = -1. This disables automatic ack deadline extension by the Pub/Sub client library. The rationale is that the gcssource itself will explicitly Ack or Nack messages. If automatic extension were enabled and the processing of a file took longer than the extension period, the message might be redelivered while still being processed, leading to duplicate processing or other issues. By disabling it, the ingester has full control over the message lifecycle.filter.Filter based on AcceptIfNameMatches and RejectIfNameMatches regular expressions provided in the instanceConfig. This allows for fine-grained control over which files are processed based on their GCS object names.Start(ctx context.Context):file.File objects.s.subscription.Receive(ctx, s.receiveSingleEventWrapper).Receive method blocks until a message is available or the context is cancelled.receiveSingleEventWrapper is called for each Pub/Sub message.receiveSingleEventWrapper and receiveSingleEvent):Data is expected to be a JSON payload describing a GCS object event (specifically, {"bucket": "...", "name": "..."}).gs:// URI is constructed from the bucket and name.filter.Filter (configured with regexes) is applied. If the filename is rejected, the message is acked (as there's no point retrying), and processing stops for this event.Sources list in instanceConfig.IngestionConfig.SourceConfig.Sources. These are typically gs:// prefixes. If the filename doesn‘t match any of these prefixes, it’s considered an unexpected file, the message is acked, and processing stops. This ensures that the ingester only processes files from explicitly configured GCS locations.obj.Attrs(ctx) is called to get metadata like the creation time. If this fails (e.g., object deleted between notification and processing, or transient network error), the message is nacked (if dead-lettering is not enabled) or handled by the dead-letter policy, as retrying might succeed.obj.NewReader(ctx) is called to get an io.ReadCloser for the file's content. If this fails, the message is nacked (or dead-lettered).file.File: A file.File struct is created with the GCS path, the reader, the attrs.Created time, and the original pubsub.Message. This file.File is sent to the fileChannel.receiveSingleEvent function returns true if the initial stages of processing (up to sending to the channel) were successful and the message should be acked from Pub/Sub's perspective (meaning it was valid, filtered appropriately, and the object was accessible). It returns false for transient errors where a retry might help (e.g., failing to get object attributes or reader).receiveSingleEventWrapper then uses this boolean:s.deadLetterEnabled):receiveSingleEvent returned false (transient error or should retry), the message is Nack()-ed. This typically sends it to a dead-letter topic if configured, or allows Pub/Sub to redeliver it after a backoff.receiveSingleEvent returned true, the message is not explicitly Ack()-ed here. The acknowledgement is deferred to the consumer of the file.File (i.e., the ingester). This is a critical design choice: the message is only truly “done” when the file content has been fully processed by the downstream system.receiveSingleEvent returned true, the message is Ack()-ed.receiveSingleEvent returned false, the message is Nack()-ed.gcssource itself doesn‘t always immediately Ack messages upon successful GCS interaction. Instead, it passes the *pubsub.Message along in the file.File struct. This allows the ultimate consumer of the file’s content (e.g., the Perf ingestion pipeline) to Ack the message only after it has successfully processed and stored the data. This provides end-to-end processing guarantees. If processing fails downstream, the message can be Nack-ed, leading to a retry or dead-lettering.filter.Filter and prefix-based SourceConfig.Sources) ensure that only desired files are processed.Ack (e.g., file explicitly filtered out) and those that warrant a Nack (e.g., transient GCS errors), especially when dead-letter queues are in use.maxParallelReceives) for Pub/Sub messages, although currently set to 1. This can be tuned for performance.New(config) -> GCSSource instance (GCS/PubSub clients, filter initialized) | V GCSSource.Start() --> Goroutine starts PubSub subscription.Receive loop | V PubSub message arrives | V receiveSingleEventWrapper(msg) | V receiveSingleEvent(msg) | +-> Deserialize msg data (JSON: bucket, name) -> Error? Ack, return. | +-> Filter filename (regex) -> Rejected? Ack, return. | +-> Check if filename matches config.Sources prefixes -> No match? Ack, return. | +-> GCS: storageClient.Object(bucket, name).Attrs() -> Error? Nack (retryable), return. | +-> GCS: object.NewReader() -> Error? Nack (retryable), return. | V Create file.File{Name, Contents, Created, PubSubMsg: msg} Send file.File to fileChannel | V Caller receives file.File from channel (Caller later Acks/Nacks msg via file.File.PubSubMsg)This modular approach to file sourcing makes the Perf ingestion system flexible and easier to test and maintain. New file sources can be added by simply implementing the file.Source interface.
The filestore module provides an abstraction layer for interacting with different file storage systems. It defines a common interface, leveraging Go's io/fs.FS, allowing the application to read files regardless of whether they are stored locally or in a cloud storage service like Google Cloud Storage (GCS). This design promotes flexibility and testability by decoupling file access logic from the specific storage implementation.
The primary goal is to enable Perf, the performance monitoring system, to seamlessly access data files from various sources. Perf often deals with large datasets and trace files, which might be stored in GCS for scalability and durability or locally during development and testing. By using this module, Perf components can be written to consume data using the standard fs.FS interface without needing to know the underlying storage details.
Key components:
local: This submodule provides an implementation of fs.FS for the local file system.
local.New(rootDir string) function initializes a filesystem struct. This struct stores the absolute path to a rootDir and uses os.DirFS(rootPath) to create an fs.FS instance scoped to that directory. When Open(name string) is called, it calculates the path relative to rootDir and then uses the underlying os.DirFS to open the file. This ensures that file access is contained within the specified root directory.local.go file contains the filesystem struct and its methods. The core logic resides in the New function for initialization and the Open method for file access. filepath.Abs and filepath.Rel are used to correctly handle and relativize paths.gcs: This submodule implements fs.FS for Google Cloud Storage.
gcs.New(ctx context.Context) function initializes a filesystem struct. It authenticates with GCS using google.DefaultTokenSource to obtain an OAuth2 token source and then creates a *storage.Client. The Open(name string) method expects a GCS URI (e.g., gs://bucket-name/path/to/file). It parses this URI into a bucket name and object path using parseNameIntoBucketAndPath. Then, it uses the storage.Client to get a *storage.Reader for the specified object. This reader is wrapped in a custom file struct which implements fs.File.gcs.go file defines the filesystem struct, which holds the *storage.Client, and the file struct, which wraps *storage.Reader. The New function handles GCS client initialization and authentication. The Open method is responsible for parsing GCS URIs and obtaining a reader for the object. Notably, the Stat() method for gcs.file is intentionally not implemented (returns ErrNotImplemented) because Perf's current usage patterns do not require it, simplifying the implementation. The parseNameIntoBucketAndPath helper function is crucial for translating the GCS URI format into the bucket and object path components required by the GCS client library.Workflow: Opening a File (Conceptual)
The client code (e.g., a component within Perf) would typically decide which filestore implementation to use based on configuration or the nature of the file path.
Initialization:
fsImpl, err := local.New("/path/to/data/root")fsImpl, err := gcs.New(context.Background())File Access:
- The client calls `file, err :=
fsImpl.Open(“relative/path/to/file.json”)(for local) orfile, err := fsImpl.Open(“gs://my-bucket/data/some_trace.json”)` (for GCS).
Behind the Scenes:
- **Local**: `local.Open("relative/path/to/file.json") | V Calculates
absolute path based on rootDir | V Calls os.DirFS(rootDir).Open(“relative/path/to/file.json”) | V Returns fs.File (os.File) - **GCS**:gcs.Open(“gs://my-bucket/data/some_trace.json”) | V parseNameIntoBucketAndPath(“gs://my-bucket/data/some_trace.json”) --> “my-bucket”, “data/some_trace.json” | V gcsClient.Bucket(“my-bucket”).Object(“data/some_trace.json”).NewReader() | V Wraps storage.Reader in gcs.file | V Returns fs.File (gcs.file)`
Reading Data:
fs.File (e.g., file.Read(buffer)) in a standard way, irrespective of whether it's an os.File or a gcs.file wrapping a storage.Reader.This abstraction allows Perf to be agnostic to the underlying storage mechanism when reading files, simplifying its data processing pipelines.
The frontend module serves as the backbone for the Perf web UI. It's responsible for handling HTTP requests, rendering HTML templates, and interacting with various backend services and data stores to provide a comprehensive performance analysis platform.
The design philosophy emphasizes a separation of concerns. The core frontend.go file initializes and wires together various components, while the api subdirectory houses specific handlers for different categories of user interactions (e.g., alerts, graphs, regressions). This modular approach simplifies development, testing, and maintenance.
Key Components and Responsibilities:
frontend.go:
New, initialize): This is the entry point. It sets up logging, metrics, reads configuration (config.Config), initializes database connections (TraceStore, AlertStore, RegressionStore, etc.), and establishes connections to external services like Git and potentially Chrome Perf.loadTemplates, templateHandler): It loads HTML templates from the dist directory (produced by the build system). These templates are Go templates, allowing for dynamic data injection. Snippets for Google Analytics (googleanalytics.html) and cookie consent (cookieconsent.html) are embedded and can be included in the rendered pages.getPageContext): This crucial function generates a JavaScript object (window.perf) that is embedded in every HTML page. This object contains configuration values and settings that the client-side JavaScript needs to function correctly, such as API URLs, display preferences, and feature flags. This avoids hardcoding such values in the JavaScript and allows for easier configuration.GetHandler, getFrontendApis): It defines the HTTP routes and associates them with their respective handler functions. This is where chi router is configured. It also instantiates and registers all the API handlers from the api sub-module.loginProvider, RoleEnforcedHandler): It integrates with an authentication system (e.g., proxylogin) to determine user identity and roles. RoleEnforcedHandler is a middleware to protect certain endpoints based on user roles.progressTracker): For operations that might take a significant amount of time (e.g., generating complex data frames for graphs, running regression detection), it uses a progress.Tracker. This allows the frontend to initiate a task, return an ID to the client, and let the client poll for status and results, preventing HTTP timeouts for long operations./_/frame/start with query details.frameStartHandler creates a progress object, adds it to progressTracker.frame.ProcessFrameRequest.frameStartHandler immediately returns the progress object's ID./_/status/{id}./_/frame/results/{id} (managed by progressTracker) once finished.gotoHandler, old URL handlers): Handles redirects for old URLs to new ones and provides a /g/ endpoint to navigate to specific views based on a Git hash.liveness): Provides a /liveness endpoint that checks the health of critical dependencies (like the database connection) for Kubernetes.api (subdirectory): This directory contains the specific HTTP handlers for various features of Perf. Each API is typically encapsulated in its own file (e.g., alertsApi.go, graphApi.go) and implements the FrontendApi interface, primarily its RegisterHandlers method. This design promotes modularity.
alertsApi.go: Manages CRUD operations for alert configurations (alerts.Alert). It interacts with alerts.ConfigProvider (for fetching configurations, potentially cached) and alerts.Store (for persistence). It also handles trying out bug filing and notification sending for alerts. Includes endpoints to list subscriptions and manage dry-run requests for alert configurations.anomaliesApi.go: Provides endpoints for fetching anomaly data. It has two modes of operation:cleanTestName) addresses potential incompatibilities in test naming conventions or characters between systems.subscription.Store, alerts.Store). This allows Perf instances to manage their own anomaly data.favoritesApi.go: Manages user-specific and instance-wide favorite links. User favorites are stored in favorites.Store, while instance-wide favorites can be defined in the main configuration file (config.Config.Favorites). It provides endpoints to list, create, delete, and update favorites.graphApi.go: Handles requests related to plotting graphs.frameStartHandler): As described above, this initiates the potentially long process of fetching trace data and constructing a dataframe.DataFrame. It uses dfbuilder.DataFrameBuilder for this.cidHandler, cidRangeHandler, shiftHandler): Provides details about specific commits or ranges of commits by interacting with perfgit.Git.detailsHandler, linksHandler): Fetches raw data or metadata for a specific trace point at a particular commit. This involves reading from tracestore.TraceStore and potentially the ingestedFS (filesystem where raw ingested data is stored) to get information like associated benchmark links from the original JSON files.pinpointApi.go: Facilitates interaction with the Pinpoint bisection service. It allows users to create bisection jobs (to identify the commit that caused a performance regression) or try jobs (to test a patch). It can proxy requests to a legacy Pinpoint service or a newer backend service.queryApi.go: Supports the query construction UI.initpageHandler, getParamSet): Provides the initial set of queryable parameters (keys and their possible values) to populate the UI. This uses psrefresh.ParamSetRefresher which periodically updates this canonical paramset based on recent data, ensuring the UI reflects available data.countHandler, nextParamListHandler): As the user builds a query in the UI, these handlers can estimate the number of matching traces or provide the next relevant parameter values based on the current partial query. This gives users immediate feedback. The nextParamListHandler is tailored for UIs where parameter selection is ordered (e.g., Chromeperf's UI).regressionsApi.go: Deals with detected regressions.regressionRangeHandler, regressionCountHandler, alertsHandler, regressionsHandler): Fetches regression data from regression.Store based on time ranges, alert configurations, or subscriptions. It can filter by user ownership or category.triageHandler): Allows users (editors) to mark regressions as triaged (e.g., “positive”, “negative”, “ignored”) and associate them with bug reports. If a regression is marked as negative, it can generate a bug report URL using a configurable template.clusterStartHandler): Allows users to initiate the regression detection process for a specific query or set of parameters. This is also a long-running operation managed by progressTracker.anomalyHandler, alertGroupQueryHandler): Provides redirect URLs to the appropriate graph view for a given anomaly ID or alert group ID from Chromeperf. This involves generating graph shortcuts.sheriffConfigApi.go: Handles interactions related to LUCI Config for sheriff configurations.getMetadataHandler): Provides metadata to LUCI Config, indicating which configuration files (e.g., skia-sheriff-configs.cfg) Perf owns and the URL for validating changes to these files. This is part of an automated config management system.validateConfigHandler): Receives configuration content from LUCI Config and validates it (e.g., using sheriffconfig.ValidateContent). Returns success or a structured error message.shortcutsApi.go: Manages the creation and retrieval of shortcuts.keysHandler): Allows storing a set of trace keys (queries) and getting a short ID for them. This is used, for example, by the “Share” button on the explore page.getGraphsShortcutHandler, createGraphsShortcutHandler): Manages shortcuts for more complex graph configurations, which can include multiple queries and formulas. These are used for sharing multi-graph views.triageApi.go: Provides endpoints for triaging anomalies, specifically those originating from or managed by Chromeperf. This includes filing new bugs, associating anomalies with existing bugs, and performing actions like ignoring or nudging anomalies. It interacts with chromeperf.ChromePerfClient and potentially an issuetracker.IssueTracker implementation.userIssueApi.go: Manages user-reported issues (Buganizer annotations) associated with specific data points (a trace at a commit). This allows users to link external bug reports directly to performance data points in the UI. It uses userissue.Store for persistence.The overall goal of the frontend module is to provide a responsive and informative user interface by efficiently querying and presenting performance data, while also enabling users to configure alerts, triage regressions, and collaborate on performance analysis. The interaction with various stores and services is abstracted to keep the request handling logic focused.
The go/git module provides an abstraction layer for interacting with Git repositories. It is designed to efficiently retrieve and cache commit information, which is essential for performance analysis in Skia Perf. The primary goal is to offer a consistent interface for accessing commit data, regardless of whether the underlying data source is a local Git checkout or a remote Gitiles API.
Design Decisions and Implementation Choices:
/go/git/schema/schema.go.provider.Provider interface (defined in /go/git/provider/provider.go). This allows for different implementations of how Git data is fetched. Currently, two providers are implemented:git_checkout: Interacts with a local Git repository by shelling out to git commands. This is suitable for environments where a local checkout is available and preferred.gitiles: Uses the Gitiles API to fetch commit data. This is useful when direct repository access is not feasible or when leveraging Google's infrastructure for Git operations. The choice of provider is determined by the instance configuration, as seen in /go/git/providers/builder.go.CommitNumbers to commits as they are ingested. This provides a simple, ordered way to refer to commits.instanceConfig.GitRepoConfig.CommitNumberRegex). This is useful for repositories like Chromium that embed a commit position in their messages. The repoSuppliedCommitNumber flag in impl.go controls this behavior.cache in impl.go) is used for frequently accessed commit details (CommitFromCommitNumber). This further speeds up lookups for commonly requested commits. The size of this cache is defined by commitCacheSize.StartBackgroundPolling method in impl.go initiates a goroutine that periodically calls the Update method. This ensures that the local database cache stays synchronized with the remote repository.impl.go. This helps in organizing and managing the queries. Separate statements are defined for different SQL dialects if needed (e.g., insert vs insertSpanner).BadCommit constant provides a sentinel value for functions returning provider.Commit to indicate an error or an invalid commit.Key Responsibilities and Components:
interface.go (Git Interface):Git interface, which is the public contract for this module. It specifies all the operations that can be performed to retrieve commit information.impl.go (Git Implementation):Impl struct, which is the primary implementation of the Git interface.Update method): This is a crucial method responsible for fetching new commits from the configured provider.Provider and storing them in the SQL database. It determines the last known commit and fetches all subsequent commits.repoSuppliedCommitNumber is true, it parses the commit number from the commit body using commitNumberRegex.CommitNumberFromGitHash: Retrieves the sequential CommitNumber for a given Git hash.CommitFromCommitNumber: Retrieves the full provider.Commit details for a given CommitNumber. Uses the LRU cache.CommitNumberFromTime: Finds the CommitNumber closest to (but not after) a given timestamp.CommitSliceFromTimeRange, CommitSliceFromCommitNumberRange: Fetches slices of commits based on time or commit number ranges.GitHashFromCommitNumber: Retrieves the Git hash for a given CommitNumber.PreviousGitHashFromCommitNumber, PreviousCommitNumberFromCommitNumber: Finds the Git hash or commit number of the commit immediately preceding a given commit number.CommitNumbersWhenFileChangesInCommitNumberRange: Identifies commit numbers within a range where a specific file was modified. This involves converting commit numbers to hashes and then querying the provider.Provider.urlFromParts): Constructs a URL to view a specific commit, respecting configurations like DebouceCommitURL or custom CommitURL formats.updateCalled, commitNumberFromGitHashCalled) to monitor the usage and performance of different operations.provider/provider.go (Provider Interface and Commit Struct):provider.Provider interface, which abstracts the source of Git commit data. Implementations of this interface (like git_checkout and gitiles) handle the actual fetching of data.provider.Commit struct, which is the standard representation of a commit used throughout the go/git module and its providers. It includes fields like GitHash, Timestamp, Author, Subject, and Body. The Body is particularly important when repoSuppliedCommitNumber is true, as it's parsed to extract the commit number.providers/builder.go (Provider Factory):New function, which acts as a factory for creating provider.Provider instances based on the instanceConfig.GitRepoConfig.Provider setting. This allows the system to dynamically choose between git_checkout or gitiles (or potentially other future providers).providers/git_checkout/git_checkout.go (CLI Git Provider):provider.Provider by executing git command-line operations.CommitsFromMostRecentGitHashToHead: Uses git rev-list to get commit information.GitHashesInRangeForFile: Uses git log to find changes to a specific file.parseGitRevLogStream: A helper function to parse the output of git rev-list --pretty.providers/gitiles/gitiles.go (Gitiles Provider):provider.Provider by interacting with a Gitiles API endpoint.CommitsFromMostRecentGitHashToHead: Uses gr.LogFnBatch to fetch commits in batches. It handles logic for main branches versus other branches and respects the startCommit.GitHashesInRangeForFile: Uses gr.Log with appropriate path filtering.Update is a no-op for Gitiles as the API always provides the latest data.schema/schema.go (Database Schema):Commit struct with SQL annotations, representing the structure of the Commits table in the database. This table stores the cached commit information.gittest/gittest.go (Test Utilities):NewForTest) for setting up test environments. This includes creating a temporary Git repository, populating it with commits, and initializing a test database. This is crucial for writing reliable unit and integration tests for the go/git module and its components.mocks/Git.go (Mock Implementation):Git interface, generated by mockery. This is used in tests of other modules that depend on go/git, allowing them to isolate their tests from actual Git operations or database interactions.Key Workflows:
Initial Population / Update:
Application -> Impl.Update()
|
'-> Provider.Update() (e.g., git pull for git_checkout)
|
'-> Impl.getMostRecentCommit() (from local DB)
|
'-> Provider.CommitsFromMostRecentGitHashToHead(mostRecentDBHash, ...)
|
'-> (For each new commit from Provider)
|
'-> [If repoSuppliedCommitNumber] Impl.getCommitNumberFromCommit(commit.Body)
|
'-> Impl.CommitNumberFromGitHash(commit.GitHash) (Check if already exists)
|
'-> DB.Exec(INSERT INTO Commits ...)
Fetching Commit Details by CommitNumber:
Application -> Impl.CommitFromCommitNumber(commitNum)
|
'-> Check LRU Cache (cache.Get(commitNum))
| |
| '-> [If found] Return cached provider.Commit
|
'-> [If not in LRU] DB.QueryRow(SELECT ... FROM Commits WHERE commit_number=$1)
|
'-> Construct provider.Commit
|
'-> Add to LRU Cache (cache.Add(commitNum, commit))
|
'-> Return provider.Commit
Finding Commits Where a File Changed: Application -> Impl.CommitNumbersWhenFileChangesInCommitNumberRange(beginNum, endNum, file) | '-> Impl.PreviousGitHashFromCommitNumber(beginNum) -> beginHash (or Impl.GitHashFromCommitNumber if beginNum is 0 and start commit is used) | '-> Impl.GitHashFromCommitNumber(endNum) -> endHash | '-> Provider.GitHashesInRangeForFile(beginHash, endHash, file) -> changedGitHashes[] | '-> (For each changedGitHash) | '-> Impl.CommitNumberFromGitHash(changedGitHash) -> commitNum | '-> Add commitNum to result list | '-> Return result list
This structure allows Perf to efficiently query and manage Git commit information, supporting its core functionality of tracking performance data across different versions of the codebase.
The graphsshortcut module provides a mechanism for storing and retrieving shortcuts for graph configurations in Perf. Users often define complex sets of graphs for analysis. Instead of redefining these configurations each time or relying on cumbersome URL sharing, this module allows users to save a collection of graph configurations and access them via a unique, shorter identifier. This significantly improves usability and sharing of common graph views.
The core idea is to represent a set of graphs, each with its own configuration (queries, formulas, keys), as a GraphsShortcut object. This object can then be persisted and retrieved using a Store interface. A key design decision is the generation of a unique ID for each GraphsShortcut. This ID is a hash (MD5) of the content of the shortcut, ensuring that identical graph configurations will always have the same ID. This also provides a form of de-duplication. To ensure consistent ID generation, the queries and formulas within each graph configuration are sorted alphabetically before hashing. However, the order of the GraphConfig objects within a GraphsShortcut does affect the generated ID.
User defines graph configurations --> [GraphsShortcut object] -- InsertShortcut --> [Store] --> Generates ID (MD5 hash) --> Persists (ID, Shortcut)
^
|
User provides ID -------------------> [Store] -- GetShortcut --------+------> [GraphsShortcut object] --> Display Graphs
graphsshortcut.go: This file defines the central data structures and the Store interface.
GraphConfig: Represents the configuration for a single graph. It contains:Queries: A slice of strings, where each string represents a query used to fetch data for the graph.Formulas: A slice of strings, representing any formulas applied to the data.Keys: A string, likely representing a pre-selected set of traces or keys to focus on.GraphsShortcut: This is the primary object that is stored and retrieved. It's essentially a list of GraphConfig objects.GetID(): A method on GraphsShortcut that calculates a unique MD5 hash based on its content. This method is crucial for identifying and de-duplicating shortcuts. It sorts queries and formulas within each GraphConfig before hashing to ensure that the order of these internal elements doesn't change the ID.Store: An interface defining the contract for persisting and retrieving GraphsShortcut objects. It has two methods:InsertShortcut: Takes a GraphsShortcut and stores it, returning its generated ID.GetShortcut: Takes an ID and returns the corresponding GraphsShortcut.graphsshortcutstore/: This subdirectory contains implementations of the graphsshortcut.Store interface.
graphsshortcutstore.go (GraphsShortcutStore): This provides an SQL-backed implementation of the Store.sql.Pool) to manage database connections.InsertShortcut: Marshals the GraphsShortcut object into JSON and stores it as a string in the GraphsShortcuts table along with its pre-computed ID. It uses ON CONFLICT (id) DO NOTHING to avoid errors if the same shortcut (and thus same ID) is inserted multiple times.GetShortcut: Retrieves the JSON string from the database based on the ID and unmarshals it back into a GraphsShortcut object.cachegraphsshortcutstore.go (cacheGraphsShortcutStore): This provides an in-memory cache-backed implementation of the Store.cache.Cache client.InsertShortcut: Marshals the GraphsShortcut to JSON and stores it in the cache using the shortcut's ID as the cache key.GetShortcut: Retrieves the JSON string from the cache by ID and unmarshals it.schema/schema.go: Defines the SQL table schema for GraphsShortcuts. The table primarily stores the id (TEXT, PRIMARY KEY) and the graphs (TEXT, storing the JSON representation of the GraphsShortcut).graphsshortcuttest/graphsshortcuttest.go: This file provides a suite of common tests that can be run against any implementation of the graphsshortcut.Store interface.
InsertGet: Verifies that a shortcut can be inserted and then retrieved, and that the retrieved shortcut is identical to the original (accounting for sorted queries/formulas).GetNonExistent: Ensures that attempting to retrieve a shortcut with an unknown ID results in an error.mocks/Store.go: This file contains a mock implementation of the graphsshortcut.Store interface, generated by the testify/mock library.
Store interface without needing a real database or cache. They allow for controlled testing of different scenarios, such as simulating errors from the store.In summary, the graphsshortcut module provides a flexible way to save and share complex graph views by defining a clear data structure (GraphsShortcut), a standardized way to identify them (GetID), and an interface (Store) for various persistence mechanisms, with current implementations for SQL databases and in-memory caches.
The /go/ingest module is responsible for the entire process of taking performance data files, parsing them, and storing the data into a trace store. This involves identifying the format of the input file, extracting relevant measurements and metadata, associating them with specific commits, and then writing this information to the configured data storage backend.
A key design principle is to support multiple ingestion file formats and to be resilient to errors in individual files. The system attempts to parse files in a specific order, falling back to legacy formats if the primary parsing fails. This allows for graceful upgrades of the ingestion format over time without breaking existing data producers.
The ingestion process also handles trybot data, extracting issue and patchset information, which is crucial for pre-submit performance analysis.
/go/ingest/filter/filter.goThis component provides a mechanism to selectively process or ignore input files based on their names using regular expressions.
Why: In many scenarios, not all files in a data source are relevant for performance analysis. For example, temporary files, logs, or files matching specific patterns might need to be excluded. This filter allows for fine-grained control over which files are ingested.
How:
accept and reject.accept regex, if provided, means only filenames matching this regex will be considered for processing. If empty, all files are initially accepted.reject regex, if provided, means any filename matching this regex will be ignored, even if it matched the accept regex. If empty, no files are rejected based on this rule.Reject(name string) bool method implements this logic: a file is rejected if it doesn't match the accept regex (if one is provided) OR if it does match the reject regex (if one is provided).Workflow:
File Name -> Filter.Reject()
|
+-- accept_regex_exists? -- Yes -> name_matches_accept? -- No -> REJECT
| |
| +-------------------------- Yes --+
+----------------------------- No -----------------------------+
|
V
reject_regex_exists? -- Yes -> name_matches_reject? -- Yes -> REJECT
| |
| +-- No --+
+----------------------------- No -----+
|
V
ACCEPT
/go/ingest/format/format.go and /go/ingest/format/legacyformat.goThese files define the structure of the data files that the ingestion system can understand. format.go defines the current standard format (Version 1), while legacyformat.go defines an older format primarily used by nanobench.
Why: A well-defined input format is essential for reliable data ingestion. Versioning allows the format to evolve while maintaining backward compatibility or clear error handling for older, unsupported versions. The current format (Format struct) is designed to be flexible, allowing for common metadata (like git hash, issue/patchset), global key-value pairs applicable to all results, and a list of individual results. Each result can have its own set of keys and either a single measurement or a map of “sub-measurements” (e.g., min, max, median for a single test). This structure allows for rich and varied performance data to be represented. The legacy format (BenchData) exists to support older systems that still produce data in that schema.
How:
format.go (Version 1):Format struct: The top-level structure. Contains Version, GitHash, optional trybot info (Issue, Patchset), a global Key map, a slice of Result structs, and global Links.Result struct: Represents one or more measurements. It has its own Key map (which gets merged with the global Key), and critically, either a single Measurement (float32) or a Measurements map.SingleMeasurement struct: Used within Measurements map. It allows associating a value (e.g., “min”, “median”) with a Measurement (float32) and optional Links. This is how multiple metrics for a single conceptual test run are represented.Parse(r io.Reader): Decodes JSON data from a reader into a Format struct. It specifically checks fileFormat.Version == FileFormatVersion.Validate(r io.Reader): Uses a JSON schema (formatSchema.json) to validate the structure of the input data. This ensures that incoming files adhere to the expected contract, preventing malformed data from causing issues downstream.GetLinksForMeasurement(traceID string): Retrieves links associated with a specific measurement, combining global links with measurement-specific ones.legacyformat.go:BenchData struct: Defines the older nanobench format. It has fields like Hash, Issue, PatchSet, Key, Options, and Results. The Results are nested maps leading to BenchResult.BenchResult: A map representing individual test results, typically map[string]interface{} where values are float64s, except for an “options” key.ParseLegacyFormat(r io.Reader): Decodes JSON data into a BenchData struct.The system will first attempt to parse an input file using format.Parse. If that fails (e.g., due to a version mismatch or JSON parsing error), it may then attempt to parse it using format.ParseLegacyFormat as a fallback.
/go/ingest/format/formatSchema.jsonThis file contains the JSON schema definition for the Format struct defined in format.go.
Why: A JSON schema provides a formal, machine-readable definition of the expected data structure. This is used for validation, ensuring that ingested files conform to the specified format. This helps catch errors early and provides clear feedback on what is wrong with a non-conforming file.
How: It's a standard JSON Schema file. The format.Validate function uses this schema to check the structure and types of the fields in an incoming JSON file. The schema is embedded into the Go binary.
/go/ingest/format/generate/main.goThis is a utility program used to automatically generate formatSchema.json from the Go Format struct definition.
Why: Manually keeping a JSON schema synchronized with Go struct definitions is error-prone. This generator ensures that the schema always accurately reflects the Go types.
How: It uses the go.skia.org/infra/go/jsonschema library, which can reflect on Go structs and produce a corresponding JSON schema. The //go:generate directive in the file allows this program to be run easily (e.g., via go generate).
/go/ingest/parser/parser.goThis is the core component responsible for taking an input file (as file.File), attempting to parse it using the defined formats, and extracting the performance data into a standardized intermediate representation.
Why: This component decouples the specifics of file formats from the process of writing data to the trace store. It handles the logic of trying different parsers, extracting common information like Git hashes and trybot details, and transforming the data into lists of parameter maps (paramtools.Params) and corresponding measurement values (float32). It also enforces rules like branch name filtering and parameter key/value validation.
How:
New(...): Initializes a Parser with instance-specific configurations, such as recognized branch names and a regex for invalid characters in parameter keys/values.Parse(ctx context.Context, file file.File): This is the main entry point for processing a regular data file.extractFromVersion1File (which uses format.Parse).extractFromLegacyFile (which uses format.ParseLegacyFormat).ErrFileShouldBeSkipped.query.ForceValidWithRegex based on the invalidParamCharRegex from the instance configuration. This is crucial because trace IDs (which are derived from these parameters) often have restrictions on allowed characters.params (a slice of paramtools.Params), values (a slice of float32), the gitHash, any global links from the file, and an error.ParseTryBot(file file.File): A specialized function to extract only the Issue and Patchset information from a file, trying both V1 and legacy formats. This is likely used for systems that only need to identify the tryjob associated with a file without processing all the measurement data.ParseCommitNumberFromGitHash(gitHash string): Extracts an integer commit number from a specially formatted git hash string (e.g., “CP:12345” -> 12345). This supports systems that use such commit identifiers.getParamsAndValuesFromLegacyFormat and getParamsAndValuesFromVersion1Format do the actual work of traversing the parsed file structures (BenchData or Format) and flattening them into the params and values slices.f.Results. If a Result has a single Measurement, it combines f.Key and result.Key to form the paramtools.Params.Result has Measurements (a map of string to []SingleMeasurement), it iterates through this map. For each entry, it takes the map's key and the Value from SingleMeasurement to add more key-value pairs to the paramtools.Params.GetSamplesFromLegacyFormat(b *format.BenchData): Extracts raw sample data (if present) from the legacy format. This seems to be for specific use cases where individual sample values, rather than just aggregated metrics, are needed.Key Workflow (Simplified Parse):
Input: file.File
Output: ([]paramtools.Params, []float32, gitHash, links, error)
1. Read file contents.
2. Attempt Parse as Version 1 Format:
`f, err := format.Parse(contents)`
If success:
`params, values := getParamsAndValuesFromVersion1Format(f, p.invalidParamCharRegex)`
`gitHash = f.GitHash`
`links = f.Links`
`commonKeys = f.Key`
Else (error):
Reset reader.
Attempt Parse as Legacy Format:
`benchData, err := format.ParseLegacyFormat(contents)`
If success:
`params, values := getParamsAndValuesFromLegacyFormat(benchData)`
`gitHash = benchData.Hash`
`links = nil` (legacy format doesn't have global links in the same way)
`commonKeys = benchData.Key`
Else (error):
Return error.
3. `branch, ok := p.checkBranchName(commonKeys)`
If !ok:
Return `ErrFileShouldBeSkipped`.
4. If len(params) == 0:
Return `ErrFileShouldBeSkipped`.
5. Return `params, values, gitHash, links, nil`.
/go/ingest/process/process.goThis component orchestrates the entire ingestion pipeline. It takes files from a source (e.g., a directory, GCS bucket), uses the parser to extract data, interacts with git to resolve commit information, and then writes the processed data to a tracestore.TraceStore and tracestore.MetadataStore. It also handles sending Pub/Sub events for ingested files.
Why: This provides the high-level control flow for ingestion. It manages concurrency (multiple worker goroutines), error handling at a macro level (retries for writing to the store), and integration with external systems like Git and Pub/Sub.
How:
Start(...):file.Source (to get files), the tracestore.TraceStore and tracestore.MetadataStore (to write data), and perfgit.Git (to map git hashes to commit numbers).worker goroutines specified by numParallelIngesters.worker listens on a channel provided by the file.Source.worker(...):parser.Parser instance.file.File objects from the channel.workerInfo.processSingleFile.workerInfo.processSingleFile(f file.File): This is the heart of the per-file processing.p.Parse(ctx, f) to get params, values, gitHash, and fileLinks.Parse:parser.ErrFileShouldBeSkipped, acks the Pub/Sub message (if any) and skips.gitHash is empty, logs an error and nacks.p.ParseCommitNumberFromGitHash.g.GetCommitNumber(ctx, gitHash, commitNumberFromFile) to resolve the gitHash (or verify the supplied commit number) against the Git repository. It includes logic to update the local Git repository clone if the hash isn‘t initially found. If the commit cannot be resolved, it logs an error, acks the Pub/Sub message (as retrying won’t help for an unknown commit), and skips.paramtools.ParamSet from all the extracted params.tracestore.TraceStore using store.WriteTraces or store.WriteTraces2 (depending on instanceConfig.IngestionConfig.TraceValuesTableInlineParams). This involves retries in case of transient store errors.WriteTraces2 suggests an optimized path where some parameter data might be stored directly with trace values, potentially for performance reasons.sendPubSubEvent to publish information about the ingested file (trace IDs, paramset, filename) to a configured Pub/Sub topic. This allows other services to react to new data ingestion.fileLinks were present in the input, it calls metadataStore.InsertMetadata to store these links.sendPubSubEvent(...): If a FileIngestionTopicName is configured, this function constructs an ingestevents.IngestEvent containing the trace IDs, the overall ParamSet for the file, and the filename. It then publishes this event to the specified Pub/Sub topic.Overall Ingestion Workflow:
File Source (e.g., GCS bucket watcher)
|
v
[ file.File channel ] -> Worker Goroutine(s)
|
v
processSingleFile(file)
|
+--------------------------+--------------------------+
| | |
v v v
Parser.Parse(file) --> Git.GetCommitNumber(hash) --> TraceStore.WriteTraces(...)
| ^ | | ^
| | (if parsing fails)| | | (retries)
| +-------------------| (update repo if needed) | |
| | | |
+-----> ParamSet Creation +--------------------------+ |
| |
v |
sendPubSubEvent (if success) ------------------------------+
|
v
MetadataStore.InsertMetadata (if links exist)
This architecture allows for robust and scalable ingestion of performance data from various sources and formats, with clear separation of concerns between parsing, data transformation, Git interaction, and storage. The use of Pub/Sub facilitates downstream processing and real-time reactions to newly ingested data.
The ingestevents module is designed to facilitate the communication of ingestion completion events via PubSub. This is a critical part of the event-driven alerting system within Perf, where the completion of data ingestion for a file triggers subsequent processes like regression detection in a clusterer.
The core of this module revolves around the IngestEvent struct. This struct encapsulates the necessary information to be transmitted when a file has been successfully ingested. It includes:
TraceIDs: A slice of strings representing all the unencoded trace identifiers found within the ingested file. These IDs are fundamental for identifying the specific data points that have been processed.ParamSet: An unencoded, read-only representation of the paramtools.ParamSet that summarizes the TraceIDs. This provides a consolidated view of the parameters associated with the ingested traces.Filename: The name of the file that was ingested. This helps in tracking the source of the ingested data.To handle the transmission of IngestEvent data over PubSub, the module provides two key functions:
CreatePubSubBody: This function takes an IngestEvent struct as input and prepares it for PubSub transmission. The “how” here involves a two-step process:
IngestEvent is first encoded into a JSON format. This provides a structured and widely compatible representation of the data.IngestEvent (struct) ---> JSON Encoding ---> Gzip Compression ---> []byte (for PubSub)
DecodePubSubBody: This function performs the reverse operation. It takes a byte slice (presumably received from a PubSub message) and decodes it back into an IngestEvent struct. The process is:
IngestEvent struct. Error handling is incorporated at each step to manage potential issues during decompression or JSON decoding.[]byte (from PubSub) ---> Gzip Decompression ---> JSON Decoding ---> IngestEvent (struct)
The primary responsibility of this module is therefore to provide a standardized and efficient way to serialize and deserialize ingestion event information for PubSub communication. The design choice of using JSON for structure and gzip for compression balances readability, interoperability, and an efficient use of PubSub resources.
The file ingestevents.go contains the definition of the IngestEvent struct and the implementation of the CreatePubSubBody and DecodePubSubBody functions. The corresponding test file, ingestevents_test.go, ensures that the encoding and decoding processes work correctly, verifying that an IngestEvent can be successfully round-tripped through the serialization and deserialization process.
The initdemo module provides a command-line application designed to initialize a database instance, specifically targeting CockroachDB or a Spanner emulator, for demonstration or development purposes.
Its primary purpose is to automate the creation of a named database and the application of the latest database schema. This ensures a consistent and ready-to-use database environment, removing the manual steps often required for setting up a database for applications like Skia Perf.
The core functionality revolves around connecting to a specified database URL, attempting to create the database (gracefully handling cases where it already exists), and then executing the appropriate schema definition. The choice of schema (standard SQL or Spanner-specific) is determined by a command-line flag.
Key Components and Responsibilities:
main.go: This is the entry point and sole Go source file for the application.--databasename: Specifies the name of the database to be created (defaults to “demo”). This allows users to customize the database name for different environments or purposes.--database_url: Provides the connection string for the CockroachDB instance (defaults to a local instance postgresql://root@127.0.0.1:26257/?sslmode=disable). This allows connection to different database servers or configurations.--spanner: A boolean flag that, when set, instructs the application to use the Spanner-specific schema. This is crucial for ensuring compatibility when targeting a Spanner emulator, which may have different SQL syntax or feature support compared to CockroachDB.pgxpool library, which is a PostgreSQL driver and connection pool for Go. This library was chosen for its robustness and performance in handling PostgreSQL-compatible databases like CockroachDB.CREATE DATABASE SQL statement. The implementation includes error handling to log an informational message if the database already exists, rather than failing, making the script idempotent in terms of database creation.SET DATABASE to switch the current session's context to the newly created (or existing) database. This is a CockroachDB-specific command.--spanner flag, it selects the appropriate schema definition.--spanner is false, it uses sql.Schema from the //perf/go/sql module, which contains the standard SQL schema for Perf.--spanner is true, it uses spanner.Schema from the //perf/go/sql/spanner module, which contains the schema adapted for Spanner. This separation allows maintaining distinct schema versions tailored to the nuances of each database system.Workflow:
The typical workflow of the initdemo application can be visualized as:
Parse Flags: Application Start -> Read --databasename, --database_url, --spanner
Connect to Database: Use --database_url -> pgxpool.Connect() -> Connection Pool (conn)
Create Database: conn + Use --databasename -> Execute "CREATE DATABASE <name>" | +-- Success | +-- Error (e.g., already exists) -> Log Info "Database <name> already exists."
Set Active Database (if not Spanner): Is --spanner false? | +-- Yes -> conn + Use --databasename -> Execute "SET DATABASE <name>" | | | +-- Error -> sklog.Fatal() | +-- No (Spanner enabled) -> Skip this step
Select Schema: Is --spanner true? | +-- Yes -> dbSchema = spanner.Schema | +-- No -> dbSchema = sql.Schema
Apply Schema: conn + dbSchema -> Execute schema DDL | +-- Error -> sklog.Fatal()
Close Connection: conn.Close() -> Application End
This process ensures that a target database is either created or confirmed to exist, and then the correct schema is applied, making it ready for use. The choice of using pgxpool for database interaction and providing separate schema definitions for standard SQL and Spanner demonstrates a design focused on supporting multiple database backends for the Perf system. The error handling, particularly for the database creation step, aims for robust and user-friendly operation.
This module provides an interface and implementation for interacting with the Google Issue Tracker API, specifically tailored for Perf's needs. The primary goal is to abstract the complexities of the Issue Tracker API and provide a simpler, more focused way to retrieve issue details and add comments to existing issues. This allows other parts of the Perf system to integrate with issue tracking without needing to directly handle API authentication, request formatting, or response parsing.
The module is designed around the IssueTracker interface, which defines the core operations:
Listing Issues (ListIssues): This function allows retrieving details for a set of specified issue IDs.
- **Why**: Perf often needs to fetch information about bugs that have been filed (e.g., to display their status or link to them from alerts). Providing a bulk retrieval mechanism based on IDs is efficient. - **How**: The implementation takes a `ListIssuesRequest` containing a slice of integer issue IDs. It constructs a query string by joining these IDs with " | " (OR operator in Issue Tracker query language) and prepending "id:()". This formatted query is then sent to the Issue Tracker API. - **Example Workflow**: `Perf System --- ListIssuesRequest (IDs: [123,
456]) ---> issuetracker Module | v Construct Query: “id:(123 | 456)” | v Issue Tracker API <--- GET Request --- issueTrackerImpl | v Perf System <--- []*issuetracker.Issue --- Response Parsing <--- API Response`
Creating Comments (CreateComment): This function allows adding a new comment to an existing issue.
- **Why**: Perf might need to automatically update bugs with new information, such as when a regression is fixed or when more data about an alert becomes available. - **How**: It takes a `CreateCommentRequest` containing the `IssueId` and the `Comment` string. The implementation constructs an `issuetracker.IssueComment` object and uses the Issue Tracker client library to post this comment to the specified issue. - **Example Workflow**: `Perf System --- CreateCommentRequest (ID: 789,
Comment: “...”) ---> issuetracker Module | v Issue Tracker API <--- POST Request --- issueTrackerImpl | v Perf System <--- CreateCommentResponse <--- Response Parsing <--- API Response`
issuetracker.go:
IssueTracker interface: Defines the contract for interacting with the issue tracker. This allows for decoupling the client code from the specific implementation and facilitates testing using mocks.issueTrackerImpl struct: The concrete implementation of the IssueTracker interface. It holds an instance of the issuetracker.Service client, which is the generated Go client for the Google Issue Tracker API.NewIssueTracker function: This is the factory function for creating an issueTrackerImpl instance.config.IssueTrackerConfig. It then uses google.DefaultClient with the “https://www.googleapis.com/auth/buganizer” scope to obtain an authenticated HTTP client. This client and the API key are then used to initialize the issuetracker.Service.BasePath of the issuetracker.Service is explicitly set to “https://issuetracker.googleapis.com” to ensure it points to the correct API endpoint.ListIssuesRequest, CreateCommentRequest, CreateCommentResponse): These simple structs define the data structures for requests and responses, making the interface clear and easy to use. They are designed to be minimal and specific to the needs of the Perf system.mocks/IssueTracker.go:
IssueTracker interface, generated using the testify/mock library.issuetracker module. They allow tests to simulate various responses (success, failure, specific data) from the issue tracker without making actual API calls. This makes tests faster, more reliable, and independent of external services.IssueTracker mock struct embeds mock.Mock and provides mock implementations for ListIssues and CreateComment. The NewIssueTracker function in this file is a constructor for the mock, which also sets up test cleanup to assert that all expected mock calls were made.IssueTracker) promotes loose coupling and testability. Consumers depend on the abstraction rather than the concrete implementation.skerr.Wrapf to wrap errors, providing context and making debugging easier. It also includes input validation for CreateCommentRequest to prevent invalid API calls.sklog.Debugf) are included to trace requests and responses, which can be helpful during development and troubleshooting.The module relies on the external go.skia.org/infra/go/issuetracker/v1 library, which is the auto-generated client for the Google Issue Tracker API. This design choice leverages existing, well-tested client libraries instead of reimplementing API interaction from scratch.
This module provides a generic implementation of the k-means clustering algorithm. The primary goal is to offer a flexible way to group a set of data points (observations) into a predefined number of clusters (k) based on their similarity. The “similarity” is determined by a distance metric, and the “center” of each cluster is represented by a centroid.
The module is designed with generality in mind. Instead of being tied to a specific data type or distance metric, it uses interfaces (Clusterable, Centroid) and a function type (CalculateCentroid). This approach allows users to define their own data structures and distance calculations, making the k-means algorithm applicable to a wide variety of problems.
Interfaces for Flexibility:
Clusterable: This is a marker interface. Any data type that needs to be clustered must satisfy this interface. In practice, this means you can use interface{} and then perform type assertions within your custom distance and centroid calculation functions. This design choice prioritizes ease of use for simple cases, where the same type might represent both an observation and a centroid.Centroid: This interface defines the contract for centroids.AsClusterable() Clusterable: This method is crucial for situations where a centroid itself can be treated as a data point (e.g., when calculating distances or when a centroid is part of the initial observation set). It allows the algorithm to seamlessly integrate centroids into lists of clusterable items. If a centroid cannot be meaningfully converted to a Clusterable, it returns nil.Distance(c Clusterable) float64: This method is the core of the similarity measure. It calculates the distance between the centroid and a given Clusterable data point. The user provides the specific implementation for this, enabling the use of various distance metrics (Euclidean, Manhattan, etc.).CalculateCentroid func([]Clusterable) Centroid: This function type defines how a new centroid is computed from a set of Clusterable items belonging to a cluster. This allows users to implement different strategies for centroid calculation, such as taking the mean, median, or other representative points.Lloyd's Algorithm Implementation:
The core clustering logic is implemented in the Do function, which performs a single iteration of Lloyd's algorithm. This is a common and relatively straightforward iterative approach to k-means.
The KMeans function orchestrates multiple iterations of Do. A key design consideration here is the convergence criteria. Currently, it runs for a fixed number of iterations (iters). A more sophisticated approach, would be to iterate until the total error (or the change in centroid positions) falls below a certain threshold, indicating that the clusters have stabilized. This was likely deferred for simplicity in the initial implementation, but it's an important aspect for practical applications to avoid unnecessary computations or premature termination.
Why modify centroids in-place in Do?
The Do function modifies the centroids slice passed to it. The documentation explicitly advises calling it as centroids = Do(observations, centroids, f). This design choice might have been made for efficiency, avoiding the allocation of a new centroids slice in every iteration if the number of centroids remains the same. However, it also means the caller needs to be aware of this side effect. The function does return the potentially new slice of centroids, which is important because centroids can be “lost” if a cluster becomes empty.
kmeans.go: This is the sole source file and contains all the logic for the k-means algorithm.
Clusterable (interface): Defines the contract for data points that can be clustered. Its main purpose is to allow generic collections of items.Centroid (interface): Defines the contract for cluster centers, including how to calculate their distance to data points and how to treat them as data points themselves.CalculateCentroid (function type): A user-provided function that defines the logic for computing a new centroid from a group of data points. This separation of concerns is key to the module's flexibility.closestCentroid(observation Clusterable, centroids []Centroid) (int, float64): A helper function that finds the index of the centroid closest to a given observation and the distance to it. This is a fundamental step in assigning observations to clusters.Do(observations []Clusterable, centroids []Centroid, f CalculateCentroid) []Centroid:Observations --> [Find Closest Centroid for each] --> Temporary Cluster Assignments 2. For each temporary cluster, it recalculates a new centroid using the user-provided f function. Temporary Cluster Assignments --> [Group by Cluster] --> Sets of Clusterable items | V [Apply 'f'] --> New Centroids 3. If a cluster becomes empty (no observations are closest to its centroid), that centroid is effectively removed in this iteration, as f will not be called for an empty set of Clusterable items, and newCentroids will not include it.KMeans function clearer. The in-place modification (and return value) addresses the potential for the number of centroids to change.GetClusters(observations []Clusterable, centroids []Centroid) ([][]Clusterable, float64):AsClusterable() is not nil).totalError.KMeans(observations []Clusterable, centroids []Centroid, k, iters int, f CalculateCentroid) ([]Centroid, [][]Clusterable):Initial Centroids --(iter 1)--> Do() --(updates)--> Centroids' | --(iter 2)--> Do() --(updates)--> Centroids'' ... --(iter 'iters')--> Do() --(updates)--> Final Centroids | V GetClusters() --> Final Clustersiters) is a straightforward stopping condition, though, as mentioned, convergence-based stopping would be more robust. The k parameter seems redundant given that the initial number of centroids is determined by len(centroids). If k was intended to specify the desired number of clusters and the initial centroids were just starting points, the implementation would need to handle cases where len(centroids) != k. However, the current Do function naturally adjusts the number of centroids if some clusters become empty.TotalError(observations []Clusterable, centroids []Centroid) float64:GetClusters and returns the totalError computed by it.1. Single K-Means Iteration (Do function):
Input: Observations (O), Current Centroids (C_curr), CalculateCentroid function (f)
1. For each Observation o in O:
Find c_closest in C_curr such that Distance(o, c_closest) is minimized.
Assign o to the cluster associated with c_closest.
---> Result: A mapping of each Observation to a Centroid index.
2. Initialize NewCentroids (C_new) as an empty list.
3. For each unique Centroid index j (from 0 to k-1):
a. Collect all Observations (O_j) assigned to cluster j.
b. If O_j is not empty:
Calculate new_centroid_j = f(O_j).
Add new_centroid_j to C_new.
---> Potentially, some original centroids might not have any observations assigned,
so C_new might have fewer centroids than C_curr.
Output: New Centroids (C_new)
2. Full K-Means Clustering (KMeans function):
Input: Observations (O), Initial Centroids (C_init), Number of Iterations (iters), CalculateCentroid function (f)
1. Set CurrentCentroids = C_init.
2. Loop 'iters' times:
CurrentCentroids = Do(O, CurrentCentroids, f) // Perform one iteration
---> CurrentCentroids are updated.
3. FinalCentroids = CurrentCentroids.
4. Clusters, TotalError = GetClusters(O, FinalCentroids)
---> Assigns each observation to its final cluster based on FinalCentroids.
The first element of each sub-array in Clusters is the centroid itself.
Output: FinalCentroids, Clusters
The unit tests in kmeans_test.go provide excellent examples of how to implement the Clusterable, Centroid, and CalculateCentroid requirements for a simple 2D point scenario. They demonstrate the expected behavior of the Do and KMeans functions, including edge cases like empty inputs or losing centroids when clusters become empty.
The maintenance module in Perf is responsible for executing a set of long-running background processes that are essential for the health and operational integrity of a Perf instance. These tasks ensure that data is kept up-to-date, system configurations are current, and storage is managed efficiently. The module is designed to be started once and run continuously, performing its duties at predefined intervals.
The core design principle behind the maintenance module is to centralize various periodic tasks that would otherwise be scattered or require manual intervention. By consolidating these operations, the system becomes more robust and easier to manage.
Key design choices include:
config.MaintenanceFlags) and the instance-specific configuration (config.InstanceConfig). This provides flexibility for different Perf deployments and operational needs.builders.NewDBPoolFromConfig), Git interfaces (builders.NewPerfGitFromConfig), and caching mechanisms (builders.GetCacheFromConfig) are created and passed into the respective maintenance tasks. This promotes modularity and testability.sklog) to provide visibility into its operations and to aid in diagnosing issues. While errors in one task might be logged, the overall Start function aims to keep other independent tasks running.flags.MigrateRegressions, instanceConfig.EnableSheriffConfig). This allows for gradual rollouts and testing in production environments.The maintenance module orchestrates several distinct background processes.
1. Core Initialization and Schema Management (maintenance.go)
tracing.Init: Sets up the distributed tracing system.builders.NewDBPoolFromConfig: Establishes a connection pool to the database.expectedschema.ValidateAndMigrateNewSchema: Checks the current database schema version against the expected version defined in the codebase. If they don't match, it applies the necessary migrations to bring the schema up to date. This is a critical step to prevent data corruption or application errors due to schema mismatches.2. Git Repository Synchronization (maintenance.go)
builders.NewPerfGitFromConfig: Creates an instance of perfgit.Git, which provides an interface to the Git repository.g.StartBackgroundPolling(ctx, gitRepoUpdatePeriod): This method launches a goroutine within the perfgit component. This goroutine periodically fetches the latest changes from the remote Git repository (origin) and updates the local representation, typically also updating a Commits table in the database with new commit information. The gitRepoUpdatePeriod constant (e.g., 1 minute) defines how frequently this update occurs.3. Regression Schema Migration (maintenance.go)
flags.MigrateRegressions flag.migration.New: Creates a Migrator instance, likely configured with database connections for both the old and new regression storage mechanisms.migrator.RunPeriodicMigration(regressionMigratePeriod, regressionMigrationBatchSize): Starts a goroutine that, at intervals defined by regressionMigratePeriod, processes a regressionMigrationBatchSize number of regressions, moving them from the old storage to the new. This batching approach prevents overwhelming the database and allows the migration to proceed incrementally.4. Sheriff Configuration Import (maintenance.go)
instanceConfig.EnableSheriffConfig and a non-empty instanceConfig.InstanceName.AlertStore and SubscriptionStore for managing alert and subscription data within Perf.luciconfig.NewApiClient: Creates a client to communicate with the LUCI Config service.sheriffconfig.New: Initializes the SheriffConfig service, which encapsulates the logic for fetching, parsing, and applying Sheriff configurations.sheriffConfig.StartImportRoutine(configImportPeriod): Launches a goroutine that periodically (every configImportPeriod) polls the LUCI Config service for the specified instance. If new or updated configurations are found, they are processed and stored/updated in Perf's database (e.g., in the Alerts and Subscriptions tables).5. Query Cache Refresh (maintenance.go)
Why: To speed up common queries (e.g., retrieving the set of available trace parameters, known as ParamSets), Perf can cache this information. This component is responsible for periodically rebuilding and refreshing these caches.
How:
flags.RefreshQueryCache flag.builders.NewTraceStoreFromConfig: Gets an interface to the trace data.dfbuilder.NewDataFrameBuilderFromTraceStore: Creates a utility for building data frames from traces, which is likely used to derive the ParamSet.psrefresh.NewDefaultParamSetRefresher: Initializes a component specifically designed to refresh ParamSets. It uses the DataFrameBuilder to scan trace data and determine the current set of unique parameter key-value pairs.psRefresher.Start(time.Hour): Starts a goroutine to refresh the primary ParamSet (perhaps stored directly in the database or an in-memory representation) hourly.builders.GetCacheFromConfig: If a distributed cache like Redis is configured, this obtains a client for it.psrefresh.NewCachedParamSetRefresher: Wraps the primary psRefresher with a caching layer.cacheParamSetRefresher.StartRefreshRoutine(redisCacheRefreshPeriod): Starts another goroutine that takes the ParamSet generated by psRefresher and populates the external cache (e.g., Redis) at redisCacheRefreshPeriod intervals (e.g., every 4 hours). This provides a faster lookup path for frequently accessed ParamSet data.Workflow:
Trace Data --> DataFrameBuilder --> ParamSetRefresher (generates primary ParamSet)
|
v
CachedParamSetRefresher --> External Cache (e.g., Redis)
6. Old Data Deletion (deletion/deleter.go, maintenance.go)
Why: Over time, Perf accumulates a large amount of data, including regression information and associated shortcuts (which are often links or identifiers for specific data views). To manage storage costs and maintain system performance, very old data that is unlikely to be accessed needs to be periodically deleted.
How:
flags.DeleteShortcutsAndRegressions flag.deletion.New(db, ...): Initializes a Deleter object. This object encapsulates the logic for identifying and removing outdated regressions and shortcuts. It takes a database connection pool (db) and the datastore type. Internally, it creates instances of sqlregressionstore and sqlshortcutstore to interact with the respective database tables.deleter.RunPeriodicDeletion(deletionPeriod, deletionBatchSize): This method in maintenance.go calls the RunPeriodicDeletion method on the Deleter instance.deleter.go, RunPeriodicDeletion starts a goroutine.deletionPeriod (e.g., every 15 minutes).d.DeleteOneBatch(deletionBatchSize).Deleter.DeleteOneBatch(shortcutBatchSize):d.getBatch(ctx, shortcutBatchSize) to identify a batch of regressions and shortcuts eligible for deletion.Deleter.getBatch(...):Regressions table.Regressions table for ranges of commits, starting from the oldest.Low and High StepPoints.StepPoint's timestamp is older than the defined ttl (Time-To-Live, currently -18 months), the associated shortcut and the commit number of the regression are marked for deletion.shortcutBatchSize.d.deleteBatch(ctx, commitNumbers, shortcuts) to perform the actual deletion.Deleter.deleteBatch(...):commitNumbers and calls d.regressionStore.DeleteByCommit() for each, removing the regression data associated with that commit.shortcuts and calls d.shortcutStore.DeleteShortcut() for each, removing the shortcut entry.Deletion Workflow:
Timer (every deletionPeriod) --> DeleteOneBatch
|
v
getBatch (identifies old data based on TTL)
|
| Returns (commit_numbers_to_delete, shortcut_ids_to_delete)
v
deleteBatch (deletes in a transaction)
|
+--> RegressionStore.DeleteByCommit
+--> ShortcutStore.DeleteShortcut
The ttl variable in deleter.go is set to -18 months, meaning regressions and their associated shortcuts older than 1.5 years are targeted for deletion. This value was determined based on stakeholder requirements for data retention.
The select {} at the end of the Start function in maintenance.go is a common Go idiom to make the main goroutine (the one that called Start) block indefinitely. Since all the actual work is done in background goroutines launched by Start, this prevents the Start function from returning and thus keeps the maintenance processes alive.
The notify module in Perf is responsible for handling notifications related to performance regressions. It provides a flexible framework for formatting and sending notifications through various channels like email, issue trackers, or custom endpoints like Chromeperf.
Core Concepts and Design:
The notification system is built around a few key abstractions:
Notifier Interface (notify.go): This is the central interface for sending notifications. It defines methods for:
RegressionFound: Called when a new regression is detected.RegressionMissing: Called when a previously detected regression is no longer found (e.g., due to new data or fixes).ExampleSend: Used for sending test/dummy notifications to verify configuration.UpdateNotification: For updating an existing notification (e.g., adding a comment to an issue).Formatter Interface (notify.go): This interface is responsible for constructing the content (body and subject) of a notification. Implementations exist for:
HTMLFormatter (html.go): Generates HTML-formatted notifications, suitable for email.MarkdownFormatter (markdown.go): Generates Markdown-formatted notifications, suitable for issue trackers or other systems that support Markdown. The formatters use Go's text/template package, allowing for customizable notification messages. Templates can access a TemplateContext (or AndroidBugTemplateContext for Android-specific notifications) which provides data about the regression, commit, alert, etc.Transport Interface (notify.go): This interface defines how a formatted notification is actually sent. Implementations include:
EmailTransport (email.go): Sends notifications via email using the emailclient module.IssueTrackerTransport (issuetracker.go): Interacts with an issue tracking system (configured for Google's Issue Tracker/Buganizer) to create or update issues. It uses the go/issuetracker/v1 client and requires an API key for authentication.NoopTransport (noop.go): A “do nothing” implementation, useful for disabling notifications or for testing.NotificationDataProvider Interface (notification_provider.go): This interface is responsible for gathering the necessary data to populate the notification templates.
defaultNotificationDataProvider uses a Formatter to generate the notification body and subject based on RegressionMetadata.androidNotificationProvider (android_notification_provider.go) is a specialized provider for Android-specific bug reporting. It uses its own AndroidBugTemplateContext which includes Android-specific details like Build ID diff URLs. It leverages the MarkdownFormatter for content generation but with Android-specific templates.Workflow for Sending a Notification (Simplified):
alerter module).Notifier's RegressionFound method is called with details about the regression (commit, alert configuration, cluster summary, etc.).Notifier (typically defaultNotifier) uses its NotificationDataProvider to get the raw notification data (body and subject).NotificationDataProvider populates a context object (e.g., TemplateContext or AndroidBugTemplateContext).Formatter (e.g., MarkdownFormatter) to execute the appropriate template with this context, producing the final body and subject.Notifier then calls its Transport's SendNewRegression method, passing the formatted body and subject.Transport implementation handles the actual sending (e.g., makes an API call to the issue tracker or sends an email).Regression Detected --> Notifier.RegressionFound(...)
|
v
NotificationDataProvider.GetNotificationDataRegressionFound(...)
|
| (Populates Context, e.g., TemplateContext)
v
Formatter.FormatNewRegressionWithContext(...)
| (Uses Go templates)
v
Formatted Body & Subject
|
v
Transport.SendNewRegression(body, subject)
|
+------------------> EmailTransport --> Email Server
|
+------------------> IssueTrackerTransport --> Issue Tracker API
|
+------------------> NoopTransport --> (Does nothing)
Key Files and Responsibilities:
notify.go:
Notifier, Formatter, Transport.defaultNotifier implementation, which orchestrates the notification process by composing a NotificationDataProvider, Formatter, and Transport.New() factory function that constructs the appropriate Notifier based on the NotifyConfig. This is the main entry point for creating a notifier.TemplateContext used by generic formatters.getRegressionMetadata to fetch additional information like source file links from TraceStore if the alert is for an individual trace.notification_provider.go:
NotificationDataProvider interface.defaultNotificationDataProvider which uses a generic Formatter.Notifier or Transport mechanisms.android_notification_provider.go:
NotificationDataProvider specifically for Android bug creation.AndroidBugTemplateContext to provide Android-specific data to templates, such as GetBuildIdUrlDiff for generating links to compare Android build CLs.MarkdownFormatter but configures it with Android-specific notification templates defined in the NotifyConfig. This allows Android teams to customize their bug reports.markdown.go & html.go:
Formatter interface for Markdown and HTML respectively.MarkdownFormatter can be configured with custom templates via NotifyConfig. It also provides a buildIDFromSubject template function, specifically designed for Android's commit message format, to extract build IDs.viewOnDashboard is a utility function to construct a URL to the Perf explore page for the given regression.email.go & issuetracker.go & noop.go:
Transport interface.email.go: Uses emailclient to send emails. Splits comma/space-separated recipient lists.issuetracker.go: Interacts with the Google Issue Tracker API. It requires API key secrets (configured via NotifyConfig) and uses OAuth2 for authentication. It can create new issues and update existing ones (e.g., to mark them obsolete).noop.go: A null implementation for disabling notifications.chromeperfnotifier.go:
Notifier interface directly, without using the Formatter or Transport abstractions in the same way as defaultNotifier. This is because it communicates directly with the Chrome Performance Dashboard's Anomaly API.ReportRegression).isParamSetValid, getTestPath) to ensure the data conforms to Chromeperf's requirements (e.g., specific param keys like master, bot, benchmark, test).improvement_direction parameter and the step direction.commitrange.go:
URLFromCommitRange, a utility function to generate a URL for a commit or a range of commits. If a commitRangeURLTemplate is provided (e.g., via configuration), it will be used to create a URL showing the diff between two commits. Otherwise, it defaults to the individual commit's URL. This is used by formatters to create links in notifications.common/notificationData.go:
NotificationData (simple struct for body and subject) and RegressionMetadata (a comprehensive struct holding all relevant information about a regression needed for notification generation). This promotes a common data structure for passing regression details.Configuration and Customization (NotifyConfig):
The behavior of the notify module is heavily influenced by config.NotifyConfig. This configuration allows users to:
Notifications field): None, HTMLEmail, MarkdownIssueTracker, ChromeperfAlerting, AnomalyGrouper.NotificationDataProvider: DefaultNotificationProvider or AndroidNotificationProvider.Subject, Body, MissingSubject, MissingBody). This is particularly relevant for MarkdownFormatter and androidNotificationProvider.IssueTrackerTransport (API key secret locations).This design allows for flexibility in how notifications are generated and delivered, catering to different needs and integrations. For instance, the Android team can have highly customized bug reports, while other users might prefer standard email notifications. The ChromeperfNotifier demonstrates a direct integration with another system, bypassing some of the general-purpose formatting/transport layers when a specific API is targeted.
The notifytypes module in Perf defines the various types of notification mechanisms that can be triggered in response to performance regressions or other significant events. It also defines types for data providers that supply the necessary information for these notifications. This module serves as a central point for enumerating and categorizing notification strategies, enabling flexible and extensible notification handling within the Perf system.
The primary goal of this module is to provide a structured and type-safe way to manage notification types.
Type string, new notification methods can be easily added in the future without requiring significant code changes in consuming modules. This promotes loose coupling and allows the notification system to evolve independently.HTMLEmail, MarkdownIssueTracker) instead of raw strings makes the code more self-documenting and reduces the likelihood of errors due to typos.NotificationDataProviderType allows for different sources or formats of data to be used for generating notifications, separating the concern of what data is needed from how the notification is delivered. This is crucial, for example, when different platforms (like Android) might require specific data formatting or additional information.Type (string alias): The Type is defined as an alias for string. This allows for string-based storage and transmission of notification types (e.g., in configuration files or database entries) while still providing a degree of type safety within Go code.Type. This ensures that only valid, predefined notification types can be used.HTMLEmail: Indicates notifications sent as HTML-formatted emails. This is suitable for rich content and direct user communication.MarkdownIssueTracker: Represents notifications formatted in Markdown, intended for integration with issue tracking systems. This facilitates automated ticket creation or updates.ChromeperfAlerting: Specifies that regression data should be sent to the Chromeperf alerting system. This allows for integration with a specialized alerting infrastructure.AnomalyGrouper: Designates that regressions should be processed by an anomaly grouping logic, which then determines the appropriate action. This enables more sophisticated handling of multiple related anomalies.None: A special type indicating that no notification should be sent. This is useful for disabling notifications in certain contexts or for configurations where alerting is not desired.AllNotifierTypes Slice: This public variable provides a convenient way for other parts of the system to iterate over or validate against all known notification types.NotificationDataProviderType (string alias): Similar to Type, this defines the kind of data provider to use for notifications.DefaultNotificationProvider: Represents the standard or default data provider.AndroidNotificationProvider: Indicates a specialized data provider tailored for Android-specific notification requirements. This might involve fetching different metrics, formatting data in a particular way, or including Android-specific metadata.notifytypes.go: This is the sole file in the module and contains all the definitions.HTMLEmail, MarkdownIssueTracker, ChromeperfAlerting, AnomalyGrouper, None). This acts as a contract for other modules that implement or consume notification functionalities.DefaultNotificationProvider, AndroidNotificationProvider) that can be used to source information for notifications. This allows the notification system to adapt to different data sources or formats.AllNotifierTypes variable makes it easy for other components to get a list of all valid notification types, for example, for display in a UI or for validation purposes.While this module itself doesn't implement workflows, it underpins them. A typical conceptual workflow where these types would be used is:
Regression Event -->notifytypes.Type. Configuration Lookup (specifies notifytypes.Type, e.g., HTMLEmail) -->notifytypes.Type from the configuration, the appropriate notifier implementation is selected. Notification System -->notifytypes.NotificationDataProviderType, the corresponding data provider is chosen. Data Provider (e.g., AndroidNotificationProvider) -->Notification Delivered (e.g., Email Sent)For example, if a regression is detected for an Android benchmark and the configuration specifies HTMLEmail as the Type and AndroidNotificationProvider as the NotificationDataProviderType:
Regression Event -> Config: {Type: HTMLEmail, DataProvider: AndroidNotificationProvider} -> Select EmailNotifier -> Select AndroidDataProvider -> AndroidDataProvider fetches data -> EmailNotifier formats and sends HTML email
The perf-tool module provides a command-line interface (CLI) for interacting with various aspects of the Perf performance monitoring system. It allows developers and administrators to manage configurations, inspect data, perform database maintenance tasks, and validate ingestion files.
The primary motivation behind perf-tool is to offer a centralized and scriptable way to perform common Perf operations that would otherwise require manual intervention or direct database interaction. This simplifies workflows and enables automation of routine tasks.
The core functionality is organized into subcommands, each addressing a specific area of Perf:
config: Manages Perf instance configurations.create-pubsub-topics-and-subscriptions: Sets up the necessary Google Cloud Pub/Sub topics and subscriptions required for data ingestion. This is crucial for ensuring that Perf instances can receive and process performance data.validate: Checks the syntax and validity of a Perf instance configuration file. This helps prevent deployment of misconfigured instances.tiles: Interacts with the tiled data storage used by Perf's tracestore. Tiles are segments of time-series data.last: Displays the index of the most recent tile, providing insight into the current state of data ingestion.list: Shows a list of recent tiles and the number of traces they contain, useful for understanding data volume and distribution.traces: Allows querying and exporting trace data.list: Retrieves and displays the IDs of traces that match a given query within a specific tile. This is useful for ad-hoc data exploration.export: Exports trace data matching a query and commit range to a JSON file. This enables external analysis or data migration.ingest: Manages the data ingestion process.force-reingest: Triggers the re-ingestion of data files from Google Cloud Storage (GCS) for a specified time range. This is useful for reprocessing data after configuration changes or to fix ingestion errors. The workflow is:validate: Validates the format and content of an ingestion file against the expected schema and parsing rules. This helps ensure data quality before ingestion.database: Provides tools for backing up and restoring Perf database components. This is critical for disaster recovery and data migration.backup:alerts: Backs up alert configurations to a zip file.shortcuts: Backs up saved shortcut configurations to a zip file.regressions: Backs up regression data (detected performance changes) and associated shortcuts to a zip file. It backs up data up to a specified date (defaulting to four weeks ago). The process involves iterating backward through commits in batches, fetching regressions for each commit range, and storing them along with any shortcuts referenced in those regressions.restore:alerts: Restores alert configurations from a backup file.shortcuts: Restores shortcut configurations from a backup file.regressions: Restores regression data and their associated shortcuts from a backup file. It's important to note that restoring regressions also attempts to re-create the associated shortcuts.trybot: Contains experimental functionality related to trybot (pre-submit testing) data.reference: Generates a synthetic nanobench reference file. This file is constructed by loading a specified trybot results file, identifying all trace IDs within it, and then fetching historical sample data for these traces from the main Perf instance (specifically, from the last N ingested files). The aggregated historical samples are then formatted into a new nanobench JSON file. This allows for comparing trybot results against a baseline derived from recent production data using tools like nanostat.markdown: Generates Markdown documentation for the perf-tool CLI itself.The main.go file sets up the CLI application using the urfave/cli library. It defines flags, commands, and subcommands, and maps them to corresponding functions in the application package. It handles flag parsing, configuration loading (from a file, with optional connection string overrides), and initialization of logging.
The application/application.go file defines the Application interface and its concrete implementation app. This interface abstracts the core logic for each command, promoting testability and separation of concerns. The app struct implements methods that interact with various Perf components like tracestore, alertStore, shortcutStore, regressionStore, and GCS.
Key design choices include:
Application interface): This allows for mocking the application logic during testing (as seen in main_test.go and application/mocks/Application.go), ensuring that the CLI command parsing and flag handling can be tested independently of the actual backend operations.--config_filename), which defines data store connections, GCS sources, etc. This makes the tool adaptable to different Perf deployments.perf/go/builders are used to instantiate components like TraceStore, AlertStore, etc., based on the provided instance configuration. This centralizes component creation logic.encoding/gob. This provides a simple and portable backup solution.regressionBatchSize) to manage memory and avoid overwhelming the database.ingest force-reingest command leverages Pub/Sub by publishing messages that mimic GCS notifications, effectively triggering the standard ingestion pipeline.The application/mocks/Application.go file contains a mock implementation of the Application interface, generated by the mockery tool. This is used in main_test.go to test the command-line argument parsing and dispatch logic without actually performing the underlying operations.
The perfclient module provides an interface for sending performance data to Skia Perf's ingestion system. The primary goal of this module is to abstract the complexities of interacting with Google Cloud Storage (GCS), which is the underlying mechanism Perf uses for data ingestion. By providing a dedicated client, it simplifies the process for other applications and services that need to report performance metrics.
The core design centers around a ClientInterface and its concrete implementation, Client. This approach allows for easy mocking and testing, promoting loose coupling between the perfclient and its consumers.
Key Components and Responsibilities:
perf_client.go:
ClientInterface: This interface defines the contract for pushing performance data. The key method is PushToPerf. The decision to use an interface here is crucial for testability and dependency injection. It allows consumers to use a real GCS-backed client in production and a mock client in tests.Client: This struct is the concrete implementation of ClientInterface. It holds a gcs.GCSClient instance, which is responsible for the actual communication with Google Cloud Storage, and a basePath string that specifies the root directory within the GCS bucket where performance data will be stored. The constructor New takes these as arguments, allowing users to configure the GCS bucket and the top-level folder for their data.PushToPerf method: This is the workhorse of the module.time.Time object (now), a folderName, a filePrefix, and a format.BenchData struct (which represents the performance metrics).format.BenchData is first marshaled into a JSON string. This is the standard format Perf expects for ingestion.gzip. This is a performance optimization, as GCS can automatically decompress gzipped files with the correct ContentEncoding header, reducing storage costs and transfer times.objectPath helper function. This path incorporates the basePath, the current timestamp (formatted as YYYY/MM/DD/HH/), the folderName, and a filename composed of the filePrefix, an MD5 hash of the JSON data, and a millisecond-precision timestamp. The inclusion of the MD5 hash helps in avoiding duplicate uploads of identical data and can be useful for debugging or data verification. The timestamp in the path and filename ensures that data from different runs or times are stored separately and can be easily queried.storageClient.SetFileContents method. Crucially, it sets ContentEncoding: "gzip" and ContentType: "application/json" in the gcs.FileWriteOptions. This metadata informs GCS about the compression and data type, enabling features like automatic decompression.objectPath function: This helper function is responsible for constructing the unique GCS path for each performance data file. The rationale for this specific path structure (basePath/YYYY/MM/DD/HH/folderName/filePrefix_hash_timestamp.json) is to organize data chronologically and by task, making it easier to browse, query, and manage within GCS. The hash ensures uniqueness and integrity.mock_perf_client.go:
MockPerfClient: This provides a mock implementation of ClientInterface using the testify/mock library. This is essential for unit testing components that depend on perfclient without requiring actual GCS interaction. It allows developers to define expected calls to PushToPerf and verify that their code interacts with the client correctly. The NewMockPerfClient constructor returns a pointer to ensure that the methods provided by mock.Mock (like On and AssertExpectations) are accessible.Workflow: Pushing Performance Data
The primary workflow involves a client application using perfclient to send performance data:
Client App perfclient.Client gcs.GCSClient
| | |
| -- Call PushToPerf(now, | |
| folder, prefix, data) ->| |
| | -- Marshal data to JSON |
| | -- Compress JSON (gzip) |
| | -- Construct GCS objectPath |
| | (includes time, folder, |
| | prefix, data hash) |
| | |
| | -- Call SetFileContents(path, |
| | options, compressed_data) -> |
| | | -- Upload to GCS
| | | with gzip encoding
| | | and JSON content type
| | <-------------------------------| -- Return success/error
| <--------------------------| |
| -- Receive success/error | |
The design emphasizes creating a clear separation of concerns: the perfclient handles the formatting, compression, and path generation logic specific to Perf's ingestion requirements, while the underlying gcs.GCSClient handles the raw GCS communication. This makes the perfclient a focused and reusable component for any system needing to integrate with Skia Perf.
The perfresults module is responsible for fetching, parsing, and processing performance results data generated by Telemetry-based benchmarks in the Chromium project. This data typically resides in perf_results.json files. The module provides functionalities to:
perf_results.json files.perf_results.json files. These files contain sets of histograms, where each histogram represents a specific benchmark measurement. The parser extracts these histograms and associated metadata.The primary goal is to provide a reliable and efficient way to access and utilize Chromium's performance data for analysis and monitoring.
The process of loading performance results from a Buildbucket build involves several steps:
Buildbucket ID -> BuildInfo -> Swarming Task ID -> Child Swarming Task IDs -> CAS Outputs -> PerfResults
Buildbucket Interaction (buildbucket.go):
bbClient interacts with the Buildbucket PRPC API to fetch build details using a given buildID. It specifically requests fields like builder, status, infra.backend.task.id (for the Swarming task ID), output.properties (for git revision information), and input.properties (for perf_dashboard_machine_group).BuildInfo struct is populated with this information, providing a consolidated view of the build's context. The GetPosition() method on BuildInfo is crucial as it determines the commit identifier (either commit position or git hash) used for associating the performance data with a specific point in the codebase.Swarming Interaction (swarming.go):
swarmingClient uses the Swarming PRPC API.findChildTaskIds: Given a parent Swarming task ID (obtained from BuildInfo), this function lists all child tasks by querying for tasks with a matching parent_task_id tag. The query is scoped by the parent task's creation and completion timestamps to narrow down the search.findTaskCASOutputs: For each child task ID, this function retrieves the task result, specifically looking for the CasOutputRoot. This reference points to the RBE-CAS location where the task's output files (including perf_results.json) are stored.RBE-CAS Interaction (rbecas.go):
perf_results.json files are stored in RBE-CAS. RBE-CAS provides efficient and reliable storage for large build artifacts.RBEPerfLoader uses the RBE SDK to interact with CAS.fetchPerfDigests: Given a CAS reference (pointing to the root directory of a task's output), this function:Directory proto.GetDirectoryTree.perf_results.json. The path structure is expected to be benchmark_name/perf_results.json, allowing association of results with a specific benchmark.loadPerfResult: Given a digest for a perf_results.json file, this reads the blob from CAS and parses it using NewResults.LoadPerfResults: This orchestrates the loading for multiple CAS references (from multiple child Swarming tasks). It iterates through each CAS reference, fetches the digests of perf_results.json files, loads each file, and then merges results from the same benchmark. Merging is important because a single benchmark might have its results split across multiple files or tasks.Orchestration (perf_loader.go):
loader.LoadPerfResults method coordinates the entire workflow:bbClient to get BuildInfo.swarmingClient to find child task IDs and then their CAS outputs.checkCasInstances) to ensure all CAS outputs come from the same RBE instance, simplifying client initialization.RBEPerfLoader (via rbeProvider for testability) for the determined CAS instance.RBEPerfLoader.LoadPerfResults with the list of CAS references to fetch and parse all perf_results.json files.rbeProvider is a good example of dependency injection, allowing tests to mock the RBE-CAS interaction.perf_results_parser.go)perf_results.json files have a specific, somewhat complex structure. A dedicated parser is needed to extract meaningful data (histograms and their metadata).PerfResults struct is the main container, holding a map of TraceKey to Histogram.TraceKey uniquely identifies a trace, composed of ChartName (metric name), Unit, Story (user journey/test case), Architecture, and OSName. These fields are extracted from the histogram's own properties and its associated “diagnostics” which are references to other metadata objects within the JSON file.Histogram stores the SampleValues (the actual measurements).NewResults uses json.NewDecoder to process the input io.Reader in a streaming fashion.perf_results.json files can be very large (10MB+). Reading the entire file into memory before parsing would be inefficient and could lead to high memory usage. Streaming allows processing the JSON array element by element.[ of the JSON array.decoder.More() is true, decoding each element into a singleEntry struct.singleEntry is a union-like struct that can hold different types of objects found in the JSON (histograms, generic sets, date ranges, related name maps). This is determined by checking fields like Name (present for histograms) or Type.entry.Name != ""), it‘s converted to TraceKey and Histogram via histogramRaw.asTraceKeyAndHistogram. This conversion involves looking up GUIDs from the histogram’s Diagnostics map in a locally maintained metadata map (md).GenericSet, DateRange, RelatedNameMap) are stored in the md map, keyed by their GUID, so they can be referenced by histograms later in the stream.pr.Histograms. If a TraceKey already exists, sample values are appended.] of the JSON array.Histogram type provides methods for common aggregations (Min, Max, Mean, Stddev, Sum, Count). AggregationMapping provides a convenient way to access these aggregation functions by string keys, which is used by downstream consumers like the ingestion module.UnmarshalJSON: An UnmarshalJSON method exists, which reads the entire byte slice into memory. This is less efficient and marked for deprecation in favor of NewResults.ingest/)This submodule focuses on transforming the parsed PerfResults into the format.Format structure required by the Perf ingestion system.
json.go (ConvertPerfResultsFormat):
PerfResults structure is not directly ingestible. It needs to be reshaped.(TraceKey, Histogram) pair in the input PerfResults.format.Result. The Key map within format.Result is populated from TraceKey fields (chart, unit, story, arch, os).Measurements map within format.Result is populated by calling toMeasurement on the Histogram.toMeasurement iterates through perfresults.AggregationMapping, applying each aggregation function to the histogram's samples. Each resulting aggregation (e.g., “max”, “mean”) becomes a format.SingleMeasurement with the aggregation type as its Value and the computed metric as its Measurement.format.Format object includes the version, commit hash (GitHash), and any provided headers and links.gcs.go:
convertPath: Constructs a GCS path like gs://<bucket>/ingest/<time_path>/<build_info_path>/<benchmark>.convertTime: Formats a time.Time into YYYY/MM/DD/HH (UTC).convertBuildInfo: Formats BuildInfo into <MachineGroup>/<BuilderName>. It defaults MachineGroup to “ChromiumPerf” and BuilderName to “BuilderNone” if they are empty.isInternal: Determines if the results are internal or public based on the BuilderName. It checks against a list of known external bot configurations (pinpoint/go/bot_configs). If not found, it defaults to internal. This determines whether PublicBucket (chrome-perf-public) or InternalBucket (chrome-perf-non-public) is used.perf_loader.go: Orchestrates the loading of performance results from Buildbucket. NewLoader().LoadPerfResults() is the main entry point.buildbucket.go: Handles interaction with the Buildbucket API to fetch build metadata. Defines BuildInfo.swarming.go: Handles interaction with the Swarming API to find child tasks and their CAS outputs.rbecas.go: Handles interaction with RBE-CAS to download and parse perf_results.json files. Defines RBEPerfLoader.perf_results_parser.go: Parses the content of perf_results.json files. Defines PerfResults, TraceKey, Histogram, and the streaming NewResults parser.ingest/json.go: Transforms parsed PerfResults into the format.Format structure for ingestion.ingest/gcs.go: Provides utilities to determine GCS paths for storing transformed results.cli/main.go: A command-line interface utility that uses the perfresults library to fetch results for a given Buildbucket ID and outputs them as JSON files in the ingestion format. This serves as a practical example and a tool for ad-hoc data retrieval.testdata/: Contains JSON files used for replaying HTTP and gRPC interactions during tests (*.json, *.rpc), and sample perf_results.json files for parser testing. replay_test.go sets up the replay mechanism.User/System --Buildbucket ID--> perf_loader.LoadPerfResults()
|
+--> buildbucket.findBuildInfo() --PRPC call--> Buildbucket API
| (Returns BuildInfo: Swarming Task ID, Git Revision, Machine Group, etc.)
|
+--> swarming.findChildTaskIds() --PRPC call--> Swarming API (using Parent Task ID)
| (Returns list of Child Swarming Task IDs)
|
+--> swarming.findTaskCASOutputs() --PRPC calls--> Swarming API (for each Child Task ID)
| (Returns list of CASReference objects)
|
(Error if CAS instances differ for CASReferences)
|
+--> rbecas.RBEPerfLoader.LoadPerfResults() (with list of CASReferences)
|
+--> For each CASReference:
| |
| +--> rbecas.fetchPerfDigests() --RBE SDK calls--> RBE-CAS
| | (Returns map of benchmark_name to digest of perf_results.json)
| |
| +--> For each (benchmark_name, digest):
| |
| +--> rbecas.loadPerfResult() --RBE SDK call (ReadBlob)--> RBE-CAS
| | |
| | +--> perf_results_parser.NewResults() (Parses JSON stream)
| | (Returns PerfResults object for this file)
| |
| +--> (Merge with existing PerfResults for the same benchmark_name)
|
(Returns map[benchmark_name]*PerfResults and BuildInfo)
CLI User --Build ID, Output Dir--> cli/main.main()
|
+--> perfresults.NewLoader().LoadPerfResults(Build ID)
| (Executes the Primary Workflow described above)
| (Returns BuildInfo, map[benchmark]*PerfResults)
|
+--> For each (benchmark, perfResult) in results:
|
+--> ingest.ConvertPerfResultsFormat(perfResult, buildInfo.GetPosition(), headers, links)
| (Transforms PerfResults to ingest.Format)
|
+--> Marshal ingest.Format to JSON
|
+--> Write JSON to output file: <outputDir>/<benchmark>_<BuildID>.json
|
+--> Print output filename to stdout
The workflows/worker/main.go file sets up a Temporal worker. Currently, it‘s a basic skeleton that initializes a worker and connects to a Temporal server. It doesn’t register any specific activities or workflows from the perfresults module itself. Its presence suggests an intention to integrate perfresults functionalities into Temporal workflows in the future, possibly for automated ingestion or processing tasks. The worker itself is a generic Temporal worker setup.
The module employs a robust testing strategy:
_test.go file with unit tests for its specific logic. For example, perf_results_parser_test.go tests the JSON parsing, and buildbucket_test.go tests BuildInfo logic.replay_test.go, testdata/):cloud.google.com/go/httpreplay. Recorded interactions are stored as .json files in testdata/.cloud.google.com/go/rpcreplay. Recorded interactions are stored as gzipped .rpc files in testdata/.-record_path) controls whether tests run in replay mode (reading from testdata/) or record mode (writing new replay files to the specified path). This allows updating replay files when external APIs change or new test cases are needed.setupReplay() and newRBEReplay() in replay_test.go are helper functions that configure the HTTP client and RBE client for either recording or replaying.testdata/perftest/): Contains various perf_results.json files (e.g., full.json, empty.json, merged.json) to test different scenarios for the perf_results_parser.go. This ensures the parser correctly handles different valid and edge-case inputs.cli/main.go): The CLI itself serves as an integration test for the core loading and conversion logic. Its tests (perf_loader_test.go for example) often use the replay mechanism to test the end-to-end flow from Build ID to parsed PerfResults.This combination ensures both isolated unit correctness and reliable integration testing without external dependencies during typical test runs.
The perfserver module serves as the central executable for the Perf performance monitoring system. It consolidates various essential components into a single command-line tool, simplifying deployment and management. The primary goal is to provide a unified entry point for running the web UI, data ingestion processes, regression detection, and maintenance tasks. This approach avoids the complexity of managing multiple separate services and their configurations.
The module leverages the urfave/cli library to define and manage sub-commands, each corresponding to a distinct functional area of Perf. This design allows for clear separation of concerns while maintaining a single binary. Configuration for each sub-command is handled through flags, with the config package providing structured types for these flags.
Key components and their responsibilities:
main.go: This is the entry point of the perfserver executable.
Why: It orchestrates the initialization and execution of the different Perf sub-systems.
How: It defines a cli.App with several sub-commands:
frontend: This sub-command launches the main web user interface for Perf.
frontend component (from //perf/go/frontend). Configuration is passed via config.FrontendFlags. The frontend component itself handles serving HTTP requests and rendering the UI.maintenance: This sub-command starts background maintenance tasks.
maintenance component (from //perf/go/maintenance). It first validates the instance configuration (using //perf/go/config/validate) and then starts the maintenance routines. Prometheus metrics are exposed for monitoring.ingest: This sub-command runs the data ingestion process.
- **Why**: To continuously import performance data from various sources (e.g., build artifacts, test results) and populate the central data store (TraceStore). - **How**: It initializes and runs the ingestion process logic (from `//perf/go/ingest/process`). Similar to `maintenance`, it validates the instance configuration. It supports parallel ingestion for improved throughput. Prometheus metrics are also exposed. - Data Ingestion Workflow: `Configured Sources --> [Ingest Process]
--Parses/Validates--> [TraceStore] | Handles incoming files Populates data`
cluster: This sub-command runs the regression detection process.
- **Why**: To automatically analyze incoming performance data against configured alerts and identify significant performance regressions. - **How**: Interestingly, this sub-command also utilizes the `frontend.New` and `f.Serve()` mechanism, similar to the `frontend` sub-command. This suggests that the regression detection logic might be tightly coupled with or exposed through the same underlying service framework as the main UI, potentially for sharing configuration or common infrastructure. It uses `config.FrontendFlags` but specifically for clustering-related settings (indicated by `AsCliFlags(true)`). - Regression Detection Workflow: `[TraceStore] --New Data--> [Cluster
Process] --Applies Alert Rules--> [Alerts/Notifications] ^ | | Identifies Regressions +-------------------------+`
markdown: A utility sub-command to generate Markdown documentation for perfserver itself.
ToMarkdown() method provided by the urfave/cli library.Logging: The Before hook in the cli.App configures sklog to output logs to standard output, ensuring that operational messages from any sub-command are visible.
Configuration Loading: For sub-commands like ingest and maintenance, instance configuration is loaded from a specified file (ConfigFilename flag) and validated using //perf/go/config/validate. The database connection string can be overridden via a command-line flag.
Metrics: The ingest and maintenance sub-commands initialize Prometheus metrics, allowing for monitoring of their operational health and performance.
The design emphasizes modularity by delegating the core logic of each function (UI, ingestion, clustering, maintenance) to dedicated packages (//perf/go/frontend, //perf/go/ingest/process, //perf/go/maintenance). perfserver acts as the conductor, parsing command-line arguments, loading appropriate configurations, and invoking the correct sub-system. This structure makes the overall Perf system more maintainable and easier to understand, as each component has a well-defined responsibility.
The /go/pinpoint module provides a Go client for interacting with the Pinpoint service, which is part of Chromeperf. Pinpoint is a performance testing and analysis tool used to identify performance regressions and improvements. This client enables other Go applications within the Skia infrastructure to programmatically trigger Pinpoint jobs.
Core Functionality:
The primary purpose of this module is to abstract the complexities of making HTTP requests to the Pinpoint API. It handles authentication, request formatting, and response parsing. This allows other services to easily initiate two main types of Pinpoint jobs:
pinpointURL endpoint.pinpointLegacyURL for these types of jobs.Design Decisions and Implementation Choices:
pinpointURL) and legacy try jobs (pinpointLegacyURL). The client reflects this by having separate methods (CreateBisect and CreateTryJob) and corresponding request URL builder functions (buildBisectRequestURL and buildTryJobRequestURL). This design choice directly maps to the underlying Pinpoint API structure, making it clear which type of job is being created.buildBisectRequestURL and buildTryJobRequestURL functions are responsible for constructing these URLs by populating url.Values and then encoding them. This is a direct consequence of how the Pinpoint API is designed.google.DefaultTokenSource) with the auth.ScopeUserinfoEmail scope. This is a standard approach for service-to-service authentication within the Google Cloud ecosystem, ensuring secure communication with the Pinpoint API.go/metrics2 to track the number of times bisect and try jobs are called and the number of times these calls fail. This is crucial for monitoring the reliability and usage of the Pinpoint integration.go/skerr for wrapping errors. This provides more context to errors, making debugging easier. For example, if a Pinpoint request fails, the HTTP status code and response body are included in the error message.pinpoint/go/bot_configs: For try jobs, the target parameter is required by the Pinpoint API. This target is derived from the Configuration (bot) and Benchmark using the bot_configs.GetIsolateTarget function. This indicates a specific configuration setup for running the performance tests.test_path Parameter for Bisect Jobs: The Pinpoint API requires a test_path parameter for bisect jobs. This parameter is constructed by joining several components like “ChromiumPerf”, configuration, benchmark, chart, and story. This specific formatting is a legacy requirement of the Chromeperf API.bug_id for Bisect Jobs: The Pinpoint API mandates the bug_id parameter for bisect jobs. If not provided by the caller, the client defaults it to "null". This reflects a specific constraint of the upstream service.tags Parameter: Both job types include a tags parameter set to {"origin":"skia_perf"}. This helps in tracking and filtering jobs originating from the Skia infrastructure within the Pinpoint system.Key Components/Files:
pinpoint.go: This is the sole Go file in the module and contains all the logic.Client struct: Represents the Pinpoint client. It holds the authenticated http.Client and counters for metrics.New() function: The constructor for the Client. It initializes the HTTP client with appropriate authentication.CreateLegacyTryRequest and CreateBisectRequest structs: Define the structure of the data required to create try jobs and bisect jobs, respectively. These fields directly map to the parameters expected by the Pinpoint API.CreatePinpointResponse struct: Defines the structure of the JSON response from Pinpoint, which includes the JobID and JobURL.CreateTryJob() method:CreateLegacyTryRequest and a context.Context.buildTryJobRequestURL to construct the request URL.pinpointLegacyURL.CreatePinpointResponse.CreateBisect() method:CreateTryJob(), but takes a CreateBisectRequest.buildBisectRequestURL.pinpointURL.buildTryJobRequestURL() function:CreateLegacyTryRequest.Benchmark and Configuration.target using bot_configs.GetIsolateTarget.url.Values with all relevant parameters from the request, including hardcoded values like comparison_mode and tags.buildBisectRequestURL() function:CreateBisectRequest.url.Values with parameters from the request.bug_id if not provided.test_path parameter based on available request fields.tags parameter.Key Workflows:
Creating a Bisect Job:
Application Code go/pinpoint.Client Pinpoint API
---------------- ------------------ ------------
1. CreateBisectRequest data ---->
2. Calls client.CreateBisect() -->
3. buildBisectRequestURL()
(constructs URL with params)
4. HTTP POST to pinpointURL -------->
5. Processes request
6. Returns JSON response
<----------------------------------- 7. Receives HTTP response
8. Parses JSON into
CreatePinpointResponse
<--------------------------------- 9. Returns CreatePinpointResponse
Creating a Try Job (A/B Test):
Application Code go/pinpoint.Client Pinpoint API (Legacy)
---------------- ------------------ ---------------------
1. CreateLegacyTryRequest data ->
2. Calls client.CreateTryJob() -->
3. buildTryJobRequestURL()
(gets 'target' from bot_configs,
constructs URL with params)
4. HTTP POST to pinpointLegacyURL ----->
5. Processes request
6. Returns JSON response
<---------------------------------------- 7. Receives HTTP response
8. Parses JSON into
CreatePinpointResponse
<--------------------------------- 9. Returns CreatePinpointResponse
The pivot module provides functionality analogous to pivot tables in spreadsheets or GROUP BY operations in SQL. Its primary purpose is to aggregate and summarize trace data within a DataFrame based on specified grouping criteria and operations. This allows users to transform raw trace data into more insightful, summarized views, facilitating comparisons and analysis across different dimensions of the data. For example, one might want to compare the performance of ‘arm’ architecture machines against ‘intel’ architecture machines by summing or averaging their respective performance metrics.
The core of the pivot module revolves around the Request struct and the Pivot function.
Request Struct:
The Request struct encapsulates the parameters for a pivot operation. It defines:
GroupBy: A slice of strings representing the parameter keys to group the traces by. This is the fundamental dimension along which the data will be aggregated. For instance, if GroupBy is ["arch"], all traces with the same ‘arch’ value will be grouped together.Operation: An Operation type (e.g., Sum, Avg, Geo) that specifies how the values within each group of traces should be combined. This operation is applied to each point in the traces within a group, resulting in a new, summarized trace for that group.Summary: An optional slice of Operation types. If provided, these operations are applied to the resulting traces from the GroupBy step. Each Summary operation generates a single value (a column in the final output if viewed as a table) for each grouped trace. If Summary is empty, the output is a DataFrame where each row is a summarized trace (suitable for plotting).Pivot Function Workflow:
The Pivot function executes the aggregation and summarization process. Here's a breakdown of its key steps and the reasoning behind them:
Input Validation (req.Valid()):
GroupBy keys or invalid Operation or Summary values.GroupBy is non-empty and if the specified Operation and Summary operations are among the predefined valid operations (AllOperations).Initialization and Grouping Structure (groupedTraceSets):
types.TraceSet containing traces belonging to that group.groupedTraceSets by determining all possible unique combinations of values for the GroupBy keys present in the input DataFrame's ParamSet. This is done using df.ParamSet.CartesianProduct(req.GroupBy). This pre-population ensures that even groups with no matching traces are considered, although they will be filtered out later if they remain empty.DataFrame (df.TraceSet).req.GroupBy to form a groupKey using groupKeyFromTraceKey. This function ensures that only traces containing all the GroupBy keys contribute to a group. If a trace is missing a GroupBy key, it's ignored.types.TraceSet associated with its groupKey in groupedTraceSets.Input DataFrame (df.TraceSet)
|
v
For each traceID, trace in df.TraceSet:
Parse traceID into params
groupKey = groupKeyFromTraceKey(params, req.GroupBy)
If groupKey is valid:
Add trace to groupedTraceSets[groupKey]
|
v
Grouped Traces (groupedTraceSets)
Applying the GroupBy Operation:
req.Operation.groupedTraceSets.groupByOperation function corresponding to req.Operation (obtained from opMap) to the types.TraceSet of that group. The opMap is a crucial design choice, mapping Operation constants to their respective implementation functions (one for grouping traces, another for summarizing single traces). This provides a clean and extensible way to manage different aggregation functions.ret.TraceSet of the new DataFrame.ctx.Err()) is checked periodically to allow for early termination if the operation is cancelled.Grouped Traces (groupedTraceSets)
|
v
For each groupID, traces in groupedTraceSets:
If len(traces) > 0:
summarizedTrace = opMap[req.Operation].groupByOperation(traces)
ret.TraceSet[groupID] = summarizedTrace
|
v
DataFrame with GroupBy Applied (ret)
Building ParamSet for the Result:
DataFrame needs its own ParamSet reflecting the new structure where trace keys only contain the GroupBy parameters.ret.BuildParamSet() is called.Applying Summary Operations (Optional):
req.Summary is specified. This is useful for generating tabular summaries rather than plots.req.Summary is empty, the original DataFrame's Header is used for the new DataFrame, and the function returns. The result is a DataFrame of summarized traces.req.Summary is not empty:ret.TraceSet.types.Trace (called summaryValues) whose length is equal to the number of Summary operations.Operation in req.Summary, it applies the corresponding summaryOperation function (from opMap) to the current grouped trace. The result is stored in summaryValues.ret.TraceSet[groupKey] is replaced with summaryValues.Header of the ret DataFrame is rebuilt. Each column in the header now corresponds to one of the Summary operations, with offsets from 0 to len(req.Summary) - 1.DataFrame with GroupBy Applied (ret)
|
v
If len(req.Summary) > 0:
For each groupKey, trace in ret.TraceSet:
summaryValues = new Trace of length len(req.Summary)
For i, op in enumerate(req.Summary):
summaryValues[i] = opMap[op].summaryOperation(trace)
ret.TraceSet[groupKey] = summaryValues
Adjust ret.Header to match Summary operations
|
v
Final Pivoted DataFrame (ret)
Operations (Operation type and opMap):
The module defines a set of standard operations like Sum, Avg, Geo, Std, Count, Min, Max.
Operation is a string constant.opMap is a map where each Operation key maps to an operationFunctions struct. This struct holds two function pointers:groupByOperation: Takes a types.TraceSet (a group of traces) and returns a single aggregated types.Trace. These functions are typically sourced from the go/calc module.summaryOperation: Takes a single []float32 (a trace) and returns a single float32 summary value. These functions are typically sourced from go/vec32 or defined locally (like stdDev).opMap with the appropriate implementation functions.Error Handling:
Pivot function returns an error if req.Valid() fails or if an error occurs during grouping (e.g., a GroupBy key is not found in the ParamSet of the input DataFrame). Context cancellation is also handled, allowing long-running pivot operations to be interrupted. Errors are wrapped using skerr.Wrap to provide context.pivot.go: This is the main file containing all the logic for the pivot functionality.
Request struct: Defines the parameters for a pivot operation. Its design allows for flexible grouping and summarization.Operation type and constants: Define the set of available aggregation operations.opMap variable: A critical data structure mapping Operation types to their respective implementation functions for both grouping and summarizing. This is the heart of how different operations are dispatched.Pivot function: The primary public function that performs the pivot operation. Its step-by-step process of grouping, applying the main operation, and then optionally applying summary operations is central to its functionality.groupKeyFromTraceKey function: A helper function responsible for constructing the group identifier for each trace based on the GroupBy keys. It handles cases where a trace might not have all the required keys.Valid() method on Request: Ensures that the pivot request is well-formed before processing begins.pivot_test.go: Contains unit tests for the pivot module.
testify assertion library and defines test cases that cover different aspects of the Request validation, groupKeyFromTraceKey logic, and the Pivot function itself with various combinations of Operation and Summary settings. The dataframeForTesting() helper function provides a consistent dataset for testing.This module is designed to be a general-purpose tool for transforming and understanding large datasets of traces by allowing users to aggregate data along arbitrary dimensions and apply various statistical operations.
The /go/progress module provides a mechanism for tracking the progress of long-running tasks on the backend and exposing this information to the UI. This is crucial for user experience in applications where operations like data queries or complex computations can take a significant amount of time. Without progress tracking, users might perceive the application as unresponsive or encounter timeouts.
Many backend operations, such as those initiated by API endpoints like /frame/start or /dryrun/start, are asynchronous. The initial HTTP request might return quickly, but the actual work continues in the background. This module addresses the need to:
The core idea is to represent the state of a long-running task as a Progress object. This object can be updated by the task as it executes. A Tracker then manages multiple Progress objects, making them accessible via HTTP polling.
Key Components:
progress.go: Defines the Progress interface and its concrete implementation progress.
Progress Interface: This is the central abstraction for a single long-running task.Message(key, value string): Allows the task to report arbitrary key-value string pairs. This is flexible enough to accommodate diverse progress information (e.g., current step, commit being processed, number of items filtered). If a key already exists, its value is updated.Results(interface{}): Stores intermediate or final results of the task. The interface{} type allows any JSON-serializable data to be stored. This is useful for showing partial results or accumulating data incrementally.Error(string): Marks the task as failed and stores an error message.Finished(): Marks the task as successfully completed.FinishedWithResults(interface{}): Atomically sets the results and marks the task as finished. This is preferred over separate Results() and Finished() calls to avoid race conditions where the UI might poll between the two calls.Status() Status: Returns the current status (Running, Finished, Error).URL(string): Sets the URL that the client should poll for further updates. This is typically set by the Tracker.JSON(w io.Writer) error: Serializes the current progress state (status, messages, results, next URL) into JSON and writes it to the provided writer.progress struct (concrete implementation):sync.Mutex to ensure thread-safe updates to its internal SerializedProgress state. This is critical because long-running tasks often execute in separate goroutines, and the Progress object might be accessed concurrently by the task updating its state and by the Tracker serving HTTP requests.SerializedProgress struct, which is designed for easy JSON serialization.Progress object starts in the Running state. Once it transitions to Finished or Error, it becomes immutable. Any attempt to modify it (e.g., calling Message() or Results() again) will result in a panic. This design simplifies reasoning about the lifecycle of a task's progress.SerializedProgress struct: Defines the JSON structure sent to the client. It includes the Status, an array of Message (key-value pairs), the Results (if any), and the URL for the next poll.Status enum: Running, Finished, Error.tracker.go: Defines the Tracker interface and its concrete implementation tracker.
Tracker Interface: Manages a collection of Progress objects.Add(prog Progress): Registers a new Progress object with the tracker. The tracker assigns a unique ID to this progress and sets its polling URL.Handler(w http.ResponseWriter, r *http.Request): An HTTP handler function that clients use to poll for progress updates. It extracts the progress ID from the request URL, retrieves the corresponding Progress object, and sends its JSON representation.Start(ctx context.Context): Starts a background goroutine for periodic cleanup of completed tasks from the cache.tracker struct (concrete implementation):lru.Cache: Uses a Least Recently Used (LRU) cache (github.com/hashicorp/golang-lru) to store cacheEntry objects.basePath: A string prefix for the polling URLs (e.g., /_/status/). Each progress object gets a unique ID appended to this base path to form its polling URL.cacheEntry struct: Wraps a Progress object and a Finished timestamp. The timestamp is used by the cleanup routine to determine when a completed task can be removed from the cache.Start method launches a goroutine that periodically calls singleStep.singleStep iterates through the cache:Finished timestamp in a cacheEntry when the corresponding Progress object transitions out of the Running state.Finished or Error state for longer than cacheDuration (currently 5 minutes). This prevents the cache from holding onto completed tasks indefinitely.github.com/google/uuid to generate unique IDs for each tracked Progress. This makes the polling URLs distinct and hard to guess.Starting and Tracking a Long-Running Task:
Backend HTTP Handler (e.g., /api/start_long_task)
|
| 1. Create a new Progress object:
| prog := progress.New()
|
| 2. Add it to the global Tracker instance:
| trackerInstance.Add(prog) // Tracker sets prog.URL() internally
|
| 3. Respond to the initial HTTP request with the Progress JSON.
| // The client now has prog.URL() to poll.
| prog.JSON(w)
|
V
Goroutine (executing the long-running task)
|
| 1. Periodically update progress:
| prog.Message("Step", "Processing item X")
| prog.Message("PercentComplete", "30%")
| prog.Results(partialData) // Optional: intermediate results
|
| 2. When finished:
| If error:
| prog.Error("Something went wrong")
| Else:
| prog.FinishedWithResults(finalData)
Client Polling for Updates:
Client (e.g., browser UI)
|
| 1. Receives initial response with prog.URL (e.g., /_/status/some-uuid)
|
| 2. Makes a GET request to prog.URL
V
Backend Tracker.Handler
|
| 1. Extracts "some-uuid" from the request path.
|
| 2. Looks up the Progress object in its cache using "some-uuid".
| If not found --> HTTP 404 Not Found
|
| 3. Calls prog.JSON(w) to send the current state.
V
Client
|
| 1. Receives JSON with current status, messages, results.
|
| 2. If Status is "Running", schedules another poll to prog.URL.
|
| 3. If Status is "Finished" or "Error", displays final results/error and stops polling.
Tracker Cache Management (Background Process):
Tracker.Start()
|
V
Goroutine (periodic execution, e.g., every minute)
|
| Calls tracker.singleStep()
| |
| V
| Iterate through cache entries:
| - If Progress.Status() is not "Running" AND cacheEntry.Finished is zero:
| Set cacheEntry.Finished = now()
| - If cacheEntry.Finished is not zero AND now() > cacheEntry.Finished + cacheDuration:
| Remove entry from cache
| - Update metrics (numEntriesInCache)
|
V
(Loop back to periodic execution)
This system provides a robust and flexible way to communicate the progress of backend tasks to the user interface, improving the overall user experience for operations that might otherwise seem opaque or unresponsive. The use of JSON for data interchange makes it easy for web frontends to consume the progress information.
The psrefresh module is designed to manage and provide access to paramtools.ParamSet instances, which are collections of key-value pairs representing the parameters of traces in a performance monitoring system. The primary goal is to efficiently retrieve and cache these parameter sets, especially for frequently accessed queries, to reduce database load and improve response times.
The module addresses the need for up-to-date parameter sets by periodically fetching data from a trace store (represented by the OPSProvider interface). It combines parameter sets from recent time intervals (tiles) to provide a comprehensive view of available parameters.
A key challenge is handling potentially large and complex parameter sets. To mitigate this, the module offers a caching layer (CachedParamSetRefresher). This caching mechanism is configurable and can pre-populate caches (e.g., local in-memory or Redis) with filtered parameter sets based on predefined query levels. This pre-population significantly speeds up queries that match these common filter patterns.
Key Components and Responsibilities:
psrefresh.go:
OPSProvider and ParamSetRefresher.OPSProvider: Abstractly represents a source of ordered parameter sets (e.g., a trace data store). It provides methods to get the latest “tile” (a time-based segment of data) and the parameter set for a specific tile. This abstraction allows psrefresh to be independent of the underlying data storage implementation.ParamSetRefresher: Defines the contract for components that can provide the full parameter set and parameter sets filtered by a query. It also includes a Start method to initiate the refresh process.defaultParamSetRefresher, which is the standard implementation of ParamSetRefresher.OPSProvider. It merges parameter sets from a configurable number of recent tiles to create a comprehensive view.refresh) that periodically calls oneStep. The oneStep method fetches the latest tile, then iterates backward through the configured number of previous tiles, retrieving and merging their parameter sets using paramtools.ParamSet.AddParamSet. The resulting merged set is then normalized and stored.sync.Mutex is used to protect concurrent access to the ps (paramtools.ReadOnlyParamSet) field, ensuring thread safety when GetAll is called.GetParamSetForQuery delegates the actual filtering and counting of traces to a dataframe.DataFrameBuilder, demonstrating a separation of concerns.UpdateQueryValueWithDefaults is a helper to automatically add default parameter selections to queries if configured, simplifying common query patterns.cachedpsrefresh.go:
CachedParamSetRefresher, which wraps a defaultParamSetRefresher and adds a caching layer.cache.Cache instance (which could be local, Redis, etc.) and a defaultParamSetRefresher.PopulateCache: This is a crucial method that proactively fills the cache. It uses the QueryCacheConfig (part of config.QueryConfig) to determine which levels of parameter sets to cache.psRefresher.PreflightQuery (via the dfBuilder) to get the filtered parameter set and the count of matching traces.populateChildLevel to cache parameter sets for combinations of Level 1 and Level 2 parameters.paramSetKey and countKey, ensuring a consistent naming scheme.GetParamSetForQuery: When a query is made, getParamSetForQueryInternal first tries to retrieve the result from the cache.getParamSetKey). It only attempts to serve from the cache if the query matches the configured cache levels (1 or 2 parameters, potentially adjusted for default parameters).paramtools.ParamSet from the cached string and retrieves the count.psRefresher.GetParamSetForQuery.StartRefreshRoutine: This method starts a goroutine that periodically calls PopulateCache to keep the cached data fresh.Key Workflows:
Initialization and Periodic Refresh (Default Refresher):
NewDefaultParamSetRefresher(opsProvider, ...) -> pf
pf.Start(refreshPeriod)
-> pf.oneStep() // Initial fetch
-> opsProvider.GetLatestTile() -> latestTile
-> LOOP (numParamSets times):
-> opsProvider.GetParamSet(tile) -> individualPS
-> mergedPS.AddParamSet(individualPS)
-> tile = tile.Prev()
-> mergedPS.Normalize()
-> pf.ps = mergedPS.Freeze()
-> GO pf.refresh()
-> LOOP (every refreshPeriod):
-> pf.oneStep() // Subsequent fetches
Cache Population (Cached Refresher):
NewCachedParamSetRefresher(defaultRefresher, cacheImpl) -> cr
cr.StartRefreshRoutine(cacheRefreshPeriod)
-> cr.PopulateCache() // Initial population
-> defaultRefresher.GetAll() -> fullPS
-> // For each configured Level 1 key/value:
-> qValues = {level1Key: [level1Value]}
-> defaultRefresher.UpdateQueryValueWithDefaults(qValues) // If applicable
-> query.New(qValues) -> lv1Query
-> defaultRefresher.dfBuilder.PreflightQuery(ctx, lv1Query, fullPS) -> count, filteredPS
-> psCacheKey = paramSetKey(qValues, [level1Key])
-> cr.addToCache(ctx, psCacheKey, filteredPS.ToString(), count)
-> // If Level 2 is configured:
-> cr.populateChildLevel(ctx, level1Key, level1Value, filteredPS, level2Key, level2Values)
-> // For each configured Level 2 value:
-> qValues = {level1Key: [level1Value], level2Key: [level2Value]}
-> ... (similar PreflightQuery and addToCache)
-> GO LOOP (every cacheRefreshPeriod):
-> cr.PopulateCache() // Subsequent cache refreshes
Querying with Cache: cr.GetParamSetForQuery(ctx, queryObj, queryValues) -> cr.getParamSetForQueryInternal(ctx, queryObj, queryValues) -> cr.getParamSetKey(queryValues) -> cacheKey, err -> IF cacheKey is valid AND exists: -> cache.GetValue(ctx, cacheKey) -> cachedParamSetString -> cache.GetValue(ctx, countKey(cacheKey)) -> cachedCountString -> paramtools.FromString(cachedParamSetString) -> paramSet -> strconv.ParseInt(cachedCountString) -> count -> RETURN count, paramSet, nil -> ELSE (cache miss or invalid key for caching): -> defaultRefresher.GetParamSetForQuery(ctx, queryObj, queryValues) -> count, paramSet, err -> RETURN count, paramSet, err
The use of config.QueryConfig and config.Experiments allows for instance-specific tuning of caching behavior (which keys/values to pre-populate) and handling of default parameters. The separation between defaultParamSetRefresher and CachedParamSetRefresher promotes modularity, allowing the caching layer to be optional or replaced with different caching strategies if needed.
The redis module in Skia Perf is designed to manage interactions with Redis instances, primarily to support and optimize the query UI. It leverages Redis for caching frequently accessed data, thereby improving the responsiveness and performance of the Perf frontend.
The core idea is to periodically fetch information about available Redis instances within a Google Cloud Project and then interact with a specific, configured Redis instance to store or retrieve cached data. This cached data typically represents results of expensive computations or frequently requested data points, like recent trace data for specific queries.
Key Responsibilities and Components:
redis.go: This is the central file of the module.
RedisWrapper interface: Defines the contract for Redis-related operations. This abstraction allows for easier testing and potential future replacements of the underlying Redis client implementation. The key methods are:StartRefreshRoutine: Initiates a background process (goroutine) that periodically discovers and interacts with the configured Redis instance.ListRedisInstances: Retrieves a list of all Redis instances available within a specified GCP project and location.RedisClient struct: This is the concrete implementation of the RedisWrapper interface.gcp_redis.CloudRedisClient for interacting with the Google Cloud Redis API (e.g., listing instances).tracestore.TraceStore, which is likely used to fetch the data that needs to be cached in Redis.tilesToCache field suggests that the caching strategy might involve pre-calculating and storing “tiles” of data, which is a common pattern in Perf systems for displaying graphs over time.NewRedisClient: The constructor for RedisClient.StartRefreshRoutine: - Why: To ensure that Perf is always aware of the correct Redis instance to use and to periodically update the cache. Network configurations or instance details might change, and this routine helps adapt to such changes. - How: It takes a refreshPeriod and a config.InstanceConfig (which is actually redis_client.RedisConfig in the current implementation, indicating the target project, zone, and instance name). It then starts a goroutine that, at regular intervals defined by refreshPeriod: _ Calls ListRedisInstances to get all Redis instances in the configured project/zone. _ Iterates through the instances to find the one matching the config.Instance name. * If the target instance is found, it calls RefreshCachedQueries. [StartRefreshRoutine] | V (Goroutine - Ticks every 'refreshPeriod') | V [ListRedisInstances] -> (GCP API Call) -> [List of Redis Instances] | V (Find Target Instance by Name) | V (If Target Found) [RefreshCachedQueries]ListRedisInstances:gcpClient (an instance of cloud.google.com/go/redis/apiv1.CloudRedisClient) to make an API call to GCP to list instances under the given parent (e.g., “projects/my-project/locations/us-central1”). It iterates through the results and returns a slice of redispb.Instance objects.RefreshCachedQueries:instance.Host and instance.Port) using github.com/redis/go-redis/v9.r.mutex.Lock()) to prevent concurrent modifications to the cache or shared resources, though the current implementation only has placeholder logic.GET a key named “FullPS”.SETs the key “FullPS” to the current time, with an expiration of 30 seconds.TODO(wenbinzhang) and tilesToCache): This method is expected to be expanded to:traceStore.tilesToCache parameter suggests it might pre-cache a certain number of recent “tiles” of trace data.mocks/RedisWrapper.go: This file contains a mock implementation of the RedisWrapper interface, generated by the mockery tool.
RedisWrapper. By using a mock, tests can simulate various Redis behaviors (e.g., successful connection, instance not found, errors) without needing an actual Redis instance or GCP connectivity.RedisWrapper struct that embeds mock.Mock from the testify library. For each method in the RedisWrapper interface, there's a corresponding method in the mock that records calls and can be configured to return specific values or errors, allowing test authors to define expected interactions.Design Decisions and Rationale:
RedisWrapper): Using an interface decouples the rest of the Perf system from the concrete Redis client implementation. This is good for:mocks package.RedisWrapper without affecting its consumers.StartRefreshRoutine provides a more robust approach.go-redis).cloud.google.com/go/redis/apiv1 for GCP infrastructure management.github.com/redis/go-redis/v9 for standard Redis data operations. This ensures reliance on well-maintained and feature-rich libraries.Workflow: Cache Refresh Process
The primary workflow driven by this module is the periodic refresh of cached data:
System Starts
|
V
Initialize RedisClient (NewRedisClient)
|
V
Call StartRefreshRoutine
|
V
[Background Goroutine - Loop every 'refreshPeriod']
|
|--> 1. List GCP Redis Instances (ListRedisInstances)
| - Input: GCP project, location
| - Output: List of *redispb.Instance
|
|--> 2. Identify Target Redis Instance
| - Based on configuration (e.g., instance name)
|
|--> 3. If Target Instance Found: Refresh Cache (RefreshCachedQueries)
|
|--> a. Connect to Target Redis (using go-redis)
| - Host, Port from *redispb.Instance
|
|--> b. Determine data to cache (e.g., recent trace data for popular queries)
| - Likely involves `traceStore`
|
|--> c. Write data to Redis (SET commands)
| - Use appropriate keys and expiration times
|
|--> (Current placeholder: SET "FullPS" = current_time with 30s TTL)
This module provides the foundational components for integrating Redis as a caching layer in Skia Perf, aiming to improve UI performance by serving frequently requested data quickly from an in-memory store. The current implementation focuses on instance discovery and has placeholder logic for the actual caching, which is expected to be expanded based on Perf's specific caching needs.
The /go/regression module is responsible for detecting, storing, and managing performance regressions in Skia. It analyzes performance data over time, identifies significant changes (regressions or improvements), and provides mechanisms for triaging and tracking these changes.
Core Functionality & Design:
The primary goal is to automatically flag performance changes that might indicate a problem or an unexpected improvement. This involves:
Key Components & Files:
detector.go: This file contains the core logic for processing regression detection requests.
ProcessRegressions is the main entry point. It takes a RegressionDetectionRequest (which specifies the alert configuration and the time domain to analyze) and a DetectorResponseProcessor callback.GroupBy parameters into multiple, more specific requests using allRequestsFromBaseRequest. This allows for targeted analysis of specific trace groups.dfiter.DataFrameIterator, which provides dataframes for analysis.tooMuchMissingData) to ensure the reliability of the detection.clustering2.CalculateClusterSummaries or individual StepFit via StepFit) to identify clusters of traces exhibiting similar behavior. The choice of K (number of clusters for K-Means) can be automatic or user-specified.RegressionDetectionResponse objects containing the cluster summaries and the relevant data frame. These responses are passed to the DetectorResponseProcessor.shortcutFromKeys for easier referencing.RegressionDetectionRequest -> Expand (if GroupBy) -> Multiple Requests | V For each Request: DataFrameIterator -> DataFrame -> Filter Traces -> Apply Clustering (KMeans or StepFit) | | V V Shortcut Creation <- ClusterSummaries -> DetectorResponseProcessorregression.go: Defines the primary data structures for representing regressions and their triage status.
Regression: The central struct holding Low and High ClusterSummary objects (from clustering2), the FrameResponse (data context), and TriageStatus for both low and high. It also includes fields for the newer regression2 schema (like Id, CommitNumber, AlertId, MedianBefore, MedianAfter).TriageStatus: Represents whether a regression is Untriaged, Positive (expected/acceptable), or Negative (a bug).AllRegressionsForCommit: A container for all regressions found for a specific commit, keyed by the alert ID.Merge: A method to combine information from two Regression objects, typically used when new data provides a more significant regression for an existing alert.types.go: Defines the Store interface, which abstracts the persistence layer for regressions.
Store interface specifies methods for:Range: Retrieving regressions within a commit range.SetHigh/SetLow: Storing newly detected high/low regressions.TriageHigh/TriageLow: Updating the triage status of regressions.Write: Bulk writing of regressions.GetRegressionsBySubName, GetByIDs: Retrieving regressions based on subscription names or specific IDs (primarily for the regression2 schema).GetOldestCommit, GetRegression: Utility methods for fetching specific data.DeleteByCommit: Removing regressions associated with a commit.stepfit.go: Implements an alternative regression detection strategy that analyzes each trace individually using step fitting.
GroupBy is used in an alert, or when K-Means clustering is not the desired approach. It focuses on finding significant steps in individual time series.StepFit function iterates through each trace in the input DataFrame.stepfit.GetStepFitAtMid to determine if there's a significant step (low or high) around the midpoint of the trace.stddevThreshold and interesting parameters), the trace is added to either the low or high ClusterSummary.low and high summaries collect all traces that show a downward or upward step, respectively.ParamSummaries) are generated for the keys within these clusters.fromsummary.go: Provides a utility function to convert a RegressionDetectionResponse into a Regression object.
Regression type used for storage and display.RegressionFromClusterResponse takes a RegressionDetectionResponse, an alerts.Alert configuration, and a perfgit.Git instance.ClusterSummary objects in the response.Low or High fields of the Regression object. It prioritizes the regression with the largest absolute magnitude if multiple are found.Submodules:
continuous/ (continuous.go): Manages the continuous, background detection of regressions.
Continuous struct: Holds dependencies like perfgit.Git, regression.Store, alerts.ConfigProvider, notify.Notifier, etc.Run(): The main entry point, which starts either event-driven or polling-based regression detection.RunEventDrivenClustering):FileIngestionTopicName indicating new data ingestion (ingestevents.IngestEvent).getTraceIdConfigsForIngestEvent (which calls matchingConfigsFromTraceIDs).matchingConfigsFromTraceIDs refines alert queries if GroupBy is present to be more specific to the incoming trace.ProcessAlertConfig (or ProcessAlertConfigForTraces if StepFitGrouping is used) for each matching config and the specific traces.RunContinuousClustering):pollingDelay), fetches all alert configurations using buildConfigAndParamsetChannel.ProcessAlertConfig for each configuration.ProcessAlertConfig():GroupBy alerts to ensure the query is valid and returns data.regression.ProcessRegressions to perform the actual detection.clusterResponseProcessor (which is reportRegressions) is called with the detection results.reportRegressions():RegressionDetectionResponse), it determines the commit and previous commit details.updateStoreAndNotification to persist the regression and send notifications.updateStoreAndNotification():regression.Store.store.SetLow or store.SetHigh) and sends a notification via notifier.RegressionFound. The notification ID is stored with the regression.notifier.UpdateNotification.EventDrivenRegressionDetection flag.Pub/Sub Message (New Data) -> Decode IngestEvent -> Get Matching Alert Configs | V For each (Config, Matched Traces): ProcessAlertConfig -> regression.ProcessRegressions | V reportRegressions -> updateStoreAndNotification | | V V Store Notifiermigration/ (migrator.go): Handles the data migration from an older regressions table schema to the newer regressions2 schema.
Regression2Schema) aims to store regression data more granularly, typically one row per detected step (high or low), rather than combining high and low for the same commit/alert into a single JSON blob.RegressionMigrator: Contains instances of the legacy sqlregressionstore.SQLRegressionStore and the new sqlregression2store.SQLRegression2Store.RunPeriodicMigration: Sets up a ticker to periodically run RunOneMigration.RunOneMigration / migrateRegressions:legacyStore.GetRegressionsToMigrate).Regression object:regression2 schema (e.g., Id, PrevCommitNumber, MedianBefore, MedianAfter, IsImprovement, ClusterType) if they are not already present from the legacy data. This is crucial as the sqlregression2store.WriteRegression expects these.sqlregression2store.WriteRegression function might split a single legacy Regression object (if it has both High and Low components) into two separate entries in the Regressions2 table, one for HighClusterType and one for LowClusterType.Regressions table as migrated using legacyStore.MarkMigrated, storing the new regression ID.sqlregressionstore/: Implements the regression.Store interface using a generic SQL database. This is the older SQL storage mechanism.
SQLRegressionStore: The main struct, holding a database connection pool (pool.Pool) and prepared SQL statements. It supports different SQL dialects (e.g., CockroachDB via statements, Spanner via spannerStatements).sqlregressionstore/schema/RegressionSchema.go) typically stores one row per (commit_number, alert_id) pair. The actual regression.Regression object (which might contain both high and low details, along with the frame) is serialized into a JSON string and stored in a regression TEXT column.readModifyWrite: A core helper function that encapsulates the common pattern of reading a Regression from the DB, allowing a callback to modify it, and then writing it back. This is done within a transaction to prevent lost updates. If mustExist is true, it errors if the regression isn't found; otherwise, it creates a new one.SetHigh/SetLow: Use readModifyWrite to update the High or Low part of the JSON-serialized Regression object. They also update the triage status to Untriaged if it was previously None.TriageHigh/TriageLow: Use readModifyWrite to update the HighStatus or LowStatus within the JSON-serialized Regression.GetRegressionsToMigrate: Fetches regressions that haven't been migrated to the regression2 schema.MarkMigrated: Updates a row to indicate it has been migrated, storing the new regression_id from the regression2 table.Regression object as JSON can make querying for specific aspects of the regression (e.g., only high regressions, or regressions with a specific triage status) less efficient and more complex. This is one of the motivations for the sqlregression2store.sqlregression2store/: Implements the regression.Store interface using a newer SQL schema (Regressions2).
sqlregressionstore by storing regression data in a more normalized and queryable way.SQLRegression2Store: The main struct.sqlregression2store/schema/Regression2Schema.go): Designed to store each regression step (high or low) as a separate row. Key columns include id (UUID, primary key), commit_number, prev_commit_number, alert_id, creation_time, median_before, median_after, is_improvement, cluster_type (e.g., “high”, “low”), cluster_summary (JSONB), frame (JSONB), triage_status, and triage_message.writeSingleRegression: The core writing function. It takes a regression.Regression object and writes its relevant parts (either high or low, but not both in the same DB row) to the Regressions2 table.convertRowToRegression: Converts a database row from Regressions2 back into a regression.Regression object. Depending on the cluster_type in the row, it populates either the High or Low part of the Regression object.SetHigh/SetLow:updateBasedOnAlertAlgo.updateBasedOnAlertAlgo: This function is crucial. It considers the Algo type of the alert (KMeansGrouping vs. StepFitGrouping).KMeansGrouping, it expects to potentially update an existing regression for the same (commit_number, alert_id) as new data might refine the cluster. It uses readModifyWriteCompat to achieve this.StepFitGrouping (individual trace analysis), it generally expects to create a new regression entry if one doesn‘t exist for the exact frame, avoiding updates to pre-existing ones unless it’s truly a new detection.updateFunc passed to updateBasedOnAlertAlgo populates the necessary fields in the regression.Regression object (e.g., setting r.High or r.Low, and calling populateRegression2Fields).populateRegression2Fields: This helper populates the fields specific to the Regressions2 schema (like PrevCommitNumber, MedianBefore, MedianAfter, IsImprovement) from the ClusterSummary and FrameResponse within the Regression object.WriteRegression (used by migrator): If a legacy Regression object has both High and Low components, this function splits it and calls writeSingleRegression twice, creating two rows in Regressions2.Range: When retrieving regressions, if multiple rows from Regressions2 correspond to the same (commit_number, alert_id) (e.g., one for high, one for low), it merges them back into a single regression.Regression object for compatibility with how the rest of the system might expect the data.Overall Workflow Example (Simplified):
continuous.go):Continuous identifies relevant alerts.Alert configurations.ProcessAlertConfig is called.detector.go):ProcessRegressions fetches data, builds DataFrames.stepfit.go) is applied.RegressionDetectionResponses are generated.continuous.go calls back into regression store):reportRegressions processes these responses.updateStoreAndNotification interacts with a regression.Store implementation (e.g., sqlregression2store.go):SetLow or SetHigh on the store.sqlregression2store) writes the data to the Regressions2 table, potentially creating a new row or updating an existing one based on the alert's algorithm type.The system is designed to be modular, with interfaces like regression.Store and alerts.ConfigProvider allowing for flexibility in implementation details. The migration path from sqlregressionstore to sqlregression2store highlights the evolution towards a more structured and queryable data model for regressions.
The samplestats module is designed to perform statistical analysis on sets of performance data, specifically to identify significant changes between two sample sets, often referred to as “before” and “after” states. This is crucial for detecting regressions or improvements in performance metrics over time or across different code versions.
The core functionality revolves around comparing these two sets of samples for each trace (a unique combination of parameters identifying a specific test or metric). It calculates various statistical metrics for each set and then employs statistical tests to determine if the observed differences are statistically significant.
Key Design Choices and Implementation Details:
Config struct provides a centralized way to control the analysis process. This includes setting the alpha level, choosing the statistical test, enabling outlier removal, and deciding whether to include all traces in the output or only those with significant changes. This configurability makes the module adaptable to various analysis needs.Result struct encapsulates the outcome of the analysis, including a list of Row structs (one per trace) and a count of skipped traces. Each Row contains the trace identifier, its parameters, the calculated metrics for both “before” and “after” samples, the percentage delta, the p-value, and any informational notes (e.g., errors during statistical test calculation). This structured output facilitates further processing or display of the results.Delta. This allows users to quickly identify the most impactful changes. The Order type and functions like ByName, ByDelta, and Reverse provide a flexible sorting mechanism.Responsibilities and Key Components:
analyze.go: This is the heart of the module.
Analyze function: This is the primary entry point. It takes the Config and two maps of samples (before and after, where keys are trace IDs and values are parser.Samples).calculateMetrics (from metrics.go) for both “before” and “after” samples.Config.Test setting, it performs either the Mann-Whitney U test or the Two Sample Welch's t-test using functions from the github.com/aclements/go-moremath/stats library.p < alpha, it calculates the percentage Delta between the means. Otherwise, Delta is NaN.Row struct with all the calculated information.Config.All is false.Rows based on Config.Order (or by Delta if no order is specified) using the Sort function from sort.go.Result struct containing the list of Rows and the count of skipped traces.Config struct: Defines the parameters that control the analysis, such as Alpha for p-value cutoff, Order for sorting, IQRR for outlier removal, All for including all results, and Test for selecting the statistical test.Result struct: Encapsulates the output of the Analyze function, holding the Rows of analysis data and the Skipped count.Row struct: Represents the analysis results for a single trace, including its name, parameters, “before” and “after” Metrics, the percentage Delta, the P value, and any Note.metrics.go: This file is responsible for calculating basic statistical metrics from a given set of sample values.
calculateMetrics function: Takes a Config (primarily to check IQRR) and parser.Samples.Config.IQRR is true, it applies the Interquartile Range Rule to filter out outliers from samples.Values. The values within 1.5 * IQR from the first and third quartiles are retained.Mean, StdDev (standard deviation), and Percent (coefficient of variation: StdDev / Mean * 100) of the (potentially filtered) values.Metrics struct, along with the (potentially filtered) Values.Metrics struct: Holds the calculated Mean, StdDev, raw Values (after potential outlier removal), and Percent (coefficient of variation).sort.go: This file provides utilities for sorting the results (Row slices).
Order type: A function type func(rows []Row, i, j int) bool defining a less-than comparison for sorting Rows.ByName function: An Order implementation that sorts rows alphabetically by Row.Name.ByDelta function: An Order implementation that sorts rows by Row.Delta. It specifically places NaN delta values (insignificant changes) at the beginning.Reverse function: A higher-order function that takes an Order and returns a new Order that represents the reverse of the input order.Sort function: A convenience function that sorts a slice of Rows in place using sort.SliceStable and a given Order.Illustrative Workflow (Simplified Analyze Process):
Input: before_samples, after_samples, config
For each trace_id in (before_samples keys + after_samples keys):
If trace_id not in before_samples OR trace_id not in after_samples:
Increment skipped_count
Continue
before_metrics = calculateMetrics(config, before_samples[trace_id])
after_metrics = calculateMetrics(config, after_samples[trace_id])
If config.Test == UTest:
p_value = MannWhitneyUTest(before_metrics.Values, after_metrics.Values)
Else (config.Test == TTest):
p_value = TwoSampleWelchTTest(before_metrics.Values, after_metrics.Values)
alpha = config.Alpha (or defaultAlpha if config.Alpha is 0)
If p_value < alpha:
delta = ((after_metrics.Mean / before_metrics.Mean) - 1) * 100
Else:
delta = NaN
If NOT config.All:
Continue // Skip if not showing all results and change is not significant
Add new Row{Name: trace_id, Delta: delta, P: p_value, ...} to results_list
Sort results_list using config.Order (or ByDelta by default)
Return Result{Rows: results_list, Skipped: skipped_count}
The sheriffconfig module is responsible for managing configurations for Skia Perf's anomaly detection and alerting system. These configurations, known as “Sheriff Configs,” are defined in Protocol Buffer format and are typically stored in LUCI Config. This module handles fetching these configurations, validating them, and transforming them into a format suitable for storage and use by other Perf components, specifically the alerts and subscription modules.
The core idea is to allow users to define rules for which performance metrics they care about and how anomalies in those metrics should be detected and handled. This provides a flexible and centralized way to manage alerting for a large number of performance tests.
Key Responsibilities and Components:
Protocol Buffer Definitions (/proto/v1):
sheriff_config.proto: Defines the main messages like SheriffConfig, Subscription, AnomalyConfig, and Rules.SheriffConfig: The top-level message, containing a list of Subscriptions. This represents the entire set of alerting configurations for a Perf instance.Subscription: Represents a user‘s or team’s interest in a specific set of metrics. It includes details for creating bug reports (e.g., contact email, bug component, labels, priority, severity) and a list of AnomalyConfigs that define how to detect anomalies for the metrics covered by this subscription.AnomalyConfig: Specifies the parameters for anomaly detection for a particular subset of metrics. This includes:Rules: Define which metrics this AnomalyConfig applies to, using match and exclude patterns. These patterns are query strings (e.g., “master=ChromiumPerf&benchmark=Speedometer2”).step (algorithm for step detection), radius (commits to consider), threshold (sensitivity), minimum_num (number of interesting traces to trigger an alert), sparse (handling of missing data), k (for K-Means clustering), group_by (for breaking down clustering), direction (up, down, or both), action (no action, triage, or bisect), and algo (clustering algorithm like StepFit or KMeans).Rules: Contains lists of match and exclude strings. Match strings define positive criteria for selecting metrics, while exclude strings define negative criteria. The combination allows for precise targeting of metrics.sheriff_config.pb.go: The Go code generated from sheriff_config.proto. This provides the Go structs and methods to work with these configurations programmatically.generate.go: Contains go:generate directives used to regenerate sheriff_config.pb.go whenever sheriff_config.proto changes. This ensures the Go code stays in sync with the proto definition.Validation (/validate):
validate.go: This is crucial for ensuring the integrity and correctness of Sheriff Configurations before they are processed or stored. It performs a series of checks:match and exclude strings in Rules are well-formed query strings. It checks for valid regex if a value starts with ~. It also enforces that exclude patterns only target a single key-value pair.AnomalyConfig has at least one match pattern.name, contact_email, bug_component, and instance are present. It also checks that each subscription has at least one AnomalyConfig.Subscription and that all subscription names within a config are unique.DeserializeProto: A helper function to convert a base64 encoded string (as typically retrieved from LUCI Config) into a SheriffConfig protobuf message.Service (/service):
service.go: This component orchestrates the process of fetching Sheriff Configurations from LUCI Config, processing them, and storing them in the database.New function: Initializes the sheriffconfigService, taking dependencies like a database connection pool (sql.Pool), subscription.Store, alerts.Store, and a luciconfig.ApiClient. If no luciconfig.ApiClient is provided, it creates one.ImportSheriffConfig method: This is the main entry point for importing configurations.luciconfig.ApiClient to fetch configurations from a specified LUCI Config path (e.g., “skia-sheriff-configs.cfg”).processConfig.subscription_pb.Subscription objects into the subscriptionStore and all alerts.SaveRequest objects into the alertStore within a single database transaction. This ensures atomicity – either all changes are saved, or none are.processConfig method:pb.SheriffConfig protobuf message using prototext.Unmarshal.pb.SheriffConfig using validate.ValidateConfig.pb.Subscription in the config:instance field, only processing those matching the service's configured instance (e.g., “chrome-internal”). This allows multiple Perf instances to share a config file but only import relevant subscriptions.makeSubscriptionEntity to convert the pb.Subscription into a subscription_pb.Subscription (the format used by the subscription module).subscriptionStore. If it does, it means this specific version of the subscription has already been imported, so it‘s skipped. This prevents redundant database writes and processing if the LUCI config file hasn’t actually changed for that subscription.makeSaveRequests to generate alerts.SaveRequest objects for each alert defined within that subscription.makeSubscriptionEntity function: Transforms a pb.Subscription (from Sheriff Config proto) into a subscription_pb.Subscription (for the subscription datastore), mapping fields and applying default priorities/severities if not specified.makeSaveRequests function:pb.AnomalyConfig within a pb.Subscription.match rule within the pb.AnomalyConfig.Rules:buildQueryFromRules to construct the actual query string that will be used to select metrics for this alert.createAlert to create an alerts.Alert object, populating it with parameters from the pb.AnomalyConfig and the parent pb.Subscription.alerts.Alert in an alerts.SaveRequest along with the subscription name and revision.createAlert function: Populates an alerts.Alert struct. This involves:AnomalyConfig_Step, AnomalyConfig_Direction, AnomalyConfig_Action, AnomalyConfig_Algo) to their corresponding internal types used by the alerts module (e.g., alerts.Direction, types.RegressionDetectionGrouping, types.StepDetection, types.AlertAction). This is done using maps like directionMap, clusterAlgoMap, etc.radius, minimum_num, sparse, k, group_by if they are not explicitly set in the AnomalyConfig.buildQueryFromRules function: Constructs a canonical query string from a match string and a list of exclude strings. It parses them as URL query parameters, combines them (with ! for excludes), sorts the parts alphabetically, and joins them with &. This ensures that equivalent rules always produce the same query string.getPriorityFromProto and getSeverityFromProto functions: Convert the enum values for priority and severity from the proto definition to the integer values expected by the subscription module, applying defaults if the proto value is “unspecified.”StartImportRoutine and ImportSheriffConfigOnce: Provide functionality to periodically fetch and import configurations, making the system self-updating when LUCI configs change.Workflow: Importing a Sheriff Configuration
LUCI Config Change (e.g., new revision of skia-sheriff-configs.cfg)
|
v
Sheriffconfig Service (triggered by timer or manual call)
|
|--- 1. luciconfigApiClient.GetProjectConfigs("skia-sheriff-configs.cfg") --> Fetches raw config content + revision
|
v
For each config file content:
|
|--- 2. processConfig(configContent, revision)
| |
| |--- 2a. prototext.Unmarshal(configContent) --> pb.SheriffConfig
| |
| |--- 2b. validate.ValidateConfig(pb.SheriffConfig) --> Error or OK
| |
| v
| For each pb.Subscription in pb.SheriffConfig:
| |
| |--- 2c. If subscription.Instance != service.Instance --> Skip
| |
| |--- 2d. subscriptionStore.GetSubscription(name, revision) --> ExistingSubscription?
| |
| |--- 2e. If ExistingSubscription == nil (new or updated):
| | |
| | |--- makeSubscriptionEntity(pb.Subscription, revision) --> subscription_pb.Subscription
| | |
| | |--- makeSaveRequests(pb.Subscription, revision)
| | | |
| | | v
| | | For each pb.AnomalyConfig in pb.Subscription:
| | | |
| | | v
| | | For each matchRule in pb.AnomalyConfig.Rules:
| | | |
| | | |--- buildQueryFromRules(matchRule, excludeRules) --> queryString
| | | |
| | | |--- createAlert(queryString, pb.AnomalyConfig, pb.Subscription, revision) --> alerts.Alert
| | | |
| | | ---> Collect alerts.SaveRequest
| | |
| | ---> Collect subscription_pb.Subscription
|
v
Database Transaction (BEGIN)
|
|--- 3. subscriptionStore.InsertSubscriptions(collected_subscriptions)
|
|--- 4. alertStore.ReplaceAll(collected_save_requests)
|
Database Transaction (COMMIT or ROLLBACK)
This module acts as a critical bridge, translating human-readable (and machine-parsable via proto) alerting definitions into the concrete data structures used by Perf's backend alerting and subscription systems. The validation step is key to preventing malformed configurations from breaking the alerting pipeline. The revision checking mechanism ensures efficiency by only processing changes.
The shortcut module provides functionality for creating, storing, and retrieving “shortcuts”. A shortcut is essentially a named list of trace keys. These trace keys typically represent specific performance metrics or configurations. The primary purpose of shortcuts is to provide a convenient way to refer to a collection of traces with a short, memorable identifier, rather than having to repeatedly specify the full list of keys. This is particularly useful for sharing links to specific views in the Perf UI or for programmatic access to predefined sets of performance data.
The core component is the Store interface, defined in shortcut.go. This interface abstracts the underlying storage mechanism, allowing different implementations to be used (e.g., in-memory for testing, SQL database for production). The key operations defined by the Store interface are:
Insert: Adds a new shortcut to the store. It takes an io.Reader containing the shortcut data (typically JSON) and returns a unique ID for the shortcut.InsertShortcut: Similar to Insert, but takes a Shortcut struct directly.Get: Retrieves a shortcut given its ID.GetAll: Returns a channel that streams all stored shortcuts. This is useful for tasks like data migration.DeleteShortcut: Removes a shortcut from the store.A Shortcut itself is a simple struct containing a slice of strings, where each string is a trace key.
The generation of shortcut IDs is handled by the IDFromKeys function. This function takes a Shortcut struct, sorts its keys alphabetically (to ensure that the order of keys doesn't affect the ID), and then computes an MD5 hash of the concatenated keys. A prefix “X” is added to this hash for historical reasons, maintaining compatibility with older systems. This deterministic ID generation ensures that the same set of keys will always produce the same shortcut ID.
Workflow for creating and retrieving a shortcut:
Creation: Client Code ---(JSON data or Shortcut struct)---> Store.Insert or Store.InsertShortcut Store ---(Generates ID using IDFromKeys, marshals to JSON if needed)---> Underlying Storage (e.g., SQL DB) Underlying Storage ---> Store ---(Returns Shortcut ID)---> Client Code
Retrieval: Client Code ---(Shortcut ID)---> Store.Get Store ---(Queries by ID)---> Underlying Storage (e.g., SQL DB) Underlying Storage ---(Returns stored JSON or data)---> Store Store ---(Unmarshals to Shortcut struct, sorts keys)---> Client Code (receives Shortcut struct)
The sqlshortcutstore subdirectory provides a concrete implementation of the Store interface using an SQL database (specifically designed for CockroachDB, as indicated by test setup and migration references). The sqlshortcutstore.go file contains the logic for interacting with the database, including SQL statements for inserting, retrieving, and deleting shortcuts. Shortcut data is stored as JSON strings in the database. The schema for the Shortcuts table is implicitly defined by the SQL statements and further clarified in sqlshortcutstore/schema/schema.go, which defines a ShortcutSchema struct mirroring the table structure (though this struct is primarily for documentation or ORM-like purposes and not directly used in the raw SQL interaction in sqlshortcutstore.go).
Testing is a significant aspect of this module:
shortcut_test.go contains unit tests for the IDFromKeys function, ensuring its correctness and deterministic behavior.shortcuttest provides a suite of common tests (InsertGet, GetNonExistent, GetAll, DeleteShortcut) that can be run against any implementation of the shortcut.Store interface. This promotes consistency and ensures that different store implementations behave as expected. The InsertGet test, for example, verifies that a stored shortcut can be retrieved and that the keys are sorted upon retrieval, even if they were not sorted initially.sqlshortcutstore_test.go utilizes the tests from shortcuttest to validate the SQLShortcutStore implementation against a test database.mocks/Store.go provides a mock implementation of the Store interface, generated by the mockery tool. This is useful for testing components that depend on shortcut.Store without needing a real storage backend.The go/sql module serves as the central hub for managing the SQL database schema used by the Perf application. It defines the structure of the database tables and provides utilities for schema generation, validation, and migration. This module ensures that the application's database schema is consistent, well-defined, and can evolve smoothly over time.
Key Responsibilities and Components:
Schema Definition (schema.go, spanner/schema_spanner.go):
CREATE TABLE statements that define the structure of all tables used by Perf. Having the schema defined in code (generated from Go structs) provides a single source of truth and allows for easier version control and programmatic manipulation.schema.go: Defines the schema for CockroachDB.spanner/schema_spanner.go: Defines the schema for Spanner. Spanner has slightly different SQL syntax and features (e.g., TTL INTERVAL), necessitating a separate schema definition.tosql utility (see below). This ensures that the SQL schema accurately reflects the Go struct definitions in other modules (e.g., perf/go/alerts/sqlalertstore/schema).CREATE TABLE statements, these files also export slices of strings representing the column names for each table. This can be useful for constructing SQL queries programmatically.Table Struct Definition (tables.go):
Tables which aggregates all the individual table schema structs from various Perf sub-modules (like alerts, anomalygroup, git, etc.).Tables struct serves as the input to the tosql schema generator. By referencing schema structs from other modules, it ensures that the generated SQL schema is consistent with how data is represented and manipulated throughout the application. The //go:generate directives at the top of this file trigger the tosql utility to regenerate the schema files when necessary.Schema Generation Utility (tosql/main.go):
schema.go and spanner/schema_spanner.go) from the Go struct definitions.sql.Tables struct (defined in tables.go) as input and uses the go/sql/exporter module to translate the Go struct tags and field types into corresponding SQL CREATE TABLE statements. It supports different SQL dialects (CockroachDB and Spanner) and can handle specific features like Spanner's TTL (Time To Live) for tables. The schemaTarget flag controls which database dialect is generated.Expected Schema and Migration (expectedschema/):
Why: As the application evolves, the database schema needs to change. This submodule manages schema migrations, ensuring that the live database can be updated to new versions without downtime or data loss. It also validates that the current database schema matches an expected version.
How:
embed.go: This file uses go:embed to embed JSON representations of the current (schema.json, schema_spanner.json) and previous (schema_prev.json, schema_prev_spanner.json) expected database schemas. These JSON files are generated by the exportschema utility. Load() and LoadPrev() functions provide access to these deserialized schema descriptions.
migrate.go: This is the core of the schema migration logic.
FromLiveToNext, FromNextToLive, and their Spanner equivalents) that describe how to upgrade the database from the “previous” schema version to the “next” (current) schema version, and how to roll back that change. Crucially, schema changes must be backward and forward compatible because during a deployment, old and new versions of the application might run concurrently.ValidateAndMigrateNewSchema is the key function. It:actual == next, no migration is needed.actual == prev and actual != next, it executes the FromLiveToNext SQL statements to upgrade the database schema.actual matches neither prev nor next, it indicates an unexpected schema state and returns an error, preventing application startup. This is a critical safety check.ValidateAndMigrateNewSchema) -> New instances (frontend, ingesters)Deployment Starts
|
V
Maintenance Task Runs
|
+------------------------------------+
| Calls ValidateAndMigrateNewSchema |
+------------------------------------+
|
V
Is schema == previous_expected_schema? --Yes--> Apply `FromLiveToNext` SQL
| No |
V V
Is schema == current_expected_schema? ---Yes---> Migration Successful / No Action
| No
V
Error: Schema mismatch! Halt.
|
V
New Application Instances Start (if migration was successful)
Test files (migrate_test.go, migrate_spanner_test.go): These files contain unit tests to verify the schema migration logic for both CockroachDB and Spanner. They test scenarios where no migration is needed, migration is required, and the schema is in an invalid state.
Schema Export Utility (exportschema/main.go):
expectedschema submodule needs JSON representations of the “current” and “previous” database schemas to perform validation and migration. This utility generates these JSON files.sql.Tables struct (for CockroachDB) or spanner.Schema (for Spanner) and uses the go/sql/schema/exportschema module to serialize the schema description into a JSON format. The output of this utility is typically checked into version control as schema.json, schema_prev.json, etc., within the expectedschema directory. The typical workflow for a schema change involves:alerts.AlertSchema).go generate ./... within perf/go/sql/ to regenerate schema.go and spanner/schema_spanner.go.expectedschema/schema.json to expectedschema/schema_prev.json (and similarly for Spanner).exportschema binary (e.g., bazel run //perf/go/sql/exportschema -- --out perf/go/sql/expectedschema/schema.json) to generate the new expectedschema/schema.json.FromLiveToNext and FromNextToLive SQL statements in expectedschema/migrate.go.sql_test.go (LiveSchema, DropTables) if necessary.Testing Utilities (sqltest/sqltest.go):
NewCockroachDBForTests: Sets up a connection to a local CockroachDB instance (managed by cockroachdb_instance.Require), creates a new temporary database for the test, applies the current sql.Schema, and registers a cleanup function to drop the database after the test.NewSpannerDBForTests: Similarly, sets up a connection to a local Spanner emulator (via PGAdapter, required by pgadapter.Require), applies the current spanner.Schema, and prepares it for tests.Schema Tests (sql_test.go):
DropTables (to clean up) and LiveSchema / LiveSchemaSpanner. LiveSchema represents the schema before the latest change defined in expectedschema/migrate.go's FromLiveToNext.DropTables to ensure a clean slate.LiveSchema to simulate the state of the database before the pending migration.expectedschema.FromLiveToNext (or its Spanner equivalent).sql.Schema (or spanner.Schema) directly to a fresh database (which represents the target state). They should be identical.This comprehensive approach to schema management ensures that Perf's database can be reliably deployed, maintained, and evolved. The separation of concerns (schema definition, generation, validation, migration, and testing) makes the system robust and easier to understand.
The stepfit module is designed to analyze time-series data, specifically performance traces, to detect significant changes or “steps.” It employs various statistical algorithms to determine if a step up (performance improvement), a step down (performance regression), or no significant change has occurred in the data. This module is crucial for automated performance monitoring, allowing for the identification of impactful changes in system behavior.
The core idea is to fit a step function to the input trace data. A step function is a simple function that is constant except for a single jump (the “step”) at a particular point (the “turning point”). The module calculates the best fit for such a function and then evaluates the characteristics of this fit to determine the nature and significance of the step.
Key Components and Logic:
The primary entity in this module is the StepFit struct. It encapsulates the results of the step detection analysis:
LeastSquares: This field stores the Least Squares Error (LSE) of the fitted step function. A lower LSE generally indicates a better fit of the step function to the data. It's important to note that not all step detection algorithms calculate or use LSE; in such cases, this field is set to InvalidLeastSquaresError.TurningPoint: This integer indicates the index in the input trace where the step function changes its value. It essentially marks the location of the detected step.StepSize: This float represents the magnitude of the change in the step function. A negative StepSize implies a step up in the trace values (conventionally a performance regression, e.g., increased latency). Conversely, a positive StepSize indicates a step down (conventionally a performance improvement, e.g., decreased latency).Regression: This value is a metric used to quantify the significance or “interestingness” of the detected step. Its calculation varies depending on the chosen stepDetection algorithm.OriginalStep algorithm, it's calculated as StepSize / LSE (or StepSize / stddevThreshold if LSE is too small). A larger absolute value of Regression implies a more significant step.AbsoluteStep, PercentStep, and CohenStep, Regression is directly related to the StepSize (or a normalized version of it).MannWhitneyU, Regression represents the p-value of the test.Status: This is an enumerated type (StepFitStatus) indicating the overall assessment of the step:LOW: A step down was detected, often interpreted as a performance improvement.HIGH: A step up was detected, often interpreted as a performance regression.UNINTERESTING: No significant step was found.The main function responsible for performing the analysis is GetStepFitAtMid. It takes the following inputs:
trace: A slice of float32 representing the time-series data to be analyzed.stddevThreshold: A threshold for standard deviation. This is used in the OriginalStep algorithm for normalizing the trace and as a floor for standard deviation in other algorithms like CohenStep to prevent division by zero or near-zero values.interesting: A threshold value used to determine if a calculated Regression value is significant enough to be classified as HIGH or LOW. The exact interpretation of this threshold depends on the stepDetection algorithm.stepDetection: An enumerated type (types.StepDetection) specifying which algorithm to use for step detection.Workflow of GetStepFitAtMid:
Initialization and Preprocessing:
StepFit struct is initialized with Status set to UNINTERESTING.minTraceSize (currently 3), the function returns the initialized StepFit as there isn't enough data to analyze.stepDetection is types.OriginalStep, the input trace is duplicated and normalized (mean centered and scaled by its standard deviation, unless the standard deviation is below stddevThreshold).stepDetection types, if the trace has an odd length, the last element is dropped to make the trace length even. This is because these algorithms typically compare the first half of the trace with the second half.Step Detection Algorithm Execution: The function then proceeds based on the selected stepDetection algorithm. The core logic involves splitting the (potentially modified) trace roughly in half at the TurningPoint (which is len(trace) / 2) and comparing statistics of the two halves.
- **`types.OriginalStep`:**
- Calculates the mean of the first half (`y0`) and the second half
(`y1`) of the (normalized) trace.
- Computes the Sum of Squared Errors (SSE) for fitting `y0` to the
first half and `y1` to the second half. The `LeastSquares` error
(`lse`) is derived from this SSE.
- `StepSize` is `y0 - y1`.
- `Regression` is calculated as `StepSize / lse` (or `StepSize /
stddevThresholdiflse` is too small). Note: The original implementation has a slight deviation from the standard definition of standard error in this calculation.
- **`types.AbsoluteStep`:**
- `StepSize` is `y0 - y1`.
- `Regression` is simply the `StepSize`.
- The step is considered interesting if the absolute value of
`StepSize` meets the `interesting` threshold.
- **`types.Const`:**
- This algorithm behaves differently. It focuses on the absolute value
of the trace point at the `TurningPoint` (`trace[i]`).
- `StepSize` is `abs(trace[i]) - interesting`.
- `Regression` is `-1 * abs(trace[i])`. This is done so that larger
deviations (regressions) result in more negative `Regression`
values, which are then flagged as `HIGH`.
- **`types.PercentStep`:**
- `StepSize` is `(y0 - y1) / y0`, representing the percentage change
relative to the mean of the first half.
- Handles potential `Inf` or `NaN` results from the division (e.g., if
`y0` is zero).
- `Regression` is the `StepSize`.
- **`types.CohenStep`:**
- Calculates Cohen's d, a measure of effect size.
- `StepSize` is `(y0 - y1) / s_pooled`, where `s_pooled` is the pooled
standard deviation of the two halves (or `stddevThreshold` if
`s_pooled` is too small or NaN).
- `Regression` is the `StepSize`.
- **`types.MannWhitneyU`:**
- Performs a Mann-Whitney U test (a non-parametric test) to determine
if the two halves of the trace come from different distributions.
- `StepSize` is `y0 - y1`.
- `Regression` is the p-value of the test.
- `LeastSquares` is set to the U-statistic from the test.
Status Determination:
types.MannWhitneyU:Regression (p-value) is less than or equal to the interesting threshold (e.g., 0.05), a significant difference is detected.Status (HIGH or LOW) is then determined by the sign of StepSize. If StepSize is negative (step up), Status is HIGH. Otherwise, it's LOW.Regression value is then negated if the status is HIGH to align with the convention that more negative values are “worse.”Regression is greater than or equal to interesting, Status is LOW.Regression is less than or equal to -interesting, Status is HIGH.Status remains UNINTERESTING.Return Result: The populated StepFit struct, containing LeastSquares, TurningPoint, StepSize, Regression, and Status, is returned.
Design Rationale:
OriginalStep, AbsoluteStep, PercentStep, CohenStep, MannWhitneyU) provides flexibility. Different datasets and performance characteristics may be better suited to different statistical approaches. For instance, MannWhitneyU is non-parametric and makes fewer assumptions about the data distribution, which can be beneficial for noisy or non-Gaussian data. AbsoluteStep and PercentStep offer simpler, more direct ways to define a regression based on absolute or relative changes.GetStepFitAtMid function consolidates the logic for all supported algorithms, making it easier to manage and extend.StepFit Structure: The StepFit struct provides a well-defined way to communicate the results of the analysis, separating the raw metrics (like StepSize, LeastSquares) from the final interpretation (Status).interesting Threshold: The interesting parameter allows users to customize the sensitivity of the step detection. This is crucial because what constitutes a “significant” change can vary greatly depending on the context of the performance metric being monitored.stddevThreshold: This parameter helps in handling cases with very low variance, preventing numerical instability (like division by zero) and ensuring that normalization in OriginalStep behaves reasonably.GetStepFitAtMid name implies that the step detection is focused around the middle of the trace. This is a common approach for detecting a single, prominent step. More complex scenarios with multiple steps would require different techniques.Why specific implementation choices?
OriginalStep: Normalizing the trace in the OriginalStep algorithm (as described in the linked blog post) aims to make the detection less sensitive to the absolute scale of the data and more focused on the relative change.OriginalStep: For algorithms other than OriginalStep, ensuring an even trace length by potentially dropping the last point simplifies the division of the trace into two equal halves for comparison.Inf and NaN in PercentStep: Explicitly checking for and handling Inf and NaN values that can arise from division by zero (when y0 is zero) makes the PercentStep calculation more robust.Regression as p-value for MannWhitneyU: Using the p-value as the Regression metric for MannWhitneyU directly reflects the statistical significance of the observed difference between the two halves of the trace. The interesting threshold then acts as the significance level (alpha).InvalidLeastSquaresError: This constant provides a clear way to indicate when LSE is not applicable or not calculated by a particular algorithm, avoiding confusion with a calculated LSE of 0 or a negative value.In essence, the stepfit module provides a toolkit for identifying abrupt changes in performance data, offering different lenses (algorithms) through which to view and quantify these changes. The design prioritizes flexibility in algorithm choice and user-configurable sensitivity to cater to diverse performance analysis needs.
The subscription module manages alerting configurations, known as subscriptions, for anomalies detected in performance data. It provides the means to define, store, and retrieve these configurations.
The core concept is that a “subscription” dictates how the system should react when an anomaly is found. This includes details like which bug tracker component to file an issue under, what labels to apply, who to CC on the bug, and the priority/severity of the issue. This allows for automated and consistent handling of performance regressions.
Subscriptions are versioned using an infra_internal Git hash (revision). This allows for tracking changes to subscription configurations over time and ensures that the correct configuration is used based on the state of the infrastructure code.
Key Components and Files:
store.go: Defines the Store interface. This interface is the central abstraction for interacting with subscription data. It dictates the operations that any concrete subscription storage implementation must provide. This design choice allows for flexibility in the underlying storage mechanism (e.g., SQL database, in-memory store for testing).
GetSubscription: Retrieves a specific version of a subscription.GetActiveSubscription: Retrieves the currently active version of a subscription by its name. This is likely the most common retrieval method for active alerting.InsertSubscriptions: Allows for batch insertion of new subscriptions. This is typically done within a database transaction to ensure atomicity – either all subscriptions are inserted, or none are. This is crucial when updating configurations, as it prevents a partially updated state. The implementation in sqlsubscriptionstore deactivates all existing subscriptions before inserting the new ones as active, effectively replacing the entire active set.GetAllSubscriptions: Retrieves all historical versions of all subscriptions.GetAllActiveSubscriptions: Retrieves all currently active subscriptions. This is useful for systems that need to know all current alerting rules.proto/v1/subscription.proto: Defines the structure of a Subscription using Protocol Buffers. This is the canonical data model for subscriptions.
name, revision, bug_labels, hotlists, bug_component, bug_priority, bug_severity, bug_cc_emails, contact_email. Each field directly maps to a configuration aspect for bug filing and contact information.sqlsubscriptionstore/sqlsubscriptionstore.go: Provides a concrete implementation of the Store interface using an SQL database (specifically designed for CockroachDB, as indicated by the use of pgx).
Store interface. When inserting subscriptions, it first deactivates all existing subscriptions and then inserts the new ones as active. This ensures that only the latest set of configurations is considered active.is_active boolean column in the database schema (sqlsubscriptionstore/schema/schema.go) is key to this “active version” concept.sqlsubscriptionstore/schema/schema.go: Defines the SQL table schema for storing subscriptions.
name and revision. This allows multiple versions of the same named subscription to exist, identified by their revision. The is_active field differentiates the current version from historical ones.mocks/Store.go: Contains a mock implementation of the Store interface, generated by the mockery tool.
Store interface without requiring an actual database connection. This makes tests faster, more reliable, and isolates the unit under test.Key Workflows:
Updating Subscriptions: This typically happens when configurations in infra_internal are changed.
External Process (e.g., config syncer)
|
v
Reads new subscription definitions (likely from files)
|
v
Parses definitions into []*pb.Subscription
|
v
Calls store.InsertSubscriptions(ctx, newSubscriptions, tx)
|
|--> [SQL Transaction Start]
| |
| v
| sqlsubscriptionstore: Deactivate all existing subscriptions (UPDATE Subscriptions SET is_active=false WHERE is_active=true)
| |
| v
| sqlsubscriptionstore: Insert each new subscription with is_active=true (INSERT INTO Subscriptions ...)
| |
| v
|--> [SQL Transaction Commit/Rollback]
This ensures that the update is atomic. If any part fails, the transaction is rolled back, leaving the previous set of active subscriptions intact.
Anomaly Detection Triggering Alerting: Anomaly Detector | v Identifies an anomaly and the relevant subscription name (e.g., based on metric patterns) | v Calls store.GetActiveSubscription(ctx, subscriptionName) | v sqlsubscriptionstore: Retrieves the active subscription (SELECT ... FROM Subscriptions WHERE name=$1 AND is_active=true) | v Anomaly Detector uses the pb.Subscription details (bug component, labels, etc.) to file a bug.
This module provides a robust and versioned way to manage alerting rules, ensuring that performance regressions are handled consistently and routed appropriately. The separation of interface and implementation, along with the use of Protocol Buffers, contributes to a maintainable and extensible system.
The tracecache module provides a mechanism for caching trace identifiers (trace IDs) associated with specific tiles and queries. This caching layer significantly improves performance by reducing the need to repeatedly compute or fetch trace IDs, which can be a computationally expensive operation.
Core Functionality & Design Rationale:
The primary purpose of tracecache is to store and retrieve lists of trace IDs. Trace IDs are represented as paramtools.Params, which are essentially key-value pairs that uniquely identify a specific trace within the performance monitoring system.
The caching strategy is built around the concept of a “tile” and a “query.”
query.Query, defines the specific parameters used to filter traces. Different queries will yield different sets of trace IDs.By combining the tile number and a string representation of the query, a unique cache key is generated. This ensures that cached data is specific to the exact combination of commit range and filter criteria.
The module relies on an external caching implementation provided via the go/cache.Cache interface. This design choice promotes flexibility, allowing different caching backends (e.g., in-memory, Redis, Memcached) to be used without modifying the tracecache logic itself. This separation of concerns is crucial for adapting to various deployment environments and performance requirements.
Key Components:
traceCache.go: This is the sole file in the module and contains the implementation of the TraceCache struct and its associated methods.TraceCache struct:cache.Cache. This is the underlying cache client used for storing and retrieving data.New(cache cache.Cache) *TraceCache:TraceCache. It takes a cache.Cache instance as an argument, which will be used for all caching operations. This dependency injection allows the caller to provide any cache implementation that conforms to the cache.Cache interface.CacheTraceIds(ctx context.Context, tileNumber types.TileNumber, q *query.Query, traceIds []paramtools.Params) error:cacheKey using the tileNumber and the query.Query.traceIds (a slice of paramtools.Params) are then serialized into a JSON string using the toJSON helper function. This serialization is necessary because most cache backends store data as strings or byte arrays. JSON is chosen for its human-readability and widespread support.cacheClient.SetValue method to store the JSON string under the generated cacheKey.GetTraceIds(ctx context.Context, tileNumber types.TileNumber, q *query.Query) ([]paramtools.Params, error):cacheKey in the same way as CacheTraceIds.cacheClient.GetValue.cacheJson is empty), it returns nil for both the trace IDs and the error, indicating a cache miss.paramtools.Params using json.Unmarshal.traceIdCacheKey(tileNumber types.TileNumber, q query.Query) string:tileNumber (an integer) and a string representation of the query.Query (obtained via q.KeyValueString()) separated by an underscore. This format ensures uniqueness and provides some human-readable context within the cache keys.toJSON(obj interface{}) (string, error):[]paramtools.Params before caching.Workflow for Caching Trace IDs:
tileNumber, query.Query, []paramtools.Params (trace IDs to cache)CacheTraceIds is called.traceIdCacheKey(tileNumber, query) generates a unique key. tileNumber + "_" + query.KeyValueString() ---> cacheKeytoJSON(traceIds) serializes the list of trace IDs into a JSON string. []paramtools.Params --json.Marshal--> jsonStringt.cacheClient.SetValue(ctx, cacheKey, jsonString) stores the JSON string in the underlying cache.Workflow for Retrieving Trace IDs:
tileNumber, query.QueryGetTraceIds is called.traceIdCacheKey(tileNumber, query) generates the cache key (same logic as above). tileNumber + "_" + query.KeyValueString() ---> cacheKeyt.cacheClient.GetValue(ctx, cacheKey) attempts to retrieve the value from the cache. cacheClient --GetValue(cacheKey)--> jsonString (or empty if not found)jsonString is empty (cache miss): Return nil, nil.jsonString is not empty (cache hit): json.Unmarshal([]byte(jsonString), &traceIds) deserializes the JSON string back into []paramtools.Params. jsonString --json.Unmarshal--> []paramtools.Params[]paramtools.Params and nil error.The tracefilter module provides a mechanism to organize and filter trace data based on their hierarchical paths. The core idea is to represent traces within a tree structure, where each node in the tree corresponds to a segment of the trace's path. This allows for efficient filtering of traces, specifically to identify “leaf” traces – those that do not have any further sub-paths.
This approach is particularly useful in scenarios where traces have a parent-child relationship implied by their path structure. For instance, in performance analysis, a trace like /root/p1/p2/p3/t1 might represent a specific test (t1) under a series of nested configurations (p1, p2, p3). If there's another trace /root/p1/p2, it could be considered a “parent” or an aggregate trace. The tracefilter helps in identifying only the most specific, or “leaf,” traces, effectively filtering out these higher-level parent traces.
The primary component is the TraceFilter struct.
TraceFilter struct:
traceKey: A string identifier associated with the trace path ending at this node. For the root of the tree, this is initialized to “HEAD”.value: The string value of the current path segment this node represents.children: A map where keys are the next path segments and values are pointers to child TraceFilter nodes. This map forms the branches of the tree.children allows for efficient lookup and addition of child nodes based on the next path segment.traceKey at each node allows associating an identifier with a complete path as it's being built.NewTraceFilter() function:
TraceFilter tree.TraceFilter node. The traceKey is set to “HEAD” as a sentinel value for the root, and its children map is initialized as empty, ready to have paths added to it.AddPath(path []string, traceKey string) method:
Purpose: Adds a new trace, defined by its path (a slice of strings representing path segments) and its unique traceKey, to the filter tree.
How it works:
path.path already exists as a child of the current node, it moves to that existing child.TraceFilter node is created for that segment, its value is set to the segment string, its traceKey is set to the input traceKey, and it's added to the children map of the current node.path.Why this design?
traceKey with each newly created node ensures that even intermediate nodes (which might later become leaves if no further sub-paths are added) have an associated key.Example: Adding path ["root", "p1", "p2"] with key "keyA"
Initial Tree:
(HEAD)
After AddPath(["root", "p1", "p2"], "keyA"):
(HEAD)
|
+-- ("root", key="keyA")
|
+-- ("p1", key="keyA")
|
+-- ("p2", key="keyA") <- Leaf node initially
If we then add ["root", "p1", "p2", "t1"] with key "keyB":
(HEAD)
|
+-- ("root", key="keyB") // traceKey updated if path is prefix
|
+-- ("p1", key="keyB")
|
+-- ("p2", key="keyB")
|
+-- ("t1", key="keyB") <- New leaf node
Note: The traceKey of an existing node is updated by AddPath if the new path being added shares that node as a prefix. This ensures that the traceKey stored at a node corresponds to the longest path ending at that node if it's also a prefix of other paths. However, the primary use of GetLeafNodeTraceKeys relies on the traceKey of nodes that become leaves.
GetLeafNodeTraceKeys() method:
Purpose: Retrieves the traceKeys of all traces that are considered “leaf” nodes in the tree. A leaf node is a node that has no children.
How it works:
len(tf.children) == 0), its traceKey is considered a leaf key and is added to the result list.Why this design?
Workflow for GetLeafNodeTraceKeys:
Start at (CurrentNode)
|
V
Is CurrentNode a leaf (no children)?
|
+-- YES --> Add CurrentNode.traceKey to results
|
+-- NO --> For each ChildNode in CurrentNode.children:
|
V
Recursively call GetLeafNodeTraceKeys on ChildNode
|
V
Append results from ChildNode to overall results
|
V
Return aggregated results
Consider the following traces and their paths:
traceA: path ["config", "test_group", "test1"], key "keyA"
traceB: path ["config", "test_group"], key "keyB"
traceC: path ["config", "test_group", "test2"], key "keyC"
traceD: path ["config", "other_group", "test3"], key "keyD"
Tree Construction (AddPath calls):
tf.AddPath(["config", "test_group", "test1"], "keyA")tf.AddPath(["config", "test_group"], "keyB")"test_group" initially created by keyA will have its traceKey updated to "keyB".tf.AddPath(["config", "test_group", "test2"], "keyC")tf.AddPath(["config", "other_group", "test3"], "keyD")The tree would look something like this (simplified, showing relevant traceKeys for leaf potential):
(HEAD)
|
+-- ("config")
|
+-- ("test_group", traceKey likely updated by "keyB" during AddPath)
| |
| +-- ("test1", traceKey="keyA") <-- Leaf
| |
| +-- ("test2", traceKey="keyC") <-- Leaf
|
+-- ("other_group")
|
+-- ("test3", traceKey="keyD") <-- Leaf
Filtering (GetLeafNodeTraceKeys() call):
GetLeafNodeTraceKeys() is called on the root:"config"."test_group". This node has children ("test1" and "test2"), so its key ("keyB") is not added."test1". This is a leaf. "keyA" is added."test2". This is a leaf. "keyC" is added."other_group"."test3". This is a leaf. "keyD" is added.The result would be ["keyA", "keyC", "keyD"]. Notice that "keyB" is excluded because the path ["config", "test_group"] has sub-paths (.../test1 and .../test2), making it a non-leaf node in the context of trace specificity.
This module provides a clean and efficient way to identify the most granular traces in a dataset where hierarchy is defined by path structure.
The tracesetbuilder module is designed to efficiently construct a types.TraceSet and its corresponding paramtools.ReadOnlyParamSet from multiple, potentially disparate, sets of trace data. This is particularly useful when dealing with performance data that might arrive in chunks (e.g., from different “Tiles” of data) and needs to be aggregated into a coherent view across a series of commits.
The core challenge this module addresses is the concurrent and distributed nature of processing trace data. If multiple traces with the same identifier (key) were processed by different workers simultaneously without coordination, it could lead to race conditions and incorrect data. Similarly, simply locking the entire TraceSet for each update would create a bottleneck.
The tracesetbuilder solves this by employing a worker pool (mergeWorkers). The key design decision here is to distribute the work based on the trace key. Each trace key is hashed (using crc32.ChecksumIEEE), and this hash determines which mergeWorker is responsible for that specific trace. This ensures that all data points for a single trace are always processed by the same worker, thereby avoiding the need for explicit locking at the individual trace level within the worker. Each mergeWorker maintains its own types.TraceSet and paramtools.ParamSet.
Key Components and Workflow:
TraceSetBuilder:
- **Responsibilities:**
- Manages a pool of `mergeWorker` instances.
- Provides the `Add` method to ingest new trace data.
- Provides the `Build` method to consolidate results from all workers
and return the final `TraceSet` and `ReadOnlyParamSet`.
- Provides the `Close` method to shut down the worker pool.
- **`New(size int)`:** Initializes the `TraceSetBuilder`. The `size`
parameter is crucial as it defines the expected length of each trace in
the final, consolidated `TraceSet`. This allows the builder to
pre-allocate trace slices of the correct length, filling in missing data
points as necessary. It creates `numWorkers` instances of `mergeWorker`.
- **`Add(commitNumberToOutputIndex map[types.CommitNumber]int32, commits
[]provider.Commit, traces types.TraceSet)`:** This is the entry point for feeding data into the builder.
traces: A types.TraceSetrepresenting a chunk of data (e.g., from a single tile). -commits: A slice of provider.Commitobjects corresponding to the data points in thetraces.commitNumberToOutputIndex: A map that dictates where each data point from the input traces(identified by its types.CommitNumber) should be placed in the final output trace. This mapping is essential for correctly aligning data points that might come from different sources or represent different commit ranges.traces:paramtools.Params.requeststruct containing the key, params, the trace data itself, thecommitNumberToOutputIndexmap, and thecommits slice.numWorkers.requestto thechchannel of the selected mergeWorker.sync.WaitGroupis incremented for each trace added, ensuring Build waits for all processing to complete.Build(ctx context.Context):Addoperations to be processed by the workers (using t.wg.Wait()).mergeWorkers.traceSetandparamSetfrom eachmergeWorkerinto a single, finaltypes.TraceSetandparamtools.ParamSet.paramSetto create a paramtools.ReadOnlyParamSet.TraceSetandReadOnlyParamSet.Close(): Iterates through the mergeWorkers and closes their respective input channels (ch). This signals the worker goroutines to terminate once they have processed all pending requests.mergeWorker:
request objects sent to its channel.types.TraceSet and paramtools.ParamSet.TraceSet with new data points, placing them correctly according to request.commitNumberToOutputIndex.ParamSet.newMergeWorker(wg *sync.WaitGroup, size int): Creates a mergeWorker and starts its goroutine.types.TraceSet and paramtools.ParamSet.request objects from its ch channel.request:m.traceSet for the given req.key. If creating, it uses types.NewTrace(size) to ensure the trace has the correct final length.req.commits and uses req.commitNumberToOutputIndex to determine the correct destination index in its local trace for each data point in req.trace.req.params to its m.paramSet.sync.WaitGroup (m.wg.Done()) to signal completion of this piece of work.Process(req *request): Sends a request to the worker's channel.Close(): Closes the worker's input channel.request struct:
mergeWorker. It encapsulates the trace key, its parsed parameters, the actual trace data segment, the mapping of commit numbers to output indices, and the corresponding commit metadata.Workflow Diagram:
TraceSetBuilder.New(outputTraceLength)
|
V
+-----------------------------------------------------------------------+
| TraceSetBuilder (manages WaitGroup and pool of mergeWorkers) |
+-----------------------------------------------------------------------+
| ^
| Add(commitMap1, commits1, traces1) | Build() waits for WaitGroup
| Add(commitMap2, commits2, traces2) |
V |
+-----------------------------------------------------------------------+
| For each trace in input: |
| 1. Parse key -> params |
| 2. Create 'request' struct |
| 3. Hash key -> workerIndex |
| 4. Send 'request' to mergeWorkers[workerIndex].ch |
| 5. Increment WaitGroup |
+-----------------------------------------------------------------------+
| | | ... (numWorkers times)
V V V
+--------+ +--------+ +--------+
| mergeW_0 | | mergeW_1 | | mergeW_N | (Each runs in its own goroutine)
| .ch | | .ch | | .ch |
| .traceSet| | .traceSet| | .traceSet|
| .paramSet| | .paramSet| | .paramSet|
+--------+ +--------+ +--------+
^ ^ ^
| Process request: |
| - Get/Create local trace for req.key (length: outputTraceLength) |
| - For each point in req.trace: |
| - Use req.commitNumberToOutputIndex[commitNum] to find dstIdx |
| - localTrace[dstIdx] = req.trace[srcIdx] |
| - Add req.params to local paramSet |
| - Decrement WaitGroup |
| | |
--------------------- (When TraceSetBuilder.Build() is called)
|
V
+-----------------------------------------------------------------------+
| TraceSetBuilder.Build(): |
| 1. Wait for all 'Add' operations (WaitGroup.Wait()) |
| 2. Create finalTraceSet, finalParamSet |
| 3. For each mergeWorker: |
| - Merge worker.traceSet into finalTraceSet |
| - Merge worker.paramSet into finalParamSet |
| 4. Normalize and Freeze finalParamSet |
| 5. Return finalTraceSet, finalParamSet (ReadOnly) |
+-----------------------------------------------------------------------+
|
V
+-----------------------------------------------------------------------+
| TraceSetBuilder.Close(): |
| - Close channels of all mergeWorkers (signals them to terminate) |
+-----------------------------------------------------------------------+
The use of numWorkers and channelBufferSize are constants that can be tuned for performance based on the expected workload and system resources. The CRC32 hash provides a reasonably good distribution of keys across workers, minimizing the chance of one worker becoming a bottleneck. The sync.WaitGroup is essential for ensuring that the Build method doesn't prematurely try to aggregate results before all input data has been processed by the workers.
The design allows for efficient, concurrent processing of large volumes of trace data by partitioning the work based on trace identity and then merging the results, making it suitable for building comprehensive views of performance metrics over time.
The tracestore module defines interfaces and implementations for storing and retrieving performance trace data. It's a core component of the Perf system, enabling the analysis of performance metrics over time and across different configurations.
The primary goal of tracestore is to provide an efficient and scalable way to manage large volumes of trace data. This involves:
tracestore uses an inverted index. This index maps key-value pairs to the trace IDs that contain them within each tile.go/cache/memcached) for broader caching strategies.tracecache for caching the results of QueryTracesIDOnly to speed up repeated queries.TraceStore, MetadataStore, TraceParamStore) to allow for different backend implementations. This promotes flexibility and testability. The primary implementation provided is sqltracestore, which uses an SQL database.TraceStore handles the core logic of reading and writing trace values and their associated parameters.MetadataStore manages metadata associated with source files (e.g., links to dashboards or logs).TraceParamStore specifically handles the mapping between trace IDs (MD5 hashes of trace names) and their full parameter sets. This separation helps in optimizing storage and retrieval for these distinct types of data.The tracestore module is primarily defined by a set of interfaces and their SQL-based implementations.
tracestore.goThis file defines the main TraceStore interface. It outlines the contract for any system that wants to store and retrieve performance traces. Key responsibilities include:
WriteTraces, WriteTraces2): Ingesting new performance data points. Each data point is associated with a specific commit, a set of parameters (defining the trace, e.g., config=8888,arch=x86), a value, the source file it came from, and a timestamp.WriteTraces method is designed to handle potentially large batches of data efficiently. Implementations often involve chunking data and performing parallel writes to the underlying storage.WriteTraces2 is a newer variant, potentially for different storage schemas or optimizations (e.g., denormalizing common params directly into the trace values table as seen in TraceValues2Schema).ReadTraces, ReadTracesForCommitRange): Retrieving trace data for specific keys (trace names) within a given tile or commit range.QueryTraces, QueryTracesIDOnly):QueryTraces allows searching for traces based on a query.Query object (which specifies parameter key-value pairs). It returns the actual trace values and associated commit information.QueryTracesIDOnly is an optimization that returns only the paramtools.Params (effectively the identifying parameters) of traces matching a query. This is useful when only the list of matching traces is needed, not their values.GetLatestTile, TileNumber, TileSize, CommitNumberOfTileStart): Provides methods for interacting with the tiled storage system.GetParamSet): Retrieving the paramtools.ReadOnlyParamSet for a specific tile. A ParamSet represents all unique key-value pairs present in the traces within that tile, which is crucial for UI elements like query builders.GetSource, GetLastNSources, GetTraceIDsBySource): Retrieving information about the origin of trace data, such as the ingested file name.metadatastore.goThis file defines the MetadataStore interface. Its responsibility is to manage metadata associated with source files.
InsertMetadata: Stores links or other metadata for a given source file name.GetMetadata: Retrieves the stored metadata for a source file. This can be used, for example, to link from a data point back to the original log file or a specific dashboard view related to the data ingestion.traceparamstore.goThis file defines the TraceParamStore interface. This store is dedicated to managing the relationship between a trace's unique identifier (typically an MD5 hash of its full parameter string) and the actual paramtools.Params object.
WriteTraceParams: Stores the mapping from trace IDs to their parameter sets. This is done to avoid repeatedly parsing or storing the full parameter string for every data point of a trace.ReadParams: Retrieves the paramtools.Params for a given set of trace IDs.sqltracestoreThis submodule provides the SQL-based implementation of the TraceStore, MetadataStore, and TraceParamStore interfaces.
sqltracestore.go: Implements the TraceStore interface.
sqltracestore/schema/schema.go) involving tables like TraceValues (for actual metric values), Postings (the inverted index), ParamSets (per-tile parameter information), and SourceFiles.WriteTraces is called, it performs several actions:SourceFiles table with the new source filename if it's not already present.ParamSets table for the current tile with any new key-value pairs from the incoming traces. This uses a cache to avoid redundant writes.TraceValues table (or TraceValues2 for WriteTraces2). _ If the trace ID and its key-value pairs are not already in the Postings table for the current tile (checked via cache), it inserts them. _ Stores the mapping of the trace ID to its paramtools.Params in the TraceParams table via the TraceParamStore. All these writes are typically batched and parallelized for efficiency.QueryTracesIDOnly):ParamSet for the target tile.query.Query and the tile's ParamSet.restrictByCounting): It attempts to optimize the query by first running COUNT(*) queries for each part of the query plan. The part of the plan that matches the fewest traces (below a threshold) is then used to fetch its corresponding trace IDs. These IDs are then used to construct a restrictClause (e.g., AND trace_id IN (...)) that is appended to the queries for the other parts of the plan. This significantly speeds up queries where one filter is much more selective than others.Postings table (using the restrictClause if applicable) to get a stream of matching traceIDForSQL.traceIDForSQL from each part of the plan are then intersected (using newIntersect) to find the trace IDs that satisfy all AND conditions of the query.TraceParamStore to fetch their full paramtools.Params.QueryTraces, ReadTraces): Once the trace IDs (and thus their full names) are known (either from QueryTracesIDOnly or directly provided), it queries the TraceValues table to fetch the actual floating-point values for those traces within the specified commit range or tile. It also fetches commit information from the Commits table.enableFollowerReads configuration, which adds AS OF SYSTEM TIME '-5s' to certain read queries, allowing them to potentially hit read replicas and reduce load on the primary, at the cost of slightly stale data.spanner.go) to account for syntax differences or performance characteristics (e.g., UPSERT vs. ON CONFLICT).sqlmetadatastore.go: Implements the MetadataStore interface. It uses an Metadata SQL table that links a source_file_id (from SourceFiles) to a JSONB column storing the metadata map.
sqltraceparamstore.go: Implements the TraceParamStore interface. It uses a TraceParams SQL table that stores trace_id (bytes) and their corresponding params (JSONB). Writes are chunked and can be parallelized.
intersect.go: Provides helper functions (newIntersect, newIntersect2) to compute the intersection of multiple sorted channels of traceIDForSQL. This is crucial for implementing the AND logic in QueryTracesIDOnly. It builds a binary tree of newIntersect2 operations for efficiency, avoiding slower reflection-based approaches.
schema/schema.go: Defines Go structs that mirror the SQL table schemas. This is used for documentation and potentially could be used with ORM-like tools if needed, though the current implementation uses direct SQL templating.
TraceValuesSchema: Stores individual data points (value, commit, source file) keyed by trace ID.TraceValues2Schema: An alternative/extended schema for trace values, potentially denormalizing common parameters like benchmark, bot, test, etc., for direct querying.SourceFilesSchema: Maps source file names to integer IDs.ParamSetsSchema: Stores the unique key-value pairs present in each tile.PostingsSchema: The inverted index, mapping (tile, key-value) to trace IDs.MetadataSchema: Stores JSON metadata for source files.TraceParamsSchema: Maps trace IDs (MD5 hashes) to their full paramtools.Params (stored as JSON).spanner.go: Contains SQL templates and specific configurations (like parallel pool sizes for writes) tailored for Google Cloud Spanner.
mocksTraceStore.go: Provides a mock implementation of the TraceStore interface, generated by the mockery tool. This is essential for unit testing components that depend on TraceStore without needing a full database setup.Caller (e.g., ingester) -> TraceStore.WriteTraces(ctx, commitNumber, params[], values[], paramset, sourceFile, timestamp)
|
`-> SQLTraceStore.WriteTraces
|
| 1. Tile Calculation: tileNumber = TileNumber(commitNumber)
|
| 2. Source File ID:
| `-> updateSourceFile(ctx, sourceFile) -> sourceFileID
| (Queries SourceFiles table, inserts if not exists)
|
| 3. ParamSet Update (for the tile):
| For each key, value in paramset:
| If not in cache(tileNumber, key, value):
| Add to batch for ParamSets table insertion
| Execute batch insert into ParamSets, update cache
|
| 4. For each trace (params[i], values[i]):
| | a. Trace ID Calculation: traceID_md5_hex = md5(query.MakeKey(params[i]))
| |
| | b. Store Trace Params:
| | `-> TraceParamStore.WriteTraceParams(ctx, {traceID_md5_hex: params[i]})
| | (Inserts into TraceParams table if not exists)
| |
| | c. Add to TraceValues Batch: (traceID_md5_hex, commitNumber, values[i], sourceFileID)
| |
| | d. Postings Update (for the tile):
| | If not in cache(tileNumber, traceID_md5_hex): // Marks this whole trace as processed for postings
| | For each key, value in params[i]:
| | Add to batch for Postings table: (tileNumber, "key=value", traceID_md5_hex)
|
| 5. Execute batch insert into TraceValues (or TraceValues2)
|
| 6. Execute batch insert into Postings, update postings cache
QueryTracesIDOnly)Caller -> TraceStore.QueryTracesIDOnly(ctx, tileNumber, query)
|
`-> SQLTraceStore.QueryTracesIDOnly
|
| 1. Get ParamSet for tile:
| `-> GetParamSet(ctx, tileNumber) -> tileParamSet
| (Checks OPS cache, falls back to querying ParamSets table)
|
| 2. Generate Query Plan: plan = query.QueryPlan(tileParamSet)
| (If plan is empty or invalid for tile, return empty channel)
|
| 3. Optimization (restrictByCounting):
| | For each part of 'plan' (key, or_values[]):
| | `-> DB: COUNT(*) FROM Postings WHERE tile_number=... AND key_value IN (...) LIMIT threshold
| | Find the plan part (minKey, minValues) with the smallest count (if count < threshold).
| | If any count is 0, plan is skippable.
| | If minKey found:
| | `-> DB: SELECT trace_id FROM Postings WHERE tile_number=... AND key_value IN (minValues)
| | `-> restrictClause = "AND trace_id IN (result_ids...)"
|
| 4. Execute Query for each plan part (concurrently):
| For each key, values[] in 'plan' (excluding minKey if restrictClause is used):
| `-> DB: SELECT trace_id FROM Postings
| WHERE tile_number=tileNumber AND key_value IN ("key=value1", "key=value2"...)
| [restrictClause]
| ORDER BY trace_id
| -> channel_for_key_N (stream of traceIDForSQL)
|
| 5. Intersect Results:
| `-> newIntersect(ctx, [channel_for_key_1, channel_for_key_2,...]) -> finalTraceIDsChannel (stream of unique traceIDForSQL)
|
| 6. Fetch Full Params (concurrently, in chunks):
| For each batch of unique traceIDForSQL from finalTraceIDsChannel:
| `-> TraceParamStore.ReadParams(ctx, batch_of_ids) -> map[traceID]Params
| For each Params in map:
| Send Params to output channel
|
`-> Returns output channel of paramtools.Params
This structured approach, combining interfaces with a robust SQL implementation, allows tracestore to serve as a reliable and performant foundation for Perf's data storage needs.
High-Level Overview
The /go/tracing module is responsible for initializing and configuring tracing capabilities within the Perf application. It leverages the OpenCensus library to provide distributed tracing, allowing developers to understand the flow of requests across different services and components. This is crucial for debugging performance issues, identifying bottlenecks, and gaining insights into the application's behavior in a distributed environment.
Design Decisions and Implementation Choices
The core design principle behind this module is to centralize tracing initialization. This ensures consistency in how tracing is set up across different parts of the application.
Conditional Initialization: The Init function provides different initialization paths based on whether the application is running in a local development environment or a deployed environment.
loggingtracer.Initialize() is called. This likely configures a simpler, console-based tracer. The rationale is that in local development, detailed, distributed tracing might be overkill, and logging traces to the console is often sufficient for debugging.tracing.Initialize function from the shared go.skia.org/infra/go/tracing library is used. This enables more sophisticated tracing, likely integrating with a backend tracing system like Jaeger or Stackdriver Trace.Configuration-Driven Sampling: The cfg.TraceSampleProportion (of type config.InstanceConfig) determines the sampling rate for traces. This allows administrators to control the volume of trace data generated, balancing the need for detailed information with the cost and overhead of storing and processing traces. A value of 0.0 would likely disable tracing, while 1.0 would trace every request.
Automatic Project ID Detection: The autoDetectProjectID constant being an empty string suggests that the underlying tracing.Initialize function is capable of automatically determining the Google Cloud Project ID when running in a GCP environment. This simplifies configuration as the project ID doesn't need to be explicitly passed.
Metadata Enrichment: The map[string]interface{} passed to tracing.Initialize includes:
podName: This value is retrieved from the MY_POD_NAME environment variable. This is a common practice in Kubernetes environments to identify the specific pod generating the trace, which is invaluable for pinpointing issues.instance: This is derived from cfg.InstanceName. This helps differentiate traces originating from different Perf instances (e.g., “perf-prod”, “perf-staging”).Responsibilities and Key Components/Files
tracing.go: This is the sole file in this module and contains the Init function.
Init(local bool, cfg *config.InstanceConfig) error function:local boolean flag and an InstanceConfig pointer as input. 2. If local is true, it calls loggingtracer.Initialize(). This indicates a preference for a simpler, possibly console-based, tracing mechanism for local development. local=true ----> loggingtracer.Initialize() 3. If local is false, it proceeds to initialize tracing for a deployed environment. - It retrieves the TraceSampleProportion from the cfg. - It retrieves the InstanceName from cfg to be used as an attribute. - It calls tracing.Initialize from the shared go.skia.org/infra/go/tracing library. - It passes the sampling proportion, autoDetectProjectID (an empty string, relying on automatic detection), and a map of attributes (podName from the environment and instance from the config). local=false | V Read cfg.TraceSampleProportion Read cfg.InstanceName Read os.Getenv("MY_POD_NAME") | V tracing.Initialize(sample_proportion, "", {podName, instance})go.skia.org/infra/go/tracing) for common functionality, promoting code reuse.Dependencies:
//go/tracing (likely go.skia.org/infra/go/tracing): This is the core shared tracing library providing the Initialize function for robust, distributed tracing. It handles the actual setup of exporters (e.g., to Stackdriver, Jaeger) and samplers.//go/tracing/loggingtracer: This dependency provides a simpler tracer implementation, probably for logging traces to standard output, suitable for local development environments where a full-fledged tracing backend might not be available or necessary.//perf/go/config: This module provides the InstanceConfig struct, which contains application-specific configuration, including the TraceSampleProportion and InstanceName used by the tracing initialization. This decouples tracing configuration from the tracing logic itself.Key Workflows/Processes
Tracing Initialization Workflow:
Application Startup
|
V
Call perf/go/tracing.Init(isLocal, instanceConfig)
|
+---- isLocal is true? ----> Call loggingtracer.Initialize() --> Tracing active (console/simple)
| |
| V
| Application proceeds
|
+---- isLocal is false? ---> Read TraceSampleProportion from instanceConfig
Read InstanceName from instanceConfig
Read MY_POD_NAME environment variable
|
V
Call shared go.skia.org/infra/go/tracing.Initialize(...)
with sampling rate and attributes (podName, instance)
|
V
Tracing active (distributed, e.g., Stackdriver)
|
V
Application proceeds
This workflow illustrates how the Init function adapts the tracing setup based on the execution context (local vs. deployed) and external configuration. The goal is to provide appropriate tracing capabilities with minimal boilerplate in the rest of the application.
The /go/trybot module is responsible for managing performance data generated by trybots. Trybots are automated systems that run tests on code changes (patches or changelists) before they are merged into the main codebase. This module handles the ingestion, storage, and retrieval of these trybot results, allowing developers and performance engineers to analyze the performance impact of proposed code changes.
The core idea is to provide a way to compare the performance characteristics of a pending change against the baseline performance of the current codebase. This helps in identifying potential performance regressions or improvements early in the development cycle.
/go/trybot/trybot.goThis file defines the central data structure TryFile.
TryFile: This struct represents a single file containing trybot results.CL: The identifier of the changelist (e.g., a Gerrit change ID). This is crucial for associating results with a specific code change.PatchNumber: The specific patchset within the changelist. Code review systems often allow multiple iterations (patchsets) for a single changelist.Filename: The name of the file where the trybot results are stored, often including a scheme like gs:// indicating its location (e.g., in Google Cloud Storage).Timestamp: When the result file was created. This is important for tracking and ordering results./go/trybot/ingesterThis submodule is responsible for taking raw result files and transforming them into the TryFile format that the rest of the system understands.
/go/trybot/ingester/ingester.go: Defines the Ingester interface.
Ingester interface: Specifies a contract for components that can process incoming files (represented by file.File) and produce a stream of trybot.TryFile objects. The Start method initiates this processing, typically in a background goroutine. This design allows for different sources or formats of trybot results to be plugged into the system./go/trybot/ingester/gerrit/gerrit.go: Provides a concrete implementation of the Ingester interface, specifically for handling trybot results originating from Gerrit code reviews.
Gerrit struct: Implements ingester.Ingester. It uses a parser.Parser (from /perf/go/ingest/parser) to understand the content of the result files.New function: Constructor for the Gerrit ingester.Start method:file.File objects.parser.ParseTryBot. This method extracts the changelist ID (issue) and patchset number.trybot.TryFile is created with the extracted CL, patch number, filename, and creation timestamp.TryFile is then sent to an output channel.parseCounter, parseFailCounter) to track the success and failure rates of parsing.files) and output (ret) facilitates asynchronous processing, meaning the ingester can process files as they become available without blocking other operations./go/trybot/storeThis submodule is responsible for persisting and retrieving TryFile information and the associated performance measurements.
/go/trybot/store/store.go: Defines the TryBotStore interface.TryBotStore interface: This interface outlines the contract for storing and retrieving trybot data. This abstraction allows different database backends (e.g., CockroachDB, in-memory stores for testing) to be used.Write(ctx context.Context, tryFile trybot.TryFile) error: Persists a TryFile and its associated data.List(ctx context.Context, since time.Time) ([]ListResult, error): Retrieves a list of unique changelist/patchset combinations that have been processed since a given time. ListResult contains the CL (as a string) and Patch number.Get(ctx context.Context, cl types.CL, patch int) ([]GetResult, error): Fetches all performance results for a specific changelist and patch number. GetResult contains the TraceName (a unique identifier for a specific metric and parameter combination) and its measured Value./go/trybot/store/mocks/TryBotStore.go: Provides a mock implementation of TryBotStore, generated by the mockery tool. This is essential for unit testing components that depend on TryBotStore without needing a real database./go/trybot/resultsThis submodule focuses on loading and preparing trybot results for analysis and presentation, often by comparing them to baseline data.
/go/trybot/results/results.go: Defines the structures for requesting and representing analyzed trybot results.
Kind type (TryBot, Commit): Distinguishes whether the analysis request is for trybot data (pre-submit) or for data from an already landed commit (post-submit). This allows the system to handle both scenarios.TryBotRequest struct: Represents a request from a client (e.g., a UI) to get analyzed performance data. It includes the Kind, CL and PatchNumber (for TryBot kind), CommitNumber and Query (for Commit kind). The Query is used to filter the traces to be analyzed when looking at landed commits.TryBotResult struct: Contains the analysis results for a single trace.Params: The key-value parameters that uniquely identify the trace.Median, Lower, Upper, StdDevRatio: Statistical measures derived from the trace data. StdDevRatio is a key metric indicating how much a new value deviates from the historical distribution, helping to flag regressions or improvements.Values: A slice of recent historical values for the trace, with the last value being either the trybot result or the value at the specified commit.TryBotResponse struct: The overall response to a TryBotRequest.Header: Column headers for the data, typically representing commit information.Results: A slice of TryBotResult for each analyzed trace.ParamSet: A collection of all unique parameter key-value pairs present in the results, useful for filtering in a UI.Loader interface: Defines a contract for components that can take a TryBotRequest and produce a TryBotResponse. This involves fetching relevant data, performing statistical analysis, and formatting it./go/trybot/results/dfloader/dfloader.go: Implements the results.Loader interface using a dataframe.DataFrameBuilder. DataFrames are a common way to represent tabular data for analysis.
Loader struct: Holds references to a dataframe.DataFrameBuilder (for constructing DataFrames from trace data), a store.TryBotStore (for fetching trybot-specific measurements), and perfgit.Git (for resolving commit information).TraceHistorySize constant: Defines how many historical data points to load for each trace for comparison.New function: Constructor for the Loader.Load method: This is the core logic for generating the TryBotResponse.Determine Timestamp: If the request is for a Commit, it fetches the commit details (including its timestamp) using perfgit.Git. Otherwise, it uses the current time.Parse Query: If the request kind is Commit, the provided Query string is parsed. An empty query for a Commit request is an error.Fetch Baseline Data (DataFrame):Kind is Commit: It uses dfb.NewNFromQuery to load a DataFrame containing the last TraceHistorySize+1 data points for traces matching the query, up to the commit's timestamp. The “+1” is to hold the value at the commit itself or to be a placeholder.Kind is TryBot: a. It first calls store.Get to retrieve the specific trybot measurements for the given CL and PatchNumber. b. It then extracts the trace names from these trybot results. c. It calls dfb.NewNFromKeys to load a DataFrame with TraceHistorySize+1 historical data points for these specific trace names. d. Crucially, it then replaces the last value in each trace within the DataFrame with the corresponding value obtained from the store.Get call. This effectively injects the trybot's measurement into the historical context for comparison. e. If a trybot result exists for a trace that has no historical data in the DataFrame, that trace is removed from the analysis, and rebuildParamSet is flagged.Prepare Response Header: The DataFrame‘s header (commit information) is used for the response. If it’s a TryBot request, the last header entry (representing the trybot data point) has its Offset set to types.BadCommitNumber to indicate it's not a landed commit.Calculate Statistics: For each trace in the DataFrame:paramtools.Params.vec32.StdDevRatio is called with the trace values (which now includes the trybot value at the end if applicable). This function calculates the median, lower/upper bounds, and the standard deviation ratio.results.TryBotResult is created.StdDevRatio calculation fails (e.g., insufficient data), the trace is skipped, and rebuildParamSet is flagged.Sort Results: The TryBotResult slice is sorted by StdDevRatio in descending order. This prioritizes potential regressions (high positive ratio) and significant improvements (high negative ratio).Normalize ParamSet: If rebuildParamSet is true (due to missing traces or parsing errors), the ParamSet for the response is regenerated from the final set of TryBotResults.results.TryBotResponse is assembled and returned./go/trybot/samplesloaderThis submodule deals with loading raw sample data from trybot result files. Sometimes, instead of just a single aggregated value, trybots might output multiple raw measurements (samples) for a metric.
/go/trybot/samplesloader/samplesloader.go: Defines the SamplesLoader interface.
SamplesLoader interface: Specifies a method Load(ctx context.Context, filename string) (parser.SamplesSet, error) that takes a filename (URL to the result file) and returns a parser.SamplesSet. A SamplesSet is a map where keys are trace identifiers and values are parser.Samples (which include parameters and a slice of raw float64 sample values)./go/trybot/samplesloader/gcssamplesloader/gcssamplesloader.go: Implements SamplesLoader for files stored in Google Cloud Storage (GCS).
loader struct: Holds a gcs.GCSClient for interacting with GCS and a parser.Parser.New function: Constructor for the GCS samples loader.Load method:filename (which is a GCS URL like gs://bucket/path/file.json) to extract the bucket and path.storageClient to read the content of the file from GCS.format.ParseLegacyFormat (assuming a specific JSON structure for these sample files).parser.SamplesSet using parser.GetSamplesFromLegacyFormat.A simplified workflow could look like this:
File Arrival: A new trybot result file appears (e.g., uploaded to GCS).
New File (e.g., in GCS)
Ingestion: An ingester.Ingester (like ingester.gerrit.Gerrit) detects and processes this file.
File --> [Gerrit Ingester] --parses--> trybot.TryFile{CL, PatchNum, Filename, Timestamp}
Storage: The TryFile metadata and potentially the parsed values are written to the store.TryBotStore.
trybot.TryFile --> [TryBotStore.Write] --> Database
(The actual performance values might be stored alongside the TryFile metadata or linked via the Filename if they are in a separate detailed file).
Analysis Request: A user or an automated system requests analysis for a particular CL/Patch via a UI or API, sending a results.TryBotRequest.
UI/API --sends--> results.TryBotRequest{Kind=TryBot, CL="123", PatchNumber=1}
Data Loading and Comparison: The results.dfloader.Loader handles this request. results.TryBotRequest | v [dfloader.Loader.Load] | +--(A)--> [TryBotStore.Get(CL, PatchNum)] --> Trybot specific values (Value_T) for traces T1, T2... | +--(B)--> [DataFrameBuilder.NewNFromKeys(traceNames=[T1,T2...])] --> Historical data for T1, T2... | (e.g., [V1_hist1, V1_hist2, ..., V1_histN, _placeholder_]) | +--(C)--> Combine: Replace _placeholder_ with Value_T | (e.g., for T1: [V1_hist1, V1_hist2, ..., V1_histN, V1_T]) | +--(D)--> Calculate StdDevRatio, Median, etc. for each trace | +--(E)--> Sort results | v results.TryBotResponse (sent back to UI/API)
This module is crucial for proactive performance monitoring, enabling teams to catch performance regressions before they land in the main codebase, by systematically ingesting, storing, and analyzing the performance data generated during the pre-submit testing phase. The use of interfaces for storage (TryBotStore), ingestion (Ingester), and results loading (results.Loader) makes the system flexible and extensible.
The go/ts module serves as a utility to generate TypeScript definition files from Go structs. This is crucial for maintaining type safety and consistency between the Go backend and the TypeScript frontend, particularly when dealing with JSON data structures that are exchanged between them. The core problem this module solves is bridging the gap between Go‘s static typing and TypeScript’s type system for data interchange, ensuring that changes in Go struct definitions are automatically reflected in the frontend's TypeScript types.
The primary component is the main.go file. Its responsibility is to:
-o) where the generated TypeScript file will be written.go2ts.Generator: This is the core engine from the go/go2ts library responsible for the Go-to-TypeScript conversion.GenerateNominalTypes = true: This setting likely ensures that the generated TypeScript types are nominal (i.e., types are distinct based on their name, not just their structure), which can provide stronger type checking.AddIgnoreNil: This is used for specific Go types like paramtools.Params, paramtools.ParamSet, paramtools.ReadOnlyParamSet, and types.TraceSet. This suggests that nil values for these types in Go should likely be treated as optional or nullable fields in TypeScript, or perhaps excluded from the generated types if they are always expected to be non-nil when serialized.generator.AddMultiple to register a wide array of Go structs from various perf submodules (e.g., alerts, chromeperf, clustering2, frontend/api, regression). These are the structs that are serialized to JSON and consumed by the frontend. By registering them, the generator knows which Go types to convert into corresponding TypeScript interfaces or types.addMultipleUnions helper function and generator.AddUnionToNamespace are used to register Go union types (often represented as a collection of constants or an interface implemented by several types). This ensures that TypeScript enums or union types are generated, reflecting the possible values or types a Go field can hold. The typeName argument in unionAndName and the namespace argument in AddUnionToNamespace control how these unions are named and organized in the generated TypeScript.generator.AddToNamespace is used to group related types under a specific namespace in the generated TypeScript, improving organization (e.g., pivot.Request{} is added to the pivot namespace).generator.Render(w) writes the generated TypeScript definitions to the specified output file.The design decision to use a dedicated program for this generation task, rather than manual synchronization or other methods, highlights the importance of automation and reducing the likelihood of human error in keeping backend and frontend types aligned. The reliance on the go/go2ts library centralizes the core conversion logic, making this module a consumer and orchestrator of that library for the specific needs of the Skia Perf application.
A key workflow is triggered by the //go:generate directive at the top of main.go: //go:generate bazelisk run --config=mayberemote //:go -- run . -o ../../modules/json/index.ts
This command, when go generate is run (typically as part of a build process), executes the compiled go/ts program.
Workflow:
perf submodule that is serialized to JSON for the UI.go generate within the go/ts module's directory (or a higher-level directory that includes it).go:generate directive executes the main function in go/ts/main.go.main.go -> Uses go2ts.Generator -> Registers relevant Go structs and unions.go2ts.Generator -> Analyzes registered Go types -> Generates corresponding TypeScript definitions.main.go -> Writes the TypeScript definitions to ../../modules/json/index.ts.The choice of specific structs and unions registered in main.go reflects the data contracts between the Perf backend and its frontend UI. Any Go struct that is part of an API response or request payload handled by the frontend needs to be included here.
This module defines core data types used throughout the Perf application. These types provide a standardized way to represent fundamental concepts related to commits, performance data (traces), and alert configurations. The design prioritizes clarity, type safety, and consistency across different parts of the system.
CommitNumber (types.go): Represents a unique, sequential identifier for a commit within a repository.
CommitNumber(0).int32. It includes an Add method for safe offsetting and a BadCommitNumber constant (-1) to represent invalid or non-existent commit numbers.CommitNumberSlice (types.go): A utility type to enable sorting of CommitNumber slices, which is useful for various data processing and display tasks.TileNumber (types.go): Represents an index for a “tile” in the TraceStore. Performance data (traces) are often stored in chunks or tiles for efficient storage and retrieval.
int32. Functions like TileNumberFromCommitNumber and TileCommitRangeForTileNumber manage the mapping between commit numbers and tile numbers based on a configurable tileSize. The Prev() method allows navigation to the preceding tile, and BadTileNumber (-1) indicates an invalid tile.Workflow: Commit to Tile Mapping
CommitNumber ----(tileSize)----> TileNumberFromCommitNumber() ----> TileNumber
|
V
TileCommitRangeForTileNumber() ----> (StartCommit, EndCommit)
Trace (types.go): Represents a sequence of performance measurements, typically corresponding to a specific metric over a series of commits.
[]float32. The NewTrace function initializes a trace of a given length with a special vec32.MISSING_DATA_SENTINEL value, which is crucial for distinguishing between actual zero values and missing data points. This leverages the go.skia.org/infra/go/vec32 package for optimized float32 vector operations.TraceSet (types.go): A collection of Traces, keyed by a string identifier (trace ID).
map[string]Trace.RegressionDetectionGrouping (types.go): An enumeration defining how traces are grouped for regression detection.
KMeansGrouping (cluster traces by shape) and StepFitGrouping (analyze each trace individually for steps). ToClusterAlgo provides a safe way to convert strings to this type.StepDetection (types.go): An enumeration defining the algorithms used to detect significant steps (changes) in individual traces or cluster centroids.
OriginalStep, AbsoluteStep, PercentStep, CohenStep, and MannWhitneyU. ToStepDetection ensures type-safe conversion from strings.AlertAction (types.go): An enumeration defining the actions to be taken when an anomaly (potential regression) is detected by an alert configuration.
NoAction, FileIssue, and Bisection.Domain (types.go): Specifies the range of commits over which an operation (like regression detection) should be performed.
N (number of commits) and End (timestamp for the end of the range) or an Offset (a specific commit number).ProgressCallback (types.go): A function type used to provide feedback on the progress of long-running operations.
func(message string).CL (types.go): Represents a Change List identifier (e.g., a GitHub Pull Request number).
string.AnomalyDetectionNotifyType (types.go): Defines the notification mechanism for anomalies.
IssueNotify (send to issue tracker) and NoneNotify (no notification).ProjectId (types.go): Represents a project identifier.
string with a predefined list AllProjectIds.AllMeasurementStats (types.go): A list of valid statistical suffixes that can be part of performance measurement keys (e.g., “avg”, “max”).
[]string slice.The unit tests in types_test.go focus on validating the logic of CommitNumber arithmetic and the mapping between CommitNumber and TileNumber, ensuring the core indexing mechanisms are correct.
The /go/ui module is responsible for handling frontend requests and preparing data for display in the Perf UI. Its primary purpose is to bridge the gap between user interactions on the frontend (e.g., selecting time ranges, defining queries, or applying formulas) and the backend data sources and processing logic.
This module is designed to be the central point for fetching and transforming performance data into a format that can be readily consumed by the UI. It orchestrates interactions with various other modules, such as those responsible for accessing Git history (/go/git), building dataframes (/go/dataframe), handling data shortcuts (/go/shortcut), and calculating derived metrics (/go/calc).
The key rationale behind this module's existence is to encapsulate the complexity of data retrieval and preparation, providing a clean and consistent API for the frontend. This separation of concerns allows the frontend to focus on presentation and user interaction, while the backend handles the intricacies of data access and manipulation.
The main workflow involves receiving a FrameRequest from the frontend, processing it to fetch and transform data, and then returning a FrameResponse containing the prepared data and display instructions.
/go/ui/frame/frame.go: This is the core file of the module.FrameRequest) and backend responses (FrameResponse). FrameRequest captures user inputs like time ranges, queries, formulas, and pivot table configurations. FrameResponse packages the resulting data, along with display hints and any relevant messages.FrameRequest objects. This involves dispatching tasks to other modules based on the request parameters. For example, it uses the dataframe.DataFrameBuilder to fetch data based on queries or trace keys, the calc module to evaluate formulas, and the pivot module to restructure data for pivot tables.REQUEST_TIME_RANGE) or a fixed number of recent commits (REQUEST_COMPACT).anomalies.Store and associates them with the relevant traces in the response. This can be done based on time ranges or commit revision numbers.ProcessFrameRequest function is the main entry point for handling a request. It creates a frameRequestProcess struct to manage the state of the request processing.DataFrame.reportError to ensure consistent logging and error propagation.progress.Progress interface, allowing the frontend to display updates during long-running requests.REQUEST_TIME_RANGE and REQUEST_COMPACT request types caters to different user needs: exploring specific historical periods versus viewing the latest trends.FrameResponse aims to provide users with immediate context about significant performance changes alongside the raw data. The system supports fetching anomalies based on either time or revision ranges, offering flexibility depending on how anomalies are tracked and stored.ResponseFromDataFrame function acts as a final assembly step, taking a processed DataFrame and enriching it with SKP change information, display mode, and handling potential truncation.A typical request processing flow might look like this:
Frontend Request (FrameRequest)
|
V
ProcessFrameRequest() in frame.go
|
+------------------------------+-----------------------------+--------------------------+
| | | |
V V V V
(If Queries exist) (If Formulas exist) (If Keys exist) (If Pivot requested)
doSearch() doCalc() doKeys() pivot.Pivot()
| | | |
V V V V
dfBuilder.NewFromQuery...() calc.Eval() with dfBuilder.NewFromKeys...() Restructure DataFrame
rowsFromQuery/Shortcut()
| | | |
+------------------------------+-----------------------------+--------------------------+
|
V
DataFrame construction and merging
|
V
(If anomaly search enabled)
addTimeBasedAnomaliesToResponse() OR addRevisionBasedAnomaliesToResponse()
|
V
anomalyStore.GetAnomalies...()
|
V
ResponseFromDataFrame()
|
V
getSkps() (Find significant file changes)
|
V
Truncate response if too large
|
V
Set DisplayMode
|
V
Backend Response (FrameResponse)
|
V
Frontend UI
The urlprovider module is designed to generate URLs for various pages within the Perf application. This centralized approach ensures consistency in URL generation across different parts of the application and simplifies the process of linking to specific views with pre-filled parameters. The key motivation is to abstract away the complexities of URL query parameter construction and to provide a simple interface for generating links to common Perf views like “Explore”, “MultiGraph”, and “GroupReport”.
The core component of this module is the URLProvider struct. An instance of URLProvider is initialized with a perfgit.Git object. This dependency is crucial because some URL generation, particularly for time-range-based views, requires fetching commit information (specifically timestamps) from the Git repository to define the “begin” and “end” parameters of the URL.
urlprovider.go: This file contains the primary logic for the URL provider.URLProvider struct: Holds a reference to a perfgit.Git instance. This allows it to interact with the Git repository to fetch commit details needed for constructing time-based query parameters.New(perfgit perfgit.Git) *URLProvider: This constructor function creates and returns a new instance of URLProvider. It takes a perfgit.Git object as an argument, which is stored within the struct. This design choice makes the URLProvider stateful with respect to its Git interaction capabilities.Explore(...) string: This method generates a URL for the “Explore” page (/e/).getQueryParams to construct the common query parameters like begin, end, and disable_filter_parent_traces. The begin and end timestamps are derived from the provided startCommitNumber and endCommitNumber by querying the perfGit instance. The end timestamp is intentionally shifted forward by one day to ensure that anomalies at the very end of the selected range are visible on the graph.parameters map (which contains key-value pairs for filtering traces) into a URL-encoded query string using GetQueryStringFromParameters. This encoded string is assigned to the queries parameter of the final URL.queryParams (passed as url.Values) can be merged into the URL./e/?.MultiGraph(...) string: This method generates a URL for the “MultiGraph” page (/m/).Explore, it uses getQueryParams to build the common time-range and filtering parameters.shortcut parameter with the provided shortcutId.queryParams can also be merged./m/?.GroupReport(param string, value string) string: This static function generates a URL for the “Group Report” page (/u/).Explore and MultiGraph, it does not inherently depend on a time range derived from commits, nor does it require complex parameter encoding.param against a predefined list of allowed parameters (anomalyGroupID, anomalyIDs, bugID, rev, sid). This is a security and correctness measure to prevent arbitrary parameters from being injected.param is valid, it constructs a simple URL with the provided param and value.param is invalid.URLProvider) because it doesn't need access to the perfGit instance or any other state within URLProvider. This simplifies its usage for cases where only a group report URL is needed without initializing a full URLProvider.getQueryParams(...) url.Values: This private helper method is responsible for creating the base set of query parameters common to Explore and MultiGraph.fillCommonParams to set the begin and end parameters based on commit numbers.disable_filter_parent_traces=true if requested.queryParams provided by the caller.fillCommonParams(...): This private helper populates the begin and end timestamp parameters in the provided url.Values.perfGit instance to look up the Commit objects corresponding to the startCommitNumber and endCommitNumber. The timestamps from these commits are then used. As mentioned earlier, the end timestamp is adjusted by adding one day. This separation of concerns keeps the main Explore and MultiGraph methods cleaner.GetQueryStringFromParameters(parameters map[string][]string) string: This helper method converts a map of string slices (representing query parameters where a single key can have multiple values) into a URL-encoded query string.Generating an “Explore” Page URL:
Caller provides: context, startCommitNum, endCommitNum, filterParams, disableFilterParent, otherQueryParams
|
v
URLProvider.Explore()
|
+-------------------------------------+
| |
v v
getQueryParams() GetQueryStringFromParameters(filterParams)
| |
+--> fillCommonParams() +--> Encode filterParams
| | |
| +--> perfGit.CommitFromCommitNumber() -> Get start timestamp
| | |
| +--> perfGit.CommitFromCommitNumber() -> Get end timestamp, add 1 day
| | |
| +----------------------------------------+
| |
| v
| Combine begin, end, disableFilterParent, otherQueryParams into url.Values
| |
+-------------------------------------+
|
v
Combine base URL ("/e/?"), common query params, and encoded filterParams string
|
v
Return final URL string
Generating a “MultiGraph” Page URL:
Caller provides: context, startCommitNum, endCommitNum, shortcutId, disableFilterParent, otherQueryParams
|
v
URLProvider.MultiGraph()
|
v
getQueryParams()
|
+--> fillCommonParams()
| |
| +--> perfGit.CommitFromCommitNumber() -> Get start timestamp
| |
| +--> perfGit.CommitFromCommitNumber() -> Get end timestamp, add 1 day
| |
| +----------------------------------------+
| |
| v
| Combine begin, end, disableFilterParent, otherQueryParams into url.Values
|
v
Add "shortcut=shortcutId" to url.Values
|
v
Combine base URL ("/m/?") and all query params
|
v
Return final URL string
Generating a “Group Report” Page URL:
Caller provides: paramName, paramValue | v urlprovider.GroupReport() | v Validate paramName against allowed list | +-- (Valid) --> Construct URL: "/u/?" + paramName + "=" + paramValue | | | v | Return URL string | +-- (Invalid) --> Return "" (empty string)
The design emphasizes reusability of common parameter generation logic (getQueryParams, fillCommonParams) and clear separation of concerns for generating URLs for different Perf pages. The dependency on perfgit.Git is explicitly managed through the URLProvider struct, making it clear when Git interaction is necessary.
The userissue module is responsible for managing the association between specific data points in Perf (identified by a trace key and a commit position) and Buganizer issues. This allows users to flag specific performance regressions or anomalies and link them directly to a tracking issue.
The core of this module is the Store interface, which defines the contract for persisting and retrieving these user-issue associations. The primary implementation of this interface is sqluserissuestore, which leverages a SQL database (specifically CockroachDB in this context) to store the data.
Key Responsibilities and Components:
store.go: This file defines the central UserIssue struct and the Store interface.
UserIssue struct: Represents a single association. It contains:UserId: The email of the user who made the association.TraceKey: A string uniquely identifying a performance metric's trace (e.g., “,arch=x86,config=Release,test=MyTest,”).CommitPosition: An integer representing a specific point in the commit history where the data point exists.IssueId: The numerical ID of the Buganizer issue.Store interface: This interface dictates the operations that any backing store for user issues must support:Save(ctx context.Context, req *UserIssue) error: Persists a new UserIssue association. The implementation must handle potential conflicts, such as trying to save a duplicate entry (same trace key and commit position).Delete(ctx context.Context, traceKey string, commitPosition int64) error: Removes an existing user-issue association based on its unique trace key and commit position. It should handle cases where the specified association doesn't exist.GetUserIssuesForTraceKeys(ctx context.Context, traceKeys []string, startCommitPosition int64, endCommitPosition int64) ([]UserIssue, error): Retrieves all UserIssue associations for a given set of trace keys within a specified range of commit positions. This is crucial for displaying these associations on performance graphs or reports.sqluserissuestore/sqluserissuestore.go: This is the SQL-backed implementation of the Store interface.
go.skia.org/infra/go/sql/pool for managing database connections.listUserIssues, use Go's text/template package to dynamically construct the IN clause for multiple traceKeys. This is a common pattern to avoid SQL injection vulnerabilities and handle variadic inputs efficiently.Save: Inserts a new row into the UserIssues table. It includes a last_modified timestamp.Delete: First, it attempts to retrieve the issue to ensure it exists before attempting deletion. This provides a more informative error message if the record is not found.GetUserIssuesForTraceKeys: Constructs a SQL query using a template to select issues matching the provided trace keys and commit position range. It then iterates over the query results and populates a slice of UserIssue structs.sqluserissuestore/schema/schema.go: This file defines the Go struct UserIssueSchema which directly maps to the SQL table schema for UserIssues.
user_id TEXT NOT NULLtrace_key TEXT NOT NULLcommit_position INT NOT NULLissue_id INT NOT NULLlast_modified TIMESTAMPTZ DEFAULT now()PRIMARY KEY(trace_key, commit_position): The combination of trace_key and commit_position uniquely identifies a user issue, preventing multiple issues from being associated with the exact same data point.mocks/Store.go: This contains a mock implementation of the Store interface, generated using the testify/mock library.
userissue.Store without requiring a live database connection. It allows developers to define expected calls and return values for the store's methods.Workflow Example: Saving a User Issue
userissue.UserIssue struct and calls the Save method on an instance of userissue.Store (likely sqluserissuestore.UserIssueStore). User Request (UI) | v API Endpoint | v Backend Handler | | Creates userissue.UserIssue{UserId:"...", TraceKey:"...", CommitPosition:123, IssueId:45678} v userissue.Store.Save(ctx, &issue) | v sqluserissuestore.UserIssueStore.Save() | | Constructs SQL: INSERT INTO UserIssues (...) VALUES ($1, $2, $3, $4, $5) v SQL Database (UserIssues Table) <-- Row insertedWorkflow Example: Retrieving User Issues for a Chart
GetUserIssuesForTraceKeys on the userissue.Store. Chart Display Request (UI) | | Provides: traceKeys=["trace1", "trace2"], startCommit=100, endCommit=200 v API Endpoint | v Backend Handler | v userissue.Store.GetUserIssuesForTraceKeys(ctx, traceKeys, startCommit, endCommit) | v sqluserissuestore.UserIssueStore.GetUserIssuesForTraceKeys() | | Constructs SQL: SELECT ... FROM UserIssues WHERE trace_key IN ('trace1', 'trace2') AND commit_position>=100 AND commit_position<=200 v SQL Database (UserIssues Table) | | Returns rows matching the query v Backend Handler | | Formats response v API Endpoint | v UI (displays issue markers on chart)The design emphasizes a clear separation of concerns with the Store interface, allowing for different storage backends if necessary (though SQL is the current and likely long-term choice). The SQL implementation is straightforward, using parameterized queries for security and templates for dynamic query construction where appropriate.
This module defines and implements Temporal workflows for automating tasks related to performance anomaly detection and analysis in Skia Perf. It orchestrates interactions between various services like the AnomalyGroup service, Culprit service, and Gerrit service to achieve end-to-end automation. The primary goal is to streamline the process of identifying performance regressions, finding their root causes (culprits), and notifying relevant parties.
The workflows are designed to be resilient and fault-tolerant, leveraging Temporal's capabilities for retries and state management. This ensures that even if individual steps or external services encounter transient issues, the overall process can continue and eventually complete.
The module is structured into a public API (workflows.go) and an internal implementation package (internal/).
workflows.go:
ProcessCulprit, MaybeTriggerBisection): These string constants are the canonical names used to invoke the respective workflows via the Temporal client. Using constants helps avoid typos and ensures consistency.ProcessCulpritParam, ProcessCulpritResult, MaybeTriggerBisectionParam, MaybeTriggerBisectionResult): These structs define the data that needs to be passed into a workflow and the data that a workflow is expected to return upon completion. They ensure type safety and clarity in communication.internal/ package: This package contains the actual implementation of the workflows and their associated activities. Activities are the building blocks of Temporal workflows, representing individual units of work that can be executed, retried, and timed out independently.
options.go:
regularActivityOptions: Defines default options (e.g., 1-minute timeout, 10 retry attempts) for standard activities that are expected to complete quickly, like API calls to other services.childWorkflowOptions: Defines options for child workflows (e.g., 12-hour execution timeout, 4 retry attempts). This longer timeout accommodates potentially resource-intensive tasks like bisections which involve compilation and testing.maybe_trigger_bisection.go:
MaybeTriggerBisectionWorkflow, which is the core logic for deciding whether to automatically find the cause of a performance regression (bisection) or to simply report the anomaly._WAIT_TIME_FOR_ANOMALIES, e.g., 30 minutes). This allows time for related anomalies to be detected and grouped together, potentially providing a more comprehensive picture before taking action. Wait for more anomalies ->Load Anomaly Group (Activity) AnomalyGroup Service <---> WorkflowGroupAction field of the anomaly group: - If BISECT: a. Load Top Anomaly: Fetches the most significant anomaly within the group. b. Resolve Commit Hashes: Converts the start and end commit positions of the anomaly into Git commit hashes using an activity that interacts with a Gerrit/Crrev service. Get Commit Hashes (Activity) Gerrit/Crrev Service <---> Workflow c. Launch Bisection (Child Workflow): Triggers a separate CulpritFinderWorkflow (defined in the pinpoint/go/workflows module) as a child workflow. This child workflow is responsible for performing the actual bisection. - A unique ID is generated for the Pinpoint job. - The child workflow is configured with ParentClosePolicy: ABANDON, meaning it will continue running even if this parent workflow terminates. This is crucial because bisections can be long-running. - Callback parameters are passed to the child workflow so it knows how to report its findings back (e.g., which Anomaly Group ID it's associated with, which Culprit service to use). Launch Pinpoint Bisection Workflow -----------------> Pinpoint.CulpritFinderWorkflow (Child) d. Update Anomaly Group: Records the ID of the launched bisection job back into the AnomalyGroup. Update Anomaly Group with Bisection ID (Activity) AnomalyGroup Service <---> Workflow - If REPORT: a. Load Top Anomalies: Fetches a list of the top N anomalies in the group. b. Notify User: Calls an activity that uses the Culprit service to file a bug or send a notification about these anomalies. Notify User of Anomalies (Activity) Culprit Service <--------> WorkflowparseStatisticNameFromChart, benchmarkStoriesNeedUpdate, updateStoryDescriptorName: These functions handle specific data transformations needed to correctly format parameters for the Pinpoint bisection request, often due to legacy conventions or differences in how metrics are named.process_culprit.go:
ProcessCulpritWorkflow, which handles the results of a completed bisection (i.e., when one or more culprits are identified).Persist Culprit (Activity) Culprit Service <--------> WorkflowNotify User of Culprit (Activity) Culprit Service <--------> WorkflowParsePinpointCommit: Handles the parsing of repository URLs from the Pinpoint commit format (e.g., https://{host}/{project}.git) into separate host and project components required by the Culprit service.anomalygroup_service_activity.go:
LoadAnomalyGroupByID: Fetches an anomaly group by its ID.FindTopAnomalies: Retrieves the most significant anomalies within a group.UpdateAnomalyGroup: Updates an existing anomaly group (e.g., to add a bisection ID).culprit_service_activity.go:
anomalygroup_service_activity.go, this encapsulates communication with the Culprit service.PeristCulprit: Stores culprit information.NotifyUserOfCulprit: Notifies users about a found culprit (e.g., by creating a bug).NotifyUserOfAnomaly: Notifies users about a set of anomalies (used when the group action is REPORT).gerrit_service_activity.go:
GetCommitRevision: Takes a commit position (as an integer) and returns its corresponding Git hash.worker/main.go:
main function sets up the worker, connects it to the Temporal server, and registers the workflows and activities it's capable of handling.localhost.dev or a production queue name). Workflows and activities are dispatched to workers listening on the correct task queue.ProcessCulpritWorkflow and MaybeTriggerBisectionWorkflow with the worker, associating them with their public names (e.g., workflows.ProcessCulprit).CulpritServiceActivity, AnomalyGroupServiceActivity, GerritServiceActivity) with the worker.1. Anomaly Group Processing and Potential Bisection (MaybeTriggerBisectionWorkflow)
External Trigger (e.g., new AnomalyGroup created)
|
v
Start MaybeTriggerBisectionWorkflow(AG_ID)
|
+----------------------------------+
| Wait (e.g., 30 mins) |
+----------------------------------+
|
v
LoadAnomalyGroupByID(AG_ID) ----> AnomalyGroup Service
|
+-----------+
| GroupAction?|
+-----------+
/ \
/ \
BISECT REPORT
| |
v v
FindTopAnomalies(AG_ID, Limit=1) FindTopAnomalies(AG_ID, Limit=10)
| |
v v
GetCommitRevision(StartCommit) --> Gerrit Anomalies --> Convert to CulpritService format
| |
v v
GetCommitRevision(EndCommit) --> Gerrit NotifyUserOfAnomaly(AG_ID, Anomalies) --> Culprit Service
|
v
Execute Pinpoint.CulpritFinderWorkflow (Child)
| (Async, ParentClosePolicy=ABANDON)
| Params: {StartHash, EndHash, Config, Benchmark, Story, ...
| CallbackParams: {AG_ID, CulpritServiceURL, GroupingTaskQueue}}
|
v
UpdateAnomalyGroup(AG_ID, BisectionID) --> AnomalyGroup Service
|
v
End Workflow
2. Processing Bisection Results (ProcessCulpritWorkflow)
This workflow is typically triggered as a callback by the Pinpoint CulpritFinderWorkflow when it successfully identifies a culprit.
Pinpoint.CulpritFinderWorkflow completes
| (Calls back to Temporal, invoking ProcessCulpritWorkflow)
v
Start ProcessCulpritWorkflow(Commits, AG_ID, CulpritServiceURL)
|
+----------------------------------+
| Convert Pinpoint Commits to |
| Culprit Service Format |
| (Parse Repository URLs) |
+----------------------------------+
|
v
PersistCulprit(Commits, AG_ID) --------> Culprit Service
| (Returns CulpritIDs)
v
NotifyUserOfCulprit(CulpritIDs, AG_ID) -> Culprit Service
| (Returns IssueIDs, e.g., bug numbers)
v
End Workflow
The /integration module provides a dataset and tools for conducting integration tests on the Perf performance monitoring system. Its primary purpose is to offer a controlled and reproducible environment for verifying the ingestion and processing capabilities of Perf.
The core of this module is the data subdirectory. This directory houses a collection of JSON files, each representing performance data associated with specific commits from the perf-demo-repo (https://github.com/skia-dev/perf-demo-repo.git). These files are structured according to the format.Format schema defined in go.skia.org/infra/perf/go/ingest/format. This standardized format is crucial as it allows Perf's ‘dir’ type ingester to directly consume these files. The dataset is intentionally designed to include a mix of valid data points and specific error conditions:
perf-demo-repo.demo_data_commit_10.json) contains a git_hash that does not correspond to an actual commit in the perf-demo-repo. This allows testing how Perf handles data associated with unknown or invalid commit identifiers.malformed.json is intentionally not a valid JSON file. This is used to test Perf's error handling capabilities when encountering incorrectly formatted input data.The generation of these data files is handled by generate_data.go. This Go program is responsible for creating the JSON files in the data directory. It uses a predefined list of commit hashes from the perf-demo-repo and generates random but plausible performance metrics for each. The inclusion of this generator script is important because it allows developers to easily modify, expand, or regenerate the test dataset if the testing requirements change or if new scenarios need to be covered. The script uses math/rand for generating some variability in the measurement values, ensuring the data isn't entirely static while still being predictable.
The key workflow for utilizing this module in an integration test scenario would look something like this:
/integration/data directory. Perf Instance --> Ingester (type: 'dir') --> /integration/data/*.jsonmalformed.json file.The BUILD.bazel file defines how the components of this module are built.
data filegroup makes the JSON test files available to other parts of the system, specifically for use in performance testing (//perf:__subpackages__).integration_lib go_library encapsulates the logic from generate_data.go.integration go_binary provides an executable to run generate_data.go, allowing for easy regeneration of the test data.In essence, the /integration module provides a self-contained, version-controlled set of test data and a mechanism to regenerate it. This is crucial for ensuring the stability and correctness of Perf‘s data ingestion pipeline by providing a consistent baseline for integration testing. The choice to include both valid and intentionally erroneous data points allows for comprehensive testing of Perf’s data handling capabilities, including its robustness in the face of invalid input.
The /jupyter module provides tools and examples for interacting with Skia's performance data, specifically data from perf.skia.org. The primary goal is to enable users to programmatically query, analyze, and visualize performance metrics using the power of Python libraries like Pandas, NumPy, and Matplotlib within a Jupyter Notebook environment.
The core functionality revolves around fetching and processing performance data. This is achieved by providing Python functions that abstract the complexities of interacting with the perf.skia.org API. This allows users to focus on the data analysis itself rather than the underlying data retrieval mechanisms.
Key Components/Files:
/jupyter/Perf+Query.ipynb: This is a Jupyter Notebook that serves as both an example and a utility library.
Why: It demonstrates how to use the provided Python functions to query performance data. It also contains the definitions of these key functions, making it a self-contained environment for performance analysis. The notebook format is chosen for its interactive nature, allowing users to execute code snippets, see results immediately, and experiment with different queries and visualizations.
How:
perf_calc(formula): This function is designed to evaluate a specific formula against the performance data. It takes a string formula (e.g., 'count(filter(\"\"))') as input. The formula is sent to the perf.skia.org backend for processing. This function is useful when you need to perform calculations or aggregations on the data directly on the server side before retrieving it.
perf_query(query): This function allows for more direct querying of performance data based on key-value pairs. It takes a query string (e.g., 'source_type=skp&sub_result=min_ms') that specifies the parameters for data retrieval. This is suitable when you want to fetch raw or filtered trace data.
perf_impl(body): This is an internal helper function used by both perf_calc and perf_query. It handles the actual HTTP communication with perf.skia.org. It first determines the time range for the query (typically the last 50 commits by default) by fetching initial page data. Then, it sends the query or formula to the /_/frame/start endpoint, polls the /_/frame/status endpoint until the request is successful, and finally retrieves the results from /_/frame/results. The results are then processed into a Pandas DataFrame, which is a powerful data structure for analysis in Python. A special value 1e32 from the backend (often representing missing or invalid data) is converted to np.nan (Not a Number) for better handling in Pandas.
paramset(): This utility function fetches the available parameter set from perf.skia.org. This is useful for discovering the possible values for different dimensions like ‘model’, ‘test’, ‘cpu_or_gpu’, etc., which can then be used to construct more targeted queries.
Examples: The notebook is rich with examples showcasing how to use perf_calc and perf_query, plot the resulting DataFrames using Pandas' built-in plotting capabilities or Matplotlib directly, normalize data, calculate means, and perform more complex analyses like finding the noisiest hardware models or comparing CPU vs. GPU performance for specific tests. These examples serve as practical starting points for users.
Workflow (Simplified perf_impl):
Client (Jupyter Notebook) -- GET /_/initpage/ --> perf.skia.org (Get time bounds)
perf.skia.org -- Initial Data (JSON) --> Client
Client -- POST /_/frame/start (with query/formula & time bounds) --> perf.skia.org
perf.skia.org -- Request ID (JSON) --> Client
Client -- GET /_/frame/status/{ID} --> perf.skia.org (Loop until ‘Success’)
perf.skia.org -- Status (JSON) --> Client
Client -- GET /_/frame/results/{ID} --> perf.skia.org
perf.skia.org -- Performance Data (JSON) --> Client
Client (Python): Parse JSON -> Create Pandas DataFrame -> Return DataFrame to user.
/jupyter/README.md: This file provides instructions on setting up the necessary Python environment to run Jupyter Notebooks and the required libraries (Pandas, SciPy, Matplotlib).
virtualenv) is recommended to isolate project dependencies and avoid conflicts.pip, python-dev, and python-virtualenv using apt-get (assuming a Debian-based Linux system). It then shows how to create a virtual environment, activate it, upgrade pip, and install jupyter, notebook, scipy, pandas, and matplotlib within that isolated environment. Finally, it explains how to run the Jupyter Notebook server and deactivate the environment when done. This ensures a reproducible and clean setup for users wanting to utilize the Perf+Query.ipynb notebook.The design emphasizes ease of use for data analysts and developers who need to interact with Skia's performance data. By leveraging Jupyter Notebooks, it provides an interactive and visual way to explore performance trends and issues. The abstraction of API calls into simple Python functions (perf_calc, perf_query) significantly lowers the barrier to entry for accessing this rich dataset.
The /lint module is responsible for ensuring code quality and consistency within the project by integrating and configuring JSHint, a popular JavaScript linting tool.
The primary goal of this module is to provide a standardized way to identify and report potential errors, stylistic issues, and anti-patterns in the JavaScript codebase. This helps maintain code readability, reduces the likelihood of bugs, and promotes adherence to established coding conventions.
The core component of this module is the reporter.js file. This file defines a custom reporter function that JSHint will use to format and output the linting results.
The decision to implement a custom reporter stems from the need to present linting errors in a clear, concise, and actionable format. Instead of relying on JSHint‘s default output, which might be too verbose or not ideally suited for the project’s workflow, reporter.js provides a tailored presentation.
The reporter function within reporter.js takes an array of error objects (res) as input, where each object represents a single linting issue found by JSHint. It then iterates through these error objects and constructs a formatted string for each error. The format chosen is filename:line:character message, which directly points developers to the exact location of the issue in the source code.
For example: src/myFile.js:10:5 Missing semicolon
This specific format is chosen for its commonality in development tools and its ease of integration with various editors and IDEs, allowing developers to quickly navigate to the reported errors.
After processing all errors, if any were found, the reporter function aggregates the formatted error strings and prints them to the standard output (process.stdout.write). Additionally, it appends a summary line indicating the total number of errors found, ensuring that developers have a quick overview of the linting status. The pluralization of “error” vs. “errors” is also handled for grammatical correctness.
The workflow can be visualized as:
JSHint analysis --[error objects]--> reporter.js --[formatted errors & summary]--> stdout
By controlling the output format, this module ensures that linting feedback is consistently presented and easily digestible, contributing to a more efficient development process. The design prioritizes providing actionable information to developers, enabling them to address code quality issues promptly.
This module is responsible for managing SQL database schema migrations for Perf. Perf utilizes SQL backends to store various data, including trace data, shortcuts, and alerts. As the application evolves, the database schema may need to change. This module provides the mechanism to apply these changes and to upgrade existing databases to the schema expected by the current Perf version.
The core of this system relies on the github.com/golang-migrate/migrate/v4 library. This library provides a robust framework for versioning database schemas and applying migrations in a controlled manner.
The key design principle is to have a versioned set of SQL scripts for each supported SQL dialect. This allows Perf to:
Each SQL dialect (e.g., CockroachDB) has its own subdirectory within the /migrations module. The naming convention for these directories is critical: they must match the values defined in sql.Dialect.
Inside each dialect-specific directory, migration files are organized by version.
0001_, 0002_)..up. file (e.g., 0001_create_initial_tables.up.sql): Contains SQL statements to apply the schema changes for that version..down. file (e.g., 0001_create_initial_tables.down.sql): Contains SQL statements to revert the schema changes introduced by the corresponding .up. file.This paired approach ensures that migrations can be applied and rolled back smoothly.
Key Files and Responsibilities:
README.md: Provides a high-level overview of the migration system, explaining its purpose and the use of the golang-migrate/migrate library. It also details the directory structure and file naming conventions for migration scripts.cockroachdb/: This directory contains the migration scripts specifically for the CockroachDB dialect.cockroachdb/0001_create_initial_tables.up.sql: This is the first migration script for CockroachDB. It defines the initial schema for Perf, creating tables such as TraceValues, SourceFiles, ParamSets, Postings, Shortcuts, Alerts, Regressions, and Commits. The table definitions include primary keys, indexes, and column types tailored for efficient data storage and retrieval specific to Perf's needs (e.g., storing trace data, associating traces with source files, managing alert configurations, and tracking commit history). The schema is designed to support the various functionalities of Perf, such as querying traces by parameters, retrieving trace values over commit ranges, and linking regressions to specific alerts and commits.cockroachdb/0001_create_initial_tables.down.sql: This file is intended to contain SQL statements to drop the tables created by its corresponding .up. script. However, as a safety precaution against accidental data loss, it is currently empty. The design acknowledges the potential danger of automated table drops in a production environment.cdb.sql: This is a utility SQL script designed for developers to interact with and test queries against a CockroachDB instance populated with Perf data. It includes sample INSERT statements to populate tables with test data and various SELECT queries demonstrating common data retrieval patterns used by Perf. This file is not part of the automated migration process but serves as a helpful tool for development and debugging. It showcases how to query for traces based on parameters, retrieve trace values, find the most recent tile, and get source file information. It also includes examples of more complex queries involving INTERSECT and JOIN operations, reflecting the kinds of queries Perf might execute.test.sql: Similar to cdb.sql, this script is for testing and experimentation, but it's tailored for a SQLite database. It creates a schema similar to the CockroachDB one (though potentially simplified or with slight variations due to dialect differences) and populates it with test data. It contains a series of CREATE TABLE, INSERT, and SELECT statements that developers can use to quickly set up a local test environment and verify SQL logic.batch-delete.sh and batch-delete.sql: These files provide a mechanism for performing batch deletions of specific parameter data from the ParamSets table in a CockroachDB instance.batch-delete.sql: Contains the DELETE SQL statement. It is designed to be edited directly to specify the deletion criteria (e.g., tile_number, param_key, param_value ranges) and the LIMIT for the number of rows deleted in each batch. This batching approach is crucial for deleting large amounts of data without overwhelming the database or causing long-running transactions.batch-delete.sh: A shell script that repeatedly executes batch-delete.sql using the cockroach sql command-line tool. It runs in a loop with a short sleep interval, allowing for controlled, iterative deletion. This script assumes that a port-forward to the CockroachDB instance is already established. This utility is likely used for data cleanup or maintenance tasks that require removing specific, potentially large, datasets.Migration Workflow (Conceptual):
When Perf starts or when a migration command is explicitly run:
Determine Current Schema Version: The golang-migrate/migrate library connects to the database and checks the current schema version (often stored in a dedicated migrations table managed by the library itself).
Identify Target Schema Version: This is typically the highest version number found among the migration files for the configured SQL dialect.
Apply Pending Migrations:
- If the current schema version is lower than the target version, the library iteratively executes the `.up.sql` files in ascending order of their version numbers, starting from the version immediately following the current one, up to the target version. - Each successful `.up.` migration updates the schema version in the database. Example: Current Version = 0, Target Version = 2 `DB State (v0) --> Run
0001**.up.sql --> DB State (v1) --> Run 0002**.up.sql --> DB State (v2)`
Rollback Migrations (if needed):
- If a user needs to revert to an older schema version, the library can execute the `.down.sql` files in descending order. Example: Current Version = 2, Target Rollback Version = 0 `DB State (v2) -->
Run 0002**.down.sql --> DB State (v1) --> Run 0001**.down.sql --> DB State (v0)`
The BUILD.bazel file defines a filegroup named cockroachdb which bundles all files under the cockroachdb/ subdirectory. This is likely used by other parts of the Perf build system, perhaps to package these migration scripts or make them accessible to the Perf application when it needs to perform migrations.
The modules directory contains a collection of frontend TypeScript modules that constitute the building blocks of the Perf web application's user interface. These modules primarily define custom HTML elements (web components) and utility functions for various UI functionalities, data processing, and interaction with backend services. The architecture emphasizes modularity, reusability, and a component-based approach, largely leveraging the Lit library for creating custom elements and elements-sk for common UI widgets.
The design philosophy encourages separation of concerns:
dataframe and progress manage data fetching, processing, and state.paramtools, pivotutil, cid, and trybot provide common functionalities for data manipulation, key parsing, and specific calculations.themes module ensures a consistent visual appearance, building upon infra-sk's theming capabilities.json module defines TypeScript interfaces that mirror backend Go structures, ensuring type safety in client-server communication.This modular structure aims to create a maintainable and scalable frontend codebase. Each module typically includes its core logic, associated styles, demo pages for isolated development and testing, and unit/integration tests.
A significant portion of the modules is dedicated to creating custom HTML elements that serve as interactive UI components. These elements often encapsulate complex behavior and interactions, simplifying their use in higher-level page components.
Data Visualization and Interaction:
plot-simple-sk: A custom-built canvas-based plotting element for rendering interactive line graphs, optimized for performance with features like dual canvases, Path2D objects, and k-d trees for point proximity.plot-google-chart-sk: An alternative plotting element that wraps the Google Charts library, offering a rich set of features and interactivity like panning, zooming, and trace visibility toggling.plot-summary-sk: Displays a summary plot (often using Google Charts) and allows users to select a range, which is useful for overview and drill-down scenarios.chart-tooltip-sk: Provides a detailed, interactive tooltip for data points on charts, showing commit information, anomaly details, and actions like bisection or requesting traces.graph-title-sk: Displays a structured title for graphs, showing key-value parameter pairs associated with the plotted data.word-cloud-sk: Visualizes key-value pairs and their frequencies as a textual list with proportional bars.Alert and Regression Management:
alert-config-sk: A UI for creating and editing alert configurations, including query definition, detection algorithms, and notification settings.alerts-page-sk: A page for viewing, creating, and managing all alert configurations.cluster-summary2-sk: Displays a detailed summary of a performance cluster, including a plot, statistics, and triage controls.anomalies-table-sk: Renders a sortable and interactive table of detected performance anomalies, allowing for grouping and bulk actions like triage and graphing.anomaly-sk: Displays detailed information about a single performance anomaly.triage-status-sk: A simple button-like element indicating the current triage status of a cluster and allowing users to initiate the triage process.triage-menu-sk: Provides a menu for bulk triage actions on selected anomalies, including assigning bugs or marking them as ignored.new-bug-dialog-sk: A dialog for filing new bugs related to anomalies, pre-filling details.existing-bug-dialog-sk: A dialog for associating anomalies with existing bug reports.user-issue-sk: Manages the association of user-reported Buganizer issues with specific data points.bisect-dialog-sk: A dialog for initiating a Pinpoint bisection process to find the commit causing a regression.pinpoint-try-job-dialog-sk: A (legacy) dialog for initiating Pinpoint A/B try jobs to request additional traces.triage-page-sk: A page dedicated to viewing and triaging regressions based on time range and filters.regressions-page-sk: A page for viewing regressions associated with specific “subscriptions” (e.g., sheriff configs).subscription-table-sk: Displays details of a subscription and its associated alerts.revision-info-sk: Displays information about anomalies detected around a specific revision.Data Input and Selection:
query-sk: A comprehensive UI for constructing complex queries by selecting parameters and their values.paramset-sk: Displays a set of parameters and their values, often used to summarize a query or data selection.query-chooser-sk: Combines paramset-sk (for summary) and query-sk (in a dialog) for a compact query selection experience.query-count-sk: Shows the number of items matching a given query, fetching this count from a backend endpoint.commit-detail-picker-sk: Allows users to select a specific commit from a range, typically presented in a dialog with date range filtering.commit-detail-panel-sk: Displays a list of commit details, making them selectable.commit-detail-sk: Displays information about a single commit with action buttons.calendar-input-sk: A date input field combined with a calendar picker dialog.calendar-sk: A standalone interactive calendar widget.day-range-sk: Allows selection of a “begin” and “end” date.domain-picker-sk: Allows selection of a data domain either by date range or by a number of recent commits.test-picker-sk: A guided, multi-step picker for selecting tests or traces by sequentially choosing parameter values.picker-field-sk: A text input field with a filterable dropdown menu of predefined options, built using Vaadin ComboBox.algo-select-sk: A dropdown for selecting a clustering algorithm.split-chart-menu-sk: A menu for selecting an attribute by which to split a chart.pivot-query-sk: A UI for configuring pivot table requests (group by, operations, summaries).triage2-sk: A set of three buttons for selecting a triage status (positive, negative, untriaged).tricon2-sk: An icon that visually represents one of the three triage states.Data Display and Structure:
pivot-table-sk: Displays pivoted DataFrame data in a sortable table.json-source-sk: A dialog for viewing the raw JSON source data for a specific trace point.ingest-file-links-sk: Displays relevant links (e.g., to Swarming, Perfetto) associated with an ingested data point.point-links-sk: Displays links from ingestion files and generates commit range links between data points.commit-range-sk: Dynamically generates a URL to a commit range viewer based on begin and end commits.Scaffolding and Application Structure:
perf-scaffold-sk: Provides the consistent layout, header, and navigation sidebar for all Perf application pages.explore-simple-sk: The core element for exploring and visualizing performance data, including querying, plotting, and anomaly interaction.explore-sk: Wraps explore-simple-sk, adding features like user authentication, default configurations, and optional integration with test-picker-sk.explore-multi-sk: Allows displaying and managing multiple explore-simple-sk graphs simultaneously, with shared controls and shortcut management.favorites-dialog-sk: A dialog for adding or editing bookmarked “favorites” (named URLs).favorites-sk: Displays and manages a user's list of favorites.Backend Interaction and Data Processing Utilities:
cid/cid.ts: Provides lookupCids to fetch detailed commit information based on commit numbers.common/plot-builder.ts & common/plot-util.ts: Utilities for transforming DataFrame and TraceSet data into formats suitable for plotting libraries (especially Google Charts) and for creating consistent chart options.common/test-util.ts: Sets up mocked API responses (fetch-mock) for various backend endpoints, facilitating isolated testing and demo page development.const/const.ts: Defines shared constants, notably MISSING_DATA_SENTINEL for representing missing data points, ensuring consistency with the backend.csv/index.ts: Converts DataFrame objects into CSV format for data export.dataframe/index.ts & dataframe/dataframe_context.ts: Core logic for managing and manipulating DataFrame objects. DataFrameRepository (a LitElement context provider) handles fetching, caching, merging, and providing DataFrame and DataTable objects to consuming components.dataframe/traceset.ts: Utilities for extracting and formatting information from trace keys within DataFrames/DataTables, such as generating chart titles and legends.errorMessage/index.ts: A wrapper around elements-sk's errorMessage to display persistent error messages by default.json/index.ts: Contains TypeScript interfaces and types that define the structure of JSON data exchanged with the backend, crucial for type safety and often auto-generated from Go structs.paramtools/index.ts: Client-side utilities for creating, parsing, and manipulating ParamSet objects and structured trace keys (e.g., makeKey, fromKey, queryFromKey).pivotutil/index.ts: Utilities for validating pivot table requests (pivot.Request) and providing descriptions for pivot operations.progress/progress.ts: Implements startRequest for initiating and polling the status of long-running server-side tasks, providing progress updates to the UI.trace-details-formatter/traceformatter.ts: Provides TraceFormatter implementations (default and Chrome-specific) for converting trace parameter sets to display strings and vice-versa for querying.trybot/calcs.ts: Calculates and aggregates stddevRatio values from Perf trybot results, grouping them by parameter to identify performance impacts.trybot-page-sk: A page for analyzing performance regressions based on commit or trybot run, using trybot/calcs for analysis.window/index.ts: Utilities related to the browser window object, including parsing build tag information from window.perf.image_tag.Core Architectural Patterns:
ElementSk from infra-sk.lit-html) and reactive updates.stateReflector (from infra-sk) is frequently used to synchronize component state with URL query parameters, enabling bookmarking and shareable views (e.g., alerts-page-sk, explore-simple-sk, triage-page-sk).@lit/context) are used for providing shared data down the component tree without prop drilling, notably in dataframe/dataframe_context.ts for DataFrame objects.query-sk emits query-change, triage-status-sk emits start-triage).fetch API is used for backend communication. Promises and async/await are standard for handling these asynchronous operations. Spinners (spinner-sk) provide user feedback during loading.BUILD.bazel files. This allows for better organization and easier maintenance.*-demo.html, *-demo.ts) for isolated development and visual testing, Karma unit tests (*_test.ts), and Puppeteer end-to-end/screenshot tests (*_puppeteer_test.ts). fetch-mock is extensively used in demos and tests to simulate backend responses.This comprehensive set of modules forms a rich ecosystem for building and maintaining the Perf application's frontend, with a strong emphasis on modern web development practices and reusability.
The alert module is responsible for validating the configuration of alerts within the Perf system. Its primary function is to ensure that alert definitions adhere to a set of predefined rules, guaranteeing their proper functioning and preventing errors. This module plays a crucial role in maintaining the reliability of the alerting system by catching invalid configurations before they are deployed.
The core design principle behind this module is simplicity and focused responsibility. Instead of incorporating complex validation logic directly into other parts of the system (like the UI or backend services that handle alert creation/modification), this module provides a dedicated, reusable validation function. This promotes modularity and makes the validation logic easier to maintain and update.
The choice of using a simple function (validate) that returns a string (empty for valid, error message for invalid) is intentional. This approach is straightforward to understand and integrate into various parts of the application. It avoids throwing exceptions for validation failures, which can sometimes complicate control flow, and instead provides clear, human-readable feedback.
The current validation is intentionally minimal, focusing on the essential requirement of a non-empty query. This is a pragmatic approach, starting with the most critical validation and allowing for the addition of more complex rules as the system evolves. The dependency on //perf/modules/json:index_ts_lib indicates that the structure of an Alert is defined externally, and this module consumes that definition.
index.ts: This is the central file of the module.Alert configurations.validate(alert: Alert): string function:Alert object (as defined in the ../json module) as input.alert object. Currently, it verifies that the query property of the Alert is present and not an empty string.Alert configuration is valid. If any check fails, it returns a string containing a descriptive error message indicating why the Alert is considered invalid. This message is intended to be user-friendly and help in correcting the configuration.Alert Validation Workflow:
External System (e.g., UI, API) -- Passes Alert object --> [alert/index.ts: validate()]
|
V
[ Is alert.query non-empty? ]
|
+--------------------------+--------------------------+
| (Yes) | (No)
V V
[ Returns "" (empty string) ] [ Returns "An alert must have a non-empty query." ]
| |
V V
External System <-- Receives validation result -- [ Interprets result (valid/invalid) ]
This workflow illustrates how an external system would interact with the validate function. The external system provides an Alert object, and the validate function returns a string. The external system then uses this string to determine if the alert configuration is valid and can proceed accordingly (e.g., save the alert, display an error to the user).
The alert-config-sk module provides a custom HTML element, <alert-config-sk>, designed for creating and editing alert configurations within the Perf application. This element serves as a user interface for defining the conditions under which an alert should be triggered, how regressions are detected, and where notifications should be sent.
Core Functionality and Design:
The primary goal of alert-config-sk is to offer a comprehensive yet user-friendly way to manage alert settings. It encapsulates all the necessary input fields and logic for defining an Alert object, which is a central data structure in Perf for representing alert configurations.
Key design considerations include:
window.perf.notifications, window.perf.display_group_by, window.perf.need_alert_action). This allows the same component to present different options depending on the specific Perf instance‘s configuration or the user’s context. For example, the notification options (email vs. issue tracker) and the visibility of “Group By” settings can change._config object, and changes to the element's properties (like config, paramset) trigger re-renders.query-chooser-sk for selecting traces, algo-select-sk for choosing clustering algorithms, and various elements-sk components (e.g., select-sk, multi-select-sk, checkbox-sk) for standard UI inputs. This promotes consistency and reduces redundant code.Key Components and Files:
alert-config-sk.ts: This is the heart of the module, defining the AlertConfigSk class which extends ElementSk.config: An Alert object representing the current alert configuration being edited. This is the primary data model for the component.paramset: A ParamSet object providing the available parameters and their values for constructing queries (used by query-chooser-sk).key_order: An array of strings dictating the preferred order of keys in the query-chooser-sk.template static method): Uses lit-html to define the structure and content of the element. It dynamically renders sections based on the current configuration and global settings (e.g., window.perf.notifications).query-change from query-chooser-sk, selection-changed from select-sk) to update the _config object.thresholdDescriptors object maps step detection algorithms to their corresponding units and descriptive labels, ensuring the “Threshold” input field is always relevant.? operator in lit-html or if statements in helper functions like _groupBy) is used to show/hide UI elements based on window.perf flags.testBugTemplate(): Sends a POST request to /_/alert/bug/try to test the configured bug URI template.testAlert(): Sends a POST request to /_/alert/notify/try to test the alert notification setup.toDirection(), toConfigState(): Convert string values from UI selections to the appropriate enum types for the Alert object.indexFromStep(): Determines the correct selection index for the “Step Detection” dropdown based on the current _config.step value.alert-config-sk.scss: Contains the SASS styles for the element, ensuring a consistent look and feel within the Perf application. It imports styles from themes_sass_lib and buttons_sass_lib for theming and button styling.alert-config-sk-demo.html and alert-config-sk-demo.ts: Provide a demonstration page for the alert-config-sk element.alert-config-sk and buttons to manipulate global window.perf settings, allowing developers to test different UI states of the component.paramset and config data, and provides event listeners for the control buttons to refresh the alert-config-sk component and display its current state. This is crucial for development and testing.alert-config-sk_puppeteer_test.ts: Contains Puppeteer tests for the component. These tests verify that the component renders correctly in different states (e.g., with/without group_by, different notification options) by interacting with the demo page and taking screenshots.index.ts: A simple entry point that imports and thereby registers the alert-config-sk custom element, making it available for use in HTML.Workflow Example: Editing an Alert
Initialization:
alert-config-sk is added to the DOM.paramset property is set, providing the available trace parameters.config property is set with the Alert object to be edited (or a default new configuration).window.perf settings influence which UI sections are initially visible.User Interaction:
query-chooser-sk), Grouping (via algo-select-sk), Step Detection, Threshold, etc.select-sk):selection-changed).alert-config-sk listens for this event.alert-config-sk.ts updates the corresponding property in its internal _config object (e.g., this._config.step = newStepValue).User interacts with <select-sk id="step">
|
V
<select-sk> emits 'selection-changed' event
|
V
AlertConfigSk.stepSelectionChanged(event) is called
|
V
this._config.step is updated
|
V
this._render() is (indirectly) called by Lit
|
V
UI updates, e.g., label for "Threshold" input changes
Testing Configuration (Optional):
AlertConfigSk.testBugTemplate() is called./_/alert/bug/try.AlertConfigSk.testAlert() is called./_/alert/notify/try.Saving Changes:
alert-config-sk is responsible for retrieving the updated config object from the alert-config-sk element (e.g., element.config) and persisting it (e.g., by sending it to a backend API). alert-config-sk itself does not handle the saving of the configuration to a persistent store.This element aims to simplify the complex task of configuring alerts by providing a structured and reactive interface, abstracting away the direct manipulation of the underlying Alert JSON object for the end-user.
The alerts-page-sk module provides a user interface for managing and configuring alerts within the Perf application. Users can view, create, edit, and delete alert configurations. The page displays existing alerts in a table and provides a dialog for detailed configuration of individual alerts. It interacts with a backend API to fetch and persist alert data.
Why a dedicated page for alerts? Centralizing alert management provides a clear and focused interface for users responsible for monitoring performance metrics. This separation of concerns simplifies the overall application structure and user experience.
How are alerts displayed and managed? Alerts are displayed in a tabular format, offering a quick overview of key information like name, query, owner, and status. Icons are used for common actions like editing and deleting, enhancing usability. A modal dialog, utilizing the <dialog> HTML element and the alert-config-sk component, is employed for focused editing of individual alert configurations. This approach avoids cluttering the main page and provides a dedicated space for detailed settings.
Why use Lit for templating? Lit is used for its efficient rendering and component-based architecture. This allows for a declarative way to define the UI and manage its state, making the code more maintainable and easier to understand. The use of html tagged template literals provides a clean and JavaScript-native way to write templates.
How is user authorization handled? The page checks if the logged-in user has an ‘editor’ role. This is determined by fetching the user‘s status from /_/login/status. Editing and creation functionalities are disabled if the user lacks the necessary permissions, preventing unauthorized modifications. The logged-in user’s email is also pre-filled as the owner for new alerts.
Why is fetch-mock used in the demo? fetch-mock is utilized in the demo (alerts-page-sk-demo.ts) to simulate backend API responses. This allows for isolated testing and development of the frontend component without requiring a running backend. It enables developers to define expected responses for various API endpoints, facilitating a predictable environment for UI development and testing.
How are API interactions handled? The component uses the fetch API to communicate with the backend. Helper functions like jsonOrThrow and okOrThrow are used to simplify response handling and error management. Specific endpoints are used for listing (/_/alert/list/...), creating (/_/alert/new), updating (/_/alert/update), and deleting (/_/alert/delete/...) alerts.
Why distinguish between “Alert” and “Component” in the UI? The UI adapts to display either an “Alert” field or an “Issue Tracker Component” field based on the window.perf.notifications global setting. This allows the application to integrate with different notification systems. If markdown_issuetracker is configured, it links directly to the relevant issue tracker component.
alerts-page-sk.ts: This is the core TypeScript file defining the AlertsPageSk custom element.
connectedCallback(): Initializes the component by fetching initial data (paramset and alert list).list(): Fetches and re-renders the list of alerts.add(): Initiates the creation of a new alert by fetching a default configuration from the server and opening the edit dialog.edit(): Opens the edit dialog for an existing alert.accept(): Handles the submission of changes from the edit dialog, sending an update request to the server.delete(): Sends a request to the server to delete an alert.openOnLoad(): Checks the URL for an alert ID on page load and, if present, opens the edit dialog for that specific alert. This allows for direct linking to an alert's configuration.alerts: An array holding the currently displayed alert configurations._cfg: The Alert object currently being edited in the dialog.isEditor: A boolean indicating if the current user has editing privileges.dialog: A reference to the HTML <dialog> element used for editing.alertconfig: A reference to the alert-config-sk element within the dialog.alerts-page-sk.scss: Contains the SASS/CSS styles for the alerts-page-sk element.
alerts-page-sk-demo.ts: Provides a demonstration and development environment for the alerts-page-sk component.
fetch-mock to simulate backend API responses for /login/status, /_/count/, /_/alert/update, /_/alert/list/..., /_/initpage/, and /_/alert/new. This allows the component to be developed and tested in isolation.window.perf properties that might affect the component's behavior (e.g., key_order, display_group_by, notifications).alerts-page-sk elements into the demo HTML page.alerts-page-sk-demo.html: The HTML structure for the demo page.
alerts-page-sk component is rendered for demonstration purposes. Includes an <error-toast-sk> for displaying error messages.alerts-page-sk_puppeteer_test.ts: Contains Puppeteer tests for the alerts-page-sk component.
index.ts: A simple entry point that imports and thereby registers the alerts-page-sk custom element.
1. Viewing Alerts:
User navigates to the alerts page
|
V
alerts-page-sk.connectedCallback()
|
+----------------------+
| |
V V
fetch('/_/initpage/') fetch('/_/alert/list/false') // Fetch paramset and initial alert list
| |
V V
Update `paramset` Update `alerts` array
| |
+----------------------+
|
V
_render() // Lit renders the table with alerts
2. Creating a New Alert:
User clicks "New" button (if isEditor === true)
|
V
alerts-page-sk.add()
|
V
fetch('/_/alert/new') // Get a template for a new alert
|
V
Update `cfg` with the new alert template (owner set to current user)
|
V
dialog.showModal() // Show the alert-config-sk dialog
|
V
User fills in alert details in alert-config-sk
|
V
User clicks "Accept"
|
V
alerts-page-sk.accept()
|
V
cfg = alertconfig.config // Get updated config from alert-config-sk
|
V
fetch('/_/alert/update', { method: 'POST', body: JSON.stringify(cfg) }) // Send new alert to backend
|
V
alerts-page-sk.list() // Refresh the alert list
3. Editing an Existing Alert:
User clicks "Edit" icon next to an alert (if isEditor === true)
|
V
alerts-page-sk.edit() with the selected alert's data
|
V
Set `origCfg` (deep copy of current `cfg`)
Set `cfg` to the selected alert's data
|
V
dialog.showModal() // Show the alert-config-sk dialog pre-filled with alert data
|
V
User modifies alert details in alert-config-sk
|
V
User clicks "Accept"
|
V
alerts-page-sk.accept()
|
V
cfg = alertconfig.config // Get updated config
|
V
IF JSON.stringify(cfg) !== JSON.stringify(origCfg) THEN
fetch('/_/alert/update', { method: 'POST', body: JSON.stringify(cfg) }) // Send updated alert
|
V
alerts-page-sk.list() // Refresh list
ENDIF
4. Deleting an Alert:
User clicks "Delete" icon next to an alert (if isEditor === true)
|
V
alerts-page-sk.delete() with the selected alert's ID
|
V
fetch('/_/alert/delete/{alert_id}', { method: 'POST' }) // Send delete request
|
V
alerts-page-sk.list() // Refresh the alert list
5. Toggling “Show Deleted Configs”:
User clicks "Show deleted configs" checkbox | V alerts-page-sk.showChanged() | V Update `showDeleted` property based on checkbox state | V alerts-page-sk.list() // Fetches alerts based on the new `showDeleted` state
The algo-select-sk module provides a custom HTML element that allows users to select a clustering algorithm. This component is crucial for applications where different clustering approaches might yield better results depending on the data or the analytical goal.
The core purpose of this module is to present a user-friendly way to switch between available clustering algorithms, specifically “k-means” and “stepfit”. It encapsulates the selection logic and emits an event when the chosen algorithm changes, allowing other parts of the application to react accordingly.
The “why” behind this module is the need for a standardized and reusable UI component for algorithm selection. Instead of each part of an application implementing its own dropdown or radio buttons for algorithm choice, algo-select-sk provides a consistent look and feel.
The “how” involves leveraging the select-sk custom element from the elements-sk library to provide the actual dropdown functionality. algo-select-sk builds upon this by:
algo attribute (and corresponding property) to store and reflect the currently selected algorithm.algo-change event with the new algorithm in the detail object. This decoupling allows other components to listen for changes without direct dependencies on algo-select-sk.The choice to use select-sk as a base provides a consistent styling and behavior aligned with other elements in the Skia infrastructure.
algo-select-sk.ts: This is the heart of the module.AlgoSelectSk class: This ElementSk subclass defines the custom element's behavior.template: Uses lit-html to render the underlying select-sk element with predefined div elements representing the algorithm options (“K-Means” and “Individual” which maps to “stepfit”). The selected attribute on these divs is dynamically updated based on the current algo property.connectedCallback and attributeChangedCallback: Ensure the element renders correctly when added to the DOM or when its algo attribute is changed programmatically._selectionChanged method: This is the event handler for the selection-changed event from the inner select-sk element. When triggered, it updates the algo property of algo-select-sk and then dispatches the algo-change custom event. This is the primary mechanism for communicating the selected algorithm to the outside world. User interacts with <select-sk> | V <select-sk> emits 'selection-changed' event | V AlgoSelectSk._selectionChanged() is called | V Updates internal 'algo' property | V Dispatches 'algo-change' event with { algo: "new_value" }algo getter/setter: Provides a programmatic way to get and set the selected algorithm. The setter ensures that only valid algorithm values (‘kmeans’ or ‘stepfit’) are set, defaulting to ‘kmeans’ for invalid inputs. This adds a layer of robustness.toClusterAlgo function: A utility function to validate and normalize the input string to one of the allowed ClusterAlgo types. This prevents invalid algorithm names from being propagated.AlgoSelectAlgoChangeEventDetail interface: Defines the structure of the detail object for the algo-change event, ensuring type safety for event consumers.algo-select-sk.scss: Provides minimal styling, primarily ensuring that the cursor is a pointer when hovering over the element, indicating interactivity. It imports shared color and theme styles.index.ts: A simple entry point that imports algo-select-sk.ts, ensuring the custom element is defined and available for use when the module is imported.algo-select-sk-demo.html and algo-select-sk-demo.ts: These files provide a demonstration page for the algo-select-sk element.algo-select-sk, including one with a pre-selected algorithm and one in dark mode, to showcase its appearance.algo-change event from one of the instances and displays the event detail in a <pre> tag. This serves as a live example of how to consume the event.algo-select-sk_puppeteer_test.ts: Contains Puppeteer tests to verify the component renders correctly and basic functionality. It checks for the presence of the elements on the demo page and takes a screenshot for visual regression testing.The component is designed to be self-contained and easy to integrate. By simply including the element in HTML and listening for the algo-change event, developers can incorporate algorithm selection functionality into their applications.
anomalies-table-sk)The anomalies-table-sk module provides a custom HTML element for displaying a sortable and interactive table of performance anomalies. Its primary purpose is to present anomaly data in a clear, actionable format, allowing users to quickly identify, group, triage, and investigate performance regressions or improvements.
Anomaly objects in a tabular format. Each row represents an anomaly and displays key information such as bug ID, revision range, test path, and metrics like delta percentage and absolute delta.triage-menu-sk to allow users to assign bug IDs, mark anomalies as invalid or ignored, or reset their triage state./u/?anomalyIDs=...)./m/...)./u/?bugID=...).groupAnomalies method iterates through the anomaly list, merging anomalies into existing groups if their revision ranges intersect, or creating new groups otherwise.sort-sk element, which observes changes to data attributes on the table rows. This avoids server roundtrips for simple sorting operations.this._render()) only when necessary, such as when data changes, groups are expanded/collapsed, or selections are updated. This improves performance.AnomalyGroup Class: A simple AnomalyGroup class is used to manage collections of related anomalies and their expanded state. This provides a clear structure for handling grouped data.showPopup boolean property.anomalies_checked when the selection state of an anomaly changes. This allows parent components or other parts of the application to react to user selections./_anomalies/group_report backend API. This API is designed to provide a consolidated view or a shared identifier (sid) for a group of anomalies, which is then used to construct the graph URL. This is preferred over constructing potentially very long URLs with many individual anomaly IDs.group_report API to provide context (one week before and after the anomaly) in the graph.ChromeTraceFormatter to correctly format trace queries for linking to the graph explorer.themes_sass_lib, buttons_sass_lib, and select_sass_lib for a consistent look and feel. Specific styles handle the appearance of regression vs. improvement, expanded rows, and the triage popup.anomalies-table-sk.ts: This is the core file containing the LitElement class definition for AnomaliesTableSk. It implements all the logic for rendering the table, handling user interactions, grouping anomalies, and interacting with backend services for triage and graphing.populateTable(anomalyList: Anomaly[]): The primary method to load data into the table. It triggers grouping and rendering.generateTable(), generateGroups(), generateRows(): Template methods responsible for constructing the HTML structure of the table using lit-html.groupAnomalies(): Implements the logic for grouping anomalies based on overlapping revision ranges.openReport(): Handles the logic for generating a URL to graph the selected anomalies, potentially calling the /_anomalies/group_report API.togglePopup(): Manages the visibility of the triage menu popup.anomalyChecked(): Handles checkbox state changes and updates the checkedAnomaliesSet.openMultiGraphUrl(): Constructs the URL for viewing an anomaly's trend in the multi-graph explorer, fetching time range context via an API call.anomalies-table-sk.scss: Contains the SCSS styles specific to the anomalies table, defining its layout, appearance, and the styling for different states (e.g., improvement, regression, expanded rows).index.ts: A simple entry point that imports and registers the anomalies-table-sk custom element.anomalies-table-sk-demo.ts and anomalies-table-sk-demo.html: Provide a demonstration page for the component, showcasing its usage with sample data and interactive buttons to populate the table and retrieve checked anomalies. The demo also sets up a global window.perf object with configuration typically provided by the Perf application environment.1. Displaying and Grouping Anomalies:
[User Action: Page Load with Anomaly Data] | v AnomaliesTableSk.populateTable(anomalyList) | v AnomaliesTableSk.groupAnomalies() |-> For each Anomaly in anomalyList: | |-> Try to merge with existing AnomalyGroup (if revision ranges intersect) | |-> Else, create new AnomalyGroup | v AnomaliesTableSk._render() | v [DOM Update: Table is rendered with grouped anomalies, groups initially collapsed]
2. Selecting and Triaging Anomalies:
[User Action: Clicks checkbox for an anomaly or group] | v AnomaliesTableSk.anomalyChecked() or AnomalySk.toggleChildrenCheckboxes() |-> Updates `checkedAnomaliesSet` |-> Updates header checkbox state if needed |-> Emits 'anomalies_checked' event |-> Enables/Disables "Triage" and "Graph" buttons based on selection | v [User Action: Clicks "Triage" button (if enabled)] | v AnomaliesTableSk.togglePopup() |-> Shows TriageMenuSk popup |-> TriageMenuSk.setAnomalies(checkedAnomalies) | v [User interacts with TriageMenuSk (e.g., assigns bug, marks invalid)] | v TriageMenuSk makes API request (e.g., to /_/triage) | v [Application reloads data or updates table based on triage result]
3. Graphing Selected Anomalies:
[User Action: Selects one or more anomalies]
|
v
[User Action: Clicks "Graph" button (if enabled)]
|
v
AnomaliesTableSk.openReport()
|
|--> If single anomaly selected:
| |-> window.open(`/u/?anomalyIDs={id}`, '_blank')
|
|--> If multiple anomalies selected:
|-> Call fetchGroupReportApi(idString)
| |-> POST to /_/anomalies/group_report with anomaly IDs
| |-> Receives response with `sid` (shared ID)
|
|-> window.open(`/u/?sid={sid}`, '_blank')
4. Expanding/Collapsing an Anomaly Group:
[User Action: Clicks expand/collapse button on a group row] | v AnomaliesTableSk.expandGroup(anomalyGroup) |-> Toggles `anomalyGroup.expanded` boolean | v AnomaliesTableSk._render() | v [DOM Update: Rows within the group are shown or hidden]
The anomaly-sk module provides a custom HTML element <anomaly-sk> and related functionalities for displaying details about performance anomalies. It's designed to present information about a specific anomaly, including its severity, the affected revision range, and a link to the associated bug report. A key utility function, getAnomalyDataMap, is also provided to process raw anomaly data into a format suitable for plotting.
Key Responsibilities and Components:
anomaly-sk.ts: This is the core file defining the <anomaly-sk> custom element.
ElementSk and uses the lit-html library for templating. It accepts an Anomaly object as a property and dynamically renders a table displaying information like the score before and after the anomaly, percentage change, revision range, improvement status, and bug ID.lookupCids function from the cid module to construct a clickable link to the commit range.bug_host_url property allows customization of the bug tracker URL.formatRevisionRange method asynchronously fetches commit hashes for the start and end revisions of the anomaly to create a link to the commit range view. If window.perf.commit_range_url is not defined, it simply displays the revision numbers.getAnomalyDataMap (function in anomaly-sk.ts):
plot-simple-sk. This function bridges the gap between the raw data representation and the visual representation of anomalies on a graph.TraceSet (a collection of traces), ColumnHeader[] (representing commit points on the x-axis), an AnomalyMap (mapping trace IDs and commit IDs to Anomaly objects), and a list of highlight_anomalies IDs.TraceSet. If a trace has anomalies listed in the AnomalyMap, it then iterates through those anomalies.cid) with the offset in the ColumnHeader. A crucial detail is that if an exact commit ID match isn’t found in the header (e.g., due to a data upload failure for that specific commit), it will associate the anomaly with the next available commit point. This ensures that anomalies are still visualized even if their precise commit data point is missing, rather than being omitted entirely.highlight_anomalies input.AnomalyData objects, each containing the x, y coordinates, the Anomaly object itself, and a highlight flag.Input:
TraceSet: { "traceA": [10, 12, 15*], ... } (*value at commit 101)
Header: [ {offset: 99}, {offset: 100}, {offset: 101} ]
AnomalyMap: { "traceA": { "101": AnomalyObjectA } }
HighlightList: []
getAnomalyDataMap
|
V
Output:
{
"traceA": [
{ x: 2, y: 15, anomaly: AnomalyObjectA, highlight: false }
],
...
}
anomaly-sk.scss: This file contains the SCSS styles for the <anomaly-sk> element.
themes_sass_lib).th and td elements within the anomaly-sk component.anomaly-sk-demo.html and anomaly-sk-demo.ts: These files set up a demonstration page for the <anomaly-sk> element.
anomaly-sk-demo.html includes instances of <anomaly-sk> with different IDs. anomaly-sk-demo.ts initializes these components with sample Anomaly data. It also mocks the /_/cid/ API endpoint using fetch-mock to simulate responses for commit detail lookups, which is crucial for the formatRevisionRange functionality to work in the demo. Global window.perf configurations are also set up, as the component relies on them (e.g., commit_range_url).Test Files (anomaly-sk_test.ts, anomaly-sk_puppeteer_test.ts):
anomaly-sk_test.ts: Contains unit tests for the getAnomalyDataMap function (verifying its mapping logic, especially the handling of missing commit points) and for static utility methods within AnomalySk like formatPercentage and the asynchronous formatRevisionRange. It uses fetch-mock to control API responses for CID lookups.anomaly-sk_puppeteer_test.ts: Contains browser-based integration tests using Puppeteer. It verifies that the demo page renders correctly and takes screenshots for visual regression testing.Workflow for Displaying an Anomaly:
Anomaly object is passed to the anomaly property of the <anomaly-sk> element. <anomaly-sk .anomaly=${someAnomalyObject}></anomaly-sk>set anomaly() setter in AnomalySk is triggered.this.formatRevisionRange() to asynchronously prepare the revision range display.formatRevisionRange extracts start_revision and end_revision.lookupCids([start_rev_num, end_rev_num]) which makes a POST request to /_/cid/.window.perf.commit_range_url is set, it constructs an <a> tag with the URL populated with the fetched hashes. Otherwise, it just formats the revision numbers as text.TemplateResult is stored in this._revision.this._render() is called, which re-renders the component's template.AnomalySk.template) displays the table:getPercentChange).this.revision template generated in step 3).AnomalySk.formatBug, potentially linking to this.bugHostUrl).This module effectively isolates the presentation and data transformation logic related to individual anomalies, making it a maintainable and reusable piece of the Perf frontend. The handling of potentially missing data points in getAnomalyDataMap shows a robust design choice for dealing with real-world data imperfections.
bisect-dialog-sk)The bisect-dialog-sk module provides a user interface element for initiating a bisection process within the Perf application. This is specifically designed to help pinpoint the commit that introduced a performance regression or improvement, primarily for Chrome.
The primary responsibility of this module is to present a dialog to the user, pre-filled with relevant information extracted from a chart tooltip (e.g., when a user identifies an anomaly in a performance graph). It allows the user to confirm or modify these parameters and then submit a request to the backend to start a bisection task.
Performance analysis often involves identifying the exact change that caused a shift in metrics. A manual bisection process can be tedious and error-prone. This dialog streamlines this by:
alogin-sk module to fetch the logged-in user's email, which is a required parameter for the bisect request.Initialization and Pre-filling:
setBisectInputParams method is called with details like the testPath, startCommit, endCommit, bugId, story, and anomalyId.User Interaction and Submission:
open() method displays the modal dialog.postBisect method is invoked.Request Construction and API Call:
postBisect gathers the current values from the form fields.testPath to extract components like the benchmark, chart, and statistic. The logic for deriving chart and statistic involves checking the last part of the test name against a predefined list of STATISTIC_VALUES (e.g., “avg”, “count”).CreateBisectRequest object is constructed with all the necessary parameters.fetch call is made to the /_/bisect/create endpoint with the JSON payload.Response Handling:
errorMessage, and the dialog remains open, allowing the user to correct any issues or retry.Simplified Bisect Request Workflow:
User Clicks Bisect Trigger (e.g., on chart)
|
V
Calling Code prepares `BisectPreloadParams`
|
V
`bisect-dialog-sk.setBisectInputParams(params)`
|
V
`bisect-dialog-sk.open()`
|
V
Dialog is Displayed (pre-filled)
|
V
User reviews/modifies data & Clicks "Bisect"
|
V
`bisect-dialog-sk.postBisect()`
|
V
`testPath` is parsed (extract benchmark, chart, statistic)
|
V
`CreateBisectRequest` object is built
|
V
`fetch POST /_/bisect/create` with request data
|
V
Handle API Response:
- Success -> Close dialog, Show success notification (external)
- Error -> Show error message, Keep dialog open
bisect-dialog-sk.ts: This is the core TypeScript file defining the BisectDialogSk custom element.
BisectDialogSk class: Extends ElementSk and manages the dialog's state, rendering, and interaction logic.BisectPreloadParams interface: Defines the structure of the initial data passed to the dialog.template: A lit-html template defining the dialog's HTML structure, including input fields for test path, bug ID, start/end commits, story, and an optional patch. It also includes a close icon, a spinner for loading states, and submit/close buttons.connectedCallback(): Initializes the element, sets up property upgrades, queries for DOM elements (dialog, form, spinner, button), and attaches an event listener to the form‘s submit event. It also fetches the logged-in user’s status.setBisectInputParams(): Populates the internal state and input fields with data provided externally.open(): Shows the modal dialog and ensures the submit button is enabled.closeBisectDialog(): Closes the dialog.postBisect(): This is the heart of the submission logic. It:testPath to extract various components required for the bisect request (like benchmark, chart, story, statistic). The logic for chart and statistic derivation is particularly important here.CreateBisectRequest payload.POST request to the /_/bisect/create endpoint.STATISTIC_VALUES: A constant array used to determine if the last part of a test name is a statistic (e.g., avg, min, max).bisect-dialog-sk.scss: Contains the SASS styles for the dialog, ensuring it aligns with the application's theme. It styles the dialog itself, input fields, and the footer elements.
index.ts: A simple entry point that imports and thus registers the bisect-dialog-sk custom element.
BUILD.bazel: Defines the build rules for this module, specifying its dependencies (SASS, TypeScript, other SK elements like alogin-sk, select-sk, spinner-sk, close-icon-sk) and sources. The dependencies highlight its reliance on common UI components and infrastructure modules for features like login status and error messaging.
ElementSk): Encapsulating the dialog as a custom element promotes reusability and modularity. It can be easily integrated into different parts of the Perf application where bisection capabilities are needed.lit-html for Templating: Provides an efficient and declarative way to define the dialog's HTML structure and update it based on its state./_/bisect/create) is tailored for Chrome's bisection infrastructure. The project: 'chromium' in the request payload confirms this.jsonOrThrow and errorMessage provides a standard way to handle API errors and inform the user.spinner-sk element gives visual feedback during the asynchronous fetch operation, improving user experience.calendar-input-sk)The calendar-input-sk module provides a user-friendly way to select dates. It combines a standard text input field for manual date entry with a button that reveals a calendar-sk element within a dialog for visual date picking. This approach offers flexibility for users who prefer typing dates directly and those who prefer a visual calendar interface.
calendar-input-sk.ts: This is the core file defining the CalendarInputSk custom element.
<input type="text"> element for direct date input. A pattern attribute ([0-9]{4}-[0-9]{1,2}-[0-9]{1,2}) and a title are used to guide the user on the expected YYYY-MM-DD format. An error indicator (✗) is shown if the input doesn't match the pattern.<button> element, styled with a date-range-icon-sk, triggers the display of the calendar.<dialog> element is used to present the calendar-sk element. This choice simplifies the implementation of modal behavior.openHandler method is responsible for showing the dialog. It uses a Promise to manage the asynchronous nature of user interaction with the dialog (either selecting a date or canceling). This makes the event handling logic cleaner and easier to follow.inputChangeHandler is triggered when the user types into the text field. It validates the input against the defined pattern. If valid, it parses the date string and updates the displayDate property.calendarChangeHandler is invoked when a date is selected from the calendar-sk component within the dialog. It resolves the aforementioned Promise with the selected date.dialogCancelHandler is called when the dialog is closed without a date selection (e.g., by pressing the “Cancel” button or the Escape key). It rejects the Promise.input custom event (of type CustomEvent<Date>) is dispatched whenever the selected date changes, whether through the text input or the calendar dialog. This allows parent components to react to date selections.displayDate property acts as the single source of truth for the currently selected date. Setting this property will update both the text input and the date displayed in the calendar-sk when it's opened.lit-html library for templating, providing a declarative way to define the element's structure and efficiently update the DOM.ElementSk, inheriting common functionalities for Skia custom elements.calendar-input-sk.scss: This file contains the styling for the calendar-input-sk element.
--error, --on-surface, --surface-1dp) for theming, allowing the component‘s appearance to adapt to different contexts (like dark mode). The .invalid class is conditionally displayed based on the input field’s validity state using the :invalid pseudo-class.index.ts: This file simply imports and thereby registers the calendar-input-sk custom element.
calendar-input-sk-demo.html / calendar-input-sk-demo.ts: These files constitute a demonstration page for the calendar-input-sk element.
<calendar-input-sk> in various configurations. The TypeScript file initializes these instances, sets initial displayDate values, and demonstrates how to listen for the input event. It also shows an example of programmatically setting an invalid value in one of the input fields.1. Selecting a Date via Text Input:
User types "2023-10-26" into text input
|
V
inputChangeHandler in calendar-input-sk.ts
|
+-- (Input is valid: matches pattern "YYYY-MM-DD") --> Parse "2023-10-26" into a Date object
| |
| V
| Update _displayDate property
| |
| V
| Render component (updates input field's .value)
| |
| V
| Dispatch "input" CustomEvent<Date>
|
+-- (Input is invalid: e.g., "2023-") --> Do nothing (CSS shows error indicator)
2. Selecting a Date via Calendar Dialog:
User clicks calendar button
|
V
openHandler in calendar-input-sk.ts
|
V
dialog.showModal() is called
|
V
<dialog> with <calendar-sk> is displayed
|
+-- User selects a date in <calendar-sk> --> <calendar-sk> dispatches "change" event
| |
| V
| calendarChangeHandler in calendar-input-sk.ts
| |
| V
| dialog.close()
| |
| V
| Promise resolves with the selected Date
|
+-- User clicks "Cancel" button or presses Esc --> dialog dispatches "cancel" event
|
V
dialogCancelHandler in calendar-input-sk.ts
|
V
dialog.close()
|
V
Promise rejects
If Promise resolves (date selected):
openHandler continues after await
|
V
Update _displayDate property with the resolved Date
|
V
Render component (updates input field's .value)
|
V
Dispatch "input" CustomEvent<Date>
|
V
Focus on the text input field
The design emphasizes a clear separation of concerns: the calendar-sk handles the visual calendar logic, while calendar-input-sk manages the integration of text input and the dialog presentation. The use of a Promise in openHandler simplifies the handling of the asynchronous dialog interaction, leading to more readable and maintainable code.
The calendar-sk module provides a custom HTML element <calendar-sk> that displays an interactive monthly calendar. This element was created to address limitations with the native HTML <input type="date"> element, specifically its lack of Safari support and the inability to style the pop-up calendar. Furthermore, it aims to be more themeable and accessible than other existing web component solutions like Elix.
The core philosophy behind calendar-sk is to provide a user-friendly, accessible, and customizable date selection experience. Accessibility is a key consideration, with design choices informed by WAI-ARIA practices for date pickers. This includes keyboard navigation and appropriate ARIA attributes.
Key Responsibilities and Components:
calendar-sk.ts: This is the heart of the module, defining the CalendarSk custom element which extends ElementSk.lit-html library for templating, dynamically generating the HTML for the calendar grid. The calendar displays one month at a time.CalendarSk.template) constructs the overall table structure, including navigation buttons for changing the year and month, and headers for the year and month.CalendarSk.rowTemplate is responsible for rendering each week (row) of the calendar.CalendarSk.buttonForDateTemplate creates the individual day buttons. It handles logic for disabling buttons for dates outside the current month and highlighting the selected date and today's date._displayDate (a JavaScript Date object) which represents the currently selected or focused date.CalendarDate class is a helper to simplify comparisons of year, month, and date, as JavaScript Date objects can be tricky with timezones and direct comparisons.getNumberOfDaysInMonth and firstDayIndexOfMonth are used to correctly layout the days within the grid.navigate-before-icon-sk and navigate-next-icon-sk) for incrementing/decrementing the month and year. Methods like incYear, decYear, incMonth, and decMonth handle the logic for updating _displayDate and re-rendering. A crucial detail in month/year navigation is handling cases where the current day (e.g., 31st) doesn't exist in the target month (e.g., February). In such scenarios, the date is adjusted to the last valid day of the target month.keyboardHandler method implements navigation using arrow keys (day/week changes) and PageUp/PageDown keys (month changes). This handler is designed to be attached to a parent element (like a dialog or the document) to allow for controlled event handling, especially when multiple keyboard-interactive elements are on a page. When a key is handled, it prevents further event propagation and focuses the newly selected date button.Intl.DateTimeFormat to display month names and weekday headers according to the specified locale property or the browser's default locale. The buildWeekDayHeader method dynamically generates these headers.change custom event ( CustomEvent<Date>) whenever a new date is selected by clicking on a day. The event detail contains the selected Date object.calendar-sk.scss. It imports styles from //perf/modules/themes:themes_sass_lib and //elements-sk/modules/styles:buttons_sass_lib.calendar-sk.scss: This file contains the SASS/CSS styles for the <calendar-sk> element. It defines the visual appearance of the calendar grid, buttons, headers, and how selected or “today” dates are highlighted. It relies on CSS variables (e.g., --background, --secondary, --surface-1dp) for theming, allowing the look and feel to be customized by the consuming application.calendar-sk-demo.html and calendar-sk-demo.ts: These files set up a demonstration page for the calendar-sk element.calendar-sk-demo.html includes instances of the calendar, some in dark mode and one configured for a different locale (zh-Hans-CN), to showcase its versatility.calendar-sk-demo.ts initializes these calendar instances, sets their initial displayDate and locale, and attaches event listeners to log the change event. It also demonstrates how to hook up the keyboardHandler.index.ts: A simple entry point that imports and thus registers the calendar-sk custom element, making it available for use in HTML.Key Workflows:
Initialization and Rendering: ElementSk constructor -> connectedCallback -> buildWeekDayHeader -> _render (calls CalendarSk.template)
<calendar-sk> element is added to the DOM, its connectedCallback is invoked._displayDate.Date Selection (Click): User clicks on a date button -> dateClick method -> Updates _displayDate -> Dispatches change event with the new Date -> _render (to update UI, e.g., highlight new selection)
User clicks a date button. [date button] --click--> dateClick(event) | +--> new Date(this._displayDate) (create copy) | +--> d.setDate(event.target.dataset.date) (update day) | +--> dispatchEvent(new CustomEvent<Date>('change', { detail: d })) | +--> this._displayDate = d | +--> this._render()
Month/Year Navigation (Click): User clicks “Previous Month” button -> decMonth method -> Calculates new year, monthIndex, and date (adjusting for days in month) -> Updates _displayDate with the new Date -> _render (to display the new month/year)
User clicks "Previous Month" button. `[Previous Month button]` --click--> `decMonth()` | +--> Calculate new year, month, date (adjusting for month boundaries and days in month) | +--> `this._displayDate = new Date(newYear,
newMonthIndex, newDate)| +-->this._render()`
Keyboard Navigation: User presses “ArrowRight” while calendar (or its container) has focus -> keyboardHandler(event) -> case 'ArrowRight': this.incDay(); -> incDay method updates _displayDate (e.g., from May 21 to May 22) -> this._render() -> e.stopPropagation(); e.preventDefault(); -> this.querySelector<HTMLButtonElement>('button[aria-selected="true"]')!.focus();
User presses ArrowRight key. keydown event (ArrowRight) ---> keyboardHandler(event) | + (matches case 'ArrowRight') | +--> this.incDay() | | | +--> `this._displayDate = new Date(year, monthIndex,
date + 1)| | | +-->this._render()| +-->event.stopPropagation()| +-->event.preventDefault()` | +--> Focus the newly selected day button.
The use of zero-indexed months (monthIndex) internally, as is common with the JavaScript Date object, is a deliberate choice for consistency with the underlying API, though it requires careful handling to avoid off-by-one errors, especially when calculating things like the number of days in a month.
The chart-tooltip-sk module provides a custom HTML element, <chart-tooltip-sk>, designed to display detailed information about a specific data point on a chart. This tooltip is intended to be interactive, offering context-sensitive actions and information relevant to performance monitoring and analysis. It can be triggered by hovering over or clicking on a chart point.
The design philosophy behind this module is to centralize the presentation of complex data point information and related actions. Instead of scattering this logic across various chart implementations, chart-tooltip-sk encapsulates it, promoting reusability and maintainability. It aims to provide a rich user experience by surfacing relevant details like commit information, anomaly status, bug tracking, and actions like bisection or requesting further traces.
The primary responsibility of chart-tooltip-sk is to render a tooltip with relevant information and interactive elements based on the data point it's associated with.
Core Functionality & Design Choices:
load() method is the main entry point for populating the tooltip with data. It accepts various parameters like the trace index, test name, y-value, date, commit position, anomaly details, and bug information. This comprehensive loading mechanism allows the parent charting component (e.g., explore-simple-sk) to provide all necessary context.fetch_details() method is responsible for asynchronously retrieving commit details using the /_/cid/ endpoint. This is done to avoid loading all commit details upfront for every point on a chart, which could be performance-intensive._always_show_commit_info and _skip_commit_detail_display flags (sourced from window.perf) allow for configurable display of commit details, catering to different instance needs.anomaly-sk for consistent formatting of anomaly data.triage-menu-sk to allow users to triage new anomalies (e.g., create bugs, mark as not a bug).user-issue-sk to display and manage Buganizer issues linked to a data point (even if it's not a formal anomaly). Users can associate existing bugs or create new ones.bug_host_url (from window.perf) is used to construct links to the bug tracking system._show_pinpoint_buttons is true, typically for Chromium instances) that opens bisect-dialog-sk. This allows users to initiate a bisection to find the exact commit that caused a regression._show_pinpoint_buttons) that opens pinpoint-try-job-dialog-sk. This is used to request more detailed trace data for a specific commit.point-links-sk to show relevant links for a data point based on instance configuration (e.g., links to V8 or WebRTC specific commit ranges). This is configured via keys_for_commit_range and keys_for_useful_links in window.perf.show_json_file_display in window.perf), it provides a way to view the raw JSON data for the point via json-source-sk.moveTo() method handles the dynamic positioning of the tooltip relative to the mouse cursor or the selected chart point. It intelligently adjusts its position to stay within the viewport and avoid overlapping critical chart elements.chart-tooltip-sk.scss), including themes imported from //perf/modules/themes:themes_sass_lib.md-elevation for a Material Design-inspired shadow effect.Key Files:
chart-tooltip-sk.ts: The core TypeScript file defining the ChartTooltipSk class, its properties, methods, and HTML template (using lit-html). This is where the primary logic for data display, interaction handling, and integration with sub-components resides.chart-tooltip-sk.scss: The SASS file containing the styles for the tooltip element.index.ts: A simple entry point that imports and registers the chart-tooltip-sk custom element.chart-tooltip-sk-demo.html & chart-tooltip-sk-demo.ts: Files for demonstrating the tooltip's functionality. The demo sets up mock data and fetchMock to simulate API responses, allowing isolated testing and visualization of the component.BUILD.bazel: Defines how the element and its demo page are built, including dependencies on other Skia Elements and Perf modules like anomaly-sk, commit-range-sk, triage-menu-sk, etc.Workflow Example: Displaying Tooltip on Chart Point Click (Fixed Tooltip)
User clicks a point on a chart
|
V
Parent Chart Component (e.g., explore-simple-sk)
1. Determines data for the clicked point (coordinates, commit, trace info).
2. Optionally fetches commit details if not already available.
3. Optionally checks its anomaly map for anomaly data.
4. Calls `chartTooltipSk.load(...)` with all relevant data,
setting `tooltipFixed = true` and providing a close button action.
5. Calls `chartTooltipSk.moveTo({x, y})` to position the tooltip.
|
V
chart-tooltip-sk
1. `load()` method populates internal properties (_test_name, _y_value, _commit_info, _anomaly, etc.).
2. `_render()` is triggered (implicitly or explicitly).
3. The lit-html template in `static template` is evaluated:
- Basic info (test name, value, date) is displayed.
- If `commit_info` is present, commit details (author, message, hash) are shown.
- If `_anomaly` is present:
- Anomaly metrics are displayed.
- If `anomaly.bug_id === 0`, `triage-menu-sk` is shown.
- If `anomaly.bug_id > 0`, bug ID is shown with an unassociate button.
- Pinpoint job links are shown if available.
- If `tooltip_fixed` is true:
- "Bisect" and "Request Trace" buttons are shown (if configured).
- `user-issue-sk` is shown (if not an anomaly).
- `json-source-sk` button/link is shown (if configured).
- The close icon is visible.
4. Child components like `commit-range-sk`, `point-links-sk`, `user-issue-sk`, `triage-menu-sk`
are updated with their respective data.
5. `moveTo()` positions the rendered `div.container` on the screen.
|
V
User interacts with buttons (e.g., "Bisect", "Triage", "Close")
|
V
chart-tooltip-sk or its child components handle the interaction
- e.g., clicking "Bisect" calls `openBisectDialog()`, which shows `bisect-dialog-sk`.
- e.g., clicking "Close" executes the `_close_button_action` passed during `load()`.
This modular approach ensures that chart-tooltip-sk is a self-contained, feature-rich component for displaying detailed contextual information and actions related to data points in performance charts.
This module, /modules/cid, provides functionality for interacting with Commit IDs (CIDs), which are also referred to as CommitNumbers. The primary purpose of this module is to facilitate the retrieval of detailed commit information based on a set of commit numbers and their corresponding sources.
The core functionality revolves around the lookupCids function. This function is designed to be a simple and efficient way to fetch commit details from a backend endpoint.
Why Asynchronous Operations?
The lookup of commit information involves a network request to a backend service (/_/cid/). Network requests are inherently asynchronous. Therefore, lookupCids returns a Promise. This allows the calling code to continue execution while the commit information is being fetched and to handle the response (or any potential errors) when it becomes available. This non-blocking approach is crucial for maintaining a responsive user interface or efficient server-side processing.
Why JSON for Data Exchange?
JSON (JavaScript Object Notation) is used as the data format for both the request and the response.
cids (an array of CommitNumber objects) is serialized into a JSON string and sent in the body of the HTTP POST request. JSON is a lightweight and widely supported format, making it ideal for client-server communication.CIDHandlerResponse type. The jsonOrThrow utility (imported from ../../../infra-sk/modules/jsonOrThrow) is used to parse this JSON response. This utility simplifies error handling by automatically throwing an error if the response is not valid JSON or if the HTTP request itself fails.Why POST Request?
A POST request is used instead of a GET request for sending the cids. While GET requests are often used for retrieving data, they are typically limited in the amount of data that can be sent in the URL (e.g., through query parameters). Since the number of cids to look up could be large, sending them in the request body via a POST request is a more robust and scalable approach. The Content-Type: application/json header informs the server that the request body contains JSON data.
cid.ts: This is the sole TypeScript file in the module and contains the implementation of the lookupCids function.lookupCids(cids: CommitNumber[]): Promise<CIDHandlerResponse>:CommitNumber objects and asynchronously fetches detailed commit information for each from the /_/cid/ backend endpoint./_/cid/ endpoint.cids array is converted into a JSON string and included as the request body.Content-Type: application/json) are set.fetch API is used to make the network request.jsonOrThrow. If the request is successful and the response is valid JSON, it resolves the promise with the parsed CIDHandlerResponse. Otherwise, it rejects the promise with an error.jsonOrThrow (from ../../../infra-sk/modules/jsonOrThrow): For robust JSON parsing and error handling.CommitNumber, CIDHandlerResponse (from ../json): These are type definitions that define the structure of the input commit identifiers and the expected response from the backend.The typical workflow for using this module is as follows:
Caller | /modules/cid/cid.ts (lookupCids) | Backend Server (/_/cid/)
---------------------------|----------------------------------|-------------------------
1. Has array of CommitNumber objects.
| |
2. Calls `lookupCids(cids)`| |
`---------------------->`| |
| 3. Serializes `cids` to JSON. |
| 4. Creates POST request with JSON body.
| `--------------------------->`| 5. Receives POST request.
| | 6. Processes `cids`.
| | 7. Generates `CIDHandlerResponse`.
| `<---------------------------`| 8. Sends JSON response.
| 9. Receives response. |
| 10. `jsonOrThrow` parses response.|
| (Throws error on failure) |
| |
11. Receives Promise that | |
resolves with | |
`CIDHandlerResponse` | |
(or rejects with error).
`<----------------------`| |
The cluster-lastn-page-sk module provides a user interface for testing and configuring alert configurations by running them against a recent range of commits. This allows users to “dry run” an alert to see what regressions it would detect before saving it to run periodically.
Core Functionality:
The primary purpose of this module is to facilitate the iterative process of defining effective alert configurations. Instead of deploying an alert and waiting for it to trigger (potentially with undesirable results), users can simulate its behavior on historical data. This helps in fine-tuning parameters like the detection algorithm, radius, sparsity, and interestingness threshold.
Key Components and Files:
cluster-lastn-page-sk.ts: This is the heart of the module, defining the ClusterLastNPageSk custom element.
this.state), the commit range (this.domain), and the results of the dry run (this.regressions). It utilizes stateReflector to potentially persist and restore parts of this state in the URL, allowing users to share specific configurations or test setups.alert-config-dialog which hosts an alert-config-sk element).domain-picker-sk element.run() method).writeAlert() method).triage-cluster-dialog which hosts a cluster-summary2-sk element)./_/initpage/ and /_/alert/new respectively./_/dryrun/start endpoint to initiate the clustering and regression detection process. It uses the startRequest utility from ../progress/progress to handle the asynchronous request and display progress./_/alert/update to save or update it in the backend.lit-html for templating and dynamically renders the UI based on the current state, including the controls, the progress of a running dry run, and a table of detected regressions. The table displays commit details (commit-detail-sk) and triage status (triage-status-sk) for each detected regression.cluster-lastn-page-sk.html (Demo Page): A simple HTML file that includes the cluster-lastn-page-sk element and an error-toast-sk for displaying global error messages. This is primarily used for demonstration and testing purposes.
cluster-lastn-page-sk-demo.ts: Sets up mock HTTP responses using fetch-mock for the demo page. This allows the cluster-lastn-page-sk element to function in isolation without needing a live backend. It mocks endpoints like /_/initpage/, /_/alert/new, /_/count/, and /_/loginstatus/.
cluster-lastn-page-sk.scss: Provides the styling for the cluster-lastn-page-sk element and its dialogs, ensuring a consistent look and feel with the rest of the Perf application. It uses shared SASS libraries for buttons and themes.
Workflow for Testing an Alert Configuration:
Load Page: User navigates to the page.
cluster-lastn-page-sk fetches initial paramset and a default new alert configuration.User -> cluster-lastn-page-sk
cluster-lastn-page-sk -> GET /_/initpage/ (fetches paramset)
cluster-lastn-page-sk -> GET /_/alert/new (fetches default alert)
Configure Alert: User clicks the “Configure Alert” button.
alert-config-dialog) opens, showing alert-config-sk.state in cluster-lastn-page-sk is updated with the new configuration.User --clicks--> "Configure Alert" button
cluster-lastn-page-sk --shows--> alert-config-dialog
User --interacts with--> alert-config-sk
User --clicks--> "Accept"
alert-config-sk --updates--> cluster-lastn-page-sk.state
(Optional) Adjust Commit Range: User interacts with domain-picker-sk to define the number of recent commits or a specific date range for the dry run.
cluster-lastn-page-sk.domain is updated.Run Dry Run: User clicks the “Run” button.
cluster-lastn-page-sk constructs a RegressionDetectionRequest using the current alert state and domain./_/dryrun/start.User --clicks--> "Run" button
cluster-lastn-page-sk --creates--> RegressionDetectionRequest
cluster-lastn-page-sk --POSTs to--> /_/dryrun/start (with request body)
(progress updates via startRequest callback)
Backend --processes & clusters-->
Backend --sends progress/results--> cluster-lastn-page-sk
cluster-lastn-page-sk --updates--> UI (regressions table, status messages)
Review Results: User examines the table of regressions.
triage-cluster-dialog (showing cluster-summary2-sk) for more details.Iterate or Save:
- If results are not satisfactory, user goes back to step 2 to adjust the alert configuration and re-runs. - If results are satisfactory, user clicks "Create Alert" (or "Update Alert" if modifying an existing one). - `cluster-lastn-page-sk` sends the current alert `state` to `/_/alert/update`. `User --clicks--> "Create Alert" / "Update Alert"
button cluster-lastn-page-sk --POSTs to--> /_/alert/update (with alert config) Backend --saves/updates alert--> Backend --responds with ID--> cluster-lastn-page-sk cluster-lastn-page-sk --updates--> UI (button text might change to “Update Alert”)`
Design Decisions:
alert-config-sk, domain-picker-sk, cluster-summary2-sk). This promotes modularity, reusability, and separation of concerns.../progress/progress utility, enhancing user experience.stateReflector allows parts of the page's state (like the alert configuration) to be encoded in the URL. This is useful for sharing specific test scenarios or bookmarking them.cluster-lastn-page-sk-demo.ts) heavily relies on fetch-mock. This enables isolated development and testing of the UI component without a backend dependency, which is crucial for frontend unit/integration tests and local development.The cluster-page-sk module provides the user interface for Perf's trace clustering functionality. This allows users to identify groups of traces that exhibit similar behavior, which is crucial for understanding performance regressions or improvements across different configurations and tests.
Core Functionality and Design:
The primary goal of this page is to allow users to define a set of traces and then apply a clustering algorithm to them. The “why” behind this is to simplify the analysis of large datasets by grouping related performance changes. Instead of manually inspecting hundreds or thousands of individual traces, users can focus on a smaller number of clusters, each representing a distinct performance pattern.
The “how” involves several key components:
Defining the Scope of Analysis:
commit-detail-picker-sk. The clustering will typically look at commits before and after this selected point. The state.offset property stores the selected commit's offset.query-sk and paramset-sk. The state.query holds this query. The query-count-sk element provides feedback on how many traces match the current query.state.radius.Clustering Algorithm and Parameters:
algo-select-sk and stored in state.algo. The choice of algorithm impacts how clusters are formed and what “similarity” means.state.k.state.interesting.state.sparse) allows users to indicate if the data is sparse, meaning not all traces have data points for all commits. This affects how the clustering algorithm processes missing data.Executing the Clustering and Displaying Results:
start() method constructs a RegressionDetectionRequest object containing all the user-defined parameters. This request is sent to the /_/cluster/start endpoint.progress utility to manage the asynchronous request. It displays a spinner (spinner-sk) and status messages (ele.status, ele.runningStatus) to keep the user informed. The requestId property tracks the active request.RegressionDetectionResponse contains a list of FullSummary objects. Each FullSummary represents a discovered cluster. These are rendered using multiple cluster-summary2-sk elements. This component is responsible for visualizing the details of each cluster, including its member traces and regression information.sort-sk.State Management:
The cluster-page-sk component maintains its internal state in a State object. This includes user selections like the query, commit offset, algorithm, and various parameters. Crucially, this state is reflected in the URL using the stateReflector utility. This design decision ensures that:
The stateHasChanged() method is called whenever a piece of the state is modified, triggering the stateReflector to update the URL and potentially re-render the component.
Key Files and Their Roles:
cluster-page-sk.ts: This is the main TypeScript file defining the ClusterPageSk custom element. It orchestrates all the sub-components, manages the application state, handles user interactions (e.g., button clicks, input changes), makes API calls for clustering, and renders the results. It defines the overall layout and logic of the clustering page.cluster-page-sk.html (inferred, as it's a LitElement): The HTML template is defined within cluster-page-sk.ts using lit-html. This template structures the page, embedding various custom elements for commit selection, query building, algorithm choice, and result display.cluster-page-sk.scss: Provides the specific styling for the cluster-page-sk element and its layout, ensuring a consistent look and feel.index.ts: A simple entry point that imports and registers the cluster-page-sk custom element, making it available for use in HTML.cluster-page-sk-demo.ts & cluster-page-sk-demo.html: These files set up a demonstration page for the cluster-page-sk element. cluster-page-sk-demo.ts uses fetch-mock to simulate API responses, allowing the component to be developed and tested in isolation without needing a live backend. This is crucial for rapid development and ensuring the UI behaves correctly under various backend scenarios.State class (within cluster-page-sk.ts): Defines the structure of the data that is persisted in the URL and drives the component's behavior. It encapsulates all user-configurable options for the clustering process.Workflow Example: Performing a Cluster Analysis
User Interaction | Component/State Change | Backend Interaction
-----------------------------------------|-------------------------------|---------------------
1. User navigates to the cluster page. | `ClusterPageSk` initializes. | Fetches initial paramset (`/_/initpage/`)
| `stateReflector` initializes |
| from URL or defaults. |
| |
2. User selects a commit. | `commit-detail-picker-sk` | (Potentially fetches commit details if not cached)
| emits `commit-selected`. |
| `state.offset` updates. |
| `stateHasChanged()` called. |
| |
3. User types a query (e.g., "config=gpu").| `query-sk` emits | (Potentially `/`_`/count/` to update trace count)
| `query-change`. |
| `state.query` updates. |
| `stateHasChanged()` called. |
| |
4. User selects an algorithm (e.g., kmeans).| `algo-select-sk` emits |
| `algo-change`. |
| `state.algo` updates. |
| `stateHasChanged()` called. |
| |
5. User adjusts advanced parameters | Input elements update |
(K, radius, interestingness). | corresponding `state` props. |
| `stateHasChanged()` called. |
| |
6. User clicks "Run". | `start()` method is called. | POST to `/_/cluster/start` with `RegressionDetectionRequest`
| `requestId` is set. | (This is a long-running request)
| Spinner becomes active. |
| |
7. Page periodically updates status. | `progress` utility polls for | GET requests to check progress.
| updates. |
| `ele.runningStatus` updates. |
| |
8. Clustering completes. | `progress` utility resolves. | Final response from `/_/cluster/start` (or progress endpoint)
| `summaries` array is populated| containing `RegressionDetectionResponse`.
| with cluster data. |
| `requestId` is cleared. |
| Spinner stops. |
| |
9. Results are displayed. | `ClusterPageSk` re-renders, |
| showing `cluster-summary2-sk` |
| elements for each cluster. |
This workflow highlights how user inputs are translated into state changes, which then drive API requests and ultimately update the UI to present the clustering results. The separation of concerns among various sub-components (for query, commit selection, etc.) makes the main cluster-page-sk element more manageable.
The cluster-summary2-sk module provides a custom HTML element for displaying detailed information about a cluster of performance test results. This includes visualizing the trace data, showing regression statistics, and allowing users to triage the cluster.
Core Functionality and Design:
The primary purpose of this element is to present a comprehensive summary of a performance cluster. It aims to provide all necessary information for a user to understand the nature of a performance change (regression or improvement) and take appropriate action (e.g., filing a bug, marking it as expected).
Key design considerations include:
plot-simple-sk element is used to display the centroid trace of the cluster over time. This visual representation helps users quickly grasp the trend and identify the point of change. An “x-bar” can be displayed on the plot to highlight the specific commit where a step change is detected.StepDetection algorithm used (e.g., ‘absolute’, ‘percent’, ‘mannwhitneyu’). This ensures that the presented information is relevant and interpretable for the specific detection method.commit-detail-panel-sk allows users to view details of the commit associated with the detected step point or any selected point on the trace plot. This is crucial for correlating performance changes with specific code modifications.notriage attribute, the element includes a triage2-sk component. This allows authenticated users with “editor” privileges to set the triage status (e.g., “positive”, “negative”, “untriaged”) and add a message. This functionality is essential for tracking the investigation and resolution of performance issues.word-cloud-sk element, which displays a summary of the parameters that make up the traces in the cluster. This helps in understanding the common characteristics of the affected tests.commit-range-sk component allows users to define a range around the detected step or a selected commit, facilitating further investigation within the Perf application.Key Components and Their Roles:
cluster-summary2-sk.ts: This is the main TypeScript file defining the ClusterSummary2Sk custom element.ClusterSummary2Sk class: Extends ElementSk and manages the element's state, rendering, and event handling.full_summary, triage, alert): These properties receive the core data for the cluster. When full_summary is set, it triggers the rendering of the plot, statistics, and commit details. The alert property determines the labels and formatting for regression statistics. The triage property reflects the current triage state.template static method): Uses lit-html to define the element's structure, binding data to various sub-components and display areas.open-keys: Fired when the “View on dashboard” button is clicked, providing details for opening the explorer.triaged: Fired when the triage status is updated, containing the new status and the relevant commit information.trace_selected: Handles events from plot-simple-sk when a point on the graph is clicked, triggering a lookup for the corresponding commit details.statusClass(): Determines the CSS class for the regression display based on the severity (e.g., “high”, “low”).permaLink(): Generates a URL to the triage page focused on the step point.lookupCids() (static): A static method (delegating to ../cid/cid.ts) used to fetch commit details based on commit numbers.labelsForStepDetection: A crucial constant object that maps different StepDetection algorithm names (e.g., ‘percent’, ‘mannwhitneyu’, ‘absolute’) to specific labels and number formatting functions for the regression statistics. This ensures that the displayed information is meaningful and correctly interpreted for the algorithm used to detect the cluster.cluster-summary2-sk.html (template, rendered by cluster-summary2-sk.ts): Defines the visual layout using HTML and embedded custom elements. It uses a CSS grid for positioning the main sections: regression summary, statistics, plot, triage status, commit details, actions, and word cloud.cluster-summary2-sk.scss: Provides the styling for the element. It defines how different sections are displayed, including styles for regression severity (e.g., red for “high” regressions, green for “low”), button appearances, and responsive behavior (hiding the plot on smaller screens).cluster-summary2-sk-demo.html and cluster-summary2-sk-demo.ts: These files set up a demonstration page for the cluster-summary2-sk element. The .ts file provides mock data for FullSummary, Alert, and TriageStatus to populate the demo instances of the element. It also demonstrates how to listen for the triaged and open-keys custom events.Workflows:
Initialization and Data Display:
full_summary (containing cluster data and trace frame), alert (details of the alert that triggered this cluster), and optionally triage (current triage status) properties to the cluster-summary2-sk element.set full_summary():summary and frame data.data-clustersize).plot-simple-sk with the centroid trace from summary.centroid and time labels from frame.dataframe.header.lookupCids is called to fetch and display details for the commit at the step point in commit-detail-panel-sk.set alert():labels used for displaying regression statistics based on alert.step and labelsForStepDetection.set triage():triageStatus and re-renders the triage controls.Host Application cluster-summary2-sk
---------------- -------------------
[Set full_summary data] --> Process data
|
+-> plot-simple-sk (Draws trace)
|
+-> commit-detail-panel-sk (Shows step commit)
|
+-> Display stats (regression, size, etc.)
[Set alert data] ---------> Update regression labels/formatters
[Set triage data] --------> Update triage2-sk state
User Triage:
triage2-sk (selects status) and the message input field.update() method is called:ClusterSummary2SkTriagedEventDetail object is created containing the step_point (as columnHeader) and the current triageStatus.triaged custom event is dispatched with this detail.triaged event to persist the triage status.User cluster-summary2-sk Host Application
---- ------------------- ----------------
Selects status ----> [triage2-sk updates value]
Types message ----> [Input updates value]
Clicks "Update" ---> update()
|
+-> Creates TriagedEventDetail
|
+-> Dispatches "triaged" event --> Listens and handles event
(e.g., saves to backend)
Viewing on Dashboard:
openShortcut() method is called:ClusterSummary2SkOpenKeysEventDetail object is created with the shortcut ID, begin and end timestamps from the frame, and the step_point as xbar.open-keys custom event is dispatched.open-keys and navigates the user to the explorer view with the provided parameters.User cluster-summary2-sk Host Application
---- ------------------- ----------------
Clicks "View on dash" --> openShortcut()
|
+-> Creates OpenKeysEventDetail
|
+-> Dispatches "open-keys" event --> Listens and handles event
(e.g., navigates to explorer)
The cluster-summary2-sk element plays a vital role in the Perf frontend by providing a focused and interactive view for analyzing individual performance regressions or improvements identified through clustering. Its integration with plotting, commit details, and triaging makes it a key tool for performance analysis workflows.
High-level Overview:
The commit-detail-panel-sk module provides a custom HTML element <commit-detail-panel-sk> designed to display a list of commit details. It offers functionality to make these commit entries selectable and emits an event when a commit is selected. This component is primarily used in user interfaces where users need to browse and interact with a sequence of commits.
Why and How:
The core purpose of this module is to present commit information in a structured and interactive way. Instead of simply displaying raw commit data, it leverages the commit-detail-sk element (an external dependency) to render each commit with relevant information like author, message, and a link to the commit.
The design decision to make commits selectable (via the selectable attribute) enhances user interaction. When a commit is clicked in “selectable” mode, it triggers a commit-selected custom event. This event carries detailed information about the selected commit, including its index in the list, a concise description, and the full commit object. This allows parent components or applications to react to user selections and perform actions based on the chosen commit (e.g., loading further details, navigating to a specific state).
The implementation uses Lit library for templating and rendering. The commit data is provided via the details property, which expects an array of Commit objects (defined in perf/modules/json). The component dynamically generates table rows for each commit.
The visual appearance is controlled by commit-detail-panel-sk.scss. It defines styles for the panel, including highlighting the selected row and adjusting opacity based on the selectable state. The styling aims for a clean and readable presentation of commit information.
A hide property is also available to conditionally show or hide the entire commit list. This is useful for scenarios where the panel's visibility needs to be controlled dynamically by the parent application.
Key Components/Files:
commit-detail-panel-sk.ts: This is the heart of the module. It defines the CommitDetailPanelSk class, which extends ElementSk.Commit objects (_details property).template and rows static methods)._click method).selectable attribute is present), it dispatches the commit-selected custom event with relevant commit data.selectable, selected, and hide attributes and their corresponding properties, re-rendering the component when these change.commit-detail-sk element to display individual commit details within each row.commit-detail-panel-sk.scss: This file contains the SASS styles for the component.selectable.--primary, --surface-1dp) from //perf/modules/themes:themes_sass_lib for consistent theming.commit-detail-panel-sk-demo.ts and commit-detail-panel-sk-demo.html: These files provide a demonstration page for the component.<commit-detail-panel-sk> element in an HTML page.details property and how to listen for the commit-selected event.index.ts: A simple entry point that imports and registers the commit-detail-panel-sk custom element, making it available for use.BUILD.bazel: Defines how the module is built and its dependencies. For instance, it declares commit-detail-sk as a runtime dependency and Lit as a TypeScript dependency.commit-detail-panel-sk_puppeteer_test.ts: Contains Puppeteer tests to verify the component's rendering and basic functionality.Key Workflows:
Initialization and Rendering:
Parent Application --> Sets 'details' property of <commit-detail-panel-sk> with Commit[]
|
V
commit-detail-panel-sk.ts --> _render() is called
|
V
Lit template generates <table>
|
V
For each Commit in 'details':
Generates <tr> containing <commit-detail-sk .cid=Commit>
Commit Selection (when selectable is true): User --> Clicks on a <tr> in the <commit-detail-panel-sk> | V commit-detail-panel-sk.ts --> _click(event) handler is invoked | V Determines the clicked commit's index and data | V Sets 'selected' attribute/property to the index of the clicked commit | V Dispatches 'commit-selected' CustomEvent with { selected: index, description: string, commit: Commit } | V Parent Application --> Listens for 'commit-selected' event and processes the event.detail
The design favors declarative attribute-based configuration (e.g., selectable, selected) and event-driven communication for user interactions, which are common patterns in web component development.
The commit-detail-picker-sk module provides a user interface element for selecting a specific commit from a range of commits. It's designed to be a reusable component that simplifies the process of commit selection within applications that need to interact with commit histories.
Core Functionality and Design:
The primary purpose of commit-detail-picker-sk is to allow users to browse and select a commit. This is achieved by presenting a button that, when clicked, opens a dialog.
[Button: "Author - Commit Message"] --- (click) ---> [Dialog Opens]commit-detail-panel-sk: This submodule is responsible for displaying the list of commits fetched from the backend. Users can click on a commit in this panel to select it.day-range-sk component allows users to specify a time window for fetching commits. This is crucial for performance and usability, as it prevents loading an overwhelming number of commits at once. When the date range changes, the component automatically fetches the relevant commits. [day-range-sk] -- (date range change) --> [Fetch Commits for New Range] | V [commit-detail-panel-sk updates]spinner-sk element provides visual feedback to the user while commits are being fetched, indicating that an operation is in progress.Data Flow and State Management:
/_/cidRange/ endpoint. The request body includes the begin and end timestamps of the desired range and optionally the offset of a currently selected commit (to ensure it's included in the results if it falls outside the new range). User Action (e.g., change date range) | V [commit-detail-picker-sk] | V (Constructs RangeRequest: {begin, end, offset}) POST /_/cidRange/ | V (Receives Commit[] array) [commit-detail-picker-sk] | V (Updates internal 'details' array) [commit-detail-panel-sk] (Re-renders with new commit list)commit-detail-panel-sk, the panel emits a commit-selected event. - commit-detail-picker-sk listens for this event and updates its internal selected index. - The dialog is then closed, and the main button‘s text updates to reflect the new selection. - Crucially, commit-detail-picker-sk itself emits a commit-selected event. This allows parent components to react to the user’s choice. The detail of this event is of type CommitDetailPanelSkCommitSelectedDetails, containing information about the selected commit. [commit-detail-panel-sk] -- (internal click on a commit) --> Emits 'commit-selected' (internal) | V [commit-detail-picker-sk] -- (handles internal event) --> Updates 'selected' index Updates button text Closes dialog Emits 'commit-selected' (external)selection property): The component exposes a selection property (of type CommitNumber). If this property is set externally, the component will attempt to fetch commits around that CommitNumber and pre-select it in the panel.Key Files and Responsibilities:
commit-detail-picker-sk.ts: This is the core TypeScript file defining the CommitDetailPickerSk custom element.commit-detail-panel-sk, and day-range-sk. It handles fetching commit data, managing the selection state, and emitting the final commit-selected event.open(), close()), handling range changes (rangeChange()), updating the commit list (updateCommitSelections()), and processing selections from the panel (panelSelect()). The selection getter/setter allows for programmatic control of the selected commit.commit-detail-picker-sk.scss: Contains the SASS/CSS styles for the component.--on-background, --background).dialog element, the buttons within it, and ensures proper display and spacing of child components like day-range-sk.commit-detail-picker-sk-demo.html & commit-detail-picker-sk-demo.ts: These files provide a demonstration page for the component.commit-detail-picker-sk, mocks the backend API call (/_/cidRange/) using fetch-mock to provide sample commit data, and sets up an event listener to display the commit-selected event details.commit-detail-panel-sk: Used within the dialog to list and allow selection of individual commits. commit-detail-picker-sk passes the fetched details (array of Commit objects) to this panel.day-range-sk: Used to allow the user to define the time window for which commits should be fetched. Its day-range-change event triggers a refetch in the picker.spinner-sk: Provides visual feedback during data loading.ElementSk: Base class from infra-sk providing common custom element functionality.jsonOrThrow: Utility for parsing JSON responses and throwing an error if parsing fails or the response is not OK.errorMessage: Utility for displaying error messages to the user.The design focuses on encapsulation: the commit-detail-picker-sk component manages its internal state (current range, fetched commits, selected index) and exposes a clear interface for interaction (a button to open, a selection property, and a commit-selected event). This makes it easy to integrate into larger applications that require users to pick a commit from a potentially large history.
The commit-detail-sk module provides a custom HTML element <commit-detail-sk> designed to display concise information about a single commit. This element is crucial for user interfaces where presenting commit details in a structured and interactive manner is necessary.
In applications dealing with version control systems, there's often a need to display details of individual commits. This could be for reviewing changes, navigating commit history, or linking to related actions like exploring code changes, viewing clustered data, or triaging issues associated with a commit. The commit-detail-sk element encapsulates this functionality, offering a reusable and consistent way to present commit information.
The core of the module is the CommitDetailSk class, which extends ElementSk. This class defines the structure and behavior of the <commit-detail-sk> element.
Key Responsibilities and Components:
commit-detail-sk.ts: This is the heart of the module.
CommitDetailSk custom element.Commit object (defined in perf/modules/json) as input via the cid property. This object contains details like the commit hash, author, message, timestamp, and URL.template function, using lit-html, defines the HTML structure of the element. It displays:diffDate).cid.url.openLink method handles the click events on these buttons, opening the respective links in a new browser window/tab.upgradeProperty is used to ensure that the cid property is correctly initialized if it's set before the element is fully connected to the DOM.commit-detail-sk.scss: This file contains the styling for the <commit-detail-sk> element.
--blue, --primary), allowing the component to adapt to different visual themes (light and dark mode, as demonstrated in the demo).//perf/modules/themes:themes_sass_lib and //elements-sk/modules:colors_sass_lib to ensure consistency with the broader application's design system.commit-detail-sk-demo.html and commit-detail-sk-demo.ts: These files provide a demonstration page for the <commit-detail-sk> element.
<commit-detail-sk> in both light and dark mode contexts.Commit data. It also simulates a click on the element to potentially reveal more details or actions if such functionality were implemented (though in the current version, the “tip” div with buttons is always visible). The Date.now function is mocked to ensure consistent output for the diffDate calculation in the demo and tests.Workflow Example: Displaying Commit Information and Actions
1. Application provides a `Commit` object.
e.g., { hash: "abc123...", author: "user@example.com", ... }
2. The `Commit` object is assigned to the `cid` property of a `<commit-detail-sk>` element.
<commit-detail-sk .cid=${commitData}></commit-detail-sk>
3. `CommitDetailSk` element renders:
[abc123...] - [user@example.com] - [2 days ago] - [Commit message]
+----------------------------------------------------------------+
| [Explore] [Cluster] [Triage] [Commit (link to commit source)] | <- Action buttons
+----------------------------------------------------------------+
4. User clicks an action button (e.g., "Explore").
5. `openLink` method is called with a generated URL (e.g., "/g/e/abc123...").
6. A new browser tab opens to the specified URL.
This design promotes reusability and separation of concerns. The element focuses solely on presenting commit information and providing relevant action links, making it easy to integrate into various parts of an application that need to display commit details. The use of lit-html for templating allows for efficient rendering and updates.
The commit-range-sk module provides a custom HTML element, <commit-range-sk>, designed to display a link representing a range of commits within a Git repository. This functionality is particularly useful in performance analysis tools where identifying the specific commits that introduced a performance regression or improvement is crucial.
Core Functionality and Design:
The primary purpose of commit-range-sk is to dynamically generate a URL that points to a commit range viewer (e.g., a Git web interface like Gerrit or GitHub). This URL is constructed based on a “begin” and an “end” commit.
Identifying the Commit Range:
trace (an array of numerical data points, where each point corresponds to a commit), a commitIndex (the index within the trace array that represents the “end” commit of interest), and header information (which maps trace indices to commit metadata like offset or commit number).commitIndex and the header.commitIndex - 1 in the trace. It skips over any entries marked with MISSING_DATA_SENTINEL (indicating commits for which there's no data point) until it finds a valid previous commit.Converting Commit Numbers to Hashes:
window.perf.commit_range_url, typically requires Git commit hashes (SHAs) rather than internal commit numbers or offsets.commit-range-sk element uses a commitNumberToHashes function to perform this conversion.defaultcommitNumberToHashes, makes an asynchronous call to a backend service (likely //cid/``) by invokinglookupCidsfrom the//perf/modules/cid:cid_ts_lib module. This service is expected to return the commit hashes corresponding to the provided commit numbers.commitNumberToHashes with a mock function during testing (as seen in commit-range-sk_test.ts).URL Construction and Display:
window.perf.commit_range_url template. This template usually contains placeholders like {begin} and {end} which are replaced with the actual commit hashes.<begin_offset + 1> - <end_offset>”. Otherwise, it will just show the “<end_offset>”. The +1 for the begin offset in a range is to ensure the displayed range starts after the last known good commit.showLinks property:showLinks is false (default, or when the element is merely hovered over in some UIs), only the text representing the commit(s) is displayed.showLinks is true, a fully formed hyperlink (<a> tag) is rendered.Key Components/Files:
commit-range-sk.ts: This is the core file defining the CommitRangeSk custom element.
ElementSk, a base class for custom elements in the Skia infrastructure._trace, _commitIndex, _header, _url, _text, and _commitIds.recalcLink() method is central to its operation. It's triggered whenever relevant input properties (trace, commitIndex, header) change. This method orchestrates the process of finding commit IDs, converting them to hashes, and generating the URL and display text.setCommitIds() implements the logic for determining the start and end commit numbers based on the input trace and header, handling missing data points.lit/html library for templating, allowing for efficient rendering and updates to the DOM.commit-range-sk-demo.ts and commit-range-sk-demo.html: These files provide a demonstration page for the commit-range-sk element.
commit-range-sk-demo.ts sets up a mock environment, including mocking the fetch call to //cid/``usingfetch-mock. This is crucial for demonstrating the element's behavior without needing a live backend.window.perf object with necessary configuration, such as the commit_range_url template.<commit-range-sk> element and populates its properties to showcase its functionality.commit-range-sk_test.ts: This file contains unit tests for the CommitRangeSk element.
chai for assertions and setUpElementUnderTest for easy instantiation of the element in a test environment.commitNumberToHashes method on the element instance to provide controlled hash values and assert the correctness of the generated URL and text, especially in scenarios involving MISSING_DATA_SENTINEL.BUILD.bazel: Defines how the module is built, its dependencies (e.g., //infra-sk/modules/ElementSk, //perf/modules/json, lit), and how the demo page and tests are structured.
Workflow Example: Generating a Commit Range Link
Initialization:
<commit-range-sk> sets the global window.perf.commit_range_url (e.g., "http://example.com/range/{begin}/{end}").<commit-range-sk> element is added to the DOM.Property Setting:
- The application provides data to the element:
- `element.trace = [10, MISSING_DATA_SENTINEL, 12, 15];`
- `element.header = [{offset: C1}, {offset: C2}, {offset: C3},
{offset: C4}];`(where C1-C4 are commit numbers)
-element.commitIndex = 3;(points to the data15and commitC4)
- `element.showLinks = true;`
recalcLink() Triggered:
recalcLink().Determine Commit IDs (setCommitIds()):
header[commitIndex].offset => C4.commitIndex - 1 = 2. trace[2] is 12 (not missing). So, header[2].offset => C3._commitIds becomes [C3, C4].Check if Range (isRange()):
C3 + 1 === C4? Let's assume C3 and C4 are not consecutive (e.g., C3=100, C4=102). isRange() returns true."${C3 + 1} - ${C4}" (e.g., "101 - 102").Convert Commit IDs to Hashes (commitNumberToHashes):
- `commitNumberToHashes([C3, C4])` is called.
- Internally, this likely makes a POST request to `/`/cid/``with`[C3,
C4]`.
- Backend returns: `{ commitSlice: [{hash: "hash_for_C3"}, {hash:
“hash_for_C4”}] }`.
["hash_for_C3", "hash_for_C4"].Construct URL:
url = window.perf.commit_range_url (e.g., "http://example.com/range/{begin}/{end}")url = url.replace('{begin}', "hash_for_C3")url = url.replace('{end}', "hash_for_C4")_url becomes "http://example.com/range/hash_for_C3/hash_for_C4".Render:
- Since `showLinks` is true, the template becomes: `<a
href=“http://example.com/range/hash_for_C3/hash_for_C4” target=“_blank”>101 - 102` - The element updates its content with this HTML.
This workflow demonstrates how commit-range-sk encapsulates the logic for finding relevant commits, converting their identifiers, and presenting a user-friendly link to explore changes between them, abstracting away the complexities of interacting with commit data and URL templates.
The common module houses utility functions and data structures that are shared across various parts of the Perf application, particularly those related to data visualization and testing. Its primary purpose is to promote code reuse and maintain consistency in how data is processed and displayed.
The module's responsibilities can be broken down into the following areas:
Plot Data Construction and Formatting:
Why: Visualizing performance data often involves transforming raw data into formats suitable for charting libraries (like Google Charts). This process needs to be standardized to ensure plots are consistent and correctly represent the underlying information.
How:
plot-builder.ts: This file is central to preparing data for plotting.
convertFromDataframe: This function is crucial for adapting data organized in a DataFrame structure (where traces are rows) into a format suitable for Google Charts, which typically expects data in columns. It essentially transposes the TraceSet. The domain parameter allows specifying whether the x-axis should represent commit positions, dates, or both, providing flexibility in how time-series data is visualized.
Input DataFrame (TraceSet): TraceA: [val1, val2, val3] TraceB: [valA, valB, valC] Header: [commit1, commit2, commit3] convertFromDataframe (domain='commit') -> Output for Google Chart: ["Commit Position", "TraceA", "TraceB"] [commit1_offset, val1, valA ] [commit2_offset, val2, valB ] [commit3_offset, val3, valC ]
ConvertData: This function takes a ChartData object, which is a more abstract representation of plot data (lines with x, y coordinates and labels), and transforms it into the specific array-of-arrays format required by Google Charts. This abstraction allows other parts of the application to work with ChartData without needing to know the exact details of the charting library's input format.
Input ChartData:
xLabel: "Time"
lines: {
"Line1": [{x: t1, y: v1}, {x: t2, y: v2}],
"Line2": [{x: t1, y: vA}, {x: t2, y: vB}]
}
ConvertData ->
Output for Google Chart:
["Time", "Line1", "Line2"]
[t1, v1, vA ]
[t2, v2, vB ]
mainChartOptions and SummaryChartOptions: These functions provide pre-configured option objects for Google Line Charts. They encapsulate common styling and behavior (like colors, axis formatting, tooltip behavior, and null interpolation) to ensure a consistent look and feel for different types of charts (main detail charts vs. summary overview charts). This avoids repetitive configuration and makes it easier to maintain visual consistency. The options are also designed to adapt to the current theme (light/dark mode) by using CSS custom properties.
defaultColors: A predefined array of colors used for chart series, ensuring a consistent and visually distinct palette.
Plotting Utilities:
Why: Beyond basic data transformation, there are common tasks related to preparing data specifically for plotting, such as associating anomalies with data points and handling missing values.
How:
plot-util.ts: This file contains helper functions that build upon plot-builder.ts.
CreateChartDataFromTraceSet: This function serves as a higher-level constructor for ChartData. It takes a raw TraceSet (a dictionary where keys are trace identifiers and values are arrays of numbers), corresponding x-axis labels (commit numbers or dates), the desired x-axis format, and anomaly information. It then iterates through the traces, constructs DataPoint objects (which include x, y, and any associated anomaly), and organizes them into the ChartData structure. A key aspect is its handling of MISSING_DATA_SENTINEL to exclude missing points from the chart data, relying on the charting library's interpolation. It also uses findMatchingAnomaly to link anomalies to their respective data points.
Input TraceSet:
"trace_foo": [10, 12, MISSING_DATA_SENTINEL, 15]
xLabels: [c1, c2, c3, c4]
Anomalies: { "trace_foo": [{x: c2, y: 12, anomaly: {...}}] }
CreateChartDataFromTraceSet ->
Output ChartData:
lines: {
"trace_foo": [
{x: c1, y: 10, anomaly: null},
{x: c2, y: 12, anomaly: {...}},
// Point for c3 is skipped due to MISSING_DATA_SENTINEL
{x: c4, y: 15, anomaly: null}
]
}
...
findMatchingAnomaly: A utility to efficiently check if a given data point (identified by its trace key, x-coordinate, and y-coordinate) corresponds to a known anomaly. This is used by CreateChartDataFromTraceSet to enrich data points with anomaly details.
Test Utilities:
test-util.ts: This file provides functions to set up a common testing and demo environment.setUpExploreDemoEnv: This is a comprehensive function that uses fetch-mock to intercept various API calls that are typically made by Perf frontend components (e.g., explore page, alert details). It returns predefined, static responses for endpoints like /_/login/status, /_/initpage/..., /_/count/, /_/frame/start, /_/defaults/, /_/status/..., /_/cid/, /_/details/, /_/shortcut/get, /_/nextParamList/, and /_/shortcut/update.paramSet data, DataFrame structures, commit information, and default configurations. This ensures that components relying on these API calls behave predictably in a test or demo environment. The function also checks for a proxy_endpoint cookie to avoid mocking if a real backend is being proxied for development or demo purposes.The /modules/const module serves as a centralized repository for constants utilized throughout the Perf UI. Its primary purpose is to ensure consistency and maintainability by providing a single source of truth for values that are shared across different parts of the user interface.
A key design decision behind this module is to manage values that might also be defined in the backend. This avoids potential discrepancies and ensures that frontend and backend systems operate with the same understanding of specific sentinel values or configurations.
The core responsibility of this module is to define and export these shared constants.
One of the key components is the const.ts file. This file contains the actual definitions of the constants. A notable constant defined here is MISSING_DATA_SENTINEL.
The MISSING_DATA_SENTINEL constant (value: 1e32) is critical for representing missing data points within traces. The backend uses this specific floating-point value to indicate that a sample is absent. The choice of 1e32 is deliberate. JSON, the data interchange format used, does not natively support NaN (Not a Number) or infinity values (+/- Inf). Therefore, a valid float32 that has a compact JSON representation and is unlikely to clash with actual data values was chosen. It is imperative that this frontend constant remains synchronized with the MissingDataSentinel constant defined in the backend Go package //go/vec32/vec. This synchronization ensures that both the UI and the backend correctly interpret missing data.
Any part of the Perf UI that needs to interpret or display trace data, especially when dealing with potentially incomplete datasets, will rely on this MISSING_DATA_SENTINEL. For instance, charting libraries or data table components might use this constant to visually differentiate missing points or to exclude them from calculations.
Workflow involving MISSING_DATA_SENTINEL:
Backend Data Generation --> Data contains MissingDataSentinel from //go/vec32/vec | V Data Serialization (JSON) --> 1e32 is used for missing data | V Frontend Data Fetching | V Frontend UI Component (e.g., a chart) | V UI uses MISSING_DATA_SENTINEL from /modules/const/const.ts to identify missing points | V Appropriate rendering (e.g., gap in a line chart, specific placeholder in a table)
The /modules/csv module provides functionality to convert DataFrame objects, a core data structure representing performance or experimental data, into the Comma Separated Values (CSV) format. This conversion is essential for users who wish to export data for analysis in external tools, spreadsheets, or for archival purposes.
The primary challenge in converting a DataFrame to CSV lies in representing the potentially sparse and varied parameter sets associated with each trace (data series) in a flat, tabular format. The DataFrame stores traces indexed by a “trace ID,” which is a string encoding of key-value pairs representing the parameters that uniquely identify that trace.
The conversion process addresses this challenge through a multi-step approach:
Parameter Key Consolidation:
parseIdsIntoParams function takes an array of trace IDs and transforms each ID string back into its constituent key-value parameter pairs. This is achieved by leveraging the fromKey function from the //perf/modules/paramtools module.allParamKeysSorted function then iterates through all these parsed parameter sets to identify the complete, unique set of all parameter keys present across all traces. These keys are then sorted alphabetically. This sorted list of unique parameter keys will form the initial set of columns in the CSV, ensuring a consistent order and comprehensive representation of all parameters.Pseudocode for parameter key consolidation:
traceIDs = ["key1=valueA,key2=valueB", "key1=valueC,key3=valueD"]
parsedParams = {}
for each id in traceIDs:
parsedParams[id] = fromKey(id) // e.g., {"key1=valueA,key2=valueB": {key1:"valueA", key2:"valueB"}}
allKeys = new Set()
for each params in parsedParams.values():
for each key in params.keys():
allKeys.add(key)
sortedColumnNames = sorted(Array.from(allKeys)) // e.g., ["key1", "key2", "key3"]
Header Row Generation:
dataFrameToCSV function begins by constructing the header row of the CSV.sortedColumnNames derived in the previous step.DataFrame's header property. Each element in df.header typically represents a point in time (or a commit, build, etc.), and its timestamp field is converted into an ISO 8601 formatted date string.Pseudocode for header row generation:
csvHeader = sortedColumnNames
for each columnHeader in df.header:
csvHeader.push(new Date(columnHeader.timestamp * 1000).toISOString())
csvLines.push(csvHeader.join(','))
Data Row Generation:
df.traceset (excluding “special_” traces, which are likely internal or metadata traces not intended for direct CSV export):sortedColumnNames are retrieved. If a trace does not have a value for a particular parameter key, an empty string is used, ensuring that each row has the same number of columns corresponding to the parameter keys.MISSING_DATA_SENTINEL (defined in //perf/modules/const) is a special value indicating missing data; this is converted to an empty string in the CSV to represent a null or missing value. Other numerical values are appended directly.Pseudocode for data row generation:
for each traceId, traceData in df.traceset:
if traceId starts with "special_":
continue
traceParams = parsedParams[traceId]
rowData = []
for each columnName in sortedColumnNames:
rowData.push(traceParams[columnName] or "") // Add parameter value or empty string
for each value in traceData:
if value is MISSING_DATA_SENTINEL:
rowData.push("")
else:
rowData.push(value)
csvLines.push(rowData.join(','))
Final CSV String Assembly:
\n) to produce the complete CSV string.The design prioritizes creating a CSV that is both human-readable and easily parsable by other tools. By dynamically determining the parameter columns based on the input DataFrame and sorting them, it ensures that all relevant trace metadata is included in a consistent manner. The explicit handling of MISSING_DATA_SENTINEL ensures that missing data is represented clearly as empty fields.
The key files in this module are:
index.ts: This file contains the core logic for the CSV conversion. It houses the parseIdsIntoParams, allParamKeysSorted, and the main dataFrameToCSV functions. It leverages helper functions from //perf/modules/paramtools for parsing trace ID strings and relies on constants from //perf/modules/const for identifying missing data.index_test.ts: This file provides unit tests for the dataFrameToCSV function. It defines a sample DataFrame with various scenarios, including different parameter sets per trace and missing data points, and asserts that the generated CSV matches the expected output. This is crucial for ensuring the correctness and robustness of the CSV generation logic.The dependencies on //perf/modules/const (for MISSING_DATA_SENTINEL) and //perf/modules/json (for DataFrame, ColumnHeader, Params types) indicate that this module is tightly integrated with the broader data representation and handling mechanisms of the Perf system. The dependency on //perf/modules/paramtools (for fromKey) highlights its role in interpreting the structured information encoded within trace IDs.
The dataframe module is designed to manage and manipulate time-series data, specifically performance testing traces, within the Perf application. It provides a centralized way to fetch, store, and process trace data, enabling functionalities like visualizing performance trends, identifying anomalies, and managing user-reported issues.
The core idea is to have a reactive data repository that components can consume. This allows for efficient data loading and updates, especially when dealing with large datasets and dynamic time ranges. Instead of each component fetching and managing its own data, they can rely on a shared DataFrameRepository to handle these tasks. This promotes consistency and reduces redundant data fetching.
dataframe_context.tsThis file defines the DataFrameRepository class, which acts as the central data store and manager. It‘s implemented as a LitElement (<dataframe-repository-sk>) that doesn’t render any UI itself but provides data and loading states through Lit contexts.
Why a LitElement with Contexts? Using a LitElement allows easy integration into the existing component-based architecture. Lit contexts (@lit/context) provide a clean and reactive way for child components to consume the DataFrame and related information without prop drilling or complex event bus implementations.
Core Functionalities:
Data Fetching:
resetTraces(range, paramset): Fetches an initial set of traces based on a time range and a ParamSet (a set of key-value pairs defining the traces to query). This is typically called when the user defines a new query. User defines query -> explore-simple-sk calls resetTraces() | V DataFrameRepository -> Fetches data from /_/frame/start | V Updates internal _header, _traceset, anomaly, userIssues | V Provides DataFrame, DataTable, AnomalyMap, UserIssueMap via contextextendRange(offsetInSeconds): Fetches additional data to extend the current time range, either forwards or backwards. This is used for infinite scrolling or when the user wants to see more data. To improve performance for large range extensions, it slices the requested range into smaller chunks (chunkSize) and fetches them concurrently. User scrolls/requests more data -> UI calls extendRange() | V DataFrameRepository -> Slices range into chunks if needed | V Fetches data for each chunk from /_/frame/start concurrently | V Merges new data with existing _header, _traceset, anomaly | V Provides updated DataFrame, DataTable, AnomalyMap via context/_/frame/start endpoint, sending a FrameRequest which includes the time range, query (derived from ParamSet), and timezone.Data Caching and Merging:
_header (array of ColumnHeader objects, representing commit points/timestamps) and _traceset (a TraceSet object mapping trace keys to their data arrays).MISSING_DATA_SENTINEL to maintain alignment with the header.Anomaly Management:
AnomalyMap) along with the trace data.updateAnomalies(anomalies, id): Allows merging new anomalies and removing specific anomalies (e.g., when an anomaly is nudged or re-triaged). This uses mergeAnomaly and removeAnomaly from index.ts.User-Reported Issue Management:
getUserIssues(traceKeys, begin, end): Fetches user-reported issues (e.g., Buganizer bugs linked to specific data points) from the /_/user_issues/ endpoint for a given set of traces and commit range.updateUserIssue(traceKey, commitPosition, bugId): Updates the local cache of user issues, typically after a new issue is filed or an existing one is modified.norm()) before querying for user issues to ensure issues are found even if the displayed trace is a transformed version of the original.Google DataTable Conversion:
DataFrame into a google.visualization.DataTable format using convertFromDataframe (from perf/modules/common:plot-builder_ts_lib). This DataTable is then provided via dataTableContext and is typically consumed by charting components like <plot-google-chart-sk>.DataFrameRepository.loadPromise).State Management:
loading: A boolean provided via dataframeLoadingContext to indicate if a data request is in flight._requestComplete: A Promise that resolves when the current data fetching operation completes. This can be used to coordinate actions that depend on data being available.Contexts Provided:
dataframeContext: Provides the current DataFrame object.dataTableContext: Provides the google.visualization.DataTable derived from the DataFrame.dataframeAnomalyContext: Provides the AnomalyMap for the current data.dataframeUserIssueContext: Provides the UserIssueMap for the current data.dataframeLoadingContext: Provides a boolean indicating if data is currently being loaded.dataframeRepoContext: Provides the DataFrameRepository instance itself, allowing consumers to call its methods (e.g., extendRange).index.tsThis file contains utility functions for manipulating DataFrame structures, similar to its Go counterpart (//perf/go/dataframe/dataframe.go). These functions are crucial for merging, slicing, and analyzing the data.
Key Functions:
findSubDataframe(header, range, domain): Given a DataFrame header and a time/offset range, this function finds the start and end indices within the header that correspond to the given range. This is essential for slicing data.generateSubDataframe(dataframe, range): Creates a new DataFrame containing only the data within the specified index range of the original DataFrame.mergeAnomaly(anomaly1, ...anomalies): Merges multiple AnomalyMap objects into a single one. If anomalies exist for the same trace and commit, the later ones in the arguments list will overwrite earlier ones. It always returns a non-null AnomalyMap.removeAnomaly(anomalies, id): Creates a new AnomalyMap excluding any anomalies with the specified id. This is used when an anomaly is moved or re-triaged on the backend, and the old entry needs to be cleared.findAnomalyInRange(allAnomaly, range): Filters an AnomalyMap to include only anomalies whose commit positions fall within the given commit range.mergeColumnHeaders(a, b): Merges two arrays of ColumnHeader objects, producing a new sorted array of unique headers. It also returns mapping objects (aMap, bMap) that indicate the new index of each header from the original arrays. This is fundamental for the join operation.join(a, b): Combines two DataFrame objects into a new one.mergeColumnHeaders.traceset. For each trace in the original DataFrames, it uses the aMap and bMap to place the trace data points into the correct slots in the new, longer trace arrays, filling gaps with MISSING_DATA_SENTINEL.paramset from both DataFrames.buildParamSet(d): Reconstructs the paramset of a DataFrame based on the keys present in its traceset. This ensures the paramset accurately reflects the data.timestampBounds(df): Returns the earliest and latest timestamps present in the DataFrame's header.traceset.tsThis file provides utility functions for extracting and formatting information from the trace keys within a DataFrame or DataTable. Trace keys are strings that encode various parameters (e.g., ",benchmark=Speedometer,test=MotionMark,").
Key Functions:
getAttributes(df): Extracts all unique attribute keys (e.g., “benchmark”, “test”) present across all trace keys in a DataFrame.getTitle(dt): Identifies the common key-value pairs across all trace labels in a DataTable. These common pairs form the “title” of the chart, representing what all displayed traces have in common.DataTable input? This function is often used directly with the DataTable that feeds a chart, as column labels in the DataTable are typically the trace keys.getLegend(dt): Identifies the key-value pairs that are not common across all trace labels in a DataTable. These differing parts form the “legend” for each trace, distinguishing them from one another."untitled_key" for consistency in display.titleFormatter(title): Formats the output of getTitle (an object) into a human-readable string, typically by joining values with ‘/’.legendFormatter(legend): Formats the output of getLegend (an array of objects) into an array of human-readable strings.getLegendKeysTitle(label): Takes a legend object (for a single trace) and creates a string by joining its keys, often used as a title for the legend section.isSingleTrace(dt): Checks if a DataTable contains data for only a single trace (i.e., has 3 columns: domain, commit position/date, and one trace).findTraceByLabel(dt, legendTraceId): Finds the column label (trace key) in a DataTable that matches the given legendTraceId.findTracesForParam(dt, paramKey, paramValue): Finds all trace labels in a DataTable that contain a specific key-value pair.removeSpecialFunctions(key): A helper used internally to strip function wrappers (like norm(...)) from trace keys before processing, ensuring that the underlying parameters are correctly parsed.Design Rationale for Title/Legend Generation: When multiple traces are plotted, the title should reflect what‘s common among them (e.g., “benchmark=Speedometer”), and the legend should highlight what’s different (e.g., “test=Run1” vs. “test=Run2”). These functions automate this process by analyzing the trace keys.
1. User navigates to a page or submits a query.
|
V
2. <explore-simple-sk> (or similar component) determines initial time range and ParamSet.
|
V
3. Calls `dataframeRepository.resetTraces(initialRange, initialParamSet)`.
|
V
4. DataFrameRepository:
a. Sets `loading = true`.
b. Constructs `FrameRequest`.
c. POSTs to `/_/frame/start`.
d. Receives `FrameResponse` (containing DataFrame and AnomalyMap).
e. Updates its internal `_header`, `_traceset`, `anomaly`.
f. Calls `setDataFrame()`:
i. Updates `this.dataframe` (triggers `dataframeContext`).
ii. Converts DataFrame to `google.visualization.DataTable`.
iii. Updates `this.data` (triggers `dataTableContext`).
g. Updates `this.anomaly` (triggers `dataframeAnomalyContext`).
h. Sets `loading = false`.
|
V
5. Charting components (consuming `dataTableContext`) re-render with the new data.
|
V
6. Other UI elements (consuming `dataframeContext`, `dataframeAnomalyContext`) update.
1. User action triggers a request to load more data (e.g., scrolls near edge of chart).
|
V
2. UI component calls `dataframeRepository.extendRange(offsetInSeconds)`.
|
V
3. DataFrameRepository:
a. Sets `loading = true`.
b. Calculates the new time range (`deltaRange`).
c. Slices the new range into chunks if `offsetInSeconds` is large (`sliceRange`).
d. For each chunk:
i. Constructs `FrameRequest`.
ii. POSTs to `/_/frame/start`.
e. `Promise.all` awaits all chunk responses.
f. Filters out empty/error responses and sorts responses by timestamp.
g. Merges `header` and `traceset` from sorted responses into existing `_header` and `_traceset`.
- For traceset: pads with `MISSING_DATA_SENTINEL` if a trace is missing in a new chunk.
h. Merges `anomalymap` from sorted responses into existing `anomaly`.
i. Calls `setDataFrame()` (as in initial load).
j. Sets `loading = false`.
|
V
4. Charting components and other UI elements update.
1. Charting component (e.g., <perf-explore-sk>) has access to the `DataTable` via `dataTableContext`. | V 2. It calls `getTitle(dataTable)` and `getLegend(dataTable)` from `traceset.ts`. | V 3. It then uses `titleFormatter` and `legendFormatter` to get displayable strings. | V 4. Renders these strings as the chart title and legend series.
dataframe_context_test.ts: Tests the DataFrameRepository class. It uses fetch-mock to simulate API responses from /_/frame/start and /_/user_issues/. Tests cover initialization, data loading (resetTraces), range extension (extendRange) with and without chunking, anomaly merging, and user issue fetching/updating.index_test.ts: Tests the utility functions in index.ts, such as mergeColumnHeaders, join, findSubDataframe, mergeAnomaly, etc. It uses manually constructed DataFrame objects to verify the logic of these data manipulation functions.traceset_test.ts: Tests the functions in traceset.ts for extracting titles and legends from trace keys. It generates DataFrame objects with various key combinations, converts them to DataTable (requiring Google Chart API to be loaded), and then asserts the output of getTitle, getLegend, etc.test_utils.ts: Provides helper functions for tests, notably:generateFullDataFrame: Creates mock DataFrame objects with specified structures, which is invaluable for setting up consistent test scenarios.generateAnomalyMap: Creates mock AnomalyMap objects linked to a DataFrame.mockFrameStart: A utility to easily mock the /_/frame/start endpoint with fetch-mock, returning parts of a provided full DataFrame based on the request's time range.mockUserIssues: Mocks the /_/user_issues/ endpoint.The testing strategy relies heavily on creating controlled mock data and API responses to ensure that the data processing and fetching logic behaves as expected under various conditions.
The day-range-sk module provides a custom HTML element for selecting a date range. It allows users to pick a “begin” and “end” date, which is a common requirement in applications that deal with time-series data or event logging.
The primary goal of this module is to offer a user-friendly way to define a time interval. It achieves this by composing two calendar-input-sk elements, one for the start date and one for the end date. This design choice leverages an existing, well-tested component for date selection, promoting code reuse and consistency.
Key Components and Responsibilities:
day-range-sk.ts: This is the core file defining the DayRangeSk custom element.
ElementSk, a base class for custom elements, providing lifecycle callbacks and rendering capabilities.lit-html library for templating, rendering two calendar-input-sk elements labeled “Begin” and “End”.begin and end dates are stored as attributes (and corresponding properties) representing Unix timestamps in seconds. This is a common and unambiguous way to represent points in time.calendar-input-sk element fires an input event (signifying a date change), the DayRangeSk element updates its corresponding begin or end attribute and then dispatches a custom event named day-range-change.day-range-change event's detail object contains the begin and end timestamps, allowing parent components to easily consume the selected range.begin and end are set if not provided: begin defaults to 24 hours before the current time, and end defaults to the current time. This provides a sensible initial state.connectedCallback and attributeChangedCallback are used to ensure the element renders correctly when added to the DOM or when its attributes are modified.day-range-sk.scss: This file contains the styling for the day-range-sk element.
themes.scss) and defines specific styles for the labels and input fields within the day-range-sk component, ensuring they adapt to light and dark modes.day-range-sk-demo.html and day-range-sk-demo.ts: These files provide a demonstration page for the day-range-sk element.
day-range-sk with different initial begin and end attributes.day-range-change event from these instances and displays the event details in a <pre> tag, demonstrating how to retrieve the selected date range.day-range-sk_puppeteer_test.ts: This file contains Puppeteer tests for the day-range-sk element.
loadCachedTestBed utility to set up a testing environment, navigates to the demo page, and takes screenshots for visual regression testing. It also performs a basic smoke test to confirm the element is present on the page.Key Workflows:
Initialization: User HTML -> day-range-sk (attributes: begin, end) day-range-sk.connectedCallback() IF begin/end not set Set default begin (now - 24h), end (now) _render() Create two <calendar-input-sk> elements with initial dates
User Selects a New “Begin” Date: User interacts with "Begin" <calendar-input-sk> <calendar-input-sk> fires "input" event (with new Date) day-range-sk._beginChanged(event) Update this.begin (convert Date to timestamp) this._sendEvent() Dispatch "day-range-change" event with { begin: new_begin_timestamp, end: current_end_timestamp }
User Selects a New “End” Date: User interacts with "End" <calendar-input-sk> <calendar-input-sk> fires "input" event (with new Date) day-range-sk._endChanged(event) Update this.end (convert Date to timestamp) this._sendEvent() Dispatch "day-range-change" event with { begin: current_begin_timestamp, end: new_end_timestamp }
Parent Component Consumes Date Range: Parent Component Listen for "day-range-change" on <day-range-sk> On event: Access event.detail.begin Access event.detail.end Perform actions with the new date range
The conversion between Date objects (used by calendar-input-sk) and numeric timestamps (used by day-range-sk's attributes and events) is handled internally by the dateFromTimestamp utility function and by using Date.prototype.valueOf() / 1000. This design ensures that the day-range-sk element exposes a simple, numeric API for its date range while leveraging a more complex date object-based component for the UI.
The domain-picker-sk module provides a custom HTML element <domain-picker-sk> that allows users to select a data domain. This domain can be defined in two ways: either as a specific date range or as a number of data points (commits) preceding a chosen end date. This flexibility is crucial for applications that need to visualize or analyze time-series data where users might want to focus on a specific period or view the most recent N data points.
The core design choice is to offer these two distinct modes of domain selection, catering to different user needs. The “Date Range” mode is useful when users know the specific start and end dates they are interested in. The “Dense” mode is more suitable when users want to see a fixed amount of recent data, regardless of the specific start date.
The component's state is managed internally and can also be set externally via the state property. This state object, defined by the DomainPickerState interface, holds the begin and end timestamps (in Unix seconds), the num_commits (for “Dense” mode), and the request_type which indicates the current selection mode (0 for “Date Range” - RANGE, 1 for “Dense” - DENSE).
Key Files and Their Responsibilities:
domain-picker-sk.ts: This is the heart of the module. It defines the DomainPickerSk class, which extends ElementSk.
lit-html library for templating, allowing for efficient updates to the DOM when the state changes. The template static method defines the basic structure, and _showRadio and _requestType static methods conditionally render different parts of the UI based on the current request_type and the force_request_type attribute._state object. Initial default values are set in the constructor (e.g., end date is now, begin date is 24 hours ago, default num_commits is 50).typeRange, typeDense, beginChange, endChange, and numChanged update the internal _state and then call render() to reflect these changes in the UI.force_request_type attribute ('range' or 'dense') allows the consuming application to lock the picker into a specific mode, hiding the radio buttons that would normally allow the user to switch. This is useful when the application context dictates a specific type of domain selection. The attributeChangedCallback and the getter/setter for force_request_type handle this.radio-sk for mode selection and calendar-input-sk for date picking, promoting modularity and reuse.domain-picker-sk.scss: This file contains the SASS styles for the component.
elements-sk/modules/styles for consistency (e.g., buttons, colors).index.ts: A simple entry point that imports and registers the domain-picker-sk custom element.
import './domain-picker-sk'; which ensures the DomainPickerSk class is defined and registered with the browser's CustomElementRegistry via the define function call within domain-picker-sk.ts.domain-picker-sk-demo.html and domain-picker-sk-demo.ts: These files provide a demonstration page for the component.
domain-picker-sk-demo.html includes instances of <domain-picker-sk>, some with the force_request_type attribute set. domain-picker-sk-demo.ts initializes the state of these demo instances with sample data.domain-picker-sk_puppeteer_test.ts: Contains Puppeteer tests for the component.
puppeteer-tests/util library to load the demo page and take screenshots, verifying the visual appearance of the component in its default state.Key Workflows/Processes:
Initialization and Rendering:
<domain-picker-sk> element is added to the DOM.connectedCallback is invoked.state and force_request_type are upgraded (if set as attributes before the element was defined)._state is established (e.g., end = now, begin = 24h ago, mode = RANGE).render() is called:force_request_type. If set, it overrides _state.request_type._showRadio decides whether to show mode selection radio buttons._requestType renders either the “Begin” date input (for RANGE mode) or the “Points” number input (for DENSE mode).[DOM Insertion] -> connectedCallback() -> _upgradeProperty('state')
-> _upgradeProperty('force_request_type')
-> render()
|
V
[UI Displayed]
User Changes Mode (if force_request_type is not set):
@change event triggers typeRange() or typeDense()._state.request_type is updated.render() is called.[User clicks radio] -> typeRange()/typeDense() -> _state.request_type updated
-> render()
|
V
[UI Updates]
User Changes Date/Number of Commits:
<calendar-input-sk> (for Begin/End dates) or the <input type="number"> (for Points).@input (for calendar) or @change (for number input) event triggers beginChange(), endChange(), or numChanged()._state (e.g., _state.begin, _state.end, _state.num_commits) is updated.render() is called (though in the case of date changes, the <calendar-input-sk> handles its own visual update for the date display, and render() here ensures the parent component is aware and can re-render if other parts depend on these values, although in the current implementation, render() on the parent might be redundant for just date changes if no other part of this component's template changes directly).[User changes input] -> beginChange()/endChange()/numChanged()
|
V
_state updated
|
V
render() // Potentially re-renders the component
|
V
[UI reflects new value]
The component emits no custom events itself but relies on the events from its child components (radio-sk, calendar-input-sk) to trigger internal state updates and re-renders. Consumers of domain-picker-sk would typically read the state property to get the user's selection.
The errorMessage module provides a wrapper around the errorMessage function from the elements-sk library. Its primary purpose is to offer a more convenient way to display persistent error messages to the user.
Core Functionality and Design Rationale:
The key differentiation of this module lies in its default behavior for message display duration. While the elements-sk errorMessage function requires a duration to be specified for how long a message (often referred to as a “toast”) remains visible, this module defaults the duration to 0 seconds.
This design choice is intentional: a duration of 0 typically signifies that the error message will not automatically close. This is particularly useful in scenarios where an error is critical or requires user acknowledgment, and an auto-dismissing message might be missed. By defaulting to a persistent display, the module prioritizes ensuring the user is aware of the error.
Responsibilities and Key Components:
The module exposes a single function: errorMessage.
errorMessage(message: string | { message: string } | { resp: Response } | object, duration: number = 0): void:message parameter as the underlying elements-sk function. This means it can handle plain strings, objects with a message property, objects containing a Response object (from which an error message can often be extracted), or generic objects.duration parameter. If not explicitly provided by the caller, it defaults to 0. This default triggers the persistent display behavior mentioned above.elementsErrorMessage from the elements-sk library, passing along the provided message and the (potentially defaulted) duration.Workflow:
The typical workflow for using this module is straightforward:
errorMessage function is imported from this module.errorMessage function is called with the error details.errorMessage("A critical error occurred.") -> Displays “A critical error occurred.” indefinitely.errorMessage("Something went wrong.", 5000) -> Displays “Something went wrong.” for 5 seconds (overriding the default).Essentially, this module acts as a thin convenience layer, promoting a specific error display pattern (persistent messages) by changing the default behavior of a more general utility. This reduces boilerplate for common use cases where persistent error notification is desired.
The existing-bug-dialog-sk module provides a user interface element for associating performance anomalies with existing bug reports in a bug tracking system (like Monorail). It's designed to be used within a larger performance monitoring application where users need to triage and manage alerts generated by performance regressions.
The core purpose of this module is to simplify the workflow of linking one or more detected anomalies to a pre-existing bug. Instead of manually navigating to the bug tracker and updating the bug, users can do this directly from the performance monitoring interface. This reduces context switching and streamlines the bug management process.
Key Components and Responsibilities:
existing-bug-dialog-sk.ts: This is the heart of the module, defining the custom HTML element existing-bug-dialog-sk.
open(), closeDialog())._anomalies)./_/triage/associate_alerts) to create the association.anomaly-changed. This event signals other parts of the application (e.g., charts or lists displaying anomalies) that the anomaly data has been updated (specifically, the bug_id field) and they might need to re-render./_/anomalies/group_report to get details of anomalies in the same group, including their associated bug_ids. This endpoint might return a sid (state ID) if the report generation is asynchronous, requiring a follow-up request./_/triage/list_issues to fetch the titles of these bugs. This provides more context to the user than just showing bug IDs.setAnomalies() method is crucial for initializing the dialog with the relevant anomaly data when it's about to be shown.window.perf.bug_host_url to construct links to the bug tracker.existing-bug-dialog-sk.scss: This file contains the SASS/CSS styles for the dialog.
--on-background, --background, etc.).index.ts: This is a simple entry point that imports and registers the existing-bug-dialog-sk custom element, making it available for use in HTML.
Workflow for Associating Anomalies with an Existing Bug:
setAnomalies() on an existing-bug-dialog-sk instance, passing the selected anomalies.open() on the dialog instance. Application existing-bug-dialog-sk | | | -- setAnomalies(anomalies) --> | | | | ------ open() ---------> | | | | -- fetch_associated_bugs() --> Backend API (/anomalies/group_report) | | | <-- (Associated Bug IDs) -- | | | | -- fetch_bug_titles() ---> Backend API (/triage/list_issues) | | | <--- (Bug Titles) -------- | | | | -- Renders Dialog with form & associated bugs list --existing-bug-dialog-sk | | -- (User Submits Form) --> | | | | -- _spinner.active = true --> (UI Update: Show spinner) | | | -- fetch('/_/triage/associate_alerts', POST, {bug_id, keys}) --> Backend API | | | <---- (Success/Failure) ---- | | | | -- _spinner.active = false -> (UI Update: Hide spinner) | | | -- IF Success: | | | -- closeDialog() ------> (UI Update: Hide dialog) | | | | | -- window.open(bug_url) -> (Opens bug in new tab) | | | | | -- dispatchEvent('anomaly-changed') --> Application (Notifies other components) | | | -- IF Failure: | | | -- errorMessage(msg) -> (UI Update: Show error toast)anomaly-changed) update to reflect the new association.The design prioritizes a clear and focused user experience for a common task in performance alert triaging. By integrating directly with the backend API for bug association and fetching related bug information, it aims to be an efficient tool for developers and SREs. The use of custom events allows for loose coupling with other components in the larger application.
The explore-multi-sk module provides a user interface for displaying and interacting with multiple performance data graphs simultaneously. This is particularly useful when users need to compare different metrics, configurations, or time ranges side-by-side. The core idea is to leverage the functionality of individual explore-simple-sk elements, which represent single graphs, and manage their states and interactions within a unified multi-graph view.
State Management: A central State object within explore-multi-sk manages properties that are common across all displayed graphs. These include the time range (begin, end), display options (showZero, dots), and pagination settings (pageSize, pageOffset). This approach simplifies the overall state management and keeps the URL from becoming overly complex, as only a limited set of shared parameters need to be reflected in the URL.
Each individual graph (explore-simple-sk instance) maintains its own specific state related to the data it displays (formulas, queries, selected keys). explore-multi-sk stores an array of GraphConfig objects, where each object corresponds to an explore-simple-sk instance and holds its unique configuration.
The stateReflector utility is used to synchronize the shared State with the URL, allowing for bookmarking and sharing of multi-graph views.
Dynamic Graph Addition and Removal: Users can dynamically add new graphs to the view. When a new graph is added, an empty explore-simple-sk instance is created and the user can then configure its data source (query or formula).
If the useTestPicker option is enabled (often determined by backend defaults), instead of a simple “Add Graph” button, a test-picker-sk element is displayed. This component provides a more structured way to select tests and parameters, and upon selection, a new graph is automatically generated and populated.
Graphs can also be removed. Event listeners are in place to handle remove-explore custom events, which are typically dispatched by the individual explore-simple-sk elements when a user closes them in a “Multiview” context (where useTestPicker is active).
Pagination: To handle potentially large numbers of graphs, pagination is implemented using the pagination-sk element. This allows users to view a subset of the total graphs at a time, improving performance and usability. The pageSize and pageOffset are part of the shared state.
Graph Manipulation (Split and Merge):
These operations primarily involve manipulating the graphConfigs array and then re-rendering the graphs.
Shortcuts: The module supports saving and loading multi-graph configurations using shortcuts. When the configuration of graphs changes (traces added/removed, graphs split/merged), updateShortcutMultiview is called. This function communicates with a backend service (/_/shortcut/get and a corresponding save endpoint invoked by updateShortcut from explore-simple-sk) to store or retrieve the graphConfigs associated with a unique shortcut ID. This ID is then reflected in the URL, allowing users to share specific multi-graph setups.
Synchronization of Interactions:
x-axis-toggled is dispatched. explore-multi-sk listens for this and updates the x-axis on all other visible graphs to maintain consistency.explore-multi-sk.ts, the explore-simple-sk component likely has mechanisms for plot selection. If the plotSummary feature is active, selections on one graph might influence others, though the provided code for explore-multi-sk doesn't directly show this cross-graph selection synchronization logic, but it does have syncChartSelection which would handle this.Defaults and Configuration: The component fetches default configurations from a /_/defaults/ endpoint. These defaults can influence various aspects, such as: - Whether to use test-picker-sk (useTestPicker). - Default parameters and their order for test-picker-sk (include_params, default_param_selections). This allows for instance-specific customization of the Perf UI.
explore-multi-sk.ts:
ExploreMultiSk custom element. It is responsible for:explore-simple-sk graph elements.stateReflector to update the URL based on the shared state.test-picker-sk if enabled.explore-simple-sk.pagination-sk for displaying graphs in pages.test-picker-sk for adding graphs when useTestPicker is true.favorites-dialog-sk to allow users to save graph configurations.explore-multi-sk.html (Inferred from the Lit html template in explore-multi-sk.ts):
explore-multi-sk element. This includes:test-picker-sk element (conditionally visible).pagination-sk elements for navigating through graph pages.#graphContainer) where the individual explore-simple-sk elements are dynamically rendered.<button> elements for user actions.<test-picker-sk> for test selection.<pagination-sk> for graph pagination.<favorites-dialog-sk> for saving favorites.div (#graphContainer) to hold the explore-simple-sk instances.explore-multi-sk.scss:
explore-multi-sk element and its children. It ensures that the layout is appropriate for displaying multiple graphs and their controls.#menu and #pagination areas.explore-simple-sk plots.#test-picker and #add-graph-button.1. Initial Load and State Restoration:
User navigates to URL with explore-multi-sk
|
V
explore-multi-sk.connectedCallback()
|
V
Fetch defaults from /_/defaults/
|
V
stateReflector() is initialized
|
V
State is read from URL (or defaults if URL is empty)
|
V
IF state.shortcut is present:
Fetch graphConfigs from /_/shortcut/get using the shortcut ID
|
V
ELSE (or after fetching):
For each graphConfig (or if starting fresh, often one empty graph is implied or added):
Create/configure explore-simple-sk instance
Set its state based on graphConfig and shared state
|
V
Add graphs to the current page based on pagination settings
|
V
Render the component
2. Adding a Graph (without Test Picker):
User clicks "Add Graph" button
|
V
explore-multi-sk.addEmptyGraph() is called
|
V
A new ExploreSimpleSk instance is created
A new empty GraphConfig is added to this.graphConfigs
|
V
explore-multi-sk.updatePageForNewExplore()
|
V
IF current page is full:
Increment pageOffset (triggering pageChanged)
ELSE:
Add new graph to current page
|
V
The new explore-simple-sk element might open its query dialog for the user
3. Adding a Graph (with Test Picker):
TestPickerSk is visible (due to defaults or state)
|
V
User interacts with TestPickerSk, selects tests/parameters
|
V
User clicks "Plot" button in TestPickerSk
|
V
TestPickerSk dispatches 'plot-button-clicked' event
|
V
explore-multi-sk listens for 'plot-button-clicked'
|
V
explore-multi-sk.addEmptyGraph(unshift=true) is called (new graph at the top)
|
V
explore-multi-sk.addGraphsToCurrentPage() updates the view
|
V
TestPickerSk.createQueryFromFieldData() gets the query
|
V
The new ExploreSimpleSk instance has its query set
4. Splitting a Graph:
User has one graph with multiple traces and clicks "Split Graph"
|
V
explore-multi-sk.splitGraph()
|
V
this.getTracesets() retrieves traces from the first (and only) graph
|
V
this.clearGraphs() removes the existing graph configuration
|
V
FOR EACH trace in the retrieved traceset:
this.addEmptyGraph()
A new GraphConfig is created for this trace (e.g., config.queries = [queryFromKey(trace)])
|
V
this.updateShortcutMultiview() (new shortcut reflecting multiple graphs)
|
V
this.state.pageOffset is reset to 0
|
V
this.addGraphsToCurrentPage() renders the new set of individual graphs
5. Saving/Updating a Shortcut:
Graph configuration changes (e.g., trace added/removed, graph split/merged, new graph added)
|
V
explore-multi-sk.updateShortcutMultiview() is called
|
V
Calls exploreSimpleSk.updateShortcut(this.graphConfigs)
|
V
(Inside updateShortcut)
IF graphConfigs is not empty:
POST this.graphConfigs to backend (e.g., /_/shortcut/new or /_/shortcut/update)
Backend returns a new or existing shortcut ID
|
V
explore-multi-sk.state.shortcut is updated with the new ID
|
V
this.stateHasChanged() is called, triggering stateReflector to update the URL
The explore-simple-sk module provides a custom HTML element for exploring and visualizing performance data. It allows users to query, plot, and analyze traces, identify anomalies, and interact with commit details. This element is a core component of the Perf application's data exploration interface.
Core Functionality:
The element's primary responsibility is to provide a user interface for:
Key Design Decisions and Implementation Choices:
State class in explore-simple-sk.ts defines the structure of this state./frame/start endpoint. The requestFrame method handles initiating these requests and processing the responses. The FrameRequest and FrameResponse types define the communication contract with the server.plot-simple-sk (a custom canvas-based plotter) and plot-google-chart-sk (which wraps Google Charts). The choice of plotter can be configured.query-sk for query input, paramset-sk for displaying parameters, commit-detail-panel-sk for commit information). This promotes modularity and reusability.explore-simple-sk element through custom events. For example, when a query changes in query-sk, it emits a query-change event that explore-simple-sk listens to.Key Files and Components:
explore-simple-sk.ts: This is the main TypeScript file that defines the ExploreSimpleSk custom element. It handles:explore-simple-sk.html (embedded in explore-simple-sk.ts): This Lit-html template defines the structure of the element's UI. It includes placeholders for various child components and dynamic content.explore-simple-sk.scss: This SCSS file provides the styling for the element and its components.explore-simple-sk.ts):query-sk: For constructing and managing queries.paramset-sk: For displaying and interacting with parameter sets.plot-simple-sk / plot-google-chart-sk: For rendering the plots.commit-detail-panel-sk: For displaying commit information.anomaly-sk: For displaying and managing anomalies.Workflow Example: Plotting a Query
query-sk element to define a query.query-sk emits a query-change event with the new query.explore-simple-sk listens for this event, updates its internal state (specifically the queries array in the State object), and triggers a re-render.explore-simple-sk constructs a FrameRequest based on the updated state and calls requestFrame to fetch data from the server. User Input (query-sk) -> Event (query-change) -> State Update (ExploreSimpleSk) -> Data Request (requestFrame)FrameResponse, explore-simple-sk processes the data, updates its internal _dataframe object, and prepares the data for plotting.explore-simple-sk passes the processed data to the plot-simple-sk or plot-google-chart-sk element, which then renders the traces on the graph. Server Response (FrameResponse) -> Data Processing (ExploreSimpleSk) -> Plot Update (plot-simple-sk/plot-google-chart-sk) -> Visual OutputThis workflow illustrates the reactive nature of the element, where user interactions trigger state changes, which in turn lead to data fetching and UI updates.
The explore-sk module serves as the primary user interface for exploring and analyzing performance data within the Perf application. It provides a comprehensive view for users to query, visualize, and interact with performance traces.
The core functionality of explore-sk is built upon the explore-simple-sk element. explore-sk acts as a wrapper, enhancing explore-simple-sk with additional features like user authentication integration, default configuration loading, and the optional test-picker-sk for more guided query construction.
Key Responsibilities and Components:
explore-sk.ts: This is the main TypeScript file defining the ExploreSk custom element.
/_/defaults/). This ensures that the exploration view is pre-configured with sensible starting points.alogin-sk to determine the logged-in user's status. This information is used to enable features like “favorites” if a user is logged in.stateReflector to persist and restore the state of the underlying explore-simple-sk element in the URL. This allows users to share specific views or bookmark their current exploration state.test-picker-sk. If the use_test_picker_query flag is set in the state (often via URL parameters or defaults), the test-picker-sk component is shown, providing a structured way to build queries based on available parameter keys and values.test-picker-sk (e.g., plot-button-clicked, remove-all, populate-query) and translates these into actions on the explore-simple-sk element, such as adding new traces based on the selected test parameters or clearing the view.explore-simple-sk.explore-simple-sk (imported module): This is a fundamental building block that handles the core trace visualization, querying logic, and interaction with the graph.
explore-sk delegates most of the heavy lifting related to data exploration to this component. It passes down the initial state, default configurations, and user-specific settings.test-picker-sk (imported module): A component that allows users to build queries by selecting from available test parameters and their values.
explore-sk then uses to fetch and display the corresponding traces via explore-simple-sk. It can also be populated based on a highlighted trace, allowing users to quickly refine queries based on existing data.favorites-dialog-sk (imported module): Enables users to save and manage their favorite query configurations.
explore-simple-sk and its functionality is enabled by explore-sk based on the user's login status.State Management (stateReflector):
explore-sk uses stateReflector to listen for state changes in explore-simple-sk. When the state changes, stateReflector updates the URL. Conversely, when the page loads or the URL changes, stateReflector parses the URL and applies the state to explore-simple-sk.Workflow Example: Initial Page Load with Test Picker
explore-sk element is connected to the DOM.connectedCallback is invoked:/_/defaults/.stateReflector is initialized. If the URL contains state for explore-simple-sk, it's applied.use_test_picker_query = true.use_test_picker_query is true:initializeTestPicker() is called.test-picker-sk element is made visible.test-picker-sk is initialized with parameters from the defaults (e.g., include_params, default_param_selections) or from existing queries in the state.test-picker-sk to select desired test parameters.test-picker-sk.test-picker-sk emits a plot-button-clicked event.explore-sk listens for this event:test-picker-sk.exploreSimpleSk.addFromQueryOrFormula() to add the new traces to the graph.explore-simple-sk fetches the data, renders the traces, and emits a state_changed event.stateReflector captures this state_changed event and updates the URL to reflect the new query.This workflow illustrates how explore-sk acts as a central coordinator, integrating various specialized components to provide a cohesive data exploration experience. The design emphasizes modularity, with explore-simple-sk handling the core plotting and test-picker-sk offering an alternative query input mechanism, all managed and presented by explore-sk.
The favorites-dialog-sk module provides a custom HTML element that displays a modal dialog for users to add or edit “favorites.” Favorites, in this context, are likely user-defined shortcuts or bookmarks to specific views or states within the application, identified by a name, description, and a URL.
Core Functionality and Design:
The primary purpose of this module is to present a user-friendly interface for managing these favorites. It‘s designed as a modal dialog to ensure that the user’s focus is on the task of adding or editing a favorite without distractions from the underlying page content.
Key Components:
favorites-dialog-sk.ts: This is the heart of the module, defining the FavoritesDialogSk custom element.
ElementSk, a base class for custom elements in the Skia infrastructure, providing a common foundation.lit/html.js) for templating, allowing for declarative and efficient rendering of the dialog's UI.open() method is the public API for triggering the dialog. It accepts optional parameters for pre-filling the form when editing an existing favorite. Crucially, it returns a Promise. This promise-based approach is a key design choice. It resolves when the favorite is successfully saved and rejects if the user cancels the dialog. This allows the calling code (likely a parent component managing the list of favorites) to react appropriately, for instance, by re-fetching the updated list of favorites only when a change has actually occurred.confirm() method handles the submission logic. It performs basic validation (checking for empty name and URL) and then makes an HTTP POST request to either /_/favorites/new or /_/favorites/edit depending on whether a new favorite is being created or an existing one is being modified.spinner-sk element is used to provide visual feedback to the user during the asynchronous operation of saving the favorite.errorMessage to display issues to the user, such as network errors or validation failures from the backend.dismiss() method handles the cancellation of the dialog, rejecting the promise returned by open().filterName, filterDescription, filterUrl) update the component's internal state as the user types, and trigger re-renders via this._render().favorites-dialog-sk.scss: This file contains the SASS styles for the dialog.
<dialog> element, input fields, labels, and buttons, ensuring a consistent look and feel within the application's theme (as indicated by @import '../themes/themes.scss';).favorites-dialog-sk-demo.html / favorites-dialog-sk-demo.ts: These files provide a demonstration page for the favorites-dialog-sk element.
open() method of the favorites-dialog-sk element with appropriate parameters.Workflow: Adding/Editing a Favorite
A typical workflow involving this dialog would be:
User Action: The user clicks a button (e.g., “Add Favorite” or an “Edit” icon next to an existing favorite) in the main application UI.
Dialog Invocation: The event handler for this action calls the open() method of an instance of favorites-dialog-sk.
open() might be called with minimal or no arguments, defaulting the URL to the current page.open() is called with the favId, name, description, and url of the favorite to be edited.User clicks "Add New" --> favoritesDialog.open('', '', '', 'current.page.url')
|
V
Dialog Appears
|
V
User fills form, clicks "Save" --> confirm() is called
|
V
POST /_/favorites/new
|
V (Success)
Dialog closes, open() Promise resolves
|
V
Calling component re-fetches favorites
-------------------------------- OR ---------------------------------
User clicks "Edit Favorite" --> favoritesDialog.open('id123', 'My Fav', 'Desc', 'fav.url.com')
|
V
Dialog Appears (pre-filled)
|
V
User modifies form, clicks "Save" --> confirm() is called
|
V
POST /_/favorites/edit (with 'id123')
|
V (Success)
Dialog closes, open() Promise resolves
|
V
Calling component re-fetches favorites
-------------------------------- OR ---------------------------------
User clicks "Cancel" or Close Icon --> dismiss() is called
|
V
Dialog closes, open() Promise rejects
|
V
Calling component does nothing (no re-fetch)
User Interaction: The user fills in or modifies the “Name,” “Description,” and “URL” fields in the dialog.
Submission/Cancellation:
confirm() method is invoked.fetch request is made to the backend API (/_/favorites/new or /_/favorites/edit).Promise returned by open() resolves.confirm).dismiss() method is invoked.Promise returned by open() rejects.Post-Dialog Action: The component that initiated the dialog (e.g., a favorites-sk list component) uses the resolved/rejected state of the Promise to decide whether to refresh its list of favorites. This is a key aspect of the design – it avoids unnecessary re-fetches if the user simply cancels the dialog.
The design prioritizes a clear separation of concerns, using custom elements for UI encapsulation, SASS for styling, and a promise-based API for asynchronous operations and communication with parent components. This makes the favorites-dialog-sk a reusable and well-defined piece of UI for managing user favorites.
The favorites-sk module provides a user interface element for displaying and managing a user's “favorites”. Favorites are essentially bookmarked URLs, categorized into sections. This module allows users to view their favorited links, edit their details (name, description, URL), and delete them.
Core Functionality & Design:
The primary responsibility of favorites-sk is to fetch favorite data from a backend endpoint (/_/favorites/) and render it in a user-friendly way. It also handles interactions for modifying these favorites, such as editing and deleting.
Data Fetching and Rendering:
connectedCallback), the element attempts to fetch the favorites configuration from the backend.Favorites JSON format (defined in perf/modules/json), is stored in the favoritesConfig property._render() method is called to update the display.Favorite Management:
deleteFavoriteConfirm method is invoked.window.confirm) to prevent accidental deletions.deleteFavorite sends a POST request to /_/favorites/delete with the ID of the favorite to be removed.editFavorite method.favorites-dialog-sk element (defined in perf/modules/favorites-dialog-sk).favorites-dialog-sk is responsible for presenting a modal dialog where the user can modify the favorite's name, description, and URL.Error Handling:
errorMessage utility (from elements-sk/modules/errorMessage).Key Components/Files:
favorites-sk.ts: This is the heart of the module. It defines the FavoritesSk custom element, extending ElementSk. It contains the logic for fetching, rendering, deleting, and initiating the editing of favorites.constructor(): Initializes the element with its Lit-html template.deleteFavorite(): Handles the asynchronous request to the backend for deleting a favorite.deleteFavoriteConfirm(): Provides a confirmation step before actual deletion.editFavorite(): Manages the interaction with the favorites-dialog-sk for editing.template(): The static Lit-html template function that defines the overall structure of the element.getSectionsTemplate(): A helper function that dynamically generates the HTML for displaying sections and their links based on favoritesConfig. It specifically adds edit/delete controls for the “My Favorites” section.fetchFavorites(): Fetches the favorites data from the backend and triggers a re-render.connectedCallback(): A lifecycle method that ensures favorites are fetched when the element is added to the page.favorites-sk.scss: Provides the styling for the favorites-sk element, defining its layout, padding, colors for links, and table appearance.index.ts: A simple entry point that imports and registers the favorites-sk custom element, making it available for use in HTML.favorites-sk-demo.html & favorites-sk-demo.ts: These files provide a demonstration page for the favorites-sk element. The HTML includes an instance of <favorites-sk> and a <pre> tag to display events. The TypeScript file simply imports the element and sets up an event listener (though no custom events are explicitly dispatched by favorites-sk in the provided code).Workflow: Deleting a Favorite
User Clicks "Delete" Button (for a link in "My Favorites")
|
V
favorites-sk.ts: deleteFavoriteConfirm(id, name)
|
V
window.confirm("Deleting favorite: [name]. Are you sure?")
|
+-- User clicks "Cancel" --> Workflow ends
|
V User clicks "OK"
favorites-sk.ts: deleteFavorite(id)
|
V
fetch('/_/favorites/delete', { method: 'POST', body: {id: favId} })
|
+-- Network Error/Non-OK Response --> errorMessage() is called, display error
|
V Successful Deletion
favorites-sk.ts: fetchFavorites()
|
V
fetch('/_/favorites/')
|
V
Parse JSON response, update this.favoritesConfig
|
V
this._render() // Re-renders the component with the updated list
Workflow: Editing a Favorite
User Clicks "Edit" Button (for a link in "My Favorites")
|
V
favorites-sk.ts: editFavorite(id, name, desc, url)
|
V
Get reference to <favorites-dialog-sk id="fav-dialog">
|
V
favorites-dialog-sk.open(id, name, desc, url) // Opens the edit dialog
|
+-- User cancels dialog --> Promise rejects (potentially with undefined, handled)
|
V User submits changes in dialog
Promise resolves
|
V
favorites-sk.ts: fetchFavorites() // Re-fetches and re-renders the list
|
V
fetch('/_/favorites/')
|
V
Parse JSON response, update this.favoritesConfig
|
V
this._render()
The design relies on Lit for templating and rendering, which provides efficient updates to the DOM when the favoritesConfig data changes. The separation of concerns is evident: favorites-sk handles the list display and top-level actions, while favorites-dialog-sk manages the intricacies of the editing form.
graph-title-sk)The graph-title-sk module provides a custom HTML element designed to display titles for individual graphs in a structured and informative way. Its primary goal is to present key-value pairs of metadata associated with a graph in a visually clear and space-efficient manner.
The core of this module is the GraphTitleSk custom element (graph-title-sk.ts). Its main responsibilities are:
Data Reception and Storage: It receives a Map<string, string> where keys represent parameter names (e.g., “bot”, “benchmark”) and values represent their corresponding values (e.g., “linux-perf”, “Speedometer2”). This map, along with the number of traces in the graph, is provided via the set() method.
Dynamic Rendering: Based on the provided data, the element dynamically generates HTML to display the title. It iterates through the key-value pairs and renders them in a columnar layout. Each pair is displayed with the key (parameter name) in a smaller font above its corresponding value.
Handling Empty or Generic Titles:
titleEntries map is empty but numTraces is greater than zero, it displays a generic title like “Multi-trace Graph (X traces)” to indicate a graph with multiple data series without specific shared parameters.Space Management and Truncation:
- The title entries are arranged in a flexible, wrapping layout (`display:
flex; flex-wrap: wrap;) using CSS (graph-title-sk.scss`). This allows the title to adapt to different screen widths.
MAX_PARAMS, currently 8), it initially displays only the first MAX_PARAMS entries. A “Show Full Title” button (<md-text-button class="showMore">) is then provided, allowing the user to expand the view and see all title entries. Conversely, a “Show Short Title” mechanism is implied (though not explicitly shown as a button in the current code, showShortTitles() method exists) to revert to the truncated view.titleattribute of thediv containing the value.ElementSk): The component is built as a custom element extending ElementSk. This aligns with the Skia infrastructure's approach to building reusable UI components and allows for easy integration into Skia applications.lit library‘s html template literal tag. This provides a declarative and efficient way to define the component’s view and update it when data changes. The _render() method, inherited from ElementSk, is called to trigger re-rendering when the internal state (_titleEntries, numTraces, showShortTitle) changes.graph-title-sk.scss). This separates presentation concerns from the component‘s logic. CSS variables (e.g., var(--primary)) are used for theming, allowing the component’s appearance to be consistent with the overall application theme.set() Method for Data Input: Instead of relying solely on HTML attributes for complex data like a map, a public set() method is provided. This is a common pattern for custom elements when dealing with non-string data or when updates need to trigger specific internal logic beyond simple attribute reflection.MAX_PARAMS) and provide a “Show Full Title” option is a user experience choice. It prioritizes a clean initial view for complex graphs while still allowing users to access all details if needed.1. Initial Rendering with Data:
User/Application Code GraphTitleSk Element
--------------------- --------------------
calls set(titleData, numTraces) -->
stores titleData & numTraces
calls _render()
|
V
getTitleHtml() is invoked
|
V
Iterates titleData:
- Skips empty keys/values
- If entries > MAX_PARAMS & showShortTitle is true:
- Renders first MAX_PARAMS entries
- Renders "Show Full Title" button
- Else:
- Renders all entries
|
V
HTML template is updated with generated content
Browser renders the title
2. Toggling Full/Short Title Display (when applicable):
User Interaction GraphTitleSk Element
---------------- --------------------
Clicks "Show Full Title" button -->
onClick handler (showFullTitle) executes
|
V
this.showShortTitle = false
calls _render()
|
V
getTitleHtml() is invoked
|
V
Now renders ALL title entries because showShortTitle is false
|
V
HTML template is updated
Browser re-renders the title to show all entries
A similar flow occurs if a mechanism to call showShortTitles() is implemented and triggered.
The demo page (graph-title-sk-demo.html and graph-title-sk-demo.ts) showcases various states of the graph-title-sk element, including:
numTraces is 0 and the map is also empty, which would result in no title being displayed).Overview:
The ingest-file-links-sk module provides a custom HTML element, <ingest-file-links-sk>, designed to display a list of relevant links associated with a specific data point in the Perf performance monitoring system. These links are retrieved from the ingest.Format data structure, which can be generated by various ingestion processes. The primary purpose is to offer users quick access to related resources, such as Swarming task runs, Perfetto traces, or bot information, directly from the Perf UI.
Why:
Performance analysis often requires context beyond the raw data. Understanding the environment in which a test ran (e.g., specific bot configuration), or having direct access to detailed trace files, can be crucial for debugging performance regressions or understanding improvements. This module centralizes these relevant links in a consistent and easily accessible manner, improving the efficiency of performance investigations.
How:
The <ingest-file-links-sk> element fetches link data asynchronously. When its load() method is called with a CommitNumber (representing a specific point in time or version) and a traceID (identifying the specific data series), it makes a POST request to the /_/details/?results=false endpoint. This endpoint is expected to return a JSON object conforming to the ingest.Format structure.
The element then parses this JSON response. It specifically looks for the links field within the ingest.Format. If links exist and the version field in the ingest.Format is present (indicating a modern format), the element dynamically renders a list of these links.
Key design considerations and implementation details:
spinner-sk element is displayed while data is being loaded.links object. If a value is a valid URL, it‘s rendered as an <a> tag. Otherwise, it’s displayed as “Key: Value”.[Link Text](url)) into standard HTML anchor tags. This allows ingestion processes to provide links in a more human-readable format if desired.version field in the response. If it‘s missing, it assumes a legacy data format that doesn’t support these links and gracefully avoids displaying anything.Responsibilities and Key Components:
ingest-file-links-sk.ts: This is the core file defining the IngestFileLinksSk custom element.load(cid: CommitNumber, traceid: string) method is the public API for triggering the data fetching and rendering process.displayLinks static method is responsible for generating the TemplateResult array for rendering the list items.isUrl and removeMarkdown helper functions provide utility for link processing.ingest-file-links-sk.scss: This file contains the SASS styles for the custom element, defining its appearance, including list styling and spinner positioning.ingest-file-links-sk-demo.html and ingest-file-links-sk-demo.ts: These files provide a demonstration page for the element. The demo page uses fetch-mock to simulate the backend API response, allowing developers to see the element in action and test its functionality in isolation.ingest-file-links-sk_test.ts: This file contains unit tests for the IngestFileLinksSk element. It uses fetch-mock to simulate various API responses and asserts the element's behavior, such as correct link rendering, spinner state, and error handling.ingest-file-links-sk_puppeteer_test.ts: This file contains Puppeteer-based end-to-end tests. These tests load the demo page in a headless browser and verify the element's visual rendering and basic functionality.Key Workflow: Loading and Displaying Links
User Action/Page Load -> Calls ingest-file-links-sk.load(commit, traceID)
|
V
ingest-file-links-sk: Show spinner-sk
|
V
Make POST request to /_/details/?results=false
(with commit and traceID in request body)
|
V
Backend API: Processes request, retrieves links for the
given commit and trace
|
V
ingest-file-links-sk: Receives JSON response (ingest.Format)
|
+----------------------+
| |
V V
Response OK? Response Error?
| |
V V
Parse links Display error message
Hide spinner Hide spinner
Render link list
This module defines TypeScript interfaces and types that represent the structure of JSON data used throughout the Perf application. It essentially acts as a contract between the Go backend and the TypeScript frontend, ensuring data consistency and type safety.
Why:
The primary motivation for this module is to leverage TypeScript's strong typing capabilities. By defining these interfaces, we can catch potential data inconsistencies and errors at compile time rather than runtime. This is particularly crucial for a data-intensive application like Perf, where the frontend relies heavily on JSON responses from the backend.
Furthermore, these definitions are automatically generated from Go struct definitions. This ensures that the frontend and backend data models remain synchronized. Any changes to the Go structs will trigger an update to these TypeScript interfaces, reducing the likelihood of manual errors and inconsistencies.
How:
The index.ts file contains all the interface and type definitions. These are organized into a flat structure for simplicity, with some nested namespaces (e.g., pivot, progress, ingest) where logical grouping is beneficial.
A key design choice is the use of nominal typing for certain primitive types (e.g., CommitNumber, TimestampSeconds, Trace). This is achieved by creating type aliases that are branded with a unique string literal type. For example:
export type CommitNumber = number & { _commitNumberBrand: 'type alias for number'; }; export function CommitNumber(v: number): CommitNumber { return v as CommitNumber; }
This prevents accidental assignment of a generic number to a CommitNumber variable, even though they are structurally identical at runtime. This adds an extra layer of type safety, ensuring that, for example, a timestamp is not inadvertently used where a commit number is expected. Helper functions (e.g., CommitNumber(v: number)) are provided for convenient type assertion.
Key Components/Files/Submodules:
index.ts: This is the sole file in this module and contains all the TypeScript interface and type definitions. It serves as the single source of truth for JSON data structures used in the frontend.Alert, DataFrame, FrameRequest, Regression): These define the shape of complex JSON objects. For instance, the Alert interface describes the structure of an alert configuration, including its query, owner, and various detection parameters. The DataFrame interface represents the core data structure for displaying traces, including the actual trace data (traceset), column headers (header), and associated parameter sets (paramset).ClusterAlgo, StepDetection, Status): These define specific allowed string values for certain properties, acting like enums. For example, ClusterAlgo can only be 'kmeans' or 'stepfit', ensuring that only valid clustering algorithms are specified.CommitNumber, TimestampSeconds, Trace, ParamSet): As explained above, these provide stronger type checking for primitive types that have specific semantic meaning within the application. TraceSet, for example, is a map where keys are trace identifiers (strings) and values are Trace arrays (nominally typed number[]).pivot.Request, ingest.Format): Some interfaces are grouped under namespaces to organize related data structures. For example, pivot.Request defines the structure for requesting pivot table operations, including grouping criteria and aggregation operations. The ingest.Format interface defines the structure of data being ingested into Perf, including metadata like Git hash and the actual performance results.ReadOnlyParamSet, AnomalyMap): These represent common data patterns. ReadOnlyParamSet is a map of parameter names to arrays of their possible string values, marked as read-only to reflect its typical usage. AnomalyMap is a nested map structure used to associate anomalies with specific commits and traces.Workflow Example: Requesting and Displaying Trace Data
A common workflow involves the frontend requesting trace data from the backend and then displaying it.
Frontend (Client) prepares a FrameRequest:
Client Code --> Creates `FrameRequest` object:
{
begin: 1678886400, // Start timestamp
end: 1678972800, // End timestamp
queries: ["config=gpu&name=my_test_trace"],
// ... other properties
}
Frontend sends the FrameRequest to the Backend (Server).
Backend processes the request and generates a FrameResponse:
Server Logic --> Processes `FrameRequest`
--> Fetches data from database/cache
--> Constructs `FrameResponse` object:
{
dataframe: {
traceset: { "config=gpu&name=my_test_trace": [10.1, 10.5, 9.8, ...Trace] },
header: [ { offset: 12345, timestamp: 1678886400 }, ...ColumnHeader[] ],
paramset: { "config": ["gpu", "cpu"], "name": ["my_test_trace"] }
},
skps: [0, 5, 10], // Indices of significant points
// ... other properties like msg, display_mode, anomalymap
}
Backend sends the FrameResponse (as JSON) back to the Frontend.
Frontend receives the JSON and parses it, expecting it to conform to the FrameResponse interface: Client Code --> Receives JSON --> Parses JSON into a `FrameResponse` typed object --> Uses `frameResponse.dataframe.traceset` to render charts --> Uses `frameResponse.dataframe.header` to display commit information
This typed interaction ensures that if the backend, for example, renamed traceset to trace_data in its Go struct, the automatic generation would update the DataFrame interface. The TypeScript compiler would then flag an error in the frontend code trying to access frameResponse.dataframe.traceset, preventing a runtime error and guiding the developer to update the frontend code accordingly.
The json-source-sk module provides a custom HTML element, <json-source-sk>, designed to display the raw JSON data associated with a specific data point in a trace. This is particularly useful in performance analysis and debugging scenarios where understanding the exact input data ingested by the system is crucial.
The core responsibility of this module is to fetch and present JSON data in a user-friendly dialog. It aims to simplify the process of inspecting the source data for a given commit and trace identifier.
The key component is the JSONSourceSk class, defined in json-source-sk.ts. This class extends ElementSk, a base class for custom elements in the Skia infrastructure.
How it Works:
Initialization and Properties:
cid: The Commit ID (represented as CommitNumber), which identifies a specific version or point in time.traceid: A string identifier for the specific trace being examined.traceid is not a valid key (checked by validKey from perf/modules/paramtools), the control buttons are hidden.User Interaction and Data Fetching:
_loadSource or _loadSourceSmall methods, respectively._loadSourceImpl. This implementation detail allows for sharing the core fetching logic while differentiating the request URL._loadSourceImpl constructs a CommitDetailsRequest object containing the cid and traceid./_/details/ endpoint.isSmall is true), the URL includes ?results=false, indicating to the backend that a potentially truncated or summarized version of the JSON is requested.spinner-sk element is activated to provide visual feedback during the fetch operation.jsonOrThrow. If the request is successful, the JSON data is formatted with indentation and stored in the _json private property.errorMessage (from perf/modules/errorMessage) is used to display an error notification to the user.Displaying the JSON:
<dialog> element (#json-dialog).jsonFile() method in the template is responsible for rendering the <pre> tag containing the formatted JSON string, but only if _json is not empty.showModal(), providing a modal interface for viewing the JSON.#closeIcon with a close-icon-sk) allows the user to dismiss the dialog. Closing the dialog also clears the _json property.Design Rationale:
async/await and fetch allows for non-blocking data retrieval, ensuring the UI remains responsive while waiting for the server.jsonOrThrow and errorMessage provides a better user experience by informing users about issues during data retrieval.spinner-sk element clearly indicates when data is being loaded.<dialog>) for displaying the JSON helps focus the user's attention on the data without cluttering the main interface.json-source-sk.scss) provides basic styling and leverages existing button styles (//elements-sk/modules/styles:buttons_sass_lib). It also includes considerations for dark mode by using CSS variables like --on-background and --background.Workflow Example: Viewing JSON Source
User Sets Properties Element Renders User Clicks Button Fetches Data Displays JSON
-------------------- --------------- ------------------ ------------ -------------
[json-source-sk -> [Buttons visible] -> ["View Json File"] -> POST /_/details/ -> <dialog>
.cid = 123 {cid, traceid} <pre>{json}</pre>
.traceid = ",foo=bar,"] </dialog>
(spinner active)
|
V
Response Received
(spinner inactive)
The demo page (json-source-sk-demo.html and json-source-sk-demo.ts) illustrates how to use the <json-source-sk> element. It sets up mock data using fetchMock to simulate the backend endpoint and programmatically clicks the button to demonstrate the JSON loading functionality.
The Puppeteer test (json-source-sk_puppeteer_test.ts) ensures the element renders correctly and performs basic visual regression testing.
The new-bug-dialog-sk module provides a user interface element for filing new bugs related to performance anomalies. It aims to streamline the bug reporting process by pre-filling relevant information and integrating with the Buganizer issue tracker.
Core Functionality:
The primary responsibility of this module is to display a dialog that allows users to input details for a new bug. This dialog is populated with information derived from one or more selected Anomaly objects. The user can then review and modify this information before submitting the bug.
Key Design Decisions and Implementation Choices:
getBugTitle(), mimics the behavior of the legacy Chromeperf UI to maintain familiarity for users.getLabelCheckboxes() and getComponentRadios().Anomaly data. This ensures that only relevant options are presented to the user. Lit-html’s templating capabilities are used for this dynamic rendering./alogin-sk./_/triage/file_bug endpoint.spinner-sk) is displayed during this operation to provide visual feedback.anomaly-changed event is dispatched to notify other components (like explore-simple-sk or chart-tooltip-sk) that the anomalies have been updated with the new bug ID.error-toast-sk, and the dialog remains open, allowing the user to retry or correct information.<dialog> HTML element, which provides built-in accessibility and modal behavior.Workflow: Filing a New Bug
setAnomalies() method on new-bug-dialog-sk, passing the relevant Anomaly objects and associated trace names.open() method. User Action (e.g., click "File Bug" button) | V External Component --[setAnomalies(anomalies, traceNames)]--> new-bug-dialog-sk | V External Component --[open()]--> new-bug-dialog-sknew-bug-dialog-sk fetches the current user's login status to pre-fill the CC field. - The _render() method is called, which uses the Lit-html template. - getBugTitle() generates a suggested title. - getLabelCheckboxes() and getComponentRadios() create the UI for selecting labels and components based on the input anomalies. - The dialog (<dialog id="new-bug-dialog">) is displayed modally. new-bug-dialog-sk.open() | V [Fetch Login Status] --> Updates `_user` | V _render() |--> getBugTitle() --> Populates Title Input |--> getLabelCheckboxes() --> Creates Label Checkboxes |--> getComponentRadios() --> Creates Component Radios | V Dialog is displayed to the userUser clicks "Submit" | V Form Submit Event | V new-bug-dialog-sk.fileNewBug()fileNewBug() method is invoked. - The spinner is activated, and form buttons are disabled. - Form data (title, description, selected labels, selected component, assignee, CCs, anomaly keys, trace names) is collected. - A POST request is sent to /_/triage/file_bug with the collected data. fileNewBug() | V [Activate Spinner, Disable Buttons] | V [Extract Form Data] | V fetch('/_/triage/file_bug', {POST, body: jsonData})bug_id. - The spinner is deactivated, and buttons are re-enabled. - The dialog is closed. - A new browser tab is opened to the URL of the created bug (e.g., https://issues.chromium.org/issues/BUG_ID). - The bug_id is updated in the local _anomalies array. - An anomaly-changed custom event is dispatched with the updated anomalies and bug ID. - Failure: - The server responds with an error. - The spinner is deactivated, and buttons are re-enabled. - An error message is displayed to the user via errorMessage(). The dialog remains open. fetch Response | +-- Success (HTTP 200, valid JSON with bug_id) | | | V | [Deactivate Spinner, Enable Buttons] | | | V | closeDialog() | | | V | window.open(bugUrl, '_blank') | | | V | Update local _anomalies with bug_id | | | V | dispatchEvent('anomaly-changed', {anomalies, bugId}) | +-- Failure (HTTP error or invalid JSON) | V [Deactivate Spinner, Enable Buttons] | V errorMessage(errorMsg) --> Displays error toastKey Files:
new-bug-dialog-sk.ts: This is the core file containing the NewBugDialogSk class definition, which extends ElementSk. It includes the Lit-html template for the dialog, the logic for populating form fields based on Anomaly data, handling form submission, interacting with the backend API to file the bug, and managing the dialog's visibility and state.new-bug-dialog-sk.scss: This file defines the styles for the dialog, ensuring it integrates visually with the rest of the application and themes. It styles the dialog container, input fields, buttons, and the close icon.new-bug-dialog-sk-demo.ts and new-bug-dialog-sk-demo.html: These files provide a demonstration page for the new-bug-dialog-sk element. The .ts file sets up mock data (Anomaly objects) and mock fetch responses to simulate the bug filing process, allowing for isolated testing and development of the dialog. The .html file includes the new-bug-dialog-sk element and a button to trigger its opening.index.ts: This file simply imports new-bug-dialog-sk.ts to ensure the custom element is defined and available for use.The module relies on several other elements and libraries:
alogin-sk: To determine the logged-in user for CC'ing.close-icon-sk: For the dialog's close button.spinner-sk: To indicate activity during bug filing.error-toast-sk (via errorMessage utility): To display error messages.lit: For templating and component rendering.jsonOrThrow: A utility for parsing JSON responses and throwing errors on failure.The paramtools module provides a TypeScript implementation of utility functions for manipulating parameter sets and structured keys. It mirrors the functionality found in the Go module /infra/go/paramtools, which is the primary source of truth for these operations. The decision to replicate this logic in TypeScript is to enable client-side applications to perform these common tasks without needing to make server requests for simple transformations or validations. This approach improves performance and reduces server load for UI-driven interactions.
The core responsibility of this module is to provide robust and consistent ways to:
ParamSet objects: ParamSets are used to represent collections of possible parameter values, often used for filtering or querying data.Key functionalities and their “why” and “how”:
makeKey(params: Params | { [key: string]: string }): string:
Params object (a dictionary of string key-value pairs). It first checks if the params object is empty, throwing an error if it is, as a key must represent at least one parameter. Then, it sorts the keys of the params object alphabetically. Finally, it constructs the string by joining each key-value pair with = and then joining these pairs with ,, prefixing and suffixing the entire string with a comma. Input: { "b": "2", "a": "1", "c": "3" } | V Sort keys: [ "a", "b", "c" ] | V Format pairs: "a=1", "b=2", "c=3" | V Join and wrap: ",a=1,b=2,c=3,"fromKey(structuredKey: string, attribute?: string): Params:
Params object, making it easier to work with the individual parameters programmatically. It also handles the removal of special functions that might be embedded in the key (e.g., norm(...) for normalization).removeSpecialFunctions to strip any function wrappers from the key. Then, it splits the key string by the comma delimiter. Each resulting segment (if not empty) is then split by the equals sign to separate the key and value. These key-value pairs are collected into a new Params object. An optional attribute parameter allows excluding a specific key from the resulting Params object, which can be useful in scenarios where certain attributes are metadata and not part of the core parameters.removeSpecialFunctions(key: string): string:
norm(...), avg(...)) or special markers (e.g., special_zero). This function is designed to strip these away, returning the “raw” underlying key. This is important when you need to work with the base parameters without the context of the applied function or special condition.function_name(,param1=value1,...). If a match is found, it extracts the content within the parentheses. The extracted string (or the original key if no function was found) is then processed by extractNonKeyValuePairsInKey.extractNonKeyValuePairsInKey(key: string): string: This helper function further refines the key string. It splits the string by commas and filters out any segments that do not represent a valid key=value pair. This helps to remove extraneous parts like special_zero that might be comma-separated but aren't true parameters. The valid pairs are then re-joined and wrapped with commas.validKey(key: string): boolean:
avg(...)) or other special trace types. This is a lightweight validation, as the server performs more comprehensive checks.addParamsToParamSet(ps: ParamSet, p: Params): void:
Params object) to an existing ParamSet. ParamSets store unique values for each parameter key. This function ensures that when new parameters are added, only new values are appended to the existing lists for each key, maintaining uniqueness.Params object (p). For each key, it retrieves the corresponding array of values from the ParamSet (ps). If the key doesn‘t exist in ps, a new array is created. If the value from p is not already present in the array, it’s added.paramsToParamSet(p: Params): ParamSet:
Params object (representing one specific combination of parameters) into a ParamSet. In a ParamSet, each key maps to an array of values, even if there's only one value.ParamSet. Then, for each key-value pair in the input Params object, it creates a new entry in the ParamSet where the key maps to an array containing just that single value.addParamSet(p: ParamSet, ps: ParamSet | ReadOnlyParamSet): void:
ParamSet (or ReadOnlyParamSet) into another. This is useful for combining sets of available parameter options, for example, when aggregating data from multiple sources.ParamSet (ps). If a key from ps is not present in the target ParamSet (p), the entire key and its value array (cloned) are added to p. If the key already exists in p, it iterates through the values in the source array and adds any values that are not already present in the target array for that key.toReadOnlyParamSet(ps: ParamSet): ReadOnlyParamSet:
ParamSet to an immutable ReadOnlyParamSet. This is useful for signaling that a ParamSet should not be modified further, typically when passing it to components or functions that expect read-only data.queryFromKey(key: string): string:
a=1&b=2&c=3). This is specifically useful for frontend applications, like explore-simple-sk, where state or filters are often represented in the URL.fromKey to parse the structured key into a Params object. Then, it leverages the URLSearchParams browser API to construct a query string from these parameters. This ensures proper URL encoding of keys and values. Input Key: ",a=1,b=2,c=3," | V fromKey -> Params: { "a": "1", "b": "2", "c": "3" } | V URLSearchParams -> Query String: "a=1&b=2&c=3"The design choice to have these functions operate with less stringent validation than their server-side Go counterparts is deliberate. The server remains the ultimate authority on data validity. These client-side functions prioritize ease of use and performance for UI interactions, assuming that the data they operate on has either originated from or will eventually be validated by the server.
The index_test.ts file provides comprehensive unit tests for these functions, ensuring their correctness and robustness across various scenarios, including handling empty inputs, duplicate values, and special key formats. This focus on testing is crucial for maintaining the reliability of these foundational utility functions.
The perf-scaffold-sk module provides a consistent layout and navigation structure for all pages within the Perf application. It acts as a wrapper, ensuring that common elements like the title bar, navigation sidebar, and error notifications are present and behave uniformly across different sections of Perf.
Core Responsibilities:
alogin-sk), theme chooser (theme-chooser-sk), and error/toast notifications (error-toast-sk).main content area and allows for specific content (like help text) to be injected into the sidebar.Key Components and Design Decisions:
perf-scaffold-sk.ts: This is the heart of the module, defining the PerfScaffoldSk custom element.
Why: Encapsulating the scaffold logic within a custom element promotes reusability and modularity. It allows any Perf page to adopt the standard layout simply by including this element.
How: It uses Lit for templating and rendering the structure (<app-sk>, header, aside#sidebar, main, footer).
Content Redistribution: A crucial design choice is how it handles child elements. Since it doesn't use Shadow DOM for the main content area (to allow global styles to apply easily to the page content), it programmatically moves children of <perf-scaffold-sk> into the <main> section.
Process:
connectedCallback is invoked, existing children of <perf-scaffold-sk> are temporarily moved out.<main> element.MutationObserver is set up to watch for any new children added to <perf-scaffold-sk> and similarly move them to <main>.Sidebar Content: An exception is made for elements with the specific ID SIDEBAR_HELP_ID. These are moved into the #help div within the sidebar. This allows pages to provide context-specific help information directly within the scaffold.
<perf-scaffold-sk>
<!-- This will go into <main> -->
<div>Page specific content</div>
<!-- This will go into <aside>#help -->
<div id="sidebar_help">Contextual help</div>
</perf-scaffold-sk>
Configuration via window.perf: The scaffold reads various configuration options from the global window.perf object. This allows instances of Perf to customize links (help, feedback, chat), behavior (e.g., show_triage_link), and display information (e.g., instance URL, build tag). This makes the scaffold adaptable to different Perf deployments.
For example, the _helpUrl and _reportBugUrl are initialized with defaults but can be overridden by window.perf.help_url_override and window.perf.feedback_url respectively.
The visibility of the “Triage” link is controlled by window.perf.show_triage_link.
Build Information: It displays the current application build tag, fetching it via getBuildTag() from //perf/modules/window:window_ts_lib and linking it to the corresponding commit in the buildbot git repository.
Instance Title: It can display the name of the Perf instance, extracted from window.perf.instance_url.
perf-scaffold-sk.scss: Defines the styles for the scaffold.
//perf/modules/themes:themes_sass_lib. It defines the layout, including the sidebar width and the main content area's width (using calc(99vw - var(--sidebar-width)) to avoid horizontal scrollbars caused by 100vw including the scrollbar width). It also styles the navigation links and other elements within the scaffold.perf-scaffold-sk-demo.html & perf-scaffold-sk-demo.ts: Provide a demonstration page for the scaffold.
perf-scaffold-sk-demo.ts initializes a mock window.perf object with various settings and then injects an instance of <perf-scaffold-sk> with some placeholder content (including a div with id="sidebar_help") into the perf-scaffold-sk-demo.html page.Workflow: Initializing and Rendering a Page with the Scaffold
<perf-scaffold-sk> as its top-level layout element. html <!-- new_query_page.html --> <body> <perf-scaffold-sk> <!-- Content specific to the New Query page --> <query-composer-sk></query-composer-sk> <div id="sidebar_help"> <p>Tips for creating new queries...</p> </div> </perf-scaffold-sk> </body>PerfScaffoldSk element's connectedCallback fires.perf-scaffold-sk.ts: - Temporarily moves <query-composer-sk> and <div id="sidebar_help">...</div> out of perf-scaffold-sk. - Renders its own internal template (header with title, login, theme chooser; sidebar with nav links; empty main area; footer with error toast). ` ... ... <-- Placeholder for sidebar help ... <-- Placeholder for main contentid=“help”>element within thesidebar. - AMutationObserverstarts listening for any further children added directly to`.
The final rendered structure (simplified) would look something like:
perf-scaffold-sk
└── app-sk
├── header
│ ├── h1.name (Instance Title)
│ ├── div.spacer
│ ├── alogin-sk
│ └── theme-chooser-sk
├── aside#sidebar
│ ├── div#links
│ │ ├── a (New Query)
│ │ ├── a (Favorites)
│ │ └── ... (other nav links)
│ ├── div#help
│ │ └── div#sidebar_help (Content from original page)
│ │ └── <p>Tips for creating new queries...</p>
│ └── div#chat
├── main
│ └── query-composer-sk (Content from original page)
└── footer
└── error-toast-sk
The picker-field-sk module provides a custom HTML element that serves as a stylized text input field with an associated dropdown menu for selecting from a predefined list of options. This component is designed to offer a user-friendly way to pick a single value from potentially many choices, enhancing the user experience in forms or selection-heavy interfaces.
Core Functionality and Design:
The primary goal of picker-field-sk is to present a familiar text input that, upon interaction (focus or click), reveals a filterable list of valid options. This addresses the need for a compact and efficient way to select an item, especially when the number of options is large.
The implementation leverages the Vaadin ComboBox component (@vaadin/combo-box) for its underlying dropdown and filtering capabilities. This choice was made to utilize a well-tested and feature-rich component, avoiding the need to reimplement complex dropdown logic, keyboard navigation, and accessibility features. picker-field-sk then wraps this Vaadin component, applying custom styling and providing a simplified API tailored to its specific use case.
Key Responsibilities and Components:
picker-field-sk.ts: This is the heart of the module, defining the PickerFieldSk custom element which extends ElementSk.
label: A string that serves as both the visual label above the input field and the placeholder text within it when empty. This provides context to the user about the expected input.options: An array of strings representing the valid choices the user can select from. The component dynamically adjusts the width of the dropdown overlay to accommodate the longest option, ensuring readability.helperText: An optional string displayed below the input field, typically used for providing additional guidance or information to the user.value-changed: This custom event is dispatched whenever the selected value in the combo box changes. This includes selecting an item from the dropdown, typing a value that matches an option (due to autoselect), or clearing the input. The new value is available in event.detail.value. This event is crucial for parent components to react to user selections.focus(): Programmatically sets focus to the input field.openOverlay(): Programmatically opens the dropdown list of options. This is useful for guiding the user or for integrating with other UI elements.disable(): Makes the input field read-only, preventing user interaction.enable(): Removes the read-only state, allowing user interaction.clear(): Clears the current value in the input field.setValue(val: string): Programmatically sets the value of the input field.getValue(): Retrieves the current value of the input field.lit-html for templating. The template renders a <vaadin-combo-box> element and binds its properties and events to the PickerFieldSk element's state.calculateOverlayWidth() private method dynamically adjusts the --vaadin-combo-box-overlay-width CSS custom property. It iterates through the options to find the longest string and sets the overlay width to be slightly larger than this string, ensuring all options are fully visible without truncation. This is a key usability enhancement. User provides options --> PickerFieldSk.options setter | V calculateOverlayWidth() | V Find max option length | V Set --vaadin-combo-box-overlay-width CSS propertypicker-field-sk.scss: Contains the SASS styles for the component.
vaadin-combo-box and its shadow parts (e.g., ::part(label), ::part(input-field), ::part(items)) to customize its appearance to match the application's theme (including dark mode support).--vaadin-field-default-width, --vaadin-combo-box-overlay-width, and --lumo-text-field-size are used to control the dimensions and sizing of the Vaadin component..darkmode picker-field-sk, adjusting colors for labels, helper text, and input fields to ensure proper contrast and visual integration.index.ts: A simple entry point that imports and thereby registers the picker-field-sk custom element, making it available for use in HTML.
picker-field-sk-demo.html & picker-field-sk-demo.ts: These files create a demonstration page for the picker-field-sk component.
picker-field-sk-demo.html includes instances of the picker-field-sk element and buttons to trigger its various functionalities (focus, fill, open overlay, disable/enable).picker-field-sk-demo.ts contains JavaScript to initialize the demo elements with sample data (a large list of “speedometer” options to showcase performance with many items) and to wire up the buttons to the corresponding methods of the PickerFieldSk instances. This allows developers to visually inspect and interact with the component.Workflow Example: User Selects an Option
<picker-field-sk> and sets its label and options properties. <picker-field-sk .label="Fruit" .options=${['Apple', 'Banana', 'Cherry']}></picker-field-sk>picker-field-sk input. User clicks/focuses input --> vaadin-combo-box internally handles focus/click | V vaadin-combo-box displays dropdown with optionsvaadin-combo-box filters the displayed options based on the typed text.User selects "Banana" --> vaadin-combo-box updates its internal value | V vaadin-combo-box emits 'value-changed' eventvaadin-combo-box within picker-field-sk emits its native value-changed event. - The onValueChanged method in PickerFieldSk catches this event. - PickerFieldSk then dispatches its own value-changed custom event, with the selected value in event.detail.value. picker-field-sk.onValueChanged(vaadinEvent) | V Dispatch new CustomEvent('value-changed', { detail: { value: vaadinEvent.detail.value }})value-changed event on the <picker-field-sk> element, receives the event and can act upon the selected value. Parent component listens for 'value-changed' --> Accesses event.detail.value | V Update application stateThis layered approach, building upon the Vaadin ComboBox, provides a robust and themeable selection component while abstracting away the complexities of the underlying library for the consumers of picker-field-sk.
pinpoint-try-job-dialog-sk)The pinpoint-try-job-dialog-sk module provides a user interface element for initiating Pinpoint A/B try jobs.
Purpose:
The primary reason for this module‘s existence within the Perf application is to allow users to request additional trace data for specific benchmark runs. While Pinpoint itself supports a wider range of try job use cases, this dialog is specifically tailored for this trace generation scenario. It’s important to note that this component is considered a legacy feature, and future development should favor the newer Pinpoint frontend.
How it Works:
The dialog is designed to gather the necessary parameters from the user to construct and submit a Pinpoint A/B try job request. This process involves:
alogin-sk to determine the logged-in user. The user's email is included in the try job request.CreateLegacyTryRequest object. This object encapsulates all the necessary information for the Pinpoint backend.testPath (e.g., master/benchmark_name/story_name) is parsed to extract the configuration (e.g., benchmark_name) and the benchmark (e.g., story_name).story is typically the last segment of the testPath.extra_test_args field is formatted to include the user-provided tracing arguments./_/try/ endpoint with the JSON payload.jobUrl for the newly created Pinpoint job. This URL is then displayed to the user, allowing them to navigate to the Pinpoint UI to monitor the job's progress.Workflow:
User Interaction (e.g., click on chart tooltip)
|
V
Dialog Pre-populated with context (testPath, commits)
|
V
pinpoint-try-job-dialog-sk.open()
|
V
User reviews/modifies input fields (Base Commit, Exp. Commit, Trace Args)
|
V
User clicks "Send to Pinpoint"
|
V
[pinpoint-try-job-dialog-sk]
- Gathers input values
- Retrieves logged-in user via alogin-sk
- Constructs `CreateLegacyTryRequest` JSON
- Sends POST request to /_/try/
|
V
[Backend Pinpoint Service]
- Processes the request
- Creates A/B try job
- Returns jobUrl (success) or error
|
V
[pinpoint-try-job-dialog-sk]
- Displays spinner during request
- On Success:
- Displays link to the created Pinpoint job (jobUrl)
- Hides spinner
- On Error:
- Displays error message
- Hides spinner
Key Components/Files:
pinpoint-try-job-dialog-sk.ts: This is the core TypeScript file that defines the custom element's logic.PinpointTryJobDialogSk class: Extends ElementSk and manages the dialog's state, user input, and interaction with the Pinpoint API.template: Defines the HTML structure of the dialog using lit-html. This includes input fields for commits and tracing arguments, a submit button, a spinner for loading states, and a link to the created Pinpoint job.connectedCallback(): Initializes the dialog, sets up event listeners (e.g., for form submission, closing the dialog on outside click), and fetches the logged-in user's information.setTryJobInputParams(params: TryJobPreloadParams): Allows external components to pre-fill the dialog's input fields. This is crucial for integrating the dialog with other parts of the Perf UI, like chart tooltips.open(): Displays the modal dialog.closeDialog(): Closes the modal dialog.postTryJob(): This is the central method for handling the job submission. It reads values from the input fields, constructs the CreateLegacyTryRequest payload, and makes the fetch call to the Pinpoint API. It also handles the UI updates based on the API response (showing the job URL or an error message).TryJobPreloadParams interface: Defines the structure for the parameters used to pre-populate the dialog.pinpoint-try-job-dialog-sk.scss: Contains the SASS/CSS styles for the dialog, ensuring it aligns with the application's visual theme. It styles the input fields, buttons, and the overall layout of the dialog.index.ts: A simple entry point that imports and registers the pinpoint-try-job-dialog-sk custom element.BUILD.bazel: Defines the build rules for the module, specifying its dependencies (e.g., elements-sk components like select-sk, spinner-sk, alogin-sk, and Material Web components) and how it should be compiled.Design Decisions:
bisect-dialog-sk: The dialog's structure and initial functionality were adapted from an existing bisect dialog. This likely accelerated development by reusing common patterns for dialog interactions and API calls.CreateLegacyTryRequest object is fully constructed on the client-side before being sent to the backend. This gives the frontend more control over the request parameters.<dialog> HTML element provides built-in modal behavior, simplifying the implementation of showing and hiding the dialog.spinner-sk component provides visual feedback to the user while the API request is in progress.This component serves as a bridge for users of the Perf application to leverage Pinpoint's capabilities for generating detailed trace information, even as the broader Pinpoint tooling evolves.
The pivot-query-sk module provides a custom HTML element for users to configure and interact with pivot table requests. Pivot tables are a powerful data summarization tool, and this element allows users to define how data should be grouped, what aggregate operations should be performed, and what summary statistics should be displayed.
The core of the module is the PivotQuerySk class, which extends ElementSk. This class manages the state of the pivot request and renders the UI for user interaction. It leverages other custom elements like multi-select-sk and select-sk to provide intuitive input controls.
Key Design Choices and Implementation Details:
pivot-changed, whenever the user modifies any part of the pivot request. This allows consuming applications to react to changes in real-time. The event detail (PivotQueryChangedEventDetail) contains the updated pivot.Request object or null if the current configuration is invalid. This decouples the UI component from the application logic that processes the pivot request.PivotQuerySk element uses Lit's html templating for rendering. It maintains internal state for the _pivotRequest (the current pivot configuration) and _paramset (the available options for grouping). When these properties are set or updated, the _render() method is called to re-render the component, ensuring the UI reflects the current state.createDefaultPivotRequestIfNull() method ensures that if _pivotRequest is initially null, it's initialized with a default valid structure before any user interaction attempts to modify it. This prevents errors and provides a sensible starting point._paramset and the existing _pivotRequest. The allGroupByOptions() method is particularly noteworthy as it ensures that even if the _paramset changes, any currently selected group_by keys in the _pivotRequest are still displayed as options. This prevents accidental data loss during _paramset updates. It achieves this by concatenating keys from both sources, sorting, and then filtering out duplicates.pivotRequest getter includes a call to validatePivotRequest (from pivotutil). This ensures that the component only returns a valid pivot.Request object. If the current configuration is invalid, it returns null. This promotes data integrity.Responsibilities and Key Components:
pivot-query-sk.ts: This is the main file defining the PivotQuerySk custom element.
PivotQuerySk class:pivot.Request object, which defines the grouping, operation, and summary statistics for a pivot table.ParamSet as input, which provides the available keys for the “group by” selection. This ParamSet likely originates from the dataset being analyzed.multi-select-sk.select element.multi-select-sk.pivot-changed custom event when the user modifies the pivot request.PivotQueryChangedEventDetail type: Defines the structure of the data passed in the pivot-changed event.PivotQueryChangedEventName constant: The string name of the custom event.groupByChanged, operationChanged, summaryChanged): These methods are triggered by user interactions with the respective UI elements. They update the internal _pivotRequest and then call emitChangeEvent.emitChangeEvent(): Constructs and dispatches the pivot-changed event.pivotRequest, paramset): Provide controlled access to the element's core data, triggering re-renders when set.pivot-query-sk.scss: Contains the styling for the pivot-query-sk element. It ensures a consistent look and feel, leveraging styles from themes_sass_lib and select_sass_lib. The layout is primarily flex-based to arrange the different selection components.
pivot-query-sk-demo.html and pivot-query-sk-demo.ts: These files provide a demonstration page for the pivot-query-sk element.
pivot-query-sk.pivot.Request data and a ParamSet. It also includes an event listener for pivot-changed to display the selected pivot configuration as JSON, illustrating how to consume the element's output.Workflow for User Interaction and Event Emission:
Initialization:
pivot-query-sk element is created.paramset (available grouping keys) and optionally an initial pivotRequest.User Modifies a Selection (e.g., changes a “group by” option):
multi-select-sk (for “group by”) emits a selection-changed event.PivotQuerySk.groupByChanged() is called.createDefaultPivotRequestIfNull() ensures _pivotRequest is not null._pivotRequest.group_by is updated based on the new selection.emitChangeEvent() is called.Event Emission:
emitChangeEvent():pivotRequest (which might be null if invalid).CustomEvent named pivot-changed.detail of the event is the current (potentially validated) pivotRequest.Application Responds:
pivot-changed events on the pivot-query-sk element or one of its ancestors, receives the event.event.detail (the pivot.Request) to update its data display, fetch new data, or perform other actions.This flow can be visualized as:
User Interaction (e.g., click on multi-select)
|
v
Internal element event (e.g., @selection-changed from multi-select-sk)
|
v
PivotQuerySk Event Handler (e.g., groupByChanged)
|
v
Update internal _pivotRequest state
|
v
PivotQuerySk.emitChangeEvent()
|
v
Dispatch "pivot-changed" CustomEvent (with pivot.Request as detail)
|
v
Consuming Application's Event Listener
|
v
Application processes the new pivot.Request
The pivot-table-sk module provides a custom HTML element, <pivot-table-sk>, designed to display pivoted data in a tabular format. This element is specifically for DataFrames that have been pivoted and contain summary values, as opposed to summary traces (which would be displayed in a plot).
Core Functionality and Design
The primary purpose of pivot-table-sk is to present complex, multi-dimensional data in an understandable and interactive table. The “why” behind its design is to offer a user-friendly way to explore summarized data that arises from pivoting operations.
Key design considerations include:
DataFrame (from //perf/modules/json:index_ts_lib) and a pivot.Request (also from //perf/modules/json:index_ts_lib) as input. The pivot.Request is crucial as it dictates how the DataFrame was originally pivoted, including the group_by keys, the main operation, and the summary operations.group_by keys and the summary operations.pivot.Request is suitable for display as a pivot table (using validateAsPivotTable from //perf/modules/pivotutil:index_ts_lib). This prevents rendering errors or confusing displays if the input data structure isn't appropriate.Key Components and Files
pivot-table-sk.ts: This is the heart of the module, defining the PivotTableSk custom element.
PivotTableSk class:ElementSk (from //infra-sk/modules/ElementSk:index_ts_lib).DataFrame (df), pivot.Request (req), and the original query string.KeyValues type and keyValuesFromTraceSet function: This is a critical internal data structure. KeyValues is an object where keys are trace keys (e.g., ',arch=x86,config=8888,') and values are arrays of strings. These string arrays represent the values of the parameters specified in req.group_by, in the same order. For example, if req.group_by is ['config', 'arch'], then for the trace ',arch=arm,config=8888,', the corresponding KeyValues entry would be ['8888', 'arm']. This transformation is performed by keyValuesFromTraceSet and is essential for rendering the “key” columns of the table and for sorting by these keys.SortSelection class: Represents the sorting state of a single column. It stores:column: The index of the column.kind: Whether the column represents ‘keyValues’ (from group_by) or ‘summaryValues’ (from summary operations).dir: The sort direction (‘up’ or ‘down’).toggleDirection, buildCompare (to create a JavaScript sort comparison function based on its state), and encode/decode for serialization.SortHistory class: Manages the overall sorting state of the table.history) of SortSelection objects.SortSelection is moved to the front of the history array, and its direction is toggled.buildCompare in SortHistory creates a composite comparison function that iterates through the SortSelection objects in history. The first SortSelection determines the primary sort order. If it results in a tie, the second SortSelection is used to break the tie, and so on. This creates the effect of a stable sort across multiple user interactions without needing a true stable sort algorithm for each click.encode/decode methods to serialize the entire sort history (e.g., for persisting sort state in a URL).set() method: The primary way to provide data to the component. It initializes keyValues, sortHistory, and the main compare function. It can also accept an encodedHistory string to restore a previous sort state.lit-html for templating.queryDefinition(): Renders the contextual information about the query and pivot operations.tableHeader(), keyColumnHeaders(), summaryColumnHeaders(): Generate the table header row, including sort icons.sortArrow(): Dynamically displays the correct sort icon (up arrow, down arrow, or neutral sort icon) based on the current SortHistory.tableRows(), keyRowValues(), summaryRowValues(): Generate the data rows of the table, applying the current sort order.displayValue(): Formats numerical values for display, converting a special sentinel value (MISSING_DATA_SENTINEL from //perf/modules/const:const_ts_lib) to ‘-’.change event when the user sorts the table. The event detail (PivotTableSkChangeEventDetail) is the encoded SortHistory string. This allows parent components to react to sort changes and potentially persist the state.paramset-sk to display the query parameters.arrow-drop-down-icon-sk, arrow-drop-up-icon-sk, sort-icon-sk) for the sort indicators.//perf/modules/json:index_ts_lib for DataFrame, TraceSet, pivot.Request types.//perf/modules/pivotutil:index_ts_lib for operationDescriptions and validateAsPivotTable.//perf/modules/paramtools:index_ts_lib for fromKey (to parse trace keys into parameter sets).//infra-sk/modules:query_ts_lib for toParamSet (to convert a query string into a ParamSet).pivot-table-sk.scss: Provides the styling for the pivot-table-sk element, including table borders, padding, text alignment, and cursor styles for interactive elements. It leverages themes from //perf/modules/themes:themes_sass_lib.
index.ts: A simple entry point that imports and thereby registers the pivot-table-sk custom element.
pivot-table-sk-demo.html & pivot-table-sk-demo.ts:
pivot-table-sk element.pivot-table-sk-demo.ts creates sample DataFrame and pivot.Request objects and uses them to populate instances of pivot-table-sk on the demo page. This is crucial for development and visual testing. It demonstrates valid use cases, cases with invalid pivot requests, and cases with null DataFrames to ensure the component handles these scenarios gracefully.Test Files (pivot-table-sk_test.ts, pivot-table-sk_puppeteer_test.ts):
pivot-table-sk_test.ts (Karma test): Contains unit tests for the PivotTableSk element and its internal logic, particularly the SortSelection and SortHistory classes. It verifies:change event is emitted with the correct encoded history).buildCompare functions in SortSelection and SortHistory produce the correct sorting results for various data types and sort directions.encode and decode methods for SortSelection and SortHistory work correctly, allowing for round-tripping of sort state.keyValuesFromTraceSet function correctly transforms TraceSet data based on the pivot.Request.pivot-table-sk_puppeteer_test.ts (Puppeteer test): Performs end-to-end tests by loading the demo page in a headless browser.Workflow Example: User Sorting the Table
Initial State:
pivot-table-sk element is initialized with a DataFrame, a pivot.Request, and an optional initial encodedHistory string.pivot-table-sk creates a SortHistory object. If encodedHistory is provided, SortHistory.decode() is called. Otherwise, a default sort order is established (usually based on the order of summary columns, then key columns, all initially ‘up’).SortHistory.buildCompare() generates the initial comparison function.sort-icon-sk.User Clicks a Column Header (e.g., “config” key column):
- `changeSort(columnIndex, 'keyValues')` is called within
`pivot-table-sk`.
- `this.sortHistory.selectColumnToSortOn(columnIndex, 'keyValues')` is
invoked:
- The `SortSelection` for the "config" column is found in
`this.sortHistory.history`.
- It's removed from its current position.
- Its `direction` is toggled (e.g., from 'up' to 'down').
- This updated `SortSelection` is prepended to
`this.sortHistory.history`. `Before: [SummaryCol0(up),
SummaryCol1(up), KeyCol0(config, up), KeyCol1(arch, up)] Click on KeyCol0 (config): After: [KeyCol0(config, down), SummaryCol0(up), SummaryCol1(up), KeyCol1(arch, up)] -this.compare = this.sortHistory.buildCompare(...)`is called. A new composite comparison function is generated. Now, rows will primarily be sorted by “config” (descending). Ties will be broken by “SummaryCol0” (ascending), then “SummaryCol1” (ascending), and finally “KeyCol1” (ascending).
CustomEvent('change')is dispatched. Theevent.detailcontains this.sortHistory.encode(), which is a string representation of the new sort order (e.g., “dk0-su0-su1-ku1”).this.\_render()is called, re-rendering the table with the new sort order. The “config” column header now shows an arrow-drop-down-icon-sk.User Clicks Another Column Header (e.g., “avg” summary column):
- The process repeats. The `SortSelection` for the "avg" column is moved to the front of `this.sortHistory.history` and its direction is toggled. `Before: [KeyCol0(config, down), SummaryCol0(avg, up), SummaryCol1(sum,
up), KeyCol1(arch, up)] Click on SummaryCol0 (avg): After: [SummaryCol0(avg, down), KeyCol0(config, down), SummaryCol1(sum, up), KeyCol1(arch, up)]` - The table is re-rendered, now primarily sorted by “avg” (descending), with ties broken by “config” (descending), then “sum” (ascending), then “arch” (ascending).
This multi-level sorting, driven by the SortHistory maintaining the sequence of user sort actions, is a key aspect of the “how” behind the pivot-table-sk's user experience. It aims to provide a powerful yet familiar way to analyze pivoted data.
The pivotutil module provides utility functions and constants for working with pivot table requests. Its primary purpose is to ensure the validity and integrity of pivot requests before they are processed, and to offer human-readable descriptions for pivot operations. This centralization of pivot-related logic helps maintain consistency and simplifies the handling of pivot table configurations across different parts of the application.
index.ts: This is the core file of the module and contains all the exported functionalities.
operationDescriptions:
avg, std). To improve user experience and make UIs more understandable, a mapping to human-readable names is necessary.pivot.Operation enum values (imported from ../json) and values are their corresponding descriptive strings (e.g., “Mean”, “Standard Deviation”). This allows for easy lookup and display of operation names.validatePivotRequest(req: pivot.Request | null): string:
pivot.Request object.null. If so, it returns an error message.group_by property is present and is an array with at least one element. A pivot table fundamentally relies on grouping data, so this is a mandatory field.Input: pivot.Request | null | V Is request null? --(Yes)--> Return "Pivot request is null." | (No) V Is req.group_by null or empty? --(Yes)--> Return "Pivot must have at least one GroupBy." | (No) V Return "" (Valid)validateAsPivotTable(req: pivot.Request | null): string:
validatePivotRequest to ensure the basic structure of the request is valid. If validatePivotRequest returns an error, that error is immediately returned.summary property of the request is present and is an array with at least one element. Summary operations (like sum, average, etc.) are essential for generating the aggregated values displayed in a pivot table. Without them, the request might be valid for plotting individual traces grouped by some criteria, but not for a typical pivot table with summarized data.summary array is missing or empty, an error message is returned. Otherwise, an empty string is returned.Input: pivot.Request | null | V Call validatePivotRequest(req) --> invalidMsg | V Is invalidMsg not empty? --(Yes)--> Return invalidMsg | (No) V Is req.summary null or empty? --(Yes)--> Return "Must have at least one Summary operation." | (No) V Return "" (Valid for pivot table)index_test.ts: This file contains unit tests for the functions in index.ts.
pivotutil module.chai assertion library to define test cases.validatePivotRequest, it tests scenarios like:null request.group_by being null.group_by being an empty array.validateAsPivotTable, it builds upon the validatePivotRequest checks and adds tests for: _ summary being null. _ summary being an empty array. * A valid request with at least one summary operation. Each test asserts whether the validation functions return an empty string (for valid inputs) or a non-empty error message string (for invalid inputs) as expected.The design decision to separate validatePivotRequest and validateAsPivotTable allows for more granular validation. Some parts of an application might only need the basic validation (e.g., ensuring data can be grouped), while others specifically require summary operations for display in a tabular format. This separation provides flexibility. The use of descriptive error messages aids in debugging and user feedback.
The plot-google-chart-sk module provides a custom element for rendering interactive time-series charts using Google Charts. It is designed to display performance data, including anomalies and user-reported issues, and allows users to interact with the chart through panning, zooming, and selecting data points.
Key Responsibilities:
DataTable objects, which are consumed from a Lit context (dataTableContext). This DataTable typically contains time-series data where the first column is a commit identifier (e.g., revision number or timestamp), the second is a date object, and subsequent columns represent different data traces.dataframeAnomalyContext and dataframeUserIssueContext).side-panel-sk) that displays a legend for the plotted traces. Users can toggle the visibility of individual traces using checkboxes in the side panel.Design Decisions and Implementation Choices:
@google-web-components/google-chart library for the core charting functionality. This provides a robust and feature-rich charting engine.DataTable, anomaly information, and loading states from parent components or a centralized data store. This promotes a decoupled architecture.v-resizable-box-sk: A dedicated component for the vertical selection box used in the “deltaY” mode. It calculates and displays the difference between the start and end points of the drag.drag-to-zoom-box-sk: Handles the visual representation of the selection box during the drag-to-zoom interaction. It manages the display and dimensions of the box as the user drags.side-panel-sk: Encapsulates the legend and trace visibility controls. This separation of concerns keeps the main chart component focused on plotting.selection-changed, plot-data-mouseover, plot-data-select) to notify parent components of user interactions and chart state changes. This allows for integration with other parts of an application.md-icon elements on top of the chart. Their positions are calculated based on the chart‘s layout and the data point coordinates. This approach avoids modifying the Google Chart’s internal rendering and allows for more flexible styling and interaction with these markers.this.chart) and chart layout information (this.cachedChartArea) to avoid redundant lookups.removedLabelsCache to efficiently hide and show traces without reconstructing the entire DataView each time.navigationMode property (pan, deltaY, dragToZoom) manages the current mouse interaction state. This simplifies event handling by directing mouse events to the appropriate logic based on the active mode.determineYAxisTitle method attempts to create a meaningful Y-axis title by examining the unit and improvement_direction parameters from the trace names. It displays these only if they are consistent across all visible traces.Key Components/Files:
plot-google-chart-sk.ts: The core component that orchestrates the chart display and interactions.DataTable, AnomalyMap, UserIssueMap) via Lit context.side-panel-sk to manage trace visibility.side-panel-sk.ts: Implements the side panel containing the legend and checkboxes for toggling trace visibility.DataTable.plot-google-chart-sk.v-resizable-box-sk.v-resizable-box-sk.ts: A custom element for the vertical resizable selection box used during the delta calculation (Shift-click + drag).drag-to-zoom-box-sk.ts: A custom element for the selection box used during the drag-to-zoom interaction (Ctrl-click + drag).plot-google-chart-sk-demo.ts and plot-google-chart-sk-demo.html: Provide a demonstration page showcasing the plot-google-chart-sk element with sample data. This is crucial for development and testing.index.ts: Serves as the entry point for the module, importing and registering all the custom elements defined within.Key Workflows:
Initial Chart Rendering: DataTable (from context) -> plot-google-chart-sk -> updateDataView() -> Creates google.visualization.DataView -> Sets columns based on domain (commit/date) and visible traces -> updateOptions() configures chart appearance (colors, axes, view window) -> plotElement.value.view = view and plotElement.value.options = options -> Google Chart renders. -> onChartReady(): -> Caches chart object. -> Calls drawAnomaly(), drawUserIssues(), drawXbar().
Panning: User mousedown (not Shift or Ctrl) -> onChartMouseDown(): navigationMode = 'pan' User mousemove -> onWindowMouseMove(): -> Calculates deltaX based on mouse movement and current domain. -> Updates this.selectedRange. -> Calls updateOptions() to update chart's horizontal view window. -> Dispatches selection-changing event. User mouseup -> onWindowMouseUp(): -> Dispatches selection-changed event. -> navigationMode = null.
Drag-to-Zoom: User Ctrl + mousedown -> onChartMouseDown(): navigationMode = 'dragToZoom' -> zoomRangeBox.value.initializeShow(): Displays the drag box. User mousemove -> onWindowMouseMove(): -> zoomRangeBox.value.handleDrag(): Updates the drag box dimensions. User mouseup -> onChartMouseUp(): -> Calculates zoom boundaries based on drag box and isHorizontalZoom. -> zoomRangeBox.value.hide(). -> showResetButton = true. -> updateBounds(): Updates chart's hAxis.viewWindow or vAxis.viewWindow. -> navigationMode = null.
Delta Calculation (Shift-Click): User Shift + mousedown -> onChartMouseDown(): navigationMode = 'deltaY' -> deltaRangeBox.value.show(): Displays the vertical resizable box. User mousemove -> onWindowMouseMove(): -> deltaRangeBox.value.updateSelection(): Updates box height and calculates delta. -> Updates sidePanel.value with delta values. User Shift + mousedown (again) or regular mousedown -> onChartMouseDown(): -> Toggles deltaRangeOn. If finishing, sidePanel.value.showDelta = true. User mouseup (after dragging) -> onChartMouseUp(): -> Updates sidePanel.value with final delta values. -> navigationMode = null.
Toggling Trace Visibility: User clicks checkbox in side-panel-sk -> side-panel-sk dispatches side-panel-selected-trace-change. plot-google-chart-sk listens (sidePanelCheckboxUpdate()): -> Updates this.removedLabelsCache. -> Calls updateDataView(): -> Recreates DataView, hiding/showing columns based on removedLabelsCache. -> Updates chart.
Anomaly/Issue Display: anomalyMap or userIssues (from context) changes -> plot-google-chart-sk.willUpdate() -> plotElement.value.redraw() (if chart already rendered). Chart redraw triggers onChartReady(): -> drawAnomaly() / drawUserIssues(): -> Iterates through anomalies/issues for visible traces. -> Calculates screen coordinates (x, y) using chart.getChartLayoutInterface().getXLocation() and getYLocation(). -> Clones template md-icon elements from slots. -> Positions the icons absolutely within anomalyDiv or userIssueDiv.
This detailed explanation should provide a solid understanding of the plot-google-chart-sk module's purpose, architecture, and key functionalities.
The plot-simple-sk module provides a custom HTML element for rendering 2D line graphs. It's designed to be interactive, allowing users to zoom, inspect individual data points, and highlight specific traces.
Core Functionality and Design:
The primary goal of plot-simple-sk is to display time-series data or any data that can be represented as a set of (x, y) coordinates. Key design considerations include:
Performance: To handle potentially large datasets and maintain a smooth user experience, the element employs several optimization techniques:
<canvas> elements stacked on top of each other.traces) is for drawing the static parts of the plot: the lines, axes, and dots representing data points. These are pre-rendered into Path2D objects for efficient redrawing.overlay) is for dynamic elements that change frequently, such as crosshairs, zoom selection rectangles, and hover highlights. This separation prevents unnecessary redrawing of the entire plot.Path2D Objects: Trace lines and data point dots are converted into Path2D objects. This allows the browser to optimize their rendering, leading to faster redraws compared to repeatedly issuing drawing commands.kd.ts) is used. This data structure allows for efficient searching of the closest point in a 2D space, crucial for interactivity with potentially many data points.recalcSearchTask) or redrawing after a zoom (zoomTask) are often scheduled using window.setTimeout. This prevents these potentially expensive operations from blocking the main thread and ensures they only happen when necessary, improving responsiveness. requestAnimationFrame is used for mouse movement updates to synchronize with the browser's repaint cycle.Interactivity:
detailsZoomRangesStack), allowing users to progressively zoom in and potentially (though not explicitly stated as a current feature for out) navigate back through zoom levels.trace_focused event.trace_selected event.xbar) or regions (bands) can be drawn on the plot to mark specific x-axis values or ranges.Appearance and Theming:
width and height attributes of the custom element and uses ResizeObserver to redraw when its dimensions change.window.devicePixelRatio to render crisply on high-DPI displays by drawing to a larger canvas and then scaling it down with CSS transforms.elements-sk/themes and uses CSS variables for colors (e.g., --on-background, --success, --failure), allowing its appearance to be customized by the surrounding application's theme. It listens for theme-chooser-toggle events to redraw when the theme changes.Key Files and Responsibilities:
plot-simple-sk.ts: This is the heart of the module, defining the PlotSimpleSk custom element.
ctx for traces, overlayCtx for overlays).lineData (traces and their pre-rendered paths), labels (x-axis tick information), current _zoom state, detailsZoomRangesStack for detail view zooms, hoverPt, crosshair, highlighted traces, _xbar, _bands, and _anomalyDataMap.theme-chooser-toggle and ResizeObserver events.addLines, deleteLines, removeAll, and properties like highlight, xbar, bands, zoom, anomalyDataMap, userIssueMap, and dots to control the plot's content and appearance.d3-scale (specifically scaleLinear) to map data coordinates (domain) to canvas pixel coordinates (range) and vice-versa. Functions like rectFromRange and rectFromRangeInvert handle these transformations for rectangular regions.PathBuilder: A helper class to construct Path2D objects for trace lines and dots based on the current scales and data.SearchBuilder: A helper class to prepare the data points for the KDTree by converting source coordinates to canvas coordinates.SummaryArea and DetailArea interfaces and manages their respective rectangles, axes, and scaling ranges.kd.ts: Implements a k-d tree.
O(log n) on average for search) to find the nearest data point to a given mouse coordinate on the canvas. This is crucial for interactivity like mouse hovering and clicking to identify specific points on traces.x and y properties), a distance metric function, and the dimensions to consider (['x', 'y']). The nearest() method is the primary interface used by plot-simple-sk.ts.ticks.ts: Responsible for generating appropriate tick marks and labels for the time-based x-axis.
Date objects representing the x-axis values, it determines a sensible set of tick positions and their corresponding formatted string labels (e.g., “Jul”, “Mon, 8 AM”, “10:30 AM”).Intl.DateTimeFormat. It aims for a reasonable number of ticks (MIN_TICKS to MAX_TICKS) and uses a fixTicksLength function to thin out the ticks if too many are generated.ticks() function returns an array of objects, each with an x (index in the original data) and a text (formatted label).plot-simple-sk.scss: Contains the SASS/CSS styles for the plot-simple-sk element.
themes.scss and uses CSS variables (e.g., var(--on-background), var(--background)) to ensure the plot‘s colors match the application’s theme.index.ts: A simple entry point that imports plot-simple-sk.ts to ensure the custom element is defined and registered with the browser.
Demo Files (plot-simple-sk-demo.html, plot-simple-sk-demo.ts, plot-simple-sk-demo.scss):
plot-simple-sk element's capabilities.plot-simple-sk-demo.ts) contains the logic to interact with the plot, such as adding random trace data, highlighting traces, zooming, clearing the plot, and displaying anomaly markers. It also logs events emitted by the plot.Key Workflows:
Initialization and Rendering: ElementSk constructor -> connectedCallback -> render render -> _render (lit-html template instantiation) -> canvas.getContext -> updateScaledMeasurements -> updateScaleRanges -> recalcDetailPaths -> recalcSummaryPaths -> drawTracesCanvas
Adding Data (addLines): addLines -> Convert MISSING_DATA_SENTINEL to NaN -> Store in this.lineData -> updateScaleDomains -> recalcSummaryPaths -> recalcDetailPaths -> drawTracesCanvas recalcDetailPaths / recalcSummaryPaths -> For each line: PathBuilder creates linePath and dotsPath. recalcDetailPaths -> recalcSearch (schedules recalcSearchImpl) recalcSearchImpl -> SearchBuilder populates points -> new KDTree
Mouse Hover and Focus: mousemove event -> this.mouseMoveRaw updated raf loop -> checks this.mouseMoveRaw -> eventToCanvasPt -> If this.pointSearch: this.pointSearch.nearest(pt) -> updates this.hoverPt -> dispatches trace_focused event -> Updates this.crosshair (based on shift key and hoverPt) -> drawOverlayCanvas
Zooming via Summary Drag: mousedown on summary -> this.inZoomDrag = 'summary' -> this.zoomBegin set mousemove (while dragging) -> raf loop: -> eventToCanvasPt -> clampToRect (summary area) -> this.summaryArea.range.x.invert(pt.x) to get source x -> this.zoom = [min_x, max_x] (triggers _zoomImpl via setter task) _zoomImpl (after timeout) -> updateScaleDomains -> recalcDetailPaths -> drawTracesCanvas mouseup / mouseleave -> dispatches zoom event -> this.inZoomDrag = 'no-zoom'
Zooming via Detail Area Drag: mousedown on detail -> this.inZoomDrag = 'details' -> this.zoomRect initialized mousemove (while dragging) -> raf loop: -> eventToCanvasPt -> clampToRect (detail area) -> Updates this.zoomRect.width/height -> drawOverlayCanvas (to show the dragging rectangle) mouseup / mouseleave -> dispatchZoomEvent -> doDetailsZoom doDetailsZoom -> If zoom box is large enough: this.detailsZoomRangesStack.push(rectFromRangeInvert(...)) -> _zoomImpl
Drawing Process:
drawTracesCanvas():
this.ctx).drawXAxis (for detail).this.lineData: draws line.detail.linePath and line.detail.dotsPath if this.dots is true.drawXAxis again (to draw labels outside the clipped region).this.summary and not dragging zoom:drawYAxis (for detail).drawOverlayCanvas().drawOverlayCanvas():
this.overlayCtx).this.summary:drawXBar, drawBands.detailsZoomRangesStack is not empty.this._zoom.drawXBar, drawBands.drawUserIssues, drawAnomalies.this.zoomRect (dashed).This structured approach allows plot-simple-sk to be both feature-rich and performant for visualizing and interacting with 2D data plots.
The plot-summary-sk module provides a custom HTML element, <plot-summary-sk>, designed to display a summary plot of performance data and allow users to select a range within that plot. This is particularly useful for visualizing trends over time or commit ranges and enabling interactive exploration of the data.
At its core, plot-summary-sk leverages the Google Charts library to render an area chart. It's designed to work with a DataFrame, a data structure commonly used in Perf for holding timeseries data. The element can display data based on either commit offsets or timestamps (domain attribute).
Key Responsibilities:
summary_selected custom event when the user makes or changes a selection. This event carries details about the selected range (start, end, value, and domain).DataFrameRepository to fetch and append new data.Key Components/Files:
plot-summary-sk.ts: This is the main file defining the PlotSummarySk LitElement.DataFrame data (from dataTableContext) and renders it using <google-chart>.selectedTrace property.h-resizable-box-sk element to provide the visual selection rectangle and handles the mouse events for drawing and resizing this selection.google-chart-ready events to ensure operations like setting a selection programmatically happen after the chart is fully initialized.controlTemplate for optional “load more data” buttons, which interact with a DataFrameRepository (consumed via dataframeRepoContext).ResizeObserver to detect when the element is resized and triggers a chart redraw.h-resizable-box-sk.ts: This file defines the HResizableBoxSk LitElement, a reusable component for creating a horizontally resizable and draggable selection box.plot-summary-sk component. This promotes reusability and simplifies the main component's logic.div (.surface) that represents the selection.mousedown events on its container to initiate an action: ‘draw’ (if clicking outside the existing selection), ‘drag’ (if clicking inside the selection), ‘left’ (if clicking on the left edge), or ‘right’ (if clicking on the right edge).mousemove events on the window to update the selection‘s position and size during an action. This ensures interaction continues even if the mouse moves outside the element’s bounds.mouseup events on the window to finalize the action and emits a selection-changed event with the new range.cursor: move, cursor: ew-resize).selectionRange property (getter and setter) allows programmatic control and retrieval of the selection, defined by begin and end pixel offsets relative to the component.plot-summary-sk.css.ts: Contains the CSS styles for the plot-summary-sk element, defined as a Lit css tagged template literal.h-resizable-box-sk) absolutely over the chart, and styles the optional loading buttons and loading indicator.plot-summary-sk-demo.ts and plot-summary-sk-demo.html: Provide a demonstration page for the plot-summary-sk element.plot-summary-sk with different configurations (e.g., domain, selectionType). The TypeScript file generates sample DataFrame objects, converts them to Google DataTable format, and populates the plot elements. It also listens for summary_selected events and displays their details.*.test.ts, *_puppeteer_test.ts):plot-summary-sk_test.ts, h_resizable_box_sk_test.ts) verify individual component logic, such as programmatic selection and state changes. They often mock dependencies like the Google Chart library or use test utilities to generate data.plot-summary-sk_puppeteer_test.ts) perform end-to-end testing by interacting with the component in a real browser environment. They simulate user actions like mouse drags and verify the emitted event details and visual output (via screenshots).Key Workflows:
Initialization and Data Display:
[DataFrame via context or property]
|
v
plot-summary-sk
|
v
[willUpdate/updateDataView] --> Converts DataFrame to Google DataTable
|
v
<google-chart> --> Renders area chart
|
v
[google-chart-ready event] --> plot-summary-sk may apply cached selection
User Selecting a Range by Drawing:
User mousedowns on <plot-summary-sk> (outside existing selection in h-resizable-box-sk)
|
v
h-resizable-box-sk (action = 'draw')
|
v
User moves mouse (mousemove on window)
|
v
h-resizable-box-sk --> Updates selection box dimensions
|
v
User mouseups (mouseup on window)
|
v
h-resizable-box-sk --> Emits 'selection-changed' (with pixel coordinates)
|
v
plot-summary-sk (onSelectionChanged)
|
v
Converts pixel coordinates to data values (commit/timestamp)
|
v
Emits 'summary_selected' (with data values)
User Resizing/Moving an Existing Selection:
User mousedowns on <h-resizable-box-sk> (on edge for resize, or middle for drag)
|
v
h-resizable-box-sk (action = 'left'/'right'/'drag')
|
v
User moves mouse (mousemove on window)
|
v
h-resizable-box-sk --> Updates selection box position/dimensions
|
v
User mouseups (mouseup on window)
|
v
h-resizable-box-sk --> Emits 'selection-changed'
|
v
plot-summary-sk (onSelectionChanged) --> Converts & Emits 'summary_selected'
Programmatic Selection: Application calls plotSummarySkElement.Select(beginHeader, endHeader) OR Application sets plotSummarySkElement.selectedValueRange = { begin: val1, end: val2 } | v plot-summary-sk | v Caches selectedValueRange (important if chart not ready) | v [If chart ready] --> Converts data values to pixel coordinates | v Sets selectionRange on <h-resizable-box-sk> If the chart is not ready when selectedValueRange is set, the conversion and setting of the h-resizable-box-sk selection is deferred until the google-chart-ready event fires.
The design separates the concerns of data plotting (Google Charts), interactive range selection UI (h-resizable-box-sk), and the overall orchestration and data conversion logic (plot-summary-sk). This makes the system more modular and easier to maintain. The use of LitElement and contexts allows for a reactive programming model and clean integration with other parts of the Perf application.
The point-links-sk module is a custom HTML element designed to display links associated with specific data points in a performance analysis context. These links often originate from ingestion files and can include commit details, build logs, or other relevant resources.
The primary purpose of this module is to provide users with quick access to contextual information related to a data point. It achieves this by:
Key Responsibilities and Components:
point-links-sk.ts: This is the core file defining the PointLinksSk custom element.ElementSk from infra-sk.load() method: This is the main public method responsible for initiating the process of fetching and displaying links. It takes the current commit ID, the previous commit ID, a trace ID, and arrays of keys to identify which links should be treated as commit ranges and which are general “useful links”. It handles the logic for checking the cache, fetching data from the API, processing commit ranges, and updating the display.getLinksForPoint() and invokeLinksForPointApi() methods: These private methods handle the actual API interaction to retrieve link data. getLinksForPoint attempts to fetch from /_/links/ first and falls back to /_/details/?results=false if the initial attempt fails. It also includes workarounds for specific data inconsistencies (e.g., V8 and WebRTC URLs).renderPointLinks() and renderRevisionLink() methods: These methods, along with the static template, use lit-html to generate the HTML structure for displaying the links.getCommitIdFromCommitUrl, getRepoUrlFromCommitUrl, getFormattedCommitRangeText, extractUrlFromStringForFuchsia): These provide utility functions for parsing URLs and formatting text.commitPosition, displayUrls, displayTexts): These store the state of the component, such as the current commit and the links to be displayed.point-links-sk.scss: Provides the styling for the point-links-sk element, ensuring a consistent look and feel, including styling for Material Design icons and buttons.index.ts: A simple entry point that imports and thereby registers the point-links-sk custom element.point-links-sk-demo.html & point-links-sk-demo.ts: These files set up a demonstration page for the point-links-sk element. The point-links-sk-demo.ts file uses fetch-mock to simulate the backend API, allowing developers to test the component's behavior in isolation. It demonstrates how to instantiate and use the point-links-sk element with different configurations.Workflow for Loading and Displaying Links:
The typical workflow when the load() method is called can be visualized as:
Caller invokes pointLinksSk.load(currentCID, prevCID, traceID, rangeKeys, usefulKeys, cachedLinks)
|
V
Check if links for (currentCID, traceID) exist in `cachedLinks`
|
+-- YES --> Use cached links
| |
| V
| Render links
|
+-- NO ---> Fetch links for `currentCID` from API (`getLinksForPoint`)
|
V
If `rangeKeys` are provided:
| Fetch links for `prevCID` from API (`getLinksForPoint`)
| For each key in `rangeKeys`:
| Extract current commit hash from `currentCID` links
| Extract previous commit hash from `prevCID` links
| If hashes are different:
| Generate "commit range" URL (e.g., .../+log/prevHash..currentHash)
| Else (hashes are same):
| Use current commit URL
| Add to `displayUrls` and `displayTexts`
|
V
If `usefulKeys` are provided:
| For each key in `usefulKeys`:
| Add corresponding link from `currentCID` links to `displayUrls`
|
V
Update cache with newly fetched/generated links for (currentCID, traceID)
|
V
Render links
This module is designed to be flexible, allowing the consuming application to specify which types of links should be processed for commit ranges and which should be displayed as direct links. The inclusion of error handling (via errorMessage) and the fallback mechanism in API calls (/_/links/ then /_/details/) make it more robust.
The progress module provides a mechanism for initiating and monitoring the status of long-running tasks on the server. This is crucial for user experience, as it allows the client to display progress information and avoid appearing unresponsive during lengthy operations.
The core of this module is the startRequest function. This function is designed to handle asynchronous server-side processes that might take a significant amount of time to complete.
How startRequest Works:
Initiation:
startingURL with a given body. This request typically triggers the long-running task on the server.spinner-sk element is provided, it's activated to visually indicate that a process is underway.Polling:
progress.SerializedProgress. This object contains:status: Indicates whether the task is “Running” or “Finished” (or potentially other states like “Error”).messages: An array of key-value pairs providing more detailed information about the current state of the task (e.g., current step, progress percentage).url: If the status is “Running”, this URL is used for the next polling request to get updated progress.results: If the status is “Finished”, this field contains the final output of the long-running process.status is “Running”, startRequest will schedule a setTimeout to make a GET request to the url provided in the response after a specified period. This creates a polling loop.Callback and Completion:
callback function can be provided. This function is invoked after each successful fetch (both the initial request and every polling update), receiving the progress.SerializedProgress object. This allows the UI to update with the latest progress information.status that is not “Running” (e.g., “Finished”).startRequest resolves with the final progress.SerializedProgress object.spinner-sk was provided, it is deactivated.Error Handling:
startRequest is rejected with an error.Workflow Diagram:
Client UI startRequest Function Server ---------- --------------------- ------ | | | -- Call startRequest --> | | | -- POST to startingURL (body) --> | | | | | | <-- Response (SerializedProgress) -- | | | | -- (Optional) Activate -- | | Spinner | | | | | -- If status is "Running": --------> Schedule setTimeout(period) | | | | | V | | -- GET to progress.url -----------> | | | | | | <-- Response (SerializedProgress) -- | | | | | | --- (Invoke callback) ---------> Client UI (Update progress) | | | | | --- Loop back to "If status is 'Running'" | | | | -- If status is "Finished": -------> Resolve Promise | | | | -- (Optional) Deactivate | <----------------------------------- | Spinner | | | | <-- Promise Resolves ---- | | (SerializedProgress) |
Key Files:
progress.ts:
startRequest: The primary function that orchestrates the entire progress monitoring flow. It encapsulates the logic for making the initial POST request and subsequent GET requests for polling. The use of a single processFetch internal function is a design choice to reduce code duplication, as the response handling logic is identical for both the initial and polling fetches.messagesToErrorString: A utility function designed to extract a user-friendly error message from the messages array within SerializedProgress. It prioritizes messages with the key “Error” but falls back to concatenating all messages if no specific error message is found. This ensures that some form of feedback is available even if the server doesn't explicitly flag an error.messagesToPreString: Formats messages for display, typically within a <pre> tag, by putting each key-value pair on a new line. This is useful for presenting detailed progress logs.messageByName: Allows retrieval of a specific message's value by its key from the messages array, with a fallback if the key is not found. This is useful for extracting specific pieces of information from the progress updates (e.g., the current step number).elements-sk/modules/spinner-sk: Used to visually indicate that a background task is in progress.perf/modules/json: Provides the progress.SerializedProgress type definition, ensuring consistency in how progress information is structured between the client and server.progress_test.ts:
progress.ts module.startRequest correctly handles different server response scenarios: immediate completion, one or more polling steps, and network errors.messagesToErrorString, messageByName) with various inputs.fetch-mock to simulate server responses, allowing for controlled testing of the asynchronous network interactions without relying on an actual backend. This is crucial for creating reliable and fast unit tests.The design of this module prioritizes a clear separation of concerns. startRequest focuses on the communication and polling logic, while the utility functions provide convenient ways to interpret and display the progress information received from the server. The use of Promises simplifies handling asynchronous operations, and the optional callback provides flexibility for updating the UI in real-time.
query-chooser-sk)The query-chooser-sk module provides a user interface element for selecting and modifying query parameters. It's designed to offer a compact way to display the currently active query and provide a mechanism to change it through a dialog.
The primary goal of query-chooser-sk is to present a summarized view of the current query and allow users to edit it in a more detailed interface. This is achieved by:
paramset-sk element. This gives users a quick overview of the active filters.query-sk in a dialog: The dialog contains a query-sk element. This is where the user can interactively build or modify their query by selecting values for different parameters.query-sk element, query-count-sk is used to display how many items match the currently constructed query. This provides immediate feedback to the user as they refine their selection.query-chooser-sk listens for query-change events from the embedded query-sk element. When a change occurs, query-chooser-sk updates its own current_query and re-renders, effectively propagating the change. It also emits its own query-change event, allowing parent components to react to query modifications.This design separates the concerns of displaying the current state from the more complex interaction of query building. The dialog provides a focused environment for query modification without cluttering the main UI.
query-chooser-sk.ts: This is the core TypeScript file defining the QueryChooserSk custom element.paramset-sk), the query editing interface (query-sk), and the match count display (query-count-sk).current_query, paramset, key_order, and count_url which are essential for its operation and for configuring its child elements._editClick and _closeClick methods handle the opening and closing of the dialog._queryChange method is crucial for reacting to changes in the embedded query-sk element and updating the current_query.query-chooser-sk.html (template within query-chooser-sk.ts): This Lit HTML template defines the structure of the element.div with class row to display the “Edit” button and the paramset-sk summary.div with id dialog acts as the container for query-sk, query-count-sk, and the “Close” button. The visibility of this dialog is controlled by adding/removing the display class.query-chooser-sk.scss: This file provides the styling for the element. It ensures proper layout of the button, summary, and the dialog content. It also includes theming support.index.ts: A simple entry point that imports and registers the query-chooser-sk custom element.query-chooser-sk-demo.html / query-chooser-sk-demo.ts: These files provide a demonstration page for the element, showcasing its usage with sample data and event handling. fetchMock is used in the demo to simulate the count_url endpoint.query-chooser-sk_puppeteer_test.ts: Contains Puppeteer tests to verify the rendering and basic functionality of the element.The typical workflow for a user interacting with query-chooser-sk is as follows:
User sees current query summary & "Edit" button | | (User clicks "Edit") V Dialog appears, showing: - `query-sk` (for selecting parameters/values) - `query-count-sk` (displaying number of matches) - "Close" button | | (User interacts with `query-sk`, changing selections) V `query-sk` emits "query-change" event | V `query-chooser-sk` (_queryChange method): - Updates its `current_query` property/attribute - Re-renders to reflect new `current_query` in summary & `query-count-sk` - Emits its own "query-change" event (for parent components) | | (User is satisfied with the new query) V User clicks "Close" | V Dialog is hidden | V `query-chooser-sk` displays the updated query summary.
The paramset attribute is crucial as it provides the available keys and values that query-sk will use to render its selection interface. The key_order attribute influences the order in which parameters are displayed within query-sk. The count_url is passed directly to query-count-sk to fetch the number of matching items for the current query.
The query-count-sk module provides a custom HTML element designed to display the number of results matching a given query. Its primary purpose is to offer a dynamic and responsive way to inform users about the scope of their queries in real-time, without requiring a full page reload or complex UI updates. This is particularly useful in applications where users frequently refine search criteria and need immediate feedback on the impact of those changes.
The core functionality revolves around the QueryCountSk class, which extends ElementSk. This class manages the state of the displayed count, handles asynchronous data fetching, and updates the UI accordingly.
Key Components and Design Decisions:
query-count-sk.ts: This is the heart of the module.current_query or url attributes change, the element initiates a POST request to the specified url.current_query, and a default time window of the last 24 hours (begin and end timestamps). This design choice implies that the element is typically used for querying recent data.AbortController. This is a crucial design decision for performance and responsiveness, especially when users rapidly change query parameters.count (number of matches) and a paramset (a read-only representation of parameters related to the query)._count property stores the fetched count as a string, and _requestInProgress is a boolean flag indicating whether a fetch operation is currently active. This flag is used to show/hide a loading spinner (spinner-sk).lit-html for efficient template rendering. The template displays the _count and the spinner-sk conditionally.paramset-changed custom event is dispatched. This event carries the paramset received from the server. This allows other components on the page to react to changes in the available parameters based on the current query results. This decoupling is a key design aspect for building modular UIs.errorMessage utility (likely from perf/modules/errorMessage). AbortErrors are handled gracefully by simply stopping the current operation without displaying an error, as this usually means the user initiated a new action.query-count-sk.scss: Provides styling for the element, ensuring the count and spinner are displayed appropriately. The display: inline-block and flexbox layout for the internal div are chosen for simple alignment of the count and spinner.query-count-sk-demo.html and query-count-sk-demo.ts: These files provide a demonstration and testing environment for the query-count-sk element.fetch-mock to simulate server responses, allowing for isolated testing of the component's behavior.url, current_query).<error-toast-sk> in the demo suggests that this is the intended mechanism for displaying errors surfaced by errorMessage.index.ts: A simple entry point that imports and registers the query-count-sk custom element, making it available for use in an HTML page.Workflow for Displaying Query Count:
Initialization:
query-count-sk element is added to the DOM.url attribute (pointing to the backend endpoint) is set.Page query-count-sk | | |--(Set url)---->|
Query Update:
current_query attribute is set or updated (e.g., by user input in another part of the application).Page query-count-sk | | |--(Set current_query)-->|
Data Fetching:
attributeChangedCallback (or connectedCallback on initial load) triggers the _fetch() method._requestInProgress is set to true, and the spinner becomes visible.this.url with the current_query and time range.query-count-sk Server
| |
|--(Set _requestInProgress=true)------>| (Spinner shows)
| |
|----(POST / {q: current_query, ...})-->|
Response Handling:
Success:
{ count: N, paramset: {...} }._count is updated with N._requestInProgress is set to false (spinner hides).paramset-changed event is dispatched with the paramset.query-count-sk Server
| |
|<----(HTTP 200, {count, paramset})----|
| |
|--(Update _count, _requestInProgress=false)-->| (Spinner hides, count updates)
| |
|--(Dispatch 'paramset-changed')------>| (Other components may react)
Error (e.g., network issue, server error):
_requestInProgress is set to false (spinner hides).error-toast-sk).query-count-sk Server
| |
|<----(HTTP Error or Network Error)----|
| |
|--(Set _requestInProgress=false)------>| (Spinner hides)
| |
|--(Display error message)------------>|
Abort:
catch block for AbortError is entered.The design emphasizes responsiveness by aborting stale requests and provides a clear visual indication of ongoing activity (the spinner). The paramset-changed event promotes loose coupling between components, allowing other parts of the application to adapt based on the query results without direct dependencies on query-count-sk's internal implementation.
The regressions-page-sk module provides a user interface for viewing and managing performance regressions. It allows users to select a “subscription” (often representing a team or area of ownership, like “Sheriff Config 1”) and then displays a list of detected performance anomalies (regressions or improvements) associated with that subscription.
The core functionality revolves around fetching and displaying this data in a user-friendly way.
Key Responsibilities and Components:
regressions-page-sk.ts: This is the main TypeScript file that defines the RegressionsPageSk custom HTML element.
State interface, stateReflector): The component maintains its UI state (selected subscription, whether to show triaged items or improvements, and a flag for using a Skia-specific backend) in the state object. The stateReflector utility is crucial here. It synchronizes this internal state with the URL query parameters. This means a user can bookmark a specific view (e.g., a particular subscription with improvements shown) and share it, or refresh the page and return to the same state.stateReflector? It provides a clean way to manage application state that needs to be persistent across page loads and shareable via URLs, without manually parsing and updating the URL.fetchRegressions, init):init method is called during component initialization and whenever the state changes significantly (like selecting a new subscription). It fetches the list of available subscriptions (sheriff lists) from either a legacy endpoint (/_/anomalies/sheriff_list) or a Skia-specific one (/_/anomalies/sheriff_list_skia) based on the state.useSkia flag. The fetched subscriptions are then sorted alphabetically for display in a dropdown.fetchRegressions method is responsible for fetching the actual anomaly data. It constructs a query based on the current state (selected subscription, filters for triaged/improvements, and a cursor for pagination). It also chooses between legacy and Skia-specific anomaly list endpoints. The fetched anomalies are then appended to the cpAnomalies array, and if a cursor is returned, a “Show More” button is made visible.template, _render): The component uses lit-html for templating. The template static method defines the HTML structure, which includes:<select id="filter">) to choose a subscription.<subscription-table-sk> to display details about the selected subscription and its associated alerts.<anomalies-table-sk> to display the list of anomalies/regressions.spinner-sk) to indicate loading states._render() method (implicitly called by ElementSk when properties change) re-renders the component with the latest data.filterChange, triagedChange, improvementChange): These methods handle user interactions like selecting a subscription or toggling filters. They update the component's state, trigger stateHasChanged (which in turn updates the URL and can re-fetch data), and then explicitly call fetchRegressions and _render to reflect the changes.getRegTemplate, regRowTemplate): There's also code related to displaying regressions directly in a table within this component (the regressions property and getRegTemplate). However, the primary display of anomalies seems to be delegated to anomalies-table-sk. This older regression display logic might be for a previous version or a specific use case not currently active in the demo. The isRegressionImprovement static method determines if a given regression object represents an improvement based on direction and cluster type.anomalies-table-sk (external dependency): This component is responsible for rendering the detailed table of anomalies. regressions-page-sk fetches the anomaly data and then passes it to anomalies-table-sk for display. This promotes modularity, separating data fetching/management from presentation.
subscription-table-sk (external dependency): This component displays information about the currently selected subscription, including any configured alerts. Similar to anomalies-table-sk, it receives data from regressions-page-sk.
regressions-page-sk.scss: Provides styling for the regressions-page-sk component, including colors for positive/negative changes and styles for spinners and buttons.
regressions-page-sk-demo.html and regressions-page-sk-demo.ts: These files set up a demonstration page for the regressions-page-sk component.
regressions-page-sk-demo.ts is particularly important for understanding how the component is intended to be used and tested. It initializes a global window.perf object with configuration settings that the main component might rely on (though direct usage isn‘t evident in regressions-page-sk.ts itself, it’s a common pattern in Perf).fetchMock to simulate API responses for /users/login/status, /_/subscriptions, and /_/regressions (which seems to be an older endpoint pattern compared to what regressions-page-sk.ts uses). This mocking is crucial for creating a standalone demo environment.fetchMock? It allows developers to work on and test the UI component without needing a live backend, ensuring predictable data and behavior for demos and tests.Workflow for Displaying Regressions:
Initialization (connectedCallback, init):
regressions-page-sk element is added to the DOM.stateReflector is set up to read initial state from URL or use defaults.init() is called:<select id="filter">).User Selects a Subscription (filterChange):
filterChange("Sheriff Config 2") is triggered.state.selectedSubscription is updated to “Sheriff Config 2”.cpAnomalies is cleared, anomalyCursor is reset.stateHasChanged() is called, updating the URL (e.g., ?selectedSubscription=Sheriff%20Config%202).fetchRegressions() is called.Fetching Anomalies (fetchRegressions):
/_/anomalies/anomaly_list?sheriff=Sheriff%20Config%202 (or the Skia equivalent).Displaying Anomalies:
this.cpAnomalies.subscriptionTable is updated with subscription details and alerts from the response.anomaliesTable (the anomalies-table-sk instance) is populated with this.cpAnomalies.User Action Component State API Interaction UI Update
----------- --------------- --------------- ---------
Page Load
|
V
regressions-page-sk.init()
| state = {selectedSubscription:''}
V
fetch('/_/anomalies/sheriff_list') -> ["Sheriff1", "Sheriff2"]
| subscriptionList = ["Sheriff1", "Sheriff2"]
V
Populate dropdown
Disable filter buttons
Selects "Sheriff1"
|
V
regressions-page-sk.filterChange("Sheriff1")
| state = {selectedSubscription:'Sheriff1', ...}
| (URL updates via stateReflector)
V
regressions-page-sk.fetchRegressions()
| anomaliesLoadingSpinner = true
V
fetch('/_/anomalies/anomaly_list?sheriff=Sheriff1') -> {anomaly_list: [...], anomaly_cursor: 'cursor123'}
| cpAnomalies = [...], anomalyCursor = 'cursor123', showMoreAnomalies = true
| anomaliesLoadingSpinner = false
V
Update anomaliesTable
Update subscriptionTable
Show "Show More" button
Enable filter buttons
Clicks "Show More"
|
V
regressions-page-sk.fetchRegressions() (called by button click)
| showMoreLoadingSpinner = true
V
fetch('/_/anomalies/anomaly_list?sheriff=Sheriff1&anomaly_cursor=cursor123') -> {anomaly_list: [more...], anomaly_cursor: null}
| cpAnomalies = [all...], anomalyCursor = null, showMoreAnomalies = false
| showMoreLoadingSpinner = false
V
Update anomaliesTable (append)
Hide "Show More" button
Toggling Filters (e.g., “Show Triaged”, triagedChange):
triagedChange() is triggered.state.showTriaged is toggled.stateHasChanged() updates the URL (e.g., ?selectedSubscription=Sheriff%20Config%202&showTriaged=true).fetchRegressions() is called again, this time with triaged=true in the query.The design separates concerns: regressions-page-sk handles overall page logic, state, and orchestration of data fetching, while specialized components like anomalies-table-sk and subscription-table-sk handle the rendering of specific data views. The use of stateReflector ensures the UI state is bookmarkable and shareable. The demo files with fetchMock are critical for isolated development and testing of the UI component.
The report-page-sk module is designed to display a detailed report page for performance anomalies. Its primary purpose is to provide users with a comprehensive view of selected anomalies, including their associated graphs and commit information, facilitating the analysis and understanding of performance regressions or improvements.
At its core, the report-page-sk element orchestrates the display of several key pieces of information. It fetches anomaly data from a backend endpoint (/_/anomalies/group_report) based on URL parameters (like revision, anomaly IDs, bug ID, etc.). This data is then used to populate an anomalies-table-sk element, which presents a tabular view of the anomalies.
A crucial design decision is the use of an AnomalyTracker class. This class is responsible for managing the state of each anomaly, including whether it's selected (checked) by the user, its associated graph, and the relevant time range for graphing. This separation of concerns keeps the main ReportPageSk class cleaner and focuses its responsibilities on rendering and user interaction.
When an anomaly is selected in the table, report-page-sk dynamically generates and displays an explore-simple-sk graph for that anomaly. The explore-simple-sk element is configured to show data around the anomaly's occurrence, typically a week before and after, to provide context. If multiple anomalies are selected, their graphs are displayed, and their heights are adjusted to fit the available space. A key feature is the synchronized X-axis across all displayed graphs, ensuring a consistent time scale for comparison.
The page also attempts to identify and display common commits related to the selected anomalies. It fetches commit details using the lookupCids function and highlights commits that appear to be “roll” commits (e.g., “Roll repo from hash to hash”). For these roll commits, it provides a link to the underlying commit or the parent commit if the roll pattern is not directly parseable from the commit message, which can be helpful for developers to trace the source of a change.
Key Components and Responsibilities:
report-page-sk.ts: This is the main TypeScript file defining the ReportPageSk custom element.
ReportPageSk class:/_/defaults/) and then anomaly data based on URL parameters.AnomalyTracker instance to manage the state of individual anomalies (selected, graphed, time range).anomalies-table-sk and explore-simple-sk graphs based on user interactions and fetched data. It uses the lit-html library for templating.anomalies_checked events from the anomalies-table-sk to update the displayed graphs. It also handles x-axis-toggled events from explore-simple-sk to synchronize the x-axis across multiple graphs.explore-simple-sk instance, configures its query based on the anomaly's test path, and sets the appropriate time range.spinner-sk) during data fetching operations.AnomalyTracker class:AnomalyDataPoint objects, each containing an Anomaly, its checked status, its associated ExploreSimpleSk graph instance (if any), and its Timerange.AnomalyDataPoint interface: Defines the structure for storing information about a single anomaly within the AnomalyTracker.report-page-sk.scss: Contains the SASS/CSS styles for the report-page-sk element, including styling for the common commits section and the dialog for displaying all commits (though the dialog itself is not fully implemented in the provided showAllCommitsTemplate).
Data Fetching Workflow:
ReportPageSk element is connected to the DOM.rev, anomalyIDs, bugID) are read.fetchAnomalies() is called./_/anomalies/group_report with URL parameters in the body.anomaly_list, timerange_map, and selected_keys.AnomalyTracker is loaded with this data.anomalies-table-sk is populated.User Interaction Workflow (Selecting an Anomaly):
anomalies-table-sk.anomalies-table-sk fires an anomalies_checked custom event with the anomaly and its checked state.ReportPageSk listens for this event.updateGraphs() is called:addGraph() is called.explore-simple-sk instance is created and configured.AnomalyTracker is updated with the new graph instance.AnomalyTracker is updated to remove the graph reference.updateChartHeights() is called to adjust the height of all visible graphs.The design emphasizes dynamic content loading and interactive exploration. By using separate custom elements for the table (anomalies-table-sk) and graphs (explore-simple-sk), the module maintains a good separation of concerns and leverages reusable components. The AnomalyTracker further enhances this by encapsulating the state and logic related to individual anomalies.
The revision-info-sk custom HTML element is designed to display information about anomalies detected around a specific revision. This is particularly useful for understanding the impact of a code change on performance metrics.
The core functionality revolves around fetching and presenting RevisionInfo objects. A RevisionInfo object contains details like the benchmark, bot, bug ID, start and end revisions of an anomaly, the associated test, and links to explore the anomaly further.
Key Components and Workflow:
revision-info-sk.ts: This is the main TypeScript file defining the RevisionInfoSk element.
State Management: The element maintains its state in a State object, primarily storing the revisionId. It utilizes stateReflector from infra-sk/modules/statereflector to keep the URL in sync with the element‘s state. This allows users to share links that directly open to a specific revision’s information.
URL change -> stateReflector updates State.revisionId -> getRevisionInfo() is calledUser types revision ID and clicks "Get Revision Information" -> State.revisionId updated -> stateReflector updates URL -> getRevisionInfo() is calledData Fetching (getRevisionInfo): When a revision ID is provided (either via URL or user input), this method is triggered.
spinner-sk) to indicate loading.fetch request to the /_/revision/?rev=<revisionId> endpoint.RevisionInfo objects, is parsed using jsonOrThrow.revisionInfos are stored, and the UI is re-rendered to display the information.Rendering (template, getRevInfosTemplate, revInfoRowTemplate): Lit-html is used for templating.
template) includes an input field for the revision ID, a button to trigger fetching, a spinner, and a container for the revision information.getRevInfosTemplate generates an HTML table if revisionInfos is populated. This table includes a header row with a “select all” checkbox and columns for bug ID, revision range, master, bot, benchmark, and test.revInfoRowTemplate renders each individual RevisionInfo as a row in the table. Each row has a checkbox for selection, a link to the bug (if any), a link to explore the anomaly, and the other relevant details.Multi-Graph Functionality: The element allows users to select multiple detected anomaly ranges and view them together on a multi-graph page.
checkbox-sk) are provided for each revision info row and a “select all” checkbox. The toggleSelectAll method handles the logic for the master checkbox.updateMultiGraphStatus: This method is called whenever a checkbox state changes. It checks if any revisions are selected and enables/disables the “View Selected Graph(s)” button accordingly. It also updates the selectAll state if no individual revisions are checked.getGraphConfigs: This helper function takes an array of selected RevisionInfo objects and transforms them into an array of GraphConfig objects. Each GraphConfig contains the query string associated with the anomaly.getMultiGraphUrl: This asynchronous method constructs the URL for the multi-graph view.getGraphConfigs to get the configurations for the selected revisions.updateShortcut (from explore-simple-sk) to generate a shortcut ID for the combined graph configurations. This typically involves a POST request to /_/shortcut/update.begin and end timestamps) encompassing all selected anomalies.anomaly_ids from the selected revisions to highlight them on the multi-graph page.begin, end timestamps, the shortcut ID, the totalGraphs, and highlight_anomalies parameters.viewMultiGraph: This method is called when the “View Selected Graph(s)” button is clicked.RevisionInfo objects.getMultiGraphUrl to generate the redirect URL.window.open(url, '_self')) to the multi-graph page. If not, it displays an error message.Styling (revision-info-sk.scss): Provides basic styling for the element, such as left-aligning table headers and styling the spinner.
index.ts: Simply imports and thereby registers the revision-info-sk custom element.
Demo Page (revision-info-sk-demo.html, revision-info-sk-demo.ts, revision-info-sk-demo.scss):
revision-info-sk element.revision-info-sk-demo.ts file uses fetch-mock to mock the /_/revision/ API endpoint. This is crucial for demonstrating the element's functionality without needing a live backend. When the demo page loads and the user interacts with the element (e.g., enters a revision ID ‘12345’), the mocked response is returned.Design Decisions and Rationale:
<revision-info-sk>) promotes reusability across different parts of the Perf application or potentially other Skia web applications.stateReflector enhances user experience by allowing direct navigation to a revision's details via URL and updating the URL as the user interacts with the element. This makes sharing and bookmarking specific views straightforward.async/await makes the code easier to read and manage compared to traditional Promise chaining.getMultiGraphUrl. This separates concerns and makes the process of generating the complex URL clearer. It relies on the explore-simple-sk module's updateShortcut function, promoting reuse of existing shortcut generation logic.jsonOrThrow is used to simplify error handling for fetch requests. The viewMultiGraph method also includes basic error handling if the URL generation fails.Workflow for Displaying Revision Information:
User Interaction / URL Change
|
v
[revision-info-sk] stateReflector updates internal 'state.revisionId'
|
v
[revision-info-sk] getRevisionInfo() called
|
+--------------------------------+
| |
v v
[revision-info-sk] shows spinner [revision-info-sk] makes fetch request to `/_/revision/?rev=<ID>`
| |
| v
| [Backend] processes request, returns RevisionInfo[]
| |
| v
+------------------> [revision-info-sk] receives JSON response, parses with jsonOrThrow
|
v
[revision-info-sk] stores 'revisionInfos', hides spinner
|
v
[revision-info-sk] re-renders using Lit-html templates to display table
Workflow for Viewing Multi-Graph:
User selects one or more revision info rows (checkboxes)
|
v
[revision-info-sk] updateMultiGraphStatus() enables "View Selected Graph(s)" button
|
v
User clicks "View Selected Graph(s)" button
|
v
[revision-info-sk] viewMultiGraph() called
|
v
[revision-info-sk] collects selected RevisionInfo objects
|
v
[revision-info-sk] calls getMultiGraphUrl(selectedRevisions)
|
+------------------------------------------------------+
| |
v v
[getMultiGraphUrl] calls getGraphConfigs() to create GraphConfig[] [getMultiGraphUrl] calls updateShortcut(GraphConfig[])
| | (makes POST to /_/shortcut/update)
| v
| [Backend] returns shortcut ID
| |
+-------------------------------------> [getMultiGraphUrl] constructs final URL (with begin, end, shortcut, anomaly IDs)
|
v
[viewMultiGraph] receives the multi-graph URL
|
v
[Browser] navigates to the generated multi-graph URL
The split-chart-menu-sk module provides a user interface element for selecting an attribute by which to split a chart. This is particularly useful in data visualization scenarios where users need to break down aggregated data into smaller, more specific views. For example, in a performance monitoring dashboard, a user might want to see performance metrics split by benchmark, specific test case (story), or sub-component (subtest).
The core functionality revolves around presenting a list of available attributes to the user in a dropdown menu. These attributes are dynamically derived from the underlying data. When an attribute is selected, the component emits an event, allowing other parts of the application to react and update the chart display accordingly.
Key Components and Design:
split-chart-menu-sk.ts: This is the main TypeScript file that defines the SplitChartMenuSk LitElement.
context API (@consume) to access data from two sources: dataframeContext and dataTableContext.dataframeContext provides the DataFrame (from //perf/modules/json:index_ts_lib and //perf/modules/dataframe:dataframe_context_ts_lib). The DataFrame is the source from which the list of available attributes for splitting is derived. This design decouples the menu from the specifics of data fetching and management, allowing it to focus solely on the UI aspect of attribute selection. The getAttributes function (from //perf/modules/dataframe:traceset_ts_lib) is used to extract these attributes.dataTableContext provides DataTable (also from //perf/modules/dataframe:dataframe_context_ts_lib). While consumed, its direct usage within this specific component‘s rendering logic isn’t immediately apparent in the provided render method, but it might be used by other parts of the application or for future enhancements.<md-outlined-button>) labeled “Split By” serves as the trigger to open the menu.<md-menu>), which is populated with <md-menu-item> elements, one for each attribute retrieved from the DataFrame.menuOpen state property controls the visibility of the menu. Clicking the button toggles this state. The menu also closes itself via the @closed event.bubbleAttribute method is called. This method dispatches a custom event named split-chart-selection.SplitChartSelectionEventDetails) contains the selected attribute (a string).bubbles: true) and pass through shadow DOM boundaries (composed: true), making it easy for ancestor elements to listen and react to the selection. This event-driven approach is crucial for decoupling the menu from the chart component or any other component that needs to know about the selected split attribute.split-chart-menu-sk.css.ts (style). This keeps the component's presentation concerns separate from its logic. The styles ensure the component is displayed as an inline block and sets a default background color, also styling the Material button.split-chart-menu-sk.css.ts: This file defines the CSS styles for the component using Lit‘s css tagged template literal. The primary styling focuses on the host element’s positioning and background, and customizing the Material Design button's border radius.
index.ts: This file simply imports and registers the split-chart-menu-sk custom element, making it available for use in HTML.
Workflow: Selecting a Split Attribute
Initialization:
split-chart-menu-sk component is rendered.DataFrame from the dataframeContext.getAttributes() method is called (implicitly via the render method's map function) to populate the list of attributes for the menu.User Interaction:
menuClicked handler is invoked -> this.menuOpen becomes true.<md-menu> component becomes visible, displaying the list of attributes.User split-chart-menu-sk DataFrame | | | |---Clicks "Split By"->| | | |---Toggles menuOpen=true-->| | | | | |<--Displays Menu-------| | | |
Attribute Selection:
- User clicks on an attribute in the menu (e.g., "benchmark").
- The `click` handler on `<md-menu-item>` calls
`this.bubbleAttribute("benchmark")`.
- `bubbleAttribute` creates a `CustomEvent('split-chart-selection', {
detail: { attribute: “benchmark” } })`. - The event is dispatched.
```
User split-chart-menu-sk (Parent Component)
| | |
|---Clicks "benchmark"->| |
| |---Calls bubbleAttribute("benchmark")-->|
| | |
| |---Dispatches "split-chart-selection" event--> (Listens for event)
| | | |
| | | |---Handles event, updates chart
```
Menu Closes:
<md-menu> component emits a closed event.menuClosed handler is invoked -> this.menuOpen becomes false.This design ensures that split-chart-menu-sk is a self-contained, reusable UI component whose sole responsibility is to provide a way to select a splitting attribute and communicate that selection to the rest of the application via a well-defined event. The use of context for data consumption and custom events for output makes it highly decoupled and easy to integrate.
The demo page (split-chart-menu-sk-demo.html and split-chart-menu-sk-demo.ts) demonstrates how to use the component and listen for the split-chart-selection event. The Puppeteer test (split-chart-menu-sk_puppeteer_test.ts) provides a basic smoke test and a visual regression test by taking a screenshot.
The subscription-table-sk module provides a custom HTML element designed to display information about a “subscription” and its associated “alerts”. This is particularly useful in contexts where users need to understand the configuration of automated monitoring or alerting systems.
The core functionality is encapsulated within the subscription-table-sk.ts file, which defines the SubscriptionTableSk custom element. This element is built using Lit, a library for creating fast, lightweight web components.
Why and How:
The primary goal is to present complex subscription and alert data in a user-friendly and interactive manner. Instead of a static display, this component allows for toggling the visibility of the detailed alert configurations. This design choice avoids overwhelming the user with too much information upfront, providing a cleaner initial view focused on the subscription summary.
The SubscriptionTableSk element takes Subscription and Alert[] objects as input. The Subscription object contains general information like name, contact email, revision, bug tracking details (component, hotlists, priority, severity, CC emails). The Alert[] array holds detailed configurations for individual alerts, including their query parameters, step algorithm, radius, and other specific settings.
Key Responsibilities and Components:
subscription-table-sk.ts:SubscriptionTableSk class: This is the heart of the module. It extends ElementSk, a base class for Skia custom elements.subscription and alerts data internally.template static method): It uses Lit's html tagged template literal to define the structure and content of the element. It conditionally renders the subscription details and the alerts table based on the available data and the showAlerts state.showAlerts is true. This state is toggled by a button.load(subscription: Subscription, alerts: Alert[]) method: This public method is the primary way to feed data into the component. It updates the internal state and triggers a re-render.toggleAlerts() method: This method flips the showAlerts boolean flag and triggers a re-render, effectively showing or hiding the alerts table.formatRevision(revision: string) method: A helper function to display the revision string as a clickable link, pointing to a specific configuration file URL. This improves usability by allowing users to quickly navigate to the source of the configuration.paramset-sk integration: For displaying the alert query, it utilizes the paramset-sk element. The toParamSet utility function (from infra-sk/modules/query) is used to convert the query string into a format suitable for paramset-sk, which then renders it as a structured set of key-value pairs. This enhances readability of complex query strings.subscription-table-sk.scss): This file defines the visual appearance of the element. It uses SCSS and imports styles from shared libraries (themes_sass_lib, buttons_sass_lib, select_sass_lib) to maintain a consistent look and feel with other Skia elements. The styles focus on clear presentation of information, with distinct sections for subscription details and the alerts table.Workflow: Displaying Subscription and Alerts
subscription-table-sk is added to the DOM. <subscription-table-sk></subscription-table-sk>load() method on the element instance, passing in the Subscription object and an array of Alert objects. element.load(mySubscriptionData, myAlertsData);SubscriptionTableSk element updates its internal subscription and alerts properties.showAlerts is set to false by default upon loading new data._render() method is called (implicitly by Lit or explicitly).template function generates the HTML:click event triggers the toggleAlerts() method. - showAlerts becomes true. - _render() is called again. - The template function now also renders the <table id="alerts-table">. - The table header is displayed. - For each Alert object in ele.alerts: - A table row (<tr>) is created. - Cells (<td>) display various alert properties (step algorithm, radius, k, etc.). - The alert query is passed to a <paramset-sk> element for structured display. - The button label changes to “Hide Alert Configurations”.Diagram: Data Flow and Rendering
External Code ---> subscriptionTableSkElement.load(subscription, alerts)
|
V
SubscriptionTableSk Internal State:
- this.subscription = subscription
- this.alerts = alerts
- this.showAlerts = false (initially or after load)
|
V
_render() ------> Lit Template Evaluation
|
-------------------------------------
| |
V (if this.subscription is not null) V (if this.showAlerts is true)
Render Subscription Details Render Alerts Table
- Name, Email, Revision (formatted link) - Iterate through this.alerts
- Bug info, Hotlists, CCs - For each alert:
- "Show/Hide Alerts" Button - Display properties in <td>
- Use <paramset-sk> for query
Demo Page (subscription-table-sk-demo.html, subscription-table-sk-demo.ts)
The demo page serves as an example and testing ground.
subscription-table-sk-demo.html: Sets up the basic HTML structure, including instances of subscription-table-sk (one for light mode, one for dark mode to test theming) and buttons to interact with them. It also includes an error-toast-sk for displaying potential errors.subscription-table-sk-demo.ts: Contains JavaScript to:subscription-table-sk element.Subscription and Alert data.load() method on the subscription-table-sk instances with the sample data.toggleAlerts() method on the instances.This setup allows developers to see the component in action and verify its functionality with predefined data.
The test-picker-sk module provides a custom HTML element, <test-picker-sk>, designed to guide users in selecting a valid trace or test for plotting. It achieves this by presenting a series of dependent input fields, where the options available in each field are dynamically updated based on selections made in previous fields. This ensures that users can only construct valid combinations of parameters.
Core Functionality and Design:
The primary goal of test-picker-sk is to simplify the process of selecting a specific data series (a “trace” or “test”) from a potentially large and complex dataset. This is often necessary in performance analysis tools where data is categorized by multiple parameters (e.g., benchmark, bot, specific test, sub-test variations).
The design enforces a specific order for filling out these parameters. This hierarchical approach is crucial because the valid options for a parameter often depend on the values chosen for its preceding parameters.
Key Components and Responsibilities:
test-picker-sk.ts: This is the heart of the module, defining the TestPickerSk custom element.
FieldInfo class: This internal class is a simple data structure used to manage the state of each individual input field within the picker. It stores a reference to the PickerFieldSk element, the parameter name (e.g., “benchmark”, “bot”), and the currently selected value.addChildField): When a value is selected in a field, and if there are more parameters in the hierarchy, a new PickerFieldSk input is dynamically added to the UI. The options for this new field are fetched from the backend. This progressive disclosure prevents overwhelming the user with too many options at once.callNextParamList): The element interacts with a backend endpoint (/_/nextParamList/). This endpoint is responsible for:_fieldData, _currentIndex): The _fieldData array holds FieldInfo objects for each parameter field. _currentIndex tracks which field is currently active or the next to be added.value-changed, plot-button-clicked):value-changed events from its child picker-field-sk elements. When a value changes, it triggers logic to update subsequent fields and the match count.plot-button-clicked custom event when the user clicks the “Add Graph” button. This event includes the fully constructed query string representing the selected trace.populateFieldDataFromQuery): This method allows the picker to be initialized with a pre-existing query string. It will populate the fields sequentially based on the query parameters. If a parameter in the hierarchy is missing or empty in the query, the population stops at that point.onPlotButtonClick, PLOT_MAXIMUM): The “Add Graph” button is enabled only when the number of matching traces is within a manageable range (greater than 0 and less than or equal to PLOT_MAXIMUM). This prevents users from attempting to plot an overwhelming number of traces.picker-field-sk (Dependency): While not part of this module, test-picker-sk heavily relies on the picker-field-sk element. Each parameter in the test picker is represented by an instance of picker-field-sk. This child component is responsible for displaying a label, an input field, and a dropdown menu of selectable options.
test-picker-sk.scss: Defines the visual styling for the test-picker-sk element and its internal components, ensuring a consistent look and feel. It styles the layout of the fields, the match count display, and the plot button.
Workflow: User Selecting a Test
Initialization (initializeTestPicker):
test-picker-sk is given an ordered list of parameter names (e.g., ['benchmark', 'bot', 'test']) and optional default parameters.test-picker-sk -> Backend (/_/nextParamList/): Requests options for the first parameter (e.g., “benchmark”) with an empty query.User Interface: Backend:
[test-picker-sk]
|
initializeTestPicker(['benchmark', 'bot', 'test'], {})
|
---> POST /_/nextParamList/ (q="")
|
(Processes request, queries data source)
|
<--- {paramset: {benchmark: ["b1", "b2"]}, count: 100}
|
(Renders first PickerFieldSk for "benchmark" with options "b1", "b2")
[Benchmark: [select ▼]] [Matches: 100] [Add Graph (disabled)]
User Selects a Value:
picker-field-sk for “benchmark” emits a value-changed event.test-picker-sk -> Backend: Requests options for the next parameter (“bot”), now including the selection benchmark=b1 in the query.User Interface:
[Benchmark: [b1 ▼]]
| (value-changed: {value: "b1"})
[test-picker-sk]
|
---> POST /_/nextParamList/ (q="benchmark=b1")
|
(Processes request, filters based on benchmark=b1)
|
<--- {paramset: {bot: ["botX", "botY"]}, count: 20}
|
(Renders PickerFieldSk for "bot" with options "botX", "botY")
[Benchmark: [b1 ▼]] [Bot: [select ▼]] [Matches: 20] [Add Graph (disabled)]
Process Repeats: This continues for each parameter in the hierarchy.
Final Selection and Plotting:
PLOT_MAXIMUM, the “Add Graph” button enables.test-picker-sk emits plot-button-clicked with the final query (e.g., benchmark=b1&bot=botX&test=testZ).User Interface:
[Benchmark: [b1 ▼]] [Bot: [botX ▼]] [Test: [testZ ▼]] [Matches: 5] [Add Graph (enabled)]
| (User clicks "Add Graph")
[test-picker-sk]
|
emits 'plot-button-clicked' (detail: {query: "benchmark=b1&bot=botX&test=testZ"})
Why this Approach?
The test-picker-sk-demo.html and test-picker-sk-demo.ts files provide a runnable example of the component, mocking the backend /_/nextParamList/ endpoint to showcase its functionality without needing a live backend. This is essential for development and testing. The Puppeteer and Karma tests (test-picker-sk_puppeteer_test.ts, test-picker-sk_test.ts) ensure the component behaves as expected under various conditions.
The /modules/themes module is responsible for defining the visual styling and theming for the application. It builds upon the base theming provided by infra-sk and introduces application-specific overrides and additions.
Why and How:
The primary goal of this module is to establish a consistent and branded look and feel across the application. Instead of defining all styles from scratch, it leverages the infra-sk theming library as a foundation. This promotes code reuse and ensures that common UI elements have a familiar appearance.
The approach taken is to:
themes.scss file begins by importing the core styles from ../../../infra-sk/themes. This brings in the foundational design system, including color palettes, typography, spacing, and component styles.https://fonts.googleapis.com/icon?family=Material+Icons). This makes a wide range of standard icons readily available for use within the application's UI.infra-sk theme and global changes from elements-sk components. This means that themes.scss focuses on styling aspects that are unique to this specific application or require modifications to the default infra-sk appearance.Key Components and Files:
themes.scss: This is the central SCSS (Sassy CSS) file for the module.
@import '../../../infra-sk/themes';: This line incorporates the foundational theme from the infra-sk library. The relative path indicates that infra-sk is expected to be a sibling or ancestor directory in the project structure.@import url('https://fonts.googleapis.com/icon?family=Material+Icons');: This directive pulls in the Material Icons font stylesheet, enabling the use of standard Google Material Design icons throughout the application.body { margin: 0; padding: 0; }: This is an example of a global override. It resets the default browser margins and padding on the <body> element, providing a cleaner baseline for layout. This is a common practice to ensure consistent spacing across different browsers. Other application-specific styles would follow this pattern, targeting specific elements or defining new CSS classes.BUILD.bazel: This file defines how the themes.scss file is processed and made available to the rest of the application.
sass_library rule (defined in //infra-sk:index.bzl) to compile the SCSS into CSS and declare it as a reusable library.load("//infra-sk:index.bzl", "sass_library"): Imports the necessary Bazel rule for handling SASS compilation.sass_library(name = "themes_sass_lib", ...): Defines a SASS library target named themes_sass_lib.srcs = ["themes.scss"]: Specifies that themes.scss is the source file for this library.visibility = ["//visibility:public"]: Makes this compiled CSS library accessible to any other part of the project.deps = ["//infra-sk:themes_sass_lib"]: Declares a dependency on the infra-sk SASS library. This is crucial because themes.scss imports styles from infra-sk. The build system needs to know about this dependency to ensure infra-sk styles are available during the compilation of themes.scss.Workflow (Styling Application):
Browser Request --> HTML Document
|
v
Link to Compiled CSS (from themes_sass_lib)
|
v
Application of Styles:
1. Base browser styles
2. infra-sk/themes.scss styles (imported)
3. Material Icons styles (imported)
4. modules/themes/themes.scss overrides & additions (applied last, taking precedence)
|
v
Rendered Page with Application-Specific Theme
In essence, this module provides a layered approach to theming. It starts with a robust base, incorporates external resources like icon fonts, and then applies specific customizations to achieve the desired visual identity for the application. The BUILD.bazel file ensures that these SASS files are correctly processed and made available as CSS to the application during the build process.
This module provides a mechanism for formatting trace details and converting trace strings into query strings. The core idea is to offer a flexible way to represent and interpret trace information, accommodating different formatting conventions, particularly for Chrome-specific trace structures.
The “why” behind this module stems from the need to handle various trace formats. Different systems or parts of the application might represent trace identifiers (which are essentially a collection of parameters) in distinct ways. This module centralizes the logic for translating between these representations. For example, a compact string representation of a trace might be used in URLs or displays, while a more structured ParamSet is needed for querying data.
The “how” is achieved through an interface TraceFormatter and concrete implementations. This allows for different formatting strategies to be plugged in as needed. The GetTraceFormatter() function acts as a factory, returning the appropriate formatter based on the application's configuration (window.perf.trace_format).
Key Components/Files:
traceformatter.ts: This is the central file containing the core logic.
TraceFormatter interface: Defines the contract for all trace formatters. It mandates two primary methods:formatTrace(params: Params): string: Takes a Params object (a key-value map representing trace parameters) and returns a string representation of the trace. This is useful for displaying trace identifiers in a user-friendly or system-specific format.formatQuery(trace: string): string: Takes a string representation of a trace and converts it into a query string (e.g., “key1=value1&key2=value2”). This is crucial for constructing API requests to fetch data related to a specific trace.DefaultTraceFormatter class: Provides a basic implementation of TraceFormatter.formatTrace method generates a string like “Trace ID: ,key1=value1,key2=value2,...”. This is a generic way to represent the trace parameters.formatQuery method currently returns an empty string, indicating that this default formatter doesn't have a specific logic for converting its trace string representation back into a query.ChromeTraceFormatter class: Implements TraceFormatter specifically for traces originating from Chrome's performance infrastructure.ChromeTraceFormatter? Chrome's performance data often uses a hierarchical, slash-separated string to identify traces (e.g., master/bot/benchmark/test/subtest_1). This formatter handles this specific convention.keys array: This private property (['master', 'bot', 'benchmark', 'test', 'subtest_1', 'subtest_2', 'subtest_3']) defines the expected order of parameters in the Chrome-style trace string. This order is significant for both formatting and parsing.formatTrace(params: Params): string: It iterates through the predefined keys and constructs a slash-separated string from the corresponding values in the input params. Input Params: { master: "m", bot: "b", benchmark: "bm", test: "t" } keys: [ "master", "bot", "benchmark", "test", ... ] Output String: "m/b/bm/t"formatQuery(trace: string): string: This is the inverse operation. It takes a slash-separated trace string, splits it, and maps the parts back to the predefined keys to build a ParamSet. It then converts this ParamSet into a standard URL query string. - Handling Statistics (Ad-hoc logic for Chromeperf/Skia bridge): A special piece of logic exists within formatQuery related to window.perf.enable_skia_bridge_aggregation. If a trace's ‘test’ value ends with a known statistic suffix (e.g., _avg, _count), this suffix is used to determine the stat parameter in the output query, and the suffix is removed from the ‘test’ parameter. If no such suffix is found, a default stat value of ‘value’ is added. This logic is a temporary measure to bridge formatting differences between Chromeperf and Skia systems and is intended to be removed once Chromeperf is deprecated. Input Trace String (enable_skia_bridge_aggregation = true): "master/bot/benchmark/test_name_max/subtest" Splits into: ["master", "bot", "benchmark", "test_name_max", "subtest"] Processed ParamSet: { master: ["master"], bot: ["bot"], benchmark: ["benchmark"], test: ["test_name"], stat: ["max"], subtest_1: ["subtest"] } Output Query: "master=master&bot=bot&benchmark=benchmark&test=test_name&stat=max&subtest_1=subtest"STATISTIC_SUFFIX_TO_VALUE_MAP: A map used by ChromeTraceFormatter to translate common statistic suffixes (like “avg”, “count”) found in test names to their corresponding “stat” parameter values (like “value”, “count”).traceFormatterRecords: A record (object map) that associates TraceFormat enum values (like '' for default, 'chrome' for Chrome-specific) with their corresponding TraceFormatter instances. This acts as a registry for available formatters.GetTraceFormatter() function: This is the public entry point for obtaining a trace formatter. It reads window.perf.trace_format (a global configuration setting) and returns the appropriate formatter instance from traceFormatterRecords. If the format is not found, it defaults to DefaultTraceFormatter.Global Config: window.perf.trace_format = "chrome"
|
v
GetTraceFormatter()
|
v
traceFormatterRecords["chrome"]
|
v
Returns new ChromeTraceFormatter() instance
traceformatter_test.ts: Contains unit tests for the ChromeTraceFormatter, specifically focusing on the formatQuery method and its logic for handling statistic suffixes under different configurations of window.perf.enable_skia_bridge_aggregation.
This module depends on:
infra-sk/modules:query_ts_lib: For the fromParamSet function, used to convert a ParamSet object into a URL query string.perf/modules/json:index_ts_lib: For type definitions like Params, ParamSet, and TraceFormat.perf/modules/paramtools:index_ts_lib: For the makeKey function, used by DefaultTraceFormatter to create a string representation of a Params object.perf/modules/window:window_ts_lib: To access global configuration values like window.perf.trace_format and window.perf.enable_skia_bridge_aggregation.The triage-menu-sk module provides a user interface element for managing and triaging anomalies in bulk. It's designed to streamline the process of handling multiple performance regressions or improvements detected in data.
The core purpose of this module is to allow users to efficiently take action on a set of selected anomalies. Instead of interacting with each anomaly individually, this menu provides centralized controls for common triage operations. This is crucial for workflows where many anomalies might be identified simultaneously, requiring a quick and consistent way to categorize or address them.
Key responsibilities and components:
triage-menu-sk.ts: This is the heart of the module, defining the TriageMenuSk custom element.
Anomaly objects and associated trace_names. This allows it to operate on multiple anomalies at once.new-bug-dialog-sk element, allowing the user to create a new bug report associated with the selected anomalies.existing-bug-dialog-sk element, enabling the user to link the selected anomalies to an already existing bug.NudgeEntry class and related logic (generateNudgeButtons, nudgeAnomaly, makeNudgeRequest) allow users to adjust the perceived start and end points of an anomaly. This is a subtle but important feature for refining the automated anomaly detection. The UI presents a set of buttons (e.g., -2, -1, 0, +1, +2) that shift the anomaly's boundaries._allowNudge flag controls whether the nudge buttons are visible, allowing for contexts where nudging might not be appropriate (e.g., when multiple, disparate anomalies are selected)._anomalies, _trace_names) and the nudge options (_nudgeList).makeEditAnomalyRequest and makeNudgeRequest methods handle sending HTTP POST requests to the /_/triage/edit_anomalies endpoint. This endpoint is responsible for persisting the triage decisions (bug associations, ignore status, nudge adjustments) in the backend database.editAction parameter in makeEditAnomalyRequest can take values like IGNORE, RESET (to de-associate bugs), or implicitly associate with a bug ID when called from the bug dialogs.anomaly-changed custom event. This event signals to parent components (likely a component displaying a list or plot of anomalies) that one or more anomalies have been modified and their representation needs to be updated. The event detail includes the affected traceNames, the editAction performed, and the updated anomalies.Integration with Dialogs:
new-bug-dialog-sk and existing-bug-dialog-sk. When the user clicks “New Bug” or “Existing Bug”, this element calls the respective open() methods on these dialog components.setAnomalies methods, so the dialogs know which anomalies the bug report will be associated with.triage-menu-sk.html (Implicit via Lit template in .ts): Defines the visual structure of the menu, including the layout of the action buttons and the nudge buttons. The rendering is dynamic based on the number of selected anomalies and whether nudging is allowed.
triage-menu-sk.scss: Provides the styling for the menu, ensuring it integrates visually with the surrounding application.
Key Workflow Example (Ignoring Anomalies):
triage-menu-sk Receives Data: The parent component calls triageMenuSkElement.setAnomalies(selectedAnomalies, correspondingTraceNames, nudgeOptions).triage-menu-sk re-renders, enabling the “Ignore” button (and potentially others). User Action (Selects Anomalies) --> Parent Component | v triage-menu-sk.setAnomalies() | v UI Renders (Buttons enabled)User Click ("Ignore") --> triage-menu-sk.ignoreAnomaly() | v makeEditAnomalyRequest(anomalies, traces, "IGNORE") | v POST /_/triage/edit_anomalies | (Backend processes) v HTTP 200 OK | v Dispatch "anomaly-changed" eventmakeEditAnomalyRequest is called. It constructs a JSON payload with the anomaly keys, trace names, and the action “IGNORE”. This payload is sent to /_/triage/edit_anomalies.triage-menu-sk updates the local state of the anomalies (setting bug_id to -2 for ignored anomalies) and dispatches the anomaly-changed event.anomaly-changed and updates its display to reflect that the anomalies are now ignored (e.g., by changing their color, removing them from an active list).The design decision to have triage-menu-sk orchestrate calls to the backend and then emit a generic anomaly-changed event decouples it from the specifics of how anomalies are displayed. Parent components only need to know that anomalies have changed and can react accordingly. The use of dedicated dialog components (new-bug-dialog-sk, existing-bug-dialog-sk) encapsulates the complexity of bug reporting, keeping the triage menu itself focused on initiating these actions.
triage-page-sk)The triage-page-sk module provides the user interface for viewing and triaging regressions in performance data. It allows users to filter regressions based on time range, commit status (all, regressions, untriaged), and alert configurations. The primary goal is to present a clear overview of regressions and facilitate the process of identifying their cause and impact.
The module is responsible for:
/_/reg/) to retrieve regression information for a specified time range and filter criteria. This data is then rendered in a tabular format, showing commits along with any associated regressions.stateReflector utility from infra-sk/modules/statereflector is used for this purpose.<dialog>) containing the cluster-summary2-sk element is displayed. This dialog allows the user to view details of the regression and assign a triage status (e.g., “positive”, “negative”, “acknowledged”)./_/triage/) to persist the decision.triage-status-sk element, which shows its current triage state.triage-page-sk.ts: This is the core TypeScript file defining the TriagePageSk custom element.State interface to manage the component's configuration (begin/end timestamps, subset filter, alert filter).connectedCallback initializes the stateReflector to synchronize the component's state with the URL.updateRange() is a crucial method that fetches regression data from the /_/reg/ endpoint whenever the state changes (e.g., date range or filter selection). It uses the fetch API for network requests.template function (using lit/html) defines the HTML structure of the component, including the filter controls, the main table displaying regressions, and the triage dialog.commitsChange, filterChange, rangeChange, triage_start, and triaged manage user input and interactions with child components.triage_start method is triggered when a user wants to triage a specific regression. It prepares the data for the cluster-summary2-sk element and displays the triage dialog.triaged method is called when the user submits a triage decision from the cluster-summary2-sk dialog. It sends a POST request to /_/triage/ with the triage information.stepUpAt, stepDownAt, alertAt, etc., are used to determine how to render cells in the regression table based on the data received.calc_all_filter_options dynamically generates the list of available alert filters based on categories returned from the backend.triage-page-sk.scss: Contains the SASS/CSS styles for the triage-page-sk element.triage-page-sk-demo.html / triage-page-sk-demo.ts: Provide a demonstration page for the triage-page-sk element.<triage-page-sk>. The TypeScript file simply imports the main component to register it.1. Initial Page Load and Data Fetch:
User navigates to page / URL with state parameters
|
V
triage-page-sk.connectedCallback()
|
V
stateReflector initializes state from URL (or defaults)
|
V
triage-page-sk.updateRange()
|
V
FETCH /_/reg/ with current state (begin, end, subset, alert_filter)
|
V
Backend responds with RegressionRangeResponse (header, table, categories)
|
V
triage-page-sk.reg is updated
|
V
triage-page-sk.calc_all_filter_options() (if categories present)
|
V
triage-page-sk._render() displays the regression table
2. User Changes Filter or Date Range:
User interacts with <select> (commits/filter) or <day-range-sk>
|
V
Event handler (e.g., commitsChange, filterChange, rangeChange) updates this.state
|
V
this.stateHasChanged() (triggers stateReflector to update URL)
|
V
triage-page-sk.updateRange()
|
V
FETCH /_/reg/ with new state
|
V
Backend responds with updated RegressionRangeResponse
|
V
triage-page-sk.reg is updated
|
V
triage-page-sk._render() re-renders the regression table with new data
3. User Initiates Triage:
User clicks on a regression in the table (within a <triage-status-sk> element)
|
V
<triage-status-sk> emits 'start-triage' event with details (alert, full_summary, cluster_type)
|
V
triage-page-sk.triage_start(event)
|
V
this.dialogState is populated with event.detail
|
V
this._render() (updates the <cluster-summary2-sk> properties within the dialog)
|
V
this.dialog.showModal() (displays the triage dialog)
4. User Submits Triage:
User interacts with <cluster-summary2-sk> in the dialog and clicks "Save" (or similar)
|
V
<cluster-summary2-sk> emits 'triaged' event with details (columnHeader, triage status)
|
V
triage-page-sk.triaged(event)
|
V
Constructs TriageRequest body (cid, triage, alert, cluster_type)
|
V
this.dialog.close()
|
V
this.triageInProgress = true; this._render() (shows spinner)
|
V
FETCH POST /_/triage/ with TriageRequest
|
V
Backend responds (e.g., with a bug link if applicable)
|
V
this.triageInProgress = false; this._render() (hides spinner)
|
V
(Optional) If json.bug exists, window.open(json.bug)
|
V
(Implicit) The <triage-status-sk> for the triaged item may update its display, or a full data refresh might be triggered if necessary to show the updated status.
triage-page-sk, commit-detail-sk, day-range-sk, triage-status-sk, cluster-summary2-sk). This promotes modularity, reusability, and separation of concerns. Each component handles a specific piece of functionality.fetch and Promises. Spinners (spinner-sk) are used to provide visual feedback to the user during these operations.<dialog>) is used for the triage process. This provides a focused interface for the user to review cluster details and make a triage decision without cluttering the main regression table.The triage-page-sk serves as the central hub for users to actively engage with and manage performance regressions, making it a critical component in the performance monitoring workflow.
The triage-status-sk module provides a custom HTML element designed to visually represent and interact with the triage status of a “cluster” within the Perf application. A cluster, in this context, likely refers to a group of related performance measurements or anomalies that require user attention and classification (triaging).
Core Functionality & Design:
The primary purpose of this element is to offer a concise and interactive way for users to understand the current triage state of a cluster and to initiate the triaging process.
Visual Indication: The element displays a button. The appearance of this button (specifically, an icon within it) changes based on the cluster's triage status: “positive,” “negative,” or “untriaged.” This provides an immediate visual cue to the user.
tricon2-sk element to display the appropriate icon based on the triage.status property. The styling for these states is defined in triage-status-sk.scss, ensuring visual consistency with the application's theme (including dark mode).Initiating Triage: Clicking the button does not directly change the triage status within this element. Instead, it emits a custom event named start-triage.
triage-status-sk element focused and reusable. The actual triaging process likely involves a dialog or a more complex UI, which is beyond the scope of this simple button._start_triage method is invoked on button click. This method constructs a detail object containing all relevant information about the cluster (full_summary, current triage status, alert configuration, cluster_type, and a reference to the element itself) and dispatches the start-triage CustomEvent.Key Components & Files:
triage-status-sk.ts: This is the heart of the module, defining the TriageStatusSk custom element class which extends ElementSk.triage: An object of type TriageStatus (defined in perf/modules/json) holding the status (‘positive’, ‘negative’, ‘untriaged’) and a message string. This is the primary driver for the element's appearance.full_summary: Potentially detailed information about the cluster, of type FullSummary.alert: Information about any alert configuration associated with the cluster, of type Alert.cluster_type: A string (‘high’ or ‘low’), likely indicating the priority or type of the cluster.lit-html for templating (TriageStatusSk.template). The template renders a <button> containing a tricon2-sk element. The class of the button and the value of the tricon2-sk are bound to ele.triage.status, dynamically changing the appearance._start_triage method is responsible for creating and dispatching the start-triage event.triage-status-sk.scss: Defines the visual styling for the triage-status-sk element. It includes specific styles for the different triage states (.positive, .negative, .untriaged) and their hover states, ensuring they integrate with the application's themes (including dark mode variables like --positive, --negative, --surface).index.ts: A simple entry point that imports and thereby registers the triage-status-sk custom element, making it available for use in HTML.triage-status-sk-demo.html & triage-status-sk-demo.ts: These files provide a demonstration page for the triage-status-sk element.start-triage event and how to programmatically set the triage property of the element. This is crucial for developers to understand how to integrate and use the component.BUILD.bazel: Defines how the module is built and its dependencies. It specifies tricon2-sk as a UI dependency and includes necessary SASS and TypeScript libraries.triage-status-sk_puppeteer_test.ts: Contains Puppeteer-based tests to ensure the element renders correctly and behaves as expected in a browser environment. This is important for maintaining code quality and preventing regressions.Workflow Example: User Initiates Triage
User sees a triage-status-sk button (e.g., showing an 'untriaged' icon)
|
V
User clicks the button
|
V
[triage-status-sk.ts] _start_triage() method is called
|
V
[triage-status-sk.ts] Creates a 'detail' object with:
- triage: { status: 'untriaged', message: '...' }
- full_summary: { ... }
- alert: { ... }
- cluster_type: 'low' | 'high'
- element: (reference to itself)
|
V
[triage-status-sk.ts] Dispatches a 'start-triage' CustomEvent with the 'detail' object
|
V
[Parent Component/Application Logic] Listens for 'start-triage' event
|
V
[Parent Component/Application Logic] Receives event.detail
|
V
[Parent Component/Application Logic] Uses the received data to:
- Open a triage dialog
- Populate the dialog with cluster details
- Allow user to select a new triage status
- (Potentially) update the original triage-status-sk element's
'triage' property after the dialog interaction is complete.
This design allows triage-status-sk to be a focused, presentational component, while the more complex logic of handling the triage process itself is managed elsewhere in the application. This promotes separation of concerns and reusability.
The triage2-sk module provides a custom HTML element for selecting a triage status. This element is designed to be a simple, reusable UI component for indicating whether a particular item is “positive”, “negative”, or “untriaged”. Its primary purpose is to offer a standardized way to represent and interact with triage states across different parts of the Perf application.
The core of the module is the triage2-sk custom element, defined in triage2-sk.ts. This element leverages the Lit library for templating and rendering. It presents three buttons, each representing one of the triage states:
<check-circle-icon-sk>).<cancel-icon-sk>).<help-icon-sk>).The “why” behind this design is to provide a clear visual representation of the current triage status and an intuitive way for users to change it. By using distinct icons and styling for each state, the element aims to reduce ambiguity.
Key Implementation Details:
triage2-sk.ts: This is the main TypeScript file defining the TriageSk class, which extends ElementSk.
value attribute (and corresponding property). It can be one of “positive”, “negative”, or “untriaged”. If no value is provided, it defaults to “untriaged”.change. The detail property of this event contains the new triage status as a string (e.g., “positive”). This allows parent components to react to changes in the triage status. User clicks "Positive" button | V triage2-sk sets its 'value' attribute to "positive" | V triage2-sk dispatches a 'change' event with detail: "positive"template static method uses Lit's html tagged template literal to define the structure of the element. It dynamically sets the selected attribute on the appropriate button based on the current value.value attribute. When this attribute changes (either programmatically or through user interaction), the attributeChangedCallback is triggered, which re-renders the component and dispatches the change event.isStatus function ensures that the value property is always one of the allowed Status types, defaulting to “untriaged” if an invalid value is encountered. This contributes to the robustness of the component.triage2-sk.scss: This file contains the SASS styles for the triage2-sk element.
index.ts: This file serves as the entry point for the module, exporting the TriageSk class and ensuring the custom element is defined.
Demo and Testing:
triage2-sk-demo.html and triage2-sk-demo.ts: Provide a simple demonstration page showcasing the element in various states and how to listen for the change event. This is useful for manual testing and visual inspection.triage2-sk_test.ts: Contains Karma unit tests that verify the event emission and value changes of the component.triage2-sk_puppeteer_test.ts: Includes Puppeteer-based end-to-end tests that check the rendering of the component in a browser environment and capture screenshots for visual regression testing.The design choice of using custom elements and Lit allows for a modular and maintainable component that can be easily integrated into larger applications. The clear separation of concerns (logic in TypeScript, styling in SASS, and structure in the template) follows common best practices for web component development.
The tricon2-sk module provides a custom HTML element <tricon2-sk> designed to visually represent triage states. This component is crucial for user interfaces where quick identification of an item's status (e.g., in a bug tracker, code review system, or monitoring dashboard) is necessary.
The core idea is to offer a standardized, reusable icon that clearly communicates whether an item is “positive,” “negative,” or “untriaged.” This avoids inconsistencies and reduces cognitive load for users who frequently interact with such systems.
Key Components and Responsibilities:
tricon2-sk.ts: This is the heart of the module. It defines the TriconSk class, which extends ElementSk (a base class for custom elements in the Skia infrastructure).
value attribute.lit-html library for templating, allowing for efficient rendering and updates.static template function determines which icon to display (check-circle-icon-sk for “positive”, cancel-icon-sk for “negative”, and help-icon-sk for “untriaged” or any other value). This design centralizes the icon selection logic.value attribute is the primary interface for controlling the displayed icon. Changes to this attribute trigger a re-render via attributeChangedCallback and _render().connectedCallback ensures that the value property is properly initialized if set before the element is attached to the DOM.check-circle-icon-sk, cancel-icon-sk, help-icon-sk) from the elements-sk module, promoting modularity and reuse of existing icon assets.tricon2-sk.scss: This file handles the styling of the tricon2-sk element and its internal icons.
--green, --red, --brown) for the icon fill colors. This allows themes (defined in themes.scss) to override these colors easily..body-sk context and when .darkmode is applied to .body-sk. This ensures the icons maintain appropriate contrast and visibility across different UI themes. The fallback hardcoded colors (#388e3c, etc.) provide a default styling if CSS variables are not defined by a theme.index.ts: This file serves as the main entry point for the module when it's imported. Its sole responsibility is to import tricon2-sk.ts, which in turn registers the <tricon2-sk> custom element. This is a common pattern for organizing custom element definitions.
tricon2-sk-demo.html and tricon2-sk-demo.ts: These files create a demonstration page for the <tricon2-sk> element.
tricon2-sk element and how it appears in various theming contexts (default, with colors.css theming, and with themes.css in both light and dark modes). This is invaluable for development, testing, and documentation.<tricon2-sk> element with different value attributes. The accompanying TypeScript file simply imports the index.ts of the tricon2-sk module to ensure the custom element is defined before the browser tries to render it.tricon2-sk_puppeteer_test.ts: This file contains automated UI tests for the tricon2-sk element using Puppeteer.
tricon2-sk-demo.html) in a headless browser, checks if the expected number of tricon2-sk elements are present (a basic smoke test), and then takes a screenshot of the page. This ensures that changes to the component's appearance are caught early.Workflow: Displaying a Triage Icon
Usage: An application includes the <tricon2-sk> element in its HTML, setting the value attribute:
<tricon2-sk value="positive"></tricon2-sk>
Element Initialization (tricon2-sk.ts):
TriconSk class is instantiated.connectedCallback is called, ensuring the value property is synchronized with the attribute._render() is called.Template Selection (tricon2-sk.ts):
static template function is invoked.this.value (e.g., “positive”), it returns the corresponding HTML template: html<check-circle-icon-sk></check-circle-icon-sk>.Icon Rendering:
<check-circle-icon-sk>) renders itself.Styling (tricon2-sk.scss):
- CSS rules are applied. For example, if the value is "positive":
`tricon2-sk { check-circle-icon-sk { fill: var(--green); // Initially
attempts to use the CSS variable } }` - If themes are active (e.g.,`.body-sk.darkmode`), more specific rules
might override the fill color: `.body-sk.darkmode tricon2-sk {
check-circle-icon-sk { fill: #4caf50; // Specific dark mode color } }`
Diagram: Attribute Change leading to Icon Update
[User/Application sets/changes 'value' attribute on <tricon2-sk>]
|
v
[<tricon2-sk> element]
|
+---------------------+
| attributeChangedCallback() is triggered |
+---------------------+
|
v
[this._render()]
|
v
[TriconSk.template(this)] <-- Reads current 'this.value'
|
+-------------+-------------+
| (value is | (value is | (value is other)
| "positive") | "negative") |
v v v
[Returns [Returns [Returns
<check-...>] <cancel-...>] <help-...>]
|
v
[lit-html updates the DOM with the new icon template]
|
v
[Browser renders the new icon with appropriate CSS styles]
The design decision to use distinct, imported icon components (check-circle-icon-sk, etc.) rather than, for example, a single SVG sprite or dynamically generating SVG paths, promotes better separation of concerns. Each icon can be managed and updated independently. The use of CSS variables for theming is a standard and flexible approach, allowing consuming applications to easily adapt the icon colors to their specific look and feel without modifying the component's core logic or styles directly.
The trybot module provides utilities for processing and analyzing results from Perf trybots. Trybots are automated systems that run performance tests on code changes before they are submitted. This module focuses on calculating and presenting metrics that help developers understand the performance impact of their changes.
The core functionality revolves around aggregating and averaging stddevRatio values across different parameter combinations. The stddevRatio is a key metric representing the change in performance relative to the standard deviation of the baseline. A positive stddevRatio generally indicates a performance regression, while a negative value suggests an improvement.
The primary goal is to help developers quickly identify which aspects of their change (represented by key-value parameters like model=GCE or test=MyBenchmark) are contributing most significantly to performance changes, both positive and negative. By grouping results by these parameters and calculating average stddevRatio, the module provides a summarized view that highlights potential problem areas or confirms expected improvements.
calcs.ts: This file contains the logic for performing calculations on trybot results.
byParams(res: TryBotResponse): AveForParam[]: This is the central function of the module.
Why: Developers need a way to understand the overall performance impact of their changes across various configurations (e.g., different devices, tests, or operating systems). Simply looking at individual trace results can be overwhelming. This function provides a summarized view by grouping results by their parameters.
How:
It takes a TryBotResponse object, which contains a list of individual test results (res.results). Each result includes a stddevRatio and a set of params (key-value pairs describing the test configuration).
It iterates through each result and then through each key-value pair within that result's params.
For each unique key=value string (e.g., “model=GCE”), it maintains a running total of stddevRatio values, the count of traces contributing to this total (n), and counts of traces with positive (high) or negative (low) stddevRatio. This aggregation happens in the runningTotals object.
Input TryBotResponse.results:
[
{ params: {arch: "arm", os: "android"}, stddevRatio: 1.5 },
{ params: {arch: "x86", os: "linux"}, stddevRatio: -0.5 },
{ params: {arch: "arm", os: "ios"}, stddevRatio: 2.0 }
]
-> runningTotals intermediate state (simplified):
"arch=arm": { totalStdDevRatio: 3.5, n: 2, high: 2, low: 0 }
"os=android": { totalStdDevRatio: 1.5, n: 1, high: 1, low: 0 }
"arch=x86": { totalStdDevRatio: -0.5, n: 1, high: 0, low: 1 }
"os=linux": { totalStdDevRatio: -0.5, n: 1, high: 0, low: 1 }
"os=ios": { totalStdDevRatio: 2.0, n: 1, high: 1, low: 0 }
After processing all results, it calculates the average stddevRatio for each key=value pair by dividing totalStdDevRatio by n.
It constructs an array of AveForParam objects. Each object represents a key=value parameter and includes its calculated average stddevRatio, the total number of traces (n) that matched this parameter, and the counts of high and low stddevRatio traces.
Finally, it sorts this array in descending order based on the aveStdDevRatio. This crucial step brings the parameters associated with the largest (potentially negative) performance regressions to the top, making them easy to identify.
AveForParam interface: Defines the structure for the output of byParams. It holds the aggregated average stddevRatio for a specific keyValue pair, along with counts of traces.
runningTotal interface: An internal helper interface used during the aggregation process within byParams to keep track of sums and counts before the final average is computed.
calcs_test.ts: This file contains unit tests for the functions in calcs.ts.
chai for assertions. Tests cover scenarios like:byParams should return an empty list.stddevRatio for multiple traces sharing common parameters. For example, if two traces have test=1, their stddevRatio values should be averaged for the test=1 entry in the output.aveStdDevRatio in descending order.Calculating Average StdDevRatio by Parameter:
TryBotResponse
|
v
byParams(response)
|
| 1. Initialize `runningTotals` (empty map)
|
| 2. For each `result` in `response.results`:
| |
| |-> For each `param` (key-value pair) in `result.params`:
| |
| |--> Generate `runningTotalsKey` (e.g., "model=GCE")
| |--> Retrieve or create `runningTotal` entry for `runningTotalsKey`
| |--> Update `totalStdDevRatio`, `n`, `high`, `low` in the entry
|
| 3. Initialize `ret` (empty array of AveForParam)
|
| 4. For each `runningTotalKey` in `runningTotals`:
| |
| |-> Calculate `aveStdDevRatio` = `runningTotal.totalStdDevRatio` / `runningTotal.n`
| |-> Create `AveForParam` object
| |-> Push to `ret`
|
| 5. Sort `ret` by `aveStdDevRatio` (descending)
|
v
Array of AveForParam
This workflow allows users to quickly pinpoint which configuration parameters (like specific device models, operating systems, or test names) are associated with the most significant average performance changes in a given trybot run. The sorting ensures that the most impactful parameters are immediately visible.
The trybot-page-sk module provides a user interface for analyzing performance regressions. It allows users to select either a specific commit from the repository or a trybot run (representing a potential code change) and then analyze performance metrics associated with that selection. The core purpose is to help developers identify and understand performance impacts before or after code submission.
Key Responsibilities and Components:
User Input and Selection:
commit-detail-picker-sk element. This allows them to investigate performance regressions that might have been introduced by a particular code change.trybot-page-sk.ts. It appears to be a planned feature or a more complex interaction than commit selection.) The underlying TryBotRequest interface includes fields like cl and patch_number, indicating the intent to support this.query-sk. This query filters the performance traces to be considered (e.g., focusing on specific benchmarks, configurations, or architectures).paramset-sk and query-count-sk elements provide feedback on the current query, showing the matching parameters and the number of traces that fit the criteria. This helps users refine their query to target the relevant data.Data Fetching and Processing:
run method is invoked. This method constructs a TryBotRequest object based on the user's selections (commit number, query, or eventually CL/patch details)./_/trybot/load/ backend endpoint. This endpoint is responsible for fetching the relevant performance data (trace values, headers, parameter sets) for the specified commit/trybot and query. The startRequest utility handles the asynchronous request and displays progress using a spinner-sk.TryBotResponse) contains the performance data, including:results: An array of individual trace results, each containing parameter values (params), actual metric values (values), and a stddevRatio (how many standard deviations the trace's value is from the median of its historical data).paramset: The complete set of parameters found across all returned traces.header: Information about the data points in each trace, likely including timestamps.byParams function (from ../trybot/calcs) is used to aggregate results by parameter key-value pairs, calculating average standard deviation ratios, counts, and high/low values for each group. This helps identify which parameters are most strongly correlated with performance changes.Results Display and Visualization:
timeline-icon-sk) for a trace renders its values over time on a plot-simple-sk element. Users can CTRL-click to plot multiple traces on the same graph for comparison.byParams calculation. For each parameter key-value pair (e.g., “config=gles”), it shows the average standard deviation ratio, the number of traces (N) in that group, and the highest/lowest individual trace values.maxByParamsPlot traces from the selected group (sorted by stddevRatio) are plotted on a separate plot-simple-sk.by-params-traceid and by-params-paramset respectively. paramset-sk is used to display the parameters, highlighting the ones belonging to the focused trace.State Management:
stateReflector to synchronize its internal state (this.state, which is a TryBotRequest object) with the URL. This means that the selected commit, query, and analysis type (“commit” or “trybot”) are reflected in the URL query parameters. This allows users to bookmark or share specific analysis views.stateHasChanged(), which updates the URL via stateReflector and re-renders the component.Styling and Structure:
trybot-page-sk.scss file defines the visual appearance and layout of the component, including styles for the query section, results tables, and plot areas.Workflow Example (Commit Analysis):
User Selects Tab: User ensures the “Commit” tab is selected. [tabs-sk] --selects index 0--> [TrybotPageSk.tabSelected] --> state.kind = "commit" --> stateHasChanged()
User Selects Commit: User interacts with commit-detail-picker-sk. [commit-detail-picker-sk] --commit-selected event--> [TrybotPageSk.commitSelected] --> state.commit_number = selected_commit_offset --> stateHasChanged() --> _render() (UI updates to show query section)
User Enters Query: User types into query-sk.
[query-sk] --query-change event--> [TrybotPageSk.queryChange]
--> state.query = new_query_string
--> stateHasChanged()
--> _render() (paramset-sk summary updates)
[query-sk] --query-change-delayed event--> [TrybotPageSk.queryChangeDelayed]
--> [query-count-sk].current_query = new_query_string (triggers count update)
User Clicks “Run”: [Run Button] --click--> [TrybotPageSk.run] --> spinner-sk.active = true --> startRequest('/_/trybot/load/', state, ...) --> HTTP POST to backend with { kind: "commit", commit_number: X, query: "Y" } <-- Backend responds with TryBotResponse (trace data, paramset, header) --> results = TryBotResponse --> byParams = byParams(results) --> spinner-sk.active = false --> _render() (results tables and plot areas become visible and populated)
User Interacts with Results:
- **Plotting Individual Trace:** `[Timeline Icon in Individual Table]
--click--> [TrybotPageSk.plotIndividualTrace(event, index)] --> individualPlot.addLines(...) --> displayedTrace = true --> _render() (individual plot becomes visible) - **Plotting By Params Group:**[Timeline Icon in By Params Table] --click--> [TrybotPageSk.plotByParamsTraces(event, index)] --> Filters results.results for matching key=value --> byParamsPlot.addLines(...) --> byParamsParamSet.paramsets = [ParamSet of plotted traces] --> displayedByParamsTrace = true --> _render() (by params plot and its paramset become visible) - **Focusing Trace on By Params Plot:**[by-params-plot] --trace_focused event--> [TrybotPageSk.byParamsTraceFocused] --> byParamsTraceID.innerText = focused_trace_name --> byParamsParamSet.highlight = fromKey(focused_trace_name) --> _render() (updates highlighted params in by-params-paramset)`
The design emphasizes providing both a high-level overview of potential regression areas (via “By Params”) and the ability to drill down into individual trace performance. The use of stddevRatio as a primary metric helps quantify the significance of observed changes.
user-issue-sk)The user-issue-sk module provides a custom HTML element for associating and managing Buganizer issues with specific data points in the Perf application. This allows users to directly link performance regressions or anomalies to their corresponding bug reports, enhancing traceability and collaboration.
Why: Tracking issues related to performance data is crucial for effective debugging and resolution. This element centralizes the issue linking process within the Perf UI, providing a seamless experience for users to add, view, and remove bug associations.
How:
The core functionality revolves around the UserIssueSk LitElement class. This class manages the display and interaction logic for associating a Buganizer issue with a data point identified by its trace_key and commit_position.
Key Responsibilities and Components:
alogin-sk. This is essential because only logged-in users can add or remove issue associations. If a user is not logged in, they can only view existing issue links.bug_id: This property determines the element's display.bug_id === 0: Indicates no Buganizer issue is associated with the data point. The element will display an “Add Bug” button (if the user is logged in).bug_id > 0: An existing Buganizer issue is linked. The element will display a link to the bug and, if the user is logged in, a “close” icon to remove the association.bug_id === -1: This is a special state where the element renders nothing, effectively hiding itself. This might be used in scenarios where issue linking is not applicable._text_input_active: A boolean flag that controls the visibility of the input field for entering a new bug ID.render() method dynamically chooses between two main templates based on the bug_id and login status:addIssueTemplate(): Shown when bug_id === 0 and the user is logged in. It initially displays an “Add Bug” button. Clicking this button reveals an input field for the bug ID and confirm/cancel icons.showLinkTemplate(): Shown when bug_id > 0. It displays a formatted link to the Buganizer issue (using AnomalySk.formatBug). If the user is logged in, a “close” icon is also displayed to allow removal of the issue link.addIssue(): Triggered when a user submits a new bug ID. It makes a POST request to the /_/user_issue/save endpoint with the trace_key, commit_position, and the new issue_id.removeIssue(): Triggered when a logged-in user clicks the “close” icon next to an existing bug link. It makes a POST request to the /_/user_issue/delete endpoint with the trace_key and commit_position.user-issue-changed. This event bubbles up and carries a detail object containing the trace_key, commit_position, and the new bug_id. This allows parent components or other parts of the application to react to changes in issue associations (e.g., by refreshing a list of user-reported issues).errorMessage utility from perf/modules/errorMessage to display feedback to the user in case of API errors or invalid input.Key Files:
user-issue-sk.ts: This is the heart of the module. It defines the UserIssueSk LitElement, including its properties, styles, templates, and logic for interacting with the backend API and handling user input. The design focuses on conditional rendering based on the bug_id and user login status. The API calls are standard fetch requests.index.ts: A simple entry point that imports and registers the user-issue-sk custom element, making it available for use in HTML.BUILD.bazel: Defines the build dependencies for the element, including alogin-sk for authentication, anomaly-sk for bug link formatting, icon elements for the UI, and Lit libraries for web component development.Workflows:
Adding a New Issue: User (logged in) sees “Add Bug” button User clicks “Add Bug” -> activateTextInput() is called -> _text_input_active becomes true -> Element re-renders to show input field, check icon, close icon User types bug ID into input field -> changeHandler() updates _input_val User clicks check icon -> addIssue() is called -> Input validation (is _input_val > 0?) -> POST request to /_/user_issue/save with trace_key, commit_position, input_val -> On success: -> bug_id is updated with _input_val -> _input_val reset to 0 -> _text_input_active set to false -> user-issue-changed event is dispatched -> Element re-renders to show the new bug link and remove icon -> On failure: -> errorMessage is displayed -> hideTextInput() is called (resets state)
Viewing an Existing Issue: Element is initialized with bug_id > 0 -> render() calls showLinkTemplate() -> A link to perf.bug_host_url + bug_id is displayed. -> If user is logged in, a “close” icon is also displayed.
Removing an Existing Issue: User (logged in) sees bug link and “close” icon User clicks “close” icon -> removeIssue() is called -> POST request to /_/user_issue/delete with trace_key, commit_position -> On success: -> bug_id is set to 0 -> _input_val reset to 0 -> _text_input_active set to false -> user-issue-changed event is dispatched -> Element re-renders to show “Add Bug” button -> On failure: -> errorMessage is displayed
The design prioritizes a clear separation of concerns: display logic is handled by LitElement's templating system, state is managed through properties, and backend interactions are encapsulated in dedicated asynchronous methods. The use of custom events allows for loose coupling with other components that might need to react to changes in issue associations.
The window module is designed to provide utility functions related to the browser's window object, specifically focusing on parsing and interpreting configuration data embedded within it. This approach centralizes the logic for accessing and processing global configurations, making it easier to manage and test.
A key responsibility of this module is to extract and process build tag information. This information is often embedded in the window.perf.image_tag global variable, which is expected to be an SkPerfConfig object (defined in //perf/modules/json:index_ts_lib). The getBuildTag function is the primary component for this task.
The getBuildTag function takes an image tag string as input (or defaults to window.perf?.image_tag). Its core purpose is to parse this string and categorize the build tag. The function employs a specific parsing logic based on the structure of the image tag:
Initial Validation:
@ character.@ or @ is the first/last character), it's considered an invalid tag.@) starts with tag:. If not, it's also an invalid tag.Input Tag String
|
V
Split by '@'
|
V
Check for at least 2 parts AND second part starts with "tag:"
|
+-- No --> Invalid Tag
|
V
Proceed to type determination
Tag Type Determination: Based on the prefix of the raw tag (the part after tag:):
- **Git Tag**: If the raw tag starts with `tag:git-`, it's classified as a
'git' type. The function extracts the first 7 characters of the Git
hash. `rawTag starts with "tag:git-" | V Type: 'git' Tag: First 7 chars
of Git hash` - **Louhi Build Tag**: If the raw tag has a specific length (>= 38
characters) and contains`louhi`at a particular position (substring
from index 25 to 30), it's classified as a 'louhi' type. The function
extracts a 7-character identifier (substring from index 31 to 38) which
typically represents a hash or version.`rawTag length >= 38 AND
rawTag[25:30] == "louhi" | V Type: 'louhi' Tag: rawTag[31:38]` - **Regular Tag**: If neither of the above conditions is met, it's
considered a generic 'tag' type. The function returns the portion of the
string after`tag:`. `Neither Git nor Louhi | V Type: 'tag' Tag: rawTag
after “tag:”`
This structured approach ensures that different build tag formats can be reliably identified and their relevant parts extracted. The decision to differentiate between ‘git’, ‘louhi’, and generic ‘tag’ types allows downstream consumers of this information to handle them appropriately. For instance, a ‘git’ tag might be used to link to a specific commit, while a ‘louhi’ tag might indicate a specific build from an internal CI system.
The module also extends the global Window interface to declare the perf: SkPerfConfig property. This is a TypeScript feature that provides type safety when accessing window.perf, ensuring that developers are aware of its expected structure.
The window_test.ts file provides unit tests for the getBuildTag function, covering various scenarios including valid git tags, Louhi build tags, arbitrary tags, and different forms of invalid tags. These tests are crucial for verifying the correctness of the parsing logic and ensuring that changes to the function do not introduce regressions. The use of chai for assertions is a standard practice for testing in this environment.
The word-cloud-sk module provides a custom HTML element designed to visualize key-value pairs and their relative frequencies. This is particularly useful for displaying data from clusters or other datasets where understanding the distribution of different attributes is important.
The core idea is to present this frequency information in an easily digestible format, combining textual representation with a simple bar graph for each item. This allows users to quickly grasp the prevalence of certain key-value pairs within a dataset.
Key Components and Responsibilities:
word-cloud-sk.ts: This is the heart of the module, defining the WordCloudSk custom element which extends ElementSk.
ElementSk, it leverages common functionalities provided by the infra-sk library for custom elements.lit-html library for templating. The items property, an array of ValuePercent objects (defined in //perf/modules/json:index_ts_lib), is the primary input. Each ValuePercent object contains a value (the key-value string) and a percent (its frequency).items and creates a table row for each. Each row displays the key-value string, its percentage as text, and a horizontal bar whose width is proportional to the percentage.connectedCallback ensures that if the items property is set before the element is fully connected to the DOM, it's properly upgraded and the element is rendered._render() method is called whenever the items property changes, ensuring the display is updated.word-cloud-sk.scss: This file contains the SASS styles for the word-cloud-sk element.
--light-gray, --on-surface, --primary), allowing the component to adapt to different themes (like light and dark mode) defined in //perf/modules/themes:themes_sass_lib and //elements-sk/modules:colors_sass_lib.word-cloud-sk-demo.html and word-cloud-sk-demo.ts: These files provide a demonstration page for the word-cloud-sk element.
word-cloud-sk-demo.html includes multiple instances of the <word-cloud-sk> tag, some within sections with different theming (e.g., dark mode). word-cloud-sk-demo.ts then selects these instances and populates their items property with sample data. This demonstrates how the component can be instantiated and how data is passed to it.index.ts: This file simply imports and thereby registers the word-cloud-sk custom element.
Workflow: Data Display
The primary workflow involves providing data to the word-cloud-sk element and its subsequent rendering:
Instantiation: An instance of <word-cloud-sk> is created in HTML.
<word-cloud-sk></word-cloud-sk>
Data Provision: The items property of the element is set with an array of ValuePercent objects.
// In JavaScript/TypeScript:
const wordCloudElement = document.querySelector('word-cloud-sk');
wordCloudElement.items = [
{ value: 'arch=x86', percent: 100 },
{ value: 'config=565', percent: 60 },
// ... more items
];
Rendering (_render() called in word-cloud-sk.ts):
WordCloudSk element iterates through the _items array.<tr>) is generated.item.value is displayed in the first cell (<td>).item.percent is displayed as text (e.g., “60%”) in the second cell.<div> element is created in the third cell. Its width style is set to item.percent pixels, creating a visual bar representation of the percentage.The overall structure rendered looks like this (simplified):
<table>
<tr> <!-- For item 1 -->
<td class="value">[item1.value]</td>
<td class="textpercent">[item1.percent]%</td>
<td class="percent">
<div style="width: [item1.percent]px"></div>
</td>
</tr>
<tr> <!-- For item 2 -->
<td class="value">[item2.value]</td>
<td class="textpercent">[item2.percent]%</td>
<td class="percent">
<div style="width: [item2.percent]px"></div>
</td>
</tr>
<!-- ... more rows -->
</table>
This process ensures that whenever the input data changes, the visual representation of the word cloud is automatically updated. The use of CSS variables for styling allows the component to seamlessly integrate into applications with different visual themes.
nanostat is a command-line tool designed to compare and analyze the results of Skia's nanobench benchmark. It takes two JSON files generated by nanobench as input, representing “old” and “new” benchmark runs, and provides a statistical summary of the performance changes between them. This is particularly useful for developers to understand the performance impact of their code changes.
When making changes to a codebase, especially one as performance-sensitive as a graphics library like Skia, it's crucial to measure the impact on performance. Nanobench produces detailed raw data, but interpreting this data directly can be cumbersome. nanostat was created to:
The core workflow of nanostat involves several steps:
Input: It accepts two file paths as command-line arguments, pointing to the “old” and “new” nanobench JSON output files.
nanostat [options] old.json new.json
Parsing: The loadFileByName function in main.go is responsible for opening and parsing these JSON files. It uses the perf/go/ingest/format.ParseLegacyFormat function to interpret the nanobench output structure and then perf/go/ingest/parser.GetSamplesFromLegacyFormat to extract the raw sample values for each benchmark test. Each file's data is converted into a parser.SamplesSet, which is a map where keys are test identifiers and values are slices of performance measurements (samples).
Statistical Analysis: The samplestats.Analyze function (from the perf/go/samplestats module) is the heart of the comparison. It takes the two parser.SamplesSet (before and after samples) and a samplestats.Config object as input. The configuration includes:
Alpha: The significance level (default 0.05). A p-value below alpha indicates a significant difference.IQRR: A boolean indicating whether to apply the Interquartile Range Rule to remove outliers from the sample data before analysis.All: A boolean determining if all results (significant or not) should be displayed.Test: The type of statistical test to perform (Mann-Whitney U test or Welch's T-test).Order: The function used to sort the output rows.For each common benchmark test found in both input files, samplestats.Analyze calculates statistics for both sets of samples (mean, percentage deviation) and then performs the chosen statistical test to compare the two distributions. This yields a p-value.
Filtering and Sorting: Based on the config, samplestats.Analyze filters out rows where the change is not statistically significant (if config.All is false). The remaining rows are then sorted according to config.Order.
Output Formatting: The formatRows function in main.go takes the analyzed and sorted samplestats.Row data and prepares it for display.
config, name, test). These are keys whose values differ across the benchmark results, helping to distinguish them.--all flag is used.stdout using text/tabwriter to create a well-aligned table.Example output line:
old new delta stats name 2.15 ± 5% 2.00 ± 2% -7% (p=0.001, n=10+ 8) tabl_digg.skp
main.go: This is the entry point of the application.
-alpha, -sort, -iqrr, -all, -test).loadFileByName to load and parse the input JSON files.samplestats.Config based on the provided flags.samplestats.Analyze to perform the statistical comparison.formatRows to format the results for display.text/tabwriter to print the formatted output to the console.actualMain(stdout io.Writer): Contains the main logic, allowing stdout to be replaced for testing.loadFileByName(filename string) parser.SamplesSet: Reads a nanobench JSON file, parses it, and extracts the performance samples. It leverages perf/go/ingest/format and perf/go/ingest/parser.formatRows(config samplestats.Config, rows []samplestats.Row) []string: Takes the analysis results and formats them into a slice of strings, ready for tabular display. It intelligently includes relevant parameter keys in the output.main_test.go: Contains unit tests for nanostat.
nanostat produces the expected output for various command-line flag combinations and input files.testdata/*.golden) to compare actual output against expected output.TestMain_DifferentFlags_ChangeOutput(t *testing.T): The main test function that sets up different test cases.check(t *testing.T, name string, args ...string): A helper function that runs nanostat with specified arguments, captures its output, and compares it against a corresponding golden file.README.md: Provides user-facing documentation on how to install and use nanostat, including examples and descriptions of command-line options.
Makefile: Contains targets for building, testing, and regenerating test data (golden files). The regenerate-testdata target is crucial for updating the golden files when the tool's output format or logic changes.
BUILD.bazel: Defines how to build and test the nanostat binary and its library using the Bazel build system. It lists dependencies on other Skia modules, such as:
//go/paramtools: Used in formatRows to work with parameter sets from benchmark results.//perf/go/ingest/format: Used for parsing the legacy nanobench JSON format.//perf/go/ingest/parser: Used to extract sample data from the parsed format.//perf/go/samplestats: Provides the core statistical analysis functions (samplestats.Analyze, samplestats.Order, samplestats.Test).perf/go/samplestats: nanostat heavily relies on this module for the actual statistical computations. This promotes code reuse and separation of concerns, keeping nanostat focused on command-line parsing, file I/O, and output formatting.perf/go/ingest/format and perf/go/ingest/parser: These modules handle the complexities of interpreting the nanobench JSON structure, abstracting this detail away from nanostat's main logic.-alpha, -iqrr, -all, -sort, -test). This flexibility allows users to tailor the analysis to their specific needs. For example, the -iqrr flag allows for more robust analysis by removing potential outlier data points that could skew results. The -test flag allows users to choose between parametric (T-test) and non-parametric (U-test) statistical tests, depending on the assumptions they are willing to make about their data's distribution.text/tabwriter provides a clean, aligned, and easy-to-read output format, which is essential for quickly scanning and understanding the performance changes.main_test.go is a good practice for testing command-line tools. It makes it easy to verify that changes to the code don't unintentionally alter the output format or the results of the analysis. The Makefile target regenerate-testdata simplifies updating these files when intended changes occur.The /pages module is responsible for defining the HTML structure and initial JavaScript and CSS for all the user-facing pages of the Skia Performance application. Each page represents a distinct view or functionality within the application, such as viewing alerts, exploring performance data, or managing regressions.
The core design philosophy is to keep the HTML files minimal and delegate the rendering and complex logic to custom HTML elements (Skia Elements). This promotes modularity and reusability of UI components.
Key Components and Responsibilities:
alerts.html, newindex.html):<head>, <body>).perf-scaffold-sk custom element. This element acts as a common layout wrapper for all pages, providing consistent navigation, header, footer, and potentially other shared UI elements.perf-scaffold-sk, they embed the primary custom element specific to that page's functionality (e.g., <alerts-page-sk>, <explore-sk>).{%- template "googleanalytics" . -%} and {% .Nonce %} for server-side rendering of common snippets and security nonces.window.perf = {%.context %}; script tag is used to pass initial data or configuration from the server (Go backend) to the client-side JavaScript. This context likely contains information needed by the page-specific custom element to initialize itself.alerts.ts, newindex.ts):<perf-scaffold-sk> and the page-specific custom element (e.g., ../modules/alerts-page-sk).alerts.scss, newindex.scss):@import 'body';, which means they inherit base body styles from body.scss.body.scss provide, those styles would be defined here.body.scss:<body> element, such as removing default margins and padding. This ensures a consistent baseline across all pages.BUILD.bazel:sk_page rule from //infra-sk:index.bzl.html_file: The entry HTML file.ts_entry_point: The entry TypeScript file.scss_entry_point: The entry SCSS file.sk_element_deps: A list of dependencies on other modules that provide the custom HTML elements used by the page. This is crucial for ensuring that elements like perf-scaffold-sk and page-specific elements (e.g., alerts-page-sk) are compiled and available.sass_deps: Dependencies for SCSS, typically including :body_sass_lib which refers to the body.scss file.assets_serving_path, nonce, and production_sourcemap.Workflow for a Page Request:
/alerts).alerts.html).{% .context %}, the {% .Nonce %}, and other templates like “googleanalytics” and “cookieconsent”.User Request ----> Go Backend ----> Template Processing (alerts.html + context) ----> HTML Response (URL Routing) (Injects window.perf data, nonce)<script src="alerts.js"></script> (or the equivalent generated by the build system), it fetches and executes alerts.ts.alerts.ts imports ../modules/perf-scaffold-sk and ../modules/alerts-page-sk. This registers these custom elements with the browser. Browser Receives HTML -> Parses HTML -> Encounters <script> for alerts.ts | -> Fetches and Executes alerts.ts | -> import '../modules/perf-scaffold-sk'; -> import '../modules/alerts-page-sk'; (Custom elements are now defined)<perf-scaffold-sk> and <alerts-page-sk>). The JavaScript logic within these custom elements takes over, potentially fetching more data via AJAX using the initial window.perf context if needed, and populating the page content. Custom Elements Registered -> Browser renders <perf-scaffold-sk> and <alerts-page-sk> | -> JavaScript within these elements executes (e.g., reads window.perf, makes AJAX calls, builds UI)alerts.scss) is also linked in the HTML (via the build system), and its styles (including those from body.scss) are applied.This structure allows for a clean separation of concerns:
The help.html page is slightly different as it directly embeds more static content (help text and examples) within its HTML structure using Go templating ({% range ... %}). However, it still utilizes the perf-scaffold-sk for consistent page layout and imports its JavaScript for any scaffold-related functionalities.
The newindex.html and multiexplore.html pages additionally include a div with id="sidebar_help" within the perf-scaffold-sk. This suggests that the perf-scaffold-sk might have a designated area or slot where page-specific help content can be injected, or that the page-specific JavaScript (explore-sk.ts or explore-multi-sk.ts) might dynamically populate or interact with this sidebar content.
/res)The /res module serves as a centralized repository for static assets required by the application. Its primary purpose is to provide a consistent and organized location for resources such as images, icons, and potentially other static files that are part of the user interface or overall application branding. By co-locating these assets, the module simplifies resource management, facilitates easier updates, and ensures that all parts of the application can reliably access necessary visual or static elements.
The decision to have a dedicated /res module stems from the need to separate static content from dynamic code. This separation offers several benefits:
The internal structure of /res is designed to categorize different types of assets. For instance, images are placed within a dedicated img subdirectory. This categorization aids in discoverability and allows for type-specific processing or handling if needed in the future.
/res/img (Submodule/Directory):/res keeps the root of the resource module clean and allows for specific image-related build optimizations or management strategies. For example, image compression tools or sprite generation scripts could target this directory specifically./res/img/favicon.ico:.ico format is the traditional and most widely supported format for favicons, ensuring compatibility across different browsers and platforms. Placing it directly in the img directory makes it easily discoverable by build tools and web servers, which often look for favicon.ico in standard locations. Its presence here ensures that the application has a visual identifier in browser contexts.A typical workflow involving the /res module might look like this:
Asset Creation/Acquisition: A designer creates a new icon or a new version of the application logo.
Designer Developer | | [New Image Asset] --> [Receives Asset]
Asset Placement: The developer places the new image file (e.g., new_icon.png) into the appropriate subdirectory within /res, likely /res/img/.
Developer | [Places new_icon.png into /res/img/]
Referencing the Asset: Application code (e.g., HTML, CSS, JavaScript) that needs to display this icon will reference it using a path relative to how the assets are served.
Application Code (e.g., HTML) | <img src="/path/to/res/img/new_icon.png">
(Note: The exact /path/to/ depends on how the web server or build system exposes the /res directory.)
Build Process: During the application build, files from the /res module are typically copied to a public-facing directory in the build output.
Build System | [Reads /res/img/new_icon.png] --> [Copies to /public_output/img/new_icon.png]
Client Request: When a user accesses the application, their browser requests the asset. User's Browser Web Server | | [Requests /public_output/img/new_icon.png] ----> [Serves new_icon.png] | | [Displays new_icon.png] <------------------------+
This workflow highlights how the /res module acts as the source of truth for static assets, which are then processed and served to the end-user. The favicon.ico follows a similar, often more implicit, path as browsers automatically request it from standard locations.
The samplevariance module is a command-line tool designed to analyze the variance of benchmark samples, specifically those generated by nanobench and stored in Google Cloud Storage (GCS). Nanobench typically produces multiple samples (e.g., 10) for each benchmark execution. This tool facilitates the examination of these samples across a large corpus of historical benchmark runs.
The primary motivation for this tool is to identify benchmarks exhibiting high variance in their results. High variance can indicate instability in the benchmark itself, the underlying system, or the measurement process. By calculating statistics like the ratio of the median to the minimum value for each set of samples, samplevariance helps pinpoint traces that warrant further investigation.
The core workflow involves:
sampleInfo struct.sampleInfo structs from the workers and sorting them in descending order based on the calculated median/min ratio. This brings the traces with the highest variance to the top.[Flags] -> initialize() -> (ctx, bucket, objectPrefix, traceFilter, outputWriter)
|
v
filenamesFromBucketAndObjectPrefix(ctx, bucket, objectPrefix) -> [filenames]
|
v
samplesFromFilenames(ctx, bucket, traceFilter, [filenames])
|
|--> [gcsFilenameChannel] -> Worker Goroutine 1 -> traceInfoFromFilename() -> [sampleInfo] --\
| |
|--> [gcsFilenameChannel] -> Worker Goroutine 2 -> traceInfoFromFilename() -> [sampleInfo] ----> [aggregatedSamples] (mutex protected)
| |
|--> ... (up to workerPoolSize) |
| |
|--> [gcsFilenameChannel] -> Worker Goroutine N -> traceInfoFromFilename() -> [sampleInfo] --/
|
v
Sort([aggregatedSamples])
|
v
writeCSV([sortedSamples], topN, outputWriter) -> CSV Output
Key components and their responsibilities:
main.go: This is the entry point of the application and orchestrates the entire process.
main(): Drives the overall workflow: initialization, fetching filenames, processing samples, sorting, and writing the output.initialize(): Handles command-line argument parsing. It sets up the GCS client, determines the input GCS path (defaulting to yesterday‘s data if not specified), parses the trace filter query, and configures the output writer (stdout or a specified file). The choice to default to yesterday’s data provides a convenient way to monitor recent benchmark stability without requiring explicit date specification.filenamesFromBucketAndObjectPrefix(): Interacts with GCS to list all object names (filenames) under the specified bucket and prefix. It uses GCS client library features to efficiently retrieve only the names, minimizing data transfer.samplesFromFilenames(): Manages the concurrent processing of benchmark files. It creates a channel (gcsFilenameChannel) to distribute filenames to a pool of worker goroutines (workerPoolSize). An errgroup is used to manage these goroutines and propagate any errors. A mutex protects the shared samples slice where results from workers are aggregated. This concurrent design is crucial for performance when dealing with a large number of benchmark files.traceInfoFromFilename(): This function is executed by each worker goroutine. It takes a single GCS filename, reads the corresponding object from the bucket, parses the JSON content using format.ParseLegacyFormat (from perf/go/ingest/format) and parser.GetSamplesFromLegacyFormat (from perf/go/ingest/parser). For each trace that matches the traceFilter (a query.Query object from go/query), it sorts the sample values, calculates the median (using stats.Sample.Quantile from go-moremath/stats) and minimum, and then computes their ratio. The use of established libraries for parsing and statistical calculation ensures correctness and leverages existing, tested code.writeCSV(): Formats the processed sampleInfo data into CSV format and writes it to the designated output writer. It includes a header row and then iterates through the sampleInfo slice, writing each entry. It also handles the --top flag to limit the number of output rows.sampleInfo: A simple struct to hold the calculated statistics (trace ID, median, min, ratio) for a single benchmark trace's samples.sampleInfoSlice: A helper type that implements sort.Interface to allow sorting sampleInfo slices by the ratio field in descending order. This is key to presenting the most variant traces first.main_test.go: Contains unit tests for the writeCSV function. These tests verify that the CSV output is correctly formatted under different conditions, such as when writing all samples, a limited number of top samples, or when the number of samples is less than the requested top N. This ensures the output formatting logic is robust.
The design decision to use a worker pool (workerPoolSize) for processing files in parallel significantly speeds up the analysis, especially when dealing with numerous benchmark result files often found in GCS. The use of golang.org/x/sync/errgroup simplifies error handling in concurrent operations. Filtering capabilities (via the --filter flag and go/query) allow users to narrow down the analysis to specific subsets of benchmarks, making the tool more flexible and targeted. The output as a CSV file makes it easy to import the results into spreadsheets or other data analysis tools for further examination.
The /scripts module provides tooling to support the data ingestion pipeline for Skia Perf. The primary focus is on automating the process of transferring processed data to the designated cloud storage location for further analysis and visualization within the Skia performance monitoring system.
The key responsibility of this module is to ensure reliable and timely delivery of performance data. This is achieved by interacting with Google Cloud Storage (GCS) using the gsutil command-line tool.
The main component within this module is the upload_extracted_json_files.sh script.
upload_extracted_json_files.sh
This shell script is responsible for uploading JSON files, which are assumed to be the output of a preceding data extraction or processing phase, to a specific Google Cloud Storage bucket (gs://skia-perf/nano-json-v1/).
Design Rationale and Implementation Details:
gsutil? gsutil is the standard command-line tool for interacting with Google Cloud Storage. It provides robust features for uploading, downloading, and managing data in GCS buckets.-m (parallel uploads)? The -m flag in gsutil cp enables parallel uploads. This is a crucial performance optimization, especially when dealing with a potentially large number of JSON files. By uploading multiple files concurrently, the overall time taken for the transfer is significantly reduced.cp -r (recursive copy)? The -r flag ensures that the entire directory structure under downloads/ is replicated in the destination GCS path. This is important for maintaining the organization of the data and potentially for downstream processing that might rely on the file paths.gs://skia-perf/nano-json-v1/$(date -u --date +1hour +%Y/%m/%d/%H))?gs://skia-perf/nano-json-v1/: This is the base path in the GCS bucket designated for “nano” format JSON files, version 1. This structured naming helps in organizing different types and versions of data within the bucket.$(date -u --date +1hour +%Y/%m/%d/%H): This part dynamically generates a timestamped subdirectory structure.date -u: Ensures the date is in UTC, providing a consistent timezone regardless of where the script is run.--date +1hour: This is a deliberate choice to place the data into the next hour's ingestion slot. This likely provides a buffer, ensuring that all data generated within a given hour is reliably captured and processed for that hour, even if the script runs slightly before or after the hour boundary. It helps prevent data from being missed or attributed to the wrong time window due to minor timing discrepancies in script execution.+%Y/%m/%d/%H: Formats the date and time into a hierarchical path (e.g., 2023/10/27/15). This organization is beneficial for:Workflow:
The script executes a simple, linear workflow:
downloads/ directory in the current working directory as the source of JSON files. [Local Filesystem] | ./downloads/ (contains *.json files)YYYY/MM/DD/HH. date command ---> YYYY/MM/DD/HH (e.g., 2023/10/27/15) | Target GCS Path: gs://skia-perf/nano-json-v1/YYYY/MM/DD/HH/gsutil to recursively copy all contents from downloads/ to the generated GCS path, utilizing parallel uploads for efficiency. ./downloads/* ---(gsutil -m cp -r)---> gs://skia-perf/nano-json-v1/YYYY/MM/DD/HH/This script assumes that the downloads/ directory exists in the location where the script is executed and contains the JSON files ready for upload. It also presumes that the user running the script has the necessary gsutil tool installed and configured with appropriate permissions to write to the specified GCS bucket.
The /secrets module is responsible for managing the creation and configuration of secrets required for various Skia Perf services to operate. These secrets primarily involve Google Cloud service accounts and OAuth credentials for email sending. The scripts in this module automate the setup of these credentials, ensuring that services have the necessary permissions to interact with Google Cloud APIs and other resources.
The design philosophy emphasizes secure and automated credential management. Instead of manual creation and configuration of secrets, these scripts provide a repeatable and version-controlled way to provision them. This reduces the risk of human error and ensures that services are configured with the principle of least privilege. For instance, service accounts are granted only the specific roles they need to perform their tasks.
1. Service Account Creation Scripts:
create-flutter-perf-service-account.sh: This script provisions a Google Cloud service account specifically for the Flutter Perf instance. It leverages a common script (../../kube/secrets/add-service-account.sh) to handle the underlying gcloud commands.
add-service-account.sh script, passing in parameters like the project ID, the desired service account name (“flutter-perf-service-account”), a descriptive display name, and the necessary IAM roles (roles/pubsub.editor, roles/cloudtrace.agent).create-perf-cockroachdb-backup-service-account.sh: This script creates a dedicated service account for the Perf CockroachDB backup cronjob.
../../kube/secrets/add-service-account.sh. It specifies the service account name (“perf-cockroachdb-backup”) and the roles/storage.objectAdmin role, which grants permissions to manage objects in Cloud Storage buckets.create-perf-ingest-sa.sh: This script is responsible for creating the perf-ingest service account. This account is used by the Perf ingestion service, which processes and stores performance data.
gs://skia-perf, gs://cluster-telemetry-perf). A dedicated service account with these precise permissions is crucial for security and operational clarity. It also leverages Workload Identity, a more secure way for Kubernetes workloads to access Google Cloud services.../kube/config.sh) and utility functions (../bash/ramdisk.sh) for environment setup.perf-ingest) using gcloud iam service-accounts create.roles/pubsub.editor: To publish messages to Pub/Sub.roles/cloudtrace.agent: To send trace data.default/perf-ingest in the skia-public namespace) to the Google Cloud service account. This allows pods running as perf-ingest in Kubernetes to impersonate the perf-ingest Google Cloud service account without needing to mount service account key files directly. Kubernetes Pod (default/perf-ingest) ----> Impersonates ----> Google Cloud SA (perf-ingest@skia-public.iam.gserviceaccount.com) | +----> Accesses GCP Resources (Pub/Sub, Cloud Trace, GCS)objectViewer permissions on specific GCS buckets using gsutil iam ch.perf-ingest.json).perf-ingest from this key file using kubectl create secret generic. This secret can then be used by deployments that might not be able to use Workload Identity directly or for other specific use cases./tmp/ramdisk) to avoid leaving sensitive key files on persistent storage.create-perf-sa.sh: This script creates the primary skia-perf service account. This is a general-purpose service account for the main Perf application.
gs://skia-perf bucket. Similar to perf-ingest, this service account uses Workload Identity for enhanced security when running within Kubernetes.create-perf-ingest-sa.sh:skia-perf service account.roles/cloudtrace.agent and roles/pubsub.editor.default/skia-perf) to the skia-perf Google Cloud service account.objectViewer on the gs://skia-perf GCS bucket.skia-perf.2. Email Secrets Creation:
create-email-secrets.sh: This script facilitates the creation of Kubernetes secrets necessary for Perf to send emails via Gmail. This typically involves an OAuth 2.0 flow.alertserver@skia.org).alertserver-skia-org).client_secret.json file (obtained from the Google Cloud Console after enabling the Gmail API and creating OAuth 2.0 client credentials) to /tmp/ramdisk.three_legged_flow Go program (which must be built and installed separately from ../go/email/three_legged_flow). This program initiates the OAuth 2.0 three-legged authentication flow. User Action: Run three_legged_flow --> Browser opens for Google Auth --> User authenticates as specified email | v three_legged_flow generates client_token.jsonclient_token.json (containing the authorization token and refresh token) is generated in /tmp/ramdisk, the script uses kubectl create secret generic to create a Kubernetes secret named perf-${EMAIL}-secrets. This secret contains both client_secret.json and client_token.json.client_token.json file from the local filesystem because it contains a sensitive refresh token. The source of truth for this token becomes the Kubernetes secret./tmp/ramdisk ensures that sensitive downloaded and generated files are stored in memory and are less likely to be inadvertently persisted.The common pattern across these scripts is the use of gcloud for Google Cloud resource management and kubectl for interacting with Kubernetes to store the secrets. The use of a ramdisk for temporary storage of sensitive files like service account keys and OAuth tokens is a security best practice. Workload Identity is preferred for service accounts running in GKE, reducing the need to manage and distribute service account key files.