| # Module: / |
| |
| ## Skia Perf Technical Documentation |
| |
| ### 1. High-Level Overview |
| |
| **Project Objectives:** Skia Perf is a performance monitoring system designed to |
| ingest, store, analyze, and visualize performance data for various projects, |
| with a primary focus on Skia and related systems (e.g., Flutter, Android, |
| Chrome). Its core objectives are: |
| |
| 1. **Centralized Performance Data Storage:** Provide a robust and scalable |
| repository for performance metrics collected from diverse benchmark runs. |
| 2. **Interactive Data Exploration:** Offer web-based dashboards that allow |
| users to query, explore, and visualize performance trends over time, across |
| different configurations and code revisions. |
| 3. **Automated Regression Detection:** Implement algorithms to automatically |
| identify statistically significant performance regressions (and |
| improvements) as new data is ingested. |
| 4. **Alerting and Notification:** Notify relevant stakeholders (developers, |
| performance engineers) about detected regressions. |
| 5. **Triage and Investigation Support:** Provide tools to help users triage |
| regressions, associate them with code changes, and track their resolution. |
| 6. **Integration with Developer Workflows:** Connect performance data with |
| version control systems (Git) and issue trackers. |
| |
| **Functionality:** Perf consists of several key components that work together: |
| |
| - **Data Ingestion:** Processes performance data files (typically JSON) |
| uploaded to Google Cloud Storage (GCS). These files are parsed, validated, |
| and their metrics are stored against corresponding Git commit hashes. |
| - **Data Storage:** Uses SQL databases (primarily CockroachDB, with support |
| for Spanner) to store metadata (commits, alert configurations, regression |
| statuses) and specialized trace data. |
| - **Clustering & Regression Detection:** Employs k-means clustering and step |
| detection algorithms on time-series trace data to identify regressions or |
| significant performance shifts. This can be run continuously or triggered by |
| new data events. |
| - **Frontend UI:** A web application (built with Go on the backend and |
| TypeScript/Lit/Web Components on the frontend) that provides interactive |
| dashboards for: |
| - Plotting performance metrics over commit ranges. |
| - Querying traces based on various parameters. |
| - Configuring alerts. |
| - Triaging detected regressions. |
| - Viewing commit details and associated performance changes. |
| - **Alerting System:** Allows users to define alerts based on specific queries |
| and thresholds. When regressions matching these alerts are found, |
| notifications can be sent (e.g., email, issue tracker integration). |
| - **Command-Line Tools:** Provides `perfserver` (to run the different |
| services) and `perf-tool` (for administrative tasks, data inspection, and |
| database backups/restores). |
| |
| ### 2. Project-Specific Terminology |
| |
| - **Trace:** A single time series of performance measurements for a specific |
| test under a specific configuration (e.g., memory usage for `draw_a_circle` |
| on `arch=x86,config=8888`). Trace IDs are structured key-value strings like |
| `,arch=x86,config=8888,test=draw_a_circle,units=ms,`. |
| - **CommitNumber:** An internal, monotonically increasing integer assigned by |
| Perf to each Git commit as it's processed. This provides a linear sequence |
| for ordering data. |
| - **Tile:** A logical grouping of commits. Trace data is stored in relation to |
| these tiles. The `tile_size` (number of commits per tile) is configurable |
| and affects how data is sharded and queried. |
| - **ParamSet:** A collection of unique parameter key-value pairs observed in |
| the data within a certain commit range (or tile range). Used to populate UI |
| query builders. |
| - **DataFrame:** A tabular data structure, similar to R's dataframes or Pandas |
| DataFrames, used on the backend and frontend. It holds trace values indexed |
| by commit and trace ID, along with header information (commit details). |
| - **Cluster / Clustering:** The process of grouping similar traces together |
| using k-means clustering. This is a core part of regression detection, as a |
| significant change in the centroid of a cluster can indicate a regression. |
| - **Regression (Statistic):** A numerical value (StepSize / LeastSquaresError) |
| calculated for a cluster's centroid after fitting a step function. It |
| measures how much the centroid's behavior resembles a step change. High |
| absolute values are "Interesting." |
| - **Alert (Configuration):** A user-defined configuration that specifies a |
| query to select traces, a detection algorithm, grouping parameters, and |
| notification settings for finding regressions. |
| - **Ingestion Format:** A specific JSON structure (documented in `FORMAT.md`) |
| that Perf expects for input data files. |
| - **Shortcut:** A saved URL or configuration, often represented by a short, |
| hashed ID, for quickly accessing a specific view or set of traces. |
| - **Triage:** The process of reviewing a detected regression, determining if |
| it's a genuine issue or an expected change/noise, and marking it accordingly |
| (e.g., "Bug," "Ignore"). |
| |
| ### 3. Overall Architecture |
| |
| Perf follows a services-oriented architecture, where the main `perfserver` |
| executable can run in different modes (frontend, ingest, cluster, maintenance). |
| Data flows from external benchmark systems into Perf, where it's processed, |
| stored, analyzed, and finally presented to users. |
| |
| **Data Flow and Main Components:** |
| |
| ``` |
| External Benchmark Systems |
| | |
| V |
| [Data Files (JSON) in Perf Ingestion Format] |
| | (Uploaded to Google Cloud Storage - GCS) |
| V |
| GCS Bucket (e.g., gs://skia-perf/nano-json-v1/) |
| | (Pub/Sub event on new file arrival) |
| V |
| Perf Ingest Service(s) (`perfserver ingest` mode) |
| | - Parses JSON files (see /go/ingest/parser) |
| | - Validates data (see /go/ingest/format) |
| | - Associates data with Git commits (see /go/git) |
| | - Writes trace data to TraceStore (SQL, tiled) (see /go/tracestore) |
| | - Updates ParamSets (for UI query builders) |
| | - (Optionally) Emits Pub/Sub events for "Event Driven Alerting" |
| V |
| SQL Database (CockroachDB / Spanner) |
| | - Trace Data (values, parameters, indexed by commit/tile) |
| | - Commit Information (hashes, timestamps, messages) |
| | - Alert Configurations |
| | - Regression Records (details of detected regressions, triage status) |
| | - Shortcuts, User Favorites, etc. |
| | |
| +<--> Perf Cluster Service(s) (`perfserver cluster` or `perfserver frontend --do_clustering` mode) |
| | - Loads Alert configurations |
| | - Queries TraceStore for relevant data |
| | - Performs clustering (k-means) (see /go/clustering2, /go/ctrace2) |
| | - Fits step functions to cluster centroids (see /go/stepfit) |
| | - Calculates Regression statistic |
| | - Stores "Interesting" clusters/regressions in the database |
| | - Sends notifications (email, issue tracker) (see /go/notify) |
| | |
| +<--> Perf Frontend Service (`perfserver frontend` mode) |
| | - Serves HTML, CSS, JS (see /pages, /modules) |
| | - Handles API requests from the UI (see /go/frontend, /API.md) |
| | - Queries database for trace data, alert configs, regressions |
| | - Formats data for UI display (often as DataFrames) |
| | - Manages user authentication (via X-WEBAUTH-USER header) |
| | |
| +<--> Perf Maintenance Service (`perfserver maintenance` mode) |
| - Git repository synchronization |
| - Database schema migrations (see /migrations) |
| - Old data cleanup |
| - Cache refreshing (e.g., ParamSet cache) |
| ``` |
| |
| **Rationale for Key Architectural Choices:** |
| |
| - **Decoupled Ingestion via GCS and Pub/Sub:** |
| - **Why:** This decouples data producers from Perf's internal processing. |
| Producers only need to drop files in a GCS bucket. Pub/Sub provides a |
| scalable and reliable way to notify ingesters about new files, allowing |
| multiple ingester instances to pull work. |
| - **How:** Ingesters subscribe to a Pub/Sub topic. GCS is configured to |
| publish a message to this topic when a new file is finalized in the |
| designated ingestion bucket/prefix. |
| - **SQL Database for Structured Data:** |
| - **Why:** SQL databases like CockroachDB and Spanner provide |
| transactional consistency, scalability, and powerful querying |
| capabilities needed for metadata, alert configurations, and regression |
| tracking. CockroachDB offers PostgreSQL compatibility, which is widely |
| used. Spanner provides horizontal scalability for very large datasets. |
| - **How:** Go's `database/sql` package is used, with schema defined and |
| managed by `/go/sql` and migration scripts in `/migrations`. |
| - **Specialized TraceStore:** |
| - **Why:** Performance trace data is time-series and can be voluminous. A |
| generic relational model might not be optimal for the typical queries |
| (fetching traces over commit ranges for specific parameter sets). The |
| tiled approach with inverted indexes for parameters is designed for more |
| efficient retrieval. |
| - **How:** The `TraceStore` (/go/tracestore) implementation uses SQL |
| tables but structures them to represent tiles of commits. `ParamSets` |
| and `Postings` tables act as inverted indexes for fast lookup of traces |
| matching specific key-value parameters. |
| - **Monolithic Executable (`perfserver`) with Modes:** |
| - **Why:** Simplifies deployment and reduces the number of distinct |
| binaries to manage. A single executable can be configured to run as a |
| frontend, an ingester, a clusterer, or a maintenance task. |
| - **How:** `perfserver` uses command-line flags and subcommands to |
| determine its operational mode. Configuration files (`/configs/*.json`) |
| further dictate behavior within each mode. |
| - **K-Means Clustering for Regression Detection:** |
| - **Why:** K-means is a well-understood clustering algorithm suitable for |
| grouping traces with similar performance characteristics. Changes in |
| these groups over time can signal regressions. Traces are normalized |
| before clustering to make them comparable despite different scales. |
| - **How:** Implemented in `/go/clustering2` and `/go/kmeans`. `ctrace2` |
| handles trace normalization. |
| - **Frontend/Backend Separation:** |
| - **Why:** Standard practice for web applications. Allows independent |
| development and scaling of the UI and the backend logic. |
| - **How:** Backend (Go) serves JSON APIs. Frontend (TypeScript/Lit) |
| consumes these APIs to render interactive views. |
| - **Event-Driven Alerting (Optional):** |
| - **Why:** For very large and sparse datasets (like Android), continuous |
| clustering over all alerts can be resource-intensive and slow. |
| Event-driven alerting processes only the data relevant to recently |
| updated traces, reducing latency and computational load. |
| - **How:** Ingesters publish Pub/Sub events containing IDs of updated |
| traces. Clusterers subscribe to these events and run relevant alert |
| configurations only for the affected data. |
| |
| ### 4. Module Responsibilities and Key Components |
| |
| This section focuses on significant modules beyond simple file/directory |
| descriptions. |
| |
| - **`/go/config`**: |
| |
| - **Responsibility:** Defines and validates the structure for instance |
| configuration files (`InstanceConfig`). This is the central place where |
| all settings for a Perf deployment (database, ingestion sources, Git |
| repo, UI features, notification settings) are specified. |
| - **Why:** Configuration files allow Perf to be deployed for different |
| projects with different data sources and requirements without code |
| changes. A strongly-typed Go struct ensures that configurations are |
| well-defined and can be validated. |
| - **How:** `InstanceConfig` is a Go struct with fields for various aspects |
| of the system. JSON files in `/configs` are unmarshaled into this |
| struct. The module provides functions to load and validate these |
| configurations. |
| |
| - **`/go/ingest`**: |
| |
| - **Responsibility:** Orchestrates the entire data ingestion pipeline. |
| This includes watching for new files, parsing them according to the |
| `format.Format` specification, extracting performance metrics and |
| metadata, associating them with Git commits, and writing the data to the |
| `TraceStore`. |
| - **Why:** This module is the entry point for all performance data into |
| the Perf system. It needs to be robust, handle various data formats |
| (though primarily the standard JSON format), and ensure data integrity. |
| - **Key Sub-components:** |
| - `ingest/format`: Defines the expected structure of input JSON files |
| (`format.Format` Go struct) and provides validation. This ensures data |
| consistency. |
| - `ingest/parser`: Contains logic to parse the `format.Format` structure |
| and extract individual trace measurements and their associated |
| parameters. |
| - `ingest/process`: Coordinates the steps: reading from a source (e.g., |
| GCS via `/go/file`), parsing, resolving commit information (via |
| `/go/git`), and writing to the `TraceStore`. |
| - **Workflow:** |
| |
| * A `Source` (e.g., `GCSSource` via PubSub) indicates a new file. |
| * `process` reads the file. |
| * `parser` and `format` validate and extract `Result`s. |
| * For each `Result`, its `git_hash` is resolved to a `CommitNumber` using |
| `/go/git`. |
| * Traces are constructed and written to `/go/tracestore`. |
| |
| - **`/go/tracestore`**: |
| |
| - **Responsibility:** Manages the storage and retrieval of performance |
| trace data. This is a critical component for efficient querying. |
| - **Why:** Trace data is time-series and multi-dimensional. The |
| `TraceStore` is designed to efficiently retrieve trace values for |
| specific parameter combinations over ranges of commits. |
| - **How:** It uses a "tiled" storage approach. Commits are grouped into |
| tiles. |
| - `TraceValues` table: Stores the actual metric values, often sharded by |
| tile. |
| - `ParamSets` table: Stores unique key-value pairs found in trace |
| identifiers within each tile. |
| - `Postings` table: An inverted index mapping (tile, param_key, |
| param_value) to a list of trace IDs that contain that key-value pair |
| within that tile. This structure allows queries like "get all traces |
| where `config=8888` and `arch=x86`" to be resolved efficiently by |
| intersecting posting lists. `SQLTraceStore` is the primary |
| implementation using the SQL database. |
| |
| - **`/go/git`**: |
| |
| - **Responsibility:** Interacts with Git repositories to fetch commit |
| information (hashes, authors, timestamps, messages). It also caches this |
| information in the SQL database to avoid repeated Git operations. |
| - **Why:** Perf needs to correlate performance data with specific code |
| changes. This module provides the link between `git_hash` values in |
| ingested data and Perf's internal `CommitNumber` sequence. |
| - **How:** It can use either a local Git checkout (via `git` CLI) or a |
| Gitiles service API. It maintains a `Commits` table in the SQL database, |
| mapping commit hashes to `CommitNumber`s and storing other metadata. It |
| periodically updates its local Git repository clone or queries Gitiles |
| for new commits. |
| |
| - **`/go/regression`**: |
| |
| - **Responsibility:** Handles the detection, storage, and management of |
| performance regressions. |
| - **Why:** This is a core function of Perf. It provides the logic to |
| identify when performance has changed significantly and to track the |
| status of these findings. |
| - **How:** |
| - It uses clustering results (from `/go/clustering2`) and step-fit |
| analysis (from `/go/stepfit`) to identify "Interesting" clusters. |
| - `Store` interface (implemented by `sqlregression2store`): Persists |
| information about detected regressions, including the cluster summary, |
| owning alert, commit hash, regression statistic, and triage status |
| (`New`, `Ignore`, `Bug`). |
| - The "Alerting" algorithm described in `DESIGN.md` (comparing new |
| interesting clusters with existing ones based on trace fingerprints) is |
| implemented here to manage the lifecycle of a regression. |
| - **Key Workflow for Alerting/Regression Tracking:** `Run clustering |
| (e.g., hourly or event-driven) | V Identify "Interesting" new clusters |
| (high |Regression| score) | V For each new Interesting Cluster: Compare |
| fingerprint (top N traces) with existing relevant Clusters in DB | +-- |
| No match? --> New Regression: Store in DB with status "New". | +-- Match |
| found? --> Update existing Regression if new one has better |Regression| |
| score. Keep triage status of existing.` |
| |
| - **`/go/frontend`**: |
| |
| - **Responsibility:** Implements the backend for the Perf web user |
| interface. It handles HTTP requests, interacts with data stores |
| (TraceStore, AlertStore, RegressionStore, etc.), processes data, and |
| serves JSON responses to the frontend. |
| - **Why:** This module connects the user's browser interactions to Perf's |
| data and analytical capabilities. |
| - **How:** It uses Go's standard `net/http` package to define HTTP |
| handlers for various API endpoints (e.g., fetching data for plots, |
| listing alerts, updating triage statuses). It authenticates users based |
| on the `X-WEBAUTH-USER` header. It often fetches data, converts it into |
| `DataFrame` structures, and then serializes these to JSON for the |
| frontend. |
| |
| - **`/modules` (Frontend TypeScript)**: |
| |
| - **Responsibility:** Contains the TypeScript source code for all frontend |
| custom elements (web components) and UI logic. These modules are |
| compiled into JavaScript and CSS that run in the user's browser. |
| - **Why:** This is where the user interface is built. Modularity (one |
| component per file/directory) makes the frontend codebase manageable. |
| Custom elements (often using Lit) provide encapsulation and reusability. |
| - **How:** Each subdirectory typically defines one or more custom elements |
| (e.g., `plot-simple-sk`, `alert-config-sk`, `query-sk`). These elements |
| handle rendering, user interaction, and making API calls to the Go |
| backend. |
| - `perf-scaffold-sk`: Provides the main page layout (header, sidebar, |
| content area). |
| - `explore-simple-sk` / `explore-sk`: Core components for querying data |
| and displaying plots. |
| - `json/index.ts`: Contains TypeScript interfaces mirroring Go backend |
| structs for type-safe API communication. This is crucial for ensuring |
| frontend and backend data structures are compatible. It's often |
| generated from Go source using `/go/ts/ts.go`. |
| |
| - **`/pages`**: |
| |
| - **Responsibility:** Defines the top-level HTML structure for each |
| distinct page of the Perf application (e.g., alerts page, exploration |
| page). |
| - **Why:** These files serve as the entry points for specific views. They |
| are kept minimal, primarily including the `perf-scaffold-sk` and the |
| main page-specific custom element. |
| - **How:** Each HTML file (e.g., `alerts.html`) includes the |
| `perf-scaffold-sk` and the relevant page element (e.g., |
| `<alerts-page-sk>`). An associated TypeScript file (e.g., `alerts.ts`) |
| imports the necessary custom element definitions. Server-side Go |
| templates inject initial context data (`window.perf = {%.context %};`) |
| into the HTML. |
| |
| - **`DESIGN.md`**: |
| |
| - **Significance:** This document is the primary source for understanding |
| the high-level architecture, design rationale, and core algorithms of |
| Perf, particularly for clustering and alerting. |
| - **Key Concepts Explained:** |
| - **Clustering:** Details the use of k-means clustering on normalized |
| traces, the Euclidean distance metric, and the calculation of the |
| "Regression" statistic (StepSize / LeastSquaresError) to identify |
| "Interesting" clusters. |
| - **Alerting Algorithm:** Explains how Perf identifies and tracks unique |
| regressions over time by fingerprinting clusters and comparing new |
| interesting clusters to existing ones. It outlines the schema for the |
| `clusters` table (though the actual schema is in `/go/sql` and may have |
| evolved to `Regressions` table). |
| - **Event Driven Alerting:** Describes an alternative to continuous |
| clustering, triggered by PubSub events when new data arrives. This is |
| beneficial for large, sparse datasets. |
| |
| - **`FORMAT.md`**: |
| |
| - **Significance:** Defines the precise JSON structure that Perf ingesters |
| expect for input data files. |
| - **Key Elements:** Specifies fields like `git_hash`, `key` (for global |
| parameters), and `results` (an array of measurements). Each result can |
| have its own `key` (for test-specific parameters like `test` name and |
| `units`) and either a single `measurement` or a more complex |
| `measurements` object for statistics (min, max, median). This document |
| is crucial for data producers who need to integrate with Perf. |
| |
| - **`BUILD.bazel` (Root)**: |
| |
| - **Significance:** Defines how the Perf application is built using Bazel. |
| It specifies container images (`perfserver`, `backendserver`) that |
| package the Go executables and necessary static resources (configs, |
| frontend assets). |
| - **How:** Uses `skia_app_container` rules to assemble Docker images. It |
| copies the `perfserver` and `perf-tool` binaries, configuration files |
| from `/configs`, and compiled frontend assets (HTML, JS, CSS from |
| `/pages` built output) into the image. The `entrypoint` for the |
| `perfserver` image is the `perfserver` executable itself. |
| |
| ### 5. Key Workflows Illustrated (Pseudographic Diagrams) |
| |
| **A. New Alert Creation via UI and API:** |
| |
| ``` |
| User (in Perf UI, e.g., on /alerts page) |
| | |
| | Fills out Alert configuration form (<alert-config-sk> element) |
| | Clicks "Save" |
| | |
| V |
| Frontend JS (<alert-config-sk>) |
| | |
| | 1. If new alert, GET /_/alert/new |
| | (Server responds with a pre-populated Alert JSON with id: -1) |
| | |
| | 2. Modifies this Alert JSON based on form input |
| | |
| | 3. POST modified Alert JSON to /_/alert/update |
| | (Authorization: Bearer token if auth is enabled) |
| | |
| V |
| Perf Backend (`/go/frontend/service.go` - UpdateAlertHandler) |
| | |
| | Receives Alert JSON |
| | If alert.ID == -1, it's a new alert. |
| | Validates Alert configuration |
| | Persists Alert to SQL Database (via `alerts.Store`) |
| | Responds 200 OK |
| | |
| V |
| SQL Database (Alerts Table) |
| | |
| | New Alert record is created or existing one updated. |
| ``` |
| |
| **Rationale:** |
| |
| - The `GET /_/alert/new` step is a convenience. It provides the frontend with |
| a valid `Alert` structure, including any instance-default values, |
| simplifying new alert creation logic on the client. |
| - Using `id: -1` to signify a new alert during the `POST` to `/_/alert/update` |
| is a common pattern to allow a single endpoint to handle both creation and |
| updates. The backend inspects the ID to determine the correct action. |
| - The API interactions are documented in `API.md`. |
| |
| **B. Data Ingestion and Event-Driven Regression Detection:** |
| |
| ``` |
| Benchmark System |
| | |
| | Produces performance_data.json (Perf Ingestion Format) |
| | Uploads to GCS: gs://[bucket]/[path]/YYYY/MM/DD/HH/performance_data.json |
| | |
| V |
| Google Cloud Storage |
| | |
| | File "OBJECT_FINALIZE" event |
| | Publishes message to PubSub Topic (e.g., "perf-ingestion-topic") |
| | |
| V |
| Perf Ingest Service(s) (Subscribed to "perf-ingestion-topic") |
| | |
| | 1. Receives PubSub message (contains GCS file path) |
| | 2. Downloads performance_data.json from GCS |
| | 3. Parses JSON, validates data (see /go/ingest/format, /go/ingest/parser) |
| | 4. Looks up git_hash in /go/git to get CommitNumber |
| | 5. Writes trace data to TraceStore (SQL tables) |
| | 6. If Event Driven Alerting enabled for this instance: |
| | Constructs a list of Trace IDs updated by this file |
| | Publishes message (containing gzipped Trace IDs) to another PubSub Topic (e.g., "trace-update-topic") |
| | |
| V |
| Perf Cluster Service(s) (Subscribed to "trace-update-topic") |
| | |
| | 1. Receives PubSub message (with updated Trace IDs) |
| | 2. For each Alert Configuration (/go/alerts): |
| | If Alert's query matches any of the updated Trace IDs: |
| | Run clustering & regression detection for THIS Alert, |
| | focusing on the commit range and data relevant to the updated traces. |
| | (Reduces scope compared to full continuous clustering) |
| | 3. If regressions found: |
| | Store in SQL Database (Regressions table) |
| | Send notifications (email, issue tracker) |
| ``` |
| |
| **Rationale:** |
| |
| - **GCS as Entry Point:** As described in `FORMAT.md` and `DESIGN.md`, GCS is |
| the standard way data enters Perf. The YYYY/MM/DD/HH path structure is a |
| convention. |
| - **Pub/Sub for Decoupling and Scalability:** Ingesters don't need to poll |
| GCS. Pub/Sub handles event delivery, and multiple ingesters can process |
| files in parallel. |
| - **Event-Driven Clustering Optimization:** `DESIGN.md` explicitly states this |
| is for large/sparse datasets. Sending only updated Trace IDs significantly |
| narrows the scope of clustering for each event, making it faster and less |
| resource-intensive than re-clustering everything. PubSub's 10MB message |
| limit is considered for gzipped trace ID lists. |
| |
| This documentation provides a comprehensive starting point for a software |
| engineer to understand the Skia Perf project. It covers its purpose, |
| architecture, core concepts, and the rationale behind key design and |
| implementation choices, referencing existing documentation and source code |
| structure where appropriate. |
| |
| # Module: /cockroachdb |
| |
| The `/cockroachdb` module provides a set of shell scripts designed to facilitate |
| interaction with a CockroachDB instance, specifically one named |
| `perf-cockroachdb`, which is presumed to be running within a Kubernetes cluster. |
| These scripts abstract away some of the complexities of `kubectl` commands, |
| offering streamlined access for common database operations. |
| |
| The primary motivation behind these scripts is to simplify development and |
| administrative workflows. Instead of requiring users to remember and type |
| lengthy `kubectl` commands with specific flags and resource names, these scripts |
| provide convenient, single-command access points. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`admin.sh`**: This script focuses on providing access to the CockroachDB |
| administrative web interface. |
| |
| - **Why**: The web UI is a crucial tool for monitoring database health, |
| performance, and managing cluster settings. Direct access via `kubectl |
| port-forward` can be cumbersome to set up repeatedly. |
| - **How**: It executes `kubectl port-forward` to map the local port `8080` |
| to the port `8080` of the `perf-cockroachdb-0` pod. Crucially, it then |
| immediately attempts to open this local address in Google Chrome, |
| providing an instant user experience. This assumes Google Chrome is |
| installed and available in the system's PATH. `User runs admin.sh | V |
| Script executes: kubectl port-forward perf-cockroachdb-0 8080 | V Local |
| port 8080 now forwards to CockroachDB pod's port 8080 | V Script |
| executes: google-chrome http://localhost:8080 | V CockroachDB Admin UI |
| opens in Chrome` |
| |
| - **`connect.sh`**: This script is designed to provide a SQL shell connection |
| to the CockroachDB instance. |
| |
| - **Why**: Developers and administrators frequently need to execute SQL |
| queries directly against the database for debugging, data manipulation, |
| or schema inspection. Setting up an interactive `kubectl run` command |
| with the correct image and arguments can be error-prone. |
| - **How**: It uses `kubectl run` to create a temporary, interactive pod |
| named `androidx-cockroachdb`. This pod uses the |
| `cockroachdb/cockroach:v19.2.5` Docker image. The `--rm` flag ensures |
| the pod is deleted after the session ends, and `--restart=Never` |
| prevents it from being restarted. The crucial part is the command passed |
| to the pod: `sql --insecure --host=perf-cockroachdb-public`. This starts |
| the CockroachDB SQL client, connecting insecurely to the database |
| service exposed at `perf-cockroachdb-public`. `User runs connect.sh | V |
| Script executes: kubectl run androidx-cockroachdb -it --image=... --rm |
| --restart=Never -- sql --insecure --host=perf-cockroachdb-public | V |
| Temporary pod 'androidx-cockroachdb' is created | V CockroachDB SQL |
| client starts inside the pod, connecting to 'perf-cockroachdb-public' | |
| V User has an interactive SQL shell | V User exits shell -> Pod |
| 'androidx-cockroachdb' is deleted` |
| |
| - **`skia-infra-public-port-forward.sh`**: This script sets up a port forward |
| for direct database connections, typically for use with a local CockroachDB |
| SQL client or other database tools. |
| |
| - **Why**: While `connect.sh` provides an in-cluster SQL shell, sometimes |
| a direct connection from the local machine is preferred, for instance, |
| to use graphical SQL clients or specific client libraries that are not |
| available within the temporary pod created by `connect.sh`. The |
| `perf-cockroachdb` instance is likely within a private network in the |
| Kubernetes cluster (namespace `perf`), and this script makes it |
| accessible locally. |
| - **How**: It leverages a helper script `../../kube/attach.sh |
| skia-infra-public` (the details of which are outside this module's scope |
| but presumably handles Kubernetes context or authentication for the |
| `skia-infra-public` cluster). This helper script is then used to execute |
| `kubectl port-forward` specifically for the `perf-cockroachdb-0` pod |
| within the `perf` namespace. It maps local port `25000` to the pod's |
| CockroachDB port `26257`. The script also helpfully prints instructions |
| for the user on how to connect using the `cockroach sql` command once |
| the port forward is active. The `set -e` command ensures the script |
| exits immediately if any command fails, and `set -x` enables command |
| tracing for debugging. `User runs skia-infra-public-port-forward.sh | V |
| Script prints connection instructions | V Script executes: |
| ../../kube/attach.sh skia-infra-public kubectl port-forward -n perf |
| perf-cockroachdb-0 25000:26257 | V Port forward is established: |
| local:25000 -> perf-cockroachdb-0:26257 (in 'perf' namespace) | V User |
| can now run 'cockroach sql --insecure --host=127.0.0.1:25000' in another |
| terminal` |
| |
| These scripts collectively aim to make interacting with the `perf-cockroachdb` |
| instance as straightforward as possible by encapsulating the necessary `kubectl` |
| commands and providing context-specific instructions or actions. They rely on |
| the Kubernetes cluster being correctly configured and accessible, and on |
| `kubectl` and potentially `google-chrome` being available on the user's system. |
| |
| # Module: /configs |
| |
| The `/configs` directory houses JSON configuration files for various instances |
| of the Perf performance monitoring system. Each file defines the specific |
| behavior and data sources for a particular Perf deployment. These configurations |
| are crucial for tailoring Perf to different projects and environments, enabling |
| developers and performance engineers to monitor and analyze performance data |
| effectively. |
| |
| The core idea is to provide a declarative way to set up a Perf instance. Instead |
| of hardcoding settings, these JSON files act as blueprints. Each file serializes |
| to and from a Go struct named `config.InstanceConfig`. This struct serves as the |
| canonical schema for all instance configurations, and its Go documentation |
| provides detailed explanations of each field. This approach ensures consistency |
| and makes it easier to manage and evolve the configuration options. |
| |
| **Key Components and Responsibilities:** |
| |
| The primary responsibility of this module is to define and store these instance |
| configurations. Each JSON file represents a distinct Perf instance, often |
| corresponding to a specific project or a particular version of a project (e.g., |
| a public vs. internal build, or a stable vs. experimental branch). |
| |
| - **Instance-Specific Configuration Files (e.g., `android2.json`, |
| `chrome-public.json`):** |
| |
| - **Why:** Each project or system being monitored by Perf has unique |
| requirements. These include where its performance data is stored (e.g., |
| GCS buckets), how it's ingested (e.g., Pub/Sub topics), which Git |
| repository tracks its code changes, how users authenticate, and how |
| notifications for regressions are handled. |
| - **How:** These files use a JSON structure that maps directly to the |
| `config.InstanceConfig` Go struct. |
| - `URL`: The public-facing URL of the Perf instance. |
| - `data_store_config`: Defines the backend database (e.g., CockroachDB, |
| Spanner), connection strings, and parameters like `tile_size` which can |
| impact query performance and data retrieval efficiency. The choice |
| between CockroachDB and Spanner often depends on scalability needs and |
| existing infrastructure. |
| - `ingestion_config`: Specifies how performance data is brought into Perf. |
| This includes the `source_type` (e.g., `gcs` for Google Cloud Storage, |
| `dir` for local directories), the specific `sources` (e.g., GCS bucket |
| paths or local file paths), and Pub/Sub topics for real-time ingestion. |
| This section is vital for connecting Perf to the data producers. |
| - `git_repo_config`: Links Perf to the source code repository. This allows |
| Perf to correlate performance data with specific code changes (commits). |
| It includes the repository `url`, the `provider` (e.g., `gitiles`, |
| `git`), and sometimes a `commit_number_regex` to extract meaningful |
| commit identifiers from commit messages. |
| - `notify_config`: Configures how alerts and notifications are sent when |
| regressions are detected. This can range from `none` to `html_email`, |
| `markdown_issuetracker`, or `anomalygroup`. It often includes templates |
| for notification subjects and bodies, leveraging placeholders like `{{ |
| .Alert.DisplayName }}` to include dynamic information. |
| - `auth_config`: Defines the authentication mechanism, commonly using a |
| header like `X-WEBAUTH-USER` for integration with existing |
| authentication systems. |
| - `query_config`: Customizes how users can query and view data, including |
| which parameters are available for filtering (`include_params`), default |
| selections, and URL value defaults to tailor the user experience. It can |
| also include caching configurations (e.g., using Redis) to improve query |
| performance by specifying `cache_config` with `level1_cache_key` and |
| `level2_cache_key`. |
| - `anomaly_config`: Contains settings related to anomaly detection, such |
| as `settling_time` which defines how long Perf waits before considering |
| new data for anomaly detection, helping to avoid flagging transient |
| issues. |
| - Other fields like `contact`, `ga_measurement_id` (for Google Analytics), |
| `feedback_url`, `trace_sample_proportion` (to control the volume of |
| detailed trace data collected), and `favorites` (for pre-defined links |
| on the Perf UI) further customize the instance. |
| - **Example Workflow (Data Ingestion and Alerting for `android2.json`):** |
| |
| * **Data Production:** Android benchmarks generate performance data. |
| * **Data Upload:** This data is uploaded to GCS buckets specified in |
| `ingestion_config.source_config.sources` (e.g., |
| `gs://android-perf-2/android2`). |
| * **Pub/Sub Notification:** A message is sent to the Pub/Sub topic |
| `perf-ingestion-android2-production`. |
| * **Perf Ingestion Service:** The Perf ingestion service, subscribed to |
| this topic, reads the new data file from GCS. |
| * **Data Processing & Storage:** Perf processes the data, associates it |
| with the corresponding commit from the `git_repo_config` (e.g., |
| `https://android.googlesource.com/platform/superproject`), and stores it |
| in the CockroachDB instance defined in `data_store_config`. |
| * **Anomaly Detection:** Perf's anomaly detection algorithms analyze the |
| new data points. |
| * **Regression Found:** If a regression is detected based on the |
| `anomaly_config`. |
| * **Notification Sent:** A notification is generated according to |
| `notify_config`. For `android2.json`, this means an issue is filed in an |
| issue tracker (`"notifications": "markdown_issuetracker"`) with a |
| subject and body formatted using the provided templates, including |
| details like affected tests and devices. |
| |
| - **`local.json`:** |
| |
| - **Why:** Provides a standardized configuration for local development and |
| manual testing of Perf. It's designed to be self-contained and not rely |
| on external production services. |
| - **How:** It typically points the `ingestion_config` to a local directory |
| (`integration/data`) that contains sample data. This data is often the |
| same data used for unit tests, ensuring consistency between testing |
| environments. The database connection will also point to a local |
| instance. |
| |
| - **`demo.json` and `demo_spanner.json`:** |
| |
| - **Why:** These configurations are likely used for demonstration purposes |
| or for setting up small-scale, illustrative Perf instances. They |
| showcase Perf's capabilities with sample data. |
| - **How:** Similar to `local.json`, `demo.json` uses a local directory for |
| data ingestion (`"./demo/data/"`) and a local CockroachDB instance. |
| `demo_spanner.json` is analogous but configured to use Spanner as the |
| backend, demonstrating flexibility in data store choices. They often |
| include simpler `git_repo_config` pointing to public demo repositories |
| (e.g., `https://github.com/skia-dev/perf-demo-repo.git`). The |
| `favorites` section in `demo.json` shows how to add curated links to the |
| Perf UI. |
| |
| - **`/spanner` subdirectory:** |
| |
| - **Why:** This subdirectory groups configurations for Perf instances that |
| specifically use Google Cloud Spanner as their backend data store. |
| Spanner is chosen for its scalability, strong consistency, and global |
| distribution capabilities, making it suitable for large-scale Perf |
| deployments. |
| - **How:** Files within this directory (e.g., |
| `spanner/chrome-public.json`, `spanner/skia-public.json`) will have |
| their `data_store_config.datastore_type` set to `"spanner"`. They often |
| include Spanner-specific settings or optimizations. For example, |
| `enable_follower_reads` might be set to `true` in `data_store_config` |
| for Spanner instances to distribute read load. Many of these |
| configurations also define `redis_config` within their |
| `query_config.cache_config` to further enhance query performance for |
| frequently accessed data. |
| - The `optimize_sqltracestore` flag, often set to `true` in Spanner |
| configurations, indicates that specific optimizations for the SQL-based |
| trace store are enabled, likely tailored to Spanner's characteristics. |
| - Configurations like `chrome-internal.json` and `chrome-public.json` |
| demonstrate sophisticated setups, including: |
| - `commit_number_regex` in `git_repo_config` to extract structured commit |
| positions. |
| - `temporal_config` for integrating with Temporal workflows for tasks like |
| regression grouping and bisection. |
| - `enable_sheriff_config` to integrate with sheriffing systems for |
| managing alerts. |
| - `trace_format: "chrome"` indicates that the performance data adheres to |
| the Chrome trace event format. |
| |
| The choice of fields and their values within each JSON file reflects a series of |
| design decisions aimed at balancing flexibility, performance, and operational |
| manageability for each specific Perf instance. For instance, the `tile_size` in |
| `data_store_config` is adjusted based on expected data characteristics and query |
| patterns. Similarly, `trace_sample_proportion` is set to manage storage costs |
| and processing load while still capturing enough data for meaningful analysis. |
| The `notify_config` templates are crafted to provide actionable information to |
| developers when regressions occur. |
| |
| # Module: /csv2days |
| |
| ## csv2days Module Documentation |
| |
| ### Overview |
| |
| The `csv2days` module is a command-line utility designed to process CSV files |
| downloaded from the Perf performance monitoring system. Its primary purpose is |
| to simplify time-series data by consolidating multiple data points from the same |
| calendar day into a single representative value. This is particularly useful |
| when analyzing performance trends over longer periods, where daily granularity |
| is sufficient and finer-grained timestamps can introduce noise or unnecessary |
| complexity. |
| |
| The core problem this module solves is the overabundance of data points when |
| Perf exports data at a high temporal resolution (e.g., multiple commits per |
| day). For certain types of analysis, this level of detail is not required and |
| can make it harder to discern broader trends. `csv2days` transforms such CSVs by |
| keeping only the first encountered data column for each unique day and |
| aggregating subsequent values from the same day into that single column using a |
| "max" aggregation strategy. |
| |
| ### Design and Implementation |
| |
| The module operates as a streaming processor. It reads the input CSV file row by |
| row, processes the header to determine which columns to modify or drop, and then |
| transforms each subsequent data row accordingly before writing it to standard |
| output. |
| |
| **Key Design Choices:** |
| |
| 1. **Command-Line Interface:** The tool is designed as a simple command-line |
| application for ease of integration into scripting workflows. It takes an |
| input file path via the `--in` flag and outputs the transformed CSV to |
| `stdout`. This follows common Unix philosophies for tool interoperability. |
| 2. **Streaming Processing:** Instead of loading the entire CSV into memory, |
| which could be problematic for very large files, `csv2days` processes the |
| file line by line. This makes the tool memory-efficient. |
| 3. **Date-Based Grouping:** The core logic revolves around identifying columns |
| that represent timestamps. It uses a regular expression (`datetime`) to |
| match RFC3339 formatted dates in the header row. The date part (YYYY-MM-DD) |
| of these timestamps is used for grouping. |
| 4. **"First Seen" Column for a Day:** For each unique calendar day encountered |
| in the header, only the first column corresponding to that day is retained |
| in the output header. Subsequent columns from the same day are marked for |
| removal. |
| 5. **"Max" Aggregation:** When multiple columns from the same day are |
| encountered in a data row, the values from these columns are aggregated. The |
| `csv2days` tool currently implements a "max" aggregation strategy: for the |
| set of values corresponding to a single day, the maximum numerical value is |
| chosen. If non-numerical values are encountered, the first value in the |
| sequence is typically used. |
| 6. **Reverse Sorted Index Removal:** When removing columns, the indices of |
| columns to be skipped (`skipCols`) are sorted in reverse order. This is |
| crucial because removing an element from a slice shifts the indices of |
| subsequent elements. Processing removals from right-to-left (largest index |
| to smallest) ensures that the indices remain valid throughout the removal |
| process. |
| |
| **Workflow:** |
| |
| The main workflow within `transformCSV` can be visualized as follows: |
| |
| ``` |
| Read Input CSV File (--in flag) |
| | |
| v |
| Parse Header Row |
| | |
| +----------------------------------------------------------------------+ |
| | Identify Timestamp Columns (using RFC3339 regex) | |
| | For each timestamp: | |
| | Extract Date (YYYY-MM-DD) | |
| | If new date: | |
| | Add Date to Output Header | |
| | Record current column as start of a new "run" for this day | |
| | Else (same date as previous timestamp): | |
| | Mark current column for skipping (`skipCols`) | |
| | Increment length of current day's "run" (`runLengths`) | |
| | | |
| | Non-timestamp columns are added to Output Header as-is | |
| +----------------------------------------------------------------------+ |
| | |
| v |
| Write Transformed Header to Output |
| | |
| v |
| Sort `skipCols` in Reverse Order |
| | |
| v |
| For each Data Row in Input CSV: |
| | |
| +----------------------------------------------------------------------+ |
| | Apply "Max" Aggregation: | |
| | For each "run" of columns belonging to the same day (from header): | |
| | Find the maximum numerical value in the corresponding cells | |
| | Replace the first cell of the run with this max value | |
| +----------------------------------------------------------------------+ |
| | |
| v |
| Remove Skipped Columns (based on `skipCols` from header processing) |
| | |
| v |
| Write Transformed Data Row to Output |
| | |
| v |
| Flush Output Buffer |
| ``` |
| |
| ### Key Components/Files |
| |
| - **`main.go`**: This is the heart of the module. |
| - **`main()` function**: Handles command-line flag parsing (`--in` for the |
| input CSV file). It orchestrates the reading of the input file and calls |
| `transformCSV` to perform the core logic. Error handling and logging are |
| also managed here. |
| - **`transformCSV(input io.Reader, output io.Writer) error`**: This is the |
| core function responsible for the CSV transformation. |
| - It initializes `csv.Reader` for input and `csv.Writer` for output. |
| - **Header Processing**: It reads the first line (header) of the CSV. It |
| iterates through the header cells. |
| - A regular expression (`datetime = regexp.MustCompile(...)`) is used |
| to identify columns containing RFC3339 timestamps. |
| - It maintains `lastDate` to detect when a new day starts in the |
| header sequence. |
| - `skipCols` (a slice of integers) stores the indices of columns that |
| represent subsequent entries for an already seen day and should thus |
| be removed from the data rows. |
| - `runLengths` (a map of `int` to `int`) stores, for each column that |
| starts a sequence of same-day entries, how many columns belong to |
| that day. This is used later for aggregation. For example, if |
| columns 5, 6, and 7 are all for "2023-01-15", `runLengths[5]` would |
| be `3`. |
| - The output header (`outHeader`) is constructed by keeping the date |
| part (YYYY-MM-DD) for the first occurrence of each day and omitting |
| subsequent columns for the same day. Non-date columns are passed |
| through unchanged. |
| - **Data Row Processing**: It then reads the rest of the CSV file row by |
| row. |
| - `applyMaxToRuns(s []string, runLengths map[int]int) []string`: For |
| each "run" of columns identified in the header as belonging to the |
| same day, this function takes the corresponding values from the |
| current data row and replaces the value in the first column of that |
| run with the maximum of those values. The `max(s []string) string` |
| helper function is used here to find the maximum float value, |
| falling back to the first string if parsing fails. |
| - `removeAllIndexesFromSlices(s []string, skipCols []int) []string`: |
| After aggregation, this function removes the data cells |
| corresponding to the `skipCols` identified during header processing. |
| It uses `removeValueFromSliceAtIndex` repeatedly. It's crucial that |
| `skipCols` is sorted in reverse order for this to work correctly. |
| - The transformed row is then written to the output CSV. |
| - **Helper Functions**: |
| - `removeValueFromSliceAtIndex(s []string, index int) []string`: A utility |
| to remove an element at a specific index from a string slice. |
| - `max(s []string) string`: Iterates through a slice of strings, attempts |
| to parse them as floats, and returns the string representation of the |
| maximum float found. If no floats are found or parsing errors occur, it |
| defaults to returning the first string in the input slice. This function |
| underpins the aggregation logic. |
| - **`main_test.go`**: Contains unit tests for the `transformCSV` function. |
| - `TestTransformCSV_HappyPath`: Provides a simple input CSV string and the |
| expected output string. It then calls `transformCSV` with these and |
| asserts that the actual output matches the expected output. This serves |
| as a concrete example of the module's behavior. |
| - **`BUILD.bazel`**: Defines how the `csv2days` Go binary and its associated |
| library and tests are built using Bazel. It specifies source files, |
| dependencies (like `skerr`, `sklog`, `util`), and visibility. |
| |
| The design decision to use `strconv.ParseFloat` and handle potential errors by |
| continuing or defaulting implies that the tool is somewhat lenient with |
| non-numeric data in columns expected to be numeric. The "max" operation will |
| effectively ignore non-convertible strings unless all strings in a run are |
| non-convertible, in which case the first string is chosen. |
| |
| # Module: /demo |
| |
| The `demo` module provides the necessary data and tools to showcase the |
| capabilities of the Perf performance monitoring system. Its primary purpose is |
| to offer a tangible and reproducible example of how Perf ingests and processes |
| performance data. This allows users and developers to understand Perf's |
| functionality without needing to set up a complex real-world data pipeline. |
| |
| The core of this module revolves around a set of pre-generated data files and a |
| Go program to create them. |
| |
| **Key Components:** |
| |
| - **`/demo/data/` (Directory):** This directory houses the actual demo data |
| files in JSON format. Each file represents performance measurements |
| associated with a specific commit hash. |
| |
| - **Why:** These static files serve as the input for a 'dir' type ingester |
| in a demo Perf instance. They are structured according to the |
| `format.Format` specification (defined in `perf/go/ingest/format`), |
| which Perf understands. This allows for a simple and direct way to feed |
| data into Perf for demonstration purposes. |
| - **How:** Each JSON file (e.g., `demo_data_commit_1.json`) contains a |
| `git_hash`, `key` (identifying the test environment like architecture |
| and configuration), and `results`. The `results` section includes |
| measurements for various tests (like "encode" and "decode") across |
| different units (like "ms" and "kb"). Some files also include `links` |
| which can point to external resources relevant to the data point or the |
| overall commit. The data in these files is designed to show some |
| variation over commits to demonstrate Perf's ability to track changes |
| and detect regressions/improvements. For instance, the `decode` |
| measurement and `encodeMemory` show a deliberate shift in values |
| starting from `demo_data_commit_6.json`. |
| |
| - **`generate_data.go`:** This Go program is responsible for creating the JSON |
| data files located in the `/demo/data/` directory. |
| |
| - **Why:** While the static data files are sufficient for running the |
| demo, this program provides the means to regenerate or modify the demo |
| dataset. This is crucial if the demo requirements change, if new Perf |
| features need to be showcased with different data patterns, or if the |
| underlying `format.Format` evolves. It ensures the demo data remains |
| relevant and can be adapted. |
| - **How:** |
| |
| * It defines a list of Git commit hashes. These hashes are specifically |
| chosen from the `skia-dev/perf-demo-repo` repository, establishing a |
| direct link between the performance data and a version control history, |
| a common scenario in real-world Perf usage. |
| * It iterates through these hashes. For each hash: _ It programmatically |
| generates performance values (e.g., `encode`, `decode`, `encodeMemory`). |
| The generation includes some randomness (`rand.Float32()`) to make the |
| data appear more realistic. _ A deliberate change in the data generation |
| logic is introduced for commits at index 5 and onwards (e.g., |
| `multiplier = 1.2`), which leads to a noticeable shift in `decode` and |
| `encodeMemory` values in the corresponding JSON files. This is done to |
| demonstrate how Perf can track and visualize such changes. _ It |
| populates a `format.Format` struct (from |
| `go.skia.org/infra/perf/go/ingest/format`) with the generated data, |
| including the Git hash, environment keys, and the measurement results. _ |
| The `format.Format` struct is then marshaled into JSON with indentation |
| for readability. \* Finally, the JSON data is written to a file named |
| according to the commit sequence (e.g., `demo_data_commit_1.json`) |
| within the `data` subdirectory. The program uses the `runtime.Caller(0)` |
| function to determine its own location, ensuring that the `data` |
| directory is created relative to the Go file itself, making the script |
| more portable. |
| |
| **Workflow for Demo Data Usage:** |
| |
| ``` |
| generate_data.go --(generates)--> /demo/data/*.json files |
| | |
| V |
| Perf Ingester (type 'dir', configured to read from /demo/data/) |
| | |
| V |
| Perf System (stores, analyzes, and visualizes the data) |
| ``` |
| |
| The demo data is specifically designed to be used in conjunction with the |
| `perf/configs/demo.json` configuration file and the |
| `https://github.com/skia-dev/perf-demo-repo.git` repository. This linkage |
| provides a complete, albeit simplified, end-to-end scenario for demonstrating |
| Perf. |
| |
| # Module: /go |
| |
| ## Module: /go |
| |
| This main module, located at `/go`, serves as the root for all Go language |
| components of the Perf performance monitoring system. It encompasses a wide |
| array of functionalities, from data ingestion and storage to analysis, alerting, |
| and user interface backend logic. The design promotes modularity, with specific |
| responsibilities delegated to sub-modules. |
| |
| The system is designed to handle large volumes of performance data, track it |
| against code revisions, detect regressions automatically, and provide tools for |
| developers and performance engineers to investigate and manage performance. |
| |
| ### Key Design Philosophies and Architectural Choices: |
| |
| 1. **Modularity:** The system is broken down into numerous sub-modules (e.g., |
| `/go/alerts`, `/go/ingest`, `/go/regression`, `/go/frontend`), each with a |
| well-defined responsibility. This promotes separation of concerns, making |
| the system easier to develop, test, and maintain. |
| 2. **Interface-Based Design:** Many modules define interfaces for their core |
| components (e.g., `tracestore.Store`, `alerts.Store`, `regression.Store`). |
| This allows for different implementations to be swapped in (e.g., SQL-based |
| stores vs. in-memory mocks for testing) and promotes loose coupling. |
| 3. **Configuration-Driven Behavior:** The `/go/config` module defines a |
| comprehensive `InstanceConfig` structure, which is loaded from a JSON file. |
| This configuration dictates many aspects of an instance's behavior, |
| including database connections, data sources, alert settings, and UI |
| features. This allows for flexible deployment and customization of Perf |
| instances. |
| 4. **Asynchronous Processing and Workflows:** For long-running tasks like data |
| ingestion, regression detection, and bisection, the system leverages |
| asynchronous processing. |
| - Go routines are widely used for concurrent operations. |
| - The `/go/progress` module provides a mechanism for tracking and |
| reporting the status of such tasks to the UI. |
| - The `/go/workflows` module utilizes Temporal to orchestrate complex, |
| multi-step processes like triggering bisections and processing their |
| results. Temporal provides resilience and fault tolerance for these |
| critical operations. |
| 5. **Data Storage and Retrieval:** |
| - **SQL Database:** A relational database (primarily targeting |
| CockroachDB, with Spanner compatibility) is the main persistence layer |
| for most structured data, including alert configurations (`/go/alerts`), |
| regression details (`/go/regression`), commit information (`/go/git`), |
| user favorites (`/go/favorites`), subscriptions (`/go/subscription`), |
| and more. The `/go/sql` module manages the database schema. |
| - **Trace Data (`/go/tracestore`):** Performance trace data is stored in a |
| tiled fashion, with inverted indexes to allow for efficient querying. |
| This specialized storage approach is optimized for time-series |
| performance metrics. |
| - **File Storage (GCS):** Raw ingested data files and potentially other |
| large artifacts are often stored in Google Cloud Storage. The `/go/file` |
| and `/go/filestore` modules provide abstractions for interacting with |
| these files. |
| 6. **Caching:** Various caching strategies are employed to improve performance: |
| - In-memory LRU caches for frequently accessed data (e.g., in `/go/git`, |
| `/go/progress`). |
| - A dedicated `/go/tracecache` for trace IDs. |
| - The `/go/psrefresh` module manages caching of `ParamSet`s (used for UI |
| query builders), potentially using Redis (`/go/redis`). |
| - `/go/graphsshortcut` offers an in-memory cache for graph shortcuts, |
| especially for development. |
| 7. **External Service Integration:** |
| - **Git:** The `/go/git` module interacts with Git repositories (via local |
| CLI or Gitiles API) to fetch commit information. |
| - **Issue Trackers:** Modules like `/go/issuetracker` and `/go/culprit` |
| integrate with issue tracking systems (e.g., Buganizer) for automated |
| bug filing. |
| - **Chrome Perf:** The `/go/chromeperf` module allows communication with |
| the Chrome Performance Dashboard for reporting regressions or fetching |
| anomaly data. |
| - **Pinpoint:** The `/go/pinpoint` module provides a client for the |
| Pinpoint bisection service. |
| - **LUCI Config:** The `/go/sheriffconfig` module integrates with LUCI |
| Config for managing alert configurations. |
| 8. **Command-Line Tools:** |
| - `/go/perfserver`: The main executable for running different Perf |
| services (frontend, ingestion, clustering, maintenance). |
| - `/go/perf-tool`: A CLI for various administrative and data inspection |
| tasks. |
| - `/go/initdemo`: A tool to initialize a database for demo or development. |
| - `/go/ts`: A utility to generate TypeScript definitions from Go structs |
| for frontend type safety. |
| |
| ### Core Workflows (Conceptual High-Level): |
| |
| 1. **Data Ingestion:** |
| |
| ``` |
| External Data Source (e.g., GCS event) |
| | |
| V |
| /go/file (Source Interface: DirSource, GCSSource) --> Raw File Data |
| | |
| V |
| /go/ingest/process (Orchestrator) |
| | |
| +--> /go/ingest/parser (Parses file based on /go/ingest/format) --> Extracted Traces & Metadata |
| | |
| +--> /go/git (Resolves Git hash to CommitNumber) |
| | |
| V |
| /go/tracestore (Writes traces, updates inverted index & ParamSets) |
| | |
| V |
| /go/ingestevents (Publishes event: "File Ingested") |
| ``` |
| |
| 2. **Regression Detection (Event-Driven Example):** |
| |
| ``` |
| /go/ingestevents (Receives "File Ingested" event) |
| | |
| V |
| /go/regression/continuous (Controller) |
| | |
| +--> /go/alerts (Loads matching Alert configurations) |
| | |
| +--> /go/dfiter & /go/dataframe & /go/dfbuilder (Prepare DataFrames for analysis) |
| | |
| V |
| /go/regression/detector (Core detection logic) |
| | |
| +--> /go/clustering2 (KMeans clustering) |
| | |
| +--> /go/stepfit (Individual trace step detection) |
| | |
| V |
| Detected Regressions |
| | |
| +--> /go/regression (Store results using Store interface, e.g., sqlregression2store) |
| | |
| +--> /go/notify (Format & send notifications via Email, IssueTracker, Chromeperf) |
| | |
| +--> /go/workflows (MaybeTriggerBisectionWorkflow for potential bisection) |
| ``` |
| |
| 3. **User Interaction (Frontend Request for Graph):** |
| |
| ``` |
| User in Browser (Requests graph) |
| | |
| V |
| /go/frontend (HTTP Handlers, e.g., graphApi) |
| | |
| +--> /go/ui/frame (ProcessFrameRequest) |
| | | |
| | +--> /go/dataframe/dfbuilder (Builds DataFrame based on query) |
| | | | |
| | | +--> /go/tracestore (Fetch trace data) |
| | | +--> /go/git (Fetch commit data) |
| | | |
| | +--> /go/calc (If formulas are used) |
| | | |
| | +--> /go/pivot (If pivot table requested) |
| | | |
| | +--> /go/anomalies (Fetch anomaly data to overlay) |
| | |
| V |
| FrameResponse (JSON data for UI) --> User in Browser |
| ``` |
| |
| 4. **Automated Bisection via Temporal Workflow:** |
| `/go/workflows.MaybeTriggerBisectionWorkflow (Triggered by significant |
| regression) | +--> Waits for related anomalies to group | +--> |
| /go/anomalygroup (Loads anomaly group details) | +--> If GroupAction == |
| BISECT: | | | +--> /go/gerrit (Activity: Get commit hashes from positions) | |
| | | +--> Executes Pinpoint.CulpritFinderWorkflow (Child Workflow) | | |
| (Pinpoint performs bisection) | V | Pinpoint calls back to |
| /go/workflows.ProcessCulpritWorkflow | | | +--> /go/culprit (Activity: |
| Persist culprit & Notify user) | +--> If GroupAction == REPORT: | +--> |
| /go/culprit (Activity: Notify user of anomaly group)` |
| |
| ### Sub-Module Summaries (Illustrative, not exhaustive): |
| |
| - **/go/alertfilter**: Constants for alert filtering modes (e.g., `ALL`, |
| `OWNER`). Ensures consistent filter definitions. |
| - **/go/alerts**: Manages `Alert` configurations, their storage |
| (`sqlalertstore`), and efficient retrieval (`ConfigProvider` with caching). |
| Defines how performance regressions are detected. |
| - **/go/anomalies**: Retrieves anomaly data, often by proxying to Chrome Perf, |
| with a caching layer to improve performance. |
| - **/go/anomalygroup**: Groups related anomalies to consolidate actions like |
| bug filing or bisection. Uses a gRPC service and SQL store. |
| - **/go/backend**: A gRPC backend service for internal, non-UI-facing APIs, |
| promoting stable interfaces. |
| - **/go/builders**: Centralized factory for creating core components (data |
| stores, Git client) based on instance configuration, preventing cyclic |
| dependencies. |
| - **/go/bug**: Generates URLs for reporting bugs to issue trackers using |
| configurable URI templates. |
| - **/go/calc**: (Not detailed in provided docs, but generally) Evaluates |
| formulas on trace data. |
| - **/go/chromeperf**: Client for interacting with the Chrome Performance |
| Dashboard API (reporting regressions, fetching anomalies). |
| - **/go/clustering2**: Implements k-means clustering for grouping similar |
| performance traces. |
| - **/go/config**: Defines and validates the `InstanceConfig` structure (loaded |
| from JSON) that governs a Perf instance. |
| - **/go/ctrace2**: Adapts trace data (normalization, handling missing points) |
| for use with k-means clustering. |
| - **/go/culprit**: Manages identified culprits (commits causing regressions), |
| their storage, and notification. |
| - **/go/dataframe**: Provides the `DataFrame` structure for handling |
| performance data in a tabular, commit-centric way, inspired by R's |
| dataframes. |
| - **/go/dfbuilder**: Constructs `DataFrame` objects from `TraceStore`, |
| handling query logic and data aggregation. |
| - **/go/dfiter**: Iterates over `DataFrame`s, typically by slicing a larger |
| fetched frame. Used in regression detection. |
| - **/go/dryrun**: Allows testing alert configurations without creating actual |
| alerts, simulating regression detection. |
| - **/go/favorites**: Manages user-saved favorite configurations/views, stored |
| in SQL. |
| - **/go/file**: Defines `File` and `Source` interfaces for abstracting file |
| access from different origins (local, GCS via Pub/Sub). |
| - **/go/filestore**: Implements `fs.FS` for local and GCS file access, |
| providing a unified way to read files. |
| - **/go/frontend**: Backend for the Perf web UI, handling HTTP requests, |
| rendering templates, and interacting with data stores. |
| - **/go/git**: Abstraction for Git repository interaction, caching commit data |
| in SQL, with providers for local CLI and Gitiles. |
| - **/go/graphsshortcut**: Manages shortcuts for collections of graph |
| configurations, using hashed IDs for de-duplication. |
| - **/go/ingest**: Orchestrates the data ingestion pipeline: reading files, |
| parsing formats, and writing to `TraceStore`. |
| - **/go/ingestevents**: Defines and handles serialization/deserialization of |
| ingestion completion events for PubSub. |
| - **/go/initdemo**: CLI tool to initialize a database (CockroachDB or Spanner |
| emulator) with the Perf schema. |
| - **/go/issuetracker**: Provides an interface and implementation for |
| interacting with Google Issue Tracker (Buganizer). |
| - **/go/kmeans**: Generic k-means clustering algorithm implementation using |
| interfaces for flexibility. |
| - **/go/maintenance**: Runs background tasks like Git repo sync, regression |
| migration, query cache refresh, and old data deletion. |
| - **/go/notify**: Framework for formatting and sending notifications (email, |
| issue tracker) about regressions. |
| - **/go/notifytypes**: Defines constants for different notification mechanisms |
| and data providers. |
| - **/go/perf-tool**: CLI for Perf administration (config management, data |
| inspection, database maintenance). |
| - **/go/perfclient**: Client for pushing performance data (typically from |
| trybots) to Perf's GCS ingestion endpoint. |
| - **/go/perfresults**: Fetches, parses, and processes performance results from |
| Telemetry benchmarks (Chromium). |
| - **/go/perfserver**: Main executable for Perf, consolidating frontend, |
| ingestion, clustering, and maintenance services. |
| - **/go/pinpoint**: Client for the Pinpoint (Chromeperf) bisection service. |
| - **/go/pivot**: Aggregates and summarizes trace data within a `DataFrame` |
| based on specified grouping criteria (like pivot tables). |
| - **/go/progress**: Tracks the progress of long-running backend tasks and |
| exposes it to the UI via HTTP polling. |
| - **/go/psrefresh**: Manages and caches `paramtools.ParamSet` instances (used |
| for UI query builders) to improve performance. |
| - **/go/redis**: Manages interaction with Redis for caching, primarily to |
| support the query UI. |
| - **/go/regression**: Core module for detecting, storing, and managing |
| performance regressions. |
| - **/go/samplestats**: Performs statistical analysis on sets of performance |
| data to identify significant changes between "before" and "after" states. |
| - **/go/sheriffconfig**: Manages Sheriff Configurations (alerting rules |
| defined in Protobuf, stored in LUCI Config), importing them into Perf. |
| - **/go/shortcut**: Manages shortcuts for lists of trace keys, using hashed |
| IDs. |
| - **/go/sql**: Central module for SQL database schema management (definition, |
| generation, validation, migration). |
| - **/go/stepfit**: Analyzes time-series data to detect significant changes |
| ("steps") using various statistical algorithms. |
| - **/go/subscription**: Manages alerting subscriptions, defining how to react |
| to anomalies (e.g., bug filing details). |
| - **/go/tracecache**: Caches trace identifiers for specific tiles and queries |
| to improve performance. |
| - **/go/tracefilter**: Filters trace data based on hierarchical paths, |
| identifying "leaf" traces. |
| - **/go/tracesetbuilder**: Efficiently constructs `TraceSet` and |
| `ReadOnlyParamSet` objects from multiple, potentially disparate chunks of |
| trace data using a worker pool. |
| - **/go/tracestore**: Defines interfaces and SQL implementations for storing |
| and retrieving performance trace data, using a tiled storage approach. |
| - **/go/tracing**: Initializes and configures distributed tracing capabilities |
| using OpenCensus. |
| - **/go/trybot**: Manages performance data from trybots (pre-submit tests), |
| including ingestion, storage, and analysis. |
| - **/go/ts**: Utility to generate TypeScript definition files from Go structs |
| for frontend type safety. |
| - **/go/types**: Defines core data types used throughout Perf (e.g., |
| `CommitNumber`, `TileNumber`, `Trace`). |
| - **/go/ui**: Handles frontend requests and prepares data for display, |
| bridging UI interactions with backend data sources. |
| - **/go/urlprovider**: Generates URLs for various pages within the Perf |
| application consistently. |
| - **/go/userissue**: Manages associations between specific data points (trace |
| key + commit position) and Buganizer issues. |
| - **/go/workflows**: Defines and implements Temporal workflows for automating |
| tasks like bisection triggering and culprit processing. |
| |
| This comprehensive suite of modules works together to provide the Skia Perf |
| performance monitoring system. |
| |
| # Module: /go/alertfilter |
| |
| This module, `go/alertfilter`, provides constants that define different |
| filtering modes for alerts. These constants are used throughout the Perf |
| application to control which alerts are displayed or processed. |
| |
| The primary motivation behind this module is to centralize the definition of |
| alert filtering options. By having these constants in a dedicated module, we |
| avoid scattering magic strings like "ALL" or "OWNER" throughout the codebase. |
| This improves maintainability, reduces the risk of typos, and makes it easier to |
| understand and modify the filtering logic. If new filtering modes are needed in |
| the future, they can be added here, providing a single source of truth. |
| |
| **Key Components/Files:** |
| |
| - **`alertfilter.go`**: This is the sole file in this module. It defines the |
| string constants used for alert filtering. |
| - **`ALL`**: This constant represents a filter that includes all alerts, |
| irrespective of their owner or other properties. It is used when a user |
| or a system process needs to view or operate on the entire set of active |
| alerts. |
| - **`OWNER`**: This constant represents a filter that includes only alerts |
| assigned to a specific owner. This is crucial for user-specific views |
| where individuals only want to see alerts relevant to their |
| responsibilities. |
| |
| **Workflow/Usage Example:** |
| |
| Imagine a user interface for viewing alerts. The user might have a dropdown to |
| select how they want to filter the alerts. |
| |
| ``` |
| User Interface: |
| [Alert List] |
| Filter: [Dropdown: "ALL", "OWNER"] |
| |
| Backend Logic: |
| func GetAlerts(filterMode string, userID string) []Alert { |
| if filterMode == alertfilter.ALL { |
| // Fetch all alerts from the database. |
| return database.GetAllAlerts() |
| } else if filterMode == alertfilter.OWNER { |
| // Fetch alerts owned by the current user. |
| return database.GetAlertsByOwner(userID) |
| } |
| // ... other filter modes or error handling |
| } |
| ``` |
| |
| In this scenario, the backend uses the constants from the `alertfilter` module |
| to determine the correct query to execute against the database. This ensures |
| consistency and clarity in how filtering is applied. |
| |
| # Module: /go/alerts |
| |
| The `/go/alerts` module is responsible for managing alert configurations within |
| the Perf application. These configurations define the conditions under which |
| users or systems should be notified about performance regressions. The module |
| handles the definition, storage, retrieval, and caching of these alert |
| configurations. |
| |
| A core design principle is the separation of concerns between defining an |
| alert's structure (`config.go`), providing access to these configurations |
| (`configprovider.go`), and persisting them (`store.go` and its SQL |
| implementation in `sqlalertstore`). This modularity allows for flexibility in |
| how alerts are stored (e.g., potentially different database backends) and |
| accessed. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`config.go`**: This file defines the `Alert` struct, which is the central |
| data structure representing a single alert configuration. |
| |
| - **Why**: It encapsulates all the parameters necessary to define an |
| alert, such as the query to select relevant performance traces, the |
| notification destination (email or issue tracker), thresholds for |
| triggering, clustering algorithms, and the desired action (e.g., report, |
| bisect). |
| - **How**: The `Alert` struct includes fields for: |
| - `IDAsString`: A string representation of the alert's unique identifier. |
| This is used for JSON serialization to avoid potential issues with large |
| integer handling in JavaScript. The `BadAlertID` and |
| `BadAlertIDAsAsString` constants represent an invalid/uninitialized ID. |
| - `Query`: A URL-encoded string that defines the criteria for selecting |
| traces from the performance data. |
| - `GroupBy`: A comma-separated list of parameter keys. If specified, the |
| `Query` is expanded into multiple sub-queries, one for each unique |
| combination of values for the `GroupBy` keys found in the data. This |
| allows for more granular alerting. The `GroupCombinations` and |
| `QueriesFromParamset` methods handle this expansion. |
| - `Alert`: The email address for notifications. |
| - `IssueTrackerComponent`: The ID of the issue tracker component to file |
| bugs against. A custom `SerializesToString` type is used for this field |
| to handle JSON serialization of the int64 component ID as a string, with |
| `0` serializing to `""`. |
| - `DirectionAsString`: Specifies whether to alert on upward (`UP`), |
| downward (`DOWN`), or both (`BOTH`) changes in performance. This |
| replaces the deprecated `StepUpOnly` boolean. |
| - `StateAsString`: Indicates if the alert is `ACTIVE` or `DELETED`. This |
| is managed internally and affects whether an alert is processed. |
| - `Action`: Defines what action to take when an anomaly is detected (e.g., |
| `types.AlertActionReport`, `types.AlertActionBisect`). |
| - Other fields like `Interesting`, `Algo`, `Step`, `Radius`, `K`, |
| `Sparse`, `MinimumNum`, `Category` control the specifics of regression |
| detection and reporting. |
| - The file also defines enums like `Direction` and `ConfigState` and |
| helper functions for ID conversion and validation (`Validate`). The |
| `Validate` function ensures consistency, for example, that `GroupBy` |
| keys do not also appear in the main `Query`. |
| |
| - **`store.go`**: This file defines the `Store` interface, which abstracts the |
| persistence mechanism for `Alert` configurations. |
| |
| - **Why**: Decoupling the alert logic from the specific storage |
| implementation (e.g., SQL, Datastore) makes the system more adaptable |
| and testable. |
| - **How**: The `Store` interface specifies methods for: |
| - `Save`: Saving a new or updating an existing alert. It takes a |
| `SaveRequest` which includes the `Alert` configuration and an optional |
| `SubKey` (linking the alert to a subscription). |
| - `ReplaceAll`: Atomically replacing all existing alerts with a new set. |
| This is useful for bulk updates, often tied to configuration |
| subscriptions. It requires a `pgx.Tx` to ensure transactional integrity. |
| - `Delete`: Marking an alert as deleted. |
| - `List`: Retrieving alerts, with an option to include deleted ones. |
| Alerts are typically sorted by `DisplayName`. |
| - `ListForSubscription`: Retrieving all active alerts associated with a |
| specific subscription name. |
| |
| - **`configprovider.go`**: This file implements a `ConfigProvider` that serves |
| `Alert` configurations, incorporating a caching layer. |
| |
| - **Why**: To provide efficient and responsive access to alert |
| configurations, especially in a high-traffic system. Repeatedly fetching |
| from the underlying `Store` for every request would be inefficient. |
| - **How**: |
| - `configProviderImpl` implements the `ConfigProvider` interface. |
| - It maintains two internal caches (`cache_active` for active alerts and |
| `cache_all` for all alerts including deleted ones) using the |
| `configCache` struct. |
| - Upon initialization (`NewConfigProvider`), it performs an initial |
| refresh and starts a background goroutine that periodically calls |
| `Refresh` to update the caches from the `Store`. |
| - `GetAllAlertConfigs` and `GetAlertConfig` serve data from these caches. |
| - A `sync.RWMutex` is used to protect concurrent access to the caches. |
| - The `Refresh` method explicitly fetches data from the `alertStore` and |
| updates both caches. |
| - The refresh interval is configurable. |
| |
| - **Submodule `sqlalertstore`**: This submodule provides a SQL-based |
| implementation of the `alerts.Store` interface. |
| |
| - **`sqlalertstore.go`**: |
| - **Why**: To persist alert configurations in a relational database |
| (specifically CockroachDB, with Spanner compatibility). |
| - **How**: The `SQLAlertStore` struct holds a database connection pool |
| (`pool.Pool`) and a map of SQL statements. |
| - Alerts are stored as JSON strings in the `Alerts` table (schema |
| defined in `sqlalertstore/schema/schema.go`). This simplifies schema |
| evolution of the `Alert` struct itself, as changes to the struct |
| don't always require immediate SQL schema migrations, though it |
| makes querying based on specific alert fields harder directly in |
| SQL. |
| - `Save`: For new alerts (ID is `BadAlertIDAsAsString`), it performs |
| an `INSERT` and retrieves the generated ID. For existing alerts, it |
| performs an `UPSERT` (or an `INSERT ... ON CONFLICT DO UPDATE` for |
| Spanner). |
| - `Delete`: Marks an alert as deleted by setting its `config_state` to |
| `1` (representing `alerts.DELETED`) and updates `last_modified`. It |
| doesn't physically remove the row. |
| - `ReplaceAll`: Within a transaction, it first marks all existing |
| active alerts as deleted, then inserts the new set of alerts. |
| - `List` and `ListForSubscription`: Query the `Alerts` table, |
| deserialize the JSON `alert` column into `alerts.Alert` structs, and |
| sort them by `DisplayName`. |
| - **`spanner.go`**: Contains Spanner-specific SQL statements. This is |
| necessary because CockroachDB and Spanner have slightly different SQL |
| syntax for certain operations like UPSERTs and RETURNING clauses. The |
| correct set of statements is chosen in `sqlalertstore.New` based on the |
| `dbType`. |
| - **`sqlalertstore/schema/schema.go`**: Defines the Go struct |
| `AlertSchema` representing the `Alerts` table in the SQL database. Key |
| fields include `id`, `alert` (TEXT, storing the JSON serialized |
| `alerts.Alert`), `config_state` (INT), `last_modified` (INT, Unix |
| timestamp), `sub_name`, and `sub_revision`. |
| |
| **Key Workflows:** |
| |
| 1. **Creating/Updating an Alert:** |
| |
| - User/System constructs an `alerts.Alert` struct. |
| - `alerts.Store.Save()` is called. |
| - If SQL-backed: |
| - `sqlalertstore.Save()` serializes the `Alert` to JSON. |
| - If `IDAsString` is `BadAlertIDAsAsString`, an `INSERT` statement is |
| executed, and the new ID is populated back into the `Alert` struct. |
| - Otherwise, an `UPSERT` or `INSERT ... ON CONFLICT DO UPDATE` |
| statement is executed. |
| - The `ConfigProvider`'s cache will eventually be updated during its next |
| refresh cycle. |
| |
| ``` |
| [Client/Service] -- Alert Data --> [alerts.Store.Save()] |
| | |
| v |
| [sqlalertstore.Save()] -- Serializes Alert to JSON --> [Database] |
| | (If new, DB returns ID) |
| <--------------------------------------- |
| | (Updates Alert struct with ID) |
| v |
| [ConfigProvider.Refresh() periodically] --> [alerts.Store.List()] |
| | |
| v |
| [sqlalertstore.List()] --> [Database] |
| | (Reads & deserializes) |
| v |
| [ConfigProvider Cache Update] |
| ``` |
| |
| 2. **Retrieving All Active Alerts:** |
| |
| - A service requests alert configurations via |
| `alerts.ConfigProvider.GetAllAlertConfigs(ctx, false)`. |
| - `configProviderImpl.GetAllAlertConfigs()` checks its `cache_active`. |
| - If the cache is up-to-date (within refresh interval), it returns the |
| cached `[]*Alert`. |
| - If the cache needs refresh (or it's the first call), the background |
| refresher (or an explicit `Refresh` call) would have populated it by: |
| - Calling `alerts.Store.List(ctx, false)`. |
| - Which in turn calls `sqlalertstore.List(ctx, false)`. |
| - `sqlalertstore` queries the database for alerts where |
| `config_state = 0` (ACTIVE), deserializes them, and returns the |
| list. |
| |
| ``` |
| [Service] -- Request All Active Alerts --> [ConfigProvider.GetAllAlertConfigs(includeDeleted=false)] |
| | (Checks cache_active) |
| | |
| +-- [Cache Hit] ----> Returns cached []*Alert |
| | |
| +-- [Cache Miss/Stale (via periodic Refresh)] |
| | |
| v |
| [alerts.Store.List(includeDeleted=false)] |
| | |
| v |
| [sqlalertstore.List(includeDeleted=false)] -- SQL Query (WHERE config_state=0) --> [Database] |
| | (Reads & deserializes) |
| v |
| [Updates & Returns from Cache] |
| ``` |
| |
| 3. **Expanding `GroupBy` Queries:** |
| |
| - When an alert with a `GroupBy` clause is processed (e.g., by the |
| regression detection system), `Alert.QueriesFromParamset(paramset)` is |
| called. |
| - `Alert.GroupCombinations(paramset)` is invoked to find all unique |
| combinations of values for the keys specified in `GroupBy` from the |
| provided `paramtools.ReadOnlyParamSet`. |
| - For each combination, a new query string is generated by taking the |
| original `Alert.Query` and appending the key-value pairs from the |
| combination. |
| - This results in a list of specific queries to be executed against the |
| trace data. |
| |
| ``` |
| [Alert Processing System] -- Has Alert with GroupBy="config,arch", Query="metric=latency" & ParamSet --> [Alert.QueriesFromParamset()] |
| | |
| v |
| [Alert.GroupCombinations()] |
| | (e.g., finds {config:A, arch:X}, {config:B, arch:X}) |
| v |
| [Generates specific queries:] |
| - "metric=latency&config=A&arch=X" |
| - "metric=latency&config=B&arch=X" |
| | |
| <-- Returns []string (list of queries) |
| ``` |
| |
| The use of `SerializesToString` for `IssueTrackerComponent` highlights a common |
| challenge when interfacing Go backend systems with JavaScript frontends: |
| JavaScript's limitations with handling large integer IDs. Serializing them as |
| strings is a robust workaround. |
| |
| The existence of a `mock` subdirectory with generated mocks for `Store` and |
| `ConfigProvider` (using `stretchr/testify/mock`) is standard Go practice, |
| facilitating unit testing of components that depend on these interfaces without |
| needing a real database or complex setup. |
| |
| # Module: /go/anomalies |
| |
| The `/go/anomalies` module is responsible for retrieving anomaly data. Anomalies |
| represent significant deviations in performance metrics. This module acts as an |
| intermediary between the application and the `chromeperf` service, which is the |
| source of truth for anomaly data. It provides an abstraction layer, potentially |
| including caching, to optimize anomaly retrieval. |
| |
| ### Key Components and Responsibilities |
| |
| **1. `anomalies.go`:** |
| |
| - **Purpose:** Defines the `Store` interface. This interface dictates the |
| contract for any component that aims to provide anomaly data. It ensures |
| that different implementations (e.g., a cached store or a direct passthrough |
| store) can be used interchangeably. |
| - **Why:** Separating the interface from the implementation promotes loose |
| coupling and testability. It allows for different strategies for fetching |
| anomalies without changing the consuming code. |
| - **Key Methods:** |
| - `GetAnomalies`: Retrieves anomalies for a list of trace names within a |
| specific commit position range. This is useful for analyzing performance |
| regressions or improvements tied to code changes. |
| - `GetAnomaliesInTimeRange`: Fetches anomalies within a given time window. |
| This is helpful for time-based analysis, independent of specific commit |
| versions. |
| - `GetAnomaliesAroundRevision`: Finds anomalies that occurred near a |
| particular revision (commit). This helps pinpoint performance changes |
| related to a specific code submission. |
| |
| **2. `impl.go`:** |
| |
| - **Purpose:** Provides a basic, non-caching implementation of the `Store` |
| interface. It directly forwards requests to the |
| `chromeperf.AnomalyApiClient`. |
| - **Why:** This serves as a foundational implementation. It's simple and |
| directly reflects the capabilities of the underlying `chromeperf` service. |
| It can be used when caching is not desired or not yet implemented. |
| - **How:** Each method in the `store` struct (the implementation of `Store`) |
| makes a corresponding call to the `ChromePerf` client. For example, |
| `GetAnomalies` calls `ChromePerf.GetAnomalies`. Error handling is included |
| to log failures from the `chromeperf` service. Trace names are sorted before |
| being passed to `chromeperf` which might be a requirement or an optimization |
| for the `chromeperf` API. |
| |
| **3. `/go/anomalies/cache/cache.go`:** |
| |
| - **Purpose:** Implements a caching layer for the `Store` interface. This is |
| designed to improve performance by reducing the number of direct calls to |
| the `chromeperf` service, which can be network-intensive. |
| - **Why:** Repeatedly fetching the same anomaly data can be inefficient. A |
| cache stores frequently accessed or recent anomalies locally, leading to |
| faster response times and reduced load on the `chromeperf` service. |
| - **How:** |
| - **LRU Cache:** Uses two Least Recently Used (LRU) caches: `testsCache` |
| for anomalies queried by trace names and commit ranges, and |
| `revisionCache` for anomalies queried around a specific revision. LRU |
| ensures that the least accessed items are evicted when the cache reaches |
| its `cacheSize` limit. |
| - **Cache Invalidation:** |
| - **TTL (Time-To-Live):** Cache entries have a `cacheItemTTL`. A periodic |
| `cleanupCache` goroutine removes entries older than this TTL. This |
| ensures that stale data doesn't persist indefinitely. |
| - **`invalidationMap`:** This map tracks trace names for which anomalies |
| have been modified (e.g., an alert was updated). If a trace name is in |
| this map, any cached anomalies for that trace are considered invalid and |
| will be re-fetched from `chromeperf`. |
| - The `invalidationMap` itself is cleared periodically |
| (`invalidationCleanupPeriod`) to prevent it from growing too large. |
| This is a trade-off: it's simpler and has lower memory overhead but |
| can lead to inaccuracies if a trace is invalidated and then the map |
| is cleared before the next fetch for that trace. |
| - **Metrics:** Tracks the `numEntriesInCache` to monitor cache |
| utilization. |
| - **Key Methods (`store` struct in `cache.go`):** |
| - `GetAnomalies`: |
| * Attempts to retrieve anomalies from `testsCache`. |
| * Checks the `invalidationMap`. If a trace is marked invalid, it's treated |
| as a cache miss. |
| * For any cache misses or invalidated traces, it fetches the data from |
| `as.ChromePerf.GetAnomalies`. |
| * Populates the `testsCache` with newly fetched data. `Client Request |
| (traceNames, startCommit, endCommit) | v [Cache Store] -- GetAnomalies() |
| | +---------------------------------+ | For each traceName: | | 1. Check |
| testsCache | ----> Cache Hit? -----> Add to Result | (Key: |
| trace:start:end) | | | 2. Check invalidationMap | No (Cache Miss or |
| Invalidated) +---------------------------------+ | | |
| (traceNamesMissingFromCache) | v | [ChromePerf Client] -- GetAnomalies() |
| -----------+ | v [Cache Store] -- Add new data to testsCache | v Return |
| Combined Result` |
| - `GetAnomaliesInTimeRange`: This method currently bypasses the cache and |
| directly calls `as.ChromePerf.GetAnomaliesTimeBased`. The decision to |
| not cache time-based queries might be due to the potentially large and |
| less frequently reused nature of such requests, or it might be a feature |
| planned for later. |
| - `GetAnomaliesAroundRevision`: Similar to `GetAnomalies`, it first checks |
| `revisionCache`. If it's a miss, it fetches from |
| `as.ChromePerf.GetAnomaliesAroundRevision` and updates the cache. |
| - `InvalidateTestsCacheForTraceName`: Adds a `traceName` to the |
| `invalidationMap`. This is likely called when an external event (e.g., |
| user updating an anomaly in Chrome Perf) indicates that the cached data |
| for this trace is no longer accurate. |
| |
| **4. `/go/anomalies/mock/Store.go`:** |
| |
| - **Purpose:** Provides a mock implementation of the `Store` interface, |
| generated using the `testify/mock` library. |
| - **Why:** Essential for unit testing. It allows other components that depend |
| on the `anomalies.Store` to be tested in isolation, without needing a real |
| `chromeperf` instance or a fully functional cache. Developers can define |
| expected calls and return values for the mock store. |
| - **How:** It's an auto-generated file. The `mock.Mock` struct from |
| `stretchr/testify` is embedded, providing methods like `On()`, `Return()`, |
| and `AssertExpectations()` to control and verify the mock's behavior during |
| tests. |
| |
| ### Design Decisions and Rationale |
| |
| - **Interface-based Design (`anomalies.Store`):** This is a common and robust |
| pattern in Go. It allows for flexibility in how anomalies are fetched and |
| managed. For example, a new caching strategy or a different backend data |
| source could be implemented without affecting code that consumes anomalies, |
| as long as the new implementation adheres to the `Store` interface. |
| - **Caching Strategy (`cache.go`):** |
| - **LRU:** A good general-purpose caching algorithm when memory is limited |
| and recent/frequently accessed items are more likely to be requested |
| again. |
| - **TTL for Cache Items:** Prevents indefinitely storing stale data. |
| - **`invalidationMap`:** A pragmatic approach to handling external data |
| modifications. While not perfectly accurate (invalidates all anomalies |
| for a trace even if only one changed, and susceptible to the |
| `invalidationCleanupPeriod` timing), it's simpler and less |
| memory-intensive than more granular invalidation schemes. This suggests |
| a balance was struck between accuracy, complexity, and resource usage. |
| - **Separate Caches (`testsCache`, `revisionCache`):** Likely done because |
| the query patterns and cache keys for these two types of requests are |
| different. `testsCache` uses a composite key |
| (`traceName:startCommit:endCommit`), while `revisionCache` uses the |
| `revision` number as the key. |
| - **Error Handling:** The implementations generally log errors from |
| `chromeperf` but often return an empty `AnomalyMap` or `nil` slice to the |
| caller in case of an error from the underlying service. This design choice |
| means that callers might receive no data instead of an error, simplifying |
| the caller's error handling logic but potentially obscuring issues if not |
| monitored through logs. |
| - **Sorting Trace Names:** Before calling `chromeperf.GetAnomalies` or |
| `chromeperf.GetAnomaliesTimeBased`, the list of `traceNames` is sorted. This |
| could be a requirement of the `chromeperf` API for deterministic behavior, |
| or an optimization to improve `chromeperf`'s internal processing or caching. |
| - **Tracing (`go.opencensus.io/trace`):** Spans are added to some methods |
| (`GetAnomaliesInTimeRange`, `GetAnomaliesAroundRevision`). This is crucial |
| for observability, allowing developers to track the performance and flow of |
| requests through the system, especially in a distributed environment. |
| |
| ### Workflows |
| |
| **Typical Anomaly Retrieval (with Cache):** |
| |
| 1. A service needs anomalies (e.g., for displaying on a dashboard). |
| 2. It calls one of the `GetAnomalies*` methods on an `anomalies.Store` instance |
| (which is likely the cached `store` from `cache.go`). |
| 3. **Cache Check:** |
| - The cached `store` first checks its internal LRU cache(s) (`testsCache` |
| or `revisionCache`) for the requested data. |
| - For `GetAnomalies`, it also consults the `invalidationMap` to see if any |
| relevant traces have been marked as stale. |
| 4. **Cache Hit:** If valid data is found in the cache, it's returned directly. |
| `Caller -> anomalies.Store.GetAnomalies(traces, range) | v |
| Cache.GetAnomalies() | +--> Check testsCache (e.g., trace1:100:200) -> Found |
| & Valid | +--> Check testsCache (e.g., trace2:100:200) -> Not Found or |
| Invalid | Return cached data for trace1` |
| 5. **Cache Miss / Stale Data:** If data is not in the cache or is marked stale: - The cached `store` makes a network request to the |
| `chromeperf.AnomalyApiClient`. - The response from `chromeperf` is received. - This new data is added to the LRU cache for future requests. - The data is returned to the caller. `Caller -> |
| anomalies.Store.GetAnomalies(traces, range) | v Cache.GetAnomalies() | |
| +--> Check testsCache (e.g., trace1:100:200) -> Found & Valid | +--> |
| Check testsCache (e.g., trace2:100:200) -> Not Found or Invalid | | | |
| (Data for trace1) v +---------------------------> [ ChromePerf API ] -- |
| GetAnomalies(trace2, range) | v Cache.Add(trace2_data) | v Combine |
| trace1_data & trace2_data | v Return to Caller` |
| |
| **Cache Invalidation Workflow:** |
| |
| 1. An external event occurs (e.g., a user triages an anomaly in the Chrome Perf |
| UI, which modifies its state). |
| 2. A mechanism (not detailed within this module, but implied) detects this |
| change. |
| 3. This mechanism calls `cache.store.InvalidateTestsCacheForTraceName(ctx, |
| "affected_trace_name")`. |
| 4. The `affected_trace_name` is added to the `invalidationMap` in the |
| `cache.store`. |
| 5. **Next `GetAnomalies` call for `affected_trace_name`:** |
| - Even if `testsCache` contains an entry for this trace and range, the |
| presence of `affected_trace_name` in `invalidationMap` will cause a |
| cache miss. |
| - Data will be re-fetched from `chromeperf`. |
| - The `invalidationMap` entry for `affected_trace_name` typically remains |
| until the `invalidationMap` is periodically cleared. |
| |
| This module effectively decouples the rest of the Perf application from the |
| direct complexities of interacting with `chromeperf` for anomaly data, offering |
| performance benefits through caching and a consistent interface for data |
| retrieval. |
| |
| # Module: /go/anomalygroup |
| |
| The `anomalygroup` module is designed to group related anomalies (regressions in |
| performance metrics) together. This grouping allows for consolidated actions |
| like filing a single bug report for multiple related regressions or triggering a |
| single bisection job to find the common culprit for a set of anomalies. This |
| approach aims to reduce noise and improve the efficiency of triaging performance |
| regressions. |
| |
| The core idea is to identify anomalies that share common characteristics, such |
| as the subscription (alert configuration), benchmark, and commit range. When a |
| new anomaly is detected, the system attempts to find an existing group that |
| matches these criteria. If a suitable group is found, the new anomaly is added |
| to it. Otherwise, a new group is created. |
| |
| The module defines a gRPC service for managing anomaly groups, a storage |
| interface for persisting group data, and utilities for processing regressions |
| and interacting with the grouping logic. |
| |
| ### Key Components and Responsibilities |
| |
| #### `store.go`: Anomaly Group Storage Interface |
| |
| The `store.go` file defines the `Store` interface, which outlines the contract |
| for persisting and retrieving anomaly group data. This abstraction allows for |
| different storage backends (e.g., SQL databases) to be used. |
| |
| **Key Responsibilities:** |
| |
| - **Creating new anomaly groups:** When a new anomaly doesn't fit into an |
| existing group, a new group record needs to be created. This involves |
| storing metadata about the group, such as the subscription details, |
| benchmark, initial commit range, and the action to be taken (e.g., REPORT, |
| BISECT). |
| - **Loading anomaly groups:** Retrieving group information by its unique ID is |
| essential for processing and taking actions on the group. |
| - **Finding existing groups:** This is a crucial part of the grouping logic. |
| When a new anomaly is detected, the store is queried to find existing groups |
| that match criteria like subscription, revision, domain (master), benchmark, |
| commit range, and action type. |
| - **Updating anomaly groups:** Groups are dynamic. As new anomalies are added, |
| or as actions are taken (e.g., bisection started, bug filed), the group |
| record needs to be updated. This includes: |
| - Adding new anomaly IDs to the group. |
| - Adding culprit commit IDs once a bisection identifies them. |
| - Storing the ID of a bisection job associated with the group. |
| - Storing the ID of a reported issue (bug) associated with the group. |
| |
| The `Store` interface ensures that the core logic for anomaly grouping is |
| decoupled from the specific implementation of data persistence. |
| |
| #### `sqlanomalygroupstore/sqlanomalygroupstore.go`: SQL-backed Anomaly Group Store |
| |
| This file provides a concrete implementation of the `Store` interface using a |
| SQL database (specifically designed with CockroachDB and Spanner in mind). |
| |
| **Implementation Details:** |
| |
| - **Schema:** The SQL schema for anomaly groups is defined in |
| `sqlanomalygroupstore/schema/schema.go`. It includes fields for the group |
| ID, creation time, list of anomaly IDs, metadata (stored as JSONB), common |
| commit range, action type, and associated IDs for bisections, issues, and |
| culprits. |
| - **Database Operations:** |
| - `Create`: Inserts a new row into the `AnomalyGroups` table. It takes |
| parameters like subscription details, benchmark, commit range, and |
| action, and stores them. The group metadata (subscription name, |
| revision, domain, benchmark) is marshaled into a JSON string before |
| insertion. |
| - `LoadById`: Selects an anomaly group from the database based on its ID. |
| It retrieves core attributes of the group. |
| - `UpdateBisectID`, `UpdateReportedIssueID`, `AddAnomalyID`, |
| `AddCulpritIDs`: These methods execute SQL UPDATE statements to modify |
| specific fields of an existing anomaly group record. They handle array |
| appends for lists like `anomaly_ids` and `culprit_ids`, with specific |
| syntax considerations for different SQL databases (e.g., Spanner's |
| `COALESCE` for array concatenation). |
| - `FindExistingGroup`: Constructs a SQL SELECT query with WHERE clauses to |
| match the provided criteria (subscription, revision, domain, benchmark, |
| commit range overlap, and action). This allows finding groups that a new |
| anomaly might belong to. |
| |
| **Design Choices:** |
| |
| - **UUIDs for IDs:** Using UUIDs for group IDs, anomaly IDs, and culprit IDs |
| ensures global uniqueness. |
| - **JSONB for Metadata:** Storing `group_meta_data` as JSONB provides |
| flexibility in the metadata stored without requiring schema changes for |
| minor additions. |
| - **Array Columns:** Storing `anomaly_ids` and `culprit_ids` as array types in |
| the database is a natural way to represent lists of associated entities. |
| - **Database Type Abstraction:** While targeting SQL, there are minor |
| conditional logic snippets (e.g., for array appending in Spanner vs. |
| CockroachDB) to handle database-specific syntax, indicated by `dbType` |
| checks. |
| |
| #### `service/service.go`: gRPC Service Implementation |
| |
| This file implements the `AnomalyGroupServiceServer` interface defined by the |
| protobuf definitions in `proto/v1/anomalygroup_service.proto`. It acts as the |
| entry point for external systems to interact with the anomaly grouping |
| functionality. |
| |
| **Responsibilities:** |
| |
| - **Exposing Store Operations via gRPC:** The service methods largely delegate |
| to the corresponding methods of the `anomalygroup.Store` interface. For |
| example, `CreateNewAnomalyGroup` calls `anomalygroupStore.Create`. |
| - **Handling gRPC Requests and Responses:** It translates incoming gRPC |
| requests into calls to the store and formats the store's output into gRPC |
| responses. |
| - **`FindTopAnomalies` Logic:** This method involves more than a simple store |
| passthrough. |
| 1. It loads the specified anomaly group. |
| 2. It retrieves all regressions (anomalies) associated with that group |
| using the `regression.Store`. |
| 3. It sorts these regressions based on the percentage change in their |
| median values (from `median_before` to `median_after`). |
| 4. It formats the top N regressions (or all if N is not specified or is too |
| large) into the `ag.Anomaly` protobuf message format, extracting |
| relevant paramset values (bot, benchmark, story, measurement, stat). |
| - **`FindIssuesFromCulprits` Logic:** |
| 1. Loads the specified anomaly group. |
| 2. Retrieves the culprit IDs associated with the group. |
| 3. Uses the `culprit.Store` to get the details of these culprits. |
| 4. For each culprit, it checks its `GroupIssueMap` to find any issue IDs |
| that are specifically associated with the given anomaly group ID. This |
| allows correlation between a group (potentially containing multiple |
| anomalies that led to a bisection) and the issues filed for the culprits |
| found by that bisection. |
| |
| **Design Choices:** |
| |
| - **Dependency Injection:** The service takes instances of |
| `anomalygroup.Store`, `culprit.Store`, and `regression.Store` as |
| dependencies, promoting testability and decoupling. |
| - **Metric Collection:** It increments a counter (`newGroupCounter`) whenever |
| a new group is created, allowing for monitoring of the system's behavior. |
| |
| #### `proto/v1/anomalygroup_service.proto`: Protocol Buffer Definitions |
| |
| This file defines the gRPC service `AnomalyGroupService` and the message types |
| used for requests and responses. This is the contract for how clients interact |
| with the anomaly grouping system. |
| |
| **Key Messages:** |
| |
| - `AnomalyGroup`: Represents a group of anomalies, including its ID, the |
| action to take, lists of associated anomaly and culprit IDs, reported issue |
| ID, and metadata like subscription and benchmark names. |
| - `Anomaly`: Represents a single regression, including its start and end |
| commit positions, a `paramset` (key-value pairs describing the test), |
| improvement direction, and median values before and after the regression. |
| - `GroupActionType`: An enum defining the possible actions for a group |
| (NOACTION, REPORT, BISECT). |
| - Request/Response Messages: Specific messages for each RPC method (e.g., |
| `CreateNewAnomalyGroupRequest`, `FindExistingGroupsResponse`). |
| |
| **Purpose:** |
| |
| - Defines a clear, language-agnostic API for the service. |
| - Ensures type safety and structured data exchange. |
| |
| #### `notifier/anomalygroupnotifier.go`: Anomaly Group Notifier |
| |
| This component implements the `notify.Notifier` interface. It's invoked when a |
| new regression is detected by the alerting system. Its primary role is to |
| integrate the regression detection with the anomaly grouping logic. |
| |
| **Workflow when `RegressionFound` is called:** |
| |
| 1. Receive details of a newly detected regression (commit information, alert |
| configuration, cluster summary, trace data, regression ID). |
| 2. Extract the `paramset` from the trace data. |
| 3. Validate the `paramset` to ensure it contains required keys (e.g., master, |
| bot, benchmark, test, subtest_1). This is important because the grouping and |
| subsequent actions (like bisection) rely on these parameters. |
| 4. Determine the `testPath` from the `paramset`. This path is used in finding |
| or creating anomaly groups. |
| 5. Call `grouper.ProcessRegressionInGroup` (which eventually calls |
| `utils.ProcessRegression`) to handle the grouping logic for this new |
| regression. |
| |
| **Design Choices:** |
| |
| - **Interface Implementation:** Adheres to the `notify.Notifier` interface, |
| allowing it to be plugged into the existing notification pipeline of the |
| performance monitoring system. |
| - **Delegation to `AnomalyGrouper`:** It delegates the core grouping logic to |
| an `AnomalyGrouper` instance (typically `utils.AnomalyGrouperImpl`). This |
| keeps the notifier focused on the integration aspect. |
| - **Handling of Summary Traces:** It explicitly ignores regressions found on |
| summary-level traces (traces representing an aggregation of multiple |
| specific tests), as anomaly grouping is typically more meaningful for |
| specific test cases. |
| |
| #### `utils/anomalygrouputils.go`: Anomaly Grouping Utilities |
| |
| This file contains the core logic for processing a new regression and |
| integrating it into an anomaly group. |
| |
| **`ProcessRegression` Function - Key Steps:** |
| |
| 1. **Synchronization:** Uses a `sync.Mutex` (`groupingMutex`). This is a |
| critical point: it aims to prevent race conditions when multiple regressions |
| are processed concurrently, especially around creating new groups. _However, |
| the comment notes that with multiple containers, this mutex might not be |
| sufficient and needs review._ |
| 2. **Client Initialization:** Creates an `AnomalyGroupServiceClient` to |
| communicate with the gRPC service. |
| 3. **Find Existing Group:** Calls the `FindExistingGroups` gRPC method to see |
| if the new anomaly fits into any current groups based on subscription, |
| revision, action type, commit range overlap, and test path. |
| 4. **Group Creation or Update:** - **If no existing group is found:** - Calls `CreateNewAnomalyGroup` to create a new group. - Calls `UpdateAnomalyGroup` to add the current `anomalyID` to this |
| newly created group. - **Triggers a Temporal Workflow:** Initiates a |
| `MaybeTriggerBisection` workflow. This workflow is responsible for |
| deciding whether to start a bisection or file a bug based on the |
| group's action type and other conditions. `Regression Detected --> |
| FindExistingGroups | +-- No Group Found --> CreateNewAnomalyGroup |
| --> UpdateAnomalyGroup (add anomaly) --> Start Temporal Workflow |
| (MaybeTriggerBisection)` - **If existing group(s) are found:** - For each matching group: - Calls `UpdateAnomalyGroup` to add the current `anomalyID` to that |
| group. - Calls `FindIssuesToUpdate` to determine if any existing bug reports |
| (either the group's own `ReportedIssueId` or issues linked via |
| culprits) should be updated with information about this new anomaly. - If issues are found, it uses the `issuetracker` to add a comment to |
| each relevant issue. `Regression Detected --> FindExistingGroups | |
| +-- Group(s) Found --> For each group: | +-- UpdateAnomalyGroup (add |
| anomaly) +-- FindIssuesToUpdate --> If issues exist --> Add Comment |
| to Issue(s)` |
| 5. **Return Group ID(s):** Returns a comma-separated string of group IDs the |
| anomaly was associated with. |
| |
| **`FindIssuesToUpdate` Function:** |
| |
| This helper determines which existing issue tracker IDs should be updated with |
| information about a new anomaly being added to a group. |
| |
| - If the `group_action` is `REPORT` and `reported_issue_id` is set on the |
| group, that issue ID is returned. |
| - If the `group_action` is `BISECT`, it calls the `FindIssuesFromCulprits` |
| gRPC method. This method looks up culprits associated with the group and |
| then checks if those culprits have specific issues filed for them in the |
| context of _this particular group_. This is important because a single |
| culprit (commit) might be associated with multiple anomaly groups, and each |
| might have its own context or bug report. |
| |
| **Design Choices:** |
| |
| - **Centralized Grouping Logic:** This package encapsulates the |
| decision-making process of whether to create a new group or add to an |
| existing one. |
| - **Temporal Workflow Integration:** Offloads the decision and execution of |
| bisection or bug filing to a Temporal workflow. This makes the process |
| asynchronous and more resilient. |
| - **Issue Tracker Interaction:** Directly interacts with the issue tracker to |
| update existing bugs, keeping them relevant as new, related anomalies are |
| found. |
| |
| ### Mocking Strategy |
| |
| The module extensively uses mocks for testing: |
| |
| - **`mocks/Store.go`:** A mock implementation of the `anomalygroup.Store` |
| interface, generated by `testify/mock`. Used in `service/service_test.go`. |
| - **`proto/v1/mocks/AnomalyGroupServiceServer.go`:** A mock for the gRPC |
| server interface `AnomalyGroupServiceServer`, generated by `testify/mock` |
| (with manual adjustments noted in the file). Used by clients or other |
| services that might call this gRPC service. |
| - **`utils/mocks/AnomalyGrouper.go`:** A mock for the `AnomalyGrouper` |
| interface, used in `notifier/anomalygroupnotifier_test.go`. |
| |
| This approach allows for unit testing components in isolation by providing |
| controlled behavior for their dependencies. |
| |
| ### Overall Workflow Example (Simplified) |
| |
| 1. **Anomaly Detection:** Perf system detects a new regression (anomaly). |
| 2. **Notification:** `AnomalyGroupNotifier.RegressionFound` is called. |
| 3. **Preprocessing:** The notifier extracts `paramset`, validates it, and |
| derives `testPath`. |
| 4. **Grouping Logic (`utils.ProcessRegression`):** |
| - The system queries `AnomalyGroupService.FindExistingGroups` using the |
| anomaly's properties (subscription, commit range, test path, action |
| type). |
| - **Scenario A: No existing group:** |
| - `AnomalyGroupService.CreateNewAnomalyGroup` is called. |
| - The new anomaly ID is added to this group via |
| `AnomalyGroupService.UpdateAnomalyGroup`. |
| - A Temporal workflow (`MaybeTriggerBisection`) is started for this |
| new group. |
| - **Scenario B: Existing group(s) found:** |
| - The new anomaly ID is added to each matching group via |
| `AnomalyGroupService.UpdateAnomalyGroup`. |
| - `utils.FindIssuesToUpdate` is called for each group. |
| - If the group's action is `REPORT` and it has a `ReportedIssueId`, |
| that issue is updated. |
| - If the group's action is `BISECT`, |
| `AnomalyGroupService.FindIssuesFromCulprits` is called. If it |
| returns issue IDs associated with this group's culprits, those |
| issues are updated. |
| 5. **Temporal Workflow (`MaybeTriggerBisection` - not detailed here but |
| implied):** |
| - Based on the group's `GroupActionType`: |
| - If `BISECT`: It might check conditions (e.g., number of anomalies in |
| the group) and then trigger a bisection job (e.g., Pinpoint) using |
| `AnomalyGroupService.FindTopAnomalies` to pick the most significant |
| anomaly. The bisection ID is then saved to the group. |
| - If `REPORT`: It might check conditions and then file a bug using |
| `AnomalyGroupService.FindTopAnomalies` to gather details. The issue |
| ID is saved to the group. |
| |
| This system aims to automate and streamline the handling of performance |
| regressions by intelligently grouping them and initiating appropriate follow-up |
| actions. |
| |
| # Module: /go/backend |
| |
| The `/go/backend` module implements a gRPC-based backend service for Perf. This |
| service is designed to host API endpoints that are not directly user-facing, |
| promoting a separation of concerns and enabling better scalability and |
| maintainability. |
| |
| **Core Purpose and Design Philosophy:** |
| |
| The primary motivation for this backend service is to create a stable, internal |
| API layer. This decouples user-facing components (like the frontend) from the |
| direct implementation details of various backend tasks. For instance, if Perf |
| needs to trigger a Pinpoint job, the frontend doesn't interact with Pinpoint or |
| a workflow engine like Temporal directly. Instead, it makes a gRPC call to an |
| endpoint on this backend service. The backend service then handles the |
| interaction with the underlying system (e.g., Temporal). |
| |
| This design offers several advantages: |
| |
| - **Interface Stability:** If the underlying implementation for a task changes |
| (e.g., replacing Temporal with another workflow orchestrator), the gRPC |
| contract exposed by the backend service can remain the same. This minimizes |
| changes required in calling services. |
| - **Load Offloading:** Computationally intensive operations that might |
| otherwise burden the frontend can be delegated to this backend service. |
| Examples include dry-running regression detection. |
| - **Centralized Internal Logic:** It provides a dedicated place for internal, |
| non-UI-facing business logic. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`backend.go`**: This is the heart of the backend service. |
| |
| - **`Backend` struct:** Encapsulates the state and configuration of the |
| backend application, including gRPC server settings, ports, and loaded |
| configuration. |
| - **`BackendService` interface:** Defines a contract for any service that |
| wishes to be hosted by this backend. Each such service must provide its |
| gRPC service descriptor, registration logic, and an authorization |
| policy. This interface-based approach allows for modular addition of new |
| functionalities. |
| - The `GetAuthorizationPolicy()` method returns a |
| `shared.AuthorizationPolicy` which specifies whether unauthenticated |
| access is allowed and which user roles are authorized to call the |
| service or specific methods within it. |
| - `RegisterGrpc()` is responsible for registering the specific gRPC |
| service implementation with the main gRPC server. |
| - `GetServiceDescriptor()` provides metadata about the gRPC service. |
| - **`initialize()` function:** This is a crucial setup function. It: |
| - Initializes common application components (like Prometheus metrics). |
| - Loads and validates the application configuration (from a JSON file, |
| e.g., `demo.json`). |
| - Instantiates various data stores (for anomaly groups, culprits, |
| subscriptions, regressions) by using builder functions that typically |
| read connection details from the loaded configuration. This allows for |
| flexibility in choosing data store implementations (e.g., Spanner, |
| CockroachDB). |
| - Sets up a culprit notifier, which is responsible for sending |
| notifications about identified culprits. |
| - Initializes a Temporal client if the `NotifyConfig.Notifications` is set |
| to `AnomalyGrouper`, as this indicates that anomaly grouping workflows |
| managed by Temporal are in use. |
| - Dynamically configures and registers all `BackendService` |
| implementations. This involves setting up authorization rules based on |
| the policy defined by each service and then registering their gRPC |
| handlers. |
| - Starts listening for gRPC connections on the configured port. |
| - **`configureServices()` and `registerServices()`:** These helper |
| functions iterate over the list of `BackendService` implementations to |
| set up authorization and register them with the main gRPC server. |
| - **`configureAuthorizationForService()`:** This function applies the |
| authorization policies defined by each individual service to the gRPC |
| server's authorization policy. It uses `grpcsp.ServerPolicy` to define |
| which roles can access the service or specific methods. |
| - **`New()` constructor:** Creates and initializes a new `Backend` |
| instance. It takes various store implementations and a notifier as |
| arguments, allowing for dependency injection, particularly useful for |
| testing. If these are `nil`, they are typically created within |
| `initialize()` based on the configuration. |
| - **`ServeGRPC()` and `Serve()`:** These methods start the gRPC server and |
| block until it's shut down. |
| - **`Cleanup()`:** Handles graceful shutdown of the gRPC server. |
| |
| - **`pinpoint.go`**: This file defines a wrapper for the actual Pinpoint |
| service implementation (which resides in `pinpoint/go/service`). |
| |
| - **`pinpointService` struct:** Implements the `BackendService` interface. |
| - **`NewPinpointService()`:** Creates a new instance, taking a Temporal |
| provider and a rate limiter as arguments. This indicates that Pinpoint |
| operations might be rate-limited and potentially involve Temporal |
| workflows. |
| - It defines an authorization policy requiring users to have at least |
| `roles.Editor` to access Pinpoint functionalities. This is a good |
| example of how specific services define their own access control rules. |
| |
| - **`shared/authorization.go`**: |
| |
| - **`AuthorizationPolicy` struct:** A simple struct used by |
| `BackendService` implementations to declare their authorization |
| requirements. This includes whether unauthenticated access is permitted, |
| a list of roles authorized for the entire service, and a map for |
| method-specific role authorizations. This promotes a consistent way for |
| services to define their security posture. |
| |
| - **`client/backendclientutil.go`**: This utility file provides helper |
| functions for creating gRPC clients to connect to the backend service itself |
| (or specific services hosted by it). |
| |
| - **`getGrpcConnection()`:** Abstracts the logic for establishing a gRPC |
| connection. It handles both insecure (typically for local |
| development/testing) and secure connections. For secure connections, it |
| uses TLS (with `InsecureSkipVerify: true` as it's intended for internal |
| GKE cluster communication) and OAuth2 for authentication, obtaining |
| tokens for the service account running the client process. |
| - **`NewPinpointClient()`, `NewAnomalyGroupServiceClient()`, |
| `NewCulpritServiceClient()`:** These are factory functions that simplify |
| the creation of typed gRPC clients for the specific services hosted on |
| the backend. They first check if the backend service is |
| configured/enabled before attempting to create a connection. This |
| pattern makes it easy for other internal services to consume the APIs |
| provided by this backend. |
| |
| - **`backendserver/main.go`**: This is the entry point for the backend server |
| executable. |
| |
| - It uses the `urfave/cli` library to define a command-line interface. |
| - The `run` command initializes and starts the `Backend` service using the |
| `backend.New()` constructor and then calls `b.Serve()`. |
| - It primarily parses command-line flags (defined in |
| `config.BackendFlags`) and passes them to the `backend` package. It |
| doesn't instantiate stores or notifiers directly, relying on the |
| `backend.New` (and subsequently `initialize`) to create them based on |
| the loaded configuration if `nil` is passed. |
| |
| **Workflow Example: Handling a gRPC Request** |
| |
| 1. A client (e.g., the Perf frontend or another internal service) uses a |
| generated gRPC client stub (potentially created with helpers from |
| `client/backendclientutil.go`) to make a call to a specific method on a |
| service hosted by the backend (e.g., `Pinpoint.ScheduleJob`). |
| 2. The gRPC request arrives at the `Backend` server's listener (`b.lisGRPC`). |
| 3. The `grpc.Server` routes the request to the appropriate service |
| implementation (e.g., `pinpointService`). |
| 4. **Authentication/Authorization (via `grpcsp.ServerPolicy`):** Before the |
| service method is executed, the `UnaryInterceptor` configured in |
| `backend.go` (which uses `b.serverAuthPolicy`) intercepts the call. |
| `Incoming gRPC Request --> UnaryInterceptor (grpcsp) | V Check Auth Policy |
| for Service/Method (defined by pinpointService.GetAuthorizationPolicy()) | V |
| Allow/Deny ----> Yes: Proceed to service method No: Return error` |
| 5. If authorized, the corresponding method on the `pinpointService` (which |
| delegates to the actual `pinpoint_service.PinpointServer` implementation) is |
| invoked. |
| 6. The service method performs its logic (e.g., interacting with Temporal to |
| schedule a Pinpoint job, querying data stores). |
| 7. A response is sent back to the client. |
| |
| **Configuration and Initialization:** |
| |
| The system relies heavily on a configuration file (specified by |
| `flags.ConfigFilename`, often `demo.json` for local development as seen in |
| `backend_test.go` and `testdata/demo.json`). This file dictates: |
| |
| - Data store connection strings and types (`data_store_config`). |
| - Notification settings (`notify_config`). |
| - The backend service's own host URL (`backend_host_url`), which it might use |
| if it needs to call itself or if other components need to discover it. |
| - Temporal configuration (`temporal_config` - though not explicitly in |
| `demo.json`, it's checked in `backend.go`). |
| |
| The `initialize` function in `backend.go` is responsible for parsing this |
| configuration and setting up all necessary dependencies like database |
| connections, the Temporal client, and the culprit notifier. The use of builder |
| functions (e.g., `builders.NewAnomalyGroupStoreFromConfig`) allows the system to |
| be flexible with regard to the actual implementations of these components, as |
| long as they conform to the required interfaces. |
| |
| This backend module serves as a crucial intermediary, enhancing the robustness |
| and maintainability of the Perf system by providing a well-defined internal API |
| layer. |
| |
| # Module: /go/bug |
| |
| The `go/bug` module is designed to facilitate the creation of URLs for reporting |
| bugs or regressions identified within the Skia performance monitoring system. |
| Its primary purpose is to dynamically generate these URLs based on a predefined |
| template and specific details about the identified issue. This approach allows |
| for flexible integration with various bug tracking systems, as the URL structure |
| can be configured externally. |
| |
| **Core Functionality and Design:** |
| |
| The module centers around the concept of URI templates. Instead of hardcoding |
| URL formats for specific bug trackers, it uses a template string that contains |
| placeholders for relevant information. This makes the system adaptable to |
| changes in bug tracker URL schemes or the adoption of new trackers without |
| requiring code modifications. |
| |
| The key function, `Expand`, takes a URI template and populates it with details |
| about the regression. These details include: |
| |
| 1. **`clusterLink`**: A URL pointing to the specific performance data cluster |
| that exhibits the regression. This provides direct context for anyone |
| investigating the bug. |
| 2. **`c provider.Commit`**: Information about the specific commit suspected of |
| causing the regression. This includes the commit's URL, allowing for easy |
| navigation to the code change. The use of the `provider.Commit` type from |
| `perf/go/git/provider` indicates an integration with a system that can |
| furnish commit details. |
| 3. **`message`**: A user-provided message describing the regression. This |
| allows the reporter to add specific observations or context. |
| |
| The `Expand` function utilizes the `gopkg.in/olivere/elastic.v5/uritemplates` |
| library to perform the actual substitution of placeholders in the template |
| string with the provided values. This library handles URL encoding of the |
| substituted values, ensuring the generated URL is valid. |
| |
| **Key Components/Files:** |
| |
| - **`bug.go`**: This file contains the core logic for expanding URI templates. |
| |
| - `Expand(uriTemplate string, clusterLink string, c provider.Commit, |
| message string) string`: This is the primary function responsible for |
| generating the bug reporting URL. It takes the template and the |
| contextual information as input and returns the fully formed URL. If the |
| template expansion fails (e.g., due to a malformed template), it logs an |
| error using `go.skia.org/infra/go/sklog` and returns an empty string or |
| a partially formed URL depending on the nature of the error. |
| - `ExampleExpand(uriTemplate string) string`: This function serves as a |
| utility or example for demonstrating how to use the `Expand` function. |
| It calls `Expand` with pre-defined example data for the cluster link, |
| commit, and message. This can be useful for testing the template |
| expansion logic or for providing a quick way to see how a given template |
| would be populated. |
| |
| - **`bug_test.go`**: This file contains unit tests for the functionality in |
| `bug.go`. |
| |
| - `TestExpand(t *testing.T)`: This test function verifies that the |
| `Expand` function correctly substitutes the provided values into the URI |
| template and produces the expected URL. It uses the |
| `github.com/stretchr/testify/assert` library for assertions, ensuring |
| that the generated URL matches the anticipated output, including proper |
| URL encoding. |
| |
| **Workflow:** |
| |
| A typical workflow involving this module would be: |
| |
| 1. **Configuration**: An external system (e.g., the Perf frontend) is |
| configured with a URI template for the desired bug tracking system. This |
| template will contain placeholders like `{cluster_url}`, `{commit_url}`, and |
| `{message}`. Example Template: |
| `https://bugtracker.example.com/new?summary=Regression%20Found&description=Regression%20details:%0ACluster:%20{cluster_url}%0ACommit:%20{commit_url}%0AMessage:%20{message}` |
| |
| 2. **Regression Identification**: A user or an automated system identifies a |
| performance regression. |
| |
| 3. **Information Gathering**: The system gathers the necessary information: |
| |
| - The URL to the Perf cluster graph showing the regression. |
| - Details of the commit suspected to have introduced the regression. |
| - An optional message from the user. |
| |
| 4. **URL Generation**: The `Expand` function in `go/bug` is called with the |
| configured URI template and the gathered information. |
| |
| ``` |
| template := "https://bugtracker.example.com/new?summary=Regression%20Found&description=Cluster:%20{cluster_url}%0ACommit:%20{commit_url}%0AMessage:%20{message}" |
| clusterURL := "https://perf.skia.org/t/?some_params" |
| commitData := provider.Commit{URL: "https://skia.googlesource.com/skia/+show/abcdef123"} |
| userMessage := "Significant drop in frame rate on TestXYZ." |
| |
| bugReportURL := bug.Expand(template, clusterURL, commitData, userMessage) |
| ``` |
| |
| 5. **Redirection/Display**: The generated `bugReportURL` is then presented to |
| the user, who can click it to navigate to the bug tracker with the |
| pre-filled information. |
| |
| This design decouples the bug reporting logic from the specifics of any single |
| bug tracking system, promoting flexibility and maintainability. The use of a |
| standard URI template expansion library ensures robustness in URL generation. |
| |
| # Module: /go/builders |
| |
| The `builders` module is responsible for constructing various core components of |
| the Perf system based on instance configuration. This centralized approach to |
| object creation prevents cyclical dependencies that could arise if configuration |
| objects were directly responsible for building the components they configure. |
| The module acts as a factory, taking an `InstanceConfig` and returning fully |
| initialized and operational objects like data stores, file sources, and caches. |
| |
| The primary design goal is to decouple the configuration of Perf components from |
| their instantiation. This allows for cleaner dependencies and makes it easier to |
| manage the lifecycle of different parts of the system. For example, a |
| `TraceStore` needs a database connection, but the `InstanceConfig` that defines |
| the database connection string shouldn't also be responsible for creating the |
| `TraceStore` itself. The `builders` module bridges this gap. |
| |
| Key components and their instantiation logic: |
| |
| - **`builders.go`**: This is the central file containing all the builder |
| functions. |
| - **Database Pool (`NewDBPoolFromConfig`)**: This function is crucial as |
| many other components rely on a database connection. It establishes a |
| connection pool to the configured database (e.g., CockroachDB, Spanner). |
| - **Why**: A connection pool is used to manage database connections |
| efficiently, reusing existing connections to reduce the overhead of |
| establishing new ones for each request. |
| - **How**: It parses the connection string from the `InstanceConfig`, |
| configures pool parameters like maximum and minimum connections, and |
| sets up a logging adapter (`pgxLogAdaptor`) to integrate database logs |
| with the application's logging system. |
| - **Singleton**: A key design choice here is the `singletonPool`. This |
| ensures that only one database connection pool is created per |
| application instance, preventing resource exhaustion and ensuring |
| consistent database interaction. A mutex (`singletonPoolMutex`) protects |
| the creation of this singleton. |
| - **Schema Check**: Optionally, it can verify that the connected database |
| schema matches the expected schema defined for the application. This is |
| important for ensuring data integrity and compatibility. |
| - **Timeout Wrapper**: The raw database pool is wrapped with a |
| `timeout.New` wrapper. This enforces that all database operations are |
| performed within a context that has a timeout, preventing indefinite |
| blocking. `InstanceConfig --> NewDBPoolFromConfig --> |
| pgxpool.ParseConfig | +-> pgxpool.ConnectConfig --> rawPool | +-> |
| timeout.New(rawPool) --> singletonPool (if schema check passes)` |
| - **PerfGit (`NewPerfGitFromConfig`)**: Constructs a `perfgit.Git` object, |
| which provides an interface to Git repository data. |
| - **Why**: Perf needs to associate performance data with specific code |
| revisions. |
| - **How**: It first obtains a database pool using `getDBPool` (which in |
| turn uses `NewDBPoolFromConfig`) and then instantiates `perfgit.New` |
| with this pool and the instance configuration. |
| - **TraceStore (`NewTraceStoreFromConfig`)**: Creates a |
| `tracestore.TraceStore` for managing performance trace data. |
| - **Why**: This is the core component for storing and retrieving |
| time-series performance metrics. |
| - **How**: It gets a database pool and a `TraceParamStore` (for managing |
| trace parameter sets) and then instantiates the appropriate |
| `sqltracestore`. |
| - **MetadataStore (`NewMetadataStoreFromConfig`)**: Creates a |
| `tracestore.MetadataStore` for managing metadata associated with traces. |
| - **How**: Similar to `TraceStore`, it obtains a database pool and then |
| creates an `sqltracestore.NewSQLMetadataStore`. |
| - **AlertStore, RegressionStore, ShortcutStore, GraphsShortcutStore, |
| AnomalyGroupStore, CulpritStore, SubscriptionStore, FavoriteStore, |
| UserIssueStore**: These functions follow a similar pattern: they obtain |
| a database pool via `getDBPool` and then instantiate their respective |
| SQL-backed store implementations (e.g., `sqlalertstore`, |
| `sqlregression2store`). |
| - **Why**: These stores manage various aspects of Perf's functionality, |
| such as alerting configurations, regression tracking, saved shortcuts, |
| etc. Centralizing their creation based on the common database |
| configuration simplifies the system. |
| - **RegressionStore Variation**: `NewRegressionStoreFromConfig` has a |
| conditional logic based on `instanceConfig.UseRegression2` to |
| instantiate either `sqlregression2store` or `sqlregressionstore`. This |
| allows for migrating to a new regression store implementation controlled |
| by configuration. |
| - **GraphsShortcutStore Caching**: `NewGraphsShortcutStoreFromConfig` can |
| return a cached version |
| (`graphsshortcutstore.NewCacheGraphsShortcutStore`) if `localToProd` is |
| true, indicating a local development or testing environment where a |
| simpler in-memory cache might be preferred over a database-backed store. |
| - **Source (`NewSourceFromConfig`)**: Creates a `file.Source` which |
| defines where Perf ingests data from (e.g., Google Cloud Storage, local |
| directories). |
| - **Why**: Perf needs to be flexible in terms of where it reads input data |
| files. |
| - **How**: It uses a `switch` statement based on |
| `instanceConfig.IngestionConfig.SourceConfig.SourceType` to instantiate |
| either a `gcssource` or a `dirsource`. |
| - **IngestedFS (`NewIngestedFSFromConfig`)**: Creates a `fs.FS` (file |
| system interface) that provides access to already ingested files. |
| - **Why**: To provide a consistent way to access files regardless of their |
| underlying storage (GCS or local). |
| - **How**: Similar to `NewSourceFromConfig`, it switches on the source |
| type to return a GCS or local file system implementation. |
| - **Cache (`GetCacheFromConfig`)**: Returns a `cache.Cache` instance |
| (either Redis-backed or local in-memory). |
| - **Why**: Caching is used to improve the performance of frequently |
| accessed data or computationally intensive queries. |
| - **How**: It checks `instanceConfig.QueryConfig.CacheConfig.Type` to |
| determine whether to create a `redisCache` (connecting to a Google Cloud |
| Redis instance) or a `localCache`. |
| |
| The `getDBPool` helper function is used internally by many builder functions. It |
| acts as a dispatcher based on `instanceConfig.DataStoreConfig.DataStoreType`, |
| calling `NewDBPoolFromConfig` with appropriate schema checking flags. This |
| abstracts the direct call to `NewDBPoolFromConfig` and centralizes the logic for |
| selecting the database type. |
| |
| The test file (`builders_test.go`) ensures that these builder functions |
| correctly instantiate objects and handle different configurations, including |
| invalid ones. A notable aspect of the tests is the management of the |
| `singletonPool`. Since `NewDBPoolFromConfig` creates a singleton, tests that |
| require fresh database instances must explicitly clear this singleton |
| (`singletonPool = nil`) before calling the builder to avoid reusing a connection |
| from a previous test. This is handled in `newDBConfigForTest`. |
| |
| # Module: /go/chromeperf |
| |
| The `chromeperf` module facilitates interaction with the Chrome Perf backend, |
| which is the system of record for performance data for Chromium. This module |
| allows Perf to send and receive data from Chrome Perf. |
| |
| ## Key Responsibilities |
| |
| The primary responsibility of this module is to abstract the communication |
| details with the Chrome Perf API. It provides a typed Go interface to various |
| Chrome Perf endpoints, handling request formatting, authentication, and response |
| parsing. |
| |
| This interaction is crucial for: |
| |
| - **Reporting Regressions:** When Perf detects a performance regression, it |
| needs to inform Chrome Perf to create an alert and potentially file a bug. |
| - **Fetching Anomaly Data:** Perf needs to retrieve information about existing |
| anomalies and alert groups from Chrome Perf to display them in its UI or use |
| them in its analysis. This includes details about the commit range, affected |
| tests, and associated bug IDs. |
| - **Maintaining Test Path Consistency:** Chrome Perf and Perf may have |
| slightly different representations of test paths (e.g., due to character |
| restrictions). This module, in conjunction with the `sqlreversekeymapstore` |
| submodule, helps manage these differences. |
| |
| ## Key Components |
| |
| ### `chromeperfClient.go` |
| |
| This file defines the generic `ChromePerfClient` interface and its |
| implementation, `chromePerfClientImpl`. This is the core component responsible |
| for making HTTP GET and POST requests to the Chrome Perf API. |
| |
| **Why:** Abstracting the HTTP client allows for easier testing (by mocking the |
| client) and centralizes the logic for handling authentication (using OAuth2 |
| Google default token source) and constructing target URLs. |
| |
| **How:** |
| |
| - It uses `google.DefaultTokenSource` for authentication. |
| - `generateTargetUrl` constructs the correct API endpoint URL, differentiating |
| between the Skia-Bridge proxy |
| (`https://skia-bridge-dot-chromeperf.appspot.com`) and direct calls to the |
| legacy Chrome Perf endpoint (`https://chromeperf.appspot.com`). The |
| Skia-Bridge is generally preferred. |
| - `SendGetRequest` and `SendPostRequest` handle the actual HTTP communication, |
| JSON marshalling/unmarshalling, and basic error handling, including checking |
| for accepted HTTP status codes. |
| |
| Example workflow for a POST request: |
| |
| ``` |
| Caller -> chromePerfClient.SendPostRequest(ctx, "anomalies", "add", requestBody, &responseObj, []int{200}) |
| | |
| | (Serializes requestBody to JSON) |
| v |
| |--------------------------------------------------------------------------------------------------------| |
| | generateTargetUrl("https://skia-bridge-dot-chromeperf.appspot.com/anomalies/add") | |
| |--------------------------------------------------------------------------------------------------------| |
| | |
| v |
| httpClient.Post(targetUrl, "application/json", jsonBody) |
| | |
| v |
| (HTTP Request to Chrome Perf API) |
| | |
| v |
| (Receives HTTP Response) |
| | |
| v |
| (Checks if response status code is in acceptedStatusCodes) |
| | |
| v |
| (Deserializes response body into responseObj) |
| | |
| v |
| Caller (receives populated responseObj or error) |
| ``` |
| |
| ### `anomalyApi.go` |
| |
| This file builds upon `chromeperfClient.go` to provide a specialized client for |
| interacting with the `/anomalies` endpoint in Chrome Perf. It defines the |
| `AnomalyApiClient` interface and its implementation `anomalyApiClientImpl`. |
| |
| **Why:** This client encapsulates the logic specific to anomaly-related |
| operations, such as formatting requests for reporting regressions or fetching |
| anomaly details, and parsing the specific JSON structures returned by these |
| endpoints. It also handles the translation between Perf's trace identifiers and |
| Chrome Perf's `test_path` format. |
| |
| **How:** |
| |
| - **`ReportRegression`**: Constructs a `ReportRegressionRequest` and sends it |
| to the `anomalies/add` endpoint. This is how Perf informs Chrome Perf about |
| a new regression. |
| - **`GetAnomalyFromUrlSafeKey`**: Fetches details for a specific anomaly using |
| its key from the `anomalies/get` endpoint. |
| - **`GetAnomalies`**: Retrieves anomalies for a list of tests within a |
| specific commit range (`min_revision`, `max_revision`) by calling the |
| `anomalies/find` endpoint. |
| - It performs a crucial translation step: `traceNameToTestPath` converts |
| Perf's comma-separated key-value trace names (e.g., |
| `,benchmark=Blazor,bot=MacM1,...`) into Chrome Perf's slash-separated |
| `test_path` (e.g., `ChromiumPerf/MacM1/Blazor/...`). |
| - It also handles potential discrepancies in commit numbers if Chrome Perf |
| returns commit hashes. It uses `perfGit.CommitNumberFromGitHash` to |
| resolve these. |
| - **`GetAnomaliesTimeBased`**: Similar to `GetAnomalies`, but fetches |
| anomalies based on a time range (`start_time`, `end_time`) by calling the |
| `anomalies/find_time` endpoint. |
| - **`GetAnomaliesAroundRevision`**: Fetches anomalies that occurred around a |
| specific revision number. |
| - **`traceNameToTestPath`**: This function is key for interoperability. It |
| parses a Perf trace name (which is a string of key-value pairs) and |
| constructs the corresponding `test_path` string that Chrome Perf expects. It |
| also handles an experimental feature (`EnableSkiaBridgeAggregation`) which |
| can modify how test paths are generated, particularly for aggregated |
| statistics (e.g., ensuring `testName_avg` is used if the `stat` is `value`). |
| - The logic for `statToSuffixMap` and `hasSuffixInTestValue` addresses |
| historical inconsistencies where test names in Perf might or might not |
| include statistical suffixes (like `_avg`, `_max`). The goal is to |
| derive the correct Chrome Perf `test_path`. |
| |
| Workflow for fetching anomalies: |
| |
| ``` |
| Perf UI/Backend -> anomalyApiClient.GetAnomalies(ctx, ["trace_A,key=val", "trace_B,key=val"], 100, 200) |
| | |
| v |
| (For each traceName) |
| traceNameToTestPath("trace_A,key=val") -> "chromeperf/test/path/A" |
| | |
| v |
| chromeperfClient.SendPostRequest(ctx, "anomalies", "find", {Tests: ["path/A", "path/B"], MinRevision: "100", MaxRevision: "200"}, &anomaliesResponse, ...) |
| | |
| v |
| (Parses anomaliesResponse, potentially resolving commit hashes to commit numbers) |
| | |
| v |
| Perf UI/Backend (receives AnomalyMap) |
| ``` |
| |
| ### `alertGroupApi.go` |
| |
| This file provides a client for interacting with Chrome Perf's `/alert_group` |
| API, specifically to get details about alert groups. An alert group in Chrome |
| Perf typically corresponds to a set of related anomalies (regressions). |
| |
| **Why:** When Perf displays information about an alert (which might have |
| originated from Chrome Perf), it needs to fetch details about the associated |
| alert group, such as the specific anomalies included, the commit range, and |
| other metadata. |
| |
| **How:** |
| |
| - **`GetAlertGroupDetails`**: Takes an alert group key and calls the |
| `alert_group/details` endpoint on Chrome Perf. |
| - The `AlertGroupDetails` struct holds the response, including a map of |
| `Anomalies` (where the value is the Chrome Perf `test_path`) and start/end |
| commit numbers/hashes. |
| - **`GetQueryParams` and `GetQueryParamsPerTrace`**: These methods are |
| utilities to transform the `AlertGroupDetails` into query parameters that |
| can be used to construct URLs for Perf's own explorer page. This allows |
| users to easily navigate from a Chrome Perf alert to viewing the |
| corresponding data in Perf. |
| - `GetQueryParams` aggregates all test path components (masters, bots, |
| benchmarks, etc.) from all anomalies in the group into a single set of |
| parameters. |
| - `GetQueryParamsPerTrace` generates a separate set of query parameters |
| for _each_ individual anomaly in the alert group. |
| - They parse the slash-separated `test_path` from Chrome Perf back into |
| individual components. |
| |
| Workflow for getting alert group details: |
| |
| ``` |
| Perf Backend (e.g., when processing an incoming alert from Chrome Perf) |
| | |
| v |
| alertGroupApiClient.GetAlertGroupDetails(ctx, "chrome_perf_group_key") |
| | |
| v |
| chromeperfClient.SendGetRequest(ctx, "alert_group", "details", {key: "chrome_perf_group_key"}, &alertGroupResponse) |
| | |
| v |
| (alertGroupResponse is populated) |
| | |
| v |
| alertGroupResponse.GetQueryParams(ctx) -> Perf Explorer URL query params |
| ``` |
| |
| ### `store.go` and the `sqlreversekeymapstore` submodule |
| |
| `store.go` defines the `ReverseKeyMapStore` interface. The |
| `sqlreversekeymapstore` directory and its `schema` subdirectory provide an |
| SQL-based implementation of this interface. |
| |
| **Why:** Test paths in Chrome Perf can contain characters that are considered |
| "invalid" or are handled differently by Perf's parameter parsing (e.g., Perf's |
| trace keys are comma-separated key-value pairs, and the values themselves should |
| ideally not interfere with this). When data is ingested into Perf from Chrome |
| Perf, or when Perf constructs test paths to query Chrome Perf, these "invalid" |
| characters in Chrome Perf test path components (like subtest names) might be |
| replaced (e.g., with underscores). |
| |
| This creates a problem: if Perf has `test/foo_bar` and Chrome Perf has |
| `test/foo?bar`, Perf needs a way to know that `foo_bar` corresponds to `foo?bar` |
| when querying Chrome Perf. The `ReverseKeyMapStore` is designed to store these |
| mappings. |
| |
| **How:** |
| |
| - `sqlreversekeymapstore/schema/schema.go` defines the SQL table schema |
| `ReverseKeyMapSchema` with columns: |
| - `ModifiedValue`: The value as it appears in Perf (e.g., `foo_bar`). |
| - `ParamKey`: The parameter key this value belongs to (e.g., `subtest_1`). |
| - `OriginalValue`: The original value as it was in Chrome Perf (e.g., |
| `foo?bar`). |
| - The primary key is a combination of `ModifiedValue` and `ParamKey`. |
| - `sqlreversekeymapstore/sqlreversekeymapstore.go` implements the |
| `ReverseKeyMapStore` interface using a SQL database (configurable for |
| CockroachDB or Spanner via different SQL statements). |
| - `Create`: Inserts a new mapping. If a mapping for the `ModifiedValue` |
| and `ParamKey` already exists (conflict), it does nothing. This is |
| important because the mapping should be stable. |
| - `Get`: Retrieves the `OriginalValue` given a `ModifiedValue` and |
| `ParamKey`. |
| |
| This store is likely used during the process of converting between Perf trace |
| parameters and Chrome Perf test paths, especially when generating requests _to_ |
| Chrome Perf. If a parameter value in Perf might have been modified from its |
| Chrome Perf original, this store can be queried to get the original value needed |
| for the Chrome Perf API call. The exact point of integration for creating these |
| mappings (i.e., when are `Create` calls made) is not explicitly detailed within |
| this module but would typically happen when Perf first encounters/ingests a test |
| path from Chrome Perf that requires modification. |
| |
| For example, if `anomalyApi.go` needs to construct a `test_path` to query Chrome |
| Perf based on parameters from Perf: |
| |
| 1. Perf has params: `test=my_test, subtest_1=value_with_question_mark` |
| 2. When constructing the `test_path` segment for `subtest_1`: - Call `reverseKeyMapStore.Get(ctx, "value_with_question_mark", |
| "subtest_1")`. - If it returns an original value like `"value?with?question?mark"`, use |
| that for the Chrome Perf API call. - Otherwise, use `"value_with_question_mark"`. |
| |
| The `store.go` file simply defines the interface, allowing for different backend |
| implementations of this mapping store if needed, though `sqlreversekeymapstore` |
| is the provided concrete implementation. |
| |
| # Module: /go/clustering2 |
| |
| ## Overview |
| |
| The `clustering2` module is responsible for grouping similar performance traces |
| together using k-means clustering. This helps in identifying patterns and |
| regressions in performance data by analyzing the collective behavior of traces |
| rather than individual ones. The core idea is to represent each trace as a point |
| in a multi-dimensional space and then find `k` clusters of these points. |
| |
| ## Design and Implementation |
| |
| ### Why K-Means? |
| |
| K-means is a well-understood and relatively efficient clustering algorithm |
| suitable for the scale of performance data encountered. It partitions data into |
| `k` distinct, non-overlapping clusters. Each data point belongs to the cluster |
| with the nearest mean (cluster centroid). This approach allows for the |
| summarization of large numbers of traces into a smaller set of representative |
| "shapes" or behaviors. |
| |
| ### Key Components and Files |
| |
| #### `clustering.go` |
| |
| This file contains the primary logic for performing k-means clustering on |
| performance traces. |
| |
| - **`ClusterSummary`**: This struct represents a single cluster found by the |
| k-means algorithm. |
| |
| - `Centroid`: The average shape of all traces in this cluster. This is the |
| core representation of the cluster's behavior. |
| - `Keys`: A list of identifiers for the traces belonging to this cluster. |
| These are sorted by their distance to the `Centroid`, allowing users to |
| quickly see the most representative traces. This is not serialized to |
| JSON to keep the payload manageable, as it can be very large. |
| - `Shortcut`: An identifier for a pre-computed set of `Keys`, used for |
| efficient retrieval and display in UIs. |
| - `ParamSummaries`: A breakdown of the parameter key-value pairs present |
| in the cluster and their prevalence (see `valuepercent.go`). This helps |
| in understanding what distinguishes this cluster (e.g., "all traces in |
| this cluster are for `arch=x86`"). |
| - `StepFit`: Contains information about how well the `Centroid` fits a |
| step function. This is crucial for identifying regressions or |
| improvements that manifest as sudden shifts in performance. |
| - `StepPoint`: The specific data point (commit/timestamp) where the step |
| (if any) in the `Centroid` is detected. |
| - `Num`: The total number of traces in this cluster. |
| - `Timestamp`: Records when the cluster analysis was performed. |
| - `NotificationID`: Stores the ID of any alert or notification sent |
| regarding a significant step change detected in this cluster. |
| |
| - **`ClusterSummaries`**: A container for all the `ClusterSummary` objects |
| produced by a single clustering run, along with metadata like the `K` value |
| used and the `StdDevThreshold`. |
| |
| - **`CalculateClusterSummaries` function**: This is the main entry point for |
| the clustering process. |
| |
| - **Trace Conversion**: It takes a `dataframe.DataFrame` (which holds |
| traces and their metadata) and converts each trace into a |
| `kmeans.Clusterable` object. The `ctrace2.NewFullTrace` function is used |
| here, which likely involves some form of normalization or feature |
| extraction to make traces comparable. The `stddevThreshold` parameter is |
| used during this conversion, potentially to filter out noisy or flat |
| traces. |
| - **Initial Centroid Selection (`chooseK`)**: K-means requires an initial |
| set of `k` centroids. This function randomly selects `k` traces from the |
| input data to serve as the initial centroids. Random selection is a |
| common and simple initialization strategy. |
| - **K-Means Iteration**: |
| - The `kmeans.Do` function performs one iteration of the k-means |
| algorithm: |
| 1. Assign each observation (trace) to the nearest centroid. |
| 2. Recalculate the centroids based on the mean of the observations |
| assigned to them. The `ctrace2.CalculateCentroid` function is likely |
| responsible for computing the mean of a set of traces. |
| - This process is repeated for a maximum of `MAX_KMEANS_ITERATIONS` or |
| until the change in `totalError` (sum of squared distances from each |
| point to its centroid) between iterations falls below `KMEAN_EPSILON`. |
| This convergence criterion prevents unnecessary computations once the |
| clusters stabilize. |
| - A `Progress` callback can be provided to monitor the clustering process, |
| reporting the `totalError` at each iteration. |
| - **Summary Generation (`getClusterSummaries`)**: After the k-means |
| algorithm converges, this function takes the final centroids and the |
| original observations to generate `ClusterSummary` objects for each |
| cluster. |
| - For each cluster, it identifies the member traces. |
| - It calculates `ParamSummaries` (see `valuepercent.go`) to describe the |
| common characteristics of traces in that cluster. |
| - It performs step detection (`stepfit.GetStepFitAtMid`) on the cluster's |
| centroid to identify significant performance shifts. The `interesting` |
| parameter likely defines a threshold for what constitutes a noteworthy |
| step change, and `stepDetection` specifies the algorithm or method used |
| for step detection. |
| - It sorts the traces within each cluster by their distance to the |
| centroid, ensuring `ClusterSummary.Keys` lists the most representative |
| traces first. A limited number of sample keys |
| (`config.MaxSampleTracesPerCluster`) are stored. |
| - Finally, the resulting `ClusterSummary` objects are sorted, likely by |
| the magnitude or significance of the detected step |
| (`StepFit.Regression`), to highlight the most impactful changes first. |
| |
| - **Constants**: |
| |
| - `K`: The default number of clusters to find. 50 is chosen as a balance |
| between granularity and computational cost. |
| - `MAX_KMEANS_ITERATIONS`: A safeguard against non-converging k-means |
| runs. |
| - `KMEAN_EPSILON`: A threshold to determine convergence, balancing |
| precision with computation time. |
| |
| #### `valuepercent.go` |
| |
| This file defines how to summarize and present the parameter distributions |
| within a cluster. |
| |
| - **`ValuePercent` struct**: Represents a specific parameter key-value pair |
| (e.g., "config=8888") and the percentage of traces in a cluster that have |
| this pair. This provides a quantitative measure of how characteristic a |
| parameter is for a given cluster. |
| |
| - **`SortValuePercentSlice` function**: This is crucial for making the |
| `ParamSummaries` in `ClusterSummary` human-readable and informative. The |
| goal is to: |
| |
| 1. Group parameter values by their key (e.g., all "config=..." values |
| together). |
| 2. Within each key group, sort by the percentage (highest first). |
| 3. Sort the key groups themselves by the highest percentage of their top |
| value. If percentages are equal, an alphabetical sort of the value is |
| used as a tie-breaker. |
| |
| This complex sorting logic ensures that the most dominant and distinguishing |
| parameters for a cluster are presented prominently. For example: |
| |
| ``` |
| config=8888 90% |
| config=565 10% |
| arch=x86 80% |
| arch=arm 20% |
| ``` |
| |
| Here, "config" is listed before "arch" because its top value ("config=8888") |
| has a higher percentage (90%) than the top value for "arch" ("arch=x86" at |
| 80%). |
| |
| ### Workflow: Calculating Cluster Summaries |
| |
| ``` |
| Input: DataFrame (traces, headers), K, StdDevThreshold, ProgressCallback, InterestingThreshold, StepDetectionMethod |
| |
| 1. [clustering.go: CalculateClusterSummaries] |
| a. Initialize empty list of observations. |
| b. For each trace in DataFrame.TraceSet: |
| i. Create ClusterableTrace (ctrace2.NewFullTrace) using trace data and StdDevThreshold. |
| ii. Add to observations list. |
| c. If no observations, return error. |
| d. [clustering.go: chooseK] |
| i. Randomly select K observations to be initial centroids. |
| e. Initialize lastTotalError = 0.0 |
| f. Loop MAX_KMEANS_ITERATIONS times OR until convergence: |
| i. [kmeans.Do] -> new_centroids |
| 1. Assign each observation to its closest centroid (from previous iteration or initial). |
| 2. Recalculate centroids (ctrace2.CalculateCentroid) based on assigned observations. |
| ii. [kmeans.TotalError] -> currentTotalError |
| iii. If ProgressCallback provided, call it with currentTotalError. |
| iv. If |currentTotalError - lastTotalError| < KMEAN_EPSILON, break loop. |
| v. lastTotalError = currentTotalError |
| g. [clustering.go: getClusterSummaries] -> clusterSummaries |
| i. [kmeans.GetClusters] -> allClusters (list of observations per centroid) |
| ii. For each cluster in allClusters and its corresponding centroid: |
| 1. Create new ClusterSummary. |
| 2. [clustering.go: getParamSummaries] (using cluster members) -> ParamSummaries |
| a. [clustering.go: GetParamSummariesForKeys] |
| i. Count occurrences of each param=value in cluster keys. |
| ii. Convert counts to ValuePercent structs. |
| iii. [valuepercent.go: SortValuePercentSlice] -> sorted ParamSummaries. |
| 3. [stepfit.GetStepFitAtMid] (on centroid values, StdDevThreshold, InterestingThreshold, StepDetectionMethod) -> StepFit, StepPoint. |
| 4. Set ClusterSummary.Num = number of members in cluster. |
| 5. Sort cluster members by distance to centroid. |
| 6. Populate ClusterSummary.Keys with top N sorted member keys. |
| 7. Populate ClusterSummary.Centroid with centroid values. |
| iii. Sort all ClusterSummary objects (e.g., by StepFit.Regression). |
| h. Populate ClusterSummaries struct with results, K, and StdDevThreshold. |
| i. Return ClusterSummaries. |
| |
| Output: ClusterSummaries object or error. |
| ``` |
| |
| This process effectively transforms raw trace data into a structured summary |
| that highlights significant patterns and changes, facilitating performance |
| analysis and regression detection. |
| |
| # Module: /go/config |
| |
| The `/go/config` module defines the configuration structure for Perf instances |
| and provides utilities for loading, validating, and managing these |
| configurations. It plays a crucial role in customizing the behavior of a Perf |
| instance, from data ingestion and storage to alert notifications and UI |
| presentation. |
| |
| **Core Responsibilities and Design:** |
| |
| The primary responsibility of this module is to define and manage the |
| `InstanceConfig` struct. This struct is a comprehensive container for all |
| settings that govern a Perf instance. The design emphasizes: |
| |
| 1. **Centralized Configuration:** By consolidating all instance-specific |
| settings into a single `InstanceConfig` struct (`config.go`), the module |
| provides a single source of truth. This simplifies understanding the state |
| of an instance and reduces the chances of configuration drift. |
| 2. **Typed Configuration:** Using Go structs with explicit types ensures that |
| configuration values are of the expected format, catching many potential |
| errors at compile-time or during validation. This is preferable to using |
| untyped maps or generic configuration formats. |
| 3. **JSON Serialization/Deserialization:** Configuration files are expected to |
| be in JSON format. The module uses standard Go `encoding/json` for this, |
| making it easy to create, read, and modify configurations. |
| 4. **Schema Validation:** To ensure the integrity and correctness of |
| configuration files, the module employs JSON Schema validation |
| (`/go/config/validate/validate.go`, |
| `/go/config/validate/instanceConfigSchema.json`). |
| - A JSON schema (`instanceConfigSchema.json`) formally defines the |
| structure and types of the `InstanceConfig`. This schema is |
| automatically generated from the Go struct definition using the |
| `/go/config/generate/main.go` program, ensuring the schema stays in sync |
| with the code. |
| - The `validate.InstanceConfigFromFile` function uses this schema to |
| validate a configuration file before attempting to deserialize it. This |
| allows for early detection of malformed or incomplete configurations. |
| 5. **Command-Line Flag Integration:** The module defines structs like |
| `BackendFlags`, `FrontendFlags`, `IngestFlags`, and `MaintenanceFlags` |
| (`config.go`). These structs group related command-line flags and provide |
| methods (`AsCliFlags`) to convert them into `cli.Flag` slices, compatible |
| with the `github.com/urfave/cli/v2` library. This design keeps flag |
| definitions organized and associated with the components they configure. |
| 6. **Extensibility:** The `InstanceConfig` is designed to be extensible. New |
| configuration options can be added as new fields to the relevant |
| sub-structs. The JSON schema generation and validation mechanisms will |
| automatically adapt to these changes. |
| |
| **Key Components and Files:** |
| |
| - **`config.go`:** This is the heart of the module. |
| - It defines the main `InstanceConfig` struct, which aggregates various |
| sub-configuration structs like `AuthConfig`, `DataStoreConfig`, |
| `IngestionConfig`, `GitRepoConfig`, `NotifyConfig`, |
| `IssueTrackerConfig`, `AnomalyConfig`, `QueryConfig`, `TemporalConfig`, |
| and `DataPointConfig`. Each of these sub-structs groups settings related |
| to a specific aspect of the Perf system (e.g., authentication, data |
| storage, data ingestion). |
| - It defines various enumerated types (e.g., `DataStoreType`, |
| `SourceType`, `GitAuthType`, `GitProvider`, `TraceFormat`) to provide |
| clear and constrained options for certain configuration values. |
| - It includes `DurationAsString`, a custom type for handling |
| `time.Duration` serialization and deserialization as strings in JSON, |
| which is more human-readable than nanosecond integers. It also provides |
| a custom JSON schema for this type. |
| - It defines structs for command-line flags used by different Perf |
| services (backend, frontend, ingest, maintenance). This helps in |
| organizing and parsing command-line arguments. |
| - Global constants like `MaxSampleTracesPerCluster`, `MinStdDev`, |
| `GotoRange`, and `QueryMaxRunTime` are defined here, providing default |
| values or limits used across the application. |
| - **`/go/config/validate/validate.go`:** |
| - This file contains the logic for validating an `InstanceConfig` beyond |
| what the JSON schema can enforce. This includes semantic checks, such as |
| ensuring that required fields are present based on the values of other |
| fields (e.g., API keys for issue tracker notifications). |
| - The `InstanceConfigFromFile` function is the primary entry point for |
| loading and validating a configuration file. It first performs schema |
| validation and then calls the `Validate` function for further business |
| logic checks. |
| - It also validates the Go text templates used in `NotifyConfig` by |
| attempting to format them with sample data. This helps catch template |
| syntax errors early. |
| - **`/go/config/validate/instanceConfigSchema.json`:** |
| - This is an automatically generated JSON Schema file that defines the |
| expected structure and data types for `InstanceConfig` JSON files. It is |
| used by `validate.go` to perform initial validation of configuration |
| files. |
| - **`/go/config/generate/main.go`:** |
| - This is a small utility program that generates the |
| `instanceConfigSchema.json` file based on the `InstanceConfig` struct |
| definition in `config.go`. This ensures that the schema is always |
| up-to-date with the Go code. The `//go:generate` directive at the top of |
| the file allows for easy regeneration of the schema. |
| - **`config_test.go` and `/go/config/validate/validate_test.go`:** |
| - These files contain unit tests for the configuration loading, |
| serialization/deserialization (especially for custom types like |
| `DurationAsString`), and validation logic. The tests for `validate.go` |
| include checks against actual configuration files used in production |
| (`//perf:configs`), ensuring that the validation logic is robust and |
| correctly handles real-world scenarios. |
| |
| **Workflows:** |
| |
| **1. Loading and Validating a Configuration File:** |
| |
| ``` |
| User provides config file path (e.g., "configs/nano.json") |
| | |
| V |
| Application calls validate.InstanceConfigFromFile("configs/nano.json") |
| | |
| V |
| validate.go: Reads the JSON file content. |
| | |
| V |
| validate.go: Validates content against instanceConfigSchema.json (using jsonschema.Validate). |
| | \ |
| | (If schema violation) \ |
| V V |
| Error returned with schema violations. Deserializes JSON into config.InstanceConfig struct. |
| | |
| V |
| validate.go: Calls Validate(instanceConfig) for further business logic checks. |
| | (e.g., API key presence, template validity) |
| | |
| | (If validation error) |
| V |
| Error returned. |
| | |
| V (If all valid) |
| Returns the populated config.InstanceConfig struct. |
| | |
| V |
| Application sets config.Config = returnedInstanceConfig |
| | |
| V |
| Perf instance uses config.Config for its operations. |
| ``` |
| |
| **2. Generating the JSON Schema:** |
| |
| This is typically done during development when the `InstanceConfig` struct |
| changes. |
| |
| ``` |
| Developer modifies config.InstanceConfig struct in config.go |
| | |
| V |
| Developer runs `go generate` in the /go/config/generate directory (or via bazel) |
| | |
| V |
| /go/config/generate/main.go: Calls jsonschema.GenerateSchema("../validate/instanceConfigSchema.json", &config.InstanceConfig{}) |
| | |
| V |
| jsonschema library: Introspects the config.InstanceConfig struct and its fields. |
| | |
| V |
| jsonschema library: Generates a JSON Schema definition. |
| | |
| V |
| /go/config/generate/main.go: Writes the generated schema to /go/config/validate/instanceConfigSchema.json. |
| ``` |
| |
| The design prioritizes robustness through schema and semantic validation, |
| maintainability through structured Go types and centralized configuration, and |
| ease of use through standard JSON format and command-line flag integration. The |
| separation of schema generation (`generate` subdirectory) and validation |
| (`validate` subdirectory) keeps concerns distinct. |
| |
| # Module: /go/ctrace2 |
| |
| ## ctrace2 Module Documentation |
| |
| ### Overview |
| |
| The `ctrace2` module provides the functionality to adapt trace data (represented |
| as a series of floating-point values) for use with k-means clustering |
| algorithms. The primary goal is to transform raw trace data into a format that |
| is suitable for distance calculations and centroid computations, which are |
| fundamental operations in k-means. This involves normalization and handling of |
| missing data points. |
| |
| ### Why and How |
| |
| In performance analysis, traces often represent measurements over time or across |
| different configurations. Clustering these traces helps identify groups of |
| similar performance characteristics. However, raw trace data might have issues |
| that hinder effective clustering: |
| |
| 1. **Varying Scales:** Different traces might have values in vastly different |
| ranges, leading to biased distance calculations where traces with larger |
| absolute values dominate. |
| 2. **Missing Data:** Traces can have missing data points, which need to be |
| handled appropriately during normalization and distance computation. |
| 3. **Zero Standard Deviation:** Traces with constant values (zero standard |
| deviation) can cause division by zero errors during normalization. |
| |
| The `ctrace2` module addresses these by: |
| |
| - **Normalization:** Each trace is normalized to have a standard deviation of |
| 1.0. This ensures that the scale of the values does not disproportionately |
| influence the clustering. The `vec32.Norm` function from the `go/vec32` |
| module is leveraged for this. Before normalization, any missing data points |
| (`vec32.MissingDataSentinel`) are filled in using `vec32.Fill`, which likely |
| interpolates or uses a similar strategy to replace them. |
| - **Minimum Standard Deviation:** To prevent division by zero or issues with |
| extremely small standard deviations, a `minStdDev` parameter is used during |
| normalization. If the calculated standard deviation of a trace is below this |
| minimum, the `minStdDev` value is used instead. This is a practical approach |
| to handle traces with very little variation without excluding them from |
| clustering. |
| - **`ClusterableTrace` Structure:** This structure wraps the trace data (`Key` |
| and `Values`) and implements the `kmeans.Clusterable` and `kmeans.Centroid` |
| interfaces from the `perf/go/kmeans` module. This makes `ClusterableTrace` |
| instances directly usable by the k-means algorithm. |
| |
| ### Responsibilities and Key Components |
| |
| - **`ctrace.go`:** This is the core file of the module. |
| - **`ClusterableTrace` struct:** |
| - **Purpose:** Represents a single trace that is ready for clustering. It |
| holds a `Key` (a string identifier for the trace) and `Values` (a slice |
| of `float32` representing the normalized data points). |
| - **Why:** This struct is designed to be directly consumable by the |
| k-means clustering algorithm by implementing necessary interfaces. |
| - **`Distance(c kmeans.Clusterable) float64` method:** Calculates the |
| Euclidean distance between the current `ClusterableTrace` and another |
| `ClusterableTrace`. This is crucial for the k-means algorithm to |
| determine how similar two traces are. The calculation assumes that both |
| traces have the same number of data points (a guarantee maintained by |
| `NewFullTrace`). `For each point i in trace1 and trace2: diff_i = |
| trace1.Values[i] - trace2.Values[i] squared_diff_i = diff_i * diff_i Sum |
| all squared_diff_i Distance = Sqrt(Sum)` |
| - **`AsClusterable() kmeans.Clusterable` method:** Returns the |
| `ClusterableTrace` itself, satisfying the `kmeans.Centroid` interface |
| requirement. |
| - **`Dup(newKey string) *ClusterableTrace` method:** Creates a deep copy |
| of the `ClusterableTrace` with a new key. This is useful when you need |
| to manipulate a trace without affecting the original. |
| - **`NewFullTrace(key string, values []float32, minStdDev float32) |
| *ClusterableTrace` function:** |
| - **Purpose:** The primary factory function for creating |
| `ClusterableTrace` instances from raw trace data. |
| - **How:** 1. It takes a `key` (string identifier), raw `values` (`[]float32`), |
| and a `minStdDev`. 2. Creates a copy of the input `values` to avoid modifying the original |
| slice. 3. Calls `vec32.Fill()` on the copied values. This step handles missing |
| data points by filling them, likely through interpolation or a |
| similar imputation technique provided by the `go/vec32` module. 4. Calls `vec32.Norm()` on the filled values, using `minStdDev`. This |
| normalizes the trace data so that its standard deviation is |
| effectively 1.0 (or adjusted if the original standard deviation was |
| below `minStdDev`). 5. Returns a new `ClusterableTrace` with the provided `key` and the |
| processed (filled and normalized) `values`. `Input: key, raw_values, |
| minStdDev ------------------------------------ copied_values = |
| copy(raw_values) filled_values = vec32.Fill(copied_values) |
| normalized_values = vec32.Norm(filled_values, minStdDev) Output: |
| ClusterableTrace{Key: key, Values: normalized_values}` |
| - **`CalculateCentroid(members []kmeans.Clusterable) kmeans.Centroid` |
| function:** |
| - **Purpose:** Implements the `kmeans.CalculateCentroid` function type. |
| Given a slice of `ClusterableTrace` instances (which are members of a |
| cluster), it computes their centroid. |
| - **How:** 1. It initializes a new slice of `float32` (`mean`) with the same |
| length as the `Values` of the first member trace. 2. It iterates through each member trace in the `members` slice. 3. For each member, it iterates through its `Values` and adds each |
| value to the corresponding element in the `mean` slice. 4. After summing up all values component-wise, it divides each element |
| in the `mean` slice by the total number of `members` to get the |
| average value for each dimension. 5. It returns a new `ClusterableTrace` representing the centroid. The |
| key for this centroid trace is set to `CENTROID_KEY` |
| ("special_centroid"). `Input: members (list of |
| ClusterableTraces) ------------------------------------------ |
| Initialize mean_values = [0.0, 0.0, ..., 0.0] (same length as |
| members[0].Values) For each member_trace in members: For each i from |
| 0 to len(member_trace.Values) - 1: mean_values[i] = mean_values[i] + |
| member_trace.Values[i] For each i from 0 to len(mean_values) - 1: |
| mean_values[i] = mean_values[i] / len(members) Output: |
| ClusterableTrace{Key: CENTROID_KEY, Values: mean_values}` |
| - **`CENTROID_KEY` constant:** |
| - **Purpose:** Defines a standard key ("special_centroid") to be used for |
| traces that represent the centroid of a cluster. |
| - **Why:** This provides a consistent way to identify centroid traces if |
| they are, for example, added back into a collection of traces (e.g., in |
| a DataFrame). |
| |
| The interaction with the `go/vec32` module is crucial for data preprocessing |
| (filling missing values and normalization), while the `perf/go/kmeans` module |
| provides the interfaces that `ctrace2` implements to be compatible with k-means |
| clustering algorithms. |
| |
| # Module: /go/culprit |
| |
| The `culprit` module is responsible for identifying, storing, and notifying |
| about commits that are likely causes of performance regressions. It integrates |
| with anomaly detection and subscription systems to automate the process of |
| pinpointing culprits and alerting relevant parties. |
| |
| ## Key Responsibilities |
| |
| - **Culprit Identification:** While the actual bisection logic might reside |
| elsewhere, this module is responsible for receiving information about |
| potential culprit commits. |
| - **Culprit Persistence:** Storing identified culprits in a database, linking |
| them to the anomaly groups they are associated with. |
| - **Notification:** Generating and sending notifications (e.g., creating |
| issues in an issue tracker) when new culprits are found or when new anomaly |
| groups are reported. |
| - **Data Formatting:** Formatting notification messages (subjects and bodies) |
| based on configurable templates. |
| |
| ## Key Components and Files |
| |
| ### `store.go` & `sqlculpritstore/sqlculpritstore.go` |
| |
| - **Purpose:** These files define the interface and implementation for storing |
| and retrieving culprit data. The primary goal is to persist information |
| about commits identified as culprits, associating them with specific anomaly |
| groups and any filed issues. |
| - **How it Works:** |
| - `store.go` defines the `Store` interface, which outlines the contract |
| for culprit data operations like `Get`, `Upsert`, and `AddIssueId`. |
| - `sqlculpritstore/sqlculpritstore.go` provides a SQL-based implementation |
| of this interface. It uses a SQL database (configured via `pool.Pool`) |
| to store culprit information. |
| - The `Upsert` method is crucial. It either inserts a new culprit record |
| or updates an existing one if a commit has already been identified as a |
| culprit for a different anomaly group. This prevents duplicate culprit |
| entries for the same commit. It also links the culprit to the |
| `anomaly_group_id`. |
| - The `AddIssueId` method updates a culprit record to include the ID of an |
| issue (e.g., a bug tracker ticket) that was created for it, and also |
| maintains a map between the anomaly group and the issue ID. This is |
| important for tracking and referencing. |
| - The database schema (defined in `sqlculpritstore/schema/schema.go`) |
| includes fields for commit details (host, project, ref, revision), |
| associated anomaly group IDs, and associated issue IDs. An index on |
| `(revision, host, project, ref)` helps in efficiently querying for |
| existing culprits. |
| - **Design Choices:** |
| - Using an interface (`Store`) decouples the rest of the module from the |
| specific database implementation, allowing for easier testing and |
| potential future changes in the storage backend. |
| - The `Upsert` logic is designed to handle cases where the same commit |
| might be identified as a culprit for multiple regressions (different |
| anomaly groups). Instead of creating duplicate entries, it appends the |
| new `anomaly_group_id` to the existing record. |
| - Storing `group_issue_map` as JSONB allows flexible storage of the |
| mapping between anomaly groups and the specific issue filed for that |
| group in the context of this culprit. |
| |
| ### `formatter/formatter.go` |
| |
| - **Purpose:** This component is responsible for constructing the content |
| (subject and body) of notifications. It allows for customizable message |
| formats. |
| - **How it Works:** |
| - Defines the `Formatter` interface with methods |
| `GetCulpritSubjectAndBody` (for new culprit notifications) and |
| `GetReportSubjectAndBody` (for new anomaly group reports). |
| - `MarkdownFormatter` is the concrete implementation. It uses Go's |
| `text/template` package to render notification messages. |
| - Templates for subjects and bodies can be provided via `InstanceConfig`. |
| If not provided, default templates are used. |
| - `TemplateContext` and `ReportTemplateContext` provide the data that can |
| be used within the templates (e.g., commit details, subscription |
| information, anomaly group details). |
| - Helper functions like `buildCommitURL`, `buildAnomalyGroupUrl`, and |
| `buildAnomalyDetails` are available within the templates to construct |
| URLs and format anomaly details. |
| - **Design Choices:** |
| - The use of interfaces and templates promotes flexibility. Users can |
| define their own notification formats without modifying the core |
| notification logic. |
| - Default templates ensure that the system can function even without |
| explicit template configuration. |
| - Separating formatting from the transport mechanism (how notifications |
| are sent) adheres to the single responsibility principle. |
| - **`formatter/noop.go`**: Provides a `NoopFormatter` that generates empty |
| subjects and bodies, useful for disabling notifications or for testing |
| scenarios where actual formatting is not needed. |
| |
| ### `transport/transport.go` |
| |
| - **Purpose:** This component handles the actual sending of notifications to |
| external systems, primarily issue trackers. |
| - **How it Works:** |
| - Defines the `Transport` interface with the `SendNewNotification` method. |
| - `IssueTrackerTransport` is the concrete implementation for interacting |
| with an issue tracker (e.g., Google Issue Tracker/Buganizer). |
| - It uses the `go.skia.org/infra/go/issuetracker/v1` client library. |
| - Authentication is handled using an API key retrieved via the `secret` |
| package. |
| - When `SendNewNotification` is called, it constructs an |
| `issuetracker.Issue` object based on the provided subject, body, and |
| subscription details (like component ID, priority, CCs, hotlists). |
| - It then calls the issue tracker API to create a new issue. |
| - Metrics (`SendNewNotificationSuccess`, `SendNewNotificationFail`) are |
| recorded to monitor the success rate of sending notifications. |
| - **Design Choices:** |
| - The `Transport` interface allows for different notification mechanisms |
| to be plugged in (e.g., email, Slack) in the future. |
| - Configuration for the issue tracker (API key, secret project/name) is |
| externalized, promoting better security and manageability. |
| - Error handling and metrics provide visibility into the notification |
| delivery process. |
| - **`transport/noop.go`**: Provides a `NoopTransport` that doesn't actually |
| send any notifications, useful for disabling notifications or for testing. |
| |
| ### `notify/notify.go` |
| |
| - **Purpose:** This component orchestrates the notification process by |
| combining a `Formatter` and a `Transport`. |
| - **How it Works:** |
| - Defines the `CulpritNotifier` interface with methods |
| `NotifyCulpritFound` and `NotifyAnomaliesFound`. |
| - `DefaultCulpritNotifier` implements this interface. It takes a |
| `formatter.Formatter` and a `transport.Transport` as dependencies. |
| - The `GetDefaultNotifier` factory function determines which `Formatter` |
| and `Transport` to use based on the |
| `InstanceConfig.IssueTrackerConfig.NotificationType`. If `NoneNotify`, |
| it uses `NoopFormatter` and `NoopTransport`. If `IssueNotify`, it sets |
| up `MarkdownFormatter` and `IssueTrackerTransport`. |
| - `NotifyCulpritFound`: |
| * Calls the formatter's `GetCulpritSubjectAndBody` to get the message |
| content. |
| * Calls the transport's `SendNewNotification` to send the message. |
| * Returns the ID of the created issue (or an empty string if no |
| notification was sent). |
| - `NotifyAnomaliesFound`: |
| * Calls the formatter's `GetReportSubjectAndBody`. |
| * Calls the transport's `SendNewNotification`. |
| * Returns the ID of the created issue. |
| - **Design Choices:** |
| - Decouples the high-level notification logic from the specifics of |
| message formatting and sending. |
| - Configuration-driven selection of formatter and transport makes the |
| notification behavior adaptable. |
| |
| ### `service/service.go` |
| |
| - **Purpose:** Implements the gRPC service defined in `culprit.proto`. This is |
| the main entry point for external systems (like a bisection service or an |
| anomaly detection pipeline) to interact with the culprit module. |
| - **How it Works:** |
| - Implements the `pb.CulpritServiceServer` interface. |
| - It depends on `anomalygroup.Store`, `culprit.Store`, |
| `subscription.Store`, and `notify.CulpritNotifier`. |
| - **`PersistCulprit` RPC:** |
| * Calls `culpritStore.Upsert` to save the identified culprit commits and |
| associate them with the `anomaly_group_id`. |
| * Calls `anomalygroupStore.AddCulpritIDs` to link the newly |
| created/updated culprit IDs back to the anomaly group. ``[Client (e.g., |
| Bisection Service)] | v [PersistCulpritRequest {Commits, |
| AnomalyGroupID}] | v [culpritService.PersistCulprit] | \ | `-> |
| [culpritStore.Upsert(AnomalyGroupID, Commits)] -> Returns CulpritIDs | | |
| | v `<----------------------- |
| [anomalygroupStore.AddCulpritIDs(AnomalyGroupID, CulpritIDs)] | v |
| [PersistCulpritResponse {CulpritIDs}] | v [Client]`` |
| - **`GetCulprit` RPC:** |
| * Calls `culpritStore.Get` to retrieve culprit details by their IDs. |
| - **`NotifyUserOfCulprit` RPC:** |
| * Retrieves culprit details using `culpritStore.Get`. |
| * Loads the corresponding `AnomalyGroup` using |
| `anomalygroupStore.LoadById`. |
| * Loads the `Subscription` associated with the anomaly group using |
| `subscriptionStore.GetSubscription`. |
| * Calls `notifier.NotifyCulpritFound` for each culprit to send a |
| notification (e.g., file a bug). |
| * Calls `culpritStore.AddIssueId` to store the generated issue ID with the |
| culprit and the specific anomaly group. ``[Client (e.g., Bisection |
| Service after PersistCulprit)] | v [NotifyUserOfCulpritRequest |
| {CulpritIDs, AnomalyGroupID}] | v [culpritService.NotifyUserOfCulprit] |
| |-> [culpritStore.Get(CulpritIDs)] -> Culprits |-> |
| [anomalygroupStore.LoadById(AnomalyGroupID)] -> AnomalyGroup | | -> |
| [subscriptionStore.GetSubscription(AnomalyGroup.SubName, |
| AnomalyGroup.SubRev)] -> Subscription | (For each Culprit in Culprits) | |
| | | `-> [notifier.NotifyCulpritFound(Culprit, Subscription)] -> Returns |
| IssueID | | | v | `-> [culpritStore.AddIssueId(Culprit.ID, IssueID, |
| AnomalyGroupID)] | v [NotifyUserOfCulpritResponse {IssueIDs}] | v |
| [Client]`` |
| - **`NotifyUserOfAnomaly` RPC:** |
| * Loads the `AnomalyGroup` and its associated `Subscription`. |
| * Calls `notifier.NotifyAnomaliesFound` to send a notification about the |
| group of anomalies (e.g., file a summary bug). ``[Client (e.g., Anomaly |
| Detection Service)] | v [NotifyUserOfAnomalyRequest {AnomalyGroupID, |
| Anomalies[]}] | v [culpritService.NotifyUserOfAnomaly] |-> |
| [anomalygroupStore.LoadById(AnomalyGroupID)] -> AnomalyGroup | | -> |
| [subscriptionStore.GetSubscription(AnomalyGroup.SubName, |
| AnomalyGroup.SubRev)] -> Subscription | `-> |
| [notifier.NotifyAnomaliesFound(AnomalyGroup, Subscription, Anomalies[])] |
| -> Returns IssueID | v [NotifyUserOfAnomalyResponse {IssueID}] | v |
| [Client]`` |
| - `PrepareSubscription` is a helper function used to potentially override |
| or mock subscription details for testing or during transitional phases |
| before full sheriff configuration is active. This is a temporary |
| measure. |
| - **Design Choices:** |
| - Clear separation of concerns: the service layer orchestrates actions by |
| calling appropriate stores and notifiers. |
| - gRPC provides a well-defined, language-agnostic interface for the |
| service. |
| - The authorization policy (`GetAuthorizationPolicy`) is currently set to |
| allow unauthenticated access, which might need to be revisited for |
| production environments. |
| |
| ### `proto/v1/culprit_service.proto` |
| |
| - **Purpose:** Defines the gRPC service contract for culprit-related |
| operations. |
| - **Key Messages and RPCs:** |
| - `Commit`: Represents a source code commit. |
| - `Culprit`: Represents an identified culprit commit, including its ID, |
| the commit details, associated anomaly group IDs, and issue IDs. It also |
| includes `group_issue_map` to track which issue was filed for which |
| anomaly group in the context of this culprit. |
| - `Anomaly`: Represents a detected performance anomaly (duplicated from |
| anomalygroup service for potential independent evolution). |
| - `PersistCulpritRequest`/`Response`: For storing new culprits. |
| - `GetCulpritRequest`/`Response`: For retrieving existing culprits. |
| - `NotifyUserOfAnomalyRequest`/`Response`: For triggering notifications |
| about a new set of anomalies (anomaly group). |
| - `NotifyUserOfCulpritRequest`/`Response`: For triggering notifications |
| about newly identified culprits. |
| - **Design Choices:** |
| - Proto definitions provide a clear, typed contract for communication. |
| - The `Anomaly` message is duplicated from the `anomalygroup` service. |
| This choice was made to allow the `culprit` service and `anomalygroup` |
| service to evolve their respective `Anomaly` definitions independently |
| if needed in the future, avoiding tight coupling. |
| - The `group_issue_map` in the `Culprit` message is important for |
| scenarios where a single culprit might be associated with multiple |
| anomaly groups, and each of those (culprit, group) pairs might result in |
| a distinct bug being filed. |
| |
| ### Mocks (`mocks/` subdirectories) |
| |
| - These directories contain generated mock implementations for the interfaces |
| defined within the `culprit` module (e.g., `Store`, `Formatter`, |
| `Transport`, `CulpritNotifier`, `CulpritServiceServer`). |
| - **Purpose:** Facilitate unit testing by allowing dependencies to be easily |
| mocked. This is standard practice for writing testable Go code. They are |
| generated using tools like `mockery`. |
| |
| ## Overall Workflow Example: Finding and Notifying a Culprit |
| |
| 1. **Anomaly Detection:** An external system detects a performance regression |
| and groups related anomalies into an `AnomalyGroup`. |
| 2. **Bisection (External):** A bisection process is triggered (potentially |
| externally) to find the commit(s) responsible for the anomalies in the |
| `AnomalyGroup`. |
| 3. **Persist Culprit:** The bisection service calls the |
| `CulpritService.PersistCulprit` RPC with the identified `Commit`(s) and the |
| `AnomalyGroupID`. |
| - `culpritService` uses `culpritStore.Upsert` to save these commits as |
| `Culprit` records, linking them to the `AnomalyGroupID`. |
| - It then calls `anomalygroupStore.AddCulpritIDs` to update the |
| `AnomalyGroup` record with the IDs of these new culprits. |
| 4. **Notify User of Culprit:** The bisection service (or another orchestrator) |
| then calls `CulpritService.NotifyUserOfCulprit` RPC with the `CulpritID`(s) |
| and the `AnomalyGroupID`. |
| - `culpritService` retrieves the full `Culprit` details and the associated |
| `Subscription`. |
| - `DefaultCulpritNotifier` is invoked: |
| - `MarkdownFormatter` generates the subject and body for the |
| notification. |
| - `IssueTrackerTransport` sends this formatted message to the issue |
| tracker, creating a new bug. |
| - The ID of the created bug is returned. |
| - `culpritService` calls `culpritStore.AddIssueId` to associate this bug |
| ID with the specific `Culprit` and `AnomalyGroupID`. |
| |
| This flow ensures that culprits are stored, linked to their regressions, and |
| users are notified through the configured channels. The modular design allows |
| for flexibility in how each step (storage, formatting, transport) is |
| implemented. |
| |
| # Module: /go/dataframe |
| |
| ## Module: /go/dataframe |
| |
| The `dataframe` module provides the `DataFrame` data structure and related |
| functionality for handling and manipulating performance trace data. It is a core |
| component for querying, analyzing, and visualizing performance metrics within |
| the Skia Perf system. |
| |
| **Key Design Principles:** |
| |
| - **Tabular Data Representation:** Inspired by R's DataFrame, this module |
| represents performance data as a table. Rows correspond to individual traces |
| (identified by a structured key), and columns represent distinct commit |
| points or patch levels. This structure facilitates efficient querying and |
| analysis of time-series data across different configurations. |
| - **TraceSet and ParamSet:** A `DataFrame` encapsulates a `types.TraceSet`, |
| which is a map of trace keys to their corresponding performance values. It |
| also maintains a `paramtools.ReadOnlyParamSet`, which describes the unique |
| parameter key-value pairs present in the `TraceSet`. This allows for |
| efficient filtering and aggregation based on trace characteristics. |
| - **Commit-Centric Columns:** The columns of a `DataFrame` are defined by |
| `ColumnHeader` structs, each containing a commit offset and a timestamp. |
| This ties the performance data directly to specific points in the codebase's |
| history. |
| - **Data Retrieval Abstraction:** The `DataFrameBuilder` interface decouples |
| the `DataFrame` creation logic from the underlying data source. This allows |
| for different implementations to fetch data (e.g., from a database) while |
| providing a consistent API for consumers. |
| - **Efficiency for Common Operations:** The module provides functions for |
| common data manipulation tasks like merging DataFrames (`Join`), filtering |
| traces (`FilterOut`), slicing data (`Slice`), and compressing data by |
| removing empty columns (`Compress`). These operations are designed with |
| performance considerations in mind. |
| |
| **Key Components and Files:** |
| |
| - **`dataframe.go`**: This is the central file defining the `DataFrame` struct |
| and its associated methods. |
| |
| - **`DataFrame` struct**: |
| - `TraceSet`: Stores the actual performance data, mapping trace keys |
| (strings representing parameter combinations like |
| ",arch=x86,config=8888,") to `types.Trace` (slices of float32 values). |
| - `Header`: A slice of `*ColumnHeader` pointers, defining the columns of |
| the DataFrame. Each `ColumnHeader` links a column to a specific commit |
| (`Offset`) and its `Timestamp`. |
| - `ParamSet`: A `paramtools.ReadOnlyParamSet` that contains all unique |
| key-value pairs from the keys in `TraceSet`. This is crucial for |
| understanding the dimensions of the data and for building UI controls |
| for filtering. It's rebuilt by `BuildParamSet()`. |
| - `Skip`: An integer indicating if any commits were skipped during data |
| retrieval to keep the DataFrame size manageable (related to |
| `MAX_SAMPLE_SIZE`). |
| - **`DataFrameBuilder` interface**: Defines the contract for objects that |
| can construct `DataFrame` instances. This allows for different data |
| sources or retrieval strategies. Key methods include: |
| - `NewFromQueryAndRange`: Creates a DataFrame based on a query and a time |
| range. |
| - `NewFromKeysAndRange`: Creates a DataFrame for specific trace keys over |
| a time range. |
| - `NewNFromQuery` / `NewNFromKeys`: Creates a DataFrame with the N most |
| recent data points for matching traces or specified keys. |
| - `NumMatches` / `PreflightQuery`: Used to estimate the size of the data |
| that a query will return, often for UI feedback or to refine queries. |
| - **`ColumnHeader` struct**: Represents a single column in the DataFrame, |
| typically corresponding to a commit. It contains: |
| - `Offset`: A `types.CommitNumber` identifying the commit. |
| - `Timestamp`: The timestamp of the commit in seconds since the Unix |
| epoch. |
| - **Key Functions**: |
| - `NewEmpty()`: Creates an empty DataFrame. |
| - `NewHeaderOnly()`: Creates a DataFrame with populated headers (commits |
| within a time range) but no trace data. This can be useful for setting |
| up the structure before fetching actual data. |
| - `FromTimeRange()`: Retrieves commit information (headers and commit |
| numbers) for a given time range from a `perfgit.Git` instance. This is a |
| foundational step in populating the `Header` of a DataFrame. |
| - `MergeColumnHeaders()`: A utility function that takes two slices of |
| `ColumnHeader` and merges them into a single sorted slice, returning |
| mapping indices to reconstruct traces. This is essential for the `Join` |
| operation. |
| - `Join()`: Combines two DataFrames into a new DataFrame. It merges their |
| headers and trace data. If traces exist in one DataFrame but not the |
| other for a given key, missing data points (`vec32.MissingDataSentinel`) |
| are inserted. The `ParamSet` of the resulting DataFrame is the union of |
| the input ParamSets. `DataFrame A (Header: [C1, C3], TraceX: [v1, v3]) | |
| V DataFrame B (Header: [C2, C3], TraceX: [v2', v3']) | V Joined |
| DataFrame (Header: [C1, C2, C3], TraceX: [v1, v2', v3/v3']) (TraceY from |
| A or B padded with missing data)` |
| - `BuildParamSet()`: Recalculates the `ParamSet` for a DataFrame based on |
| the current keys in its `TraceSet`. This is called after operations like |
| `FilterOut` that might change the set of traces. |
| - `FilterOut()`: Removes traces from the `TraceSet` based on a provided |
| `TraceFilter` function. It then calls `BuildParamSet()` to update the |
| `ParamSet`. |
| - `Slice()`: Returns a new DataFrame that is a view into a sub-section of |
| the original DataFrame's columns. The underlying trace data is sliced, |
| not copied, for efficiency. |
| - `Compress()`: Creates a new DataFrame by removing any columns (and |
| corresponding data points in traces) that contain only missing data |
| sentinels across all traces. This helps in reducing data size and |
| focusing on relevant data points. |
| |
| - **`dataframe_test.go`**: Contains unit tests for the functionality in |
| `dataframe.go`. These tests cover various scenarios, including empty |
| DataFrames, different merging and joining cases, filtering, slicing, and |
| compression. The tests often use `gittest` for creating mock Git |
| repositories to test time range queries. |
| |
| - **`/go/dataframe/mocks/DataFrameBuilder.go`**: This file contains a mock |
| implementation of the `DataFrameBuilder` interface, generated using the |
| `testify/mock` library. This mock is used in tests of other packages that |
| depend on `DataFrameBuilder`, allowing them to simulate DataFrame creation |
| without needing a real data source or Git repository. |
| |
| **Workflows:** |
| |
| 1. **Fetching Data for Display/Analysis:** |
| |
| - A client (e.g., a web UI) specifies a query and a time range. |
| - An implementation of `DataFrameBuilder` (e.g., one that queries a |
| CockroachDB instance) uses `NewFromQueryAndRange`. |
| - Internally, this likely involves: |
| 1. Resolving the time range to a list of commits using `FromTimeRange` |
| (which calls `perfgit.Git.CommitSliceFromTimeRange`). This populates |
| the `Header`. |
| 2. Querying the data source for traces matching the query and falling |
| within the identified commit range. |
| 3. Populating the `TraceSet`. |
| 4. Building the `ParamSet` using `BuildParamSet()`. |
| - The resulting `DataFrame` is returned. |
| |
| ``` |
| Client Request (Query, TimeRange) |
| | |
| V |
| DataFrameBuilder.NewFromQueryAndRange(ctx, begin, end, query, ...) |
| | |
| +-> FromTimeRange(ctx, git, begin, end, ...) // Get commit headers |
| | | |
| | V |
| | perfgit.Git.CommitSliceFromTimeRange() |
| | | |
| | V |
| | [ColumnHeader{Offset, Timestamp}, ...] |
| | |
| +-> DataSource.QueryTraces(query, commit_numbers) // Fetch trace data |
| | | |
| | V |
| | types.TraceSet |
| | |
| +-> DataFrame.BuildParamSet() // Populate ParamSet |
| | |
| V |
| DataFrame{Header, TraceSet, ParamSet} |
| ``` |
| |
| 2. **Joining DataFrames (e.g., from different sources or queries):** |
| |
| - Two `DataFrame` instances, `dfA` and `dfB`, are available. |
| - `Join(dfA, dfB)` is called. |
| - `MergeColumnHeaders(dfA.Header, dfB.Header)` creates a unified header |
| and maps to align traces. |
| - A new `TraceSet` is built. For each key: |
| - If a key is in `dfA` but not `dfB`, its trace is copied, padded with |
| missing values for columns unique to `dfB`. |
| - If a key is in `dfB` but not `dfA`, its trace is copied, padded with |
| missing values for columns unique to `dfA`. |
| - If a key is in both, values are merged based on the unified header. |
| - The `ParamSet`s of `dfA` and `dfB` are combined. |
| - A new, joined `DataFrame` is returned. |
| |
| 3. **Filtering Data:** |
| |
| - A `DataFrame` `df` exists. |
| - A `TraceFilter` function `myFilter` is defined (e.g., to remove traces |
| with all zero values). |
| - `df.FilterOut(myFilter)` is called. |
| - The method iterates through `df.TraceSet`. If `myFilter` returns `true` |
| for a trace, that trace is deleted from the `TraceSet`. |
| - `df.BuildParamSet()` is called to reflect the potentially reduced set of |
| parameters. |
| |
| **Constants:** |
| |
| - `DEFAULT_NUM_COMMITS`: Default number of commits to fetch when using methods |
| like `NewNFromQuery`. Set to 50. |
| - `MAX_SAMPLE_SIZE`: A limit on the number of commits (columns) a DataFrame |
| might contain, especially when downsampling. Set to 5000. (Note: The |
| `downsample` parameter in `FromTimeRange` is currently ignored, meaning this |
| might not be strictly enforced by that specific function directly but could |
| be a target for other parts of the system or future enhancements.) |
| |
| # Module: /go/dfbuilder |
| |
| The `dfbuilder` module is responsible for constructing `DataFrame` objects. |
| `DataFrames` are fundamental data structures in Perf, representing a collection |
| of performance traces (time series data) along with their associated parameters |
| and commit information. This module acts as an intermediary between the raw |
| trace data stored in a `TraceStore` and the higher-level analysis and |
| visualization components that consume `DataFrames`. |
| |
| The core design revolves around efficiently fetching and organizing trace data |
| based on various querying criteria. This involves interacting with a |
| `perfgit.Git` instance to resolve commit ranges and timestamps, and a |
| `tracestore.TraceStore` to retrieve the actual trace data. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`dfbuilder.go`**: This is the central file implementing the |
| `DataFrameBuilder` interface. |
| - **`builder` struct**: This struct holds the necessary dependencies like |
| `perfgit.Git`, `tracestore.TraceStore`, `tracecache.TraceCache`, and |
| configuration parameters (e.g., `tileSize`, `numPreflightTiles`, |
| `QueryCommitChunkSize`). It also maintains metrics for various DataFrame |
| construction operations. |
| - **Construction (`NewDataFrameBuilderFromTraceStore`)**: Initializes a |
| `builder` instance. An important configuration here is |
| `filterParentTraces`. If enabled, the builder will attempt to remove |
| redundant parent traces when child traces (more specific traces) exist. |
| For example, if traces for `test=foo,subtest=bar` and `test=foo` both |
| exist, the latter might be filtered out if `filterParentTraces` is true. |
| - **Fetching by Time Range and Query (`NewFromQueryAndRange`)**: |
| - **Why**: This is a common use case where users want to see traces |
| matching a specific query (e.g., `config=8888`) within a given time |
| period. |
| - **How**: 1. It first uses `dataframe.FromTimeRange` (which internally queries |
| `perfgit.Git`) to get a list of `ColumnHeader` (commit information) |
| and `CommitNumber`s within the specified time range. It also handles |
| downsampling if requested. 2. It then determines the relevant tiles to query from the `TraceStore` |
| based on the commit numbers (`sliceOfTileNumbersFromCommits`). 3. The core data fetching happens in the `new` method. This method |
| queries the `TraceStore` for matching traces _per tile_ concurrently |
| using `errgroup.Group` for parallelism. This is a key optimization |
| to speed up data retrieval, especially over large time ranges |
| spanning multiple tiles. 4. A `tracesetbuilder.TraceSetBuilder` is used to efficiently aggregate |
| the traces fetched from different tiles into a single |
| `types.TraceSet` and `paramtools.ParamSet`. 5. Finally, it constructs and returns a compressed `DataFrame`. |
| `NewFromQueryAndRange | -> dataframe.FromTimeRange (get commits in |
| time range from Git) | -> sliceOfTileNumbersFromCommits (determine |
| tiles to query) | -> new (concurrently query TraceStore for each |
| tile) | -> TraceStore.QueryTraces (for each tile) | -> |
| tracesetbuilder.Add (aggregate results) | -> tracesetbuilder.Build | |
| -> DataFrame.Compress` |
| - **Fetching by Keys and Time Range (`NewFromKeysAndRange`)**: |
| - **Why**: Used when the specific trace keys are already known, and data |
| for these keys is needed within a time range. |
| - **How**: Similar to `NewFromQueryAndRange` in terms of getting commit |
| information for the time range. However, instead of querying by a |
| `query.Query` object, it directly calls `TraceStore.ReadTraces` for each |
| relevant tile, providing the list of trace keys. Results are then |
| aggregated. This is generally faster if the exact trace keys are known |
| as it avoids the overhead of query parsing and matching within the |
| `TraceStore`. |
| - **Fetching N Most Recent Data Points (`NewNFromQuery`, |
| `NewNFromKeys`)**: |
| - **Why**: Often, users are interested in the N most recent data points |
| for a query or a set of keys, typically for displaying recent trends or |
| for alert evaluation. |
| - **How**: These methods work by iterating backward in time, tile by tile |
| (or by `QueryCommitChunkSize` if configured), until `N` data points are |
| collected for the matching traces. 1. It starts from a given `end` time (or the latest commit if `end` is |
| zero). 2. It determines an initial `beginIndex` and `endIndex` for commit |
| numbers. The `QueryCommitChunkSize` can influence this `beginIndex` |
| to fetch a larger chunk of commits at once, potentially improving |
| parallelism in the `new` method. 3. In a loop: - It fetches commit headers and indices for the current |
| `beginIndex`-`endIndex` range. - It calls the `new` method (for `NewNFromQuery`) or a similar |
| tile-based fetching logic (for `NewNFromKeys`) to get a |
| `DataFrame` for this smaller range. - It counts non-missing data points in the fetched `DataFrame`. If |
| no data is found for `maxEmptyTiles` consecutive attempts, it |
| stops to prevent searching indefinitely through sparse data. - It appends the data from the fetched `DataFrame` to the result |
| `DataFrame`, working backward from the `N`th slot. - It then adjusts `beginIndex` and `endIndex` to move to the |
| previous chunk of commits/tiles. 4. If `filterParentTraces` is enabled, it calls `filterParentTraces` to |
| remove redundant parent traces from the final `TraceSet`. 5. The resulting `DataFrame` might have traces of length less than `N` |
| if not enough data points were found. It trims the traces if |
| necessary. `NewNFromQuery (or NewNFromKeys) | -> findIndexForTime |
| (get commit number for 'end' time) | -> Loop (until N points are |
| found or maxEmptyTiles reached): | -> fromIndexRange (get commits |
| for current chunk) | -> new (or similar logic for keys) (fetch data |
| for this chunk) | -> Aggregate data into result DataFrame | -> |
| Update beginIndex/endIndex to previous chunk | -> [Optional] |
| filterParentTraces | -> Trim traces if fewer than N points found` |
| - **Preflighting Queries (`PreflightQuery`)**: |
| - **Why**: Before executing a potentially expensive query to fetch a full |
| `DataFrame`, it's useful to get an estimate of how many traces will |
| match and what the resulting `ParamSet` will look like. This allows UIs |
| to present filter options dynamically. |
| - **How**: 1. It fetches the latest tile number from the `TraceStore`. 2. It queries the `numPreflightTiles` most recent tiles (concurrently) |
| for trace IDs matching the query `q`. This uses `getTraceIds`, which |
| first attempts to fetch from `tracecache` and falls back to |
| `TraceStore.QueryTracesIDOnly`. 3. The trace IDs (which are `paramtools.Params`) found are used to |
| build up a `ParamSet`. 4. The count of matching traces from the tile with the most matches is |
| taken as the estimated count. 5. Crucially, for parameter keys _present in the input query `q`_, it |
| replaces the values in the computed `ParamSet` with _all_ values for |
| those keys from the `referenceParamSet`. This ensures that the UI |
| can still offer all possible filter options for parameters the user |
| has already started filtering on. 6. The resulting `ParamSet` is normalized. `PreflightQuery | -> |
| TraceStore.GetLatestTile | -> Loop (for numPreflightTiles, |
| concurrently): | -> getTraceIds (TileN, query) // Checks tracecache |
| first, then TraceStore.QueryTracesIDOnly | -> [If cache miss] |
| TraceStore.QueryTracesIDOnly | -> [If cache miss & tracecache |
| enabled] tracecache.CacheTraceIds | -> Aggregate Params into a new |
| ParamSet -> Update max count | -> Update ParamSet with values from |
| referenceParamSet for keys in the original query | -> Normalize |
| ParamSet` |
| - **Counting Matches (`NumMatches`)**: |
| - **Why**: A simpler version of `PreflightQuery` that only returns the |
| estimated number of matching traces. |
| - **How**: It queries the two most recent tiles using |
| `TraceStore.QueryTracesIDOnly` and returns the higher of the two counts. |
| - **Parent Trace Filtering (`filterParentTraces` function)**: |
| - **Why**: To reduce data redundancy and present a cleaner set of traces |
| to the user, especially in UIs where deeply nested subtests can create |
| many similar-looking parent traces. |
| - **How**: It uses `tracefilter.NewTraceFilter()`. For each trace key in |
| the input `TraceSet`: |
| 1. The key is parsed into `paramtools.Params`. |
| 2. A "path" is constructed from the parameter values based on a |
| predefined order of keys (e.g., "master", "bot", "benchmark", |
| "test", "subtest_1", ...). |
| 3. This path and the original trace key are added to the `traceFilter`. |
| 4. After processing all keys, `traceFilter.GetLeafNodeTraceKeys()` |
| returns only the keys corresponding to the most specific (leaf) |
| traces in the hierarchical structure implied by the paths. |
| 5. A new `TraceSet` is built containing only these leaf node traces. |
| - **Caching (`getTraceIds`, `cacheTraceIdsIfNeeded`)**: |
| - **Why**: `QueryTracesIDOnly` can still be somewhat expensive if |
| performed frequently on the same tiles and queries (e.g., during |
| `PreflightQuery`). Caching the results (the list of matching trace |
| IDs/params) can significantly speed this up. |
| - **How**: The `getTraceIds` function first attempts to retrieve trace IDs |
| from the `tracecache.TraceCache`. If there's a cache miss or the cache |
| is not configured, it queries the `TraceStore`. If a database query was |
| performed and the cache is configured, `cacheTraceIdsIfNeeded` is called |
| to store the results in the cache for future requests. The cache key is |
| typically a combination of the tile number and the query string. |
| |
| **Design Choices and Trade-offs:** |
| |
| - **Tile-Based Processing**: The `TraceStore` organizes data into tiles. Most |
| `dfbuilder` operations that involve fetching data across a range of commits |
| are designed to process these tiles concurrently. This improves performance |
| by parallelizing I/O and computation. |
| - **`tracesetbuilder`**: This utility is used to efficiently merge trace data |
| coming from different tiles (which might have different sets of commits) |
| into a coherent `TraceSet` and `ParamSet`. |
| - **`QueryCommitChunkSize`**: This parameter in `NewNFromQuery` allows |
| fetching data in larger chunks than a single tile. This can increase |
| parallelism in the underlying `new` method call, but fetching too large a |
| chunk might lead to excessive memory usage or longer latency for the first |
| chunk. |
| - **`maxEmptyTiles` / `newNMaxSearch`**: When searching backward for N data |
| points, these constants prevent indefinite searching if the data is very |
| sparse or the query matches very few traces. |
| - **`singleTileQueryTimeout`**: This guards against queries on individual |
| tiles taking too long, which could happen with "bad" tiles containing |
| excessive data or due to backend issues. This is particularly important for |
| operations like `NewNFromQuery` or `PreflightQuery` which might issue many |
| such single-tile queries. |
| - **Caching for `PreflightQuery`**: `PreflightQuery` is often called by UIs to |
| populate filter options. Caching the results of `QueryTracesIDOnly` (which |
| provides the raw data for `ParamSet` construction in preflight) via |
| `tracecache` helps make these UI interactions faster. |
| - **Parent Trace Filtering**: This is an opinionated feature that aims to |
| improve usability by default. The specific heuristic for identifying |
| "parent" vs. "child" traces is based on a predefined order of parameter |
| keys. |
| |
| The `dfbuilder_test.go` file provides comprehensive unit tests for these |
| functionalities, covering various scenarios including empty queries, queries |
| matching data in single or multiple tiles, N-point queries, and preflight |
| operations with and without caching. It uses `gittest` for creating a mock Git |
| history and `sqltest` (for Spanner) or mock implementations for the `TraceStore` |
| and `TraceCache`. |
| |
| # Module: /go/dfiter |
| |
| ## `dfiter` Module Documentation |
| |
| ### Overview |
| |
| The `dfiter` module is responsible for efficiently creating and providing |
| `dataframe.DataFrame` objects, which are fundamental data structures used in |
| regression detection within the Perf application. It acts as an iterator, |
| allowing consuming code to process DataFrames one by one. This is particularly |
| useful for performance reasons, as constructing and holding all possible |
| DataFrames in memory simultaneously could be resource-intensive. |
| |
| The core purpose of `dfiter` is to abstract away the complexities of fetching |
| and structuring data from the underlying trace store and Git history. It ensures |
| that DataFrames are generated with the correct dimensions and data points based |
| on user-defined queries, commit ranges, and alert configurations. |
| |
| ### Design and Implementation Choices |
| |
| The `dfiter` module employs a "slicing" strategy for generating DataFrames. This |
| means it typically fetches a larger, encompassing DataFrame from the |
| `dataframe.DataFrameBuilder` and then yields smaller, overlapping |
| sub-DataFrames. |
| |
| **Why this approach?** |
| |
| - **Efficiency:** Fetching a larger chunk of data once from the database (via |
| `DataFrameBuilder`) is often more efficient than making numerous small |
| queries. The slicing operation itself is a relatively cheap in-memory |
| operation. |
| - **Context for Regression Detection:** Regression detection algorithms often |
| need to look at data points before and after a specific commit (the "radius" |
| of an alert). The slicing approach naturally provides this sliding window of |
| context. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`DataFrameIterator` Interface:** |
| - **Why:** Defines a standard contract for iterating over DataFrames. This |
| promotes loose coupling, allowing different implementations of DataFrame |
| generation if needed in the future, and simplifies how other parts of |
| the system consume DataFrames. |
| - **How:** It provides two methods: |
| - `Next() bool`: Advances the iterator to the next DataFrame. Returns |
| `true` if a next DataFrame is available, `false` otherwise. |
| - `Value(ctx context.Context) (*dataframe.DataFrame, error)`: Returns the |
| current DataFrame. |
| - **`dataframeSlicer` struct:** |
| - **Why:** This is the concrete implementation of `DataFrameIterator`. It |
| embodies the slicing strategy described above. |
| - **How:** It holds a reference to a larger, source `dataframe.DataFrame` |
| (`df`), the desired `size` of the sliced DataFrames (determined by |
| `alert.Radius`), and the current `offset` for slicing. The `Next()` |
| method checks if another slice of the specified `size` can be made, and |
| `Value()` performs the actual slicing using `df.Slice()`. |
| - **`NewDataFrameIterator` Function:** |
| - **Why:** This is the factory function for creating `DataFrameIterator` |
| instances. It encapsulates the logic for determining how the initial, |
| larger DataFrame should be fetched based on the input parameters. |
| - **How:** |
| * **Parameter Parsing:** Parses the input `queryAsString` into a |
| `query.Query` object. |
| * **Mode Determination (Implicit):** The function behaves differently |
| based on `domain.Offset`: - **`domain.Offset == 0` (Continuous/Sliding Window Mode):** - This mode is typically used for ongoing regression detection |
| across a range of recent commits. - It fetches a DataFrame of `domain.N` commits ending at |
| `domain.End`. - **Settling Time:** If `anomalyConfig.SettlingTime` is |
| configured, it adjusts `domain.End` to exclude very recent data |
| points that might not have "settled" (e.g., due to data |
| ingestion delays or pending backfills). This prevents alerts on |
| potentially incomplete or volatile fresh data. - The `dataframeSlicer` will then produce overlapping DataFrames |
| of size `2*alert.Radius + 1`. - **`domain.Offset != 0` (Specific Commit/Exact DataFrame Mode):** - This mode is used when analyzing a specific commit or a small, |
| fixed window around it (e.g., when a user clicks on a specific |
| point in a chart to see its details or re-runs detection for a |
| particular regression). - It aims to return a _single_ DataFrame. - The size of this DataFrame is `2*alert.Radius + 1`. - To determine the `End` time for fetching data, it calculates the |
| commit `alert.Radius` positions _after_ the `domain.Offset`. |
| This ensures the commit at `domain.Offset` is centered within |
| the radius. For example, if `domain.Offset` is commit 21 and |
| `alert.Radius` is 3, it will fetch data up to commit 24 (`21 + |
| 3`). The resulting DataFrame will then contain commits `[18, 19, |
| 20, 21, 22, 23, 24]`. This is a specific requirement to ensure |
| consistency with how different step detection algorithms expect |
| their input DataFrames. |
| * **Data Fetching:** Uses the injected `dataframe.DataFrameBuilder` |
| (`dfBuilder`) to construct the initial DataFrame |
| (`dfBuilder.NewNFromQuery`). This involves querying the trace store and |
| potentially Git history. |
| * **Data Sufficiency Check:** Verifies if the fetched DataFrame contains |
| enough data points (at least `2*alert.Radius + 1` commits). If not, it |
| returns `ErrInsufficientData`. This is crucial because regression |
| detection algorithms require a minimum amount of data to operate |
| correctly. |
| * **Metrics:** Records the number of floating-point values queried from |
| the database using |
| `metrics2.GetCounter("perf_regression_detection_floats")`. This helps in |
| monitoring the data processing load. |
| * **Iterator Instantiation:** Creates and returns a `dataframeSlicer` |
| instance initialized with the fetched DataFrame and the calculated slice |
| size. |
| - **`ErrInsufficientData`:** |
| - **Why:** A specific error type to indicate that while the queries were |
| successful, the available data didn't meet the minimum requirements |
| (e.g., not enough commits within the requested range or matching the |
| query). This allows calling code to handle this scenario gracefully, |
| perhaps by informing the user or adjusting parameters. |
| |
| ### Key Workflows |
| |
| **1. Continuous Regression Detection (Sliding Window):** |
| |
| This typically happens when `domain.Offset` is 0. |
| |
| ``` |
| [Caller] [NewDataFrameIterator] [DataFrameBuilder] |
| | -- Request with query, domain (N, End), alert (Radius) --> | | |
| | | -- Parse query | |
| | | -- (If anomalyConfig.SettlingTime > 0) Adjust domain.End --> | |
| | | -- dfBuilder.NewNFromQuery(ctx, domain.End, q, domain.N) --> | |
| | | | -- Query TraceStore |
| | | | -- Build large DataFrame |
| | | | <----- DataFrame (df) |
| | | -- Check if len(df.Header) >= 2*Radius+1 | |
| | | -- (If insufficient) Return ErrInsufficientData ----------- | |
| | | -- Create dataframeSlicer(df, size=2*Radius+1, offset=0) | |
| | <----------------- DataFrameIterator (slicer) ------------- | | |
| |
| [Caller] [dataframeSlicer] |
| | | |
| | -- it.Next() ---------------------------------------------> | |
| | | -- return offset+size <= len(df.Header) |
| | <------------------------------ true ---------------------- | |
| | -- it.Value() --------------------------------------------> | |
| | | -- subDf = df.Slice(offset, size) |
| | | -- offset++ |
| | <-------------------------- subDf, nil -------------------- | |
| | -- (Process subDf) | |
| | ... (loop Next()/Value() until Next() returns false) ... | |
| ``` |
| |
| **2. Specific Commit Analysis (Exact DataFrame):** |
| |
| This typically happens when `domain.Offset` is non-zero. |
| |
| ``` |
| [Caller] [NewDataFrameIterator] [Git] [DataFrameBuilder] |
| | -- Request with query, domain (Offset), alert (Radius) --> | | | |
| | | -- Parse query | | |
| | | -- targetCommitNum = domain.Offset + alert.Radius | | |
| | | -- perfGit.CommitFromCommitNumber(targetCommitNum) ------> | | |
| | | | -- Lookup commit | |
| | | <----------------------------- commitDetails, nil --------- | | |
| | | -- dfBuilder.NewNFromQuery(ctx, commitDetails.Timestamp, | | |
| | | q, n=2*Radius+1) ------------> | | |
| | | | | -- Query TraceStore |
| | | | | -- Build DataFrame (size 2*R+1) |
| | | <-------------------------------------------------------- DataFrame (df) ----- | |
| | | -- Check if len(df.Header) >= 2*Radius+1 | | |
| | | -- (If insufficient) Return ErrInsufficientData --------- | | |
| | | -- Create dataframeSlicer(df, size=2*Radius+1, offset=0) | | |
| | <----------------------- DataFrameIterator (slicer) ------ | | | |
| |
| [Caller] [dataframeSlicer] |
| | | |
| | -- it.Next() ---------------------------------------------> | |
| | | -- return offset+size <= len(df.Header) (true for the first call) |
| | <------------------------------ true ---------------------- | |
| | -- it.Value() --------------------------------------------> | |
| | | -- subDf = df.Slice(offset, size) (returns the whole df) |
| | | -- offset++ |
| | <-------------------------- subDf, nil -------------------- | |
| | -- (Process subDf) | |
| | -- it.Next() ---------------------------------------------> | |
| | | -- return offset+size <= len(df.Header) (false for subsequent calls) |
| | <------------------------------ false --------------------- | |
| ``` |
| |
| This design allows for flexible and efficient generation of DataFrames tailored |
| to the specific needs of regression detection, whether it's scanning a wide |
| range of recent commits or focusing on a particular point in time. The use of an |
| iterator pattern also helps manage memory consumption by processing DataFrames |
| sequentially. |
| |
| # Module: /go/dryrun |
| |
| ## Dryrun Module Documentation |
| |
| ### Overview |
| |
| The `dryrun` module provides the capability to test an alert configuration and |
| preview the regressions it would identify without actually creating an alert or |
| sending notifications. This is a crucial tool for developers and performance |
| engineers to fine-tune alert parameters and ensure they accurately capture |
| relevant performance changes. |
| |
| The core idea is to simulate the regression detection process for a given alert |
| configuration over a historical range of data. This allows users to iterate on |
| alert definitions, observe the potential impact of those definitions, and avoid |
| alert fatigue caused by poorly configured alerts. |
| |
| ### Responsibilities and Key Components |
| |
| The primary responsibility of the `dryrun` module is to handle HTTP requests for |
| initiating and reporting the progress of these alert simulations. |
| |
| #### Key Files and Components: |
| |
| - **`dryrun.go`**: This is the heart of the `dryrun` module. It defines the |
| `Requests` struct, which manages the state and dependencies required for |
| processing dry run requests. It also contains the HTTP handler |
| (`StartHandler`) that orchestrates the dry run process. |
| |
| - **`Requests` struct**: |
| |
| - **Why**: Encapsulates all necessary dependencies (like `perfgit.Git` for |
| Git interactions, `shortcut.Store` for shortcut lookups, |
| `dataframe.DataFrameBuilder` for data retrieval, `progress.Tracker` for |
| reporting progress, and `regression.ParamsetProvider` for accessing |
| parameter sets) into a single unit. This promotes modularity and makes |
| it easier to manage and test the dry run functionality. |
| |
| - **How**: It is instantiated via the `New` function, which takes these |
| dependencies as arguments. This allows for dependency injection, making |
| the component more testable and flexible. |
| |
| - **`StartHandler` function**: |
| |
| - **Why**: This is the entry point for initiating a dry run. It handles |
| the incoming HTTP request, validates the alert configuration, and kicks |
| off the asynchronous regression detection process. |
| |
| - **How**: |
| |
| 1. It decodes the alert configuration from the HTTP request body. |
| 2. It performs initial validation on the alert query and other |
| parameters. If validation fails, an error is immediately reported to |
| the client. |
| 3. It uses a `progress.Tracker` to allow clients to monitor the status |
| of the long-running dry run operation. |
| 4. Crucially, it launches the actual regression detection in a separate |
| goroutine. This is essential because regression detection can be a |
| time-consuming process, and blocking the HTTP handler would lead to |
| timeouts and poor user experience. |
| 5. It defines a `detectorResponseProcessor` callback function. This |
| function is invoked by the underlying |
| `regression.ProcessRegressions` function whenever potential |
| regressions are found. |
| - **Why (callback)**: This design decouples the core regression |
| detection logic from the specifics of how dry run results are |
| formatted and reported. It allows the `regression` module to |
| focus on detection, while the `dryrun` module handles the |
| presentation and progress updates for the dry run scenario. |
| - **How (callback)**: The callback processes the raw |
| `ClusterResponse` objects from the regression detection, |
| converts them into user-friendly `RegressionAtCommit` structures |
| (which include commit details and the detected regression), and |
| updates the `Progress` object with these results. This enables |
| real-time feedback to the user as regressions are identified. |
| 6. The `regression.ProcessRegressions` function is then called in the |
| goroutine, passing the alert request, the callback, and other |
| necessary dependencies. This function iterates through the relevant |
| data, applies the alert's clustering and detection logic, and |
| invokes the callback for each identified cluster. |
| 7. The handler immediately responds to the client with the initial |
| `Progress` object, allowing the client to start polling for updates. |
| |
| - **`RegressionAtCommit` struct**: |
| |
| - **Why**: Provides a structured way to represent a regression found at a |
| specific commit. This includes both the commit information (`CID`) and |
| the details of the regression itself (`Regression`). |
| |
| - **How**: It's a simple struct used for marshalling the results into JSON |
| for the client. |
| |
| ### Workflows |
| |
| #### Dry Run Initiation and Processing: |
| |
| ``` |
| Client (UI/API) --HTTP POST /dryrun/start with AlertConfig--> Requests.StartHandler |
| | |
| V |
| [Validate AlertConfig] |
| | |
| +----------------------------------+----------------------------------+ |
| | (Validation Fails) | (Validation Succeeds) |
| V V |
| [Update Progress with Error] [Add to Progress Tracker] |
| | | |
| V V |
| Respond to Client with Error Progress Launch Goroutine: regression.ProcessRegressions(...) |
| | |
| V |
| [Iterate through data, detect regressions] |
| | |
| V |
| For each potential regression cluster: |
| Invoke `detectorResponseProcessor` callback |
| | |
| V |
| Callback: [Convert ClusterResponse to RegressionAtCommit] |
| [Update Progress with new RegressionAtCommit] |
| | |
| V |
| (Client polls for Progress updates) |
| | |
| V |
| When ProcessRegressions completes: |
| [Update Progress: Finished or Error] |
| |
| ``` |
| |
| The `StartHandler` effectively acts as a controller that receives the request, |
| performs initial setup and validation, and then delegates the heavy lifting of |
| regression detection to the `regression.ProcessRegressions` function, ensuring |
| the HTTP request can return quickly while the background processing continues. |
| The callback mechanism allows the `dryrun` module to react to findings from the |
| `regression` module in a way that's specific to the dry run use case (i.e., |
| accumulating and formatting results for client display). |
| |
| # Module: /go/favorites |
| |
| ## Favorites Module |
| |
| The `favorites` module provides functionality for users to save and manage |
| "favorite" configurations or views within the Perf application. This allows |
| users to quickly return to specific data explorations or commonly used settings. |
| |
| The core design philosophy is to provide a persistent storage mechanism for |
| user-specific preferences related to application state (represented as URLs). |
| This is achieved through a `Store` interface, which abstracts the underlying |
| data storage, and a concrete SQL-based implementation. |
| |
| ### Key Components and Responsibilities |
| |
| - **`store.go`**: This file defines the central `Store` interface. |
| |
| - **Why**: The interface decouples the business logic of managing |
| favorites from the specific database implementation. This promotes |
| testability (using mocks) and allows for potential future changes to the |
| storage backend without impacting the core application logic. |
| - **How**: It specifies the fundamental CRUD (Create, Read, Update, |
| Delete) operations for favorites, along with a `List` operation to |
| retrieve all favorites for a specific user and a `Liveness` check. |
| - `Favorite`: This struct represents a single favorite item, containing |
| fields like `ID`, `UserId`, `Name`, `Url`, `Description`, and |
| `LastModified`. The `Url` is a key piece of data, as it allows the |
| application to reconstruct the state the user wants to save. |
| - `SaveRequest`: This struct is used for creating and updating favorites, |
| encapsulating the data needed for these operations, notably excluding |
| the `ID` (which is generated or already known) and `LastModified` (which |
| is handled by the store). |
| - **Liveness**: The `Liveness` method is a bit of an outlier. It's used to |
| check the health of the database connection. It was placed in this store |
| "arbitrarily because of its lack of essential function" in other more |
| critical stores, making it a relatively safe place to perform this check |
| without impacting core performance data operations. |
| |
| - **`sqlfavoritestore/sqlfavoritestore.go`**: This file provides the SQL |
| implementation of the `Store` interface. |
| |
| - **Why**: CockroachDB (or a similar SQL database) is used as the |
| persistent storage for favorites. This choice provides a robust, |
| scalable, and transactional way to manage this data. |
| - **How**: |
| - It defines SQL statements for each operation in the `Store` interface. |
| These statements interact with a `Favorites` table. |
| - The `FavoriteStore` struct holds a database connection pool |
| (`pool.Pool`). |
| - Methods like `Get`, `Create`, `Update`, `Delete`, and `List` execute |
| their corresponding SQL statements against the database. |
| - Timestamps (`LastModified`) are handled automatically during create and |
| update operations to track when a favorite was last changed. |
| - Error handling is done using `skerr.Wrapf` to provide context to any |
| database errors. |
| |
| - **`sqlfavoritestore/schema/schema.go`**: This file defines the SQL schema |
| for the `Favorites` table. |
| |
| - **Why**: It provides a structured, Go-based representation of the |
| database table. This can be useful for schema management, migrations, |
| and ORM-like interactions (though a full ORM isn't used here). |
| - **How**: The `FavoriteSchema` struct uses struct tags (`sql:"..."`) to |
| define column names, types, constraints (like `PRIMARY KEY`, `NOT |
| NULL`), and indexes. The `byUserIdIndex` is crucial for efficiently |
| listing favorites for a specific user. |
| |
| - **`mocks/Store.go`**: This file contains a generated mock implementation of |
| the `Store` interface. |
| |
| - **Why**: Mocks are essential for unit testing components that depend on |
| the `Store` interface. They allow tests to simulate different store |
| behaviors (e.g., successful operations, errors) without requiring an |
| actual database connection. |
| - **How**: This file is auto-generated by the `mockery` tool. It provides |
| a `Store` struct that embeds `mock.Mock` from the `testify` library. |
| Each method of the interface has a corresponding mock function that can |
| be configured to return specific values or errors. |
| |
| ### Key Workflows |
| |
| **1. Creating a New Favorite:** |
| |
| ``` |
| User Action (e.g., clicks "Save as Favorite" in UI) |
| | |
| V |
| Application Handler |
| | |
| V |
| [favorites.Store.Create] is called with user ID, name, URL, description |
| | |
| V |
| [sqlfavoritestore.FavoriteStore.Create] |
| | |
| V |
| Generates current timestamp for LastModified |
| | |
| V |
| Executes INSERT SQL statement: |
| INSERT INTO Favorites (user_id, name, url, description, last_modified) VALUES (...) |
| | |
| V |
| Database stores the new favorite record |
| | |
| V |
| Returns success/error to Application Handler |
| ``` |
| |
| **2. Listing User's Favorites:** |
| |
| ``` |
| User navigates to "My Favorites" page |
| | |
| V |
| Application Handler |
| | |
| V |
| [favorites.Store.List] is called with the current user's ID |
| | |
| V |
| [sqlfavoritestore.FavoriteStore.List] |
| | |
| V |
| Executes SELECT SQL statement: |
| SELECT id, name, url, description FROM Favorites WHERE user_id=$1 |
| | |
| V |
| Database returns rows matching the user ID |
| | |
| V |
| [sqlfavoritestore.FavoriteStore.List] scans rows into []*favorites.Favorite |
| | |
| V |
| Returns list of favorites to Application Handler |
| | |
| V |
| UI displays the list |
| ``` |
| |
| **3. Retrieving a Specific Favorite (e.g., when a user clicks on a favorite to |
| load it):** |
| |
| ``` |
| User clicks on a specific favorite in their list |
| | |
| V |
| Application Handler (obtains favorite ID) |
| | |
| V |
| [favorites.Store.Get] is called with the favorite ID |
| | |
| V |
| [sqlfavoritestore.FavoriteStore.Get] |
| | |
| V |
| Executes SELECT SQL statement: |
| SELECT id, user_id, name, url, description, last_modified FROM Favorites WHERE id=$1 |
| | |
| V |
| Database returns the single matching favorite row |
| | |
| V |
| [sqlfavoritestore.FavoriteStore.Get] scans row into a *favorites.Favorite struct |
| | |
| V |
| Returns the favorite object to Application Handler |
| | |
| V |
| Application uses the `Url` from the favorite object to restore the application state |
| ``` |
| |
| # Module: /go/file |
| |
| The `file` module and its submodules are responsible for providing a unified |
| interface for accessing files from different sources, such as local directories |
| or Google Cloud Storage (GCS). This abstraction allows the Perf ingestion system |
| to treat files consistently regardless of their origin. |
| |
| ## Core Concepts |
| |
| The central idea is to define a `file.Source` interface that abstracts the |
| origin of files. Implementations of this interface are then responsible for |
| monitoring their respective sources (e.g., a GCS bucket via Pub/Sub |
| notifications, or a local directory) and emitting `file.File` objects through a |
| channel when new files become available. |
| |
| The `file.File` struct encapsulates the essential information about a file: its |
| name, an `io.ReadCloser` for its contents, its creation timestamp, and |
| optionally, the associated `pubsub.Message` if the file originated from a GCS |
| Pub/Sub notification. This optional field is crucial for acknowledging the |
| message after successful processing, or nack'ing it if an error occurs, ensuring |
| reliable message handling in a distributed system. |
| |
| ### `file.go` |
| |
| This file defines the core `File` struct and the `Source` interface. |
| |
| - **`File` struct:** Represents a single file. |
| |
| - `Name`: The identifier for the file (e.g., `gs://bucket/object` or a |
| local path). |
| - `Contents`: An `io.ReadCloser` to read the file's content. This design |
| allows for streaming file data, which is memory-efficient, especially |
| for large files. The consumer is responsible for closing this reader. |
| - `Created`: The timestamp when the file was created or last modified |
| (depending on the source). |
| - `PubSubMsg`: A pointer to a `pubsub.Message`. This is populated if the |
| file notification came from a Pub/Sub message (e.g., GCS object change |
| notifications). It's used to `Ack` or `Nack` the message, indicating |
| successful processing or a desire to retry/dead-letter. |
| |
| - **`Source` interface:** Defines the contract for file sources. |
| |
| - `Start(ctx context.Context) (<-chan File, error)`: This method initiates |
| the process of watching for new files. It returns a read-only channel |
| (`<-chan File`) through which `File` objects are sent as they are |
| discovered. The method is designed to be called only once per `Source` |
| instance. This design ensures that the resource setup and monitoring |
| logic (like starting a Pub/Sub subscription listener or initiating a |
| directory walk) is done once. |
| |
| ## Implementations of `file.Source` |
| |
| ### `dirsource` |
| |
| The `dirsource` submodule provides an implementation of `file.Source` that reads |
| files from a local filesystem directory. |
| |
| - **Purpose:** Primarily intended for testing and demonstration purposes. It |
| allows developers to simulate file ingestion locally without needing to set |
| up GCS or Pub/Sub. |
| - **Mechanism:** |
| - `New(dir string)`: Constructs a `DirSource` for a given directory path. |
| It resolves the path to an absolute path. |
| - `Start(_ context.Context)`: When called, it initiates a `filepath.Walk` |
| over the specified directory. |
| - For each regular file encountered, it opens the file and creates a |
| `file.File` object. |
| - The `ModTime` of the file is used as the `Created` timestamp, which is a |
| known simplification for its intended use cases. |
| - The `file.File` objects are sent to an unbuffered channel. |
| - The channel is closed after the directory walk is complete. |
| - **Limitations:** |
| - It performs a one-time walk of the directory. It does not watch for new |
| files or changes to existing files after the initial walk. |
| - It uses the file's modification time as the creation time. |
| - **Workflow:** `New(directory) -> DirSource instance | V DirSource.Start() |
| --> Goroutine starts | V filepath.Walk(directory) | +----------------------+ |
| | | V V For each file: For each directory: os.Open(path) (skip) Create |
| file.File{Name, Contents, ModTime} Send file.File to channel | V Caller |
| receives file.File from channel` |
| |
| ### `gcssource` |
| |
| The `gcssource` submodule implements `file.Source` for files stored in Google |
| Cloud Storage, using Pub/Sub notifications for new file events. |
| |
| - **Purpose:** This is the production-grade implementation for ingesting files |
| from GCS. It's designed to be robust and scalable. |
| - **Mechanism:** |
| - `New(ctx context.Context, instanceConfig *config.InstanceConfig)`: |
| - Initializes GCS and Pub/Sub clients using default application |
| credentials. |
| - Constructs a Pub/Sub subscription name. It can either use a |
| pre-configured `Subscription` name from `instanceConfig` or generate one |
| based on the `Topic` (often adding a suffix like `-prod` or using a |
| round-robin scheme for load distribution if multiple ingester instances |
| are running). |
| - Creates a `sub.Subscription` object to manage receiving messages from |
| the configured Pub/Sub topic/subscription. A key configuration here is |
| `ReceiveSettings.MaxExtension = -1`. This disables automatic ack |
| deadline extension by the Pub/Sub client library. The rationale is that |
| the `gcssource` itself will explicitly `Ack` or `Nack` messages. If |
| automatic extension were enabled and the processing of a file took |
| longer than the extension period, the message might be redelivered while |
| still being processed, leading to duplicate processing or other issues. |
| By disabling it, the ingester has full control over the message |
| lifecycle. |
| - Initializes a `filter.Filter` based on `AcceptIfNameMatches` and |
| `RejectIfNameMatches` regular expressions provided in the |
| `instanceConfig`. This allows for fine-grained control over which files |
| are processed based on their GCS object names. |
| - Determines if dead-lettering is enabled based on the instance |
| configuration. |
| - `Start(ctx context.Context)`: |
| - Creates the output channel for `file.File` objects. |
| - Launches a goroutine that continuously calls |
| `s.subscription.Receive(ctx, s.receiveSingleEventWrapper)`. |
| - The `Receive` method blocks until a message is available or the |
| context is cancelled. |
| - `receiveSingleEventWrapper` is called for each Pub/Sub message. |
| - **File Event Processing (`receiveSingleEventWrapper` and |
| `receiveSingleEvent`):** |
| 1. **Deserialize Event:** The Pub/Sub message `Data` is expected to be a |
| JSON payload describing a GCS object event (specifically, `{"bucket": |
| "...", "name": "..."}`). |
| 2. **Filename Construction:** A `gs://` URI is constructed from the bucket |
| and name. |
| 3. **Filename Filtering:** The `filter.Filter` (configured with regexes) is |
| applied. If the filename is rejected, the message is acked (as there's |
| no point retrying), and processing stops for this event. |
| 4. **Source Prefix Check:** The filename is checked against the `Sources` |
| list in `instanceConfig.IngestionConfig.SourceConfig.Sources`. These are |
| typically `gs://` prefixes. If the filename doesn't match any of these |
| prefixes, it's considered an unexpected file, the message is acked, and |
| processing stops. This ensures that the ingester only processes files |
| from explicitly configured GCS locations. |
| 5. **Fetch GCS Object Attributes:** `obj.Attrs(ctx)` is called to get |
| metadata like the creation time. If this fails (e.g., object deleted |
| between notification and processing, or transient network error), the |
| message is nacked (if dead-lettering is not enabled) or handled by the |
| dead-letter policy, as retrying might succeed. |
| 6. **Stream GCS Object Contents:** `obj.NewReader(ctx)` is called to get an |
| `io.ReadCloser` for the file's content. If this fails, the message is |
| nacked (or dead-lettered). |
| 7. **Send `file.File`:** A `file.File` struct is created with the GCS path, |
| the reader, the `attrs.Created` time, and the original `pubsub.Message`. |
| This `file.File` is sent to the `fileChannel`. |
| 8. **Message Acknowledgement:** |
| - The `receiveSingleEvent` function returns `true` if the initial |
| stages of processing (up to sending to the channel) were successful |
| and the message should be acked from Pub/Sub's perspective (meaning |
| it was valid, filtered appropriately, and the object was |
| accessible). It returns `false` for transient errors where a retry |
| might help (e.g., failing to get object attributes or reader). |
| - The `receiveSingleEventWrapper` then uses this boolean: |
| - If dead-lettering is enabled (`s.deadLetterEnabled`): |
| - If `receiveSingleEvent` returned `false` (transient error or |
| should retry), the message is `Nack()`-ed. This typically sends |
| it to a dead-letter topic if configured, or allows Pub/Sub to |
| redeliver it after a backoff. |
| - If `receiveSingleEvent` returned `true`, the message is _not_ |
| explicitly `Ack()`-ed here. The acknowledgement is deferred to |
| the consumer of the `file.File` (i.e., the ingester). This is a |
| critical design choice: the message is only truly "done" when |
| the file content has been fully processed by the downstream |
| system. |
| - If dead-lettering is _not_ enabled: |
| - If `receiveSingleEvent` returned `true`, the message is |
| `Ack()`-ed. |
| - If `receiveSingleEvent` returned `false`, the message is |
| `Nack()`-ed. |
| - **Key Design Choices:** |
| - **Decoupling from Pub/Sub Ack/Nack:** The `gcssource` itself doesn't |
| always immediately `Ack` messages upon successful GCS interaction. |
| Instead, it passes the `*pubsub.Message` along in the `file.File` |
| struct. This allows the ultimate consumer of the file's content (e.g., |
| the Perf ingestion pipeline) to `Ack` the message only after it has |
| successfully processed and stored the data. This provides end-to-end |
| processing guarantees. If processing fails downstream, the message can |
| be `Nack`-ed, leading to a retry or dead-lettering. |
| - **Filtering:** Multiple layers of filtering (regex-based `filter.Filter` |
| and prefix-based `SourceConfig.Sources`) ensure that only desired files |
| are processed. |
| - **Error Handling:** Distinguishes between errors that warrant an `Ack` |
| (e.g., file explicitly filtered out) and those that warrant a `Nack` |
| (e.g., transient GCS errors), especially when dead-letter queues are in |
| use. |
| - **Scalability:** Uses a configurable number of parallel receivers |
| (`maxParallelReceives`) for Pub/Sub messages, although currently set |
| to 1. This can be tuned for performance. |
| - **Workflow (Simplified):** `New(config) -> GCSSource instance (GCS/PubSub |
| clients, filter initialized) | V GCSSource.Start() --> Goroutine starts |
| PubSub subscription.Receive loop | V PubSub message arrives | V |
| receiveSingleEventWrapper(msg) | V receiveSingleEvent(msg) | +-> Deserialize |
| msg data (JSON: bucket, name) -> Error? Ack, return. | +-> Filter filename |
| (regex) -> Rejected? Ack, return. | +-> Check if filename matches |
| config.Sources prefixes -> No match? Ack, return. | +-> GCS: |
| storageClient.Object(bucket, name).Attrs() -> Error? Nack (retryable), |
| return. | +-> GCS: object.NewReader() -> Error? Nack (retryable), return. | |
| V Create file.File{Name, Contents, Created, PubSubMsg: msg} Send file.File |
| to fileChannel | V Caller receives file.File from channel (Caller later |
| Acks/Nacks msg via file.File.PubSubMsg)` |
| |
| This modular approach to file sourcing makes the Perf ingestion system flexible |
| and easier to test and maintain. New file sources can be added by simply |
| implementing the `file.Source` interface. |
| |
| # Module: /go/filestore |
| |
| The `filestore` module provides an abstraction layer for interacting with |
| different file storage systems. It defines a common interface, leveraging Go's |
| `io/fs.FS`, allowing the application to read files regardless of whether they |
| are stored locally or in a cloud storage service like Google Cloud Storage |
| (GCS). This design promotes flexibility and testability by decoupling file |
| access logic from the specific storage implementation. |
| |
| The primary goal is to enable Perf, the performance monitoring system, to |
| seamlessly access data files from various sources. Perf often deals with large |
| datasets and trace files, which might be stored in GCS for scalability and |
| durability or locally during development and testing. By using this module, Perf |
| components can be written to consume data using the standard `fs.FS` interface |
| without needing to know the underlying storage details. |
| |
| Key components: |
| |
| - **`local`**: This submodule provides an implementation of `fs.FS` for the |
| local file system. |
| |
| - **Why**: It's essential for local development, testing, and scenarios |
| where data is directly available on the machine running Perf. |
| - **How**: The `local.New(rootDir string)` function initializes a |
| `filesystem` struct. This struct stores the absolute path to a `rootDir` |
| and uses `os.DirFS(rootPath)` to create an `fs.FS` instance scoped to |
| that directory. When `Open(name string)` is called, it calculates the |
| path relative to `rootDir` and then uses the underlying `os.DirFS` to |
| open the file. This ensures that file access is contained within the |
| specified root directory. |
| - The `local.go` file contains the `filesystem` struct and its methods. |
| The core logic resides in the `New` function for initialization and the |
| `Open` method for file access. `filepath.Abs` and `filepath.Rel` are |
| used to correctly handle and relativize paths. |
| |
| - **`gcs`**: This submodule implements `fs.FS` for Google Cloud Storage. |
| |
| - **Why**: GCS is a common choice for storing large amounts of data in a |
| scalable and accessible manner. Perf relies on GCS for storing trace |
| files and other performance artifacts. |
| - **How**: The `gcs.New(ctx context.Context)` function initializes a |
| `filesystem` struct. It authenticates with GCS using |
| `google.DefaultTokenSource` to obtain an OAuth2 token source and then |
| creates a `*storage.Client`. The `Open(name string)` method expects a |
| GCS URI (e.g., `gs://bucket-name/path/to/file`). It parses this URI into |
| a bucket name and object path using `parseNameIntoBucketAndPath`. Then, |
| it uses the `storage.Client` to get a `*storage.Reader` for the |
| specified object. This reader is wrapped in a custom `file` struct which |
| implements `fs.File`. |
| - The `gcs.go` file defines the `filesystem` struct, which holds the |
| `*storage.Client`, and the `file` struct, which wraps `*storage.Reader`. |
| The `New` function handles GCS client initialization and authentication. |
| The `Open` method is responsible for parsing GCS URIs and obtaining a |
| reader for the object. Notably, the `Stat()` method for `gcs.file` is |
| intentionally not implemented (returns `ErrNotImplemented`) because |
| Perf's current usage patterns do not require it, simplifying the |
| implementation. The `parseNameIntoBucketAndPath` helper function is |
| crucial for translating the GCS URI format into the bucket and object |
| path components required by the GCS client library. |
| |
| **Workflow: Opening a File (Conceptual)** |
| |
| The client code (e.g., a component within Perf) would typically decide which |
| filestore implementation to use based on configuration or the nature of the file |
| path. |
| |
| 1. **Initialization**: |
| |
| - For local files: `fsImpl, err := local.New("/path/to/data/root")` |
| - For GCS files: `fsImpl, err := gcs.New(context.Background())` |
| |
| 2. **File Access**: |
| |
| - The client calls `file, err := |
| |
| fsImpl.Open("relative/path/to/file.json")`(for local) or`file, err := |
| fsImpl.Open("gs://my-bucket/data/some_trace.json")` (for GCS). |
| |
| 3. **Behind the Scenes**: |
| |
| - **Local**: `local.Open("relative/path/to/file.json") | V Calculates |
| |
| absolute path based on rootDir | V Calls |
| os.DirFS(rootDir).Open("relative/path/to/file.json") | V Returns fs.File |
| (os.File)` - **GCS**:`gcs.Open("gs://my-bucket/data/some_trace.json") | V |
| parseNameIntoBucketAndPath("gs://my-bucket/data/some_trace.json") --> |
| "my-bucket", "data/some_trace.json" | V |
| gcsClient.Bucket("my-bucket").Object("data/some_trace.json").NewReader() |
| | V Wraps storage.Reader in gcs.file | V Returns fs.File (gcs.file)` |
| |
| 4. **Reading Data**: |
| |
| - The client can then use the returned `fs.File` (e.g., |
| `file.Read(buffer)`) in a standard way, irrespective of whether it's an |
| `os.File` or a `gcs.file` wrapping a `storage.Reader`. |
| |
| This abstraction allows Perf to be agnostic to the underlying storage mechanism |
| when reading files, simplifying its data processing pipelines. |
| |
| # Module: /go/frontend |
| |
| The `frontend` module serves as the backbone for the Perf web UI. It's |
| responsible for handling HTTP requests, rendering HTML templates, and |
| interacting with various backend services and data stores to provide a |
| comprehensive performance analysis platform. |
| |
| The design philosophy emphasizes a separation of concerns. The core |
| `frontend.go` file initializes and wires together various components, while the |
| `api` subdirectory houses specific handlers for different categories of user |
| interactions (e.g., alerts, graphs, regressions). This modular approach |
| simplifies development, testing, and maintenance. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`frontend.go`**: |
| |
| - **Initialization (`New`, `initialize`)**: This is the entry point. It |
| sets up logging, metrics, reads configuration (`config.Config`), |
| initializes database connections (TraceStore, AlertStore, |
| RegressionStore, etc.), and establishes connections to external services |
| like Git and potentially Chrome Perf. |
| - **Template Handling (`loadTemplates`, `templateHandler`)**: It loads |
| HTML templates from the `dist` directory (produced by the build system). |
| These templates are Go templates, allowing for dynamic data injection. |
| Snippets for Google Analytics (`googleanalytics.html`) and cookie |
| consent (`cookieconsent.html`) are embedded and can be included in the |
| rendered pages. |
| - **Page Context (`getPageContext`)**: This crucial function generates a |
| JavaScript object (`window.perf`) that is embedded in every HTML page. |
| This object contains configuration values and settings that the |
| client-side JavaScript needs to function correctly, such as API URLs, |
| display preferences, and feature flags. This avoids hardcoding such |
| values in the JavaScript and allows for easier configuration. |
| - **Routing (`GetHandler`, `getFrontendApis`)**: It defines the HTTP |
| routes and associates them with their respective handler functions. This |
| is where `chi` router is configured. It also instantiates and registers |
| all the API handlers from the `api` sub-module. |
| - **Authentication and Authorization (`loginProvider`, |
| `RoleEnforcedHandler`)**: It integrates with an authentication system |
| (e.g., `proxylogin`) to determine user identity and roles. |
| `RoleEnforcedHandler` is a middleware to protect certain endpoints based |
| on user roles. |
| - **Long-Running Task Management (`progressTracker`)**: For operations |
| that might take a significant amount of time (e.g., generating complex |
| data frames for graphs, running regression detection), it uses a |
| `progress.Tracker`. This allows the frontend to initiate a task, return |
| an ID to the client, and let the client poll for status and results, |
| preventing HTTP timeouts for long operations. |
| - _Workflow Example (Frame Request):_ |
| 1. Client POSTs to `/_/frame/start` with query details. |
| 2. `frameStartHandler` creates a `progress` object, adds it to |
| `progressTracker`. |
| 3. A goroutine is launched to process the frame request using |
| `frame.ProcessFrameRequest`. |
| 4. `frameStartHandler` immediately returns the `progress` object's ID. |
| 5. Client polls `/_/status/{id}`. |
| 6. Client fetches results from `/_/frame/results/{id}` (managed by |
| `progressTracker`) once finished. |
| - **Redirections (`gotoHandler`, old URL handlers)**: Handles redirects |
| for old URLs to new ones and provides a `/g/` endpoint to navigate to |
| specific views based on a Git hash. |
| - **Liveness Probe (`liveness`)**: Provides a `/liveness` endpoint that |
| checks the health of critical dependencies (like the database |
| connection) for Kubernetes. |
| |
| - **`api` (subdirectory)**: This directory contains the specific HTTP handlers |
| for various features of Perf. Each API is typically encapsulated in its own |
| file (e.g., `alertsApi.go`, `graphApi.go`) and implements the `FrontendApi` |
| interface, primarily its `RegisterHandlers` method. This design promotes |
| modularity. |
| |
| - **`alertsApi.go`**: Manages CRUD operations for alert configurations |
| (`alerts.Alert`). It interacts with `alerts.ConfigProvider` (for |
| fetching configurations, potentially cached) and `alerts.Store` (for |
| persistence). It also handles trying out bug filing and notification |
| sending for alerts. Includes endpoints to list subscriptions and manage |
| dry-run requests for alert configurations. |
| - **`anomaliesApi.go`**: Provides endpoints for fetching anomaly data. It |
| has two modes of operation: |
| - **Legacy (Chromeperf-backed)**: Proxies requests to an external |
| Chromeperf instance for sheriff lists, anomaly lists, and group reports. |
| This was likely an initial integration or for instances that rely on |
| Chromeperf's anomaly detection. The test name cleaning logic |
| (`cleanTestName`) addresses potential incompatibilities in test naming |
| conventions or characters between systems. |
| - **Skia-internal**: Fetches sheriff (subscription) lists and associated |
| alerts directly from the instance's own database (`subscription.Store`, |
| `alerts.Store`). This allows Perf instances to manage their own anomaly |
| data. |
| - **`favoritesApi.go`**: Manages user-specific and instance-wide favorite |
| links. User favorites are stored in `favorites.Store`, while |
| instance-wide favorites can be defined in the main configuration file |
| (`config.Config.Favorites`). It provides endpoints to list, create, |
| delete, and update favorites. |
| - **`graphApi.go`**: Handles requests related to plotting graphs. |
| - **Frame Requests (`frameStartHandler`)**: As described above, this |
| initiates the potentially long process of fetching trace data and |
| constructing a `dataframe.DataFrame`. It uses |
| `dfbuilder.DataFrameBuilder` for this. |
| - **Commit Information (`cidHandler`, `cidRangeHandler`, |
| `shiftHandler`)**: Provides details about specific commits or ranges of |
| commits by interacting with `perfgit.Git`. |
| - **Trace Details (`detailsHandler`, `linksHandler`)**: Fetches raw data |
| or metadata for a specific trace point at a particular commit. This |
| involves reading from `tracestore.TraceStore` and potentially the |
| `ingestedFS` (filesystem where raw ingested data is stored) to get |
| information like associated benchmark links from the original JSON |
| files. |
| - **`pinpointApi.go`**: Facilitates interaction with the Pinpoint |
| bisection service. It allows users to create bisection jobs (to identify |
| the commit that caused a performance regression) or try jobs (to test a |
| patch). It can proxy requests to a legacy Pinpoint service or a newer |
| backend service. |
| - **`queryApi.go`**: Supports the query construction UI. |
| - **Parameter Set (`initpageHandler`, `getParamSet`)**: Provides the |
| initial set of queryable parameters (keys and their possible values) to |
| populate the UI. This uses `psrefresh.ParamSetRefresher` which |
| periodically updates this canonical paramset based on recent data, |
| ensuring the UI reflects available data. |
| - **Query Preflighting/Counting (`countHandler`, |
| `nextParamListHandler`)**: As the user builds a query in the UI, these |
| handlers can estimate the number of matching traces or provide the next |
| relevant parameter values based on the current partial query. This gives |
| users immediate feedback. The `nextParamListHandler` is tailored for UIs |
| where parameter selection is ordered (e.g., Chromeperf's UI). |
| - **`regressionsApi.go`**: Deals with detected regressions. |
| - **Listing/Counting Regressions (`regressionRangeHandler`, |
| `regressionCountHandler`, `alertsHandler`, `regressionsHandler`)**: |
| Fetches regression data from `regression.Store` based on time ranges, |
| alert configurations, or subscriptions. It can filter by user ownership |
| or category. |
| - **Triage (`triageHandler`)**: Allows users (editors) to mark regressions |
| as triaged (e.g., "positive", "negative", "ignored") and associate them |
| with bug reports. If a regression is marked as negative, it can generate |
| a bug report URL using a configurable template. |
| - **Manual Clustering (`clusterStartHandler`)**: Allows users to initiate |
| the regression detection process for a specific query or set of |
| parameters. This is also a long-running operation managed by |
| `progressTracker`. |
| - **Anomaly/Group Redirection (`anomalyHandler`, |
| `alertGroupQueryHandler`)**: Provides redirect URLs to the appropriate |
| graph view for a given anomaly ID or alert group ID from Chromeperf. |
| This involves generating graph shortcuts. |
| - **`sheriffConfigApi.go`**: Handles interactions related to LUCI Config |
| for sheriff configurations. |
| - **Metadata (`getMetadataHandler`)**: Provides metadata to LUCI Config, |
| indicating which configuration files (e.g., `skia-sheriff-configs.cfg`) |
| Perf owns and the URL for validating changes to these files. This is |
| part of an automated config management system. |
| - **Validation (`validateConfigHandler`)**: Receives configuration content |
| from LUCI Config and validates it (e.g., using |
| `sheriffconfig.ValidateContent`). Returns success or a structured error |
| message. |
| - **`shortcutsApi.go`**: Manages the creation and retrieval of shortcuts. |
| - **Key Shortcuts (`keysHandler`)**: Allows storing a set of trace keys |
| (queries) and getting a short ID for them. This is used, for example, by |
| the "Share" button on the explore page. |
| - **Graph Shortcuts (`getGraphsShortcutHandler`, |
| `createGraphsShortcutHandler`)**: Manages shortcuts for more complex |
| graph configurations, which can include multiple queries and formulas. |
| These are used for sharing multi-graph views. |
| - **`triageApi.go`**: Provides endpoints for triaging anomalies, |
| specifically those originating from or managed by Chromeperf. This |
| includes filing new bugs, associating anomalies with existing bugs, and |
| performing actions like ignoring or nudging anomalies. It interacts with |
| `chromeperf.ChromePerfClient` and potentially an |
| `issuetracker.IssueTracker` implementation. |
| - **`userIssueApi.go`**: Manages user-reported issues (Buganizer |
| annotations) associated with specific data points (a trace at a commit). |
| This allows users to link external bug reports directly to performance |
| data points in the UI. It uses `userissue.Store` for persistence. |
| |
| The overall goal of the `frontend` module is to provide a responsive and |
| informative user interface by efficiently querying and presenting performance |
| data, while also enabling users to configure alerts, triage regressions, and |
| collaborate on performance analysis. The interaction with various stores and |
| services is abstracted to keep the request handling logic focused. |
| |
| # Module: /go/git |
| |
| The `go/git` module provides an abstraction layer for interacting with Git |
| repositories. It is designed to efficiently retrieve and cache commit |
| information, which is essential for performance analysis in Skia Perf. The |
| primary goal is to offer a consistent interface for accessing commit data, |
| regardless of whether the underlying data source is a local Git checkout or a |
| remote Gitiles API. |
| |
| **Design Decisions and Implementation Choices:** |
| |
| - **Database Caching:** To avoid repeated and potentially slow Git operations, |
| commit information is cached in an SQL database. This allows for quick |
| lookups of commit details, commit numbers, and commit ranges. The schema for |
| this database is defined in `/go/git/schema/schema.go`. |
| - **Provider Abstraction:** The module utilizes a `provider.Provider` |
| interface (defined in `/go/git/provider/provider.go`). This allows for |
| different implementations of how Git data is fetched. Currently, two |
| providers are implemented: |
| - `git_checkout`: Interacts with a local Git repository by shelling out to |
| `git` commands. This is suitable for environments where a local checkout |
| is available and preferred. |
| - `gitiles`: Uses the Gitiles API to fetch commit data. This is useful |
| when direct repository access is not feasible or when leveraging |
| Google's infrastructure for Git operations. The choice of provider is |
| determined by the instance configuration, as seen in |
| `/go/git/providers/builder.go`. |
| - **Commit Numbering:** |
| - **Sequential:** By default, the system assigns sequential integer |
| `CommitNumber`s to commits as they are ingested. This provides a simple, |
| ordered way to refer to commits. |
| - **Repo-Supplied:** The system can also be configured to extract commit |
| numbers directly from commit messages using a regular expression |
| (specified in `instanceConfig.GitRepoConfig.CommitNumberRegex`). This is |
| useful for repositories like Chromium that embed a commit position in |
| their messages. The `repoSuppliedCommitNumber` flag in `impl.go` |
| controls this behavior. |
| - **LRU Cache:** In addition to the database cache, an in-memory LRU (Least |
| Recently Used) cache (`cache` in `impl.go`) is used for frequently accessed |
| commit details (`CommitFromCommitNumber`). This further speeds up lookups |
| for commonly requested commits. The size of this cache is defined by |
| `commitCacheSize`. |
| - **Background Polling:** The `StartBackgroundPolling` method in `impl.go` |
| initiates a goroutine that periodically calls the `Update` method. This |
| ensures that the local database cache stays synchronized with the remote |
| repository. |
| - **SQL Statements:** All SQL queries are predefined as constants in |
| `impl.go`. This helps in organizing and managing the queries. Separate |
| statements are defined for different SQL dialects if needed (e.g., `insert` |
| vs `insertSpanner`). |
| - **Error Handling:** The `BadCommit` constant provides a sentinel value for |
| functions returning `provider.Commit` to indicate an error or an invalid |
| commit. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`interface.go` (Git Interface):** |
| - Defines the `Git` interface, which is the public contract for this |
| module. It specifies all the operations that can be performed to |
| retrieve commit information. |
| - This interface decouples the consumers of Git data from the specific |
| implementation details (e.g., whether data comes from a local repo or |
| Gitiles). |
| - **`impl.go` (Git Implementation):** |
| - Contains the `Impl` struct, which is the primary implementation of the |
| `Git` interface. |
| - **Data Synchronization (`Update` method):** This is a crucial method |
| responsible for fetching new commits from the configured |
| `provider.Provider` and storing them in the SQL database. It determines |
| the last known commit and fetches all subsequent commits. |
| - If `repoSuppliedCommitNumber` is true, it parses the commit number from |
| the commit body using `commitNumberRegex`. |
| - It handles potential race conditions where multiple services might try |
| to update simultaneously by checking if a commit already exists before |
| insertion. |
| - **Commit Retrieval Methods:** Implements various methods for fetching |
| commit data, such as: |
| - `CommitNumberFromGitHash`: Retrieves the sequential `CommitNumber` for a |
| given Git hash. |
| - `CommitFromCommitNumber`: Retrieves the full `provider.Commit` details |
| for a given `CommitNumber`. Uses the LRU cache. |
| - `CommitNumberFromTime`: Finds the `CommitNumber` closest to (but not |
| after) a given timestamp. |
| - `CommitSliceFromTimeRange`, `CommitSliceFromCommitNumberRange`: Fetches |
| slices of commits based on time or commit number ranges. |
| - `GitHashFromCommitNumber`: Retrieves the Git hash for a given |
| `CommitNumber`. |
| - `PreviousGitHashFromCommitNumber`, |
| `PreviousCommitNumberFromCommitNumber`: Finds the Git hash or commit |
| number of the commit immediately preceding a given commit number. |
| - `CommitNumbersWhenFileChangesInCommitNumberRange`: Identifies commit |
| numbers within a range where a specific file was modified. This involves |
| converting commit numbers to hashes and then querying the |
| `provider.Provider`. |
| - **URL Generation (`urlFromParts`):** Constructs a URL to view a specific |
| commit, respecting configurations like `DebouceCommitURL` or custom |
| `CommitURL` formats. |
| - **Metrics:** Collects various metrics (e.g., `updateCalled`, |
| `commitNumberFromGitHashCalled`) to monitor the usage and performance of |
| different operations. |
| - **`provider/provider.go` (Provider Interface and Commit Struct):** |
| - Defines the `provider.Provider` interface, which abstracts the source of |
| Git commit data. Implementations of this interface (like `git_checkout` |
| and `gitiles`) handle the actual fetching of data. |
| - Defines the `provider.Commit` struct, which is the standard |
| representation of a commit used throughout the `go/git` module and its |
| providers. It includes fields like `GitHash`, `Timestamp`, `Author`, |
| `Subject`, and `Body`. The `Body` is particularly important when |
| `repoSuppliedCommitNumber` is true, as it's parsed to extract the commit |
| number. |
| - **`providers/builder.go` (Provider Factory):** |
| - Contains the `New` function, which acts as a factory for creating |
| `provider.Provider` instances based on the |
| `instanceConfig.GitRepoConfig.Provider` setting. This allows the system |
| to dynamically choose between `git_checkout` or `gitiles` (or |
| potentially other future providers). |
| - **`providers/git_checkout/git_checkout.go` (CLI Git Provider):** |
| - Implements `provider.Provider` by executing `git` command-line |
| operations. |
| - Handles cloning the repository if it doesn't exist. |
| - Manages Git authentication (e.g., via Gerrit) if configured. |
| - `CommitsFromMostRecentGitHashToHead`: Uses `git rev-list` to get commit |
| information. |
| - `GitHashesInRangeForFile`: Uses `git log` to find changes to a specific |
| file. |
| - `parseGitRevLogStream`: A helper function to parse the output of `git |
| rev-list --pretty`. |
| - **`providers/gitiles/gitiles.go` (Gitiles Provider):** |
| - Implements `provider.Provider` by interacting with a Gitiles API |
| endpoint. |
| - `CommitsFromMostRecentGitHashToHead`: Uses `gr.LogFnBatch` to fetch |
| commits in batches. It handles logic for main branches versus other |
| branches and respects the `startCommit`. |
| - `GitHashesInRangeForFile`: Uses `gr.Log` with appropriate path |
| filtering. |
| - `Update` is a no-op for Gitiles as the API always provides the latest |
| data. |
| - **`schema/schema.go` (Database Schema):** |
| - Defines the `Commit` struct with SQL annotations, representing the |
| structure of the `Commits` table in the database. This table stores the |
| cached commit information. |
| - **`gittest/gittest.go` (Test Utilities):** |
| - Provides helper functions (`NewForTest`) for setting up test |
| environments. This includes creating a temporary Git repository, |
| populating it with commits, and initializing a test database. This is |
| crucial for writing reliable unit and integration tests for the `go/git` |
| module and its components. |
| - **`mocks/Git.go` (Mock Implementation):** |
| - Provides a mock implementation of the `Git` interface, generated by |
| `mockery`. This is used in tests of other modules that depend on |
| `go/git`, allowing them to isolate their tests from actual Git |
| operations or database interactions. |
| |
| **Key Workflows:** |
| |
| 1. **Initial Population / Update:** |
| |
| ``` |
| Application -> Impl.Update() |
| | |
| '-> Provider.Update() (e.g., git pull for git_checkout) |
| | |
| '-> Impl.getMostRecentCommit() (from local DB) |
| | |
| '-> Provider.CommitsFromMostRecentGitHashToHead(mostRecentDBHash, ...) |
| | |
| '-> (For each new commit from Provider) |
| | |
| '-> [If repoSuppliedCommitNumber] Impl.getCommitNumberFromCommit(commit.Body) |
| | |
| '-> Impl.CommitNumberFromGitHash(commit.GitHash) (Check if already exists) |
| | |
| '-> DB.Exec(INSERT INTO Commits ...) |
| ``` |
| |
| 2. **Fetching Commit Details by CommitNumber:** |
| |
| ``` |
| Application -> Impl.CommitFromCommitNumber(commitNum) |
| | |
| '-> Check LRU Cache (cache.Get(commitNum)) |
| | | |
| | '-> [If found] Return cached provider.Commit |
| | |
| '-> [If not in LRU] DB.QueryRow(SELECT ... FROM Commits WHERE commit_number=$1) |
| | |
| '-> Construct provider.Commit |
| | |
| '-> Add to LRU Cache (cache.Add(commitNum, commit)) |
| | |
| '-> Return provider.Commit |
| ``` |
| |
| 3. **Finding Commits Where a File Changed:** `Application -> |
| Impl.CommitNumbersWhenFileChangesInCommitNumberRange(beginNum, endNum, file) |
| | '-> Impl.PreviousGitHashFromCommitNumber(beginNum) -> beginHash (or |
| Impl.GitHashFromCommitNumber if beginNum is 0 and start commit is used) | |
| '-> Impl.GitHashFromCommitNumber(endNum) -> endHash | '-> |
| Provider.GitHashesInRangeForFile(beginHash, endHash, file) -> |
| changedGitHashes[] | '-> (For each changedGitHash) | '-> |
| Impl.CommitNumberFromGitHash(changedGitHash) -> commitNum | '-> Add |
| commitNum to result list | '-> Return result list` |
| |
| This structure allows Perf to efficiently query and manage Git commit |
| information, supporting its core functionality of tracking performance data |
| across different versions of the codebase. |
| |
| # Module: /go/graphsshortcut |
| |
| The `graphsshortcut` module provides a mechanism for storing and retrieving |
| shortcuts for graph configurations in Perf. Users often define complex sets of |
| graphs for analysis. Instead of redefining these configurations each time or |
| relying on cumbersome URL sharing, this module allows users to save a collection |
| of graph configurations and access them via a unique, shorter identifier. This |
| significantly improves usability and sharing of common graph views. |
| |
| The core idea is to represent a set of graphs, each with its own configuration |
| (queries, formulas, keys), as a `GraphsShortcut` object. This object can then be |
| persisted and retrieved using a `Store` interface. A key design decision is the |
| generation of a unique ID for each `GraphsShortcut`. This ID is a hash (MD5) of |
| the content of the shortcut, ensuring that identical graph configurations will |
| always have the same ID. This also provides a form of de-duplication. To ensure |
| consistent ID generation, the queries and formulas within each graph |
| configuration are sorted alphabetically before hashing. However, the order of |
| the `GraphConfig` objects within a `GraphsShortcut` _does_ affect the generated |
| ID. |
| |
| ``` |
| User defines graph configurations --> [GraphsShortcut object] -- InsertShortcut --> [Store] --> Generates ID (MD5 hash) --> Persists (ID, Shortcut) |
| ^ |
| | |
| User provides ID -------------------> [Store] -- GetShortcut --------+------> [GraphsShortcut object] --> Display Graphs |
| ``` |
| |
| ### Key Components: |
| |
| - **`graphsshortcut.go`**: This file defines the central data structures and |
| the `Store` interface. |
| |
| - `GraphConfig`: Represents the configuration for a single graph. It |
| contains: |
| - `Queries`: A slice of strings, where each string represents a query used |
| to fetch data for the graph. |
| - `Formulas`: A slice of strings, representing any formulas applied to the |
| data. |
| - `Keys`: A string, likely representing a pre-selected set of traces or |
| keys to focus on. |
| - `GraphsShortcut`: This is the primary object that is stored and |
| retrieved. It's essentially a list of `GraphConfig` objects. |
| - `GetID()`: A method on `GraphsShortcut` that calculates a unique MD5 |
| hash based on its content. This method is crucial for identifying and |
| de-duplicating shortcuts. It sorts queries and formulas within each |
| `GraphConfig` before hashing to ensure that the order of these internal |
| elements doesn't change the ID. |
| - `Store`: An interface defining the contract for persisting and |
| retrieving `GraphsShortcut` objects. It has two methods: |
| - `InsertShortcut`: Takes a `GraphsShortcut` and stores it, returning its |
| generated ID. |
| - `GetShortcut`: Takes an ID and returns the corresponding |
| `GraphsShortcut`. |
| |
| - **`graphsshortcutstore/`**: This subdirectory contains implementations of |
| the `graphsshortcut.Store` interface. |
| |
| - **`graphsshortcutstore.go` (`GraphsShortcutStore`)**: This provides an |
| SQL-backed implementation of the `Store`. |
| - **Why SQL?**: SQL databases offer robust, persistent storage suitable |
| for production environments where data integrity and concurrent access |
| are important. |
| - **How it works**: |
| - It uses a connection pool (`sql.Pool`) to manage database |
| connections. |
| - `InsertShortcut`: Marshals the `GraphsShortcut` object into JSON and |
| stores it as a string in the `GraphsShortcuts` table along with its |
| pre-computed ID. It uses `ON CONFLICT (id) DO NOTHING` to avoid |
| errors if the same shortcut (and thus same ID) is inserted multiple |
| times. |
| - `GetShortcut`: Retrieves the JSON string from the database based on |
| the ID and unmarshals it back into a `GraphsShortcut` object. |
| - **`cachegraphsshortcutstore.go` (`cacheGraphsShortcutStore`)**: This |
| provides an in-memory cache-backed implementation of the `Store`. |
| - **Why a cache implementation?**: This is primarily useful for local |
| development or testing scenarios, especially when connecting to a |
| production database. It allows developers to use features that rely on |
| graph shortcuts (like multigraph) without needing write access (or |
| breakglass permissions) to the production database. The shortcuts are |
| stored locally and ephemerally. |
| - **How it works**: |
| - It utilizes a generic `cache.Cache` client. |
| - `InsertShortcut`: Marshals the `GraphsShortcut` to JSON and stores |
| it in the cache using the shortcut's ID as the cache key. |
| - `GetShortcut`: Retrieves the JSON string from the cache by ID and |
| unmarshals it. |
| - **`schema/schema.go`**: Defines the SQL table schema for |
| `GraphsShortcuts`. The table primarily stores the `id` (TEXT, PRIMARY |
| KEY) and the `graphs` (TEXT, storing the JSON representation of the |
| `GraphsShortcut`). |
| |
| - **`graphsshortcuttest/graphsshortcuttest.go`**: This file provides a suite |
| of common tests that can be run against any implementation of the |
| `graphsshortcut.Store` interface. |
| |
| - **Why shared tests?**: This promotes consistency and ensures that all |
| store implementations adhere to the same contract. It makes it easier to |
| add new store implementations and verify their correctness. |
| - **Key Tests**: |
| - `InsertGet`: Verifies that a shortcut can be inserted and then |
| retrieved, and that the retrieved shortcut is identical to the original |
| (accounting for sorted queries/formulas). |
| - `GetNonExistent`: Ensures that attempting to retrieve a shortcut with an |
| unknown ID results in an error. |
| |
| - **`mocks/Store.go`**: This file contains a mock implementation of the |
| `graphsshortcut.Store` interface, generated by the `testify/mock` library. |
| |
| - **Why mocks?**: Mocks are essential for unit testing components that |
| depend on the `Store` interface without needing a real database or |
| cache. They allow for controlled testing of different scenarios, such as |
| simulating errors from the store. |
| |
| In summary, the `graphsshortcut` module provides a flexible way to save and |
| share complex graph views by defining a clear data structure (`GraphsShortcut`), |
| a standardized way to identify them (`GetID`), and an interface (`Store`) for |
| various persistence mechanisms, with current implementations for SQL databases |
| and in-memory caches. |
| |
| # Module: /go/ingest |
| |
| The `/go/ingest` module is responsible for the entire process of taking |
| performance data files, parsing them, and storing the data into a trace store. |
| This involves identifying the format of the input file, extracting relevant |
| measurements and metadata, associating them with specific commits, and then |
| writing this information to the configured data storage backend. |
| |
| A key design principle is to support multiple ingestion file formats and to be |
| resilient to errors in individual files. The system attempts to parse files in a |
| specific order, falling back to legacy formats if the primary parsing fails. |
| This allows for graceful upgrades of the ingestion format over time without |
| breaking existing data producers. |
| |
| The ingestion process also handles trybot data, extracting issue and patchset |
| information, which is crucial for pre-submit performance analysis. |
| |
| ## Key Components and Files |
| |
| ### `/go/ingest/filter/filter.go` |
| |
| This component provides a mechanism to selectively process or ignore input files |
| based on their names using regular expressions. |
| |
| **Why:** In many scenarios, not all files in a data source are relevant for |
| performance analysis. For example, temporary files, logs, or files matching |
| specific patterns might need to be excluded. This filter allows for fine-grained |
| control over which files are ingested. |
| |
| **How:** |
| |
| - It uses two regular expressions: `accept` and `reject`. |
| - An `accept` regex, if provided, means only filenames matching this regex |
| will be considered for processing. If empty, all files are initially |
| accepted. |
| - A `reject` regex, if provided, means any filename matching this regex will |
| be ignored, even if it matched the `accept` regex. If empty, no files are |
| rejected based on this rule. |
| - The `Reject(name string) bool` method implements this logic: a file is |
| rejected if it _doesn't_ match the `accept` regex (if one is provided) OR if |
| it _does_ match the `reject` regex (if one is provided). |
| |
| **Workflow:** |
| |
| ``` |
| File Name -> Filter.Reject() |
| | |
| +-- accept_regex_exists? -- Yes -> name_matches_accept? -- No -> REJECT |
| | | |
| | +-------------------------- Yes --+ |
| +----------------------------- No -----------------------------+ |
| | |
| V |
| reject_regex_exists? -- Yes -> name_matches_reject? -- Yes -> REJECT |
| | | |
| | +-- No --+ |
| +----------------------------- No -----+ |
| | |
| V |
| ACCEPT |
| ``` |
| |
| ### `/go/ingest/format/format.go` and `/go/ingest/format/legacyformat.go` |
| |
| These files define the structure of the data files that the ingestion system can |
| understand. `format.go` defines the current standard format (Version 1), while |
| `legacyformat.go` defines an older format primarily used by nanobench. |
| |
| **Why:** A well-defined input format is essential for reliable data ingestion. |
| Versioning allows the format to evolve while maintaining backward compatibility |
| or clear error handling for older, unsupported versions. The current format |
| (`Format` struct) is designed to be flexible, allowing for common metadata (like |
| git hash, issue/patchset), global key-value pairs applicable to all results, and |
| a list of individual results. Each result can have its own set of keys and |
| either a single measurement or a map of "sub-measurements" (e.g., min, max, |
| median for a single test). This structure allows for rich and varied performance |
| data to be represented. The legacy format (`BenchData`) exists to support older |
| systems that still produce data in that schema. |
| |
| **How:** |
| |
| - **`format.go` (Version 1):** |
| - `Format` struct: The top-level structure. Contains `Version`, `GitHash`, |
| optional trybot info (`Issue`, `Patchset`), a global `Key` map, a slice |
| of `Result` structs, and global `Links`. |
| - `Result` struct: Represents one or more measurements. It has its own |
| `Key` map (which gets merged with the global `Key`), and critically, |
| either a single `Measurement` (float32) or a `Measurements` map. |
| - `SingleMeasurement` struct: Used within `Measurements` map. It allows |
| associating a `value` (e.g., "min", "median") with a `Measurement` |
| (float32) and optional `Links`. This is how multiple metrics for a |
| single conceptual test run are represented. |
| - `Parse(r io.Reader)`: Decodes JSON data from a reader into a `Format` |
| struct. It specifically checks `fileFormat.Version == |
| FileFormatVersion`. |
| - `Validate(r io.Reader)`: Uses a JSON schema (`formatSchema.json`) to |
| validate the structure of the input data. This ensures that incoming |
| files adhere to the expected contract, preventing malformed data from |
| causing issues downstream. |
| - `GetLinksForMeasurement(traceID string)`: Retrieves links associated |
| with a specific measurement, combining global links with |
| measurement-specific ones. |
| - **`legacyformat.go`:** |
| - `BenchData` struct: Defines the older nanobench format. It has fields |
| like `Hash`, `Issue`, `PatchSet`, `Key`, `Options`, and `Results`. The |
| `Results` are nested maps leading to `BenchResult`. |
| - `BenchResult`: A map representing individual test results, typically |
| `map[string]interface{}` where values are float64s, except for an |
| "options" key. |
| - `ParseLegacyFormat(r io.Reader)`: Decodes JSON data into a `BenchData` |
| struct. |
| |
| The system will first attempt to parse an input file using `format.Parse`. If |
| that fails (e.g., due to a version mismatch or JSON parsing error), it may then |
| attempt to parse it using `format.ParseLegacyFormat` as a fallback. |
| |
| ### `/go/ingest/format/formatSchema.json` |
| |
| This file contains the JSON schema definition for the `Format` struct defined in |
| `format.go`. |
| |
| **Why:** A JSON schema provides a formal, machine-readable definition of the |
| expected data structure. This is used for validation, ensuring that ingested |
| files conform to the specified format. This helps catch errors early and |
| provides clear feedback on what is wrong with a non-conforming file. |
| |
| **How:** It's a standard JSON Schema file. The `format.Validate` function uses |
| this schema to check the structure and types of the fields in an incoming JSON |
| file. The schema is embedded into the Go binary. |
| |
| ### `/go/ingest/format/generate/main.go` |
| |
| This is a utility program used to automatically generate `formatSchema.json` |
| from the Go `Format` struct definition. |
| |
| **Why:** Manually keeping a JSON schema synchronized with Go struct definitions |
| is error-prone. This generator ensures that the schema always accurately |
| reflects the Go types. |
| |
| **How:** It uses the `go.skia.org/infra/go/jsonschema` library, which can |
| reflect on Go structs and produce a corresponding JSON schema. The |
| `//go:generate` directive in the file allows this program to be run easily |
| (e.g., via `go generate`). |
| |
| ### `/go/ingest/parser/parser.go` |
| |
| This is the core component responsible for taking an input file (as |
| `file.File`), attempting to parse it using the defined formats, and extracting |
| the performance data into a standardized intermediate representation. |
| |
| **Why:** This component decouples the specifics of file formats from the process |
| of writing data to the trace store. It handles the logic of trying different |
| parsers, extracting common information like Git hashes and trybot details, and |
| transforming the data into lists of parameter maps (`paramtools.Params`) and |
| corresponding measurement values (`float32`). It also enforces rules like branch |
| name filtering and parameter key/value validation. |
| |
| **How:** |
| |
| - **`New(...)`**: Initializes a `Parser` with instance-specific |
| configurations, such as recognized branch names and a regex for invalid |
| characters in parameter keys/values. |
| - **`Parse(ctx context.Context, file file.File)`**: This is the main entry |
| point for processing a regular data file. |
| 1. It first attempts to parse the file using `extractFromVersion1File` |
| (which uses `format.Parse`). |
| 2. If that fails, it falls back to `extractFromLegacyFile` (which uses |
| `format.ParseLegacyFormat`). |
| 3. It checks if the branch name (if present in the file's common keys) is |
| in the allowed list. If not, it returns `ErrFileShouldBeSkipped`. |
| 4. It ensures that the extracted parameter keys and values are valid, |
| potentially modifying them using `query.ForceValidWithRegex` based on |
| the `invalidParamCharRegex` from the instance configuration. This is |
| crucial because trace IDs (which are derived from these parameters) |
| often have restrictions on allowed characters. |
| 5. Returns `params` (a slice of `paramtools.Params`), `values` (a slice of |
| `float32`), the `gitHash`, any global `links` from the file, and an |
| error. |
| - **`ParseTryBot(file file.File)`**: A specialized function to extract only |
| the `Issue` and `Patchset` information from a file, trying both V1 and |
| legacy formats. This is likely used for systems that only need to identify |
| the tryjob associated with a file without processing all the measurement |
| data. |
| - **`ParseCommitNumberFromGitHash(gitHash string)`**: Extracts an integer |
| commit number from a specially formatted git hash string (e.g., "CP:12345" |
| -> 12345). This supports systems that use such commit identifiers. |
| - Helper functions like `getParamsAndValuesFromLegacyFormat` and |
| `getParamsAndValuesFromVersion1Format` do the actual work of traversing the |
| parsed file structures (`BenchData` or `Format`) and flattening them into |
| the `params` and `values` slices. |
| - For the V1 format, it iterates through `f.Results`. If a `Result` has a |
| single `Measurement`, it combines `f.Key` and `result.Key` to form the |
| `paramtools.Params`. |
| - If a `Result` has `Measurements` (a map of `string` to |
| `[]SingleMeasurement`), it iterates through this map. For each entry, it |
| takes the map's key and the `Value` from `SingleMeasurement` to add more |
| key-value pairs to the `paramtools.Params`. |
| - **`GetSamplesFromLegacyFormat(b *format.BenchData)`**: Extracts raw sample |
| data (if present) from the legacy format. This seems to be for specific use |
| cases where individual sample values, rather than just aggregated metrics, |
| are needed. |
| |
| **Key Workflow (Simplified `Parse`):** |
| |
| ``` |
| Input: file.File |
| Output: ([]paramtools.Params, []float32, gitHash, links, error) |
| |
| 1. Read file contents. |
| 2. Attempt Parse as Version 1 Format: |
| `f, err := format.Parse(contents)` |
| If success: |
| `params, values := getParamsAndValuesFromVersion1Format(f, p.invalidParamCharRegex)` |
| `gitHash = f.GitHash` |
| `links = f.Links` |
| `commonKeys = f.Key` |
| Else (error): |
| Reset reader. |
| Attempt Parse as Legacy Format: |
| `benchData, err := format.ParseLegacyFormat(contents)` |
| If success: |
| `params, values := getParamsAndValuesFromLegacyFormat(benchData)` |
| `gitHash = benchData.Hash` |
| `links = nil` (legacy format doesn't have global links in the same way) |
| `commonKeys = benchData.Key` |
| Else (error): |
| Return error. |
| |
| 3. `branch, ok := p.checkBranchName(commonKeys)` |
| If !ok: |
| Return `ErrFileShouldBeSkipped`. |
| |
| 4. If len(params) == 0: |
| Return `ErrFileShouldBeSkipped`. |
| |
| 5. Return `params, values, gitHash, links, nil`. |
| ``` |
| |
| ### `/go/ingest/process/process.go` |
| |
| This component orchestrates the entire ingestion pipeline. It takes files from a |
| source (e.g., a directory, GCS bucket), uses the `parser` to extract data, |
| interacts with `git` to resolve commit information, and then writes the |
| processed data to a `tracestore.TraceStore` and `tracestore.MetadataStore`. It |
| also handles sending Pub/Sub events for ingested files. |
| |
| **Why:** This provides the high-level control flow for ingestion. It manages |
| concurrency (multiple worker goroutines), error handling at a macro level |
| (retries for writing to the store), and integration with external systems like |
| Git and Pub/Sub. |
| |
| **How:** |
| |
| - **`Start(...)`**: |
| 1. Initializes tracing, Pub/Sub client (if a topic is configured), the |
| `file.Source` (to get files), the `tracestore.TraceStore` and |
| `tracestore.MetadataStore` (to write data), and `perfgit.Git` (to map |
| git hashes to commit numbers). |
| 2. Starts a number of `worker` goroutines specified by |
| `numParallelIngesters`. |
| 3. Each `worker` listens on a channel provided by the `file.Source`. |
| - **`worker(...)`**: |
| 1. Creates a `parser.Parser` instance. |
| 2. Enters a loop, receiving `file.File` objects from the channel. |
| 3. For each file, it calls `workerInfo.processSingleFile`. |
| - **`workerInfo.processSingleFile(f file.File)`**: This is the heart of the |
| per-file processing. |
| 1. Increments metrics for files received. |
| 2. Calls `p.Parse(ctx, f)` to get `params`, `values`, `gitHash`, and |
| `fileLinks`. |
| 3. Handles errors from `Parse`: |
| - If `parser.ErrFileShouldBeSkipped`, acks the Pub/Sub message (if |
| any) and skips. |
| - For other parsing errors, increments metrics and nacks the Pub/Sub |
| message (if dead-lettering is enabled, allowing for retries or |
| manual inspection). |
| 4. If `gitHash` is empty, logs an error and nacks. |
| 5. If the Git repo supplies commit numbers directly (e.g. "CP:12345"), it |
| calls `p.ParseCommitNumberFromGitHash`. |
| 6. Calls `g.GetCommitNumber(ctx, gitHash, commitNumberFromFile)` to resolve |
| the `gitHash` (or verify the supplied commit number) against the Git |
| repository. It includes logic to update the local Git repository clone |
| if the hash isn't initially found. If the commit cannot be resolved, it |
| logs an error, acks the Pub/Sub message (as retrying won't help for an |
| unknown commit), and skips. |
| 7. Builds a `paramtools.ParamSet` from all the extracted `params`. |
| 8. Writes the data to the `tracestore.TraceStore` using `store.WriteTraces` |
| or `store.WriteTraces2` (depending on |
| `instanceConfig.IngestionConfig.TraceValuesTableInlineParams`). This |
| involves retries in case of transient store errors. |
| - `WriteTraces2` suggests an optimized path where some parameter data |
| might be stored directly with trace values, potentially for |
| performance reasons. |
| 9. If writing fails after retries, increments metrics and nacks. |
| 10. If writing succeeds, acks the Pub/Sub message and increments success |
| metrics. |
| 11. Calls `sendPubSubEvent` to publish information about the ingested file |
| (trace IDs, paramset, filename) to a configured Pub/Sub topic. This |
| allows other services to react to new data ingestion. |
| 12. If `fileLinks` were present in the input, it calls |
| `metadataStore.InsertMetadata` to store these links. |
| - **`sendPubSubEvent(...)`**: If a `FileIngestionTopicName` is configured, |
| this function constructs an `ingestevents.IngestEvent` containing the trace |
| IDs, the overall `ParamSet` for the file, and the filename. It then |
| publishes this event to the specified Pub/Sub topic. |
| |
| **Overall Ingestion Workflow:** |
| |
| ``` |
| File Source (e.g., GCS bucket watcher) |
| | |
| v |
| [ file.File channel ] -> Worker Goroutine(s) |
| | |
| v |
| processSingleFile(file) |
| | |
| +--------------------------+--------------------------+ |
| | | | |
| v v v |
| Parser.Parse(file) --> Git.GetCommitNumber(hash) --> TraceStore.WriteTraces(...) |
| | ^ | | ^ |
| | | (if parsing fails)| | | (retries) |
| | +-------------------| (update repo if needed) | | |
| | | | | |
| +-----> ParamSet Creation +--------------------------+ | |
| | | |
| v | |
| sendPubSubEvent (if success) ------------------------------+ |
| | |
| v |
| MetadataStore.InsertMetadata (if links exist) |
| ``` |
| |
| This architecture allows for robust and scalable ingestion of performance data |
| from various sources and formats, with clear separation of concerns between |
| parsing, data transformation, Git interaction, and storage. The use of Pub/Sub |
| facilitates downstream processing and real-time reactions to newly ingested |
| data. |
| |
| # Module: /go/ingestevents |
| |
| The `ingestevents` module is designed to facilitate the communication of |
| ingestion completion events via PubSub. This is a critical part of the |
| event-driven alerting system within Perf, where the completion of data ingestion |
| for a file triggers subsequent processes like regression detection in a |
| clusterer. |
| |
| The core of this module revolves around the `IngestEvent` struct. This struct |
| encapsulates the necessary information to be transmitted when a file has been |
| successfully ingested. It includes: |
| |
| - `TraceIDs`: A slice of strings representing all the unencoded trace |
| identifiers found within the ingested file. These IDs are fundamental for |
| identifying the specific data points that have been processed. |
| - `ParamSet`: An unencoded, read-only representation of the |
| `paramtools.ParamSet` that summarizes the `TraceIDs`. This provides a |
| consolidated view of the parameters associated with the ingested traces. |
| - `Filename`: The name of the file that was ingested. This helps in tracking |
| the source of the ingested data. |
| |
| To handle the transmission of `IngestEvent` data over PubSub, the module |
| provides two key functions: |
| |
| - `CreatePubSubBody`: This function takes an `IngestEvent` struct as input and |
| prepares it for PubSub transmission. The "how" here involves a two-step |
| process: |
| |
| 1. The `IngestEvent` is first encoded into a JSON format. This provides a |
| structured and widely compatible representation of the data. |
| 2. The resulting JSON data is then compressed using gzip. The "why" for |
| this step is to ensure that the message size stays within the PubSub |
| message size limits (currently 10MB). This is particularly important |
| when dealing with files that contain a large number of traces, as the |
| raw JSON representation could exceed the limit. The function returns the |
| gzipped JSON data as a byte slice. |
| |
| ``` |
| IngestEvent (struct) ---> JSON Encoding ---> Gzip Compression ---> []byte (for PubSub) |
| ``` |
| |
| - `DecodePubSubBody`: This function performs the reverse operation. It takes a |
| byte slice (presumably received from a PubSub message) and decodes it back |
| into an `IngestEvent` struct. The process is: |
| |
| 1. The input byte slice is first decompressed using gzip. |
| 2. The decompressed data, which is expected to be in JSON format, is then |
| decoded into an `IngestEvent` struct. Error handling is incorporated at |
| each step to manage potential issues during decompression or JSON |
| decoding. |
| |
| ``` |
| []byte (from PubSub) ---> Gzip Decompression ---> JSON Decoding ---> IngestEvent (struct) |
| ``` |
| |
| The primary responsibility of this module is therefore to provide a standardized |
| and efficient way to serialize and deserialize ingestion event information for |
| PubSub communication. The design choice of using JSON for structure and gzip for |
| compression balances readability, interoperability, and an efficient use of |
| PubSub resources. |
| |
| The file `ingestevents.go` contains the definition of the `IngestEvent` struct |
| and the implementation of the `CreatePubSubBody` and `DecodePubSubBody` |
| functions. The corresponding test file, `ingestevents_test.go`, ensures that the |
| encoding and decoding processes work correctly, verifying that an `IngestEvent` |
| can be successfully round-tripped through the serialization and deserialization |
| process. |
| |
| # Module: /go/initdemo |
| |
| The `initdemo` module provides a command-line application designed to initialize |
| a database instance, specifically targeting CockroachDB or a Spanner emulator, |
| for demonstration or development purposes. |
| |
| Its primary purpose is to automate the creation of a named database and the |
| application of the latest database schema. This ensures a consistent and |
| ready-to-use database environment, removing the manual steps often required for |
| setting up a database for applications like Skia Perf. |
| |
| The core functionality revolves around connecting to a specified database URL, |
| attempting to create the database (gracefully handling cases where it already |
| exists), and then executing the appropriate schema definition. The choice of |
| schema (standard SQL or Spanner-specific) is determined by a command-line flag. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`main.go`**: This is the entry point and sole Go source file for the |
| application. |
| - **Flag Parsing**: It defines and parses command-line flags to configure |
| the database connection and behavior. |
| - `--databasename`: Specifies the name of the database to be created |
| (defaults to "demo"). This allows users to customize the database name |
| for different environments or purposes. |
| - `--database_url`: Provides the connection string for the CockroachDB |
| instance (defaults to a local instance |
| `postgresql://root@127.0.0.1:26257/?sslmode=disable`). This allows |
| connection to different database servers or configurations. |
| - `--spanner`: A boolean flag that, when set, instructs the application to |
| use the Spanner-specific schema. This is crucial for ensuring |
| compatibility when targeting a Spanner emulator, which may have |
| different SQL syntax or feature support compared to CockroachDB. |
| - **Database Connection**: It establishes a connection to the database |
| using the `pgxpool` library, which is a PostgreSQL driver and connection |
| pool for Go. This library was chosen for its robustness and performance |
| in handling PostgreSQL-compatible databases like CockroachDB. |
| - **Database Creation**: It attempts to execute a `CREATE DATABASE` SQL |
| statement. The implementation includes error handling to log an |
| informational message if the database already exists, rather than |
| failing, making the script idempotent in terms of database creation. |
| - **Database Selection (CockroachDB specific)**: If not targeting Spanner, |
| it executes `SET DATABASE` to switch the current session's context to |
| the newly created (or existing) database. This is a CockroachDB-specific |
| command. |
| - **Schema Selection**: Based on the `--spanner` flag, it selects the |
| appropriate schema definition. |
| - If `--spanner` is false, it uses `sql.Schema` from the `//perf/go/sql` |
| module, which contains the standard SQL schema for Perf. |
| - If `--spanner` is true, it uses `spanner.Schema` from the |
| `//perf/go/sql/spanner` module, which contains the schema adapted for |
| Spanner. This separation allows maintaining distinct schema versions |
| tailored to the nuances of each database system. |
| - **Schema Application**: It executes the selected schema DDL statements |
| against the connected database. This step creates all the necessary |
| tables, indexes, and other database objects required by the Perf |
| application. |
| - **Connection Closure**: Finally, it closes the database connection pool |
| to release resources. |
| |
| **Workflow:** |
| |
| The typical workflow of the `initdemo` application can be visualized as: |
| |
| 1. **Parse Flags**: `Application Start` -> `Read --databasename, |
| --database_url, --spanner` |
| |
| 2. **Connect to Database**: `Use --database_url` -> `pgxpool.Connect()` -> |
| `Connection Pool (conn)` |
| |
| 3. **Create Database**: `conn` + `Use --databasename` -> `Execute "CREATE |
| DATABASE <name>"` `|` `+-- Success` `|` `+-- Error (e.g., already exists)` |
| -> `Log Info "Database <name> already exists."` |
| |
| 4. **Set Active Database (if not Spanner)**: `Is --spanner false?` `|` `+-- |
| Yes` -> `conn` + `Use --databasename` -> `Execute "SET DATABASE <name>"` `| |
| |` `| +-- Error` -> `sklog.Fatal()` `|` `+-- No (Spanner enabled)` -> `Skip |
| this step` |
| |
| 5. **Select Schema**: `Is --spanner true?` `|` `+-- Yes` -> `dbSchema = |
| spanner.Schema` `|` `+-- No` -> `dbSchema = sql.Schema` |
| |
| 6. **Apply Schema**: `conn` + `dbSchema` -> `Execute schema DDL` `|` `+-- |
| Error` -> `sklog.Fatal()` |
| |
| 7. **Close Connection**: `conn.Close()` -> `Application End` |
| |
| This process ensures that a target database is either created or confirmed to |
| exist, and then the correct schema is applied, making it ready for use. The |
| choice of using `pgxpool` for database interaction and providing separate schema |
| definitions for standard SQL and Spanner demonstrates a design focused on |
| supporting multiple database backends for the Perf system. The error handling, |
| particularly for the database creation step, aims for robust and user-friendly |
| operation. |
| |
| # Module: /go/issuetracker |
| |
| ## Perf Issue Tracker Module |
| |
| This module provides an interface and implementation for interacting with the |
| Google Issue Tracker API, specifically tailored for Perf's needs. The primary |
| goal is to abstract the complexities of the Issue Tracker API and provide a |
| simpler, more focused way to retrieve issue details and add comments to existing |
| issues. This allows other parts of the Perf system to integrate with issue |
| tracking without needing to directly handle API authentication, request |
| formatting, or response parsing. |
| |
| ### Core Functionality and Design |
| |
| The module is designed around the `IssueTracker` interface, which defines the |
| core operations: |
| |
| 1. **Listing Issues (`ListIssues`)**: This function allows retrieving details |
| for a set of specified issue IDs. |
| |
| - **Why**: Perf often needs to fetch information about bugs that have been |
| filed (e.g., to display their status or link to them from alerts). |
| Providing a bulk retrieval mechanism based on IDs is efficient. |
| - **How**: The implementation takes a `ListIssuesRequest` containing a |
| slice of integer issue IDs. It constructs a query string by joining |
| these IDs with " | " (OR operator in Issue Tracker query language) and |
| prepending "id:()". This formatted query is then sent to the Issue |
| Tracker API. |
| - **Example Workflow**: `Perf System --- ListIssuesRequest (IDs: [123, |
| |
| 456]) ---> issuetracker Module | v Construct Query: "id:(123 | 456)" | v |
| Issue Tracker API <--- GET Request --- issueTrackerImpl | v Perf System |
| <--- []\*issuetracker.Issue --- Response Parsing <--- API Response` |
| |
| 2. **Creating Comments (`CreateComment`)**: This function allows adding a new |
| comment to an existing issue. |
| |
| - **Why**: Perf might need to automatically update bugs with new |
| information, such as when a regression is fixed or when more data about |
| an alert becomes available. |
| - **How**: It takes a `CreateCommentRequest` containing the `IssueId` and |
| the `Comment` string. The implementation constructs an |
| `issuetracker.IssueComment` object and uses the Issue Tracker client |
| library to post this comment to the specified issue. |
| - **Example Workflow**: `Perf System --- CreateCommentRequest (ID: 789, |
| |
| Comment: "...") ---> issuetracker Module | v Issue Tracker API <--- POST |
| Request --- issueTrackerImpl | v Perf System <--- CreateCommentResponse |
| <--- Response Parsing <--- API Response` |
| |
| ### Key Components |
| |
| - **`issuetracker.go`**: |
| |
| - **`IssueTracker` interface**: Defines the contract for interacting with |
| the issue tracker. This allows for decoupling the client code from the |
| specific implementation and facilitates testing using mocks. |
| - **`issueTrackerImpl` struct**: The concrete implementation of the |
| `IssueTracker` interface. It holds an instance of the |
| `issuetracker.Service` client, which is the generated Go client for the |
| Google Issue Tracker API. |
| - **`NewIssueTracker` function**: This is the factory function for |
| creating an `issueTrackerImpl` instance. |
| - **Authentication**: It handles the authentication by fetching an API key |
| from Google Secret Manager. The secret project and name are configurable |
| via `config.IssueTrackerConfig`. It then uses `google.DefaultClient` |
| with the "https://www.googleapis.com/auth/buganizer" scope to obtain an |
| authenticated HTTP client. This client and the API key are then used to |
| initialize the `issuetracker.Service`. |
| - **Configuration**: The `BasePath` of the `issuetracker.Service` is |
| explicitly set to "https://issuetracker.googleapis.com" to ensure it |
| points to the correct API endpoint. |
| - **Request/Response Structs (`ListIssuesRequest`, `CreateCommentRequest`, |
| `CreateCommentResponse`)**: These simple structs define the data |
| structures for requests and responses, making the interface clear and |
| easy to use. They are designed to be minimal and specific to the needs |
| of the Perf system. |
| |
| - **`mocks/IssueTracker.go`**: |
| |
| - This file contains a mock implementation of the `IssueTracker` |
| interface, generated using the `testify/mock` library. |
| - **Why**: Mocks are crucial for unit testing components that depend on |
| the `issuetracker` module. They allow tests to simulate various |
| responses (success, failure, specific data) from the issue tracker |
| without making actual API calls. This makes tests faster, more reliable, |
| and independent of external services. |
| - **How**: The `IssueTracker` mock struct embeds `mock.Mock` and provides |
| mock implementations for `ListIssues` and `CreateComment`. The |
| `NewIssueTracker` function in this file is a constructor for the mock, |
| which also sets up test cleanup to assert that all expected mock calls |
| were made. |
| |
| ### Design Decisions and Trade-offs |
| |
| - **Interface-based design**: Using an interface (`IssueTracker`) promotes |
| loose coupling and testability. Consumers depend on the abstraction rather |
| than the concrete implementation. |
| - **Simplified API**: The module exposes only the functionality currently |
| needed by Perf (listing issues by ID and creating comments). It doesn't |
| attempt to be a full-fledged Issue Tracker client, which simplifies its own |
| implementation and usage. If more advanced features are needed in the |
| future, the interface can be extended. |
| - **Secret Management for API Key**: Storing the API key in Google Secret |
| Manager is a security best practice, preventing it from being hardcoded or |
| checked into version control. |
| - **Error Handling**: The module uses `skerr.Wrapf` to wrap errors, providing |
| context and making debugging easier. It also includes input validation for |
| `CreateCommentRequest` to prevent invalid API calls. |
| - **Logging**: Debug logs (`sklog.Debugf`) are included to trace requests and |
| responses, which can be helpful during development and troubleshooting. |
| |
| The module relies on the external `go.skia.org/infra/go/issuetracker/v1` |
| library, which is the auto-generated client for the Google Issue Tracker API. |
| This design choice leverages existing, well-tested client libraries instead of |
| reimplementing API interaction from scratch. |
| |
| # Module: /go/kmeans |
| |
| ## K-Means Clustering Module |
| |
| This module provides a generic implementation of the k-means clustering |
| algorithm. The primary goal is to offer a flexible way to group a set of data |
| points (observations) into a predefined number of clusters (k) based on their |
| similarity. The "similarity" is determined by a distance metric, and the |
| "center" of each cluster is represented by a centroid. |
| |
| ### Design and Implementation Choices |
| |
| The module is designed with **generality** in mind. Instead of being tied to a |
| specific data type or distance metric, it uses interfaces (`Clusterable`, |
| `Centroid`) and a function type (`CalculateCentroid`). This approach allows |
| users to define their own data structures and distance calculations, making the |
| k-means algorithm applicable to a wide variety of problems. |
| |
| **Interfaces for Flexibility:** |
| |
| - **`Clusterable`**: This is a marker interface. Any data type that needs to |
| be clustered must satisfy this interface. In practice, this means you can |
| use `interface{}` and then perform type assertions within your custom |
| distance and centroid calculation functions. This design choice prioritizes |
| ease of use for simple cases, where the same type might represent both an |
| observation and a centroid. |
| - **`Centroid`**: This interface defines the contract for centroids. |
| - `AsClusterable() Clusterable`: This method is crucial for situations |
| where a centroid itself can be treated as a data point (e.g., when |
| calculating distances or when a centroid is part of the initial |
| observation set). It allows the algorithm to seamlessly integrate |
| centroids into lists of clusterable items. If a centroid cannot be |
| meaningfully converted to a `Clusterable`, it returns `nil`. |
| - `Distance(c Clusterable) float64`: This method is the core of the |
| similarity measure. It calculates the distance between the centroid and |
| a given `Clusterable` data point. The user provides the specific |
| implementation for this, enabling the use of various distance metrics |
| (Euclidean, Manhattan, etc.). |
| - **`CalculateCentroid func([]Clusterable) Centroid`**: This function type |
| defines how a new centroid is computed from a set of `Clusterable` items |
| belonging to a cluster. This allows users to implement different strategies |
| for centroid calculation, such as taking the mean, median, or other |
| representative points. |
| |
| **Lloyd's Algorithm Implementation:** |
| |
| The core clustering logic is implemented in the `Do` function, which performs a |
| single iteration of Lloyd's algorithm. This is a common and relatively |
| straightforward iterative approach to k-means. |
| |
| The `KMeans` function orchestrates multiple iterations of `Do`. A key design |
| consideration here is the **convergence criteria**. Currently, it runs for a |
| fixed number of iterations (`iters`). A more sophisticated approach, would be to iterate until the total error (or the change in |
| centroid positions) falls below a certain threshold, indicating that the |
| clusters have stabilized. This was likely deferred for simplicity in the initial |
| implementation, but it's an important aspect for practical applications to avoid |
| unnecessary computations or premature termination. |
| |
| **Why modify centroids in-place in `Do`?** |
| |
| The `Do` function modifies the `centroids` slice passed to it. The documentation |
| explicitly advises calling it as `centroids = Do(observations, centroids, f)`. |
| This design choice might have been made for efficiency, avoiding the allocation |
| of a new centroids slice in every iteration if the number of centroids remains |
| the same. However, it also means the caller needs to be aware of this side |
| effect. The function does return the potentially new slice of centroids, which |
| is important because centroids can be "lost" if a cluster becomes empty. |
| |
| ### Key Responsibilities and Components |
| |
| - **`kmeans.go`**: This is the sole source file and contains all the logic for |
| the k-means algorithm. |
| |
| - **`Clusterable` (interface)**: Defines the contract for data points that |
| can be clustered. Its main purpose is to allow generic collections of |
| items. |
| - **`Centroid` (interface)**: Defines the contract for cluster centers, |
| including how to calculate their distance to data points and how to |
| treat them as data points themselves. |
| - **`CalculateCentroid` (function type)**: A user-provided function that |
| defines the logic for computing a new centroid from a group of data |
| points. This separation of concerns is key to the module's flexibility. |
| - **`closestCentroid(observation Clusterable, centroids []Centroid) (int, |
| float64)`**: A helper function that finds the index of the centroid |
| closest to a given observation and the distance to it. This is a |
| fundamental step in assigning observations to clusters. |
| - **`Do(observations []Clusterable, centroids []Centroid, f |
| CalculateCentroid) []Centroid`**: |
| - **Responsibility**: Performs a single iteration of the k-means algorithm |
| (Lloyd's algorithm). |
| - **How it works**: 1. Assigns each observation to its nearest centroid, forming temporary |
| clusters. `Observations --> [Find Closest Centroid for each] --> |
| Temporary Cluster Assignments` 2. For each temporary cluster, it recalculates a new centroid using the |
| user-provided `f` function. `Temporary Cluster Assignments --> |
| [Group by Cluster] --> Sets of Clusterable items | V [Apply 'f'] --> |
| New Centroids` 3. If a cluster becomes empty (no observations are closest to its |
| centroid), that centroid is effectively removed in this iteration, |
| as `f` will not be called for an empty set of `Clusterable` items, |
| and `newCentroids` will not include it. |
| - **Design Rationale**: Encapsulates one core step of the iterative |
| process, making the overall `KMeans` function clearer. The in-place |
| modification (and return value) addresses the potential for the number |
| of centroids to change. |
| - **`GetClusters(observations []Clusterable, centroids []Centroid) |
| ([][]Clusterable, float64)`**: |
| - **Responsibility**: Organizes the observations into their final clusters |
| based on the provided (presumably converged) centroids and calculates |
| the total sum of squared errors (or whatever distance metric is used). |
| - **How it works**: |
| 1. Initializes a list of clusters, where each cluster initially |
| contains only its centroid (if `AsClusterable()` is not nil). |
| 2. Iterates through all observations, assigning each to its closest |
| centroid and adding it to the corresponding cluster list. |
| 3. Accumulates the distance from each observation to its assigned |
| centroid to compute the `totalError`. |
| - **Design Rationale**: Provides a way to retrieve the actual cluster |
| memberships after the algorithm has run. The inclusion of the centroid |
| as the first element in each returned cluster is a convention for easy |
| identification. |
| - **`KMeans(observations []Clusterable, centroids []Centroid, k, iters |
| int, f CalculateCentroid) ([]Centroid, [][]Clusterable)`**: |
| - **Responsibility**: The main entry point for running the k-means |
| algorithm for a specified number of iterations. |
| - **How it works**: `Initial Centroids --(iter 1)--> Do() --(updates)--> |
| Centroids' | --(iter 2)--> Do() --(updates)--> Centroids'' ... --(iter |
| 'iters')--> Do() --(updates)--> Final Centroids | V GetClusters() --> |
| Final Clusters` |
| - **Design Rationale**: Provides a simple interface to run the entire |
| process. The fixed number of iterations (`iters`) is a straightforward |
| stopping condition, though, as mentioned, convergence-based stopping |
| would be more robust. The `k` parameter seems redundant given that the |
| initial number of centroids is determined by `len(centroids)`. If `k` |
| was intended to specify the _desired_ number of clusters and the initial |
| `centroids` were just starting points, the implementation would need to |
| handle cases where `len(centroids)` != `k`. However, the current `Do` |
| function naturally adjusts the number of centroids if some clusters |
| become empty. |
| - **`TotalError(observations []Clusterable, centroids []Centroid) |
| float64`**: |
| - **Responsibility**: Calculates the sum of distances from each |
| observation to its closest centroid. This is often used as a measure of |
| the "goodness" of the clustering. |
| - **How it works**: It simply calls `GetClusters` and returns the |
| `totalError` computed by it. |
| - **Design Rationale**: Provides a convenient way to evaluate the |
| clustering quality without needing to manually iterate and sum |
| distances. |
| |
| ### Key Workflows |
| |
| **1. Single K-Means Iteration (`Do` function):** |
| |
| ``` |
| Input: Observations (O), Current Centroids (C_curr), CalculateCentroid function (f) |
| |
| 1. For each Observation o in O: |
| Find c_closest in C_curr such that Distance(o, c_closest) is minimized. |
| Assign o to the cluster associated with c_closest. |
| ---> Result: A mapping of each Observation to a Centroid index. |
| |
| 2. Initialize NewCentroids (C_new) as an empty list. |
| |
| 3. For each unique Centroid index j (from 0 to k-1): |
| a. Collect all Observations (O_j) assigned to cluster j. |
| b. If O_j is not empty: |
| Calculate new_centroid_j = f(O_j). |
| Add new_centroid_j to C_new. |
| ---> Potentially, some original centroids might not have any observations assigned, |
| so C_new might have fewer centroids than C_curr. |
| |
| Output: New Centroids (C_new) |
| ``` |
| |
| **2. Full K-Means Clustering (`KMeans` function):** |
| |
| ``` |
| Input: Observations (O), Initial Centroids (C_init), Number of Iterations (iters), CalculateCentroid function (f) |
| |
| 1. Set CurrentCentroids = C_init. |
| |
| 2. Loop 'iters' times: |
| CurrentCentroids = Do(O, CurrentCentroids, f) // Perform one iteration |
| ---> CurrentCentroids are updated. |
| |
| 3. FinalCentroids = CurrentCentroids. |
| |
| 4. Clusters, TotalError = GetClusters(O, FinalCentroids) |
| ---> Assigns each observation to its final cluster based on FinalCentroids. |
| The first element of each sub-array in Clusters is the centroid itself. |
| |
| Output: FinalCentroids, Clusters |
| ``` |
| |
| The unit tests in `kmeans_test.go` provide excellent examples of how to |
| implement the `Clusterable`, `Centroid`, and `CalculateCentroid` requirements |
| for a simple 2D point scenario. They demonstrate the expected behavior of the |
| `Do` and `KMeans` functions, including edge cases like empty inputs or losing |
| centroids when clusters become empty. |
| |
| # Module: /go/maintenance |
| |
| ## Maintenance Module Documentation |
| |
| ### High-Level Overview |
| |
| The `maintenance` module in Perf is responsible for executing a set of |
| long-running background processes that are essential for the health and |
| operational integrity of a Perf instance. These tasks ensure that data is kept |
| up-to-date, system configurations are current, and storage is managed |
| efficiently. The module is designed to be started once and run continuously, |
| performing its duties at predefined intervals. |
| |
| ### Design Rationale and Implementation Choices |
| |
| The core design principle behind the `maintenance` module is to centralize |
| various periodic tasks that would otherwise be scattered or require manual |
| intervention. By consolidating these operations, the system becomes more robust |
| and easier to manage. |
| |
| Key design choices include: |
| |
| - **Asynchronous Operations:** Most maintenance tasks are designed to run in |
| separate goroutines, triggered by timers. This allows the main application |
| thread (if any) to remain responsive and prevents one maintenance task from |
| blocking others. |
| - **Configurability via Flags and Instance Configuration:** The behavior of |
| the maintenance tasks (e.g., whether to perform regression migration, |
| refresh query cache, or delete old data) is controlled by command-line flags |
| (`config.MaintenanceFlags`) and the instance-specific configuration |
| (`config.InstanceConfig`). This provides flexibility for different Perf |
| deployments and operational needs. |
| - **Dependency Injection:** Components like database connections |
| (`builders.NewDBPoolFromConfig`), Git interfaces |
| (`builders.NewPerfGitFromConfig`), and caching mechanisms |
| (`builders.GetCacheFromConfig`) are created and passed into the respective |
| maintenance tasks. This promotes modularity and testability. |
| - **Error Handling and Logging:** Each maintenance task incorporates error |
| handling and logging (`sklog`) to provide visibility into its operations and |
| to aid in diagnosing issues. While errors in one task might be logged, the |
| overall `Start` function aims to keep other independent tasks running. |
| - **Idempotency (Implicit):** While not explicitly stated for all tasks, many |
| maintenance operations are inherently idempotent or designed to be safe to |
| run repeatedly (e.g., schema migration, data deletion based on age). |
| - **Phased Introduction of Features:** Features like regression migration or |
| Sheriff config integration are gated by flags (`flags.MigrateRegressions`, |
| `instanceConfig.EnableSheriffConfig`). This allows for gradual rollouts and |
| testing in production environments. |
| |
| ### Responsibilities and Key Components |
| |
| The `maintenance` module orchestrates several distinct background processes. |
| |
| **1. Core Initialization and Schema Management (`maintenance.go`)** |
| |
| - **Why:** Before any maintenance tasks can run, essential services like |
| tracing need to be initialized. Crucially, the database schema must be |
| validated and migrated to the expected version. This ensures that all |
| subsequent database operations are performed against a compatible and |
| up-to-date schema. |
| - **How:** |
| - `tracing.Init`: Sets up the distributed tracing system. |
| - `builders.NewDBPoolFromConfig`: Establishes a connection pool to the |
| database. |
| - `expectedschema.ValidateAndMigrateNewSchema`: Checks the current |
| database schema version against the expected version defined in the |
| codebase. If they don't match, it applies the necessary migrations to |
| bring the schema up to date. This is a critical step to prevent data |
| corruption or application errors due to schema mismatches. |
| |
| **2. Git Repository Synchronization (`maintenance.go`)** |
| |
| - **Why:** Perf relies on an up-to-date view of the monitored Git repository |
| to associate performance data with specific commits. This process ensures |
| that new commits are continuously ingested into the Perf system. |
| - **How:** |
| - `builders.NewPerfGitFromConfig`: Creates an instance of `perfgit.Git`, |
| which provides an interface to the Git repository. |
| - `g.StartBackgroundPolling(ctx, gitRepoUpdatePeriod)`: This method |
| launches a goroutine within the `perfgit` component. This goroutine |
| periodically fetches the latest changes from the remote Git repository |
| (origin) and updates the local representation, typically also updating a |
| `Commits` table in the database with new commit information. The |
| `gitRepoUpdatePeriod` constant (e.g., 1 minute) defines how frequently |
| this update occurs. |
| |
| **3. Regression Schema Migration (`maintenance.go`)** |
| |
| - **Why:** Over time, the way regression data is stored might need to be |
| changed for performance, new features, or data integrity reasons. This |
| component handles the migration of existing regression data from an older |
| schema/table to a newer one. This is often a long-running process for |
| instances with a large history of regressions. |
| - **How:** |
| - Controlled by the `flags.MigrateRegressions` flag. |
| - `migration.New`: Creates a `Migrator` instance, likely configured with |
| database connections for both the old and new regression storage |
| mechanisms. |
| - `migrator.RunPeriodicMigration(regressionMigratePeriod, |
| regressionMigrationBatchSize)`: Starts a goroutine that, at intervals |
| defined by `regressionMigratePeriod`, processes a |
| `regressionMigrationBatchSize` number of regressions, moving them from |
| the old storage to the new. This batching approach prevents overwhelming |
| the database and allows the migration to proceed incrementally. |
| |
| **4. Sheriff Configuration Import (`maintenance.go`)** |
| |
| - **Why:** Perf allows defining alert configurations (Sheriff configs) that |
| specify how and when alerts should be triggered for performance regressions. |
| These configurations can be managed externally (e.g., via LUCI Config). This |
| component ensures that Perf stays synchronized with the latest |
| configurations. |
| - **How:** |
| - Conditional on `instanceConfig.EnableSheriffConfig` and a non-empty |
| `instanceConfig.InstanceName`. |
| - It initializes `AlertStore` and `SubscriptionStore` for managing alert |
| and subscription data within Perf. |
| - `luciconfig.NewApiClient`: Creates a client to communicate with the LUCI |
| Config service. |
| - `sheriffconfig.New`: Initializes the `SheriffConfig` service, which |
| encapsulates the logic for fetching, parsing, and applying Sheriff |
| configurations. |
| - `sheriffConfig.StartImportRoutine(configImportPeriod)`: Launches a |
| goroutine that periodically (every `configImportPeriod`) polls the LUCI |
| Config service for the specified instance. If new or updated |
| configurations are found, they are processed and stored/updated in |
| Perf's database (e.g., in the `Alerts` and `Subscriptions` tables). |
| |
| **5. Query Cache Refresh (`maintenance.go`)** |
| |
| - **Why:** To speed up common queries (e.g., retrieving the set of available |
| trace parameters, known as ParamSets), Perf can cache this information. This |
| component is responsible for periodically rebuilding and refreshing these |
| caches. |
| - **How:** |
| |
| - Controlled by the `flags.RefreshQueryCache` flag. |
| - `builders.NewTraceStoreFromConfig`: Gets an interface to the trace data. |
| - `dfbuilder.NewDataFrameBuilderFromTraceStore`: Creates a utility for |
| building data frames from traces, which is likely used to derive the |
| ParamSet. |
| - `psrefresh.NewDefaultParamSetRefresher`: Initializes a component |
| specifically designed to refresh ParamSets. It uses the |
| `DataFrameBuilder` to scan trace data and determine the current set of |
| unique parameter key-value pairs. |
| - `psRefresher.Start(time.Hour)`: Starts a goroutine to refresh the |
| primary ParamSet (perhaps stored directly in the database or an |
| in-memory representation) hourly. |
| - `builders.GetCacheFromConfig`: If a distributed cache like Redis is |
| configured, this obtains a client for it. |
| - `psrefresh.NewCachedParamSetRefresher`: Wraps the primary `psRefresher` |
| with a caching layer. |
| - `cacheParamSetRefresher.StartRefreshRoutine(redisCacheRefreshPeriod)`: |
| Starts another goroutine that takes the ParamSet generated by |
| `psRefresher` and populates the external cache (e.g., Redis) at |
| `redisCacheRefreshPeriod` intervals (e.g., every 4 hours). This provides |
| a faster lookup path for frequently accessed ParamSet data. |
| |
| Workflow: |
| |
| ``` |
| Trace Data --> DataFrameBuilder --> ParamSetRefresher (generates primary ParamSet) |
| | |
| v |
| CachedParamSetRefresher --> External Cache (e.g., Redis) |
| ``` |
| |
| **6. Old Data Deletion (`deletion/deleter.go`, `maintenance.go`)** |
| |
| - **Why:** Over time, Perf accumulates a large amount of data, including |
| regression information and associated shortcuts (which are often links or |
| identifiers for specific data views). To manage storage costs and maintain |
| system performance, very old data that is unlikely to be accessed needs to |
| be periodically deleted. |
| - **How:** |
| |
| - Controlled by the `flags.DeleteShortcutsAndRegressions` flag. |
| - **`deletion.New(db, ...)`:** Initializes a `Deleter` object. This object |
| encapsulates the logic for identifying and removing outdated regressions |
| and shortcuts. It takes a database connection pool (`db`) and the |
| datastore type. Internally, it creates instances of `sqlregressionstore` |
| and `sqlshortcutstore` to interact with the respective database tables. |
| - **`deleter.RunPeriodicDeletion(deletionPeriod, deletionBatchSize)`:** |
| This method in `maintenance.go` calls the `RunPeriodicDeletion` method |
| on the `Deleter` instance. |
| - Inside `deleter.go`, `RunPeriodicDeletion` starts a goroutine. |
| - This goroutine ticks at intervals specified by `deletionPeriod` (e.g., |
| every 15 minutes). |
| - On each tick, it calls `d.DeleteOneBatch(deletionBatchSize)`. |
| - **`Deleter.DeleteOneBatch(shortcutBatchSize)`:** |
| |
| * Calls `d.getBatch(ctx, shortcutBatchSize)` to identify a batch of |
| regressions and shortcuts eligible for deletion. |
| - **`Deleter.getBatch(...)`:** |
| - Finds the oldest commit number present in the `Regressions` |
| table. |
| - Iteratively queries the `Regressions` table for ranges of |
| commits, starting from the oldest. |
| - For each regression found, it checks the timestamp of its `Low` |
| and `High` `StepPoint`s. |
| - If a `StepPoint`'s timestamp is older than the defined `ttl` |
| (Time-To-Live, currently -18 months), the associated shortcut |
| and the commit number of the regression are marked for deletion. |
| - It continues collecting these until the number of shortcuts to |
| be deleted reaches approximately `shortcutBatchSize`. |
| - Returns the list of commit numbers whose regressions will be |
| deleted and the list of shortcut IDs to be deleted. |
| * Calls `d.deleteBatch(ctx, commitNumbers, shortcuts)` to perform the |
| actual deletion. |
| - **`Deleter.deleteBatch(...)`:** |
| - Starts a database transaction. |
| - Iterates through the `commitNumbers` and calls |
| `d.regressionStore.DeleteByCommit()` for each, removing the |
| regression data associated with that commit. |
| - Iterates through the `shortcuts` and calls |
| `d.shortcutStore.DeleteShortcut()` for each, removing the |
| shortcut entry. |
| - If all deletions are successful, it commits the transaction. If |
| any error occurs, it rolls back the transaction to ensure data |
| consistency. |
| |
| Deletion Workflow: |
| |
| ``` |
| Timer (every deletionPeriod) --> DeleteOneBatch |
| | |
| v |
| getBatch (identifies old data based on TTL) |
| | |
| | Returns (commit_numbers_to_delete, shortcut_ids_to_delete) |
| v |
| deleteBatch (deletes in a transaction) |
| | |
| +--> RegressionStore.DeleteByCommit |
| +--> ShortcutStore.DeleteShortcut |
| ``` |
| |
| The `ttl` variable in `deleter.go` is set to -18 months, meaning regressions |
| and their associated shortcuts older than 1.5 years are targeted for |
| deletion. This value was determined based on stakeholder requirements for |
| data retention. |
| |
| The `select {}` at the end of the `Start` function in `maintenance.go` is a |
| common Go idiom to make the main goroutine (the one that called `Start`) block |
| indefinitely. Since all the actual work is done in background goroutines |
| launched by `Start`, this prevents the `Start` function from returning and thus |
| keeps the maintenance processes alive. |
| |
| # Module: /go/notify |
| |
| The `notify` module in Perf is responsible for handling notifications related to |
| performance regressions. It provides a flexible framework for formatting and |
| sending notifications through various channels like email, issue trackers, or |
| custom endpoints like Chromeperf. |
| |
| **Core Concepts and Design:** |
| |
| The notification system is built around a few key abstractions: |
| |
| 1. **`Notifier` Interface (`notify.go`):** This is the central interface for |
| sending notifications. It defines methods for: |
| |
| - `RegressionFound`: Called when a new regression is detected. |
| - `RegressionMissing`: Called when a previously detected regression is no |
| longer found (e.g., due to new data or fixes). |
| - `ExampleSend`: Used for sending test/dummy notifications to verify |
| configuration. |
| - `UpdateNotification`: For updating an existing notification (e.g., |
| adding a comment to an issue). |
| |
| 2. **`Formatter` Interface (`notify.go`):** This interface is responsible for |
| constructing the content (body and subject) of a notification. |
| Implementations exist for: |
| |
| - `HTMLFormatter` (`html.go`): Generates HTML-formatted notifications, |
| suitable for email. |
| - `MarkdownFormatter` (`markdown.go`): Generates Markdown-formatted |
| notifications, suitable for issue trackers or other systems that support |
| Markdown. The formatters use Go's `text/template` package, allowing for |
| customizable notification messages. Templates can access a |
| `TemplateContext` (or `AndroidBugTemplateContext` for Android-specific |
| notifications) which provides data about the regression, commit, alert, |
| etc. |
| |
| 3. **`Transport` Interface (`notify.go`):** This interface defines how a |
| formatted notification is actually sent. Implementations include: |
| |
| - `EmailTransport` (`email.go`): Sends notifications via email using the |
| `emailclient` module. |
| - `IssueTrackerTransport` (`issuetracker.go`): Interacts with an issue |
| tracking system (configured for Google's Issue Tracker/Buganizer) to |
| create or update issues. It uses the `go/issuetracker/v1` client and |
| requires an API key for authentication. |
| - `NoopTransport` (`noop.go`): A "do nothing" implementation, useful for |
| disabling notifications or for testing. |
| |
| 4. **`NotificationDataProvider` Interface (`notification_provider.go`):** This |
| interface is responsible for gathering the necessary data to populate the |
| notification templates. |
| |
| - The `defaultNotificationDataProvider` uses a `Formatter` to generate the |
| notification body and subject based on `RegressionMetadata`. |
| - `androidNotificationProvider` (`android_notification_provider.go`) is a |
| specialized provider for Android-specific bug reporting. It uses its own |
| `AndroidBugTemplateContext` which includes Android-specific details like |
| Build ID diff URLs. It leverages the `MarkdownFormatter` for content |
| generation but with Android-specific templates. |
| |
| **Workflow for Sending a Notification (Simplified):** |
| |
| 1. A regression is detected (e.g., by the `alerter` module). |
| 2. The `Notifier`'s `RegressionFound` method is called with details about the |
| regression (commit, alert configuration, cluster summary, etc.). |
| 3. The `Notifier` (typically `defaultNotifier`) uses its |
| `NotificationDataProvider` to get the raw notification data (body and |
| subject). |
| - The `NotificationDataProvider` populates a context object (e.g., |
| `TemplateContext` or `AndroidBugTemplateContext`). |
| - It then uses a `Formatter` (e.g., `MarkdownFormatter`) to execute the |
| appropriate template with this context, producing the final body and |
| subject. |
| 4. The `Notifier` then calls its `Transport`'s `SendNewRegression` method, |
| passing the formatted body and subject. |
| 5. The `Transport` implementation handles the actual sending (e.g., makes an |
| API call to the issue tracker or sends an email). |
| |
| ``` |
| Regression Detected --> Notifier.RegressionFound(...) |
| | |
| v |
| NotificationDataProvider.GetNotificationDataRegressionFound(...) |
| | |
| | (Populates Context, e.g., TemplateContext) |
| v |
| Formatter.FormatNewRegressionWithContext(...) |
| | (Uses Go templates) |
| v |
| Formatted Body & Subject |
| | |
| v |
| Transport.SendNewRegression(body, subject) |
| | |
| +------------------> EmailTransport --> Email Server |
| | |
| +------------------> IssueTrackerTransport --> Issue Tracker API |
| | |
| +------------------> NoopTransport --> (Does nothing) |
| ``` |
| |
| **Key Files and Responsibilities:** |
| |
| - **`notify.go`**: |
| |
| - Defines the core interfaces: `Notifier`, `Formatter`, `Transport`. |
| - Provides the `defaultNotifier` implementation, which orchestrates the |
| notification process by composing a `NotificationDataProvider`, |
| `Formatter`, and `Transport`. |
| - Contains the `New()` factory function that constructs the appropriate |
| `Notifier` based on the `NotifyConfig`. This is the main entry point for |
| creating a notifier. |
| - Defines `TemplateContext` used by generic formatters. |
| - Includes logic in `getRegressionMetadata` to fetch additional |
| information like source file links from `TraceStore` if the alert is for |
| an individual trace. |
| |
| - **`notification_provider.go`**: |
| |
| - Defines the `NotificationDataProvider` interface. |
| - Provides `defaultNotificationDataProvider` which uses a generic |
| `Formatter`. |
| - The purpose is to abstract the data gathering logic for notifications, |
| allowing for different data providers (like the Android-specific one) |
| without changing the core `Notifier` or `Transport` mechanisms. |
| |
| - **`android_notification_provider.go`**: |
| |
| - Implements `NotificationDataProvider` specifically for Android bug |
| creation. |
| - Uses `AndroidBugTemplateContext` to provide Android-specific data to |
| templates, such as `GetBuildIdUrlDiff` for generating links to compare |
| Android build CLs. |
| - Relies on `MarkdownFormatter` but configures it with Android-specific |
| notification templates defined in the `NotifyConfig`. This allows |
| Android teams to customize their bug reports. |
| |
| - **`markdown.go` & `html.go`**: |
| |
| - Implement the `Formatter` interface for Markdown and HTML respectively. |
| - Define default templates for new regressions and when regressions go |
| missing. |
| - `MarkdownFormatter` can be configured with custom templates via |
| `NotifyConfig`. It also provides a `buildIDFromSubject` template |
| function, specifically designed for Android's commit message format, to |
| extract build IDs. |
| - `viewOnDashboard` is a utility function to construct a URL to the Perf |
| explore page for the given regression. |
| |
| - **`email.go` & `issuetracker.go` & `noop.go`**: |
| |
| - Implement the `Transport` interface. |
| - `email.go`: Uses `emailclient` to send emails. Splits |
| comma/space-separated recipient lists. |
| - `issuetracker.go`: Interacts with the Google Issue Tracker API. It |
| requires API key secrets (configured via `NotifyConfig`) and uses OAuth2 |
| for authentication. It can create new issues and update existing ones |
| (e.g., to mark them obsolete). |
| - `noop.go`: A null implementation for disabling notifications. |
| |
| - **`chromeperfnotifier.go`**: |
| |
| - Implements the `Notifier` interface directly, without using the |
| `Formatter` or `Transport` abstractions in the same way as |
| `defaultNotifier`. This is because it communicates directly with the |
| Chrome Performance Dashboard's Anomaly API. |
| - It translates Perf's regression data into the format expected by the |
| Chromeperf API (`ReportRegression`). |
| - Includes logic (`isParamSetValid`, `getTestPath`) to ensure the data |
| conforms to Chromeperf's requirements (e.g., specific param keys like |
| `master`, `bot`, `benchmark`, `test`). |
| - Determines if a regression is an improvement based on the |
| `improvement_direction` parameter and the step direction. |
| |
| - **`commitrange.go`**: |
| |
| - Provides `URLFromCommitRange`, a utility function to generate a URL for |
| a commit or a range of commits. If a `commitRangeURLTemplate` is |
| provided (e.g., via configuration), it will be used to create a URL |
| showing the diff between two commits. Otherwise, it defaults to the |
| individual commit's URL. This is used by formatters to create links in |
| notifications. |
| |
| - **`common/notificationData.go`**: |
| |
| - Defines `NotificationData` (simple struct for body and subject) and |
| `RegressionMetadata` (a comprehensive struct holding all relevant |
| information about a regression needed for notification generation). This |
| promotes a common data structure for passing regression details. |
| |
| **Configuration and Customization (`NotifyConfig`):** |
| |
| The behavior of the `notify` module is heavily influenced by |
| `config.NotifyConfig`. This configuration allows users to: |
| |
| - Choose the notification type (`Notifications` field): `None`, `HTMLEmail`, |
| `MarkdownIssueTracker`, `ChromeperfAlerting`, `AnomalyGrouper`. |
| - Specify the `NotificationDataProvider`: `DefaultNotificationProvider` or |
| `AndroidNotificationProvider`. |
| - Customize the subject and body of notifications using Go templates |
| (`Subject`, `Body`, `MissingSubject`, `MissingBody`). This is particularly |
| relevant for `MarkdownFormatter` and `androidNotificationProvider`. |
| - Provide settings for `IssueTrackerTransport` (API key secret locations). |
| |
| This design allows for flexibility in how notifications are generated and |
| delivered, catering to different needs and integrations. For instance, the |
| Android team can have highly customized bug reports, while other users might |
| prefer standard email notifications. The `ChromeperfNotifier` demonstrates a |
| direct integration with another system, bypassing some of the general-purpose |
| formatting/transport layers when a specific API is targeted. |
| |
| # Module: /go/notifytypes |
| |
| ## Perf Notifytypes Module |
| |
| ### Overview |
| |
| The `notifytypes` module in Perf defines the various types of notification |
| mechanisms that can be triggered in response to performance regressions or other |
| significant events. It also defines types for data providers that supply the |
| necessary information for these notifications. This module serves as a central |
| point for enumerating and categorizing notification strategies, enabling |
| flexible and extensible notification handling within the Perf system. |
| |
| ### Why: Design Decisions |
| |
| The primary goal of this module is to provide a structured and type-safe way to |
| manage notification types. |
| |
| - **Extensibility:** By defining notification types as constants of a custom |
| `Type` string, new notification methods can be easily added in the future |
| without requiring significant code changes in consuming modules. This |
| promotes loose coupling and allows the notification system to evolve |
| independently. |
| - **Clarity and Readability:** Using named constants (e.g., `HTMLEmail`, |
| `MarkdownIssueTracker`) instead of raw strings makes the code more |
| self-documenting and reduces the likelihood of errors due to typos. |
| - **Centralized Definition:** Having all notification types defined in one |
| place simplifies maintenance and provides a clear overview of the available |
| notification options. |
| - **Separation of Concerns:** The `NotificationDataProviderType` allows for |
| different sources or formats of data to be used for generating |
| notifications, separating the concern of _what_ data is needed from _how_ |
| the notification is delivered. This is crucial, for example, when different |
| platforms (like Android) might require specific data formatting or |
| additional information. |
| |
| ### How: Implementation Choices |
| |
| - **`Type` (string alias):** The `Type` is defined as an alias for `string`. |
| This allows for string-based storage and transmission of notification types |
| (e.g., in configuration files or database entries) while still providing a |
| degree of type safety within Go code. |
| - **Constants for Notification Types:** Specific notification mechanisms are |
| defined as constants of type `Type`. This ensures that only valid, |
| predefined notification types can be used. |
| - `HTMLEmail`: Indicates notifications sent as HTML-formatted emails. This |
| is suitable for rich content and direct user communication. |
| - `MarkdownIssueTracker`: Represents notifications formatted in Markdown, |
| intended for integration with issue tracking systems. This facilitates |
| automated ticket creation or updates. |
| - `ChromeperfAlerting`: Specifies that regression data should be sent to |
| the Chromeperf alerting system. This allows for integration with a |
| specialized alerting infrastructure. |
| - `AnomalyGrouper`: Designates that regressions should be processed by an |
| anomaly grouping logic, which then determines the appropriate action. |
| This enables more sophisticated handling of multiple related anomalies. |
| - `None`: A special type indicating that no notification should be sent. |
| This is useful for disabling notifications in certain contexts or for |
| configurations where alerting is not desired. |
| - **`AllNotifierTypes` Slice:** This public variable provides a convenient way |
| for other parts of the system to iterate over or validate against all known |
| notification types. |
| - **`NotificationDataProviderType` (string alias):** Similar to `Type`, this |
| defines the kind of data provider to use for notifications. |
| - `DefaultNotificationProvider`: Represents the standard or default data |
| provider. |
| - `AndroidNotificationProvider`: Indicates a specialized data provider |
| tailored for Android-specific notification requirements. This might |
| involve fetching different metrics, formatting data in a particular way, |
| or including Android-specific metadata. |
| |
| ### Responsibilities and Key Components |
| |
| - **`notifytypes.go`:** This is the sole file in the module and contains all |
| the definitions. |
| - **Defines Notification Types:** Its primary responsibility is to |
| enumerate the supported notification mechanisms (`HTMLEmail`, |
| `MarkdownIssueTracker`, `ChromeperfAlerting`, `AnomalyGrouper`, `None`). |
| This acts as a contract for other modules that implement or consume |
| notification functionalities. |
| - **Defines Data Provider Types:** It also defines the types of data |
| providers (`DefaultNotificationProvider`, `AndroidNotificationProvider`) |
| that can be used to source information for notifications. This allows |
| the notification system to adapt to different data sources or formats. |
| - **Provides an Exhaustive List:** The `AllNotifierTypes` variable makes |
| it easy for other components to get a list of all valid notification |
| types, for example, for display in a UI or for validation purposes. |
| |
| ### Key Workflows/Processes |
| |
| While this module itself doesn't implement workflows, it underpins them. A |
| typical conceptual workflow where these types would be used is: |
| |
| 1. **Regression Detected:** The Perf system identifies a performance |
| regression. `Regression Event -->` |
| 2. **Configuration Checked:** The system checks the configuration associated |
| with the metric/test that regressed. This configuration would specify a |
| `notifytypes.Type`. `Configuration Lookup (specifies notifytypes.Type, e.g., |
| HTMLEmail) -->` |
| 3. **Notifier Selected:** Based on the `notifytypes.Type` from the |
| configuration, the appropriate notifier implementation is selected. |
| `Notification System -->` |
| 4. **Data Provider Selected (if applicable):** If the configuration also |
| specifies a `notifytypes.NotificationDataProviderType`, the corresponding |
| data provider is chosen. `Data Provider (e.g., AndroidNotificationProvider) |
| -->` |
| 5. **Notification Sent:** The selected notifier uses the data (potentially from |
| the selected data provider) to construct and send the notification. |
| `Notification Delivered (e.g., Email Sent)` |
| |
| For example, if a regression is detected for an Android benchmark and the |
| configuration specifies `HTMLEmail` as the `Type` and |
| `AndroidNotificationProvider` as the `NotificationDataProviderType`: |
| |
| `Regression Event` -> `Config: {Type: HTMLEmail, DataProvider: |
| AndroidNotificationProvider}` -> `Select EmailNotifier` -> `Select |
| AndroidDataProvider` -> `AndroidDataProvider fetches data` -> `EmailNotifier |
| formats and sends HTML email` |
| |
| # Module: /go/perf-tool |
| |
| The `perf-tool` module provides a command-line interface (CLI) for interacting |
| with various aspects of the Perf performance monitoring system. It allows |
| developers and administrators to manage configurations, inspect data, perform |
| database maintenance tasks, and validate ingestion files. |
| |
| The primary motivation behind `perf-tool` is to offer a centralized and |
| scriptable way to perform common Perf operations that would otherwise require |
| manual intervention or direct database interaction. This simplifies workflows |
| and enables automation of routine tasks. |
| |
| The core functionality is organized into subcommands, each addressing a specific |
| area of Perf: |
| |
| - **`config`**: Manages Perf instance configurations. |
| - `create-pubsub-topics-and-subscriptions`: Sets up the necessary Google |
| Cloud Pub/Sub topics and subscriptions required for data ingestion. This |
| is crucial for ensuring that Perf instances can receive and process |
| performance data. |
| - `validate`: Checks the syntax and validity of a Perf instance |
| configuration file. This helps prevent deployment of misconfigured |
| instances. |
| - **`tiles`**: Interacts with the tiled data storage used by Perf's |
| `tracestore`. Tiles are segments of time-series data. |
| - `last`: Displays the index of the most recent tile, providing insight |
| into the current state of data ingestion. |
| - `list`: Shows a list of recent tiles and the number of traces they |
| contain, useful for understanding data volume and distribution. |
| - **`traces`**: Allows querying and exporting trace data. |
| - `list`: Retrieves and displays the IDs of traces that match a given |
| query within a specific tile. This is useful for ad-hoc data |
| exploration. |
| - `export`: Exports trace data matching a query and commit range to a JSON |
| file. This enables external analysis or data migration. |
| - **`ingest`**: Manages the data ingestion process. |
| - `force-reingest`: Triggers the re-ingestion of data files from Google |
| Cloud Storage (GCS) for a specified time range. This is useful for |
| reprocessing data after configuration changes or to fix ingestion |
| errors. The workflow is: |
| * Parse start and stop time parameters. |
| * Iterate through configured GCS source prefixes. |
| * For each prefix, determine hourly GCS directories within the time range. |
| * List files in each directory. |
| * For each file, create a Pub/Sub message with the GCS object attributes |
| (bucket and name). |
| * Publish these messages to the configured ingestion topic. This simulates |
| the GCS notification events that trigger ingestion. |
| - `validate`: Validates the format and content of an ingestion file |
| against the expected schema and parsing rules. This helps ensure data |
| quality before ingestion. |
| - **`database`**: Provides tools for backing up and restoring Perf database |
| components. This is critical for disaster recovery and data migration. |
| - `backup`: |
| - `alerts`: Backs up alert configurations to a zip file. |
| - `shortcuts`: Backs up saved shortcut configurations to a zip file. |
| - `regressions`: Backs up regression data (detected performance changes) |
| and associated shortcuts to a zip file. It backs up data up to a |
| specified date (defaulting to four weeks ago). The process involves |
| iterating backward through commits in batches, fetching regressions for |
| each commit range, and storing them along with any shortcuts referenced |
| in those regressions. |
| - `restore`: |
| - `alerts`: Restores alert configurations from a backup file. |
| - `shortcuts`: Restores shortcut configurations from a backup file. |
| - `regressions`: Restores regression data and their associated shortcuts |
| from a backup file. It's important to note that restoring regressions |
| also attempts to re-create the associated shortcuts. |
| - **`trybot`**: Contains experimental functionality related to trybot |
| (pre-submit testing) data. |
| - `reference`: Generates a synthetic nanobench reference file. This file |
| is constructed by loading a specified trybot results file, identifying |
| all trace IDs within it, and then fetching historical sample data for |
| these traces from the main Perf instance (specifically, from the last N |
| ingested files). The aggregated historical samples are then formatted |
| into a new nanobench JSON file. This allows for comparing trybot results |
| against a baseline derived from recent production data using tools like |
| `nanostat`. |
| - **`markdown`**: Generates Markdown documentation for the `perf-tool` CLI |
| itself. |
| |
| The `main.go` file sets up the CLI application using the `urfave/cli` library. |
| It defines flags, commands, and subcommands, and maps them to corresponding |
| functions in the `application` package. It handles flag parsing, configuration |
| loading (from a file, with optional connection string overrides), and |
| initialization of logging. |
| |
| The `application/application.go` file defines the `Application` interface and |
| its concrete implementation `app`. This interface abstracts the core logic for |
| each command, promoting testability and separation of concerns. The `app` struct |
| implements methods that interact with various Perf components like `tracestore`, |
| `alertStore`, `shortcutStore`, `regressionStore`, and GCS. |
| |
| Key design choices include: |
| |
| - **Interface-based application logic (`Application` interface):** This allows |
| for mocking the application logic during testing (as seen in `main_test.go` |
| and `application/mocks/Application.go`), ensuring that the CLI command |
| parsing and flag handling can be tested independently of the actual backend |
| operations. |
| - **Configuration-driven:** Most operations require an instance configuration |
| file (`--config_filename`), which defines data store connections, GCS |
| sources, etc. This makes the tool adaptable to different Perf deployments. |
| - **Use of helper builders:** Functions from `perf/go/builders` are used to |
| instantiate components like `TraceStore`, `AlertStore`, etc., based on the |
| provided instance configuration. This centralizes component creation logic. |
| - **Zip format for backups:** Database backups for alerts, shortcuts, and |
| regressions are stored in zip files. Inside these zip files, data is |
| typically serialized using `encoding/gob`. This provides a simple and |
| portable backup solution. |
| - **Batching for large operations:** When backing up regressions, data is |
| fetched in batches of commits (`regressionBatchSize`) to manage memory and |
| avoid overwhelming the database. |
| - **Pub/Sub for re-ingestion:** The `ingest force-reingest` command leverages |
| Pub/Sub by publishing messages that mimic GCS notifications, effectively |
| triggering the standard ingestion pipeline. |
| |
| The `application/mocks/Application.go` file contains a mock implementation of |
| the `Application` interface, generated by the `mockery` tool. This is used in |
| `main_test.go` to test the command-line argument parsing and dispatch logic |
| without actually performing the underlying operations. |
| |
| # Module: /go/perfclient |
| |
| The `perfclient` module provides an interface for sending performance data to |
| Skia Perf's ingestion system. The primary goal of this module is to abstract the |
| complexities of interacting with Google Cloud Storage (GCS), which is the |
| underlying mechanism Perf uses for data ingestion. By providing a dedicated |
| client, it simplifies the process for other applications and services that need |
| to report performance metrics. |
| |
| The core design centers around a `ClientInterface` and its concrete |
| implementation, `Client`. This approach allows for easy mocking and testing, |
| promoting loose coupling between the `perfclient` and its consumers. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`perf_client.go`**: |
| |
| - **`ClientInterface`**: This interface defines the contract for pushing |
| performance data. The key method is `PushToPerf`. The decision to use an |
| interface here is crucial for testability and dependency injection. It |
| allows consumers to use a real GCS-backed client in production and a |
| mock client in tests. |
| - **`Client`**: This struct is the concrete implementation of |
| `ClientInterface`. It holds a `gcs.GCSClient` instance, which is |
| responsible for the actual communication with Google Cloud Storage, and |
| a `basePath` string that specifies the root directory within the GCS |
| bucket where performance data will be stored. The constructor `New` |
| takes these as arguments, allowing users to configure the GCS bucket and |
| the top-level folder for their data. |
| - **`PushToPerf` method**: This is the workhorse of the module. |
| |
| * It takes a `time.Time` object (`now`), a `folderName`, a `filePrefix`, |
| and a `format.BenchData` struct (which represents the performance |
| metrics). |
| * The `format.BenchData` is first marshaled into a JSON string. This is |
| the standard format Perf expects for ingestion. |
| * The JSON data is then compressed using `gzip`. This is a performance |
| optimization, as GCS can automatically decompress gzipped files with the |
| correct `ContentEncoding` header, reducing storage costs and transfer |
| times. |
| * A deterministic GCS object path is constructed using the `objectPath` |
| helper function. This path incorporates the `basePath`, the current |
| timestamp (formatted as `YYYY/MM/DD/HH/`), the `folderName`, and a |
| filename composed of the `filePrefix`, an MD5 hash of the JSON data, and |
| a millisecond-precision timestamp. The inclusion of the MD5 hash helps |
| in avoiding duplicate uploads of identical data and can be useful for |
| debugging or data verification. The timestamp in the path and filename |
| ensures that data from different runs or times are stored separately and |
| can be easily queried. |
| * Finally, the compressed data is uploaded to GCS using the |
| `storageClient.SetFileContents` method. Crucially, it sets |
| `ContentEncoding: "gzip"` and `ContentType: "application/json"` in the |
| `gcs.FileWriteOptions`. This metadata informs GCS about the compression |
| and data type, enabling features like automatic decompression. |
| |
| - **`objectPath` function**: This helper function is responsible for |
| constructing the unique GCS path for each performance data file. The |
| rationale for this specific path structure |
| (`basePath/YYYY/MM/DD/HH/folderName/filePrefix_hash_timestamp.json`) is |
| to organize data chronologically and by task, making it easier to |
| browse, query, and manage within GCS. The hash ensures uniqueness and |
| integrity. |
| |
| - **`mock_perf_client.go`**: |
| |
| - **`MockPerfClient`**: This provides a mock implementation of |
| `ClientInterface` using the `testify/mock` library. This is essential |
| for unit testing components that depend on `perfclient` without |
| requiring actual GCS interaction. It allows developers to define |
| expected calls to `PushToPerf` and verify that their code interacts with |
| the client correctly. The `NewMockPerfClient` constructor returns a |
| pointer to ensure that the methods provided by `mock.Mock` (like `On` |
| and `AssertExpectations`) are accessible. |
| |
| **Workflow: Pushing Performance Data** |
| |
| The primary workflow involves a client application using `perfclient` to send |
| performance data: |
| |
| ``` |
| Client App perfclient.Client gcs.GCSClient |
| | | | |
| | -- Call PushToPerf(now, | | |
| | folder, prefix, data) ->| | |
| | | -- Marshal data to JSON | |
| | | -- Compress JSON (gzip) | |
| | | -- Construct GCS objectPath | |
| | | (includes time, folder, | |
| | | prefix, data hash) | |
| | | | |
| | | -- Call SetFileContents(path, | |
| | | options, compressed_data) -> | |
| | | | -- Upload to GCS |
| | | | with gzip encoding |
| | | | and JSON content type |
| | | <-------------------------------| -- Return success/error |
| | <--------------------------| | |
| | -- Receive success/error | | |
| ``` |
| |
| The design emphasizes creating a clear separation of concerns: the `perfclient` |
| handles the formatting, compression, and path generation logic specific to |
| Perf's ingestion requirements, while the underlying `gcs.GCSClient` handles the |
| raw GCS communication. This makes the `perfclient` a focused and reusable |
| component for any system needing to integrate with Skia Perf. |
| |
| # Module: /go/perfresults |
| |
| ## Module Overview |
| |
| The `perfresults` module is responsible for fetching, parsing, and processing |
| performance results data generated by Telemetry-based benchmarks in the Chromium |
| project. This data typically resides in `perf_results.json` files. The module |
| provides functionalities to: |
| |
| 1. **Load Performance Data**: Retrieve performance results from various |
| sources, primarily Buildbucket builds. This involves interacting with |
| Buildbucket to get build information, Swarming to identify relevant tasks |
| and their outputs, and RBE-CAS (Content Addressable Storage) to download the |
| actual `perf_results.json` files. |
| 2. **Parse Performance Data**: Interpret the structure of `perf_results.json` |
| files. These files contain sets of histograms, where each histogram |
| represents a specific benchmark measurement. The parser extracts these |
| histograms and associated metadata. |
| 3. **Process and Transform Data**: Convert the parsed performance data into a |
| format suitable for ingestion by other systems, such as the Perf ingestion |
| pipeline. This includes aggregating histogram samples (e.g., calculating |
| mean, max, min) and structuring the data according to a defined schema. |
| |
| The primary goal is to provide a reliable and efficient way to access and |
| utilize Chromium's performance data for analysis and monitoring. |
| |
| ## Design Decisions and Implementation Choices |
| |
| ### Data Loading Workflow |
| |
| The process of loading performance results from a Buildbucket build involves |
| several steps: |
| |
| `Buildbucket ID -> BuildInfo -> Swarming Task ID -> Child Swarming Task IDs -> |
| CAS Outputs -> PerfResults` |
| |
| 1. **Buildbucket Interaction (`buildbucket.go`)**: |
| |
| - **Why**: Buildbucket is the entry point for CI/CQ builds. It contains |
| information about the build, including the associated Swarming task and |
| crucial metadata like git revision and commit position. |
| - **How**: The `bbClient` interacts with the Buildbucket PRPC API to fetch |
| build details using a given `buildID`. It specifically requests fields |
| like `builder`, `status`, `infra.backend.task.id` (for the Swarming task |
| ID), `output.properties` (for git revision information), and |
| `input.properties` (for `perf_dashboard_machine_group`). |
| - The `BuildInfo` struct is populated with this information, providing a |
| consolidated view of the build's context. The `GetPosition()` method on |
| `BuildInfo` is crucial as it determines the commit identifier (either |
| commit position or git hash) used for associating the performance data |
| with a specific point in the codebase. |
| |
| 2. **Swarming Interaction (`swarming.go`)**: |
| |
| - **Why**: The main Buildbucket task often spawns multiple child Swarming |
| tasks, each running a subset of benchmarks. We need to identify all |
| these child tasks to gather all performance results. |
| - **How**: The `swarmingClient` uses the Swarming PRPC API. |
| - `findChildTaskIds`: Given a parent Swarming task ID (obtained from |
| `BuildInfo`), this function lists all child tasks by querying for |
| tasks with a matching `parent_task_id` tag. The query is scoped by |
| the parent task's creation and completion timestamps to narrow down |
| the search. |
| - `findTaskCASOutputs`: For each child task ID, this function |
| retrieves the task result, specifically looking for the |
| `CasOutputRoot`. This reference points to the RBE-CAS location where |
| the task's output files (including `perf_results.json`) are stored. |
| |
| 3. **RBE-CAS Interaction (`rbecas.go`)**: |
| |
| - **Why**: `perf_results.json` files are stored in RBE-CAS. RBE-CAS |
| provides efficient and reliable storage for large build artifacts. |
| - **How**: The `RBEPerfLoader` uses the RBE SDK to interact with CAS. |
| - `fetchPerfDigests`: Given a CAS reference (pointing to the root |
| directory of a task's output), this function: |
| * Reads the root `Directory` proto. |
| * Retrieves the entire directory tree using `GetDirectoryTree`. |
| * Flattens the tree to get a map of file paths to their digests. |
| * Filters for files named `perf_results.json`. The path structure is |
| expected to be `benchmark_name/perf_results.json`, allowing |
| association of results with a specific benchmark. |
| - `loadPerfResult`: Given a digest for a `perf_results.json` file, |
| this reads the blob from CAS and parses it using `NewResults`. |
| - `LoadPerfResults`: This orchestrates the loading for multiple CAS |
| references (from multiple child Swarming tasks). It iterates through |
| each CAS reference, fetches the digests of `perf_results.json` |
| files, loads each file, and then merges results from the same |
| benchmark. Merging is important because a single benchmark might |
| have its results split across multiple files or tasks. |
| |
| 4. **Orchestration (`perf_loader.go`)**: |
| |
| - **Why**: A central loader is needed to tie together the interactions |
| with Buildbucket, Swarming, and RBE-CAS. |
| - **How**: The `loader.LoadPerfResults` method coordinates the entire |
| workflow: |
| 1. Initializes `bbClient` to get `BuildInfo`. |
| 2. Initializes `swarmingClient` to find child task IDs and then their |
| CAS outputs. |
| 3. It performs a sanity check (`checkCasInstances`) to ensure all CAS |
| outputs come from the same RBE instance, simplifying client |
| initialization. |
| 4. Initializes `RBEPerfLoader` (via `rbeProvider` for testability) for |
| the determined CAS instance. |
| 5. Calls `RBEPerfLoader.LoadPerfResults` with the list of CAS |
| references to fetch and parse all `perf_results.json` files. |
| - The use of `rbeProvider` is a good example of dependency injection, |
| allowing tests to mock the RBE-CAS interaction. |
| |
| ### Performance Data Parsing (`perf_results_parser.go`) |
| |
| - **Why**: `perf_results.json` files have a specific, somewhat complex |
| structure. A dedicated parser is needed to extract meaningful data |
| (histograms and their metadata). |
| - **How**: |
| - The `PerfResults` struct is the main container, holding a map of |
| `TraceKey` to `Histogram`. |
| - `TraceKey` uniquely identifies a trace, composed of `ChartName` (metric |
| name), `Unit`, `Story` (user journey/test case), `Architecture`, and |
| `OSName`. These fields are extracted from the histogram's own properties |
| and its associated "diagnostics" which are references to other metadata |
| objects within the JSON file. |
| - `Histogram` stores the `SampleValues` (the actual measurements). |
| - **Streaming JSON Decoding**: `NewResults` uses `json.NewDecoder` to |
| process the input `io.Reader` in a streaming fashion. |
| - **Why Streaming?**: `perf_results.json` files can be very large (10MB+). |
| Reading the entire file into memory before parsing would be inefficient |
| and could lead to high memory usage. Streaming allows processing the |
| JSON array element by element. |
| - **Implementation**: |
| 1. It first expects and consumes the opening `[` of the JSON array. |
| 2. It then iterates while `decoder.More()` is true, decoding each |
| element into a `singleEntry` struct. |
| 3. `singleEntry` is a union-like struct that can hold different types |
| of objects found in the JSON (histograms, generic sets, date ranges, |
| related name maps). This is determined by checking fields like |
| `Name` (present for histograms) or `Type`. |
| 4. If an entry is a histogram (`entry.Name != ""`), it's converted to |
| `TraceKey` and `Histogram` via |
| `histogramRaw.asTraceKeyAndHistogram`. This conversion involves |
| looking up GUIDs from the histogram's `Diagnostics` map in a locally |
| maintained `metadata` map (`md`). |
| 5. Other entry types (`GenericSet`, `DateRange`, `RelatedNameMap`) are |
| stored in the `md` map, keyed by their `GUID`, so they can be |
| referenced by histograms later in the stream. |
| 6. Parsed histograms are merged into `pr.Histograms`. If a `TraceKey` |
| already exists, sample values are appended. |
| 7. Finally, it consumes the closing `]` of the JSON array. |
| - **Aggregation**: The `Histogram` type provides methods for common |
| aggregations (Min, Max, Mean, Stddev, Sum, Count). `AggregationMapping` |
| provides a convenient way to access these aggregation functions by |
| string keys, which is used by downstream consumers like the ingestion |
| module. |
| - **Legacy `UnmarshalJSON`**: An `UnmarshalJSON` method exists, which |
| reads the entire byte slice into memory. This is less efficient and |
| marked for deprecation in favor of `NewResults`. |
| |
| ### Data Ingestion Preparation (`ingest/`) |
| |
| This submodule focuses on transforming the parsed `PerfResults` into the |
| `format.Format` structure required by the Perf ingestion system. |
| |
| - **`json.go` (`ConvertPerfResultsFormat`)**: |
| |
| - **Why**: The raw `PerfResults` structure is not directly ingestible. It |
| needs to be reshaped. |
| - **How**: |
| |
| * Iterates through each `(TraceKey, Histogram)` pair in the input |
| `PerfResults`. |
| * For each pair, it creates a `format.Result`. The `Key` map within |
| `format.Result` is populated from `TraceKey` fields (chart, unit, story, |
| arch, os). |
| * The `Measurements` map within `format.Result` is populated by calling |
| `toMeasurement` on the `Histogram`. |
| * `toMeasurement` iterates through `perfresults.AggregationMapping`, |
| applying each aggregation function to the histogram's samples. Each |
| resulting aggregation (e.g., "max", "mean") becomes a |
| `format.SingleMeasurement` with the aggregation type as its `Value` and |
| the computed metric as its `Measurement`. |
| * The final `format.Format` object includes the version, commit hash |
| (`GitHash`), and any provided headers and links. |
| |
| - **`gcs.go`**: |
| |
| - **Why**: Provides utilities for determining the correct Google Cloud |
| Storage (GCS) path where the transformed JSON files should be stored. |
| This is based on conventions used by the Perf ingestion system. |
| - **How**: |
| - `convertPath`: Constructs a GCS path like |
| `gs://<bucket>/ingest/<time_path>/<build_info_path>/<benchmark>`. |
| - `convertTime`: Formats a `time.Time` into `YYYY/MM/DD/HH` (UTC). |
| - `convertBuildInfo`: Formats `BuildInfo` into |
| `<MachineGroup>/<BuilderName>`. It defaults `MachineGroup` to |
| "ChromiumPerf" and `BuilderName` to "BuilderNone" if they are empty. |
| - `isInternal`: Determines if the results are internal or public based on |
| the `BuilderName`. It checks against a list of known external bot |
| configurations (`pinpoint/go/bot_configs`). If not found, it defaults to |
| internal. This determines whether `PublicBucket` (`chrome-perf-public`) |
| or `InternalBucket` (`chrome-perf-non-public`) is used. |
| |
| ## Key Components and Files |
| |
| - **`perf_loader.go`**: Orchestrates the loading of performance results from |
| Buildbucket. `NewLoader().LoadPerfResults()` is the main entry point. |
| - **`buildbucket.go`**: Handles interaction with the Buildbucket API to fetch |
| build metadata. Defines `BuildInfo`. |
| - **`swarming.go`**: Handles interaction with the Swarming API to find child |
| tasks and their CAS outputs. |
| - **`rbecas.go`**: Handles interaction with RBE-CAS to download and parse |
| `perf_results.json` files. Defines `RBEPerfLoader`. |
| - **`perf_results_parser.go`**: Parses the content of `perf_results.json` |
| files. Defines `PerfResults`, `TraceKey`, `Histogram`, and the streaming |
| `NewResults` parser. |
| - **`ingest/json.go`**: Transforms parsed `PerfResults` into the |
| `format.Format` structure for ingestion. |
| - **`ingest/gcs.go`**: Provides utilities to determine GCS paths for storing |
| transformed results. |
| - **`cli/main.go`**: A command-line interface utility that uses the |
| `perfresults` library to fetch results for a given Buildbucket ID and |
| outputs them as JSON files in the ingestion format. This serves as a |
| practical example and a tool for ad-hoc data retrieval. |
| - **`testdata/`**: Contains JSON files used for replaying HTTP and gRPC |
| interactions during tests (`*.json`, `*.rpc`), and sample |
| `perf_results.json` files for parser testing. `replay_test.go` sets up the |
| replay mechanism. |
| |
| ## Workflows |
| |
| ### Primary Workflow: Loading Perf Results from Buildbucket |
| |
| ``` |
| User/System --Buildbucket ID--> perf_loader.LoadPerfResults() |
| | |
| +--> buildbucket.findBuildInfo() --PRPC call--> Buildbucket API |
| | (Returns BuildInfo: Swarming Task ID, Git Revision, Machine Group, etc.) |
| | |
| +--> swarming.findChildTaskIds() --PRPC call--> Swarming API (using Parent Task ID) |
| | (Returns list of Child Swarming Task IDs) |
| | |
| +--> swarming.findTaskCASOutputs() --PRPC calls--> Swarming API (for each Child Task ID) |
| | (Returns list of CASReference objects) |
| | |
| (Error if CAS instances differ for CASReferences) |
| | |
| +--> rbecas.RBEPerfLoader.LoadPerfResults() (with list of CASReferences) |
| | |
| +--> For each CASReference: |
| | | |
| | +--> rbecas.fetchPerfDigests() --RBE SDK calls--> RBE-CAS |
| | | (Returns map of benchmark_name to digest of perf_results.json) |
| | | |
| | +--> For each (benchmark_name, digest): |
| | | |
| | +--> rbecas.loadPerfResult() --RBE SDK call (ReadBlob)--> RBE-CAS |
| | | | |
| | | +--> perf_results_parser.NewResults() (Parses JSON stream) |
| | | (Returns PerfResults object for this file) |
| | | |
| | +--> (Merge with existing PerfResults for the same benchmark_name) |
| | |
| (Returns map[benchmark_name]*PerfResults and BuildInfo) |
| ``` |
| |
| ### CLI Workflow: Fetching and Converting Perf Results |
| |
| ``` |
| CLI User --Build ID, Output Dir--> cli/main.main() |
| | |
| +--> perfresults.NewLoader().LoadPerfResults(Build ID) |
| | (Executes the Primary Workflow described above) |
| | (Returns BuildInfo, map[benchmark]*PerfResults) |
| | |
| +--> For each (benchmark, perfResult) in results: |
| | |
| +--> ingest.ConvertPerfResultsFormat(perfResult, buildInfo.GetPosition(), headers, links) |
| | (Transforms PerfResults to ingest.Format) |
| | |
| +--> Marshal ingest.Format to JSON |
| | |
| +--> Write JSON to output file: <outputDir>/<benchmark>_<BuildID>.json |
| | |
| +--> Print output filename to stdout |
| ``` |
| |
| ### Temporal Worker (Placeholder) |
| |
| The `workflows/worker/main.go` file sets up a Temporal worker. Currently, it's a |
| basic skeleton that initializes a worker and connects to a Temporal server. It |
| doesn't register any specific activities or workflows from the `perfresults` |
| module itself. Its presence suggests an intention to integrate `perfresults` |
| functionalities into Temporal workflows in the future, possibly for automated |
| ingestion or processing tasks. The worker itself is a generic Temporal worker |
| setup. |
| |
| ## Testing Strategy |
| |
| The module employs a robust testing strategy: |
| |
| - **Unit Tests**: Each Go file generally has a corresponding `_test.go` file |
| with unit tests for its specific logic. For example, |
| `perf_results_parser_test.go` tests the JSON parsing, and |
| `buildbucket_test.go` tests `BuildInfo` logic. |
| - **Replay Testing (`replay_test.go`, `testdata/`)**: |
| - **Why**: Directly calling external services (Buildbucket, Swarming, |
| RBE-CAS) in tests makes them slow, flaky, and dependent on external |
| state. Replay testing records actual interactions once and then |
| "replays" them during subsequent test runs. |
| - **How**: |
| - HTTP interactions (with Buildbucket and Swarming PRPC servers) are |
| replayed using `cloud.google.com/go/httpreplay`. Recorded interactions |
| are stored as `.json` files in `testdata/`. |
| - gRPC interactions (with RBE-CAS) are replayed using |
| `cloud.google.com/go/rpcreplay`. Recorded interactions are stored as |
| gzipped `.rpc` files in `testdata/`. |
| - A command-line flag (`-record_path`) controls whether tests run in |
| replay mode (reading from `testdata/`) or record mode (writing new |
| replay files to the specified path). This allows updating replay files |
| when external APIs change or new test cases are needed. |
| - `setupReplay()` and `newRBEReplay()` in `replay_test.go` are helper |
| functions that configure the HTTP client and RBE client for either |
| recording or replaying. |
| - **Test Data (`testdata/perftest/`)**: Contains various `perf_results.json` |
| files (e.g., `full.json`, `empty.json`, `merged.json`) to test different |
| scenarios for the `perf_results_parser.go`. This ensures the parser |
| correctly handles different valid and edge-case inputs. |
| - **Example Usage as Test (`cli/main.go`)**: The CLI itself serves as an |
| integration test for the core loading and conversion logic. Its tests |
| (`perf_loader_test.go` for example) often use the replay mechanism to test |
| the end-to-end flow from Build ID to parsed `PerfResults`. |
| |
| This combination ensures both isolated unit correctness and reliable integration |
| testing without external dependencies during typical test runs. |
| |
| # Module: /go/perfserver |
| |
| The `perfserver` module serves as the central executable for the Perf |
| performance monitoring system. It consolidates various essential components into |
| a single command-line tool, simplifying deployment and management. The primary |
| goal is to provide a unified entry point for running the web UI, data ingestion |
| processes, regression detection, and maintenance tasks. This approach avoids the |
| complexity of managing multiple separate services and their configurations. |
| |
| The module leverages the `urfave/cli` library to define and manage sub-commands, |
| each corresponding to a distinct functional area of Perf. This design allows for |
| clear separation of concerns while maintaining a single binary. Configuration |
| for each sub-command is handled through flags, with the `config` package |
| providing structured types for these flags. |
| |
| Key components and their responsibilities: |
| |
| - **`main.go`**: This is the entry point of the `perfserver` executable. |
| |
| - **Why**: It orchestrates the initialization and execution of the |
| different Perf sub-systems. |
| - **How**: It defines a `cli.App` with several sub-commands: |
| |
| - **`frontend`**: This sub-command launches the main web user interface |
| for Perf. |
| |
| - **Why**: To provide users with a visual way to explore performance |
| data, configure alerts, and view regressions. |
| - **How**: It initializes and runs the `frontend` component (from |
| `//perf/go/frontend`). Configuration is passed via |
| `config.FrontendFlags`. The `frontend` component itself handles |
| serving HTTP requests and rendering the UI. |
| |
| - **`maintenance`**: This sub-command starts background maintenance tasks. |
| |
| - **Why**: Certain operations, like data cleanup, schema migrations, |
| or periodic recalculations, are necessary for the long-term health |
| and efficiency of the Perf system. These tasks often need to be run |
| as singletons to avoid conflicts. |
| - **How**: It initializes and runs the `maintenance` component (from |
| `//perf/go/maintenance`). It first validates the instance |
| configuration (using `//perf/go/config/validate`) and then starts |
| the maintenance routines. Prometheus metrics are exposed for |
| monitoring. |
| |
| - **`ingest`**: This sub-command runs the data ingestion process. |
| |
| - **Why**: To continuously import performance data from various |
| sources (e.g., build artifacts, test results) and populate the |
| central data store (TraceStore). |
| - **How**: It initializes and runs the ingestion process logic (from |
| `//perf/go/ingest/process`). Similar to `maintenance`, it validates |
| the instance configuration. It supports parallel ingestion for |
| improved throughput. Prometheus metrics are also exposed. |
| - Data Ingestion Workflow: `Configured Sources --> [Ingest Process] |
| |
| --Parses/Validates--> [TraceStore] | Handles incoming files |
| Populates data` |
| |
| - **`cluster`**: This sub-command runs the regression detection process. |
| |
| - **Why**: To automatically analyze incoming performance data against |
| configured alerts and identify significant performance regressions. |
| - **How**: Interestingly, this sub-command also utilizes the |
| `frontend.New` and `f.Serve()` mechanism, similar to the `frontend` |
| sub-command. This suggests that the regression detection logic might |
| be tightly coupled with or exposed through the same underlying |
| service framework as the main UI, potentially for sharing |
| configuration or common infrastructure. It uses |
| `config.FrontendFlags` but specifically for clustering-related |
| settings (indicated by `AsCliFlags(true)`). |
| - Regression Detection Workflow: `[TraceStore] --New Data--> [Cluster |
| |
| Process] --Applies Alert Rules--> [Alerts/Notifications] ^ | | |
| Identifies Regressions +-------------------------+` |
| |
| - **`markdown`**: A utility sub-command to generate Markdown documentation |
| for `perfserver` itself. |
| |
| - **Why**: To provide up-to-date command-line help in a portable |
| format. |
| - **How**: It uses the `ToMarkdown()` method provided by the |
| `urfave/cli` library. |
| |
| - **Logging**: The `Before` hook in the `cli.App` configures `sklog` to |
| output logs to standard output, ensuring that operational messages from |
| any sub-command are visible. |
| |
| - **Configuration Loading**: For sub-commands like `ingest` and |
| `maintenance`, instance configuration is loaded from a specified file |
| (`ConfigFilename` flag) and validated using `//perf/go/config/validate`. |
| The database connection string can be overridden via a command-line |
| flag. |
| |
| - **Metrics**: The `ingest` and `maintenance` sub-commands initialize |
| Prometheus metrics, allowing for monitoring of their operational health |
| and performance. |
| |
| The design emphasizes modularity by delegating the core logic of each function |
| (UI, ingestion, clustering, maintenance) to dedicated packages |
| (`//perf/go/frontend`, `//perf/go/ingest/process`, `//perf/go/maintenance`). |
| `perfserver` acts as the conductor, parsing command-line arguments, loading |
| appropriate configurations, and invoking the correct sub-system. This structure |
| makes the overall Perf system more maintainable and easier to understand, as |
| each component has a well-defined responsibility. |
| |
| # Module: /go/pinpoint |
| |
| The `/go/pinpoint` module provides a Go client for interacting with the Pinpoint |
| service, which is part of Chromeperf. Pinpoint is a performance testing and |
| analysis tool used to identify performance regressions and improvements. This |
| client enables other Go applications within the Skia infrastructure to |
| programmatically trigger Pinpoint jobs. |
| |
| **Core Functionality:** |
| |
| The primary purpose of this module is to abstract the complexities of making |
| HTTP requests to the Pinpoint API. It handles authentication, request |
| formatting, and response parsing. This allows other services to easily initiate |
| two main types of Pinpoint jobs: |
| |
| 1. **Bisect Jobs:** These jobs are used to identify the specific commit that |
| caused a performance regression or improvement between two given git |
| revisions. The client constructs the appropriate URL and parameters for the |
| `pinpointURL` endpoint. |
| 2. **Try Jobs (A/B Testing):** These jobs compare the performance of a base |
| commit (or patch) against an experimental commit (or patch). This is |
| particularly useful for evaluating the performance impact of a pending code |
| change. The client uses the `pinpointLegacyURL` for these types of jobs. |
| |
| **Design Decisions and Implementation Choices:** |
| |
| - **Separate Endpoints for Bisect and Try Jobs:** The Pinpoint service has |
| distinct API endpoints for creating bisect jobs (`pinpointURL`) and legacy |
| try jobs (`pinpointLegacyURL`). The client reflects this by having separate |
| methods (`CreateBisect` and `CreateTryJob`) and corresponding request URL |
| builder functions (`buildBisectRequestURL` and `buildTryJobRequestURL`). |
| This design choice directly maps to the underlying Pinpoint API structure, |
| making it clear which type of job is being created. |
| - **URL Parameter Encoding:** Both types of Pinpoint jobs are initiated via |
| HTTP GET requests where all parameters are encoded in the URL query string. |
| The `buildBisectRequestURL` and `buildTryJobRequestURL` functions are |
| responsible for constructing these URLs by populating `url.Values` and then |
| encoding them. This is a direct consequence of how the Pinpoint API is |
| designed. |
| - **Authentication:** The client utilizes Google's default token source for |
| authentication (`google.DefaultTokenSource`) with the |
| `auth.ScopeUserinfoEmail` scope. This is a standard approach for |
| service-to-service authentication within the Google Cloud ecosystem, |
| ensuring secure communication with the Pinpoint API. |
| - **Metrics Collection:** The client integrates with `go/metrics2` to track |
| the number of times bisect and try jobs are called and the number of times |
| these calls fail. This is crucial for monitoring the reliability and usage |
| of the Pinpoint integration. |
| - **Error Handling:** The module uses `go/skerr` for wrapping errors. This |
| provides more context to errors, making debugging easier. For example, if a |
| Pinpoint request fails, the HTTP status code and response body are included |
| in the error message. |
| - **Dependency on `pinpoint/go/bot_configs`:** For try jobs, the `target` |
| parameter is required by the Pinpoint API. This `target` is derived from the |
| `Configuration` (bot) and `Benchmark` using the |
| `bot_configs.GetIsolateTarget` function. This indicates a specific |
| configuration setup for running the performance tests. |
| - **`test_path` Parameter for Bisect Jobs:** The Pinpoint API requires a |
| `test_path` parameter for bisect jobs. This parameter is constructed by |
| joining several components like "ChromiumPerf", configuration, benchmark, |
| chart, and story. This specific formatting is a legacy requirement of the |
| Chromeperf API. |
| - **Mandatory `bug_id` for Bisect Jobs:** The Pinpoint API mandates the |
| `bug_id` parameter for bisect jobs. If not provided by the caller, the |
| client defaults it to `"null"`. This reflects a specific constraint of the |
| upstream service. |
| - **`tags` Parameter:** Both job types include a `tags` parameter set to |
| `{"origin":"skia_perf"}`. This helps in tracking and filtering jobs |
| originating from the Skia infrastructure within the Pinpoint system. |
| |
| **Key Components/Files:** |
| |
| - **`pinpoint.go`:** This is the sole Go file in the module and contains all |
| the logic. |
| - **`Client` struct:** Represents the Pinpoint client. It holds the |
| authenticated `http.Client` and counters for metrics. |
| - **`New()` function:** The constructor for the `Client`. It initializes |
| the HTTP client with appropriate authentication. |
| - **`CreateLegacyTryRequest` and `CreateBisectRequest` structs:** Define |
| the structure of the data required to create try jobs and bisect jobs, |
| respectively. These fields directly map to the parameters expected by |
| the Pinpoint API. |
| - **`CreatePinpointResponse` struct:** Defines the structure of the JSON |
| response from Pinpoint, which includes the `JobID` and `JobURL`. |
| - **`CreateTryJob()` method:** |
| - Takes a `CreateLegacyTryRequest` and a `context.Context`. |
| - Calls `buildTryJobRequestURL` to construct the request URL. |
| - Makes an HTTP POST request (though parameters are in URL, Pinpoint |
| endpoint expects POST for creation) to the `pinpointLegacyURL`. |
| - Parses the JSON response into a `CreatePinpointResponse`. |
| - Handles errors and increments metrics. |
| - **`CreateBisect()` method:** |
| - Similar to `CreateTryJob()`, but takes a `CreateBisectRequest`. |
| - Calls `buildBisectRequestURL`. |
| - Makes an HTTP POST request to the `pinpointURL`. |
| - Parses the response and handles errors/metrics. |
| - **`buildTryJobRequestURL()` function:** |
| - Takes a `CreateLegacyTryRequest`. |
| - Validates required fields like `Benchmark` and `Configuration`. |
| - Retrieves the `target` using `bot_configs.GetIsolateTarget`. |
| - Populates `url.Values` with all relevant parameters from the request, |
| including hardcoded values like `comparison_mode` and `tags`. |
| - Returns the fully formed URL string. |
| - **`buildBisectRequestURL()` function:** |
| - Takes a `CreateBisectRequest`. |
| - Populates `url.Values` with parameters from the request. |
| - Sets a default value for `bug_id` if not provided. |
| - Constructs the `test_path` parameter based on available request fields. |
| - Includes the `tags` parameter. |
| - Returns the fully formed URL string. |
| |
| **Key Workflows:** |
| |
| 1. **Creating a Bisect Job:** |
| |
| ``` |
| Application Code go/pinpoint.Client Pinpoint API |
| ---------------- ------------------ ------------ |
| 1. CreateBisectRequest data ----> |
| 2. Calls client.CreateBisect() --> |
| 3. buildBisectRequestURL() |
| (constructs URL with params) |
| 4. HTTP POST to pinpointURL --------> |
| 5. Processes request |
| 6. Returns JSON response |
| <----------------------------------- 7. Receives HTTP response |
| 8. Parses JSON into |
| CreatePinpointResponse |
| <--------------------------------- 9. Returns CreatePinpointResponse |
| ``` |
| |
| 2. **Creating a Try Job (A/B Test):** |
| |
| ``` |
| Application Code go/pinpoint.Client Pinpoint API (Legacy) |
| ---------------- ------------------ --------------------- |
| 1. CreateLegacyTryRequest data -> |
| 2. Calls client.CreateTryJob() --> |
| 3. buildTryJobRequestURL() |
| (gets 'target' from bot_configs, |
| constructs URL with params) |
| 4. HTTP POST to pinpointLegacyURL -----> |
| 5. Processes request |
| 6. Returns JSON response |
| <---------------------------------------- 7. Receives HTTP response |
| 8. Parses JSON into |
| CreatePinpointResponse |
| <--------------------------------- 9. Returns CreatePinpointResponse |
| ``` |
| |
| # Module: /go/pivot |
| |
| ## Pivot Module Documentation |
| |
| ### High-Level Overview |
| |
| The `pivot` module provides functionality analogous to pivot tables in |
| spreadsheets or `GROUP BY` operations in SQL. Its primary purpose is to |
| aggregate and summarize trace data within a `DataFrame` based on specified |
| grouping criteria and operations. This allows users to transform raw trace data |
| into more insightful, summarized views, facilitating comparisons and analysis |
| across different dimensions of the data. For example, one might want to compare |
| the performance of 'arm' architecture machines against 'intel' architecture |
| machines by summing or averaging their respective performance metrics. |
| |
| ### Design and Implementation |
| |
| The core of the `pivot` module revolves around the `Request` struct and the |
| `Pivot` function. |
| |
| **`Request` Struct:** |
| |
| The `Request` struct encapsulates the parameters for a pivot operation. It |
| defines: |
| |
| - **`GroupBy`**: A slice of strings representing the parameter keys to group |
| the traces by. This is the fundamental dimension along which the data will |
| be aggregated. For instance, if `GroupBy` is `["arch"]`, all traces with the |
| same 'arch' value will be grouped together. |
| - **`Operation`**: An `Operation` type (e.g., `Sum`, `Avg`, `Geo`) that |
| specifies how the values within each group of traces should be combined. |
| This operation is applied to each point in the traces within a group, |
| resulting in a new, summarized trace for that group. |
| - **`Summary`**: An optional slice of `Operation` types. If provided, these |
| operations are applied to the _resulting_ traces from the `GroupBy` step. |
| Each `Summary` operation generates a single value (a column in the final |
| output if viewed as a table) for each grouped trace. If `Summary` is empty, |
| the output is a `DataFrame` where each row is a summarized trace (suitable |
| for plotting). |
| |
| **`Pivot` Function Workflow:** |
| |
| The `Pivot` function executes the aggregation and summarization process. Here's |
| a breakdown of its key steps and the reasoning behind them: |
| |
| 1. **Input Validation (`req.Valid()`):** |
| |
| - **Why:** To ensure the request is well-formed before proceeding with |
| potentially expensive computations. This prevents errors due to missing |
| `GroupBy` keys or invalid `Operation` or `Summary` values. |
| - **How:** It checks if `GroupBy` is non-empty and if the specified |
| `Operation` and `Summary` operations are among the predefined valid |
| operations (`AllOperations`). |
| |
| 2. **Initialization and Grouping Structure (`groupedTraceSets`):** |
| |
| - **Why:** To efficiently organize traces into their respective groups. A |
| map is used where keys are the group identifiers (e.g., ",arch=arm,") |
| and values are `types.TraceSet` containing traces belonging to that |
| group. |
| - **How:** |
| - It pre-populates `groupedTraceSets` by determining all possible |
| unique combinations of values for the `GroupBy` keys present in the |
| input `DataFrame`'s `ParamSet`. This is done using |
| `df.ParamSet.CartesianProduct(req.GroupBy)`. This pre-population |
| ensures that even groups with no matching traces are considered, |
| although they will be filtered out later if they remain empty. |
| - It then iterates through each trace in the input `DataFrame` |
| (`df.TraceSet`). |
| - For each trace, it extracts the relevant parameter values specified |
| in `req.GroupBy` to form a `groupKey` using `groupKeyFromTraceKey`. |
| This function ensures that only traces containing _all_ the |
| `GroupBy` keys contribute to a group. If a trace is missing a |
| `GroupBy` key, it's ignored. |
| - The trace is then added to the `types.TraceSet` associated with its |
| `groupKey` in `groupedTraceSets`. |
| |
| ``` |
| Input DataFrame (df.TraceSet) |
| | |
| v |
| For each traceID, trace in df.TraceSet: |
| Parse traceID into params |
| groupKey = groupKeyFromTraceKey(params, req.GroupBy) |
| If groupKey is valid: |
| Add trace to groupedTraceSets[groupKey] |
| | |
| v |
| Grouped Traces (groupedTraceSets) |
| ``` |
| |
| 3. **Applying the GroupBy Operation:** |
| |
| - **Why:** To perform the primary aggregation based on the |
| `req.Operation`. |
| - **How:** |
| - It iterates through the `groupedTraceSets`. |
| - For each non-empty group, it applies the `groupByOperation` function |
| corresponding to `req.Operation` (obtained from `opMap`) to the |
| `types.TraceSet` of that group. The `opMap` is a crucial design |
| choice, mapping `Operation` constants to their respective |
| implementation functions (one for grouping traces, another for |
| summarizing single traces). This provides a clean and extensible way |
| to manage different aggregation functions. |
| - The result of this operation is a single summarized trace for that |
| group, which is stored in the `ret.TraceSet` of the new `DataFrame`. |
| - Context cancellation (`ctx.Err()`) is checked periodically to allow |
| for early termination if the operation is cancelled. |
| |
| ``` |
| Grouped Traces (groupedTraceSets) |
| | |
| v |
| For each groupID, traces in groupedTraceSets: |
| If len(traces) > 0: |
| summarizedTrace = opMap[req.Operation].groupByOperation(traces) |
| ret.TraceSet[groupID] = summarizedTrace |
| | |
| v |
| DataFrame with GroupBy Applied (ret) |
| ``` |
| |
| 4. **Building ParamSet for the Result:** |
| |
| - **Why:** The resulting `DataFrame` needs its own `ParamSet` reflecting |
| the new structure where trace keys only contain the `GroupBy` |
| parameters. |
| - **How:** `ret.BuildParamSet()` is called. |
| |
| 5. **Applying Summary Operations (Optional):** |
| |
| - **Why:** To further reduce the data into single summary values per group |
| if `req.Summary` is specified. This is useful for generating tabular |
| summaries rather than plots. |
| - **How:** |
| - If `req.Summary` is empty, the original `DataFrame`'s `Header` is |
| used for the new `DataFrame`, and the function returns. The result |
| is a `DataFrame` of summarized traces. |
| - If `req.Summary` is not empty: |
| - It iterates through each summarized trace in `ret.TraceSet`. |
| - For each trace, it creates a new `types.Trace` (called |
| `summaryValues`) whose length is equal to the number of `Summary` |
| operations. |
| - For each `Operation` in `req.Summary`, it applies the corresponding |
| `summaryOperation` function (from `opMap`) to the current grouped |
| trace. The result is stored in `summaryValues`. |
| - The original summarized trace in `ret.TraceSet[groupKey]` is |
| replaced with `summaryValues`. |
| - The `Header` of the `ret` `DataFrame` is rebuilt. Each column in the |
| header now corresponds to one of the `Summary` operations, with |
| offsets from 0 to `len(req.Summary) - 1`. |
| |
| ``` |
| DataFrame with GroupBy Applied (ret) |
| | |
| v |
| If len(req.Summary) > 0: |
| For each groupKey, trace in ret.TraceSet: |
| summaryValues = new Trace of length len(req.Summary) |
| For i, op in enumerate(req.Summary): |
| summaryValues[i] = opMap[op].summaryOperation(trace) |
| ret.TraceSet[groupKey] = summaryValues |
| Adjust ret.Header to match Summary operations |
| | |
| v |
| Final Pivoted DataFrame (ret) |
| ``` |
| |
| **Operations (`Operation` type and `opMap`):** |
| |
| The module defines a set of standard operations like `Sum`, `Avg`, `Geo`, `Std`, |
| `Count`, `Min`, `Max`. |
| |
| - **Why:** To provide common aggregation methods. |
| - **How:** |
| - Each `Operation` is a string constant. |
| - The `opMap` is a map where each `Operation` key maps to an |
| `operationFunctions` struct. This struct holds two function pointers: |
| - `groupByOperation`: Takes a `types.TraceSet` (a group of traces) and |
| returns a single aggregated `types.Trace`. These functions are typically |
| sourced from the `go/calc` module. |
| - `summaryOperation`: Takes a single `[]float32` (a trace) and returns a |
| single `float32` summary value. These functions are typically sourced |
| from `go/vec32` or defined locally (like `stdDev`). |
| - This design makes it easy to add new operations by defining the constant |
| and adding corresponding entries to `opMap` with the appropriate |
| implementation functions. |
| |
| **Error Handling:** |
| |
| - **Why:** To provide clear feedback on invalid inputs or issues during |
| processing. |
| - **How:** The `Pivot` function returns an error if `req.Valid()` fails or if |
| an error occurs during grouping (e.g., a `GroupBy` key is not found in the |
| `ParamSet` of the input DataFrame). Context cancellation is also handled, |
| allowing long-running pivot operations to be interrupted. Errors are wrapped |
| using `skerr.Wrap` to provide context. |
| |
| ### Key Components and Files |
| |
| - **`pivot.go`**: This is the main file containing all the logic for the pivot |
| functionality. |
| |
| - **`Request` struct**: Defines the parameters for a pivot operation. Its |
| design allows for flexible grouping and summarization. |
| - **`Operation` type and constants**: Define the set of available |
| aggregation operations. |
| - **`opMap` variable**: A critical data structure mapping `Operation` |
| types to their respective implementation functions for both grouping and |
| summarizing. This is the heart of how different operations are |
| dispatched. |
| - **`Pivot` function**: The primary public function that performs the |
| pivot operation. Its step-by-step process of grouping, applying the main |
| operation, and then optionally applying summary operations is central to |
| its functionality. |
| - **`groupKeyFromTraceKey` function**: A helper function responsible for |
| constructing the group identifier for each trace based on the `GroupBy` |
| keys. It handles cases where a trace might not have all the required |
| keys. |
| - **`Valid()` method on `Request`**: Ensures that the pivot request is |
| well-formed before processing begins. |
| |
| - **`pivot_test.go`**: Contains unit tests for the `pivot` module. |
| |
| - **Why:** To ensure the correctness and robustness of the pivot logic |
| under various scenarios, including valid inputs, invalid inputs, edge |
| cases (like empty groups or traces not matching any group), and context |
| cancellation. |
| - **How:** It uses the `testify` assertion library and defines test cases |
| that cover different aspects of the `Request` validation, |
| `groupKeyFromTraceKey` logic, and the `Pivot` function itself with |
| various combinations of `Operation` and `Summary` settings. The |
| `dataframeForTesting()` helper function provides a consistent dataset |
| for testing. |
| |
| This module is designed to be a general-purpose tool for transforming and |
| understanding large datasets of traces by allowing users to aggregate data along |
| arbitrary dimensions and apply various statistical operations. |
| |
| # Module: /go/progress |
| |
| ## Module: /go/progress |
| |
| The `/go/progress` module provides a mechanism for tracking the progress of |
| long-running tasks on the backend and exposing this information to the UI. This |
| is crucial for user experience in applications where operations like data |
| queries or complex computations can take a significant amount of time. Without |
| progress tracking, users might perceive the application as unresponsive or |
| encounter timeouts. |
| |
| ### Why: The Need for Asynchronous Task Monitoring |
| |
| Many backend operations, such as those initiated by API endpoints like |
| `/frame/start` or `/dryrun/start`, are asynchronous. The initial HTTP request |
| might return quickly, but the actual work continues in the background. This |
| module addresses the need to: |
| |
| 1. **Provide Feedback:** Inform the user that a task is ongoing and how it's |
| progressing. |
| 2. **Avoid Timeouts:** Prevent HTTP requests from timing out while waiting for |
| a long task to complete. The UI can poll for updates instead of holding a |
| connection open. |
| 3. **Communicate Complex State:** Allow tasks to report detailed, multi-stage |
| progress information, not just a simple percentage. For example, a "dry run" |
| might involve several distinct steps, each with its own status and relevant |
| data. |
| |
| ### How: Design and Implementation |
| |
| The core idea is to represent the state of a long-running task as a `Progress` |
| object. This object can be updated by the task as it executes. A `Tracker` then |
| manages multiple `Progress` objects, making them accessible via HTTP polling. |
| |
| **Key Components:** |
| |
| - **`progress.go`**: Defines the `Progress` interface and its concrete |
| implementation `progress`. |
| |
| - **`Progress` Interface**: This is the central abstraction for a single |
| long-running task. |
| - **Why an interface?** It allows for potential future extensions or |
| alternative implementations (e.g., different storage mechanisms for |
| progress data if needed, though the current implementation is |
| in-memory). |
| - **Key Methods:** |
| - `Message(key, value string)`: Allows the task to report arbitrary |
| key-value string pairs. This is flexible enough to accommodate |
| diverse progress information (e.g., current step, commit being |
| processed, number of items filtered). If a key already exists, its |
| value is updated. |
| - `Results(interface{})`: Stores intermediate or final results of the |
| task. The `interface{}` type allows any JSON-serializable data to be |
| stored. This is useful for showing partial results or accumulating |
| data incrementally. |
| - `Error(string)`: Marks the task as failed and stores an error |
| message. |
| - `Finished()`: Marks the task as successfully completed. |
| - `FinishedWithResults(interface{})`: Atomically sets the results and |
| marks the task as finished. This is preferred over separate |
| `Results()` and `Finished()` calls to avoid race conditions where |
| the UI might poll between the two calls. |
| - `Status() Status`: Returns the current status (`Running`, |
| `Finished`, `Error`). |
| - `URL(string)`: Sets the URL that the client should poll for further |
| updates. This is typically set by the `Tracker`. |
| - `JSON(w io.Writer) error`: Serializes the current progress state |
| (status, messages, results, next URL) into JSON and writes it to the |
| provided writer. |
| - **`progress` struct (concrete implementation)**: |
| - Uses a `sync.Mutex` to ensure thread-safe updates to its internal |
| `SerializedProgress` state. This is critical because long-running tasks |
| often execute in separate goroutines, and the `Progress` object might be |
| accessed concurrently by the task updating its state and by the |
| `Tracker` serving HTTP requests. |
| - Maintains its state in a `SerializedProgress` struct, which is designed |
| for easy JSON serialization. |
| - **State Transitions:** A `Progress` object starts in the `Running` |
| state. Once it transitions to `Finished` or `Error`, it becomes |
| immutable. Any attempt to modify it (e.g., calling `Message()` or |
| `Results()` again) will result in a panic. This design simplifies |
| reasoning about the lifecycle of a task's progress. |
| - **`SerializedProgress` struct**: Defines the JSON structure sent to the |
| client. It includes the `Status`, an array of `Message` (key-value |
| pairs), the `Results` (if any), and the `URL` for the next poll. |
| - **`Status` enum**: `Running`, `Finished`, `Error`. |
| |
| - **`tracker.go`**: Defines the `Tracker` interface and its concrete |
| implementation `tracker`. |
| |
| - **`Tracker` Interface**: Manages a collection of `Progress` objects. |
| - `Add(prog Progress)`: Registers a new `Progress` object with the |
| tracker. The tracker assigns a unique ID to this progress and sets its |
| polling URL. |
| - `Handler(w http.ResponseWriter, r *http.Request)`: An HTTP handler |
| function that clients use to poll for progress updates. It extracts the |
| progress ID from the request URL, retrieves the corresponding `Progress` |
| object, and sends its JSON representation. |
| - `Start(ctx context.Context)`: Starts a background goroutine for periodic |
| cleanup of completed tasks from the cache. |
| - **`tracker` struct (concrete implementation)**: |
| - **`lru.Cache`**: Uses a Least Recently Used (LRU) cache |
| (`github.com/hashicorp/golang-lru`) to store `cacheEntry` objects. |
| - **Why LRU?** To prevent unbounded memory growth if many tasks are |
| tracked. Older, completed tasks are eventually evicted. |
| - **`basePath`**: A string prefix for the polling URLs (e.g., |
| `/_/status/`). Each progress object gets a unique ID appended to this |
| base path to form its polling URL. |
| - **`cacheEntry` struct**: Wraps a `Progress` object and a `Finished` |
| timestamp. The timestamp is used by the cleanup routine to determine |
| when a completed task can be removed from the cache. |
| - **Cleanup Mechanism**: |
| - The `Start` method launches a goroutine that periodically calls |
| `singleStep`. |
| - `singleStep` iterates through the cache: |
| - It updates the `Finished` timestamp in a `cacheEntry` when the |
| corresponding `Progress` object transitions out of the `Running` |
| state. |
| - It removes entries from the cache if they have been in a `Finished` |
| or `Error` state for longer than `cacheDuration` (currently 5 |
| minutes). This prevents the cache from holding onto completed tasks |
| indefinitely. |
| - This ensures that resources are eventually freed up while still |
| allowing clients a reasonable window to fetch the final results of a |
| completed task. |
| - **UUIDs for IDs**: Uses `github.com/google/uuid` to generate unique IDs |
| for each tracked `Progress`. This makes the polling URLs distinct and |
| hard to guess. |
| |
| ### Key Workflows |
| |
| 1. **Starting and Tracking a Long-Running Task:** |
| |
| ``` |
| Backend HTTP Handler (e.g., /api/start_long_task) |
| | |
| | 1. Create a new Progress object: |
| | prog := progress.New() |
| | |
| | 2. Add it to the global Tracker instance: |
| | trackerInstance.Add(prog) // Tracker sets prog.URL() internally |
| | |
| | 3. Respond to the initial HTTP request with the Progress JSON. |
| | // The client now has prog.URL() to poll. |
| | prog.JSON(w) |
| | |
| V |
| Goroutine (executing the long-running task) |
| | |
| | 1. Periodically update progress: |
| | prog.Message("Step", "Processing item X") |
| | prog.Message("PercentComplete", "30%") |
| | prog.Results(partialData) // Optional: intermediate results |
| | |
| | 2. When finished: |
| | If error: |
| | prog.Error("Something went wrong") |
| | Else: |
| | prog.FinishedWithResults(finalData) |
| ``` |
| |
| 2. **Client Polling for Updates:** |
| |
| ``` |
| Client (e.g., browser UI) |
| | |
| | 1. Receives initial response with prog.URL (e.g., /_/status/some-uuid) |
| | |
| | 2. Makes a GET request to prog.URL |
| V |
| Backend Tracker.Handler |
| | |
| | 1. Extracts "some-uuid" from the request path. |
| | |
| | 2. Looks up the Progress object in its cache using "some-uuid". |
| | If not found --> HTTP 404 Not Found |
| | |
| | 3. Calls prog.JSON(w) to send the current state. |
| V |
| Client |
| | |
| | 1. Receives JSON with current status, messages, results. |
| | |
| | 2. If Status is "Running", schedules another poll to prog.URL. |
| | |
| | 3. If Status is "Finished" or "Error", displays final results/error and stops polling. |
| ``` |
| |
| 3. **Tracker Cache Management (Background Process):** |
| |
| ``` |
| Tracker.Start() |
| | |
| V |
| Goroutine (periodic execution, e.g., every minute) |
| | |
| | Calls tracker.singleStep() |
| | | |
| | V |
| | Iterate through cache entries: |
| | - If Progress.Status() is not "Running" AND cacheEntry.Finished is zero: |
| | Set cacheEntry.Finished = now() |
| | - If cacheEntry.Finished is not zero AND now() > cacheEntry.Finished + cacheDuration: |
| | Remove entry from cache |
| | - Update metrics (numEntriesInCache) |
| | |
| V |
| (Loop back to periodic execution) |
| ``` |
| |
| This system provides a robust and flexible way to communicate the progress of |
| backend tasks to the user interface, improving the overall user experience for |
| operations that might otherwise seem opaque or unresponsive. The use of JSON for |
| data interchange makes it easy for web frontends to consume the progress |
| information. |
| |
| # Module: /go/psrefresh |
| |
| The `psrefresh` module is designed to manage and provide access to |
| `paramtools.ParamSet` instances, which are collections of key-value pairs |
| representing the parameters of traces in a performance monitoring system. The |
| primary goal is to efficiently retrieve and cache these parameter sets, |
| especially for frequently accessed queries, to reduce database load and improve |
| response times. |
| |
| The module addresses the need for up-to-date parameter sets by periodically |
| fetching data from a trace store (represented by the `OPSProvider` interface). |
| It combines parameter sets from recent time intervals (tiles) to provide a |
| comprehensive view of available parameters. |
| |
| A key challenge is handling potentially large and complex parameter sets. To |
| mitigate this, the module offers a caching layer (`CachedParamSetRefresher`). |
| This caching mechanism is configurable and can pre-populate caches (e.g., local |
| in-memory or Redis) with filtered parameter sets based on predefined query |
| levels. This pre-population significantly speeds up queries that match these |
| common filter patterns. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`psrefresh.go`**: |
| |
| - Defines the core interfaces `OPSProvider` and `ParamSetRefresher`. |
| - `OPSProvider`: Abstractly represents a source of ordered parameter sets |
| (e.g., a trace data store). It provides methods to get the latest "tile" |
| (a time-based segment of data) and the parameter set for a specific |
| tile. This abstraction allows `psrefresh` to be independent of the |
| underlying data storage implementation. |
| - `ParamSetRefresher`: Defines the contract for components that can |
| provide the full parameter set and parameter sets filtered by a query. |
| It also includes a `Start` method to initiate the refresh process. |
| - Implements `defaultParamSetRefresher`, which is the standard |
| implementation of `ParamSetRefresher`. |
| - **Why**: This struct is responsible for the fundamental logic of |
| periodically fetching parameter sets from the `OPSProvider`. It merges |
| parameter sets from a configurable number of recent tiles to create a |
| comprehensive view. |
| - **How**: It uses a background goroutine (`refresh`) that periodically |
| calls `oneStep`. The `oneStep` method fetches the latest tile, then |
| iterates backward through the configured number of previous tiles, |
| retrieving and merging their parameter sets using |
| `paramtools.ParamSet.AddParamSet`. The resulting merged set is then |
| normalized and stored. |
| - A `sync.Mutex` is used to protect concurrent access to the `ps` |
| (paramtools.ReadOnlyParamSet) field, ensuring thread safety when |
| `GetAll` is called. |
| - `GetParamSetForQuery` delegates the actual filtering and counting of |
| traces to a `dataframe.DataFrameBuilder`, demonstrating a separation of |
| concerns. |
| - `UpdateQueryValueWithDefaults` is a helper to automatically add default |
| parameter selections to queries if configured, simplifying common query |
| patterns. |
| |
| - **`cachedpsrefresh.go`**: |
| |
| - Implements `CachedParamSetRefresher`, which wraps a |
| `defaultParamSetRefresher` and adds a caching layer. |
| - **Why**: To improve performance for common queries by avoiding repeated |
| database lookups or expensive filtering operations. For frequently |
| accessed subsets of data (e.g., specific benchmarks or configurations), |
| retrieving pre-computed parameter sets from a cache is much faster. |
| - **How**: It takes a `cache.Cache` instance (which could be local, Redis, |
| etc.) and a `defaultParamSetRefresher`. |
| - `PopulateCache`: This is a crucial method that proactively fills the |
| cache. It uses the `QueryCacheConfig` (part of `config.QueryConfig`) to |
| determine which levels of parameter sets to cache. |
| - It starts by getting the full parameter set from the underlying |
| `psRefresher`. |
| - It then iterates through configured "Level 1" parameter keys and |
| their specified values. For each combination, it performs a |
| `PreflightQuery` (via the `dfBuilder`) to get the filtered parameter |
| set and the count of matching traces. |
| - Both the filtered parameter set (as a string) and the count are |
| stored in the cache using distinct keys. |
| - If "Level 2" keys and values are configured, it recursively calls |
| `populateChildLevel` to cache parameter sets for combinations of |
| Level 1 and Level 2 parameters. |
| - The cache keys are generated by `paramSetKey` and `countKey`, |
| ensuring a consistent naming scheme. |
| - `GetParamSetForQuery`: When a query is made, |
| `getParamSetForQueryInternal` first tries to retrieve the result from |
| the cache. |
| - It determines the appropriate cache key based on the query |
| parameters and the configured cache levels (`getParamSetKey`). It |
| only attempts to serve from the cache if the query matches the |
| configured cache levels (1 or 2 parameters, potentially adjusted for |
| default parameters). |
| - If a cache hit occurs, it reconstructs the `paramtools.ParamSet` |
| from the cached string and retrieves the count. |
| - If there's a cache miss or an error, it falls back to the underlying |
| `psRefresher.GetParamSetForQuery`. |
| - `StartRefreshRoutine`: This method starts a goroutine that periodically |
| calls `PopulateCache` to keep the cached data fresh. |
| |
| **Key Workflows:** |
| |
| 1. **Initialization and Periodic Refresh (Default Refresher):** |
| |
| ``` |
| NewDefaultParamSetRefresher(opsProvider, ...) -> pf |
| pf.Start(refreshPeriod) |
| -> pf.oneStep() // Initial fetch |
| -> opsProvider.GetLatestTile() -> latestTile |
| -> LOOP (numParamSets times): |
| -> opsProvider.GetParamSet(tile) -> individualPS |
| -> mergedPS.AddParamSet(individualPS) |
| -> tile = tile.Prev() |
| -> mergedPS.Normalize() |
| -> pf.ps = mergedPS.Freeze() |
| -> GO pf.refresh() |
| -> LOOP (every refreshPeriod): |
| -> pf.oneStep() // Subsequent fetches |
| ``` |
| |
| 2. **Cache Population (Cached Refresher):** |
| |
| ``` |
| NewCachedParamSetRefresher(defaultRefresher, cacheImpl) -> cr |
| cr.StartRefreshRoutine(cacheRefreshPeriod) |
| -> cr.PopulateCache() // Initial population |
| -> defaultRefresher.GetAll() -> fullPS |
| -> // For each configured Level 1 key/value: |
| -> qValues = {level1Key: [level1Value]} |
| -> defaultRefresher.UpdateQueryValueWithDefaults(qValues) // If applicable |
| -> query.New(qValues) -> lv1Query |
| -> defaultRefresher.dfBuilder.PreflightQuery(ctx, lv1Query, fullPS) -> count, filteredPS |
| -> psCacheKey = paramSetKey(qValues, [level1Key]) |
| -> cr.addToCache(ctx, psCacheKey, filteredPS.ToString(), count) |
| -> // If Level 2 is configured: |
| -> cr.populateChildLevel(ctx, level1Key, level1Value, filteredPS, level2Key, level2Values) |
| -> // For each configured Level 2 value: |
| -> qValues = {level1Key: [level1Value], level2Key: [level2Value]} |
| -> ... (similar PreflightQuery and addToCache) |
| -> GO LOOP (every cacheRefreshPeriod): |
| -> cr.PopulateCache() // Subsequent cache refreshes |
| ``` |
| |
| 3. **Querying with Cache:** `cr.GetParamSetForQuery(ctx, queryObj, queryValues) |
| -> cr.getParamSetForQueryInternal(ctx, queryObj, queryValues) -> |
| cr.getParamSetKey(queryValues) -> cacheKey, err -> IF cacheKey is valid AND |
| exists: -> cache.GetValue(ctx, cacheKey) -> cachedParamSetString -> |
| cache.GetValue(ctx, countKey(cacheKey)) -> cachedCountString -> |
| paramtools.FromString(cachedParamSetString) -> paramSet -> |
| strconv.ParseInt(cachedCountString) -> count -> RETURN count, paramSet, nil |
| -> ELSE (cache miss or invalid key for caching): -> |
| defaultRefresher.GetParamSetForQuery(ctx, queryObj, queryValues) -> count, |
| paramSet, err -> RETURN count, paramSet, err` |
| |
| The use of `config.QueryConfig` and `config.Experiments` allows for |
| instance-specific tuning of caching behavior (which keys/values to pre-populate) |
| and handling of default parameters. The separation between |
| `defaultParamSetRefresher` and `CachedParamSetRefresher` promotes modularity, |
| allowing the caching layer to be optional or replaced with different caching |
| strategies if needed. |
| |
| # Module: /go/redis |
| |
| The `redis` module in Skia Perf is designed to manage interactions with Redis |
| instances, primarily to support and optimize the query UI. It leverages Redis |
| for caching frequently accessed data, thereby improving the responsiveness and |
| performance of the Perf frontend. |
| |
| The core idea is to periodically fetch information about available Redis |
| instances within a Google Cloud Project and then interact with a specific, |
| configured Redis instance to store or retrieve cached data. This cached data |
| typically represents results of expensive computations or frequently requested |
| data points, like recent trace data for specific queries. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`redis.go`**: This is the central file of the module. |
| |
| - **`RedisWrapper` interface**: Defines the contract for Redis-related |
| operations. This abstraction allows for easier testing and potential |
| future replacements of the underlying Redis client implementation. The |
| key methods are: |
| - `StartRefreshRoutine`: Initiates a background process (goroutine) that |
| periodically discovers and interacts with the configured Redis instance. |
| - `ListRedisInstances`: Retrieves a list of all Redis instances available |
| within a specified GCP project and location. |
| - **`RedisClient` struct**: This is the concrete implementation of the |
| `RedisWrapper` interface. |
| - It holds a `gcp_redis.CloudRedisClient` for interacting with the Google |
| Cloud Redis API (e.g., listing instances). |
| - It also has a reference to `tracestore.TraceStore`, which is likely used |
| to fetch the data that needs to be cached in Redis. |
| - The `tilesToCache` field suggests that the caching strategy might |
| involve pre-calculating and storing "tiles" of data, which is a common |
| pattern in Perf systems for displaying graphs over time. |
| - **`NewRedisClient`**: The constructor for `RedisClient`. |
| - **`StartRefreshRoutine`**: - **Why**: To ensure that Perf is always aware of the correct Redis |
| instance to use and to periodically update the cache. Network |
| configurations or instance details might change, and this routine |
| helps adapt to such changes. - **How**: It takes a `refreshPeriod` and a `config.InstanceConfig` |
| (which is actually `redis_client.RedisConfig` in the current |
| implementation, indicating the target project, zone, and instance |
| name). It then starts a goroutine that, at regular intervals defined |
| by `refreshPeriod`: |
| _ Calls `ListRedisInstances` to get all Redis instances in the |
| configured project/zone. |
| _ Iterates through the instances to find the one matching the |
| `config.Instance` name. \* If the target instance is found, it calls `RefreshCachedQueries`. |
| `[StartRefreshRoutine] | V (Goroutine - Ticks every 'refreshPeriod') |
| | V [ListRedisInstances] -> (GCP API Call) -> [List of Redis |
| Instances] | V (Find Target Instance by Name) | V (If Target Found) |
| [RefreshCachedQueries]` |
| - **`ListRedisInstances`**: |
| - **Why**: To discover available Redis instances within the specified |
| GCP project and location. This is the first step before Perf can |
| connect to and use a specific Redis instance. |
| - **How**: It uses the `gcpClient` (an instance of |
| `cloud.google.com/go/redis/apiv1.CloudRedisClient`) to make an API |
| call to GCP to list instances under the given `parent` (e.g., |
| "projects/my-project/locations/us-central1"). It iterates through |
| the results and returns a slice of `redispb.Instance` objects. |
| - **`RefreshCachedQueries`**: |
| - **Why**: This is the heart of the caching mechanism. Its purpose is |
| to update the data stored in the target Redis instance. The specific |
| data to be cached would depend on the needs of the Perf query UI. |
| - **How**: |
| * It establishes a connection to the specified Redis instance |
| (`instance.Host` and `instance.Port`) using |
| `github.com/redis/go-redis/v9`. |
| * It acquires a mutex (`r.mutex.Lock()`) to prevent concurrent |
| modifications to the cache or shared resources, though the current |
| implementation only has placeholder logic. |
| * The current implementation contains placeholder logic: |
| - It attempts to `GET` a key named "FullPS". |
| - It then `SET`s the key "FullPS" to the current time, with an |
| expiration of 30 seconds. |
| * **Future Work (as hinted by `TODO(wenbinzhang)` and |
| `tilesToCache`)**: This method is expected to be expanded to: |
| - Identify which queries or data segments are good candidates for |
| caching. |
| - Fetch the necessary data, potentially using the `traceStore`. |
| - Store this data in Redis, likely with appropriate keys and |
| expiration times. The `tilesToCache` parameter suggests it might |
| pre-cache a certain number of recent "tiles" of trace data. |
| |
| - **`mocks/RedisWrapper.go`**: This file contains a mock implementation of the |
| `RedisWrapper` interface, generated by the `mockery` tool. |
| |
| - **Why**: To facilitate unit testing of components that depend on |
| `RedisWrapper`. By using a mock, tests can simulate various Redis |
| behaviors (e.g., successful connection, instance not found, errors) |
| without needing an actual Redis instance or GCP connectivity. |
| - **How**: It provides a `RedisWrapper` struct that embeds `mock.Mock` |
| from the `testify` library. For each method in the `RedisWrapper` |
| interface, there's a corresponding method in the mock that records calls |
| and can be configured to return specific values or errors, allowing test |
| authors to define expected interactions. |
| |
| **Design Decisions and Rationale:** |
| |
| - **Interface-based Design (`RedisWrapper`)**: Using an interface decouples |
| the rest of the Perf system from the concrete Redis client implementation. |
| This is good for: |
| - **Testability**: As seen with the `mocks` package. |
| - **Flexibility**: If Skia decides to switch to a different Redis client |
| library or even a different caching technology in the future, the |
| changes would be localized to the implementation of `RedisWrapper` |
| without affecting its consumers. |
| - **Periodic Refresh Routine**: Instead of connecting to Redis on-demand for |
| every operation or assuming a static configuration, the |
| `StartRefreshRoutine` provides a more robust approach. |
| - It handles potential changes in the Redis instance's availability or |
| address. |
| - It centralizes the logic for keeping the cache up-to-date. |
| - **Separation of Concerns**: |
| - The module clearly separates GCP Redis instance management (listing |
| instances via GCP API) from data interaction with a specific Redis |
| instance (using a Redis client library like `go-redis`). |
| - **Use of Standard Libraries**: |
| - `cloud.google.com/go/redis/apiv1` for GCP infrastructure management. |
| - `github.com/redis/go-redis/v9` for standard Redis data operations. This |
| ensures reliance on well-maintained and feature-rich libraries. |
| |
| **Workflow: Cache Refresh Process** |
| |
| The primary workflow driven by this module is the periodic refresh of cached |
| data: |
| |
| ``` |
| System Starts |
| | |
| V |
| Initialize RedisClient (NewRedisClient) |
| | |
| V |
| Call StartRefreshRoutine |
| | |
| V |
| [Background Goroutine - Loop every 'refreshPeriod'] |
| | |
| |--> 1. List GCP Redis Instances (ListRedisInstances) |
| | - Input: GCP project, location |
| | - Output: List of *redispb.Instance |
| | |
| |--> 2. Identify Target Redis Instance |
| | - Based on configuration (e.g., instance name) |
| | |
| |--> 3. If Target Instance Found: Refresh Cache (RefreshCachedQueries) |
| | |
| |--> a. Connect to Target Redis (using go-redis) |
| | - Host, Port from *redispb.Instance |
| | |
| |--> b. Determine data to cache (e.g., recent trace data for popular queries) |
| | - Likely involves `traceStore` |
| | |
| |--> c. Write data to Redis (SET commands) |
| | - Use appropriate keys and expiration times |
| | |
| |--> (Current placeholder: SET "FullPS" = current_time with 30s TTL) |
| ``` |
| |
| This module provides the foundational components for integrating Redis as a |
| caching layer in Skia Perf, aiming to improve UI performance by serving |
| frequently requested data quickly from an in-memory store. The current |
| implementation focuses on instance discovery and has placeholder logic for the |
| actual caching, which is expected to be expanded based on Perf's specific |
| caching needs. |
| |
| # Module: /go/regression |
| |
| The `/go/regression` module is responsible for detecting, storing, and managing |
| performance regressions in Skia. It analyzes performance data over time, |
| identifies significant changes (regressions or improvements), and provides |
| mechanisms for triaging and tracking these changes. |
| |
| **Core Functionality & Design:** |
| |
| The primary goal is to automatically flag performance changes that might |
| indicate a problem or an unexpected improvement. This involves: |
| |
| 1. **Data Analysis:** Analyzing time-series performance data (traces) across |
| different commits. |
| 2. **Clustering:** Grouping similar traces together to identify patterns and |
| changes affecting multiple tests or configurations. |
| 3. **Step Detection:** Identifying abrupt changes (steps) in performance |
| metrics that signify a potential regression or improvement. |
| 4. **Alerting & Notification:** Informing relevant parties when a potential |
| regression is detected. |
| 5. **Persistence:** Storing detected regressions and their triage status. |
| |
| **Key Components & Files:** |
| |
| - **`detector.go`**: This file contains the core logic for processing |
| regression detection requests. |
| |
| - **Why:** It orchestrates the process of fetching data, applying |
| clustering algorithms, and identifying regressions. It's designed to |
| handle potentially large datasets and long-running analyses. |
| - **How:** `ProcessRegressions` is the main entry point. It takes a |
| `RegressionDetectionRequest` (which specifies the alert configuration |
| and the time domain to analyze) and a `DetectorResponseProcessor` |
| callback. |
| - It can expand a single alert configuration with `GroupBy` parameters |
| into multiple, more specific requests using |
| `allRequestsFromBaseRequest`. This allows for targeted analysis of |
| specific trace groups. |
| - It iterates over data using `dfiter.DataFrameIterator`, which provides |
| dataframes for analysis. |
| - For each dataframe, it filters out traces with too much missing data |
| (`tooMuchMissingData`) to ensure the reliability of the detection. |
| - It applies a clustering algorithm (either K-Means via |
| `clustering2.CalculateClusterSummaries` or individual StepFit via |
| `StepFit`) to identify clusters of traces exhibiting similar behavior. |
| The choice of K (number of clusters for K-Means) can be automatic or |
| user-specified. |
| - It generates `RegressionDetectionResponse` objects containing the |
| cluster summaries and the relevant data frame. These responses are |
| passed to the `DetectorResponseProcessor`. |
| - Shortcuts for identified clusters are created using `shortcutFromKeys` |
| for easier referencing. |
| - **Workflow:** `RegressionDetectionRequest -> Expand (if GroupBy) -> |
| Multiple Requests | V For each Request: DataFrameIterator -> DataFrame |
| -> Filter Traces -> Apply Clustering (KMeans or StepFit) | | V V |
| Shortcut Creation <- ClusterSummaries -> DetectorResponseProcessor` |
| |
| - **`regression.go`**: Defines the primary data structures for representing |
| regressions and their triage status. |
| |
| - **Why:** Provides a standardized way to model regressions, including |
| details about the low (performance degradation) and high (performance |
| improvement) steps, the associated data frame, and triage information. |
| - **How:** |
| - `Regression`: The central struct holding `Low` and `High` |
| `ClusterSummary` objects (from `clustering2`), the `FrameResponse` (data |
| context), and `TriageStatus` for both low and high. It also includes |
| fields for the newer `regression2` schema (like `Id`, `CommitNumber`, |
| `AlertId`, `MedianBefore`, `MedianAfter`). |
| - `TriageStatus`: Represents whether a regression is `Untriaged`, |
| `Positive` (expected/acceptable), or `Negative` (a bug). |
| - `AllRegressionsForCommit`: A container for all regressions found for a |
| specific commit, keyed by the alert ID. |
| - `Merge`: A method to combine information from two `Regression` objects, |
| typically used when new data provides a more significant regression for |
| an existing alert. |
| |
| - **`types.go`**: Defines the `Store` interface, which abstracts the |
| persistence layer for regressions. |
| |
| - **Why:** Decouples the regression detection logic from the specific |
| database implementation, allowing for different storage backends. |
| - **How:** The `Store` interface specifies methods for: |
| - `Range`: Retrieving regressions within a commit range. |
| - `SetHigh`/`SetLow`: Storing newly detected high/low regressions. |
| - `TriageHigh`/`TriageLow`: Updating the triage status of regressions. |
| - `Write`: Bulk writing of regressions. |
| - `GetRegressionsBySubName`, `GetByIDs`: Retrieving regressions based on |
| subscription names or specific IDs (primarily for the `regression2` |
| schema). |
| - `GetOldestCommit`, `GetRegression`: Utility methods for fetching |
| specific data. |
| - `DeleteByCommit`: Removing regressions associated with a commit. |
| |
| - **`stepfit.go`**: Implements an alternative regression detection strategy |
| that analyzes each trace individually using step fitting. |
| |
| - **Why:** Useful when `GroupBy` is used in an alert, or when K-Means |
| clustering is not the desired approach. It focuses on finding |
| significant steps in individual time series. |
| - **How:** The `StepFit` function iterates through each trace in the input |
| `DataFrame`. |
| - For each trace, it calls `stepfit.GetStepFitAtMid` to determine if |
| there's a significant step (low or high) around the midpoint of the |
| trace. |
| - If an "interesting" step is found (based on `stddevThreshold` and |
| `interesting` parameters), the trace is added to either the `low` or |
| `high` `ClusterSummary`. |
| - The `low` and `high` summaries collect all traces that show a downward |
| or upward step, respectively. |
| - Parametric summaries (`ParamSummaries`) are generated for the keys |
| within these clusters. |
| |
| - **`fromsummary.go`**: Provides a utility function to convert a |
| `RegressionDetectionResponse` into a `Regression` object. |
| |
| - **Why:** Bridges the output of the detection process with the structured |
| `Regression` type used for storage and display. |
| - **How:** `RegressionFromClusterResponse` takes a |
| `RegressionDetectionResponse`, an `alerts.Alert` configuration, and a |
| `perfgit.Git` instance. |
| - It identifies the commit at the midpoint of the response's data frame. |
| - It iterates through the `ClusterSummary` objects in the response. |
| - If a cluster's step point matches the midpoint commit and meets the |
| alert's criteria (minimum number of traces, direction), it populates the |
| `Low` or `High` fields of the `Regression` object. It prioritizes the |
| regression with the largest absolute magnitude if multiple are found. |
| |
| **Submodules:** |
| |
| - **`continuous/` (`continuous.go`)**: Manages the continuous, background |
| detection of regressions. |
| |
| - **Why:** Ensures that new performance data is promptly analyzed for |
| regressions as it arrives, without requiring manual intervention. |
| - **How:** |
| - `Continuous` struct: Holds dependencies like `perfgit.Git`, |
| `regression.Store`, `alerts.ConfigProvider`, `notify.Notifier`, etc. |
| - `Run()`: The main entry point, which starts either event-driven or |
| polling-based regression detection. |
| - **Event-Driven (`RunEventDrivenClustering`)**: |
| - Listens to Pub/Sub messages from `FileIngestionTopicName` indicating |
| new data ingestion (`ingestevents.IngestEvent`). |
| - For each event, it identifies matching alert configurations using |
| `getTraceIdConfigsForIngestEvent` (which calls |
| `matchingConfigsFromTraceIDs`). |
| - `matchingConfigsFromTraceIDs` refines alert queries if `GroupBy` is |
| present to be more specific to the incoming trace. |
| - It then calls `ProcessAlertConfig` (or `ProcessAlertConfigForTraces` |
| if `StepFitGrouping` is used) for each matching config and the |
| specific traces. |
| - **Polling (`RunContinuousClustering`)**: |
| - Periodically (defined by `pollingDelay`), fetches all alert |
| configurations using `buildConfigAndParamsetChannel`. |
| - Shuffles the configs to distribute the load if multiple detectors |
| are running. |
| - Calls `ProcessAlertConfig` for each configuration. |
| - `ProcessAlertConfig()`: |
| - Sets the current config being processed. |
| - Optionally performs a "smoketest" for `GroupBy` alerts to ensure the |
| query is valid and returns data. |
| - Calls `regression.ProcessRegressions` to perform the actual |
| detection. |
| - The `clusterResponseProcessor` (which is `reportRegressions`) is |
| called with the detection results. |
| - `reportRegressions()`: |
| - For each detected regression (`RegressionDetectionResponse`), it |
| determines the commit and previous commit details. |
| - It checks if the regression meets the alert criteria (direction, |
| minimum number). |
| - It calls `updateStoreAndNotification` to persist the regression and |
| send notifications. |
| - `updateStoreAndNotification()`: |
| - Checks if the regression already exists in the `regression.Store`. |
| - If new, it stores the regression (using `store.SetLow` or |
| `store.SetHigh`) and sends a notification via |
| `notifier.RegressionFound`. The notification ID is stored with the |
| regression. |
| - If existing, but the direction of the regression has changed (e.g., |
| was low, now also high), it updates the store and the notification |
| using `notifier.UpdateNotification`. |
| - If existing and the direction is the same, it only updates the |
| store. |
| - **Key Decision:** The system supports both event-driven (preferred for |
| responsiveness) and polling-based detection (as a fallback or for |
| periodic full checks). The choice is controlled by the |
| `EventDrivenRegressionDetection` flag. |
| - **Workflow (Event-Driven):** `Pub/Sub Message (New Data) -> Decode |
| IngestEvent -> Get Matching Alert Configs | V For each (Config, Matched |
| Traces): ProcessAlertConfig -> regression.ProcessRegressions | V |
| reportRegressions -> updateStoreAndNotification | | V V Store Notifier` |
| |
| - **`migration/` (`migrator.go`)**: Handles the data migration from an older |
| `regressions` table schema to the newer `regressions2` schema. |
| |
| - **Why:** Facilitates schema evolution without data loss. The newer |
| schema (`Regression2Schema`) aims to store regression data more |
| granularly, typically one row per detected step (high or low), rather |
| than combining high and low for the same commit/alert into a single JSON |
| blob. |
| - **How:** |
| - `RegressionMigrator`: Contains instances of the legacy |
| `sqlregressionstore.SQLRegressionStore` and the new |
| `sqlregression2store.SQLRegression2Store`. |
| - `RunPeriodicMigration`: Sets up a ticker to periodically run |
| `RunOneMigration`. |
| - `RunOneMigration` / `migrateRegressions`: |
| - Fetches a batch of unmigrated regressions from the legacy store |
| (`legacyStore.GetRegressionsToMigrate`). |
| - For each legacy `Regression` object: |
| - It begins a database transaction. |
| - It populates fields specific to the `regression2` schema (e.g., |
| `Id`, `PrevCommitNumber`, `MedianBefore`, `MedianAfter`, |
| `IsImprovement`, `ClusterType`) if they are not already present from |
| the legacy data. This is crucial as the |
| `sqlregression2store.WriteRegression` expects these. |
| - The `sqlregression2store.WriteRegression` function might split a |
| single legacy `Regression` object (if it has both `High` and `Low` |
| components) into two separate entries in the `Regressions2` table, |
| one for `HighClusterType` and one for `LowClusterType`. |
| - It then marks the corresponding row in the legacy `Regressions` |
| table as migrated using `legacyStore.MarkMigrated`, storing the new |
| regression ID. |
| - Commits the transaction. If any step fails, the transaction is |
| rolled back. |
| - **Key Decision:** Migration is performed in batches and within |
| transactions to ensure atomicity and prevent data duplication or loss |
| during the migration process. |
| |
| - **`sqlregressionstore/`**: Implements the `regression.Store` interface using |
| a generic SQL database. This is the _older_ SQL storage mechanism. |
| |
| - **Why:** Provides a persistent storage solution for regressions |
| identified by the detection system. |
| - **How:** |
| - `SQLRegressionStore`: The main struct, holding a database connection |
| pool (`pool.Pool`) and prepared SQL statements. It supports different |
| SQL dialects (e.g., CockroachDB via `statements`, Spanner via |
| `spannerStatements`). |
| - The schema (`sqlregressionstore/schema/RegressionSchema.go`) typically |
| stores one row per `(commit_number, alert_id)` pair. The actual |
| `regression.Regression` object (which might contain both high and low |
| details, along with the frame) is serialized into a JSON string and |
| stored in a `regression TEXT` column. |
| - `readModifyWrite`: A core helper function that encapsulates the common |
| pattern of reading a `Regression` from the DB, allowing a callback to |
| modify it, and then writing it back. This is done within a transaction |
| to prevent lost updates. If `mustExist` is true, it errors if the |
| regression isn't found; otherwise, it creates a new one. |
| - `SetHigh`/`SetLow`: Use `readModifyWrite` to update the `High` or `Low` |
| part of the JSON-serialized `Regression` object. They also update the |
| triage status to `Untriaged` if it was previously `None`. |
| - `TriageHigh`/`TriageLow`: Use `readModifyWrite` to update the |
| `HighStatus` or `LowStatus` within the JSON-serialized `Regression`. |
| - `GetRegressionsToMigrate`: Fetches regressions that haven't been |
| migrated to the `regression2` schema. |
| - `MarkMigrated`: Updates a row to indicate it has been migrated, storing |
| the new `regression_id` from the `regression2` table. |
| - **Limitation:** Storing the entire `Regression` object as JSON can make |
| querying for specific aspects of the regression (e.g., only high |
| regressions, or regressions with a specific triage status) less |
| efficient and more complex. This is one of the motivations for the |
| `sqlregression2store`. |
| |
| - **`sqlregression2store/`**: Implements the `regression.Store` interface |
| using a newer SQL schema (`Regressions2`). |
| |
| - **Why:** Addresses limitations of the older `sqlregressionstore` by |
| storing regression data in a more normalized and queryable way. |
| - **How:** |
| - `SQLRegression2Store`: The main struct. |
| - Schema (`sqlregression2store/schema/Regression2Schema.go`): Designed to |
| store each regression step (high or low) as a separate row. Key columns |
| include `id` (UUID, primary key), `commit_number`, `prev_commit_number`, |
| `alert_id`, `creation_time`, `median_before`, `median_after`, |
| `is_improvement`, `cluster_type` (e.g., "high", "low"), |
| `cluster_summary` (JSONB), `frame` (JSONB), `triage_status`, and |
| `triage_message`. |
| - `writeSingleRegression`: The core writing function. It takes a |
| `regression.Regression` object and writes its relevant parts (either |
| high or low, but not both in the same DB row) to the `Regressions2` |
| table. |
| - `convertRowToRegression`: Converts a database row from `Regressions2` |
| back into a `regression.Regression` object. Depending on the |
| `cluster_type` in the row, it populates either the `High` or `Low` part |
| of the `Regression` object. |
| - `SetHigh`/`SetLow`: |
| - These methods now interact with `updateBasedOnAlertAlgo`. |
| - `updateBasedOnAlertAlgo`: This function is crucial. It considers the |
| `Algo` type of the alert (`KMeansGrouping` vs. `StepFitGrouping`). |
| - For `KMeansGrouping`, it expects to potentially update an existing |
| regression for the same `(commit_number, alert_id)` as new data |
| might refine the cluster. It uses `readModifyWriteCompat` to achieve |
| this. |
| - For `StepFitGrouping` (individual trace analysis), it generally |
| expects to create a _new_ regression entry if one doesn't exist for |
| the exact frame, avoiding updates to pre-existing ones unless it's |
| truly a new detection. |
| - The `updateFunc` passed to `updateBasedOnAlertAlgo` populates the |
| necessary fields in the `regression.Regression` object (e.g., |
| setting `r.High` or `r.Low`, and calling |
| `populateRegression2Fields`). |
| - `populateRegression2Fields`: This helper populates the fields specific |
| to the `Regressions2` schema (like `PrevCommitNumber`, `MedianBefore`, |
| `MedianAfter`, `IsImprovement`) from the `ClusterSummary` and |
| `FrameResponse` within the `Regression` object. |
| - `WriteRegression` (used by migrator): If a legacy `Regression` object |
| has both `High` and `Low` components, this function splits it and calls |
| `writeSingleRegression` twice, creating two rows in `Regressions2`. |
| - `Range`: When retrieving regressions, if multiple rows from |
| `Regressions2` correspond to the same `(commit_number, alert_id)` (e.g., |
| one for high, one for low), it merges them back into a single |
| `regression.Regression` object for compatibility with how the rest of |
| the system might expect the data. |
| - **Key Improvement:** Storing regression components (high/low) as |
| separate rows with dedicated columns for medians, triage status, etc., |
| allows for much more efficient and direct SQL querying compared to |
| parsing JSON in the older store. |
| |
| **Overall Workflow Example (Simplified):** |
| |
| 1. **Continuous Detection (`continuous.go`):** |
| - New data arrives (e.g., via Pub/Sub). |
| - `Continuous` identifies relevant `alerts.Alert` configurations. |
| - `ProcessAlertConfig` is called. |
| 2. **Regression Processing (`detector.go`):** |
| - `ProcessRegressions` fetches data, builds `DataFrame`s. |
| - Clustering (KMeans or `stepfit.go`) is applied. |
| - `RegressionDetectionResponse`s are generated. |
| 3. **Reporting & Storing (`continuous.go` calls back into `regression` |
| store):** |
| - `reportRegressions` processes these responses. |
| - `updateStoreAndNotification` interacts with a `regression.Store` |
| implementation (e.g., `sqlregression2store.go`): |
| - Checks if the regression is new or an update. |
| - Calls `SetLow` or `SetHigh` on the store. |
| - The store (`sqlregression2store`) writes the data to the |
| `Regressions2` table, potentially creating a new row or updating an |
| existing one based on the alert's algorithm type. |
| - A notification might be sent. |
| |
| The system is designed to be modular, with interfaces like `regression.Store` |
| and `alerts.ConfigProvider` allowing for flexibility in implementation details. |
| The migration path from `sqlregressionstore` to `sqlregression2store` highlights |
| the evolution towards a more structured and queryable data model for |
| regressions. |
| |
| # Module: /go/samplestats |
| |
| The `samplestats` module is designed to perform statistical analysis on sets of |
| performance data, specifically to identify significant changes between two |
| sample sets, often referred to as "before" and "after" states. This is crucial |
| for detecting regressions or improvements in performance metrics over time or |
| across different code versions. |
| |
| The core functionality revolves around comparing these two sets of samples for |
| each trace (a unique combination of parameters identifying a specific test or |
| metric). It calculates various statistical metrics for each set and then employs |
| statistical tests to determine if the observed differences are statistically |
| significant. |
| |
| **Key Design Choices and Implementation Details:** |
| |
| - **Statistical Significance:** The module uses p-values to determine |
| significance. A user-configurable alpha level (defaulting to 0.05) acts as |
| the threshold. If the calculated p-value for a trace is below this alpha, |
| the change is considered significant. |
| - **Choice of Statistical Tests:** The module offers two common statistical |
| tests: |
| - **Mann-Whitney U Test (default):** This is a non-parametric test used to |
| compare two independent samples. It's often preferred when the data |
| doesn't necessarily follow a normal distribution. |
| - **Two Sample Welch's t-test:** This parametric test is used to compare |
| the means of two independent samples, particularly when their variances |
| might be unequal. The choice allows users to select the most appropriate |
| test based on the characteristics of their data. |
| - **Outlier Removal:** An optional Interquartile Range Rule (IQRR) can be |
| applied to remove outliers from the sample data before calculating |
| statistics. This helps in reducing the influence of extreme values that |
| might skew the results. The decision to make this optional acknowledges that |
| outlier removal isn't always desired or appropriate. |
| - **Delta Calculation:** For changes deemed significant, the module calculates |
| the percentage change in the mean between the "before" and "after" samples. |
| If a change isn't significant, the delta is reported as NaN (Not a Number), |
| clearly distinguishing it from actual zero-percentage changes. |
| - **Configurability:** The `Config` struct provides a centralized way to |
| control the analysis process. This includes setting the alpha level, |
| choosing the statistical test, enabling outlier removal, and deciding |
| whether to include all traces in the output or only those with significant |
| changes. This configurability makes the module adaptable to various analysis |
| needs. |
| - **Result Structure:** The `Result` struct encapsulates the outcome of the |
| analysis, including a list of `Row` structs (one per trace) and a count of |
| skipped traces. Each `Row` contains the trace identifier, its parameters, |
| the calculated metrics for both "before" and "after" samples, the percentage |
| delta, the p-value, and any informational notes (e.g., errors during |
| statistical test calculation). This structured output facilitates further |
| processing or display of the results. |
| - **Sorting:** The results can be sorted based on different criteria, with the |
| default being by the calculated `Delta`. This allows users to quickly |
| identify the most impactful changes. The `Order` type and functions like |
| `ByName`, `ByDelta`, and `Reverse` provide a flexible sorting mechanism. |
| |
| **Responsibilities and Key Components:** |
| |
| - **`analyze.go`**: This is the heart of the module. |
| |
| - **`Analyze` function**: This is the primary entry point. It takes the |
| `Config` and two maps of samples (`before` and `after`, where keys are |
| trace IDs and values are `parser.Samples`). |
| - It iterates through all unique trace IDs present in either the "before" |
| or "after" sets. |
| - For each trace, it retrieves the corresponding samples, skipping the |
| trace if data isn't present in both sets. |
| - It calls `calculateMetrics` (from `metrics.go`) for both "before" and |
| "after" samples. |
| - Based on the `Config.Test` setting, it performs either the Mann-Whitney |
| U test or the Two Sample Welch's t-test using functions from the |
| `github.com/aclements/go-moremath/stats` library. |
| - It compares the resulting p-value with the configured alpha level. If `p |
| < alpha`, it calculates the percentage `Delta` between the means. |
| Otherwise, `Delta` is `NaN`. |
| - It constructs a `Row` struct with all the calculated information. |
| - It optionally filters out rows where no significant change was detected |
| if `Config.All` is false. |
| - Finally, it sorts the resulting `Row`s based on `Config.Order` (or by |
| `Delta` if no order is specified) using the `Sort` function from |
| `sort.go`. |
| - It returns a `Result` struct containing the list of `Row`s and the count |
| of skipped traces. |
| - **`Config` struct**: Defines the parameters that control the analysis, |
| such as `Alpha` for p-value cutoff, `Order` for sorting, `IQRR` for |
| outlier removal, `All` for including all results, and `Test` for |
| selecting the statistical test. |
| - **`Result` struct**: Encapsulates the output of the `Analyze` function, |
| holding the `Rows` of analysis data and the `Skipped` count. |
| - **`Row` struct**: Represents the analysis results for a single trace, |
| including its name, parameters, "before" and "after" `Metrics`, the |
| percentage `Delta`, the `P` value, and any `Note`. |
| |
| - **`metrics.go`**: This file is responsible for calculating basic statistical |
| metrics from a given set of sample values. |
| |
| - **`calculateMetrics` function**: Takes a `Config` (primarily to check |
| `IQRR`) and `parser.Samples`. |
| - If `Config.IQRR` is true, it applies the Interquartile Range Rule to |
| filter out outliers from `samples.Values`. The values within 1.5 \* IQR |
| from the first and third quartiles are retained. |
| - It then calculates the `Mean`, `StdDev` (standard deviation), and |
| `Percent` (coefficient of variation: `StdDev / Mean * 100`) of the |
| (potentially filtered) values. |
| - It returns these calculated statistics in a `Metrics` struct, along with |
| the (potentially filtered) `Values`. |
| - **`Metrics` struct**: Holds the calculated `Mean`, `StdDev`, raw |
| `Values` (after potential outlier removal), and `Percent` (coefficient |
| of variation). |
| |
| - **`sort.go`**: This file provides utilities for sorting the results (`Row` |
| slices). |
| |
| - **`Order` type**: A function type `func(rows []Row, i, j int) bool` |
| defining a less-than comparison for sorting `Row`s. |
| - **`ByName` function**: An `Order` implementation that sorts rows |
| alphabetically by `Row.Name`. |
| - **`ByDelta` function**: An `Order` implementation that sorts rows by |
| `Row.Delta`. It specifically places `NaN` delta values (insignificant |
| changes) at the beginning. |
| - **`Reverse` function**: A higher-order function that takes an `Order` |
| and returns a new `Order` that represents the reverse of the input |
| order. |
| - **`Sort` function**: A convenience function that sorts a slice of `Row`s |
| in place using `sort.SliceStable` and a given `Order`. |
| |
| **Illustrative Workflow (Simplified `Analyze` Process):** |
| |
| ``` |
| Input: before_samples, after_samples, config |
| |
| For each trace_id in (before_samples keys + after_samples keys): |
| If trace_id not in before_samples OR trace_id not in after_samples: |
| Increment skipped_count |
| Continue |
| |
| before_metrics = calculateMetrics(config, before_samples[trace_id]) |
| after_metrics = calculateMetrics(config, after_samples[trace_id]) |
| |
| If config.Test == UTest: |
| p_value = MannWhitneyUTest(before_metrics.Values, after_metrics.Values) |
| Else (config.Test == TTest): |
| p_value = TwoSampleWelchTTest(before_metrics.Values, after_metrics.Values) |
| |
| alpha = config.Alpha (or defaultAlpha if config.Alpha is 0) |
| |
| If p_value < alpha: |
| delta = ((after_metrics.Mean / before_metrics.Mean) - 1) * 100 |
| Else: |
| delta = NaN |
| If NOT config.All: |
| Continue // Skip if not showing all results and change is not significant |
| |
| Add new Row{Name: trace_id, Delta: delta, P: p_value, ...} to results_list |
| |
| Sort results_list using config.Order (or ByDelta by default) |
| |
| Return Result{Rows: results_list, Skipped: skipped_count} |
| ``` |
| |
| # Module: /go/sheriffconfig |
| |
| The `sheriffconfig` module is responsible for managing configurations for Skia |
| Perf's anomaly detection and alerting system. These configurations, known as |
| "Sheriff Configs," are defined in Protocol Buffer format and are typically |
| stored in LUCI Config. This module handles fetching these configurations, |
| validating them, and transforming them into a format suitable for storage and |
| use by other Perf components, specifically the `alerts` and `subscription` |
| modules. |
| |
| The core idea is to allow users to define rules for which performance metrics |
| they care about and how anomalies in those metrics should be detected and |
| handled. This provides a flexible and centralized way to manage alerting for a |
| large number of performance tests. |
| |
| **Key Responsibilities and Components:** |
| |
| - **Protocol Buffer Definitions (`/proto/v1`):** |
| |
| - This directory defines the structure of Sheriff Configurations using |
| Protocol Buffers. This is the "source of truth" for what constitutes a |
| valid configuration. |
| - `sheriff_config.proto`: Defines the main messages like `SheriffConfig`, |
| `Subscription`, `AnomalyConfig`, and `Rules`. |
| - `SheriffConfig`: The top-level message, containing a list of |
| `Subscription`s. This represents the entire set of alerting |
| configurations for a Perf instance. |
| - `Subscription`: Represents a user's or team's interest in a specific set |
| of metrics. It includes details for creating bug reports (e.g., contact |
| email, bug component, labels, priority, severity) and a list of |
| `AnomalyConfig`s that define how to detect anomalies for the metrics |
| covered by this subscription. |
| - `AnomalyConfig`: Specifies the parameters for anomaly detection for a |
| particular subset of metrics. This includes: |
| - `Rules`: Define which metrics this `AnomalyConfig` applies to, using |
| `match` and `exclude` patterns. These patterns are query strings |
| (e.g., "master=ChromiumPerf&benchmark=Speedometer2"). |
| - Detection parameters: `step` (algorithm for step detection), |
| `radius` (commits to consider), `threshold` (sensitivity), |
| `minimum_num` (number of interesting traces to trigger an alert), |
| `sparse` (handling of missing data), `k` (for K-Means clustering), |
| `group_by` (for breaking down clustering), `direction` (up, down, or |
| both), `action` (no action, triage, or bisect), and `algo` |
| (clustering algorithm like StepFit or KMeans). |
| - `Rules`: Contains lists of `match` and `exclude` strings. Match strings |
| define positive criteria for selecting metrics, while exclude strings |
| define negative criteria. The combination allows for precise targeting |
| of metrics. |
| - `sheriff_config.pb.go`: The Go code generated from |
| `sheriff_config.proto`. This provides the Go structs and methods to work |
| with these configurations programmatically. |
| - `generate.go`: Contains `go:generate` directives used to regenerate |
| `sheriff_config.pb.go` whenever `sheriff_config.proto` changes. This |
| ensures the Go code stays in sync with the proto definition. |
| |
| - **Validation (`/validate`):** |
| |
| - `validate.go`: This is crucial for ensuring the integrity and |
| correctness of Sheriff Configurations before they are processed or |
| stored. It performs a series of checks: |
| - **Pattern Validation:** Ensures that `match` and `exclude` strings in |
| `Rules` are well-formed query strings. It checks for valid regex if a |
| value starts with `~`. It also enforces that exclude patterns only |
| target a single key-value pair. |
| - **AnomalyConfig Validation:** Ensures that each `AnomalyConfig` has at |
| least one `match` pattern. |
| - **Subscription Validation:** Verifies that essential fields like `name`, |
| `contact_email`, `bug_component`, and `instance` are present. It also |
| checks that each subscription has at least one `AnomalyConfig`. |
| - **SheriffConfig Validation:** Ensures there's at least one |
| `Subscription` and that all subscription names within a config are |
| unique. |
| - `DeserializeProto`: A helper function to convert a base64 encoded string |
| (as typically retrieved from LUCI Config) into a `SheriffConfig` |
| protobuf message. |
| |
| - **Service (`/service`):** |
| |
| - `service.go`: This component orchestrates the process of fetching |
| Sheriff Configurations from LUCI Config, processing them, and storing |
| them in the database. |
| - **`New` function:** Initializes the `sheriffconfigService`, taking |
| dependencies like a database connection pool (`sql.Pool`), |
| `subscription.Store`, `alerts.Store`, and a `luciconfig.ApiClient`. If |
| no `luciconfig.ApiClient` is provided, it creates one. |
| - **`ImportSheriffConfig` method:** This is the main entry point for |
| importing configurations. |
| |
| * It uses the `luciconfig.ApiClient` to fetch configurations from a |
| specified LUCI Config path (e.g., "skia-sheriff-configs.cfg"). |
| * For each fetched configuration file content: |
| - It calls `processConfig`. |
| * It then inserts all derived `subscription_pb.Subscription` objects into |
| the `subscriptionStore` and all `alerts.SaveRequest` objects into the |
| `alertStore` within a single database transaction. This ensures |
| atomicity – either all changes are saved, or none are. |
| |
| - **`processConfig` method:** |
| |
| * Deserializes the raw configuration content (string) into a |
| `pb.SheriffConfig` protobuf message using `prototext.Unmarshal`. |
| * Validates the deserialized `pb.SheriffConfig` using |
| `validate.ValidateConfig`. |
| * Iterates through each `pb.Subscription` in the config: |
| - It filters subscriptions based on the `instance` field, only |
| processing those matching the service's configured instance (e.g., |
| "chrome-internal"). This allows multiple Perf instances to share a |
| config file but only import relevant subscriptions. |
| - It calls `makeSubscriptionEntity` to convert the `pb.Subscription` |
| into a `subscription_pb.Subscription` (the format used by the |
| `subscription` module). |
| - **Revision Check:** Crucially, it checks if a subscription with the |
| same name and revision already exists in the `subscriptionStore`. If |
| it does, it means this specific version of the subscription has |
| already been imported, so it's skipped. This prevents redundant |
| database writes and processing if the LUCI config file hasn't |
| actually changed for that subscription. |
| - If the subscription is new or has a new revision, it calls |
| `makeSaveRequests` to generate `alerts.SaveRequest` objects for each |
| alert defined within that subscription. |
| |
| - **`makeSubscriptionEntity` function:** Transforms a `pb.Subscription` |
| (from Sheriff Config proto) into a `subscription_pb.Subscription` (for |
| the `subscription` datastore), mapping fields and applying default |
| priorities/severities if not specified. |
| - **`makeSaveRequests` function:** |
| |
| * Iterates through each `pb.AnomalyConfig` within a `pb.Subscription`. |
| * For each `match` rule within the `pb.AnomalyConfig.Rules`: |
| - Calls `buildQueryFromRules` to construct the actual query string |
| that will be used to select metrics for this alert. |
| - Calls `createAlert` to create an `alerts.Alert` object, populating |
| it with parameters from the `pb.AnomalyConfig` and the parent |
| `pb.Subscription`. |
| - Wraps the `alerts.Alert` in an `alerts.SaveRequest` along with the |
| subscription name and revision. |
| |
| - **`createAlert` function:** Populates an `alerts.Alert` struct. This |
| involves: |
| - Mapping enum values from the Sheriff Config proto (e.g., |
| `AnomalyConfig_Step`, `AnomalyConfig_Direction`, `AnomalyConfig_Action`, |
| `AnomalyConfig_Algo`) to their corresponding internal types used by the |
| `alerts` module (e.g., `alerts.Direction`, |
| `types.RegressionDetectionGrouping`, `types.StepDetection`, |
| `types.AlertAction`). This is done using maps like `directionMap`, |
| `clusterAlgoMap`, etc. |
| - Applying default values for parameters like `radius`, `minimum_num`, |
| `sparse`, `k`, `group_by` if they are not explicitly set in the |
| `AnomalyConfig`. |
| - **`buildQueryFromRules` function:** Constructs a canonical query string |
| from a `match` string and a list of `exclude` strings. It parses them as |
| URL query parameters, combines them (with `!` for excludes), sorts the |
| parts alphabetically, and joins them with `&`. This ensures that |
| equivalent rules always produce the same query string. |
| - **`getPriorityFromProto` and `getSeverityFromProto` functions:** Convert |
| the enum values for priority and severity from the proto definition to |
| the integer values expected by the `subscription` module, applying |
| defaults if the proto value is "unspecified." |
| - **`StartImportRoutine` and `ImportSheriffConfigOnce`:** Provide |
| functionality to periodically fetch and import configurations, making |
| the system self-updating when LUCI configs change. |
| |
| **Workflow: Importing a Sheriff Configuration** |
| |
| ``` |
| LUCI Config Change (e.g., new revision of skia-sheriff-configs.cfg) |
| | |
| v |
| Sheriffconfig Service (triggered by timer or manual call) |
| | |
| |--- 1. luciconfigApiClient.GetProjectConfigs("skia-sheriff-configs.cfg") --> Fetches raw config content + revision |
| | |
| v |
| For each config file content: |
| | |
| |--- 2. processConfig(configContent, revision) |
| | | |
| | |--- 2a. prototext.Unmarshal(configContent) --> pb.SheriffConfig |
| | | |
| | |--- 2b. validate.ValidateConfig(pb.SheriffConfig) --> Error or OK |
| | | |
| | v |
| | For each pb.Subscription in pb.SheriffConfig: |
| | | |
| | |--- 2c. If subscription.Instance != service.Instance --> Skip |
| | | |
| | |--- 2d. subscriptionStore.GetSubscription(name, revision) --> ExistingSubscription? |
| | | |
| | |--- 2e. If ExistingSubscription == nil (new or updated): |
| | | | |
| | | |--- makeSubscriptionEntity(pb.Subscription, revision) --> subscription_pb.Subscription |
| | | | |
| | | |--- makeSaveRequests(pb.Subscription, revision) |
| | | | | |
| | | | v |
| | | | For each pb.AnomalyConfig in pb.Subscription: |
| | | | | |
| | | | v |
| | | | For each matchRule in pb.AnomalyConfig.Rules: |
| | | | | |
| | | | |--- buildQueryFromRules(matchRule, excludeRules) --> queryString |
| | | | | |
| | | | |--- createAlert(queryString, pb.AnomalyConfig, pb.Subscription, revision) --> alerts.Alert |
| | | | | |
| | | | ---> Collect alerts.SaveRequest |
| | | | |
| | | ---> Collect subscription_pb.Subscription |
| | |
| v |
| Database Transaction (BEGIN) |
| | |
| |--- 3. subscriptionStore.InsertSubscriptions(collected_subscriptions) |
| | |
| |--- 4. alertStore.ReplaceAll(collected_save_requests) |
| | |
| Database Transaction (COMMIT or ROLLBACK) |
| ``` |
| |
| This module acts as a critical bridge, translating human-readable (and |
| machine-parsable via proto) alerting definitions into the concrete data |
| structures used by Perf's backend alerting and subscription systems. The |
| validation step is key to preventing malformed configurations from breaking the |
| alerting pipeline. The revision checking mechanism ensures efficiency by only |
| processing changes. |
| |
| # Module: /go/shortcut |
| |
| The `shortcut` module provides functionality for creating, storing, and |
| retrieving "shortcuts". A shortcut is essentially a named list of trace keys. |
| These trace keys typically represent specific performance metrics or |
| configurations. The primary purpose of shortcuts is to provide a convenient way |
| to refer to a collection of traces with a short, memorable identifier, rather |
| than having to repeatedly specify the full list of keys. This is particularly |
| useful for sharing links to specific views in the Perf UI or for programmatic |
| access to predefined sets of performance data. |
| |
| The core component is the `Store` interface, defined in `shortcut.go`. This |
| interface abstracts the underlying storage mechanism, allowing different |
| implementations to be used (e.g., in-memory for testing, SQL database for |
| production). The key operations defined by the `Store` interface are: |
| |
| - `Insert`: Adds a new shortcut to the store. It takes an `io.Reader` |
| containing the shortcut data (typically JSON) and returns a unique ID for |
| the shortcut. |
| - `InsertShortcut`: Similar to `Insert`, but takes a `Shortcut` struct |
| directly. |
| - `Get`: Retrieves a shortcut given its ID. |
| - `GetAll`: Returns a channel that streams all stored shortcuts. This is |
| useful for tasks like data migration. |
| - `DeleteShortcut`: Removes a shortcut from the store. |
| |
| A `Shortcut` itself is a simple struct containing a slice of strings, where each |
| string is a trace key. |
| |
| The generation of shortcut IDs is handled by the `IDFromKeys` function. This |
| function takes a `Shortcut` struct, sorts its keys alphabetically (to ensure |
| that the order of keys doesn't affect the ID), and then computes an MD5 hash of |
| the concatenated keys. A prefix "X" is added to this hash for historical |
| reasons, maintaining compatibility with older systems. This deterministic ID |
| generation ensures that the same set of keys will always produce the same |
| shortcut ID. |
| |
| Workflow for creating and retrieving a shortcut: |
| |
| 1. **Creation**: `Client Code` ---(JSON data or Shortcut struct)---> |
| `Store.Insert` or `Store.InsertShortcut` `Store` ---(Generates ID using |
| IDFromKeys, marshals to JSON if needed)---> `Underlying Storage (e.g., SQL |
| DB)` `Underlying Storage` ---> `Store` ---(Returns Shortcut ID)---> `Client |
| Code` |
| |
| 2. **Retrieval**: `Client Code` ---(Shortcut ID)---> `Store.Get` `Store` |
| ---(Queries by ID)---> `Underlying Storage (e.g., SQL DB)` `Underlying |
| Storage` ---(Returns stored JSON or data)---> `Store` `Store` ---(Unmarshals |
| to Shortcut struct, sorts keys)---> `Client Code` (receives Shortcut struct) |
| |
| The `sqlshortcutstore` subdirectory provides a concrete implementation of the |
| `Store` interface using an SQL database (specifically designed for CockroachDB, |
| as indicated by test setup and migration references). The `sqlshortcutstore.go` |
| file contains the logic for interacting with the database, including SQL |
| statements for inserting, retrieving, and deleting shortcuts. Shortcut data is |
| stored as JSON strings in the database. The schema for the `Shortcuts` table is |
| implicitly defined by the SQL statements and further clarified in |
| `sqlshortcutstore/schema/schema.go`, which defines a `ShortcutSchema` struct |
| mirroring the table structure (though this struct is primarily for documentation |
| or ORM-like purposes and not directly used in the raw SQL interaction in |
| `sqlshortcutstore.go`). |
| |
| Testing is a significant aspect of this module: |
| |
| - `shortcut_test.go` contains unit tests for the `IDFromKeys` function, |
| ensuring its correctness and deterministic behavior. |
| - `shortcuttest` provides a suite of common tests (`InsertGet`, |
| `GetNonExistent`, `GetAll`, `DeleteShortcut`) that can be run against any |
| implementation of the `shortcut.Store` interface. This promotes consistency |
| and ensures that different store implementations behave as expected. The |
| `InsertGet` test, for example, verifies that a stored shortcut can be |
| retrieved and that the keys are sorted upon retrieval, even if they were not |
| sorted initially. |
| - `sqlshortcutstore_test.go` utilizes the tests from `shortcuttest` to |
| validate the `SQLShortcutStore` implementation against a test database. |
| - `mocks/Store.go` provides a mock implementation of the `Store` interface, |
| generated by the `mockery` tool. This is useful for testing components that |
| depend on `shortcut.Store` without needing a real storage backend. |
| |
| # Module: /go/sql |
| |
| The `go/sql` module serves as the central hub for managing the SQL database |
| schema used by the Perf application. It defines the structure of the database |
| tables and provides utilities for schema generation, validation, and migration. |
| This module ensures that the application's database schema is consistent, |
| well-defined, and can evolve smoothly over time. |
| |
| **Key Responsibilities and Components:** |
| |
| - **Schema Definition (`schema.go`, `spanner/schema_spanner.go`):** |
| |
| - **Why:** These files contain the SQL `CREATE TABLE` statements that |
| define the structure of all tables used by Perf. Having the schema |
| defined in code (generated from Go structs) provides a single source of |
| truth and allows for easier version control and programmatic |
| manipulation. |
| - **How:** |
| - `schema.go`: Defines the schema for CockroachDB. |
| - `spanner/schema_spanner.go`: Defines the schema for Spanner. Spanner has |
| slightly different SQL syntax and features (e.g., `TTL INTERVAL`), |
| necessitating a separate schema definition. |
| - The schema is not written manually but is _generated_ by the `tosql` |
| utility (see below). This ensures that the SQL schema accurately |
| reflects the Go struct definitions in other modules (e.g., |
| `perf/go/alerts/sqlalertstore/schema`). |
| - Along with the `CREATE TABLE` statements, these files also export slices |
| of strings representing the column names for each table. This can be |
| useful for constructing SQL queries programmatically. |
| |
| - **Table Struct Definition (`tables.go`):** |
| |
| - **Why:** This file defines a Go struct `Tables` which aggregates all the |
| individual table schema structs from various Perf sub-modules (like |
| `alerts`, `anomalygroup`, `git`, etc.). |
| - **How:** The `Tables` struct serves as the input to the `tosql` schema |
| generator. By referencing schema structs from other modules, it ensures |
| that the generated SQL schema is consistent with how data is represented |
| and manipulated throughout the application. The `//go:generate` |
| directives at the top of this file trigger the `tosql` utility to |
| regenerate the schema files when necessary. |
| |
| - **Schema Generation Utility (`tosql/main.go`):** |
| |
| - **Why:** Manually writing and maintaining complex SQL schemas is |
| error-prone. This utility automates the generation of the SQL schema |
| files (`schema.go` and `spanner/schema_spanner.go`) from the Go struct |
| definitions. |
| - **How:** It takes the `sql.Tables` struct (defined in `tables.go`) as |
| input and uses the `go/sql/exporter` module to translate the Go struct |
| tags and field types into corresponding SQL `CREATE TABLE` statements. |
| It supports different SQL dialects (CockroachDB and Spanner) and can |
| handle specific features like Spanner's TTL (Time To Live) for tables. |
| The `schemaTarget` flag controls which database dialect is generated. |
| |
| - **Expected Schema and Migration (`expectedschema/`):** |
| |
| - **Why:** As the application evolves, the database schema needs to |
| change. This submodule manages schema migrations, ensuring that the live |
| database can be updated to new versions without downtime or data loss. |
| It also validates that the current database schema matches an expected |
| version. |
| - **How:** |
| |
| - **`embed.go`:** This file uses `go:embed` to embed JSON representations |
| of the _current_ (`schema.json`, `schema_spanner.json`) and _previous_ |
| (`schema_prev.json`, `schema_prev_spanner.json`) expected database |
| schemas. These JSON files are generated by the `exportschema` utility. |
| `Load()` and `LoadPrev()` functions provide access to these deserialized |
| schema descriptions. |
| |
| - **`migrate.go`:** This is the core of the schema migration logic. |
| |
| - It defines SQL statements (`FromLiveToNext`, `FromNextToLive`, and |
| their Spanner equivalents) that describe how to upgrade the database |
| from the "previous" schema version to the "next" (current) schema |
| version, and how to roll back that change. **Crucially, schema |
| changes must be backward and forward compatible** because during a |
| deployment, old and new versions of the application might run |
| concurrently. |
| - `ValidateAndMigrateNewSchema` is the key function. It: |
| |
| * Loads the "next" (target) and "previous" expected schemas from the |
| embedded JSON files. |
| * Gets the _actual_ schema description from the live database. |
| * Compares the actual schema with the previous and next expected |
| schemas. |
| - If `actual == next`, no migration is needed. |
| - If `actual == prev` and `actual != next`, it executes the |
| `FromLiveToNext` SQL statements to upgrade the database schema. |
| - If `actual` matches neither `prev` nor `next`, it indicates an |
| unexpected schema state and returns an error, preventing |
| application startup. This is a critical safety check. |
| |
| - The migration process is designed to be run by a maintenance task |
| during deployment. Old instances (frontend, ingesters) -> |
| Maintenance task (runs `ValidateAndMigrateNewSchema`) -> New |
| instances (frontend, ingesters) |
| |
| ``` |
| Deployment Starts |
| | |
| V |
| Maintenance Task Runs |
| | |
| +------------------------------------+ |
| | Calls ValidateAndMigrateNewSchema | |
| +------------------------------------+ |
| | |
| V |
| Is schema == previous_expected_schema? --Yes--> Apply `FromLiveToNext` SQL |
| | No | |
| V V |
| Is schema == current_expected_schema? ---Yes---> Migration Successful / No Action |
| | No |
| V |
| Error: Schema mismatch! Halt. |
| | |
| V |
| New Application Instances Start (if migration was successful) |
| ``` |
| |
| - **Test files (`migrate_test.go`, `migrate_spanner_test.go`):** These |
| files contain unit tests to verify the schema migration logic for both |
| CockroachDB and Spanner. They test scenarios where no migration is |
| needed, migration is required, and the schema is in an invalid state. |
| |
| - **Schema Export Utility (`exportschema/main.go`):** |
| |
| - **Why:** The `expectedschema` submodule needs JSON representations of |
| the "current" and "previous" database schemas to perform validation and |
| migration. This utility generates these JSON files. |
| - **How:** It takes the `sql.Tables` struct (for CockroachDB) or |
| `spanner.Schema` (for Spanner) and uses the `go/sql/schema/exportschema` |
| module to serialize the schema description into a JSON format. The |
| output of this utility is typically checked into version control as |
| `schema.json`, `schema_prev.json`, etc., within the `expectedschema` |
| directory. The typical workflow for a schema change involves: |
| |
| * Make schema changes in relevant Go structs (e.g., add a new field to |
| `alerts.AlertSchema`). |
| * Run `go generate ./...` within `perf/go/sql/` to regenerate `schema.go` |
| and `spanner/schema_spanner.go`. |
| * Copy the _old_ `expectedschema/schema.json` to |
| `expectedschema/schema_prev.json` (and similarly for Spanner). |
| * Run the `exportschema` binary (e.g., `bazel run |
| //perf/go/sql/exportschema -- --out |
| perf/go/sql/expectedschema/schema.json`) to generate the new |
| `expectedschema/schema.json`. |
| * Update the `FromLiveToNext` and `FromNextToLive` SQL statements in |
| `expectedschema/migrate.go`. |
| * Update test constants in `sql_test.go` (`LiveSchema`, `DropTables`) if |
| necessary. |
| |
| - **Testing Utilities (`sqltest/sqltest.go`):** |
| |
| - **Why:** Provides standardized ways to set up temporary CockroachDB or |
| Spanner emulator instances for testing components that interact with the |
| database. |
| - **How:** |
| - `NewCockroachDBForTests`: Sets up a connection to a local CockroachDB |
| instance (managed by `cockroachdb_instance.Require`), creates a new |
| temporary database for the test, applies the current `sql.Schema`, and |
| registers a cleanup function to drop the database after the test. |
| - `NewSpannerDBForTests`: Similarly, sets up a connection to a local |
| Spanner emulator (via PGAdapter, required by `pgadapter.Require`), |
| applies the current `spanner.Schema`, and prepares it for tests. |
| - These functions abstract away the complexities of emulator management |
| and initial schema setup, making tests cleaner and more reliable. |
| |
| - **Schema Tests (`sql_test.go`):** |
| |
| - **Why:** Verifies that the schema migration scripts correctly transform |
| a database from a "live-like" previous state to the current expected |
| state. |
| - **How:** |
| - Defines constants like `DropTables` (to clean up) and `LiveSchema` / |
| `LiveSchemaSpanner`. `LiveSchema` represents the schema _before_ the |
| latest change defined in `expectedschema/migrate.go`'s `FromLiveToNext`. |
| - The tests typically: |
| 1. Create a test database. |
| 2. Apply `DropTables` to ensure a clean slate. |
| 3. Apply `LiveSchema` to simulate the state of the database _before_ |
| the pending migration. |
| 4. Execute `expectedschema.FromLiveToNext` (or its Spanner equivalent). |
| 5. Fetch the schema description from the migrated database. |
| 6. Compare this migrated schema with the schema obtained by applying |
| `sql.Schema` (or `spanner.Schema`) directly to a fresh database |
| (which represents the target state). They should be identical. |
| |
| This comprehensive approach to schema management ensures that Perf's database |
| can be reliably deployed, maintained, and evolved. The separation of concerns |
| (schema definition, generation, validation, migration, and testing) makes the |
| system robust and easier to understand. |
| |
| # Module: /go/stepfit |
| |
| The `stepfit` module is designed to analyze time-series data, specifically |
| performance traces, to detect significant changes or "steps." It employs various |
| statistical algorithms to determine if a step up (performance improvement), a |
| step down (performance regression), or no significant change has occurred in the |
| data. This module is crucial for automated performance monitoring, allowing for |
| the identification of impactful changes in system behavior. |
| |
| The core idea is to fit a step function to the input trace data. A step function |
| is a simple function that is constant except for a single jump (the "step") at a |
| particular point (the "turning point"). The module calculates the best fit for |
| such a function and then evaluates the characteristics of this fit to determine |
| the nature and significance of the step. |
| |
| **Key Components and Logic:** |
| |
| The primary entity in this module is the `StepFit` struct. It encapsulates the |
| results of the step detection analysis: |
| |
| - `LeastSquares`: This field stores the Least Squares Error (LSE) of the |
| fitted step function. A lower LSE generally indicates a better fit of the |
| step function to the data. It's important to note that not all step |
| detection algorithms calculate or use LSE; in such cases, this field is set |
| to `InvalidLeastSquaresError`. |
| - `TurningPoint`: This integer indicates the index in the input trace where |
| the step function changes its value. It essentially marks the location of |
| the detected step. |
| - `StepSize`: This float represents the magnitude of the change in the step |
| function. A negative `StepSize` implies a step _up_ in the trace values |
| (conventionally a performance regression, e.g., increased latency). |
| Conversely, a positive `StepSize` indicates a step _down_ (conventionally a |
| performance improvement, e.g., decreased latency). |
| - `Regression`: This value is a metric used to quantify the significance or |
| "interestingness" of the detected step. Its calculation varies depending on |
| the chosen `stepDetection` algorithm. |
| - For the `OriginalStep` algorithm, it's calculated as `StepSize / LSE` |
| (or `StepSize / stddevThreshold` if LSE is too small). A larger absolute |
| value of `Regression` implies a more significant step. |
| - For other algorithms like `AbsoluteStep`, `PercentStep`, and |
| `CohenStep`, `Regression` is directly related to the `StepSize` (or a |
| normalized version of it). |
| - For `MannWhitneyU`, `Regression` represents the p-value of the test. |
| - `Status`: This is an enumerated type (`StepFitStatus`) indicating the |
| overall assessment of the step: |
| - `LOW`: A step down was detected, often interpreted as a performance |
| improvement. |
| - `HIGH`: A step up was detected, often interpreted as a performance |
| regression. |
| - `UNINTERESTING`: No significant step was found. |
| |
| The main function responsible for performing the analysis is `GetStepFitAtMid`. |
| It takes the following inputs: |
| |
| - `trace`: A slice of `float32` representing the time-series data to be |
| analyzed. |
| - `stddevThreshold`: A threshold for standard deviation. This is used in the |
| `OriginalStep` algorithm for normalizing the trace and as a floor for |
| standard deviation in other algorithms like `CohenStep` to prevent division |
| by zero or near-zero values. |
| - `interesting`: A threshold value used to determine if a calculated |
| `Regression` value is significant enough to be classified as `HIGH` or |
| `LOW`. The exact interpretation of this threshold depends on the |
| `stepDetection` algorithm. |
| - `stepDetection`: An enumerated type (`types.StepDetection`) specifying which |
| algorithm to use for step detection. |
| |
| **Workflow of `GetStepFitAtMid`:** |
| |
| 1. **Initialization and Preprocessing:** |
| |
| - A new `StepFit` struct is initialized with `Status` set to |
| `UNINTERESTING`. |
| - If the trace length is less than `minTraceSize` (currently 3), the |
| function returns the initialized `StepFit` as there isn't enough data to |
| analyze. |
| - **Trace Normalization/Adjustment:** |
| - If `stepDetection` is `types.OriginalStep`, the input `trace` is |
| duplicated and normalized (mean centered and scaled by its standard |
| deviation, unless the standard deviation is below |
| `stddevThreshold`). |
| - For all other `stepDetection` types, if the trace has an odd length, |
| the last element is dropped to make the trace length even. This is |
| because these algorithms typically compare the first half of the |
| trace with the second half. |
| |
| 2. **Step Detection Algorithm Execution:** The function then proceeds based on |
| the selected `stepDetection` algorithm. The core logic involves splitting |
| the (potentially modified) trace roughly in half at the `TurningPoint` |
| (which is `len(trace) / 2`) and comparing statistics of the two halves. |
| |
| - **`types.OriginalStep`:** |
| |
| - Calculates the mean of the first half (`y0`) and the second half |
| (`y1`) of the (normalized) trace. |
| - Computes the Sum of Squared Errors (SSE) for fitting `y0` to the |
| first half and `y1` to the second half. The `LeastSquares` error |
| (`lse`) is derived from this SSE. |
| - `StepSize` is `y0 - y1`. |
| - `Regression` is calculated as `StepSize / lse` (or `StepSize / |
| |
| stddevThreshold`if`lse` is too small). _Note: The original |
| implementation has a slight deviation from the standard definition |
| of standard error in this calculation._ |
| |
| - **`types.AbsoluteStep`:** |
| |
| - `StepSize` is `y0 - y1`. |
| - `Regression` is simply the `StepSize`. |
| - The step is considered interesting if the absolute value of |
| `StepSize` meets the `interesting` threshold. |
| |
| - **`types.Const`:** |
| |
| - This algorithm behaves differently. It focuses on the absolute value |
| of the trace point at the `TurningPoint` (`trace[i]`). |
| - `StepSize` is `abs(trace[i]) - interesting`. |
| - `Regression` is `-1 * abs(trace[i])`. This is done so that larger |
| deviations (regressions) result in more negative `Regression` |
| values, which are then flagged as `HIGH`. |
| |
| - **`types.PercentStep`:** |
| |
| - `StepSize` is `(y0 - y1) / y0`, representing the percentage change |
| relative to the mean of the first half. |
| - Handles potential `Inf` or `NaN` results from the division (e.g., if |
| `y0` is zero). |
| - `Regression` is the `StepSize`. |
| |
| - **`types.CohenStep`:** |
| |
| - Calculates Cohen's d, a measure of effect size. |
| - `StepSize` is `(y0 - y1) / s_pooled`, where `s_pooled` is the pooled |
| standard deviation of the two halves (or `stddevThreshold` if |
| `s_pooled` is too small or NaN). |
| - `Regression` is the `StepSize`. |
| |
| - **`types.MannWhitneyU`:** |
| |
| - Performs a Mann-Whitney U test (a non-parametric test) to determine |
| if the two halves of the trace come from different distributions. |
| - `StepSize` is `y0 - y1`. |
| - `Regression` is the p-value of the test. |
| - `LeastSquares` is set to the U-statistic from the test. |
| |
| 3. **Status Determination:** |
| |
| - For `types.MannWhitneyU`: |
| - If `Regression` (p-value) is less than or equal to the `interesting` |
| threshold (e.g., 0.05), a significant difference is detected. |
| - The `Status` (`HIGH` or `LOW`) is then determined by the sign of |
| `StepSize`. If `StepSize` is negative (step up), `Status` is `HIGH`. |
| Otherwise, it's `LOW`. |
| - The `Regression` value is then negated if the status is `HIGH` to |
| align with the convention that more negative values are "worse." |
| - For all other algorithms: |
| - If `Regression` is greater than or equal to `interesting`, `Status` |
| is `LOW`. |
| - If `Regression` is less than or equal to `-interesting`, `Status` is |
| `HIGH`. |
| - Otherwise, `Status` remains `UNINTERESTING`. |
| |
| 4. **Return Result:** The populated `StepFit` struct, containing |
| `LeastSquares`, `TurningPoint`, `StepSize`, `Regression`, and `Status`, is |
| returned. |
| |
| **Design Rationale:** |
| |
| - **Multiple Algorithms:** The inclusion of various step detection algorithms |
| (`OriginalStep`, `AbsoluteStep`, `PercentStep`, `CohenStep`, `MannWhitneyU`) |
| provides flexibility. Different datasets and performance characteristics may |
| be better suited to different statistical approaches. For instance, |
| `MannWhitneyU` is non-parametric and makes fewer assumptions about the data |
| distribution, which can be beneficial for noisy or non-Gaussian data. |
| `AbsoluteStep` and `PercentStep` offer simpler, more direct ways to define a |
| regression based on absolute or relative changes. |
| - **Centralized Logic:** The `GetStepFitAtMid` function consolidates the logic |
| for all supported algorithms, making it easier to manage and extend. |
| - **Clear `StepFit` Structure:** The `StepFit` struct provides a well-defined |
| way to communicate the results of the analysis, separating the raw metrics |
| (like `StepSize`, `LeastSquares`) from the final interpretation (`Status`). |
| - **`interesting` Threshold:** The `interesting` parameter allows users to |
| customize the sensitivity of the step detection. This is crucial because |
| what constitutes a "significant" change can vary greatly depending on the |
| context of the performance metric being monitored. |
| - **`stddevThreshold`:** This parameter helps in handling cases with very low |
| variance, preventing numerical instability (like division by zero) and |
| ensuring that normalization in `OriginalStep` behaves reasonably. |
| - **Focus on the Middle:** The `GetStepFitAtMid` name implies that the step |
| detection is focused around the middle of the trace. This is a common |
| approach for detecting a single, prominent step. More complex scenarios with |
| multiple steps would require different techniques. |
| |
| **Why specific implementation choices?** |
| |
| - **Normalization in `OriginalStep`:** Normalizing the trace in the |
| `OriginalStep` algorithm (as described in the linked blog post) aims to make |
| the detection less sensitive to the absolute scale of the data and more |
| focused on the relative change. |
| - **Symmetric Traces for Non-`OriginalStep`:** For algorithms other than |
| `OriginalStep`, ensuring an even trace length by potentially dropping the |
| last point simplifies the division of the trace into two equal halves for |
| comparison. |
| - **Handling of `Inf` and `NaN` in `PercentStep`:** Explicitly checking for |
| and handling `Inf` and `NaN` values that can arise from division by zero |
| (when `y0` is zero) makes the `PercentStep` calculation more robust. |
| - **`Regression` as p-value for `MannWhitneyU`:** Using the p-value as the |
| `Regression` metric for `MannWhitneyU` directly reflects the statistical |
| significance of the observed difference between the two halves of the trace. |
| The `interesting` threshold then acts as the significance level (alpha). |
| - **`InvalidLeastSquaresError`:** This constant provides a clear way to |
| indicate when LSE is not applicable or not calculated by a particular |
| algorithm, avoiding confusion with a calculated LSE of 0 or a negative |
| value. |
| |
| In essence, the `stepfit` module provides a toolkit for identifying abrupt |
| changes in performance data, offering different lenses (algorithms) through |
| which to view and quantify these changes. The design prioritizes flexibility in |
| algorithm choice and user-configurable sensitivity to cater to diverse |
| performance analysis needs. |
| |
| # Module: /go/subscription |
| |
| The `subscription` module manages alerting configurations, known as |
| subscriptions, for anomalies detected in performance data. It provides the means |
| to define, store, and retrieve these configurations. |
| |
| The core concept is that a "subscription" dictates how the system should react |
| when an anomaly is found. This includes details like which bug tracker component |
| to file an issue under, what labels to apply, who to CC on the bug, and the |
| priority/severity of the issue. This allows for automated and consistent |
| handling of performance regressions. |
| |
| Subscriptions are versioned using an `infra_internal` Git hash (revision). This |
| allows for tracking changes to subscription configurations over time and ensures |
| that the correct configuration is used based on the state of the infrastructure |
| code. |
| |
| **Key Components and Files:** |
| |
| - **`store.go`**: Defines the `Store` interface. This interface is the central |
| abstraction for interacting with subscription data. It dictates the |
| operations that any concrete subscription storage implementation must |
| provide. This design choice allows for flexibility in the underlying storage |
| mechanism (e.g., SQL database, in-memory store for testing). |
| |
| - **Why an interface?** Decouples the business logic from the specific |
| storage implementation. This promotes testability (using mocks) and |
| allows for easier migration to different database technologies in the |
| future if needed. |
| - **Key methods:** |
| - `GetSubscription`: Retrieves a specific version of a subscription. |
| - `GetActiveSubscription`: Retrieves the currently active version of a |
| subscription by its name. This is likely the most common retrieval |
| method for active alerting. |
| - `InsertSubscriptions`: Allows for batch insertion of new subscriptions. |
| This is typically done within a database transaction to ensure atomicity |
| – either all subscriptions are inserted, or none are. This is crucial |
| when updating configurations, as it prevents a partially updated state. |
| The implementation in `sqlsubscriptionstore` deactivates all existing |
| subscriptions before inserting the new ones as active, effectively |
| replacing the entire active set. |
| - `GetAllSubscriptions`: Retrieves all historical versions of all |
| subscriptions. |
| - `GetAllActiveSubscriptions`: Retrieves all currently active |
| subscriptions. This is useful for systems that need to know all current |
| alerting rules. |
| |
| - **`proto/v1/subscription.proto`**: Defines the structure of a `Subscription` |
| using Protocol Buffers. This is the canonical data model for subscriptions. |
| |
| - **Why Protocol Buffers?** Provides a language-neutral, platform-neutral, |
| extensible mechanism for serializing structured data. This is beneficial |
| for potential interoperability with other services or for persisting |
| data in a well-defined format. It also generates efficient serialization |
| and deserialization code. |
| - **Key fields:** `name`, `revision`, `bug_labels`, `hotlists`, |
| `bug_component`, `bug_priority`, `bug_severity`, `bug_cc_emails`, |
| `contact_email`. Each field directly maps to a configuration aspect for |
| bug filing and contact information. |
| |
| - **`sqlsubscriptionstore/sqlsubscriptionstore.go`**: Provides a concrete |
| implementation of the `Store` interface using an SQL database (specifically |
| designed for CockroachDB, as indicated by the use of `pgx`). |
| |
| - **Why SQL?** Relational databases offer robust data integrity, |
| transaction support (ACID properties), and powerful querying |
| capabilities, which are well-suited for managing structured |
| configuration data like subscriptions. |
| - **How it works:** It defines SQL statements for each operation in the |
| `Store` interface. When inserting subscriptions, it first deactivates |
| all existing subscriptions and then inserts the new ones as active. This |
| ensures that only the latest set of configurations is considered active. |
| - The `is_active` boolean column in the database schema |
| (`sqlsubscriptionstore/schema/schema.go`) is key to this "active |
| version" concept. |
| |
| - **`sqlsubscriptionstore/schema/schema.go`**: Defines the SQL table schema |
| for storing subscriptions. |
| |
| - **Key design choice:** The primary key is a composite of `name` and |
| `revision`. This allows multiple versions of the same named subscription |
| to exist, identified by their revision. The `is_active` field |
| differentiates the current version from historical ones. |
| |
| - **`mocks/Store.go`**: Contains a mock implementation of the `Store` |
| interface, generated by the `mockery` tool. |
| |
| - **Why mocks?** Essential for unit testing components that depend on the |
| `Store` interface without requiring an actual database connection. This |
| makes tests faster, more reliable, and isolates the unit under test. |
| |
| **Key Workflows:** |
| |
| 1. **Updating Subscriptions:** This typically happens when configurations in |
| `infra_internal` are changed. |
| |
| ``` |
| External Process (e.g., config syncer) |
| | |
| v |
| Reads new subscription definitions (likely from files) |
| | |
| v |
| Parses definitions into []*pb.Subscription |
| | |
| v |
| Calls store.InsertSubscriptions(ctx, newSubscriptions, tx) |
| | |
| |--> [SQL Transaction Start] |
| | | |
| | v |
| | sqlsubscriptionstore: Deactivate all existing subscriptions (UPDATE Subscriptions SET is_active=false WHERE is_active=true) |
| | | |
| | v |
| | sqlsubscriptionstore: Insert each new subscription with is_active=true (INSERT INTO Subscriptions ...) |
| | | |
| | v |
| |--> [SQL Transaction Commit/Rollback] |
| ``` |
| |
| This ensures that the update is atomic. If any part fails, the transaction |
| is rolled back, leaving the previous set of active subscriptions intact. |
| |
| 2. **Anomaly Detection Triggering Alerting:** `Anomaly Detector | v Identifies |
| an anomaly and the relevant subscription name (e.g., based on metric |
| patterns) | v Calls store.GetActiveSubscription(ctx, subscriptionName) | v |
| sqlsubscriptionstore: Retrieves the active subscription (SELECT ... FROM |
| Subscriptions WHERE name=$1 AND is_active=true) | v Anomaly Detector uses |
| the pb.Subscription details (bug component, labels, etc.) to file a bug.` |
| |
| This module provides a robust and versioned way to manage alerting rules, |
| ensuring that performance regressions are handled consistently and routed |
| appropriately. The separation of interface and implementation, along with the |
| use of Protocol Buffers, contributes to a maintainable and extensible system. |
| |
| # Module: /go/tracecache |
| |
| ## TraceCache Module Documentation |
| |
| The `tracecache` module provides a mechanism for caching trace identifiers |
| (trace IDs) associated with specific tiles and queries. This caching layer |
| significantly improves performance by reducing the need to repeatedly compute or |
| fetch trace IDs, which can be a computationally expensive operation. |
| |
| **Core Functionality & Design Rationale:** |
| |
| The primary purpose of `tracecache` is to store and retrieve lists of trace IDs. |
| Trace IDs are represented as `paramtools.Params`, which are essentially |
| key-value pairs that uniquely identify a specific trace within the performance |
| monitoring system. |
| |
| The caching strategy is built around the concept of a "tile" and a "query." |
| |
| - **Tile:** In the context of Skia Perf, a tile represents a chunk of commit |
| history. Caching trace IDs per tile allows for efficient retrieval of |
| relevant traces when analyzing a specific range of commits. |
| - **Query:** A query, represented by `query.Query`, defines the specific |
| parameters used to filter traces. Different queries will yield different |
| sets of trace IDs. |
| |
| By combining the tile number and a string representation of the query, a unique |
| cache key is generated. This ensures that cached data is specific to the exact |
| combination of commit range and filter criteria. |
| |
| The module relies on an external caching implementation provided via the |
| `go/cache.Cache` interface. This design choice promotes flexibility, allowing |
| different caching backends (e.g., in-memory, Redis, Memcached) to be used |
| without modifying the `tracecache` logic itself. This separation of concerns is |
| crucial for adapting to various deployment environments and performance |
| requirements. |
| |
| **Key Components:** |
| |
| - **`traceCache.go`**: This is the sole file in the module and contains the |
| implementation of the `TraceCache` struct and its associated methods. |
| - **`TraceCache` struct**: |
| - Holds an instance of `cache.Cache`. This is the underlying cache client |
| used for storing and retrieving data. |
| - **`New(cache cache.Cache) *TraceCache`**: |
| - The constructor for `TraceCache`. It takes a `cache.Cache` instance as |
| an argument, which will be used for all caching operations. This |
| dependency injection allows the caller to provide any cache |
| implementation that conforms to the `cache.Cache` interface. |
| - **`CacheTraceIds(ctx context.Context, tileNumber types.TileNumber, q |
| *query.Query, traceIds []paramtools.Params) error`**: |
| - This method is responsible for storing a list of trace IDs into the |
| cache. |
| - It first generates a unique `cacheKey` using the `tileNumber` and the |
| `query.Query`. |
| - The `traceIds` (a slice of `paramtools.Params`) are then serialized into |
| a JSON string using the `toJSON` helper function. This serialization is |
| necessary because most cache backends store data as strings or byte |
| arrays. JSON is chosen for its human-readability and widespread support. |
| - Finally, it uses the `cacheClient.SetValue` method to store the JSON |
| string under the generated `cacheKey`. |
| - **`GetTraceIds(ctx context.Context, tileNumber types.TileNumber, q |
| *query.Query) ([]paramtools.Params, error)`**: |
| - This method retrieves a list of trace IDs from the cache. |
| - It generates the `cacheKey` in the same way as `CacheTraceIds`. |
| - It then attempts to fetch the value associated with this key using |
| `cacheClient.GetValue`. |
| - If the value is not found in the cache (i.e., `cacheJson` is empty), it |
| returns `nil` for both the trace IDs and the error, indicating a cache |
| miss. |
| - If a value is found, it deserializes the JSON string back into a slice |
| of `paramtools.Params` using `json.Unmarshal`. |
| - **`traceIdCacheKey(tileNumber types.TileNumber, q query.Query) |
| string`**: |
| - A private helper function that constructs the cache key. It combines the |
| `tileNumber` (an integer) and a string representation of the |
| `query.Query` (obtained via `q.KeyValueString()`) separated by an |
| underscore. This format ensures uniqueness and provides some |
| human-readable context within the cache keys. |
| - **`toJSON(obj interface{}) (string, error)`**: |
| - A private generic helper function to marshal any given object into a |
| JSON string. This is used specifically for serializing the |
| `[]paramtools.Params` before caching. |
| |
| **Workflow for Caching Trace IDs:** |
| |
| 1. **Input:** `tileNumber`, `query.Query`, `[]paramtools.Params` (trace IDs to |
| cache) |
| 2. `CacheTraceIds` is called. |
| 3. `traceIdCacheKey(tileNumber, query)` generates a unique key. `tileNumber + |
| "_" + query.KeyValueString() ---> cacheKey` |
| 4. `toJSON(traceIds)` serializes the list of trace IDs into a JSON string. |
| `[]paramtools.Params --json.Marshal--> jsonString` |
| 5. `t.cacheClient.SetValue(ctx, cacheKey, jsonString)` stores the JSON string |
| in the underlying cache. |
| |
| **Workflow for Retrieving Trace IDs:** |
| |
| 1. **Input:** `tileNumber`, `query.Query` |
| 2. `GetTraceIds` is called. |
| 3. `traceIdCacheKey(tileNumber, query)` generates the cache key (same logic as |
| above). `tileNumber + "_" + query.KeyValueString() ---> cacheKey` |
| 4. `t.cacheClient.GetValue(ctx, cacheKey)` attempts to retrieve the value from |
| the cache. `cacheClient --GetValue(cacheKey)--> jsonString (or empty if not |
| found)` |
| 5. **If `jsonString` is empty (cache miss):** Return `nil`, `nil`. |
| 6. **If `jsonString` is not empty (cache hit):** |
| `json.Unmarshal([]byte(jsonString), &traceIds)` deserializes the JSON string |
| back into `[]paramtools.Params`. `jsonString --json.Unmarshal--> |
| []paramtools.Params` |
| 7. Return the deserialized `[]paramtools.Params` and `nil` error. |
| |
| # Module: /go/tracefilter |
| |
| ## Tracefilter Module Documentation |
| |
| The `tracefilter` module provides a mechanism to organize and filter trace data |
| based on their hierarchical paths. The core idea is to represent traces within a |
| tree structure, where each node in the tree corresponds to a segment of the |
| trace's path. This allows for efficient filtering of traces, specifically to |
| identify "leaf" traces – those that do not have any further sub-paths. |
| |
| This approach is particularly useful in scenarios where traces have a |
| parent-child relationship implied by their path structure. For instance, in |
| performance analysis, a trace like `/root/p1/p2/p3/t1` might represent a |
| specific test (`t1`) under a series of nested configurations (`p1`, `p2`, `p3`). |
| If there's another trace `/root/p1/p2`, it could be considered a "parent" or an |
| aggregate trace. The `tracefilter` helps in identifying only the most specific, |
| or "leaf," traces, effectively filtering out these higher-level parent traces. |
| |
| ### Key Components and Responsibilities |
| |
| The primary component is the `TraceFilter` struct. |
| |
| **`TraceFilter` struct:** |
| |
| - **Purpose:** Represents a node within the trace path tree. |
| - **Fields:** |
| - `traceKey`: A string identifier associated with the trace path ending at |
| this node. For the root of the tree, this is initialized to "HEAD". |
| - `value`: The string value of the current path segment this node |
| represents. |
| - `children`: A map where keys are the next path segments and values are |
| pointers to child `TraceFilter` nodes. This map forms the branches of |
| the tree. |
| - **Why this structure?** |
| - A tree is a natural way to represent hierarchical path data. |
| - Using a map for `children` allows for efficient lookup and addition of |
| child nodes based on the next path segment. |
| - Storing the `traceKey` at each node allows associating an identifier |
| with a complete path as it's being built. |
| |
| **`NewTraceFilter()` function:** |
| |
| - **Purpose:** Acts as the constructor for the `TraceFilter` tree. |
| - **How it works:** It initializes a root `TraceFilter` node. The `traceKey` |
| is set to "HEAD" as a sentinel value for the root, and its `children` map is |
| initialized as empty, ready to have paths added to it. |
| - **Why this design?** Provides a clear and simple entry point for creating a |
| new filter structure. |
| |
| **`AddPath(path []string, traceKey string)` method:** |
| |
| - **Purpose:** Adds a new trace, defined by its `path` (a slice of strings |
| representing path segments) and its unique `traceKey`, to the filter tree. |
| - **How it works:** |
| 1. It traverses the tree, creating new nodes as needed for each segment in |
| the input `path`. |
| 2. If a segment in the `path` already exists as a child of the current |
| node, it moves to that existing child. |
| 3. If a segment does not exist, a new `TraceFilter` node is created for |
| that segment, its `value` is set to the segment string, its `traceKey` |
| is set to the input `traceKey`, and it's added to the `children` map of |
| the current node. |
| 4. This process repeats recursively for the remaining segments in the |
| `path`. |
| - **Why this design?** |
| |
| - This incremental build process efficiently constructs the tree by |
| reusing existing nodes for common path prefixes. |
| - The recursive nature elegantly handles paths of arbitrary length. |
| - Associating the `traceKey` with each newly created node ensures that |
| even intermediate nodes (which might later become leaves if no further |
| sub-paths are added) have an associated key. |
| |
| ``` |
| Example: Adding path ["root", "p1", "p2"] with key "keyA" |
| |
| Initial Tree: |
| (HEAD) |
| |
| After AddPath(["root", "p1", "p2"], "keyA"): |
| |
| (HEAD) |
| | |
| +-- ("root", key="keyA") |
| | |
| +-- ("p1", key="keyA") |
| | |
| +-- ("p2", key="keyA") <- Leaf node initially |
| ``` |
| |
| If we then add `["root", "p1", "p2", "t1"]` with key `"keyB"`: |
| |
| ``` |
| (HEAD) |
| | |
| +-- ("root", key="keyB") // traceKey updated if path is prefix |
| | |
| +-- ("p1", key="keyB") |
| | |
| +-- ("p2", key="keyB") |
| | |
| +-- ("t1", key="keyB") <- New leaf node |
| ``` |
| |
| _Note: The `traceKey` of an existing node is updated by `AddPath` if the new |
| path being added shares that node as a prefix. This ensures that the |
| `traceKey` stored at a node corresponds to the longest path ending at that |
| node if it's also a prefix of other paths._ However, the primary use of |
| `GetLeafNodeTraceKeys` relies on the `traceKey` of nodes that _become_ |
| leaves. |
| |
| **`GetLeafNodeTraceKeys()` method:** |
| |
| - **Purpose:** Retrieves the `traceKey`s of all traces that are considered |
| "leaf" nodes in the tree. A leaf node is a node that has no children. |
| - **How it works:** |
| 1. It performs a depth-first traversal of the tree. |
| 2. If the current node has no children (i.e., `len(tf.children) == 0`), its |
| `traceKey` is considered a leaf key and is added to the result list. |
| 3. If the current node has children, the method recursively calls itself on |
| each child node and aggregates the results. |
| - **Why this design?** |
| |
| - This is the core filtering logic. By only returning keys from nodes |
| without children, it effectively filters out traces that serve as |
| prefixes (parents) to other, more specific traces. |
| - Recursion is a natural fit for traversing tree structures. |
| |
| ``` |
| Workflow for GetLeafNodeTraceKeys: |
| |
| Start at (CurrentNode) |
| | |
| V |
| Is CurrentNode a leaf (no children)? |
| | |
| +-- YES --> Add CurrentNode.traceKey to results |
| | |
| +-- NO --> For each ChildNode in CurrentNode.children: |
| | |
| V |
| Recursively call GetLeafNodeTraceKeys on ChildNode |
| | |
| V |
| Append results from ChildNode to overall results |
| | |
| V |
| Return aggregated results |
| ``` |
| |
| ### Example Scenario and How it Works |
| |
| Consider the following traces and their paths: |
| |
| 1. `traceA`: path `["config", "test_group", "test1"]`, key `"keyA"` |
| 2. `traceB`: path `["config", "test_group"]`, key `"keyB"` |
| 3. `traceC`: path `["config", "test_group", "test2"]`, key `"keyC"` |
| 4. `traceD`: path `["config", "other_group", "test3"]`, key `"keyD"` |
| |
| 5. **Tree Construction (`AddPath` calls):** |
| |
| - `tf.AddPath(["config", "test_group", "test1"], "keyA")` |
| - `tf.AddPath(["config", "test_group"], "keyB")` |
| - When this is added, the node for `"test_group"` initially created by |
| `keyA` will have its `traceKey` updated to `"keyB"`. |
| - `tf.AddPath(["config", "test_group", "test2"], "keyC")` |
| - `tf.AddPath(["config", "other_group", "test3"], "keyD")` |
| |
| The tree would look something like this (simplified, showing relevant |
| traceKeys for leaf potential): |
| |
| ``` |
| (HEAD) |
| | |
| +-- ("config") |
| | |
| +-- ("test_group", traceKey likely updated by "keyB" during AddPath) |
| | | |
| | +-- ("test1", traceKey="keyA") <-- Leaf |
| | | |
| | +-- ("test2", traceKey="keyC") <-- Leaf |
| | |
| +-- ("other_group") |
| | |
| +-- ("test3", traceKey="keyD") <-- Leaf |
| ``` |
| |
| 6. **Filtering (`GetLeafNodeTraceKeys()` call):** |
| |
| - When `GetLeafNodeTraceKeys()` is called on the root: |
| - It traverses to `"config"`. |
| - It traverses to `"test_group"`. This node has children (`"test1"` |
| and `"test2"`), so its key (`"keyB"`) is _not_ added. |
| - It traverses to `"test1"`. This is a leaf. `"keyA"` is added. |
| - It traverses to `"test2"`. This is a leaf. `"keyC"` is added. |
| - It traverses to `"other_group"`. |
| - It traverses to `"test3"`. This is a leaf. `"keyD"` is added. |
| |
| The result would be `["keyA", "keyC", "keyD"]`. Notice that `"keyB"` is |
| excluded because the path `["config", "test_group"]` has sub-paths |
| (`.../test1` and `.../test2`), making it a non-leaf node in the context of |
| trace specificity. |
| |
| This module provides a clean and efficient way to identify the most granular |
| traces in a dataset where hierarchy is defined by path structure. |
| |
| # Module: /go/tracesetbuilder |
| |
| The `tracesetbuilder` module is designed to efficiently construct a |
| `types.TraceSet` and its corresponding `paramtools.ReadOnlyParamSet` from |
| multiple, potentially disparate, sets of trace data. This is particularly useful |
| when dealing with performance data that might arrive in chunks (e.g., from |
| different "Tiles" of data) and needs to be aggregated into a coherent view |
| across a series of commits. |
| |
| The core challenge this module addresses is the concurrent and distributed |
| nature of processing trace data. If multiple traces with the same identifier |
| (key) were processed by different workers simultaneously without coordination, |
| it could lead to race conditions and incorrect data. Similarly, simply locking |
| the entire `TraceSet` for each update would create a bottleneck. |
| |
| The `tracesetbuilder` solves this by employing a worker pool (`mergeWorkers`). |
| The key design decision here is to distribute the work based on the trace key. |
| Each trace key is hashed (using `crc32.ChecksumIEEE`), and this hash determines |
| which `mergeWorker` is responsible for that specific trace. This ensures that |
| all data points for a single trace are always processed by the same worker, |
| thereby avoiding the need for explicit locking at the individual trace level |
| within the worker. Each `mergeWorker` maintains its own `types.TraceSet` and |
| `paramtools.ParamSet`. |
| |
| **Key Components and Workflow:** |
| |
| 1. **`TraceSetBuilder`:** |
| |
| - **Responsibilities:** |
| - Manages a pool of `mergeWorker` instances. |
| - Provides the `Add` method to ingest new trace data. |
| - Provides the `Build` method to consolidate results from all workers |
| and return the final `TraceSet` and `ReadOnlyParamSet`. |
| - Provides the `Close` method to shut down the worker pool. |
| - **`New(size int)`:** Initializes the `TraceSetBuilder`. The `size` |
| parameter is crucial as it defines the expected length of each trace in |
| the final, consolidated `TraceSet`. This allows the builder to |
| pre-allocate trace slices of the correct length, filling in missing data |
| points as necessary. It creates `numWorkers` instances of `mergeWorker`. |
| - **`Add(commitNumberToOutputIndex map[types.CommitNumber]int32, commits |
| |
| []provider.Commit, traces types.TraceSet)`:\*\* This is the entry point |
| for feeding data into the builder. |
| |
| - `traces`: A `types.TraceSet`representing a chunk of data (e.g., |
| from a single tile). -`commits`: A slice of `provider.Commit`objects corresponding to the |
| data points in the`traces`. |
| - `commitNumberToOutputIndex`: A map that dictates where each data |
| point from the input `traces`(identified by its |
| `types.CommitNumber`) should be placed in the _final_ output trace. |
| This mapping is essential for correctly aligning data points that |
| might come from different sources or represent different commit |
| ranges. |
| - For each trace in the input `traces`: |
| - It parses the trace key into `paramtools.Params`. |
| - It creates a `request`struct containing the key, params, the trace |
| data itself, the`commitNumberToOutputIndex`map, and the`commits` slice. |
| - It calculates an index based on the CRC32 hash of the trace key |
| modulo`numWorkers`. |
| - It sends this `request`to the`ch`channel of the selected |
| `mergeWorker`. |
| - A `sync.WaitGroup`is incremented for each trace added, ensuring |
| `Build` waits for all processing to complete. |
| - **`Build(ctx context.Context)`:** |
| - Waits for all `Add`operations to be processed by the workers (using |
| `t.wg.Wait()`). |
| - Iterates through all `mergeWorkers`. |
| - Merges the `traceSet`and`paramSet`from each`mergeWorker`into a |
| single, final`types.TraceSet`and`paramtools.ParamSet`. |
| - Normalizes and freezes the final `paramSet`to create a |
| `paramtools.ReadOnlyParamSet`. |
| - Returns the consolidated `TraceSet`and`ReadOnlyParamSet`. |
| - **`Close()`:** Iterates through the `mergeWorkers` and closes their |
| respective input channels (`ch`). This signals the worker goroutines to |
| terminate once they have processed all pending requests. |
| |
| 2. **`mergeWorker`:** |
| |
| - **Responsibilities:** |
| - Processes `request` objects sent to its channel. |
| - Maintains its own local `types.TraceSet` and `paramtools.ParamSet`. |
| - Updates its local `TraceSet` with new data points, placing them |
| correctly according to `request.commitNumberToOutputIndex`. |
| - Adds the parameters from each processed trace to its local |
| `ParamSet`. |
| - **`newMergeWorker(wg *sync.WaitGroup, size int)`:** Creates a |
| `mergeWorker` and starts its goroutine. |
| - It initializes an empty `types.TraceSet` and `paramtools.ParamSet`. |
| - The goroutine continuously reads `request` objects from its `ch` |
| channel. |
| - For each `request`: |
| - It retrieves or creates a trace in its `m.traceSet` for the given |
| `req.key`. If creating, it uses `types.NewTrace(size)` to ensure the |
| trace has the correct final length. |
| - It iterates through the `req.commits` and uses |
| `req.commitNumberToOutputIndex` to determine the correct destination |
| index in its local trace for each data point in `req.trace`. |
| - It updates the trace value at that destination index. |
| - It adds `req.params` to its `m.paramSet`. |
| - It decrements the shared `sync.WaitGroup` (`m.wg.Done()`) to signal |
| completion of this piece of work. |
| - **`Process(req *request)`:** Sends a request to the worker's channel. |
| - **`Close()`:** Closes the worker's input channel. |
| |
| 3. **`request` struct:** |
| |
| - A simple data structure used to pass all necessary information for |
| processing a single trace segment through the pipeline to a |
| `mergeWorker`. It encapsulates the trace key, its parsed parameters, the |
| actual trace data segment, the mapping of commit numbers to output |
| indices, and the corresponding commit metadata. |
| |
| **Workflow Diagram:** |
| |
| ``` |
| TraceSetBuilder.New(outputTraceLength) |
| | |
| V |
| +-----------------------------------------------------------------------+ |
| | TraceSetBuilder (manages WaitGroup and pool of mergeWorkers) | |
| +-----------------------------------------------------------------------+ |
| | ^ |
| | Add(commitMap1, commits1, traces1) | Build() waits for WaitGroup |
| | Add(commitMap2, commits2, traces2) | |
| V | |
| +-----------------------------------------------------------------------+ |
| | For each trace in input: | |
| | 1. Parse key -> params | |
| | 2. Create 'request' struct | |
| | 3. Hash key -> workerIndex | |
| | 4. Send 'request' to mergeWorkers[workerIndex].ch | |
| | 5. Increment WaitGroup | |
| +-----------------------------------------------------------------------+ |
| | | | ... (numWorkers times) |
| V V V |
| +--------+ +--------+ +--------+ |
| | mergeW_0 | | mergeW_1 | | mergeW_N | (Each runs in its own goroutine) |
| | .ch | | .ch | | .ch | |
| | .traceSet| | .traceSet| | .traceSet| |
| | .paramSet| | .paramSet| | .paramSet| |
| +--------+ +--------+ +--------+ |
| ^ ^ ^ |
| | Process request: | |
| | - Get/Create local trace for req.key (length: outputTraceLength) | |
| | - For each point in req.trace: | |
| | - Use req.commitNumberToOutputIndex[commitNum] to find dstIdx | |
| | - localTrace[dstIdx] = req.trace[srcIdx] | |
| | - Add req.params to local paramSet | |
| | - Decrement WaitGroup | |
| | | | |
| --------------------- (When TraceSetBuilder.Build() is called) |
| | |
| V |
| +-----------------------------------------------------------------------+ |
| | TraceSetBuilder.Build(): | |
| | 1. Wait for all 'Add' operations (WaitGroup.Wait()) | |
| | 2. Create finalTraceSet, finalParamSet | |
| | 3. For each mergeWorker: | |
| | - Merge worker.traceSet into finalTraceSet | |
| | - Merge worker.paramSet into finalParamSet | |
| | 4. Normalize and Freeze finalParamSet | |
| | 5. Return finalTraceSet, finalParamSet (ReadOnly) | |
| +-----------------------------------------------------------------------+ |
| | |
| V |
| +-----------------------------------------------------------------------+ |
| | TraceSetBuilder.Close(): | |
| | - Close channels of all mergeWorkers (signals them to terminate) | |
| +-----------------------------------------------------------------------+ |
| ``` |
| |
| The use of `numWorkers` and `channelBufferSize` are constants that can be tuned |
| for performance based on the expected workload and system resources. The CRC32 |
| hash provides a reasonably good distribution of keys across workers, minimizing |
| the chance of one worker becoming a bottleneck. The `sync.WaitGroup` is |
| essential for ensuring that the `Build` method doesn't prematurely try to |
| aggregate results before all input data has been processed by the workers. |
| |
| The design allows for efficient, concurrent processing of large volumes of trace |
| data by partitioning the work based on trace identity and then merging the |
| results, making it suitable for building comprehensive views of performance |
| metrics over time. |
| |
| # Module: /go/tracestore |
| |
| The `tracestore` module defines interfaces and implementations for storing and |
| retrieving performance trace data. It's a core component of the Perf system, |
| enabling the analysis of performance metrics over time and across different |
| configurations. |
| |
| ## Design Philosophy |
| |
| The primary goal of `tracestore` is to provide an efficient and scalable way to |
| manage large volumes of trace data. This involves: |
| |
| - **Tiled Storage:** Data is organized into "tiles," which are fixed-size |
| blocks of commits. This approach simplifies data management and allows for |
| efficient querying of data within specific time ranges. Each tile has its |
| own inverted index and ParamSet, making searches within a tile fast. |
| - **Inverted Indexing:** To quickly find traces matching specific criteria |
| (e.g., "arch=x86" and "config=8888"), `tracestore` uses an inverted index. |
| This index maps key-value pairs to the trace IDs that contain them within |
| each tile. |
| - **Caching:** Various caching mechanisms are employed to improve performance, |
| including: |
| - In-memory LRU caches for frequently accessed data like ParamSets and |
| recently written Postings/ParamSet entries. |
| - An optional external cache (like Memcached via `go/cache/memcached`) for |
| broader caching strategies. |
| - A `tracecache` for caching the results of `QueryTracesIDOnly` to speed |
| up repeated queries. |
| - **Interface-Based Design:** The module defines interfaces (`TraceStore`, |
| `MetadataStore`, `TraceParamStore`) to allow for different backend |
| implementations. This promotes flexibility and testability. The primary |
| implementation provided is `sqltracestore`, which uses an SQL database. |
| - **Concurrency:** Operations like writing traces and querying are designed to |
| be concurrent, leveraging Go routines and parallel processing to handle |
| large datasets efficiently. For instance, writing large batches of traces or |
| postings is often chunked and processed in parallel. |
| - **Separation of Concerns:** |
| - `TraceStore` handles the core logic of reading and writing trace values |
| and their associated parameters. |
| - `MetadataStore` manages metadata associated with source files (e.g., |
| links to dashboards or logs). |
| - `TraceParamStore` specifically handles the mapping between trace IDs |
| (MD5 hashes of trace names) and their full parameter sets. This |
| separation helps in optimizing storage and retrieval for these distinct |
| types of data. |
| |
| ## Key Components and Responsibilities |
| |
| The `tracestore` module is primarily defined by a set of interfaces and their |
| SQL-based implementations. |
| |
| ### `tracestore.go` |
| |
| This file defines the main `TraceStore` interface. It outlines the contract for |
| any system that wants to store and retrieve performance traces. Key |
| responsibilities include: |
| |
| - **Writing Traces (`WriteTraces`, `WriteTraces2`):** Ingesting new |
| performance data points. Each data point is associated with a specific |
| commit, a set of parameters (defining the trace, e.g., |
| `config=8888,arch=x86`), a value, the source file it came from, and a |
| timestamp. |
| - The `WriteTraces` method is designed to handle potentially large batches |
| of data efficiently. Implementations often involve chunking data and |
| performing parallel writes to the underlying storage. |
| - `WriteTraces2` is a newer variant, potentially for different storage |
| schemas or optimizations (e.g., denormalizing common params directly |
| into the trace values table as seen in `TraceValues2Schema`). |
| - **Reading Traces (`ReadTraces`, `ReadTracesForCommitRange`):** Retrieving |
| trace data for specific keys (trace names) within a given tile or commit |
| range. |
| - **Querying Traces (`QueryTraces`, `QueryTracesIDOnly`):** |
| - `QueryTraces` allows searching for traces based on a `query.Query` |
| object (which specifies parameter key-value pairs). It returns the |
| actual trace values and associated commit information. |
| - `QueryTracesIDOnly` is an optimization that returns only the |
| `paramtools.Params` (effectively the identifying parameters) of traces |
| matching a query. This is useful when only the list of matching traces |
| is needed, not their values. |
| - **Tile Management (`GetLatestTile`, `TileNumber`, `TileSize`, |
| `CommitNumberOfTileStart`):** Provides methods for interacting with the |
| tiled storage system. |
| - **ParamSet Management (`GetParamSet`):** Retrieving the |
| `paramtools.ReadOnlyParamSet` for a specific tile. A ParamSet represents all |
| unique key-value pairs present in the traces within that tile, which is |
| crucial for UI elements like query builders. |
| - **Source Information (`GetSource`, `GetLastNSources`, |
| `GetTraceIDsBySource`):** Retrieving information about the origin of trace |
| data, such as the ingested file name. |
| |
| ### `metadatastore.go` |
| |
| This file defines the `MetadataStore` interface. Its responsibility is to manage |
| metadata associated with source files. |
| |
| - **`InsertMetadata`:** Stores links or other metadata for a given source file |
| name. |
| - **`GetMetadata`:** Retrieves the stored metadata for a source file. This can |
| be used, for example, to link from a data point back to the original log |
| file or a specific dashboard view related to the data ingestion. |
| |
| ### `traceparamstore.go` |
| |
| This file defines the `TraceParamStore` interface. This store is dedicated to |
| managing the relationship between a trace's unique identifier (typically an MD5 |
| hash of its full parameter string) and the actual `paramtools.Params` object. |
| |
| - **`WriteTraceParams`:** Stores the mapping from trace IDs to their parameter |
| sets. This is done to avoid repeatedly parsing or storing the full parameter |
| string for every data point of a trace. |
| - **`ReadParams`:** Retrieves the `paramtools.Params` for a given set of trace |
| IDs. |
| |
| ### Submodule: `sqltracestore` |
| |
| This submodule provides the SQL-based implementation of the `TraceStore`, |
| `MetadataStore`, and `TraceParamStore` interfaces. |
| |
| - **`sqltracestore.go`:** Implements the `TraceStore` interface. |
| |
| - **Schema:** It relies on a specific SQL schema (defined conceptually in |
| the package documentation and concretely in |
| `sqltracestore/schema/schema.go`) involving tables like `TraceValues` |
| (for actual metric values), `Postings` (the inverted index), `ParamSets` |
| (per-tile parameter information), and `SourceFiles`. |
| - **Writing Data:** When `WriteTraces` is called, it performs several |
| actions: |
| |
| * Updates the `SourceFiles` table with the new source filename if it's not |
| already present. |
| * Updates the `ParamSets` table for the current tile with any new |
| key-value pairs from the incoming traces. This uses a cache to avoid |
| redundant writes. |
| * For each incoming trace: _ Calculates its MD5 hash (trace ID). _ Inserts |
| the value into the `TraceValues` table (or `TraceValues2` for |
| `WriteTraces2`). _ If the trace ID and its key-value pairs are not |
| already in the `Postings` table for the current tile (checked via |
| cache), it inserts them. _ Stores the mapping of the trace ID to its |
| `paramtools.Params` in the `TraceParams` table via the |
| `TraceParamStore`. All these writes are typically batched and |
| parallelized for efficiency. |
| |
| - **Querying Data (`QueryTracesIDOnly`):** |
| |
| * Retrieves the `ParamSet` for the target tile. |
| * Generates a query plan based on the input `query.Query` and the tile's |
| `ParamSet`. |
| * **Optimization (`restrictByCounting`):** It attempts to optimize the |
| query by first running `COUNT(*)` queries for each part of the query |
| plan. The part of the plan that matches the fewest traces (below a |
| threshold) is then used to fetch its corresponding trace IDs. These IDs |
| are then used to construct a `restrictClause` (e.g., `AND trace_id IN |
| (...)`) that is appended to the queries for the other parts of the plan. |
| This significantly speeds up queries where one filter is much more |
| selective than others. |
| * For each part of the query plan (each key and its OR'd values), it |
| executes an SQL query against the `Postings` table (using the |
| `restrictClause` if applicable) to get a stream of matching |
| `traceIDForSQL`. |
| * The streams of `traceIDForSQL` from each part of the plan are then |
| intersected (using `newIntersect`) to find the trace IDs that satisfy |
| all AND conditions of the query. |
| * These resulting trace IDs are then passed to the `TraceParamStore` to |
| fetch their full `paramtools.Params`. |
| |
| - **Reading Data (`QueryTraces`, `ReadTraces`):** Once the trace IDs (and |
| thus their full names) are known (either from `QueryTracesIDOnly` or |
| directly provided), it queries the `TraceValues` table to fetch the |
| actual floating-point values for those traces within the specified |
| commit range or tile. It also fetches commit information from the |
| `Commits` table. |
| - **Follower Reads:** Supports `enableFollowerReads` configuration, which |
| adds `AS OF SYSTEM TIME '-5s'` to certain read queries, allowing them to |
| potentially hit read replicas and reduce load on the primary, at the |
| cost of slightly stale data. |
| - **Dialect Specificity:** It has distinct SQL templates and statement |
| strings for CockroachDB (default) and Spanner (`spanner.go`) to account |
| for syntax differences or performance characteristics (e.g., `UPSERT` |
| vs. `ON CONFLICT`). |
| |
| - **`sqlmetadatastore.go`:** Implements the `MetadataStore` interface. It uses |
| an `Metadata` SQL table that links a `source_file_id` (from `SourceFiles`) |
| to a JSONB column storing the metadata map. |
| |
| - **`sqltraceparamstore.go`:** Implements the `TraceParamStore` interface. It |
| uses a `TraceParams` SQL table that stores `trace_id` (bytes) and their |
| corresponding `params` (JSONB). Writes are chunked and can be parallelized. |
| |
| - **`intersect.go`:** Provides helper functions (`newIntersect`, |
| `newIntersect2`) to compute the intersection of multiple sorted channels of |
| `traceIDForSQL`. This is crucial for implementing the AND logic in |
| `QueryTracesIDOnly`. It builds a binary tree of `newIntersect2` operations |
| for efficiency, avoiding slower reflection-based approaches. |
| |
| - **`schema/schema.go`:** Defines Go structs that mirror the SQL table |
| schemas. This is used for documentation and potentially could be used with |
| ORM-like tools if needed, though the current implementation uses direct SQL |
| templating. |
| |
| - `TraceValuesSchema`: Stores individual data points (value, commit, |
| source file) keyed by trace ID. |
| - `TraceValues2Schema`: An alternative/extended schema for trace values, |
| potentially denormalizing common parameters like `benchmark`, `bot`, |
| `test`, etc., for direct querying. |
| - `SourceFilesSchema`: Maps source file names to integer IDs. |
| - `ParamSetsSchema`: Stores the unique key-value pairs present in each |
| tile. |
| - `PostingsSchema`: The inverted index, mapping (tile, key-value) to trace |
| IDs. |
| - `MetadataSchema`: Stores JSON metadata for source files. |
| - `TraceParamsSchema`: Maps trace IDs (MD5 hashes) to their full |
| `paramtools.Params` (stored as JSON). |
| |
| - **`spanner.go`:** Contains SQL templates and specific configurations (like |
| parallel pool sizes for writes) tailored for Google Cloud Spanner. |
| |
| ### Submodule: `mocks` |
| |
| - **`TraceStore.go`:** Provides a mock implementation of the `TraceStore` |
| interface, generated by the `mockery` tool. This is essential for unit |
| testing components that depend on `TraceStore` without needing a full |
| database setup. |
| |
| ## Key Workflows |
| |
| ### Writing Traces |
| |
| ``` |
| Caller (e.g., ingester) -> TraceStore.WriteTraces(ctx, commitNumber, params[], values[], paramset, sourceFile, timestamp) |
| | |
| `-> SQLTraceStore.WriteTraces |
| | |
| | 1. Tile Calculation: tileNumber = TileNumber(commitNumber) |
| | |
| | 2. Source File ID: |
| | `-> updateSourceFile(ctx, sourceFile) -> sourceFileID |
| | (Queries SourceFiles table, inserts if not exists) |
| | |
| | 3. ParamSet Update (for the tile): |
| | For each key, value in paramset: |
| | If not in cache(tileNumber, key, value): |
| | Add to batch for ParamSets table insertion |
| | Execute batch insert into ParamSets, update cache |
| | |
| | 4. For each trace (params[i], values[i]): |
| | | a. Trace ID Calculation: traceID_md5_hex = md5(query.MakeKey(params[i])) |
| | | |
| | | b. Store Trace Params: |
| | | `-> TraceParamStore.WriteTraceParams(ctx, {traceID_md5_hex: params[i]}) |
| | | (Inserts into TraceParams table if not exists) |
| | | |
| | | c. Add to TraceValues Batch: (traceID_md5_hex, commitNumber, values[i], sourceFileID) |
| | | |
| | | d. Postings Update (for the tile): |
| | | If not in cache(tileNumber, traceID_md5_hex): // Marks this whole trace as processed for postings |
| | | For each key, value in params[i]: |
| | | Add to batch for Postings table: (tileNumber, "key=value", traceID_md5_hex) |
| | |
| | 5. Execute batch insert into TraceValues (or TraceValues2) |
| | |
| | 6. Execute batch insert into Postings, update postings cache |
| ``` |
| |
| ### Querying for Trace IDs (`QueryTracesIDOnly`) |
| |
| ``` |
| Caller -> TraceStore.QueryTracesIDOnly(ctx, tileNumber, query) |
| | |
| `-> SQLTraceStore.QueryTracesIDOnly |
| | |
| | 1. Get ParamSet for tile: |
| | `-> GetParamSet(ctx, tileNumber) -> tileParamSet |
| | (Checks OPS cache, falls back to querying ParamSets table) |
| | |
| | 2. Generate Query Plan: plan = query.QueryPlan(tileParamSet) |
| | (If plan is empty or invalid for tile, return empty channel) |
| | |
| | 3. Optimization (restrictByCounting): |
| | | For each part of 'plan' (key, or_values[]): |
| | | `-> DB: COUNT(*) FROM Postings WHERE tile_number=... AND key_value IN (...) LIMIT threshold |
| | | Find the plan part (minKey, minValues) with the smallest count (if count < threshold). |
| | | If any count is 0, plan is skippable. |
| | | If minKey found: |
| | | `-> DB: SELECT trace_id FROM Postings WHERE tile_number=... AND key_value IN (minValues) |
| | | `-> restrictClause = "AND trace_id IN (result_ids...)" |
| | |
| | 4. Execute Query for each plan part (concurrently): |
| | For each key, values[] in 'plan' (excluding minKey if restrictClause is used): |
| | `-> DB: SELECT trace_id FROM Postings |
| | WHERE tile_number=tileNumber AND key_value IN ("key=value1", "key=value2"...) |
| | [restrictClause] |
| | ORDER BY trace_id |
| | -> channel_for_key_N (stream of traceIDForSQL) |
| | |
| | 5. Intersect Results: |
| | `-> newIntersect(ctx, [channel_for_key_1, channel_for_key_2,...]) -> finalTraceIDsChannel (stream of unique traceIDForSQL) |
| | |
| | 6. Fetch Full Params (concurrently, in chunks): |
| | For each batch of unique traceIDForSQL from finalTraceIDsChannel: |
| | `-> TraceParamStore.ReadParams(ctx, batch_of_ids) -> map[traceID]Params |
| | For each Params in map: |
| | Send Params to output channel |
| | |
| `-> Returns output channel of paramtools.Params |
| ``` |
| |
| This structured approach, combining interfaces with a robust SQL implementation, |
| allows `tracestore` to serve as a reliable and performant foundation for Perf's |
| data storage needs. |
| |
| # Module: /go/tracing |
| |
| ## Tracing Module Documentation |
| |
| **High-Level Overview** |
| |
| The `/go/tracing` module is responsible for initializing and configuring tracing |
| capabilities within the Perf application. It leverages the OpenCensus library to |
| provide distributed tracing, allowing developers to understand the flow of |
| requests across different services and components. This is crucial for debugging |
| performance issues, identifying bottlenecks, and gaining insights into the |
| application's behavior in a distributed environment. |
| |
| **Design Decisions and Implementation Choices** |
| |
| The core design principle behind this module is to centralize tracing |
| initialization. This ensures consistency in how tracing is set up across |
| different parts of the application. |
| |
| - **Conditional Initialization:** The `Init` function provides different |
| initialization paths based on whether the application is running in a |
| `local` development environment or a deployed environment. |
| |
| - **Local Environment:** In a local setup, `loggingtracer.Initialize()` is |
| called. This likely configures a simpler, console-based tracer. The |
| rationale is that in local development, detailed, distributed tracing |
| might be overkill, and logging traces to the console is often sufficient |
| for debugging. |
| - **Deployed Environment:** For deployed instances, the |
| `tracing.Initialize` function from the shared |
| `go.skia.org/infra/go/tracing` library is used. This enables more |
| sophisticated tracing, likely integrating with a backend tracing system |
| like Jaeger or Stackdriver Trace. |
| |
| - **Configuration-Driven Sampling:** The `cfg.TraceSampleProportion` (of type |
| `config.InstanceConfig`) determines the sampling rate for traces. This |
| allows administrators to control the volume of trace data generated, |
| balancing the need for detailed information with the cost and overhead of |
| storing and processing traces. A value of `0.0` would likely disable |
| tracing, while `1.0` would trace every request. |
| |
| - **Automatic Project ID Detection:** The `autoDetectProjectID` constant being |
| an empty string suggests that the underlying `tracing.Initialize` function |
| is capable of automatically determining the Google Cloud Project ID when |
| running in a GCP environment. This simplifies configuration as the project |
| ID doesn't need to be explicitly passed. |
| |
| - **Metadata Enrichment:** The `map[string]interface{}` passed to |
| `tracing.Initialize` includes: |
| |
| - `podName`: This value is retrieved from the `MY_POD_NAME` environment |
| variable. This is a common practice in Kubernetes environments to |
| identify the specific pod generating the trace, which is invaluable for |
| pinpointing issues. |
| - `instance`: This is derived from `cfg.InstanceName`. This helps |
| differentiate traces originating from different Perf instances (e.g., |
| "perf-prod", "perf-staging"). |
| |
| **Responsibilities and Key Components/Files** |
| |
| - **`tracing.go`:** This is the sole file in this module and contains the |
| `Init` function. |
| |
| - **`Init(local bool, cfg *config.InstanceConfig) error` function:** |
| - **Responsibility:** To initialize the tracing system for the |
| application. It acts as the single entry point for tracing setup. |
| - **How it works:** 1. It takes a `local` boolean flag and an `InstanceConfig` pointer as |
| input. 2. If `local` is `true`, it calls `loggingtracer.Initialize()`. This |
| indicates a preference for a simpler, possibly console-based, |
| tracing mechanism for local development. `local=true ----> |
| loggingtracer.Initialize()` 3. If `local` is `false`, it proceeds to initialize tracing for a |
| deployed environment. - It retrieves the `TraceSampleProportion` from the `cfg`. - It retrieves the `InstanceName` from `cfg` to be used as an |
| attribute. - It calls `tracing.Initialize` from the shared |
| `go.skia.org/infra/go/tracing` library. - It passes the sampling proportion, `autoDetectProjectID` (an |
| empty string, relying on automatic detection), and a map of |
| attributes (`podName` from the environment and `instance` from |
| the config). `local=false | V Read cfg.TraceSampleProportion |
| Read cfg.InstanceName Read os.Getenv("MY_POD_NAME") | V |
| tracing.Initialize(sample_proportion, "", {podName, instance})` |
| - **Why this approach:** |
| - Centralizes tracing setup, making it easier to manage and modify. |
| - Provides a clear distinction between local and deployed tracing |
| configurations, catering to different needs. |
| - Leverages shared tracing libraries (`go.skia.org/infra/go/tracing`) |
| for common functionality, promoting code reuse. |
| |
| - **Dependencies:** |
| |
| - `//go/tracing` (likely `go.skia.org/infra/go/tracing`): This is the core |
| shared tracing library providing the `Initialize` function for robust, |
| distributed tracing. It handles the actual setup of exporters (e.g., to |
| Stackdriver, Jaeger) and samplers. |
| - `//go/tracing/loggingtracer`: This dependency provides a simpler tracer |
| implementation, probably for logging traces to standard output, suitable |
| for local development environments where a full-fledged tracing backend |
| might not be available or necessary. |
| - `//perf/go/config`: This module provides the `InstanceConfig` struct, |
| which contains application-specific configuration, including the |
| `TraceSampleProportion` and `InstanceName` used by the tracing |
| initialization. This decouples tracing configuration from the tracing |
| logic itself. |
| |
| **Key Workflows/Processes** |
| |
| **Tracing Initialization Workflow:** |
| |
| ``` |
| Application Startup |
| | |
| V |
| Call perf/go/tracing.Init(isLocal, instanceConfig) |
| | |
| +---- isLocal is true? ----> Call loggingtracer.Initialize() --> Tracing active (console/simple) |
| | | |
| | V |
| | Application proceeds |
| | |
| +---- isLocal is false? ---> Read TraceSampleProportion from instanceConfig |
| Read InstanceName from instanceConfig |
| Read MY_POD_NAME environment variable |
| | |
| V |
| Call shared go.skia.org/infra/go/tracing.Initialize(...) |
| with sampling rate and attributes (podName, instance) |
| | |
| V |
| Tracing active (distributed, e.g., Stackdriver) |
| | |
| V |
| Application proceeds |
| ``` |
| |
| This workflow illustrates how the `Init` function adapts the tracing setup based |
| on the execution context (local vs. deployed) and external configuration. The |
| goal is to provide appropriate tracing capabilities with minimal boilerplate in |
| the rest of the application. |
| |
| # Module: /go/trybot |
| |
| The `/go/trybot` module is responsible for managing performance data generated |
| by trybots. Trybots are automated systems that run tests on code changes |
| (patches or changelists) before they are merged into the main codebase. This |
| module handles the ingestion, storage, and retrieval of these trybot results, |
| allowing developers and performance engineers to analyze the performance impact |
| of proposed code changes. |
| |
| The core idea is to provide a way to compare the performance characteristics of |
| a pending change against the baseline performance of the current codebase. This |
| helps in identifying potential performance regressions or improvements early in |
| the development cycle. |
| |
| ## Key Components and Responsibilities |
| |
| ### `/go/trybot/trybot.go` |
| |
| This file defines the central data structure `TryFile`. |
| |
| - **`TryFile`**: This struct represents a single file containing trybot |
| results. |
| - `CL`: The identifier of the changelist (e.g., a Gerrit change ID). This |
| is crucial for associating results with a specific code change. |
| - `PatchNumber`: The specific patchset within the changelist. Code review |
| systems often allow multiple iterations (patchsets) for a single |
| changelist. |
| - `Filename`: The name of the file where the trybot results are stored, |
| often including a scheme like `gs://` indicating its location (e.g., in |
| Google Cloud Storage). |
| - `Timestamp`: When the result file was created. This is important for |
| tracking and ordering results. |
| |
| ### `/go/trybot/ingester` |
| |
| This submodule is responsible for taking raw result files and transforming them |
| into the `TryFile` format that the rest of the system understands. |
| |
| - **`/go/trybot/ingester/ingester.go`**: Defines the `Ingester` interface. |
| |
| - **`Ingester` interface**: Specifies a contract for components that can |
| process incoming files (represented by `file.File`) and produce a stream |
| of `trybot.TryFile` objects. The `Start` method initiates this |
| processing, typically in a background goroutine. This design allows for |
| different sources or formats of trybot results to be plugged into the |
| system. |
| |
| - **`/go/trybot/ingester/gerrit/gerrit.go`**: Provides a concrete |
| implementation of the `Ingester` interface, specifically for handling trybot |
| results originating from Gerrit code reviews. |
| |
| - **`Gerrit` struct**: Implements `ingester.Ingester`. It uses a |
| `parser.Parser` (from `/perf/go/ingest/parser`) to understand the |
| content of the result files. |
| - **`New` function**: Constructor for the `Gerrit` ingester. |
| - **`Start` method**: |
| - It receives a channel of `file.File` objects. |
| - For each file, it attempts to parse it using `parser.ParseTryBot`. This |
| method extracts the changelist ID (`issue`) and patchset number. |
| - If parsing is successful, it converts the patchset string to an integer. |
| - A `trybot.TryFile` is created with the extracted CL, patch number, |
| filename, and creation timestamp. |
| - This `TryFile` is then sent to an output channel. |
| - It includes metrics (`parseCounter`, `parseFailCounter`) to track the |
| success and failure rates of parsing. |
| - The use of channels for input (`files`) and output (`ret`) facilitates |
| asynchronous processing, meaning the ingester can process files as they |
| become available without blocking other operations. |
| |
| ### `/go/trybot/store` |
| |
| This submodule is responsible for persisting and retrieving `TryFile` |
| information and the associated performance measurements. |
| |
| - **`/go/trybot/store/store.go`**: Defines the `TryBotStore` interface. |
| - **`TryBotStore` interface**: This interface outlines the contract for |
| storing and retrieving trybot data. This abstraction allows different |
| database backends (e.g., CockroachDB, in-memory stores for testing) to |
| be used. |
| - `Write(ctx context.Context, tryFile trybot.TryFile) error`: Persists a |
| `TryFile` and its associated data. |
| - `List(ctx context.Context, since time.Time) ([]ListResult, error)`: |
| Retrieves a list of unique changelist/patchset combinations that have |
| been processed since a given time. `ListResult` contains the `CL` (as a |
| string) and `Patch` number. |
| - `Get(ctx context.Context, cl types.CL, patch int) ([]GetResult, error)`: |
| Fetches all performance results for a specific changelist and patch |
| number. `GetResult` contains the `TraceName` (a unique identifier for a |
| specific metric and parameter combination) and its measured `Value`. |
| - **`/go/trybot/store/mocks/TryBotStore.go`**: Provides a mock implementation |
| of `TryBotStore`, generated by the `mockery` tool. This is essential for |
| unit testing components that depend on `TryBotStore` without needing a real |
| database. |
| |
| ### `/go/trybot/results` |
| |
| This submodule focuses on loading and preparing trybot results for analysis and |
| presentation, often by comparing them to baseline data. |
| |
| - **`/go/trybot/results/results.go`**: Defines the structures for requesting |
| and representing analyzed trybot results. |
| |
| - **`Kind` type (`TryBot`, `Commit`)**: Distinguishes whether the analysis |
| request is for trybot data (pre-submit) or for data from an already |
| landed commit (post-submit). This allows the system to handle both |
| scenarios. |
| - **`TryBotRequest` struct**: Represents a request from a client (e.g., a |
| UI) to get analyzed performance data. It includes the `Kind`, `CL` and |
| `PatchNumber` (for `TryBot` kind), `CommitNumber` and `Query` (for |
| `Commit` kind). The `Query` is used to filter the traces to be analyzed |
| when looking at landed commits. |
| - **`TryBotResult` struct**: Contains the analysis results for a single |
| trace. |
| - `Params`: The key-value parameters that uniquely identify the trace. |
| - `Median`, `Lower`, `Upper`, `StdDevRatio`: Statistical measures derived |
| from the trace data. `StdDevRatio` is a key metric indicating how much a |
| new value deviates from the historical distribution, helping to flag |
| regressions or improvements. |
| - `Values`: A slice of recent historical values for the trace, with the |
| last value being either the trybot result or the value at the specified |
| commit. |
| - **`TryBotResponse` struct**: The overall response to a `TryBotRequest`. |
| - `Header`: Column headers for the data, typically representing commit |
| information. |
| - `Results`: A slice of `TryBotResult` for each analyzed trace. |
| - `ParamSet`: A collection of all unique parameter key-value pairs present |
| in the results, useful for filtering in a UI. |
| - **`Loader` interface**: Defines a contract for components that can take |
| a `TryBotRequest` and produce a `TryBotResponse`. This involves fetching |
| relevant data, performing statistical analysis, and formatting it. |
| |
| - **`/go/trybot/results/dfloader/dfloader.go`**: Implements the |
| `results.Loader` interface using a `dataframe.DataFrameBuilder`. DataFrames |
| are a common way to represent tabular data for analysis. |
| |
| - **`Loader` struct**: Holds references to a `dataframe.DataFrameBuilder` |
| (for constructing DataFrames from trace data), a `store.TryBotStore` |
| (for fetching trybot-specific measurements), and `perfgit.Git` (for |
| resolving commit information). |
| - **`TraceHistorySize` constant**: Defines how many historical data points |
| to load for each trace for comparison. |
| - **`New` function**: Constructor for the `Loader`. |
| - **`Load` method**: This is the core logic for generating the |
| `TryBotResponse`. |
| - **Workflow**: |
| 1. `Determine Timestamp`: If the request is for a `Commit`, it fetches |
| the commit details (including its timestamp) using `perfgit.Git`. |
| Otherwise, it uses the current time. |
| 2. `Parse Query`: If the request kind is `Commit`, the provided `Query` |
| string is parsed. An empty query for a `Commit` request is an error. |
| 3. `Fetch Baseline Data (DataFrame)`: |
| - If `Kind` is `Commit`: It uses `dfb.NewNFromQuery` to load a |
| DataFrame containing the last `TraceHistorySize+1` data points |
| for traces matching the query, up to the commit's timestamp. The |
| "+1" is to hold the value at the commit itself or to be a |
| placeholder. |
| - If `Kind` is `TryBot`: a. It first calls `store.Get` to retrieve |
| the specific trybot measurements for the given `CL` and |
| `PatchNumber`. b. It then extracts the trace names from these |
| trybot results. c. It calls `dfb.NewNFromKeys` to load a |
| DataFrame with `TraceHistorySize+1` historical data points for |
| these specific trace names. d. Crucially, it then _replaces_ the |
| last value in each trace within the DataFrame with the |
| corresponding value obtained from the `store.Get` call. This |
| effectively injects the trybot's measurement into the historical |
| context for comparison. e. If a trybot result exists for a trace |
| that has no historical data in the DataFrame, that trace is |
| removed from the analysis, and `rebuildParamSet` is flagged. |
| 4. `Prepare Response Header`: The DataFrame's header (commit |
| information) is used for the response. If it's a `TryBot` request, |
| the last header entry (representing the trybot data point) has its |
| `Offset` set to `types.BadCommitNumber` to indicate it's not a |
| landed commit. |
| 5. `Calculate Statistics`: For each trace in the DataFrame: |
| - The trace name (key) is parsed into `paramtools.Params`. |
| - `vec32.StdDevRatio` is called with the trace values (which now |
| includes the trybot value at the end if applicable). This |
| function calculates the median, lower/upper bounds, and the |
| standard deviation ratio. |
| - A `results.TryBotResult` is created. |
| - If `StdDevRatio` calculation fails (e.g., insufficient data), |
| the trace is skipped, and `rebuildParamSet` is flagged. |
| 6. `Sort Results`: The `TryBotResult` slice is sorted by `StdDevRatio` |
| in descending order. This prioritizes potential regressions (high |
| positive ratio) and significant improvements (high negative ratio). |
| 7. `Normalize ParamSet`: If `rebuildParamSet` is true (due to missing |
| traces or parsing errors), the `ParamSet` for the response is |
| regenerated from the final set of `TryBotResult`s. |
| 8. The `results.TryBotResponse` is assembled and returned. |
| - This process allows a direct comparison of a tryjob's performance |
| numbers against the recent history of the same metrics on the main |
| branch. |
| |
| ### `/go/trybot/samplesloader` |
| |
| This submodule deals with loading raw sample data from trybot result files. |
| Sometimes, instead of just a single aggregated value, trybots might output |
| multiple raw measurements (samples) for a metric. |
| |
| - **`/go/trybot/samplesloader/samplesloader.go`**: Defines the `SamplesLoader` |
| interface. |
| |
| - **`SamplesLoader` interface**: Specifies a method `Load(ctx |
| context.Context, filename string) (parser.SamplesSet, error)` that takes |
| a filename (URL to the result file) and returns a `parser.SamplesSet`. A |
| `SamplesSet` is a map where keys are trace identifiers and values are |
| `parser.Samples` (which include parameters and a slice of raw float64 |
| sample values). |
| |
| - **`/go/trybot/samplesloader/gcssamplesloader/gcssamplesloader.go`**: |
| Implements `SamplesLoader` for files stored in Google Cloud Storage (GCS). |
| |
| - **`loader` struct**: Holds a `gcs.GCSClient` for interacting with GCS |
| and a `parser.Parser`. |
| - **`New` function**: Constructor for the GCS samples loader. |
| - **`Load` method**: |
| - Parses the input `filename` (which is a GCS URL like |
| `gs://bucket/path/file.json`) to extract the bucket and path. |
| - Uses the `storageClient` to read the content of the file from GCS. |
| - Parses the file content using `format.ParseLegacyFormat` (assuming a |
| specific JSON structure for these sample files). |
| - Converts the parsed data into a `parser.SamplesSet` using |
| `parser.GetSamplesFromLegacyFormat`. |
| - This component is essential when detailed analysis of raw samples is |
| needed, rather than just aggregated metrics. |
| |
| ## Overall Workflow (Ingestion and Analysis) |
| |
| A simplified workflow could look like this: |
| |
| 1. **File Arrival**: A new trybot result file appears (e.g., uploaded to GCS). |
| |
| ``` |
| New File (e.g., in GCS) |
| ``` |
| |
| 2. **Ingestion**: An `ingester.Ingester` (like `ingester.gerrit.Gerrit`) |
| detects and processes this file. |
| |
| ``` |
| File --> [Gerrit Ingester] --parses--> trybot.TryFile{CL, PatchNum, Filename, Timestamp} |
| ``` |
| |
| 3. **Storage**: The `TryFile` metadata and potentially the parsed values are |
| written to the `store.TryBotStore`. |
| |
| ``` |
| trybot.TryFile --> [TryBotStore.Write] --> Database |
| ``` |
| |
| (The actual performance values might be stored alongside the `TryFile` |
| metadata or linked via the `Filename` if they are in a separate detailed |
| file). |
| |
| 4. **Analysis Request**: A user or an automated system requests analysis for a |
| particular CL/Patch via a UI or API, sending a `results.TryBotRequest`. |
| |
| ``` |
| UI/API --sends--> results.TryBotRequest{Kind=TryBot, CL="123", PatchNumber=1} |
| ``` |
| |
| 5. **Data Loading and Comparison**: The `results.dfloader.Loader` handles this |
| request. `results.TryBotRequest | v [dfloader.Loader.Load] | +--(A)--> |
| [TryBotStore.Get(CL, PatchNum)] --> Trybot specific values (Value_T) for |
| traces T1, T2... | +--(B)--> |
| [DataFrameBuilder.NewNFromKeys(traceNames=[T1,T2...])] --> Historical data |
| for T1, T2... | (e.g., [V1_hist1, V1_hist2, ..., V1_histN, _placeholder_]) | |
| +--(C)--> Combine: Replace _placeholder_ with Value_T | (e.g., for T1: |
| [V1_hist1, V1_hist2, ..., V1_histN, V1_T]) | +--(D)--> Calculate |
| StdDevRatio, Median, etc. for each trace | +--(E)--> Sort results | v |
| results.TryBotResponse (sent back to UI/API)` |
| |
| This module is crucial for proactive performance monitoring, enabling teams to |
| catch performance regressions before they land in the main codebase, by |
| systematically ingesting, storing, and analyzing the performance data generated |
| during the pre-submit testing phase. The use of interfaces for storage |
| (`TryBotStore`), ingestion (`Ingester`), and results loading (`results.Loader`) |
| makes the system flexible and extensible. |
| |
| # Module: /go/ts |
| |
| The `go/ts` module serves as a utility to generate TypeScript definition files |
| from Go structs. This is crucial for maintaining type safety and consistency |
| between the Go backend and the TypeScript frontend, particularly when dealing |
| with JSON data structures that are exchanged between them. The core problem this |
| module solves is bridging the gap between Go's static typing and TypeScript's |
| type system for data interchange, ensuring that changes in Go struct definitions |
| are automatically reflected in the frontend's TypeScript types. |
| |
| The primary component is the `main.go` file. Its responsibility is to: |
| |
| 1. **Parse command-line arguments**: It accepts an output path (`-o`) where the |
| generated TypeScript file will be written. |
| 2. **Instantiate a `go2ts.Generator`**: This is the core engine from the |
| `go/go2ts` library responsible for the Go-to-TypeScript conversion. |
| 3. **Configure the generator**: |
| - `GenerateNominalTypes = true`: This setting likely ensures that the |
| generated TypeScript types are nominal (i.e., types are distinct based |
| on their name, not just their structure), which can provide stronger |
| type checking. |
| - `AddIgnoreNil`: This is used for specific Go types like |
| `paramtools.Params`, `paramtools.ParamSet`, |
| `paramtools.ReadOnlyParamSet`, and `types.TraceSet`. This suggests that |
| `nil` values for these types in Go should likely be treated as optional |
| or nullable fields in TypeScript, or perhaps excluded from the generated |
| types if they are always expected to be non-nil when serialized. |
| 4. **Register Go structs and unions for conversion**: |
| - The code extensively uses `generator.AddMultiple` to register a wide |
| array of Go structs from various `perf` submodules (e.g., `alerts`, |
| `chromeperf`, `clustering2`, `frontend/api`, `regression`). These are |
| the structs that are serialized to JSON and consumed by the frontend. By |
| registering them, the generator knows which Go types to convert into |
| corresponding TypeScript interfaces or types. |
| - The `addMultipleUnions` helper function and |
| `generator.AddUnionToNamespace` are used to register Go union types |
| (often represented as a collection of constants or an interface |
| implemented by several types). This ensures that TypeScript enums or |
| union types are generated, reflecting the possible values or types a Go |
| field can hold. The `typeName` argument in `unionAndName` and the |
| namespace argument in `AddUnionToNamespace` control how these unions are |
| named and organized in the generated TypeScript. |
| - `generator.AddToNamespace` is used to group related types under a |
| specific namespace in the generated TypeScript, improving organization |
| (e.g., `pivot.Request{}` is added to the `pivot` namespace). |
| 5. **Render the TypeScript output**: Finally, `generator.Render(w)` writes the |
| generated TypeScript definitions to the specified output file. |
| |
| The design decision to use a dedicated program for this generation task, rather |
| than manual synchronization or other methods, highlights the importance of |
| automation and reducing the likelihood of human error in keeping backend and |
| frontend types aligned. The reliance on the `go/go2ts` library centralizes the |
| core conversion logic, making this module a consumer and orchestrator of that |
| library for the specific needs of the Skia Perf application. |
| |
| A key workflow is triggered by the `//go:generate` directive at the top of |
| `main.go`: `//go:generate bazelisk run --config=mayberemote //:go -- run . -o |
| ../../modules/json/index.ts` |
| |
| This command, when `go generate` is run (typically as part of a build process), |
| executes the compiled `go/ts` program. |
| |
| Workflow: |
| |
| 1. Developer modifies a Go struct in a `perf` submodule that is serialized to |
| JSON for the UI. |
| 2. Developer (or an automated build step) runs `go generate` within the `go/ts` |
| module's directory (or a higher-level directory that includes it). |
| 3. The `go:generate` directive executes the `main` function in `go/ts/main.go`. |
| 4. `main.go` -> Uses `go2ts.Generator` -> Registers relevant Go structs and |
| unions. |
| 5. `go2ts.Generator` -> Analyzes registered Go types -> Generates corresponding |
| TypeScript definitions. |
| 6. `main.go` -> Writes the TypeScript definitions to |
| `../../modules/json/index.ts`. |
| 7. The frontend can now import and use these up-to-date TypeScript types, |
| ensuring type safety when interacting with JSON data from the backend. |
| |
| The choice of specific structs and unions registered in `main.go` reflects the |
| data contracts between the Perf backend and its frontend UI. Any Go struct that |
| is part of an API response or request payload handled by the frontend needs to |
| be included here. |
| |
| # Module: /go/types |
| |
| ## Go Types Module |
| |
| This module defines core data types used throughout the Perf application. These |
| types provide a standardized way to represent fundamental concepts related to |
| commits, performance data (traces), and alert configurations. The design |
| prioritizes clarity, type safety, and consistency across different parts of the |
| system. |
| |
| ### Key Concepts and Components: |
| |
| #### Commit and Tile Numbering: |
| |
| - **`CommitNumber` (`types.go`)**: Represents a unique, sequential identifier |
| for a commit within a repository. |
| |
| - **Why**: To provide a simple, linear way to reference commits. It |
| assumes a straightforward, non-branching history for easier indexing and |
| retrieval of performance data associated with specific code changes. The |
| first commit in a repository is assigned `CommitNumber(0)`. |
| - **How**: Implemented as an `int32`. It includes an `Add` method for safe |
| offsetting and a `BadCommitNumber` constant (`-1`) to represent invalid |
| or non-existent commit numbers. |
| - **`CommitNumberSlice` (`types.go`)**: A utility type to enable sorting |
| of `CommitNumber` slices, which is useful for various data processing |
| and display tasks. |
| |
| - **`TileNumber` (`types.go`)**: Represents an index for a "tile" in the |
| `TraceStore`. Performance data (traces) are often stored in chunks or tiles |
| for efficient storage and retrieval. |
| |
| - **Why**: Tiling allows for optimized access to performance data, |
| especially for large datasets. Instead of loading entire traces, only |
| relevant tiles need to be accessed. |
| - **How**: Implemented as an `int32`. Functions like |
| `TileNumberFromCommitNumber` and `TileCommitRangeForTileNumber` manage |
| the mapping between commit numbers and tile numbers based on a |
| configurable `tileSize`. The `Prev()` method allows navigation to the |
| preceding tile, and `BadTileNumber` (`-1`) indicates an invalid tile. |
| |
| **Workflow: Commit to Tile Mapping** |
| |
| ``` |
| CommitNumber ----(tileSize)----> TileNumberFromCommitNumber() ----> TileNumber |
| | |
| V |
| TileCommitRangeForTileNumber() ----> (StartCommit, EndCommit) |
| ``` |
| |
| #### Performance Data Representation: |
| |
| - **`Trace` (`types.go`)**: Represents a sequence of performance measurements, |
| typically corresponding to a specific metric over a series of commits. |
| |
| - **Why**: To provide a simple and efficient way to store and manipulate |
| time-series performance data. |
| - **How**: Implemented as a `[]float32`. The `NewTrace` function |
| initializes a trace of a given length with a special |
| `vec32.MISSING_DATA_SENTINEL` value, which is crucial for distinguishing |
| between actual zero values and missing data points. This leverages the |
| `go.skia.org/infra/go/vec32` package for optimized float32 vector |
| operations. |
| |
| - **`TraceSet` (`types.go`)**: A collection of `Trace`s, keyed by a string |
| identifier (trace ID). |
| |
| - **Why**: To group related traces, often corresponding to different |
| metrics measured for the same test or configuration. |
| - **How**: Implemented as a `map[string]Trace`. |
| |
| #### Regression Detection and Alerting: |
| |
| - **`RegressionDetectionGrouping` (`types.go`)**: An enumeration defining how |
| traces are grouped for regression detection. |
| |
| - **Why**: Different grouping strategies can be more effective for |
| different types of performance data. This allows flexibility in the |
| regression detection process. |
| - **How**: Defined as a string type with constants like `KMeansGrouping` |
| (cluster traces by shape) and `StepFitGrouping` (analyze each trace |
| individually for steps). `ToClusterAlgo` provides a safe way to convert |
| strings to this type. |
| |
| - **`StepDetection` (`types.go`)**: An enumeration defining the algorithms |
| used to detect significant steps (changes) in individual traces or cluster |
| centroids. |
| |
| - **Why**: Various statistical methods can be employed to identify |
| meaningful performance regressions or improvements. This allows |
| selection of the most appropriate method for the data characteristics. |
| - **How**: Defined as a string type with constants representing different |
| detection methods, such as `OriginalStep`, `AbsoluteStep`, |
| `PercentStep`, `CohenStep`, and `MannWhitneyU`. `ToStepDetection` |
| ensures type-safe conversion from strings. |
| |
| - **`AlertAction` (`types.go`)**: An enumeration defining the actions to be |
| taken when an anomaly (potential regression) is detected by an alert |
| configuration. |
| |
| - **Why**: To allow configurable responses to detected anomalies, ranging |
| from no action to filing issues or triggering bisection jobs. |
| - **How**: Defined as a string type with constants like `NoAction`, |
| `FileIssue`, and `Bisection`. |
| |
| - **`Domain` (`types.go`)**: Specifies the range of commits over which an |
| operation (like regression detection) should be performed. |
| |
| - **Why**: To precisely define the scope of analysis. |
| - **How**: A struct containing either `N` (number of commits) and `End` |
| (timestamp for the end of the range) or an `Offset` (a specific commit |
| number). |
| |
| - **`ProgressCallback` (`types.go`)**: A function type used to provide |
| feedback on the progress of long-running operations. |
| |
| - **Why**: To enable user interfaces or logging systems to display the |
| status of tasks like regression detection. |
| - **How**: Defined as `func(message string)`. |
| |
| - **`CL` (`types.go`)**: Represents a Change List identifier (e.g., a GitHub |
| Pull Request number). |
| |
| - **Why**: To associate performance data or alerts with specific code |
| changes under review. |
| - **How**: Defined as a `string`. |
| |
| - **`AnomalyDetectionNotifyType` (`types.go`)**: Defines the notification |
| mechanism for anomalies. |
| |
| - **Why**: Allows flexibility in how users are informed about detected |
| performance issues. |
| - **How**: String type with constants `IssueNotify` (send to issue |
| tracker) and `NoneNotify` (no notification). |
| |
| #### Miscellaneous: |
| |
| - **`ProjectId` (`types.go`)**: Represents a project identifier. |
| |
| - **Why**: Useful in multi-project environments to scope data or |
| configurations. |
| - **How**: Defined as a `string` with a predefined list `AllProjectIds`. |
| |
| - **`AllMeasurementStats` (`types.go`)**: A list of valid statistical suffixes |
| that can be part of performance measurement keys (e.g., "avg", "max"). |
| |
| - **Why**: To ensure consistency and provide a reference for valid stat |
| types when parsing or generating metric keys. |
| - **How**: A `[]string` slice. |
| |
| The unit tests in `types_test.go` focus on validating the logic of |
| `CommitNumber` arithmetic and the mapping between `CommitNumber` and |
| `TileNumber`, ensuring the core indexing mechanisms are correct. |
| |
| # Module: /go/ui |
| |
| The `/go/ui` module is responsible for handling frontend requests and preparing |
| data for display in the Perf UI. Its primary purpose is to bridge the gap |
| between user interactions on the frontend (e.g., selecting time ranges, defining |
| queries, or applying formulas) and the backend data sources and processing |
| logic. |
| |
| This module is designed to be the central point for fetching and transforming |
| performance data into a format that can be readily consumed by the UI. It |
| orchestrates interactions with various other modules, such as those responsible |
| for accessing Git history (`/go/git`), building dataframes (`/go/dataframe`), |
| handling data shortcuts (`/go/shortcut`), and calculating derived metrics |
| (`/go/calc`). |
| |
| The key rationale behind this module's existence is to encapsulate the |
| complexity of data retrieval and preparation, providing a clean and consistent |
| API for the frontend. This separation of concerns allows the frontend to focus |
| on presentation and user interaction, while the backend handles the intricacies |
| of data access and manipulation. |
| |
| The main workflow involves receiving a `FrameRequest` from the frontend, |
| processing it to fetch and transform data, and then returning a `FrameResponse` |
| containing the prepared data and display instructions. |
| |
| ### Key Components and Files: |
| |
| - **`/go/ui/frame/frame.go`**: This is the core file of the module. |
| - **Responsibilities**: |
| - Defines the structure of frontend requests (`FrameRequest`) and backend |
| responses (`FrameResponse`). `FrameRequest` captures user inputs like |
| time ranges, queries, formulas, and pivot table configurations. |
| `FrameResponse` packages the resulting data, along with display hints |
| and any relevant messages. |
| - Manages the processing of `FrameRequest` objects. This involves |
| dispatching tasks to other modules based on the request parameters. For |
| example, it uses the `dataframe.DataFrameBuilder` to fetch data based on |
| queries or trace keys, the `calc` module to evaluate formulas, and the |
| `pivot` module to restructure data for pivot tables. |
| - Handles different types of requests, such as those based on a specific |
| time range (`REQUEST_TIME_RANGE`) or a fixed number of recent commits |
| (`REQUEST_COMPACT`). |
| - Orchestrates the retrieval of anomalies from an `anomalies.Store` and |
| associates them with the relevant traces in the response. This can be |
| done based on time ranges or commit revision numbers. |
| - Includes logic to determine the appropriate display mode for the |
| frontend (e.g., plot, pivot table, or just a query input). |
| - Implements safeguards like truncating the number of traces in the |
| response if it exceeds a predefined limit, to prevent overwhelming the |
| frontend or the network. |
| - Provides functionality to identify "SKP changes" (significant file |
| changes in the Git repository, historically related to Skia Picture |
| files) within the requested commit range, which can be highlighted in |
| the UI. |
| - **Design Choices & Implementation Details**: |
| - The `ProcessFrameRequest` function is the main entry point for handling |
| a request. It creates a `frameRequestProcess` struct to manage the state |
| of the request processing. |
| - The processing is broken down into distinct steps: handling queries, |
| formulas, and keys (shortcuts). Each step typically involves fetching |
| data and then joining it into a single `DataFrame`. |
| - Error handling is centralized in `reportError` to ensure consistent |
| logging and error propagation. |
| - Progress tracking is integrated via the `progress.Progress` interface, |
| allowing the frontend to display updates during long-running requests. |
| - The decision to support both `REQUEST_TIME_RANGE` and `REQUEST_COMPACT` |
| request types caters to different user needs: exploring specific |
| historical periods versus viewing the latest trends. |
| - The inclusion of anomaly data directly in the `FrameResponse` aims to |
| provide users with immediate context about significant performance |
| changes alongside the raw data. The system supports fetching anomalies |
| based on either time or revision ranges, offering flexibility depending |
| on how anomalies are tracked and stored. |
| - The `ResponseFromDataFrame` function acts as a final assembly step, |
| taking a processed `DataFrame` and enriching it with SKP change |
| information, display mode, and handling potential truncation. |
| |
| A typical request processing flow might look like this: |
| |
| ``` |
| Frontend Request (FrameRequest) |
| | |
| V |
| ProcessFrameRequest() in frame.go |
| | |
| +------------------------------+-----------------------------+--------------------------+ |
| | | | | |
| V V V V |
| (If Queries exist) (If Formulas exist) (If Keys exist) (If Pivot requested) |
| doSearch() doCalc() doKeys() pivot.Pivot() |
| | | | | |
| V V V V |
| dfBuilder.NewFromQuery...() calc.Eval() with dfBuilder.NewFromKeys...() Restructure DataFrame |
| rowsFromQuery/Shortcut() |
| | | | | |
| +------------------------------+-----------------------------+--------------------------+ |
| | |
| V |
| DataFrame construction and merging |
| | |
| V |
| (If anomaly search enabled) |
| addTimeBasedAnomaliesToResponse() OR addRevisionBasedAnomaliesToResponse() |
| | |
| V |
| anomalyStore.GetAnomalies...() |
| | |
| V |
| ResponseFromDataFrame() |
| | |
| V |
| getSkps() (Find significant file changes) |
| | |
| V |
| Truncate response if too large |
| | |
| V |
| Set DisplayMode |
| | |
| V |
| Backend Response (FrameResponse) |
| | |
| V |
| Frontend UI |
| ``` |
| |
| # Module: /go/urlprovider |
| |
| ## URL Provider Module |
| |
| The `urlprovider` module is designed to generate URLs for various pages within |
| the Perf application. This centralized approach ensures consistency in URL |
| generation across different parts of the application and simplifies the process |
| of linking to specific views with pre-filled parameters. The key motivation is |
| to abstract away the complexities of URL query parameter construction and to |
| provide a simple interface for generating links to common Perf views like |
| "Explore", "MultiGraph", and "GroupReport". |
| |
| The core component of this module is the `URLProvider` struct. An instance of |
| `URLProvider` is initialized with a `perfgit.Git` object. This dependency is |
| crucial because some URL generation, particularly for time-range-based views, |
| requires fetching commit information (specifically timestamps) from the Git |
| repository to define the "begin" and "end" parameters of the URL. |
| |
| ### Key Responsibilities and Components: |
| |
| - **`urlprovider.go`**: This file contains the primary logic for the URL |
| provider. |
| - **`URLProvider` struct**: Holds a reference to a `perfgit.Git` instance. |
| This allows it to interact with the Git repository to fetch commit |
| details needed for constructing time-based query parameters. |
| - **`New(perfgit perfgit.Git) *URLProvider`**: This constructor function |
| creates and returns a new instance of `URLProvider`. It takes a |
| `perfgit.Git` object as an argument, which is stored within the struct. |
| This design choice makes the `URLProvider` stateful with respect to its |
| Git interaction capabilities. |
| - **`Explore(...) string`**: This method generates a URL for the "Explore" |
| page (`/e/`). |
| - **Why**: The "Explore" page is used for in-depth analysis of performance |
| data based on various parameters and a specific commit range. |
| - **How**: |
| 1. It calls `getQueryParams` to construct the common query parameters |
| like `begin`, `end`, and `disable_filter_parent_traces`. The `begin` |
| and `end` timestamps are derived from the provided |
| `startCommitNumber` and `endCommitNumber` by querying the `perfGit` |
| instance. The `end` timestamp is intentionally shifted forward by |
| one day to ensure that anomalies at the very end of the selected |
| range are visible on the graph. |
| 2. It then serializes the `parameters` map (which contains key-value |
| pairs for filtering traces) into a URL-encoded query string using |
| `GetQueryStringFromParameters`. This encoded string is assigned to |
| the `queries` parameter of the final URL. |
| 3. Additional `queryParams` (passed as `url.Values`) can be merged into |
| the URL. |
| 4. The final URL is constructed by appending the encoded query |
| parameters to the base path `/e/?`. |
| - **`MultiGraph(...) string`**: This method generates a URL for the |
| "MultiGraph" page (`/m/`). |
| - **Why**: The "MultiGraph" page allows users to view multiple graphs |
| simultaneously, often identified by a shortcut ID. |
| - **How**: |
| 1. Similar to `Explore`, it uses `getQueryParams` to build the common |
| time-range and filtering parameters. |
| 2. It specifically adds the `shortcut` parameter with the provided |
| `shortcutId`. |
| 3. Additional `queryParams` can also be merged. |
| 4. The final URL is constructed by appending the encoded query |
| parameters to the base path `/m/?`. |
| - **`GroupReport(param string, value string) string`**: This _static_ |
| function generates a URL for the "Group Report" page (`/u/`). |
| - **Why**: The "Group Report" page displays information related to groups |
| of anomalies, specific anomalies, bugs, or revisions. Unlike `Explore` |
| and `MultiGraph`, it does not inherently depend on a time range derived |
| from commits, nor does it require complex parameter encoding. |
| - **How**: |
| 1. It validates the input `param` against a predefined list of allowed |
| parameters (`anomalyGroupID`, `anomalyIDs`, `bugID`, `rev`, `sid`). |
| This is a security and correctness measure to prevent arbitrary |
| parameters from being injected. |
| 2. If the `param` is valid, it constructs a simple URL with the |
| provided `param` and `value`. |
| 3. It returns an empty string if the `param` is invalid. |
| 4. This function is static (not a method on `URLProvider`) because it |
| doesn't need access to the `perfGit` instance or any other state |
| within `URLProvider`. This simplifies its usage for cases where only |
| a group report URL is needed without initializing a full |
| `URLProvider`. |
| - **`getQueryParams(...) url.Values`**: This private helper method is |
| responsible for creating the base set of query parameters common to |
| `Explore` and `MultiGraph`. |
| - **How**: |
| 1. It calls `fillCommonParams` to set the `begin` and `end` parameters |
| based on commit numbers. |
| 2. It conditionally adds `disable_filter_parent_traces=true` if |
| requested. |
| 3. It merges any additional `queryParams` provided by the caller. |
| - **`fillCommonParams(...)`**: This private helper populates the `begin` |
| and `end` timestamp parameters in the provided `url.Values`. |
| - **How**: It uses the `perfGit` instance to look up the `Commit` objects |
| corresponding to the `startCommitNumber` and `endCommitNumber`. The |
| timestamps from these commits are then used. As mentioned earlier, the |
| `end` timestamp is adjusted by adding one day. This separation of |
| concerns keeps the main `Explore` and `MultiGraph` methods cleaner. |
| - **`GetQueryStringFromParameters(parameters map[string][]string) |
| string`**: This helper method converts a map of string slices |
| (representing query parameters where a single key can have multiple |
| values) into a URL-encoded query string. |
| |
| ### Key Workflows: |
| |
| 1. **Generating an "Explore" Page URL:** |
| |
| ``` |
| Caller provides: context, startCommitNum, endCommitNum, filterParams, disableFilterParent, otherQueryParams |
| | |
| v |
| URLProvider.Explore() |
| | |
| +-------------------------------------+ |
| | | |
| v v |
| getQueryParams() GetQueryStringFromParameters(filterParams) |
| | | |
| +--> fillCommonParams() +--> Encode filterParams |
| | | | |
| | +--> perfGit.CommitFromCommitNumber() -> Get start timestamp |
| | | | |
| | +--> perfGit.CommitFromCommitNumber() -> Get end timestamp, add 1 day |
| | | | |
| | +----------------------------------------+ |
| | | |
| | v |
| | Combine begin, end, disableFilterParent, otherQueryParams into url.Values |
| | | |
| +-------------------------------------+ |
| | |
| v |
| Combine base URL ("/e/?"), common query params, and encoded filterParams string |
| | |
| v |
| Return final URL string |
| ``` |
| |
| 2. **Generating a "MultiGraph" Page URL:** |
| |
| ``` |
| Caller provides: context, startCommitNum, endCommitNum, shortcutId, disableFilterParent, otherQueryParams |
| | |
| v |
| URLProvider.MultiGraph() |
| | |
| v |
| getQueryParams() |
| | |
| +--> fillCommonParams() |
| | | |
| | +--> perfGit.CommitFromCommitNumber() -> Get start timestamp |
| | | |
| | +--> perfGit.CommitFromCommitNumber() -> Get end timestamp, add 1 day |
| | | |
| | +----------------------------------------+ |
| | | |
| | v |
| | Combine begin, end, disableFilterParent, otherQueryParams into url.Values |
| | |
| v |
| Add "shortcut=shortcutId" to url.Values |
| | |
| v |
| Combine base URL ("/m/?") and all query params |
| | |
| v |
| Return final URL string |
| ``` |
| |
| 3. **Generating a "Group Report" Page URL:** |
| |
| ``` |
| Caller provides: paramName, paramValue |
| | |
| v |
| urlprovider.GroupReport() |
| | |
| v |
| Validate paramName against allowed list |
| | |
| +-- (Valid) --> Construct URL: "/u/?" + paramName + "=" + paramValue |
| | | |
| | v |
| | Return URL string |
| | |
| +-- (Invalid) --> Return "" (empty string) |
| ``` |
| |
| The design emphasizes reusability of common parameter generation logic |
| (`getQueryParams`, `fillCommonParams`) and clear separation of concerns for |
| generating URLs for different Perf pages. The dependency on `perfgit.Git` is |
| explicitly managed through the `URLProvider` struct, making it clear when Git |
| interaction is necessary. |
| |
| # Module: /go/userissue |
| |
| The `userissue` module is responsible for managing the association between |
| specific data points in Perf (identified by a trace key and a commit position) |
| and Buganizer issues. This allows users to flag specific performance regressions |
| or anomalies and link them directly to a tracking issue. |
| |
| The core of this module is the `Store` interface, which defines the contract for |
| persisting and retrieving these user-issue associations. The primary |
| implementation of this interface is `sqluserissuestore`, which leverages a SQL |
| database (specifically CockroachDB in this context) to store the data. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`store.go`**: This file defines the central `UserIssue` struct and the |
| `Store` interface. |
| |
| - **`UserIssue` struct**: Represents a single association. It contains: |
| - `UserId`: The email of the user who made the association. |
| - `TraceKey`: A string uniquely identifying a performance metric's trace |
| (e.g., ",arch=x86,config=Release,test=MyTest,"). |
| - `CommitPosition`: An integer representing a specific point in the commit |
| history where the data point exists. |
| - `IssueId`: The numerical ID of the Buganizer issue. |
| - **`Store` interface**: This interface dictates the operations that any |
| backing store for user issues must support: |
| - `Save(ctx context.Context, req *UserIssue) error`: Persists a new |
| `UserIssue` association. The implementation must handle potential |
| conflicts, such as trying to save a duplicate entry (same trace key and |
| commit position). |
| - `Delete(ctx context.Context, traceKey string, commitPosition int64) |
| error`: Removes an existing user-issue association based on its unique |
| trace key and commit position. It should handle cases where the |
| specified association doesn't exist. |
| - `GetUserIssuesForTraceKeys(ctx context.Context, traceKeys []string, |
| startCommitPosition int64, endCommitPosition int64) ([]UserIssue, |
| error)`: Retrieves all `UserIssue` associations for a given set of trace |
| keys within a specified range of commit positions. This is crucial for |
| displaying these associations on performance graphs or reports. |
| |
| - **`sqluserissuestore/sqluserissuestore.go`**: This is the SQL-backed |
| implementation of the `Store` interface. |
| |
| - **Design Rationale**: Using a SQL database provides robust data |
| integrity, transactional guarantees, and the ability to perform complex |
| queries if needed in the future. CockroachDB is chosen for its |
| scalability and compatibility with PostgreSQL syntax. |
| - **Implementation Details**: |
| - It uses a `go.skia.org/infra/go/sql/pool` for managing database |
| connections. |
| - SQL statements are defined as constants and, in the case of |
| `listUserIssues`, use Go's `text/template` package to dynamically |
| construct the `IN` clause for multiple `traceKeys`. This is a common |
| pattern to avoid SQL injection vulnerabilities and handle variadic |
| inputs efficiently. |
| - `Save`: Inserts a new row into the `UserIssues` table. It includes a |
| `last_modified` timestamp. |
| - `Delete`: First, it attempts to retrieve the issue to ensure it exists |
| before attempting deletion. This provides a more informative error |
| message if the record is not found. |
| - `GetUserIssuesForTraceKeys`: Constructs a SQL query using a template to |
| select issues matching the provided trace keys and commit position |
| range. It then iterates over the query results and populates a slice of |
| `UserIssue` structs. |
| |
| - **`sqluserissuestore/schema/schema.go`**: This file defines the Go struct |
| `UserIssueSchema` which directly maps to the SQL table schema for |
| `UserIssues`. |
| |
| - **Purpose**: This provides a typed representation of the database table, |
| making it easier to reason about the data structure and to potentially |
| use with ORM-like tools or schema migration utilities. |
| - **Key Fields**: |
| - `user_id TEXT NOT NULL` |
| - `trace_key TEXT NOT NULL` |
| - `commit_position INT NOT NULL` |
| - `issue_id INT NOT NULL` |
| - `last_modified TIMESTAMPTZ DEFAULT now()` |
| - `PRIMARY KEY(trace_key, commit_position)`: The combination of |
| `trace_key` and `commit_position` uniquely identifies a user issue, |
| preventing multiple issues from being associated with the exact same |
| data point. |
| |
| - **`mocks/Store.go`**: This contains a mock implementation of the `Store` |
| interface, generated using the `testify/mock` library. |
| |
| - **Purpose**: This is essential for unit testing components that depend |
| on the `userissue.Store` without requiring a live database connection. |
| It allows developers to define expected calls and return values for the |
| store's methods. |
| |
| **Workflow Example: Saving a User Issue** |
| |
| 1. **User Action**: A user on the Perf frontend identifies a data point (e.g., |
| on a graph) and associates it with a Buganizer issue ID. |
| 2. **API Request**: The frontend sends a request to a backend API endpoint. |
| 3. **Backend Handler**: The API handler receives the request, which includes |
| the user's ID, the trace key, the commit position, and the issue ID. |
| 4. **Store Interaction**: The handler creates a `userissue.UserIssue` struct |
| and calls the `Save` method on an instance of `userissue.Store` (likely |
| `sqluserissuestore.UserIssueStore`). `User Request (UI) | v API Endpoint | v |
| Backend Handler | | Creates userissue.UserIssue{UserId:"...", |
| TraceKey:"...", CommitPosition:123, IssueId:45678} v |
| userissue.Store.Save(ctx, &issue) | v |
| sqluserissuestore.UserIssueStore.Save() | | Constructs SQL: INSERT INTO |
| UserIssues (...) VALUES ($1, $2, $3, $4, $5) v SQL Database (UserIssues |
| Table) <-- Row inserted` |
| |
| **Workflow Example: Retrieving User Issues for a Chart** |
| |
| 1. **User Action**: A user views a performance chart displaying multiple traces |
| over a range of commits. |
| 2. **Frontend Request**: The frontend needs to know if any data points on the |
| visible traces and commit range have associated issues. It requests this |
| information from a backend API. |
| 3. **Backend Handler**: The API handler receives the list of trace keys visible |
| on the chart and the start/end commit positions. |
| 4. **Store Interaction**: The handler calls `GetUserIssuesForTraceKeys` on the |
| `userissue.Store`. `Chart Display Request (UI) | | Provides: |
| traceKeys=["trace1", "trace2"], startCommit=100, endCommit=200 v API |
| Endpoint | v Backend Handler | v |
| userissue.Store.GetUserIssuesForTraceKeys(ctx, traceKeys, startCommit, |
| endCommit) | v sqluserissuestore.UserIssueStore.GetUserIssuesForTraceKeys() |
| | | Constructs SQL: SELECT ... FROM UserIssues WHERE trace_key IN ('trace1', |
| 'trace2') AND commit_position>=100 AND commit_position<=200 v SQL Database |
| (UserIssues Table) | | Returns rows matching the query v Backend Handler | | |
| Formats response v API Endpoint | v UI (displays issue markers on chart)` |
| |
| The design emphasizes a clear separation of concerns with the `Store` interface, |
| allowing for different storage backends if necessary (though SQL is the current |
| and likely long-term choice). The SQL implementation is straightforward, using |
| parameterized queries for security and templates for dynamic query construction |
| where appropriate. |
| |
| # Module: /go/workflows |
| |
| ## Module: /go/workflows |
| |
| ### Overview |
| |
| This module defines and implements Temporal workflows for automating tasks |
| related to performance anomaly detection and analysis in Skia Perf. It |
| orchestrates interactions between various services like the AnomalyGroup |
| service, Culprit service, and Gerrit service to achieve end-to-end automation. |
| The primary goal is to streamline the process of identifying performance |
| regressions, finding their root causes (culprits), and notifying relevant |
| parties. |
| |
| The workflows are designed to be resilient and fault-tolerant, leveraging |
| Temporal's capabilities for retries and state management. This ensures that even |
| if individual steps or external services encounter transient issues, the overall |
| process can continue and eventually complete. |
| |
| ### Responsibilities and Key Components |
| |
| The module is structured into a public API (`workflows.go`) and an internal |
| implementation package (`internal/`). |
| |
| **`workflows.go`**: |
| |
| - **Purpose**: Defines the public interface for the workflows, including their |
| names and the data structures for their parameters and results. |
| - **Why**: This separation allows other modules (clients) to trigger these |
| workflows without needing to know the internal implementation details or |
| depend on the specific libraries used within the workflows. It acts as a |
| contract. |
| - **Key Contents**: |
| - **Workflow Name Constants (`ProcessCulprit`, `MaybeTriggerBisection`)**: |
| These string constants are the canonical names used to invoke the |
| respective workflows via the Temporal client. Using constants helps |
| avoid typos and ensures consistency. |
| - **Parameter and Result Structs (`ProcessCulpritParam`, |
| `ProcessCulpritResult`, `MaybeTriggerBisectionParam`, |
| `MaybeTriggerBisectionResult`)**: These structs define the data that |
| needs to be passed into a workflow and the data that a workflow is |
| expected to return upon completion. They ensure type safety and clarity |
| in communication. |
| |
| **`internal/` package**: This package contains the actual implementation of the |
| workflows and their associated activities. Activities are the building blocks of |
| Temporal workflows, representing individual units of work that can be executed, |
| retried, and timed out independently. |
| |
| - **`options.go`**: |
| |
| - **Purpose**: Centralizes the configuration for Temporal activities and |
| child workflows. |
| - **Why**: Provides a consistent way to define timeouts and retry |
| policies. This makes it easier to manage and adjust these settings |
| globally or for specific categories of operations. For example, |
| short-lived activities interacting with external services have different |
| reliability characteristics than long-running child workflows. |
| - **Key Components**: |
| - `regularActivityOptions`: Defines default options (e.g., 1-minute |
| timeout, 10 retry attempts) for standard activities that are expected to |
| complete quickly, like API calls to other services. |
| - `childWorkflowOptions`: Defines options for child workflows (e.g., |
| 12-hour execution timeout, 4 retry attempts). This longer timeout |
| accommodates potentially resource-intensive tasks like bisections which |
| involve compilation and testing. |
| |
| - **`maybe_trigger_bisection.go`**: |
| |
| - **Purpose**: Implements the `MaybeTriggerBisectionWorkflow`, which is |
| the core logic for deciding whether to automatically find the cause of a |
| performance regression (bisection) or to simply report the anomaly. |
| - **Why**: This workflow automates a critical decision point in the |
| performance analysis pipeline. It aims to reduce manual intervention by |
| automatically initiating bisections for significant regressions while |
| still allowing for manual reporting of less critical issues. |
| - **Key Workflow Steps**: |
| |
| * **Wait**: Pauses for a defined duration (`_WAIT_TIME_FOR_ANOMALIES`, |
| e.g., 30 minutes). This allows time for related anomalies to be detected |
| and grouped together, potentially providing a more comprehensive picture |
| before taking action. `Wait for more anomalies ->` |
| * **Load Anomaly Group**: Retrieves details of the specific anomaly group |
| using an activity that calls the AnomalyGroup service. `Load Anomaly |
| Group (Activity) AnomalyGroup Service <---> Workflow` |
| * **Decision (Bisect or Report)**: Based on the `GroupAction` field of the |
| anomaly group: - **If `BISECT`**: a. **Load Top Anomaly**: Fetches the most |
| significant anomaly within the group. b. **Resolve Commit Hashes**: |
| Converts the start and end commit positions of the anomaly into Git |
| commit hashes using an activity that interacts with a Gerrit/Crrev |
| service. `Get Commit Hashes (Activity) Gerrit/Crrev Service <---> |
| Workflow` c. **Launch Bisection (Child Workflow)**: Triggers a |
| separate `CulpritFinderWorkflow` (defined in the |
| `pinpoint/go/workflows` module) as a child workflow. This child |
| workflow is responsible for performing the actual bisection. - A |
| unique ID is generated for the Pinpoint job. - The child workflow is |
| configured with `ParentClosePolicy: ABANDON`, meaning it will |
| continue running even if this parent workflow terminates. This is |
| crucial because bisections can be long-running. - Callback |
| parameters are passed to the child workflow so it knows how to |
| report its findings back (e.g., which Anomaly Group ID it's |
| associated with, which Culprit service to use). `Launch Pinpoint |
| Bisection Workflow -----------------> Pinpoint.CulpritFinderWorkflow |
| (Child)` d. **Update Anomaly Group**: Records the ID of the launched |
| bisection job back into the AnomalyGroup. `Update Anomaly Group with |
| Bisection ID (Activity) AnomalyGroup Service <---> Workflow` - **If `REPORT`**: a. **Load Top Anomalies**: Fetches a list of the |
| top N anomalies in the group. b. **Notify User**: Calls an activity |
| that uses the Culprit service to file a bug or send a notification |
| about these anomalies. `Notify User of Anomalies (Activity) Culprit |
| Service <--------> Workflow` |
| |
| - **Helper Functions**: |
| - `parseStatisticNameFromChart`, `benchmarkStoriesNeedUpdate`, |
| `updateStoryDescriptorName`: These functions handle specific data |
| transformations needed to correctly format parameters for the Pinpoint |
| bisection request, often due to legacy conventions or differences in how |
| metrics are named. |
| |
| - **`process_culprit.go`**: |
| |
| - **Purpose**: Implements the `ProcessCulpritWorkflow`, which handles the |
| results of a completed bisection (i.e., when one or more culprits are |
| identified). |
| - **Why**: This workflow bridges the gap between a successful bisection |
| and making that information actionable. It ensures that found culprits |
| are stored and that users are notified appropriately. |
| - **Key Workflow Steps**: |
| |
| * **Convert Commits**: Transforms the commit data from the Pinpoint format |
| to the format expected by the Culprit service. This involves parsing |
| repository URLs. |
| * **Persist Culprit**: Calls an activity to store the identified |
| culprit(s) in a persistent datastore via the Culprit service. `Persist |
| Culprit (Activity) Culprit Service <--------> Workflow` |
| * **Notify User of Culprit**: Calls an activity to notify users (e.g., by |
| filing or updating a bug) about the identified culprit(s) via the |
| Culprit service. `Notify User of Culprit (Activity) Culprit Service |
| <--------> Workflow` |
| |
| - **Helper Function**: |
| - `ParsePinpointCommit`: Handles the parsing of repository URLs from the |
| Pinpoint commit format (e.g., `https://{host}/{project}.git`) into |
| separate host and project components required by the Culprit service. |
| |
| - **`anomalygroup_service_activity.go`**: |
| |
| - **Purpose**: Defines activities that interact with the AnomalyGroup gRPC |
| service. |
| - **Why**: Encapsulates the client-side logic for communicating with the |
| AnomalyGroup service. This makes the workflows themselves cleaner and |
| focuses them on orchestration rather than low-level RPC details. |
| - **Key Activities**: |
| - `LoadAnomalyGroupByID`: Fetches an anomaly group by its ID. |
| - `FindTopAnomalies`: Retrieves the most significant anomalies within a |
| group. |
| - `UpdateAnomalyGroup`: Updates an existing anomaly group (e.g., to add a |
| bisection ID). |
| |
| - **`culprit_service_activity.go`**: |
| |
| - **Purpose**: Defines activities that interact with the Culprit gRPC |
| service. |
| - **Why**: Similar to `anomalygroup_service_activity.go`, this |
| encapsulates communication with the Culprit service. |
| - **Key Activities**: |
| - `PeristCulprit`: Stores culprit information. |
| - `NotifyUserOfCulprit`: Notifies users about a found culprit (e.g., by |
| creating a bug). |
| - `NotifyUserOfAnomaly`: Notifies users about a set of anomalies (used |
| when the group action is `REPORT`). |
| |
| - **`gerrit_service_activity.go`**: |
| |
| - **Purpose**: Defines activities for interacting with Gerrit or a |
| Gerrit-like service (specifically Crrev in this case) to resolve commit |
| positions to commit hashes. |
| - **Why**: Bisection workflows often start with commit positions (which |
| are easier for humans or detection systems to reason about initially) |
| but need actual Git hashes to perform the bisection. This activity |
| provides that translation. |
| - **Key Activity**: |
| - `GetCommitRevision`: Takes a commit position (as an integer) and returns |
| its corresponding Git hash. |
| |
| **`worker/main.go`**: |
| |
| - **Purpose**: This is the entry point for the Temporal worker process that |
| hosts and executes the workflows and activities defined in this module. |
| - **Why**: Temporal workers are the processes that actually run the workflow |
| and activity code. This `main` function sets up the worker, connects it to |
| the Temporal server, and registers the workflows and activities it's capable |
| of handling. |
| - **Key Operations**: |
| 1. **Initialization**: Sets up logging and Prometheus metrics. |
| 2. **Temporal Client Creation**: Establishes a connection to the Temporal |
| frontend service. |
| 3. **Worker Creation**: Creates a new Temporal worker associated with a |
| specific task queue (e.g., `localhost.dev` or a production queue name). |
| Workflows and activities are dispatched to workers listening on the |
| correct task queue. |
| 4. **Workflow Registration**: Registers `ProcessCulpritWorkflow` and |
| `MaybeTriggerBisectionWorkflow` with the worker, associating them with |
| their public names (e.g., `workflows.ProcessCulprit`). |
| 5. **Activity Registration**: Registers instances of the activity structs |
| (e.g., `CulpritServiceActivity`, `AnomalyGroupServiceActivity`, |
| `GerritServiceActivity`) with the worker. |
| 6. **Worker Start**: Starts the worker, which begins polling the specified |
| task queue for tasks to execute. |
| |
| ### Key Workflows/Processes |
| |
| **1. Anomaly Group Processing and Potential Bisection |
| (`MaybeTriggerBisectionWorkflow`)** |
| |
| ``` |
| External Trigger (e.g., new AnomalyGroup created) |
| | |
| v |
| Start MaybeTriggerBisectionWorkflow(AG_ID) |
| | |
| +----------------------------------+ |
| | Wait (e.g., 30 mins) | |
| +----------------------------------+ |
| | |
| v |
| LoadAnomalyGroupByID(AG_ID) ----> AnomalyGroup Service |
| | |
| +-----------+ |
| | GroupAction?| |
| +-----------+ |
| / \ |
| / \ |
| BISECT REPORT |
| | | |
| v v |
| FindTopAnomalies(AG_ID, Limit=1) FindTopAnomalies(AG_ID, Limit=10) |
| | | |
| v v |
| GetCommitRevision(StartCommit) --> Gerrit Anomalies --> Convert to CulpritService format |
| | | |
| v v |
| GetCommitRevision(EndCommit) --> Gerrit NotifyUserOfAnomaly(AG_ID, Anomalies) --> Culprit Service |
| | |
| v |
| Execute Pinpoint.CulpritFinderWorkflow (Child) |
| | (Async, ParentClosePolicy=ABANDON) |
| | Params: {StartHash, EndHash, Config, Benchmark, Story, ... |
| | CallbackParams: {AG_ID, CulpritServiceURL, GroupingTaskQueue}} |
| | |
| v |
| UpdateAnomalyGroup(AG_ID, BisectionID) --> AnomalyGroup Service |
| | |
| v |
| End Workflow |
| ``` |
| |
| **2. Processing Bisection Results (`ProcessCulpritWorkflow`)** |
| |
| This workflow is typically triggered as a callback by the Pinpoint |
| `CulpritFinderWorkflow` when it successfully identifies a culprit. |
| |
| ``` |
| Pinpoint.CulpritFinderWorkflow completes |
| | (Calls back to Temporal, invoking ProcessCulpritWorkflow) |
| v |
| Start ProcessCulpritWorkflow(Commits, AG_ID, CulpritServiceURL) |
| | |
| +----------------------------------+ |
| | Convert Pinpoint Commits to | |
| | Culprit Service Format | |
| | (Parse Repository URLs) | |
| +----------------------------------+ |
| | |
| v |
| PersistCulprit(Commits, AG_ID) --------> Culprit Service |
| | (Returns CulpritIDs) |
| v |
| NotifyUserOfCulprit(CulpritIDs, AG_ID) -> Culprit Service |
| | (Returns IssueIDs, e.g., bug numbers) |
| v |
| End Workflow |
| ``` |
| |
| # Module: /integration |
| |
| The `/integration` module provides a dataset and tools for conducting |
| integration tests on the Perf performance monitoring system. Its primary purpose |
| is to offer a controlled and reproducible environment for verifying the |
| ingestion and processing capabilities of Perf. |
| |
| The core of this module is the `data` subdirectory. This directory houses a |
| collection of JSON files, each representing performance data associated with |
| specific commits from the `perf-demo-repo` |
| (https://github.com/skia-dev/perf-demo-repo.git). These files are structured |
| according to the `format.Format` schema defined in |
| `go.skia.org/infra/perf/go/ingest/format`. This standardized format is crucial |
| as it allows Perf's 'dir' type ingester to directly consume these files. The |
| dataset is intentionally designed to include a mix of valid data points and |
| specific error conditions: |
| |
| - **Nine "good" files:** These represent typical, valid performance data that |
| Perf should successfully ingest and process. Each file corresponds to a |
| known commit in the `perf-demo-repo`. |
| - **One file with a "bad" commit:** This file (`demo_data_commit_10.json`) |
| contains a `git_hash` that does not correspond to an actual commit in the |
| `perf-demo-repo`. This allows testing how Perf handles data associated with |
| unknown or invalid commit identifiers. |
| - **One malformed JSON file:** `malformed.json` is intentionally not a valid |
| JSON file. This is used to test Perf's error handling capabilities when |
| encountering incorrectly formatted input data. |
| |
| The generation of these data files is handled by `generate_data.go`. This Go |
| program is responsible for creating the JSON files in the `data` directory. It |
| uses a predefined list of commit hashes from the `perf-demo-repo` and generates |
| random but plausible performance metrics for each. The inclusion of this |
| generator script is important because it allows developers to easily modify, |
| expand, or regenerate the test dataset if the testing requirements change or if |
| new scenarios need to be covered. The script uses `math/rand` for generating |
| some variability in the measurement values, ensuring the data isn't entirely |
| static while still being predictable. |
| |
| The key workflow for utilizing this module in an integration test scenario would |
| look something like this: |
| |
| 1. **Setup Perf:** Configure a local instance of Perf. |
| 2. **Configure Ingester:** Point Perf's 'dir' type ingester to the |
| `/integration/data` directory. `Perf Instance --> Ingester (type: 'dir') --> |
| /integration/data/*.json` |
| 3. **Run Ingestion:** Trigger the ingestion process in Perf. |
| 4. **Verify:** |
| - Confirm that the data from the nine "good" files is correctly ingested |
| and displayed in Perf. |
| - Check that Perf appropriately handles the file with the "bad" commit |
| (e.g., logs an error, flags the data). |
| - Verify that Perf correctly identifies and reports the error with the |
| `malformed.json` file. |
| |
| The `BUILD.bazel` file defines how the components of this module are built. |
| |
| - The `data` `filegroup` makes the JSON test files available to other parts of |
| the system, specifically for use in performance testing |
| (`//perf:__subpackages__`). |
| - The `integration_lib` `go_library` encapsulates the logic from |
| `generate_data.go`. |
| - The `integration` `go_binary` provides an executable to run |
| `generate_data.go`, allowing for easy regeneration of the test data. |
| |
| In essence, the `/integration` module provides a self-contained, |
| version-controlled set of test data and a mechanism to regenerate it. This is |
| crucial for ensuring the stability and correctness of Perf's data ingestion |
| pipeline by providing a consistent baseline for integration testing. The choice |
| to include both valid and intentionally erroneous data points allows for |
| comprehensive testing of Perf's data handling capabilities, including its |
| robustness in the face of invalid input. |
| |
| # Module: /jupyter |
| |
| The `/jupyter` module provides tools and examples for interacting with Skia's |
| performance data, specifically data from `perf.skia.org`. The primary goal is to |
| enable users to programmatically query, analyze, and visualize performance |
| metrics using the power of Python libraries like Pandas, NumPy, and Matplotlib |
| within a Jupyter Notebook environment. |
| |
| The core functionality revolves around fetching and processing performance data. |
| This is achieved by providing Python functions that abstract the complexities of |
| interacting with the `perf.skia.org` API. This allows users to focus on the data |
| analysis itself rather than the underlying data retrieval mechanisms. |
| |
| **Key Components/Files:** |
| |
| - **`/jupyter/Perf+Query.ipynb`**: This is a Jupyter Notebook that serves as |
| both an example and a utility library. |
| |
| - **Why**: It demonstrates how to use the provided Python functions to |
| query performance data. It also contains the definitions of these key |
| functions, making it a self-contained environment for performance |
| analysis. The notebook format is chosen for its interactive nature, |
| allowing users to execute code snippets, see results immediately, and |
| experiment with different queries and visualizations. |
| - **How**: |
| |
| - **`perf_calc(formula)`**: This function is designed to evaluate a |
| specific formula against the performance data. It takes a string |
| `formula` (e.g., `'count(filter(\"\"))'`) as input. The formula is sent |
| to the `perf.skia.org` backend for processing. This function is useful |
| when you need to perform calculations or aggregations on the data |
| directly on the server side before retrieving it. |
| |
| - **`perf_query(query)`**: This function allows for more direct querying |
| of performance data based on key-value pairs. It takes a query string |
| (e.g., `'source_type=skp&sub_result=min_ms'`) that specifies the |
| parameters for data retrieval. This is suitable when you want to fetch |
| raw or filtered trace data. |
| |
| - **`perf_impl(body)`**: This is an internal helper function used by both |
| `perf_calc` and `perf_query`. It handles the actual HTTP communication |
| with `perf.skia.org`. It first determines the time range for the query |
| (typically the last 50 commits by default) by fetching initial page |
| data. Then, it sends the query or formula to the `/_/frame/start` |
| endpoint, polls the `/_/frame/status` endpoint until the request is |
| successful, and finally retrieves the results from `/_/frame/results`. |
| The results are then processed into a Pandas DataFrame, which is a |
| powerful data structure for analysis in Python. A special value `1e32` |
| from the backend (often representing missing or invalid data) is |
| converted to `np.nan` (Not a Number) for better handling in Pandas. |
| |
| - **`paramset()`**: This utility function fetches the available parameter |
| set from `perf.skia.org`. This is useful for discovering the possible |
| values for different dimensions like 'model', 'test', 'cpu_or_gpu', |
| etc., which can then be used to construct more targeted queries. |
| |
| - **Examples**: The notebook is rich with examples showcasing how to use |
| `perf_calc` and `perf_query`, plot the resulting DataFrames using |
| Pandas' built-in plotting capabilities or Matplotlib directly, normalize |
| data, calculate means, and perform more complex analyses like finding |
| the noisiest hardware models or comparing CPU vs. GPU performance for |
| specific tests. These examples serve as practical starting points for |
| users. |
| |
| - **Workflow (Simplified `perf_impl`):** |
| |
| * `Client (Jupyter Notebook)` -- `GET /_/initpage/` --> `perf.skia.org` |
| (Get time bounds) |
| |
| * `perf.skia.org` -- `Initial Data (JSON)` --> `Client` |
| |
| * `Client` -- `POST /_/frame/start (with query/formula & time bounds)` --> |
| `perf.skia.org` |
| |
| * `perf.skia.org` -- `Request ID (JSON)` --> `Client` |
| |
| * `Client` -- `GET /_/frame/status/{ID}` --> `perf.skia.org` (Loop until |
| 'Success') |
| |
| * `perf.skia.org` -- `Status (JSON)` --> `Client` |
| |
| * `Client` -- `GET /_/frame/results/{ID}` --> `perf.skia.org` |
| |
| * `perf.skia.org` -- `Performance Data (JSON)` --> `Client` |
| |
| * `Client (Python)`: Parse JSON -> Create Pandas DataFrame -> Return |
| DataFrame to user. |
| |
| - **`/jupyter/README.md`**: This file provides instructions on setting up the |
| necessary Python environment to run Jupyter Notebooks and the required |
| libraries (Pandas, SciPy, Matplotlib). |
| |
| - **Why**: Python environment management can be tricky, especially with |
| system-wide installations. Using a virtual environment (`virtualenv`) is |
| recommended to isolate project dependencies and avoid conflicts. |
| - **How**: It guides the user through installing `pip`, `python-dev`, and |
| `python-virtualenv` using `apt-get` (assuming a Debian-based Linux |
| system). It then shows how to create a virtual environment, activate it, |
| upgrade `pip`, and install `jupyter`, `notebook`, `scipy`, `pandas`, and |
| `matplotlib` within that isolated environment. Finally, it explains how |
| to run the Jupyter Notebook server and deactivate the environment when |
| done. This ensures a reproducible and clean setup for users wanting to |
| utilize the `Perf+Query.ipynb` notebook. |
| |
| The design emphasizes ease of use for data analysts and developers who need to |
| interact with Skia's performance data. By leveraging Jupyter Notebooks, it |
| provides an interactive and visual way to explore performance trends and issues. |
| The abstraction of API calls into simple Python functions (`perf_calc`, |
| `perf_query`) significantly lowers the barrier to entry for accessing this rich |
| dataset. |
| |
| # Module: /lint |
| |
| The `/lint` module is responsible for ensuring code quality and consistency |
| within the project by integrating and configuring JSHint, a popular JavaScript |
| linting tool. |
| |
| The primary goal of this module is to provide a standardized way to identify and |
| report potential errors, stylistic issues, and anti-patterns in the JavaScript |
| codebase. This helps maintain code readability, reduces the likelihood of bugs, |
| and promotes adherence to established coding conventions. |
| |
| The core component of this module is the `reporter.js` file. This file defines a |
| custom reporter function that JSHint will use to format and output the linting |
| results. |
| |
| The decision to implement a custom reporter stems from the need to present |
| linting errors in a clear, concise, and actionable format. Instead of relying on |
| JSHint's default output, which might be too verbose or not ideally suited for |
| the project's workflow, `reporter.js` provides a tailored presentation. |
| |
| The `reporter` function within `reporter.js` takes an array of error objects |
| (`res`) as input, where each object represents a single linting issue found by |
| JSHint. It then iterates through these error objects and constructs a formatted |
| string for each error. The format chosen is `filename:line:character message`, |
| which directly points developers to the exact location of the issue in the |
| source code. |
| |
| For example: `src/myFile.js:10:5 Missing semicolon` |
| |
| This specific format is chosen for its commonality in development tools and its |
| ease of integration with various editors and IDEs, allowing developers to |
| quickly navigate to the reported errors. |
| |
| After processing all errors, if any were found, the `reporter` function |
| aggregates the formatted error strings and prints them to the standard output |
| (`process.stdout.write`). Additionally, it appends a summary line indicating the |
| total number of errors found, ensuring that developers have a quick overview of |
| the linting status. The pluralization of "error" vs. "errors" is also handled |
| for grammatical correctness. |
| |
| The workflow can be visualized as: |
| |
| ``` |
| JSHint analysis --[error objects]--> reporter.js --[formatted errors & summary]--> stdout |
| ``` |
| |
| By controlling the output format, this module ensures that linting feedback is |
| consistently presented and easily digestible, contributing to a more efficient |
| development process. The design prioritizes providing actionable information to |
| developers, enabling them to address code quality issues promptly. |
| |
| # Module: /migrations |
| |
| This module is responsible for managing SQL database schema migrations for Perf. |
| Perf utilizes SQL backends to store various data, including trace data, |
| shortcuts, and alerts. As the application evolves, the database schema may need |
| to change. This module provides the mechanism to apply these changes and to |
| upgrade existing databases to the schema expected by the current Perf version. |
| |
| The core of this system relies on the `github.com/golang-migrate/migrate/v4` |
| library. This library provides a robust framework for versioning database |
| schemas and applying migrations in a controlled manner. |
| |
| The key design principle is to have a versioned set of SQL scripts for each |
| supported SQL dialect. This allows Perf to: |
| |
| 1. **Initialize a new database** with the correct schema. |
| 2. **Upgrade an existing database** from an older schema version to the current |
| one. |
| 3. **Rollback schema changes** if necessary, by providing "down" migrations. |
| |
| Each SQL dialect (e.g., CockroachDB) has its own subdirectory within the |
| `/migrations` module. The naming convention for these directories is critical: |
| they must match the values defined in `sql.Dialect`. |
| |
| Inside each dialect-specific directory, migration files are organized by |
| version. |
| |
| - File names are prefixed with a 0-padded version number (e.g., `0001_`, |
| `0002_`). |
| - For each version, there are two files: |
| - An `.up.` file (e.g., `0001_create_initial_tables.up.sql`): Contains SQL |
| statements to apply the schema changes for that version. |
| - A `.down.` file (e.g., `0001_create_initial_tables.down.sql`): Contains |
| SQL statements to revert the schema changes introduced by the |
| corresponding `.up.` file. |
| |
| This paired approach ensures that migrations can be applied and rolled back |
| smoothly. |
| |
| **Key Files and Responsibilities:** |
| |
| - `README.md`: Provides a high-level overview of the migration system, |
| explaining its purpose and the use of the `golang-migrate/migrate` library. |
| It also details the directory structure and file naming conventions for |
| migration scripts. |
| - `cockroachdb/`: This directory contains the migration scripts specifically |
| for the CockroachDB dialect. |
| - `cockroachdb/0001_create_initial_tables.up.sql`: This is the first |
| migration script for CockroachDB. It defines the initial schema for |
| Perf, creating tables such as `TraceValues`, `SourceFiles`, `ParamSets`, |
| `Postings`, `Shortcuts`, `Alerts`, `Regressions`, and `Commits`. The |
| table definitions include primary keys, indexes, and column types |
| tailored for efficient data storage and retrieval specific to Perf's |
| needs (e.g., storing trace data, associating traces with source files, |
| managing alert configurations, and tracking commit history). The schema |
| is designed to support the various functionalities of Perf, such as |
| querying traces by parameters, retrieving trace values over commit |
| ranges, and linking regressions to specific alerts and commits. |
| - `cockroachdb/0001_create_initial_tables.down.sql`: This file is intended |
| to contain SQL statements to drop the tables created by its |
| corresponding `.up.` script. However, as a safety precaution against |
| accidental data loss, it is currently empty. The design acknowledges the |
| potential danger of automated table drops in a production environment. |
| - `cdb.sql`: This is a utility SQL script designed for developers to interact |
| with and test queries against a CockroachDB instance populated with Perf |
| data. It includes sample `INSERT` statements to populate tables with test |
| data and various `SELECT` queries demonstrating common data retrieval |
| patterns used by Perf. This file is not part of the automated migration |
| process but serves as a helpful tool for development and debugging. It |
| showcases how to query for traces based on parameters, retrieve trace |
| values, find the most recent tile, and get source file information. It also |
| includes examples of more complex queries involving `INTERSECT` and `JOIN` |
| operations, reflecting the kinds of queries Perf might execute. |
| - `test.sql`: Similar to `cdb.sql`, this script is for testing and |
| experimentation, but it's tailored for a SQLite database. It creates a |
| schema similar to the CockroachDB one (though potentially simplified or with |
| slight variations due to dialect differences) and populates it with test |
| data. It contains a series of `CREATE TABLE`, `INSERT`, and `SELECT` |
| statements that developers can use to quickly set up a local test |
| environment and verify SQL logic. |
| - `batch-delete.sh` and `batch-delete.sql`: These files provide a mechanism |
| for performing batch deletions of specific parameter data from the |
| `ParamSets` table in a CockroachDB instance. |
| - `batch-delete.sql`: Contains the `DELETE` SQL statement. It is designed |
| to be edited directly to specify the deletion criteria (e.g., |
| `tile_number`, `param_key`, `param_value` ranges) and the `LIMIT` for |
| the number of rows deleted in each batch. This batching approach is |
| crucial for deleting large amounts of data without overwhelming the |
| database or causing long-running transactions. |
| - `batch-delete.sh`: A shell script that repeatedly executes |
| `batch-delete.sql` using the `cockroach sql` command-line tool. It runs |
| in a loop with a short sleep interval, allowing for controlled, |
| iterative deletion. This script assumes that a port-forward to the |
| CockroachDB instance is already established. This utility is likely used |
| for data cleanup or maintenance tasks that require removing specific, |
| potentially large, datasets. |
| |
| **Migration Workflow (Conceptual):** |
| |
| When Perf starts or when a migration command is explicitly run: |
| |
| 1. **Determine Current Schema Version:** The `golang-migrate/migrate` library |
| connects to the database and checks the current schema version (often stored |
| in a dedicated migrations table managed by the library itself). |
| 2. **Identify Target Schema Version:** This is typically the highest version |
| number found among the migration files for the configured SQL dialect. |
| 3. **Apply Pending Migrations:** |
| |
| - If the current schema version is lower than the target version, the |
| library iteratively executes the `.up.sql` files in ascending order of |
| their version numbers, starting from the version immediately following |
| the current one, up to the target version. |
| - Each successful `.up.` migration updates the schema version in the |
| database. |
| |
| Example: Current Version = 0, Target Version = 2 `DB State (v0) --> Run |
| |
| 0001*\*.up.sql --> DB State (v1) --> Run 0002*\*.up.sql --> DB State (v2)` |
| |
| 4. **Rollback Migrations (if needed):** |
| |
| - If a user needs to revert to an older schema version, the library can |
| execute the `.down.sql` files in descending order. |
| |
| Example: Current Version = 2, Target Rollback Version = 0 `DB State (v2) --> |
| |
| Run 0002*\*.down.sql --> DB State (v1) --> Run 0001*\*.down.sql --> DB State |
| (v0)` |
| |
| The `BUILD.bazel` file defines a `filegroup` named `cockroachdb` which bundles |
| all files under the `cockroachdb/` subdirectory. This is likely used by other |
| parts of the Perf build system, perhaps to package these migration scripts or |
| make them accessible to the Perf application when it needs to perform |
| migrations. |
| |
| # Module: /modules |
| |
| ## Modules Documentation |
| |
| ### Overview |
| |
| The `modules` directory contains a collection of frontend TypeScript modules |
| that constitute the building blocks of the Perf web application's user |
| interface. These modules primarily define custom HTML elements (web components) |
| and utility functions for various UI functionalities, data processing, and |
| interaction with backend services. The architecture emphasizes modularity, |
| reusability, and a component-based approach, largely leveraging the Lit library |
| for creating custom elements and `elements-sk` for common UI widgets. |
| |
| The design philosophy encourages separation of concerns: |
| |
| - **UI Components:** Dedicated custom elements encapsulate specific UI |
| features like plotting, alert configuration, data tables, dialogs, and input |
| controls. |
| - **Data Handling:** Modules like `dataframe` and `progress` manage data |
| fetching, processing, and state. |
| - **Utilities:** Modules like `paramtools`, `pivotutil`, `cid`, and `trybot` |
| provide common functionalities for data manipulation, key parsing, and |
| specific calculations. |
| - **Styling and Theming:** A centralized `themes` module ensures a consistent |
| visual appearance, building upon `infra-sk`'s theming capabilities. |
| - **JSON Contracts:** The `json` module defines TypeScript interfaces that |
| mirror backend Go structures, ensuring type safety in client-server |
| communication. |
| |
| This modular structure aims to create a maintainable and scalable frontend |
| codebase. Each module typically includes its core logic, associated styles, demo |
| pages for isolated development and testing, and unit/integration tests. |
| |
| ### Key Responsibilities and Components |
| |
| A significant portion of the modules is dedicated to creating custom HTML |
| elements that serve as interactive UI components. These elements often |
| encapsulate complex behavior and interactions, simplifying their use in |
| higher-level page components. |
| |
| **Data Visualization and Interaction:** |
| |
| - `plot-simple-sk`: A custom-built canvas-based plotting element for rendering |
| interactive line graphs, optimized for performance with features like dual |
| canvases, Path2D objects, and k-d trees for point proximity. |
| - `plot-google-chart-sk`: An alternative plotting element that wraps the |
| Google Charts library, offering a rich set of features and interactivity |
| like panning, zooming, and trace visibility toggling. |
| - `plot-summary-sk`: Displays a summary plot (often using Google Charts) and |
| allows users to select a range, which is useful for overview and drill-down |
| scenarios. |
| - `chart-tooltip-sk`: Provides a detailed, interactive tooltip for data points |
| on charts, showing commit information, anomaly details, and actions like |
| bisection or requesting traces. |
| - `graph-title-sk`: Displays a structured title for graphs, showing key-value |
| parameter pairs associated with the plotted data. |
| - `word-cloud-sk`: Visualizes key-value pairs and their frequencies as a |
| textual list with proportional bars. |
| |
| **Alert and Regression Management:** |
| |
| - `alert-config-sk`: A UI for creating and editing alert configurations, |
| including query definition, detection algorithms, and notification settings. |
| - `alerts-page-sk`: A page for viewing, creating, and managing all alert |
| configurations. |
| - `cluster-summary2-sk`: Displays a detailed summary of a performance cluster, |
| including a plot, statistics, and triage controls. |
| - `anomalies-table-sk`: Renders a sortable and interactive table of detected |
| performance anomalies, allowing for grouping and bulk actions like triage |
| and graphing. |
| - `anomaly-sk`: Displays detailed information about a single performance |
| anomaly. |
| - `triage-status-sk`: A simple button-like element indicating the current |
| triage status of a cluster and allowing users to initiate the triage |
| process. |
| - `triage-menu-sk`: Provides a menu for bulk triage actions on selected |
| anomalies, including assigning bugs or marking them as ignored. |
| - `new-bug-dialog-sk`: A dialog for filing new bugs related to anomalies, |
| pre-filling details. |
| - `existing-bug-dialog-sk`: A dialog for associating anomalies with existing |
| bug reports. |
| - `user-issue-sk`: Manages the association of user-reported Buganizer issues |
| with specific data points. |
| - `bisect-dialog-sk`: A dialog for initiating a Pinpoint bisection process to |
| find the commit causing a regression. |
| - `pinpoint-try-job-dialog-sk`: A (legacy) dialog for initiating Pinpoint A/B |
| try jobs to request additional traces. |
| - `triage-page-sk`: A page dedicated to viewing and triaging regressions based |
| on time range and filters. |
| - `regressions-page-sk`: A page for viewing regressions associated with |
| specific "subscriptions" (e.g., sheriff configs). |
| - `subscription-table-sk`: Displays details of a subscription and its |
| associated alerts. |
| - `revision-info-sk`: Displays information about anomalies detected around a |
| specific revision. |
| |
| **Data Input and Selection:** |
| |
| - `query-sk`: A comprehensive UI for constructing complex queries by selecting |
| parameters and their values. |
| - `paramset-sk`: Displays a set of parameters and their values, often used to |
| summarize a query or data selection. |
| - `query-chooser-sk`: Combines `paramset-sk` (for summary) and `query-sk` (in |
| a dialog) for a compact query selection experience. |
| - `query-count-sk`: Shows the number of items matching a given query, fetching |
| this count from a backend endpoint. |
| - `commit-detail-picker-sk`: Allows users to select a specific commit from a |
| range, typically presented in a dialog with date range filtering. |
| - `commit-detail-panel-sk`: Displays a list of commit details, making them |
| selectable. |
| - `commit-detail-sk`: Displays information about a single commit with action |
| buttons. |
| - `calendar-input-sk`: A date input field combined with a calendar picker |
| dialog. |
| - `calendar-sk`: A standalone interactive calendar widget. |
| - `day-range-sk`: Allows selection of a "begin" and "end" date. |
| - `domain-picker-sk`: Allows selection of a data domain either by date range |
| or by a number of recent commits. |
| - `test-picker-sk`: A guided, multi-step picker for selecting tests or traces |
| by sequentially choosing parameter values. |
| - `picker-field-sk`: A text input field with a filterable dropdown menu of |
| predefined options, built using Vaadin ComboBox. |
| - `algo-select-sk`: A dropdown for selecting a clustering algorithm. |
| - `split-chart-menu-sk`: A menu for selecting an attribute by which to split a |
| chart. |
| - `pivot-query-sk`: A UI for configuring pivot table requests (group by, |
| operations, summaries). |
| - `triage2-sk`: A set of three buttons for selecting a triage status |
| (positive, negative, untriaged). |
| - `tricon2-sk`: An icon that visually represents one of the three triage |
| states. |
| |
| **Data Display and Structure:** |
| |
| - `pivot-table-sk`: Displays pivoted DataFrame data in a sortable table. |
| - `json-source-sk`: A dialog for viewing the raw JSON source data for a |
| specific trace point. |
| - `ingest-file-links-sk`: Displays relevant links (e.g., to Swarming, |
| Perfetto) associated with an ingested data point. |
| - `point-links-sk`: Displays links from ingestion files and generates commit |
| range links between data points. |
| - `commit-range-sk`: Dynamically generates a URL to a commit range viewer |
| based on begin and end commits. |
| |
| **Scaffolding and Application Structure:** |
| |
| - `perf-scaffold-sk`: Provides the consistent layout, header, and navigation |
| sidebar for all Perf application pages. |
| - `explore-simple-sk`: The core element for exploring and visualizing |
| performance data, including querying, plotting, and anomaly interaction. |
| - `explore-sk`: Wraps `explore-simple-sk`, adding features like user |
| authentication, default configurations, and optional integration with |
| `test-picker-sk`. |
| - `explore-multi-sk`: Allows displaying and managing multiple |
| `explore-simple-sk` graphs simultaneously, with shared controls and shortcut |
| management. |
| - `favorites-dialog-sk`: A dialog for adding or editing bookmarked "favorites" |
| (named URLs). |
| - `favorites-sk`: Displays and manages a user's list of favorites. |
| |
| **Backend Interaction and Data Processing Utilities:** |
| |
| - `cid/cid.ts`: Provides `lookupCids` to fetch detailed commit information |
| based on commit numbers. |
| - `common/plot-builder.ts` & `common/plot-util.ts`: Utilities for transforming |
| `DataFrame` and `TraceSet` data into formats suitable for plotting libraries |
| (especially Google Charts) and for creating consistent chart options. |
| - `common/test-util.ts`: Sets up mocked API responses (`fetch-mock`) for |
| various backend endpoints, facilitating isolated testing and demo page |
| development. |
| - `const/const.ts`: Defines shared constants, notably `MISSING_DATA_SENTINEL` |
| for representing missing data points, ensuring consistency with the backend. |
| - `csv/index.ts`: Converts `DataFrame` objects into CSV format for data |
| export. |
| - `dataframe/index.ts` & `dataframe/dataframe_context.ts`: Core logic for |
| managing and manipulating `DataFrame` objects. `DataFrameRepository` (a |
| LitElement context provider) handles fetching, caching, merging, and |
| providing `DataFrame` and `DataTable` objects to consuming components. |
| - `dataframe/traceset.ts`: Utilities for extracting and formatting information |
| from trace keys within DataFrames/DataTables, such as generating chart |
| titles and legends. |
| - `errorMessage/index.ts`: A wrapper around `elements-sk`'s `errorMessage` to |
| display persistent error messages by default. |
| - `json/index.ts`: Contains TypeScript interfaces and types that define the |
| structure of JSON data exchanged with the backend, crucial for type safety |
| and often auto-generated from Go structs. |
| - `paramtools/index.ts`: Client-side utilities for creating, parsing, and |
| manipulating `ParamSet` objects and structured trace keys (e.g., `makeKey`, |
| `fromKey`, `queryFromKey`). |
| - `pivotutil/index.ts`: Utilities for validating pivot table requests |
| (`pivot.Request`) and providing descriptions for pivot operations. |
| - `progress/progress.ts`: Implements `startRequest` for initiating and polling |
| the status of long-running server-side tasks, providing progress updates to |
| the UI. |
| - `trace-details-formatter/traceformatter.ts`: Provides `TraceFormatter` |
| implementations (default and Chrome-specific) for converting trace parameter |
| sets to display strings and vice-versa for querying. |
| - `trybot/calcs.ts`: Calculates and aggregates `stddevRatio` values from Perf |
| trybot results, grouping them by parameter to identify performance impacts. |
| - `trybot-page-sk`: A page for analyzing performance regressions based on |
| commit or trybot run, using `trybot/calcs` for analysis. |
| - `window/index.ts`: Utilities related to the browser `window` object, |
| including parsing build tag information from `window.perf.image_tag`. |
| |
| **Core Architectural Patterns:** |
| |
| - **Custom Elements (Web Components):** The UI is primarily built using custom |
| elements, promoting encapsulation, reusability, and interoperability. Most |
| elements extend `ElementSk` from `infra-sk`. |
| - **Lit Library:** Widely used for defining custom elements, providing |
| efficient templating (`lit-html`) and reactive updates. |
| - **State Management:** |
| - Local component state is managed within the elements themselves. |
| - `stateReflector` (from `infra-sk`) is frequently used to synchronize |
| component state with URL query parameters, enabling bookmarking and |
| shareable views (e.g., `alerts-page-sk`, `explore-simple-sk`, |
| `triage-page-sk`). |
| - Lit contexts (`@lit/context`) are used for providing shared data down |
| the component tree without prop drilling, notably in |
| `dataframe/dataframe_context.ts` for `DataFrame` objects. |
| - **Event-Driven Communication:** Components often communicate using custom |
| DOM events. Child components emit events, and parent components listen and |
| react to them (e.g., `query-sk` emits `query-change`, `triage-status-sk` |
| emits `start-triage`). |
| - **Asynchronous Operations:** `fetch` API is used for backend communication. |
| Promises and `async/await` are standard for handling these asynchronous |
| operations. Spinners (`spinner-sk`) provide user feedback during loading. |
| - **Modularity and Dependencies:** Modules are designed to be relatively |
| self-contained, with clear dependencies declared in `BUILD.bazel` files. |
| This allows for better organization and easier maintenance. |
| - **Testing:** Each module typically has associated demo pages (`*-demo.html`, |
| `*-demo.ts`) for isolated development and visual testing, Karma unit tests |
| (`*_test.ts`), and Puppeteer end-to-end/screenshot tests |
| (`*_puppeteer_test.ts`). `fetch-mock` is extensively used in demos and tests |
| to simulate backend responses. |
| |
| This comprehensive set of modules forms a rich ecosystem for building and |
| maintaining the Perf application's frontend, with a strong emphasis on modern |
| web development practices and reusability. |
| |
| # Module: /modules/alert |
| |
| ## Alert Module Documentation |
| |
| ### Overview |
| |
| The `alert` module is responsible for validating the configuration of alerts |
| within the Perf system. Its primary function is to ensure that alert definitions |
| adhere to a set of predefined rules, guaranteeing their proper functioning and |
| preventing errors. This module plays a crucial role in maintaining the |
| reliability of the alerting system by catching invalid configurations before |
| they are deployed. |
| |
| ### Design Decisions and Implementation Choices |
| |
| The core design principle behind this module is simplicity and focused |
| responsibility. Instead of incorporating complex validation logic directly into |
| other parts of the system (like the UI or backend services that handle alert |
| creation/modification), this module provides a dedicated, reusable validation |
| function. This promotes modularity and makes the validation logic easier to |
| maintain and update. |
| |
| The choice of using a simple function (`validate`) that returns a string (empty |
| for valid, error message for invalid) is intentional. This approach is |
| straightforward to understand and integrate into various parts of the |
| application. It avoids throwing exceptions for validation failures, which can |
| sometimes complicate control flow, and instead provides clear, human-readable |
| feedback. |
| |
| The current validation is intentionally minimal, focusing on the essential |
| requirement of a non-empty query. This is a pragmatic approach, starting with |
| the most critical validation and allowing for the addition of more complex rules |
| as the system evolves. The dependency on `//perf/modules/json:index_ts_lib` |
| indicates that the structure of an `Alert` is defined externally, and this |
| module consumes that definition. |
| |
| ### Key Components and Responsibilities |
| |
| - **`index.ts`**: This is the central file of the module. |
| - **Responsibility**: It houses the primary validation logic for `Alert` |
| configurations. |
| - **`validate(alert: Alert): string` function**: |
| - **Purpose**: This function is the public API of the module. It takes an |
| `Alert` object (as defined in the `../json` module) as input. |
| - **How it works**: It performs a series of checks on the properties of |
| the `alert` object. Currently, it verifies that the `query` property of |
| the `Alert` is present and not an empty string. |
| - **Output**: If all checks pass, it returns an empty string, signifying |
| that the `Alert` configuration is valid. If any check fails, it returns |
| a string containing a descriptive error message indicating why the |
| `Alert` is considered invalid. This message is intended to be |
| user-friendly and help in correcting the configuration. |
| |
| ### Key Workflows |
| |
| **Alert Validation Workflow:** |
| |
| ``` |
| External System (e.g., UI, API) -- Passes Alert object --> [alert/index.ts: validate()] |
| | |
| V |
| [ Is alert.query non-empty? ] |
| | |
| +--------------------------+--------------------------+ |
| | (Yes) | (No) |
| V V |
| [ Returns "" (empty string) ] [ Returns "An alert must have a non-empty query." ] |
| | | |
| V V |
| External System <-- Receives validation result -- [ Interprets result (valid/invalid) ] |
| ``` |
| |
| This workflow illustrates how an external system would interact with the |
| `validate` function. The external system provides an `Alert` object, and the |
| `validate` function returns a string. The external system then uses this string |
| to determine if the alert configuration is valid and can proceed accordingly |
| (e.g., save the alert, display an error to the user). |
| |
| # Module: /modules/alert-config-sk |
| |
| The `alert-config-sk` module provides a custom HTML element, |
| `<alert-config-sk>`, designed for creating and editing alert configurations |
| within the Perf application. This element serves as a user interface for |
| defining the conditions under which an alert should be triggered, how |
| regressions are detected, and where notifications should be sent. |
| |
| **Core Functionality and Design:** |
| |
| The primary goal of `alert-config-sk` is to offer a comprehensive yet |
| user-friendly way to manage alert settings. It encapsulates all the necessary |
| input fields and logic for defining an `Alert` object, which is a central data |
| structure in Perf for representing alert configurations. |
| |
| Key design considerations include: |
| |
| - **Modularity and Reusability:** By packaging the alert configuration UI as a |
| custom element, it can be easily integrated into various parts of the Perf |
| application where alert management is needed. |
| - **Dynamic UI based on Context:** The UI adapts based on global settings |
| (e.g., `window.perf.notifications`, `window.perf.display_group_by`, |
| `window.perf.need_alert_action`). This allows the same component to present |
| different options depending on the specific Perf instance's configuration or |
| the user's context. For example, the notification options (email vs. issue |
| tracker) and the visibility of "Group By" settings can change. |
| - **Data Binding and Reactivity:** The element uses Lit library for templating |
| and reactivity. Changes in the input fields directly update the internal |
| `_config` object, and changes to the element's properties (like `config`, |
| `paramset`) trigger re-renders. |
| - **Integration with other Perf modules:** It leverages other custom elements |
| like `query-chooser-sk` for selecting traces, `algo-select-sk` for choosing |
| clustering algorithms, and various `elements-sk` components (e.g., |
| `select-sk`, `multi-select-sk`, `checkbox-sk`) for standard UI inputs. This |
| promotes consistency and reduces redundant code. |
| - **User Feedback and Validation:** The component provides immediate feedback, |
| such as displaying different threshold units based on the selected step |
| detection algorithm and validating input for fields like the Issue Tracker |
| Component ID. It also includes "Test" buttons to verify alert notification |
| and bug template configurations. |
| |
| **Key Components and Files:** |
| |
| - **`alert-config-sk.ts`:** This is the heart of the module, defining the |
| `AlertConfigSk` class which extends `ElementSk`. |
| - **Properties:** |
| - `config`: An `Alert` object representing the current alert configuration |
| being edited. This is the primary data model for the component. |
| - `paramset`: A `ParamSet` object providing the available parameters and |
| their values for constructing queries (used by `query-chooser-sk`). |
| - `key_order`: An array of strings dictating the preferred order of keys |
| in the `query-chooser-sk`. |
| - **Templating (`template` static method):** Uses `lit-html` to define the |
| structure and content of the element. It dynamically renders sections |
| based on the current configuration and global settings (e.g., |
| `window.perf.notifications`). |
| - **Event Handling:** Listens to events from child components (e.g., |
| `query-change` from `query-chooser-sk`, `selection-changed` from |
| `select-sk`) to update the `_config` object. |
| - **Logic for Dynamic UI:** |
| - The `thresholdDescriptors` object maps step detection algorithms to |
| their corresponding units and descriptive labels, ensuring the |
| "Threshold" input field is always relevant. |
| - Conditional rendering (e.g., using `?` operator in lit-html or `if` |
| statements in helper functions like `_groupBy`) is used to show/hide UI |
| elements based on `window.perf` flags. |
| - **API Interaction:** |
| - `testBugTemplate()`: Sends a `POST` request to `/_/alert/bug/try` to |
| test the configured bug URI template. |
| - `testAlert()`: Sends a `POST` request to `/_/alert/notify/try` to test |
| the alert notification setup. |
| - **Helper Functions:** |
| - `toDirection()`, `toConfigState()`: Convert string values from UI |
| selections to the appropriate enum types for the `Alert` object. |
| - `indexFromStep()`: Determines the correct selection index for the "Step |
| Detection" dropdown based on the current `_config.step` value. |
| - **`alert-config-sk.scss`:** Contains the SASS styles for the element, |
| ensuring a consistent look and feel within the Perf application. It imports |
| styles from `themes_sass_lib` and `buttons_sass_lib` for theming and button |
| styling. |
| - **`alert-config-sk-demo.html` and `alert-config-sk-demo.ts`:** Provide a |
| demonstration page for the `alert-config-sk` element. |
| - The HTML sets up a basic page structure with an instance of |
| `alert-config-sk` and buttons to manipulate global `window.perf` |
| settings, allowing developers to test different UI states of the |
| component. |
| - The TypeScript file initializes the demo, sets up mock `paramset` and |
| `config` data, and provides event listeners for the control buttons to |
| refresh the `alert-config-sk` component and display its current state. |
| This is crucial for development and testing. |
| - **`alert-config-sk_puppeteer_test.ts`:** Contains Puppeteer tests for the |
| component. These tests verify that the component renders correctly in |
| different states (e.g., with/without group_by, different notification |
| options) by interacting with the demo page and taking screenshots. |
| - **`index.ts`:** A simple entry point that imports and thereby registers the |
| `alert-config-sk` custom element, making it available for use in HTML. |
| |
| **Workflow Example: Editing an Alert** |
| |
| 1. **Initialization:** |
| |
| - An instance of `alert-config-sk` is added to the DOM. |
| - The `paramset` property is set, providing the available trace |
| parameters. |
| - The `config` property is set with the `Alert` object to be edited (or a |
| default new configuration). |
| - Global `window.perf` settings influence which UI sections are initially |
| visible. |
| |
| 2. **User Interaction:** |
| |
| - The user modifies various fields: Display Name, Category, Query (via |
| `query-chooser-sk`), Grouping (via `algo-select-sk`), Step Detection, |
| Threshold, etc. |
| - As the user changes a field (e.g., selects a new "Step Detection" |
| algorithm from the `select-sk`): |
| - An event is dispatched by the child component (e.g., |
| `selection-changed`). |
| - `alert-config-sk` listens for this event. |
| - The event handler in `alert-config-sk.ts` updates the corresponding |
| property in its internal `_config` object (e.g., |
| `this._config.step = newStepValue`). |
| - The component re-renders (managed by Lit) to reflect the change. For |
| instance, if the "Step Detection" changes, the "Threshold" label and |
| units dynamically update. |
| |
| ``` |
| User interacts with <select-sk id="step"> |
| | |
| V |
| <select-sk> emits 'selection-changed' event |
| | |
| V |
| AlertConfigSk.stepSelectionChanged(event) is called |
| | |
| V |
| this._config.step is updated |
| | |
| V |
| this._render() is (indirectly) called by Lit |
| | |
| V |
| UI updates, e.g., label for "Threshold" input changes |
| ``` |
| |
| 3. **Testing Configuration (Optional):** |
| |
| - User clicks "Test" for bug template: |
| - `AlertConfigSk.testBugTemplate()` is called. |
| - A POST request is made to `/_/alert/bug/try`. |
| - The response (a URL to the bug) is opened in a new tab, or an error |
| is shown. |
| - User clicks "Test" for alert notification: |
| - `AlertConfigSk.testAlert()` is called. |
| - A POST request is made to `/_/alert/notify/try`. |
| - A success/error message is displayed. |
| |
| 4. **Saving Changes:** |
| |
| - The parent component or application logic that hosts `alert-config-sk` |
| is responsible for retrieving the updated `config` object from the |
| `alert-config-sk` element (e.g., `element.config`) and persisting it |
| (e.g., by sending it to a backend API). `alert-config-sk` itself does |
| not handle the saving of the configuration to a persistent store. |
| |
| This element aims to simplify the complex task of configuring alerts by |
| providing a structured and reactive interface, abstracting away the direct |
| manipulation of the underlying `Alert` JSON object for the end-user. |
| |
| # Module: /modules/alerts-page-sk |
| |
| ## alerts-page-sk Module Documentation |
| |
| ### High-Level Overview |
| |
| The `alerts-page-sk` module provides a user interface for managing and |
| configuring alerts within the Perf application. Users can view, create, edit, |
| and delete alert configurations. The page displays existing alerts in a table |
| and provides a dialog for detailed configuration of individual alerts. It |
| interacts with a backend API to fetch and persist alert data. |
| |
| ### Design Decisions and Implementation Choices |
| |
| **Why a dedicated page for alerts?** Centralizing alert management provides a |
| clear and focused interface for users responsible for monitoring performance |
| metrics. This separation of concerns simplifies the overall application |
| structure and user experience. |
| |
| **How are alerts displayed and managed?** Alerts are displayed in a tabular |
| format, offering a quick overview of key information like name, query, owner, |
| and status. Icons are used for common actions like editing and deleting, |
| enhancing usability. A modal dialog, utilizing the `<dialog>` HTML element and |
| the `alert-config-sk` component, is employed for focused editing of individual |
| alert configurations. This approach avoids cluttering the main page and provides |
| a dedicated space for detailed settings. |
| |
| **Why use Lit for templating?** Lit is used for its efficient rendering and |
| component-based architecture. This allows for a declarative way to define the UI |
| and manage its state, making the code more maintainable and easier to |
| understand. The use of `html` tagged template literals provides a clean and |
| JavaScript-native way to write templates. |
| |
| **How is user authorization handled?** The page checks if the logged-in user has |
| an 'editor' role. This is determined by fetching the user's status from |
| `/_/login/status`. Editing and creation functionalities are disabled if the user |
| lacks the necessary permissions, preventing unauthorized modifications. The |
| logged-in user's email is also pre-filled as the owner for new alerts. |
| |
| **Why is `fetch-mock` used in the demo?** `fetch-mock` is utilized in the demo |
| (`alerts-page-sk-demo.ts`) to simulate backend API responses. This allows for |
| isolated testing and development of the frontend component without requiring a |
| running backend. It enables developers to define expected responses for various |
| API endpoints, facilitating a predictable environment for UI development and |
| testing. |
| |
| **How are API interactions handled?** The component uses the `fetch` API to |
| communicate with the backend. Helper functions like `jsonOrThrow` and |
| `okOrThrow` are used to simplify response handling and error management. |
| Specific endpoints are used for listing (`/_/alert/list/...`), creating |
| (`/_/alert/new`), updating (`/_/alert/update`), and deleting |
| (`/_/alert/delete/...`) alerts. |
| |
| **Why distinguish between "Alert" and "Component" in the UI?** The UI adapts to |
| display either an "Alert" field or an "Issue Tracker Component" field based on |
| the `window.perf.notifications` global setting. This allows the application to |
| integrate with different notification systems. If `markdown_issuetracker` is |
| configured, it links directly to the relevant issue tracker component. |
| |
| ### Responsibilities and Key Components/Files |
| |
| - **`alerts-page-sk.ts`**: This is the core TypeScript file defining the |
| `AlertsPageSk` custom element. |
| |
| - **Responsibilities**: |
| - Fetching and displaying a list of alert configurations. |
| - Providing functionality to create new alerts. |
| - Enabling editing of existing alerts through a modal dialog. |
| - Allowing deletion of alerts. |
| - Handling user authorization for edit/create operations. |
| - Managing the state of the "show deleted alerts" checkbox. |
| - Interacting with the backend API for all alert-related operations. |
| - Rendering the UI using Lit templates. |
| - **Key Methods**: |
| - `connectedCallback()`: Initializes the component by fetching initial |
| data (paramset and alert list). |
| - `list()`: Fetches and re-renders the list of alerts. |
| - `add()`: Initiates the creation of a new alert by fetching a default |
| configuration from the server and opening the edit dialog. |
| - `edit()`: Opens the edit dialog for an existing alert. |
| - `accept()`: Handles the submission of changes from the edit dialog, |
| sending an update request to the server. |
| - `delete()`: Sends a request to the server to delete an alert. |
| - `openOnLoad()`: Checks the URL for an alert ID on page load and, if |
| present, opens the edit dialog for that specific alert. This allows for |
| direct linking to an alert's configuration. |
| - **Key Properties**: |
| - `alerts`: An array holding the currently displayed alert configurations. |
| - `_cfg`: The `Alert` object currently being edited in the dialog. |
| - `isEditor`: A boolean indicating if the current user has editing |
| privileges. |
| - `dialog`: A reference to the HTML `<dialog>` element used for editing. |
| - `alertconfig`: A reference to the `alert-config-sk` element within the |
| dialog. |
| |
| - **`alerts-page-sk.scss`**: Contains the SASS/CSS styles for the |
| `alerts-page-sk` element. |
| |
| - **Responsibilities**: Defines the visual appearance of the alerts table, |
| buttons, dialog, and other UI elements within the page. It ensures a |
| consistent look and feel, including theming (dark mode). |
| |
| - **`alerts-page-sk-demo.ts`**: Provides a demonstration and development |
| environment for the `alerts-page-sk` component. |
| |
| - **Responsibilities**: |
| - Sets up `fetch-mock` to simulate backend API responses for |
| `/login/status`, `/_/count/`, `/_/alert/update`, `/_/alert/list/...`, |
| `/_/initpage/`, and `/_/alert/new`. This allows the component to be |
| developed and tested in isolation. |
| - Initializes global `window.perf` properties that might affect the |
| component's behavior (e.g., `key_order`, `display_group_by`, |
| `notifications`). |
| - Dynamically inserts `alerts-page-sk` elements into the demo HTML page. |
| |
| - **`alerts-page-sk-demo.html`**: The HTML structure for the demo page. |
| |
| - **Responsibilities**: Provides the basic HTML layout where the |
| `alerts-page-sk` component is rendered for demonstration purposes. |
| Includes an `<error-toast-sk>` for displaying error messages. |
| |
| - **`alerts-page-sk_puppeteer_test.ts`**: Contains Puppeteer tests for the |
| `alerts-page-sk` component. |
| |
| - **Responsibilities**: Performs automated UI testing, ensuring the |
| component renders correctly and basic interactions function as expected. |
| It takes screenshots for visual regression testing. |
| |
| - **`index.ts`**: A simple entry point that imports and thereby registers the |
| `alerts-page-sk` custom element. |
| |
| ### Key Workflows |
| |
| **1. Viewing Alerts:** |
| |
| ``` |
| User navigates to the alerts page |
| | |
| V |
| alerts-page-sk.connectedCallback() |
| | |
| +----------------------+ |
| | | |
| V V |
| fetch('/_/initpage/') fetch('/_/alert/list/false') // Fetch paramset and initial alert list |
| | | |
| V V |
| Update `paramset` Update `alerts` array |
| | | |
| +----------------------+ |
| | |
| V |
| _render() // Lit renders the table with alerts |
| ``` |
| |
| **2. Creating a New Alert:** |
| |
| ``` |
| User clicks "New" button (if isEditor === true) |
| | |
| V |
| alerts-page-sk.add() |
| | |
| V |
| fetch('/_/alert/new') // Get a template for a new alert |
| | |
| V |
| Update `cfg` with the new alert template (owner set to current user) |
| | |
| V |
| dialog.showModal() // Show the alert-config-sk dialog |
| | |
| V |
| User fills in alert details in alert-config-sk |
| | |
| V |
| User clicks "Accept" |
| | |
| V |
| alerts-page-sk.accept() |
| | |
| V |
| cfg = alertconfig.config // Get updated config from alert-config-sk |
| | |
| V |
| fetch('/_/alert/update', { method: 'POST', body: JSON.stringify(cfg) }) // Send new alert to backend |
| | |
| V |
| alerts-page-sk.list() // Refresh the alert list |
| ``` |
| |
| **3. Editing an Existing Alert:** |
| |
| ``` |
| User clicks "Edit" icon next to an alert (if isEditor === true) |
| | |
| V |
| alerts-page-sk.edit() with the selected alert's data |
| | |
| V |
| Set `origCfg` (deep copy of current `cfg`) |
| Set `cfg` to the selected alert's data |
| | |
| V |
| dialog.showModal() // Show the alert-config-sk dialog pre-filled with alert data |
| | |
| V |
| User modifies alert details in alert-config-sk |
| | |
| V |
| User clicks "Accept" |
| | |
| V |
| alerts-page-sk.accept() |
| | |
| V |
| cfg = alertconfig.config // Get updated config |
| | |
| V |
| IF JSON.stringify(cfg) !== JSON.stringify(origCfg) THEN |
| fetch('/_/alert/update', { method: 'POST', body: JSON.stringify(cfg) }) // Send updated alert |
| | |
| V |
| alerts-page-sk.list() // Refresh list |
| ENDIF |
| ``` |
| |
| **4. Deleting an Alert:** |
| |
| ``` |
| User clicks "Delete" icon next to an alert (if isEditor === true) |
| | |
| V |
| alerts-page-sk.delete() with the selected alert's ID |
| | |
| V |
| fetch('/_/alert/delete/{alert_id}', { method: 'POST' }) // Send delete request |
| | |
| V |
| alerts-page-sk.list() // Refresh the alert list |
| ``` |
| |
| **5. Toggling "Show Deleted Configs":** |
| |
| ``` |
| User clicks "Show deleted configs" checkbox |
| | |
| V |
| alerts-page-sk.showChanged() |
| | |
| V |
| Update `showDeleted` property based on checkbox state |
| | |
| V |
| alerts-page-sk.list() // Fetches alerts based on the new `showDeleted` state |
| ``` |
| |
| # Module: /modules/algo-select-sk |
| |
| ## Algo Select SK Module |
| |
| The `algo-select-sk` module provides a custom HTML element that allows users to |
| select a clustering algorithm. This component is crucial for applications where |
| different clustering approaches might yield better results depending on the data |
| or the analytical goal. |
| |
| ### High-Level Overview |
| |
| The core purpose of this module is to present a user-friendly way to switch |
| between available clustering algorithms, specifically "k-means" and "stepfit". |
| It encapsulates the selection logic and emits an event when the chosen algorithm |
| changes, allowing other parts of the application to react accordingly. |
| |
| ### Design and Implementation |
| |
| The "why" behind this module is the need for a standardized and reusable UI |
| component for algorithm selection. Instead of each part of an application |
| implementing its own dropdown or radio buttons for algorithm choice, |
| `algo-select-sk` provides a consistent look and feel. |
| |
| The "how" involves leveraging the `select-sk` custom element from the |
| `elements-sk` library to provide the actual dropdown functionality. |
| `algo-select-sk` builds upon this by: |
| |
| 1. **Defining specific algorithm options:** It hardcodes "k-means" and |
| "stepfit" as the available choices, along with descriptive tooltips. |
| 2. **Managing state:** It uses an `algo` attribute (and corresponding property) |
| to store and reflect the currently selected algorithm. |
| 3. **Emitting a custom event:** When the selection changes, it dispatches an |
| `algo-change` event with the new algorithm in the `detail` object. This |
| decoupling allows other components to listen for changes without direct |
| dependencies on `algo-select-sk`. |
| |
| The choice to use `select-sk` as a base provides a consistent styling and |
| behavior aligned with other elements in the Skia infrastructure. |
| |
| ### Responsibilities and Key Components |
| |
| - **`algo-select-sk.ts`**: This is the heart of the module. |
| - **`AlgoSelectSk` class**: This `ElementSk` subclass defines the custom |
| element's behavior. |
| - **`template`**: Uses `lit-html` to render the underlying `select-sk` |
| element with predefined `div` elements representing the algorithm |
| options ("K-Means" and "Individual" which maps to "stepfit"). The |
| `selected` attribute on these divs is dynamically updated based on the |
| current `algo` property. |
| - **`connectedCallback` and `attributeChangedCallback`**: Ensure the |
| element renders correctly when added to the DOM or when its `algo` |
| attribute is changed programmatically. |
| - **`_selectionChanged` method**: This is the event handler for the |
| `selection-changed` event from the inner `select-sk` element. When |
| triggered, it updates the `algo` property of `algo-select-sk` and then |
| dispatches the `algo-change` custom event. This is the primary mechanism |
| for communicating the selected algorithm to the outside world. `User |
| interacts with <select-sk> | V <select-sk> emits 'selection-changed' |
| event | V AlgoSelectSk._selectionChanged() is called | V Updates |
| internal 'algo' property | V Dispatches 'algo-change' event with { algo: |
| "new_value" }` |
| - **`algo` getter/setter**: Provides a programmatic way to get and set the |
| selected algorithm. The setter ensures that only valid algorithm values |
| ('kmeans' or 'stepfit') are set, defaulting to 'kmeans' for invalid |
| inputs. This adds a layer of robustness. |
| - **`toClusterAlgo` function**: A utility function to validate and |
| normalize the input string to one of the allowed `ClusterAlgo` types. |
| This prevents invalid algorithm names from being propagated. |
| - **`AlgoSelectAlgoChangeEventDetail` interface**: Defines the structure |
| of the `detail` object for the `algo-change` event, ensuring type safety |
| for event consumers. |
| - **`algo-select-sk.scss`**: Provides minimal styling, primarily ensuring that |
| the cursor is a pointer when hovering over the element, indicating |
| interactivity. It imports shared color and theme styles. |
| - **`index.ts`**: A simple entry point that imports `algo-select-sk.ts`, |
| ensuring the custom element is defined and available for use when the module |
| is imported. |
| - **`algo-select-sk-demo.html` and `algo-select-sk-demo.ts`**: These files |
| provide a demonstration page for the `algo-select-sk` element. |
| - The HTML sets up a few instances of `algo-select-sk`, including one with |
| a pre-selected algorithm and one in dark mode, to showcase its |
| appearance. |
| - The TypeScript for the demo listens to the `algo-change` event from one |
| of the instances and displays the event detail in a `<pre>` tag. This |
| serves as a live example of how to consume the event. |
| - **`algo-select-sk_puppeteer_test.ts`**: Contains Puppeteer tests to verify |
| the component renders correctly and basic functionality. It checks for the |
| presence of the elements on the demo page and takes a screenshot for visual |
| regression testing. |
| |
| The component is designed to be self-contained and easy to integrate. By simply |
| including the element in HTML and listening for the `algo-change` event, |
| developers can incorporate algorithm selection functionality into their |
| applications. |
| |
| # Module: /modules/anomalies-table-sk |
| |
| ## Anomalies Table (`anomalies-table-sk`) |
| |
| The `anomalies-table-sk` module provides a custom HTML element for displaying a |
| sortable and interactive table of performance anomalies. Its primary purpose is |
| to present anomaly data in a clear, actionable format, allowing users to quickly |
| identify, group, triage, and investigate performance regressions or |
| improvements. |
| |
| ### Key Responsibilities: |
| |
| - **Displaying Anomalies:** Renders a list of `Anomaly` objects in a tabular |
| format. Each row represents an anomaly and displays key information such as |
| bug ID, revision range, test path, and metrics like delta percentage and |
| absolute delta. |
| - **Grouping Anomalies:** Automatically groups anomalies that share |
| overlapping revision ranges. This helps users identify related issues or |
| multiple manifestations of the same underlying problem. Groups can be |
| expanded or collapsed for better readability. |
| - **User Interaction:** |
| - **Sorting:** Allows users to sort the table by various columns (e.g., |
| Bug ID, Revisions, Test, Delta %). |
| - **Selection:** Users can select individual anomalies or entire groups of |
| anomalies using checkboxes. |
| - **Bulk Actions:** Provides "Triage" and "Graph" buttons that operate on |
| the currently selected anomalies. |
| - **Triage Integration:** Integrates with `triage-menu-sk` to allow users to |
| assign bug IDs, mark anomalies as invalid or ignored, or reset their triage |
| state. |
| - **Navigation and Investigation:** |
| - Provides links to individual anomaly reports (e.g., |
| `/u/?anomalyIDs=...`). |
| - Generates links to view graphs of selected anomalies in the multi-graph |
| explorer (`/m/...`). |
| - Links bug IDs to the configured bug tracking system (e.g., |
| `/u/?bugID=...`). |
| - Allows unassociating a bug ID from an anomaly. |
| |
| ### Design and Implementation Choices: |
| |
| - **LitElement for Web Component:** The component is built using LitElement, a |
| lightweight library for creating Web Components. This promotes |
| encapsulation, reusability, and interoperability with other web |
| technologies. |
| - **Client-Side Grouping:** Anomaly grouping based on revision range |
| intersection is performed client-side. This simplifies the backend and |
| provides immediate feedback to the user as they interact with the table. The |
| `groupAnomalies` method iterates through the anomaly list, merging anomalies |
| into existing groups if their revision ranges intersect, or creating new |
| groups otherwise. |
| - **Client-Side Sorting:** Sorting is handled by the `sort-sk` element, which |
| observes changes to data attributes on the table rows. This avoids server |
| roundtrips for simple sorting operations. |
| - **Selective Rendering:** The table is re-rendered (using `this._render()`) |
| only when necessary, such as when data changes, groups are |
| expanded/collapsed, or selections are updated. This improves performance. |
| - **`AnomalyGroup` Class:** A simple `AnomalyGroup` class is used to manage |
| collections of related anomalies and their expanded state. This provides a |
| clear structure for handling grouped data. |
| - **Popup for Triage:** The triage menu is presented in a popup to save screen |
| real estate and provide a focused interface for triage actions. The popup's |
| visibility is controlled by the `showPopup` boolean property. |
| - **Event-Driven Communication:** The component emits a custom event |
| `anomalies_checked` when the selection state of an anomaly changes. This |
| allows parent components or other parts of the application to react to user |
| selections. |
| - **API Integration for Graphing and Reporting:** |
| - When graphing multiple anomalies, it first calls the |
| `/_anomalies/group_report` backend API. This API is designed to provide |
| a consolidated view or a shared identifier (`sid`) for a group of |
| anomalies, which is then used to construct the graph URL. This is |
| preferred over constructing potentially very long URLs with many |
| individual anomaly IDs. |
| - For single anomaly graphing, it fetches additional time range |
| information via the same `group_report` API to provide context (one week |
| before and after the anomaly) in the graph. |
| - **Trace Formatting:** Uses `ChromeTraceFormatter` to correctly format trace |
| queries for linking to the graph explorer. |
| - **Styling:** SCSS is used for styling, importing shared styles from |
| `themes_sass_lib`, `buttons_sass_lib`, and `select_sass_lib` for a |
| consistent look and feel. Specific styles handle the appearance of |
| regression vs. improvement, expanded rows, and the triage popup. |
| |
| ### Key Files: |
| |
| - **`anomalies-table-sk.ts`:** This is the core file containing the LitElement |
| class definition for `AnomaliesTableSk`. It implements all the logic for |
| rendering the table, handling user interactions, grouping anomalies, and |
| interacting with backend services for triage and graphing. |
| - `populateTable(anomalyList: Anomaly[])`: The primary method to load data |
| into the table. It triggers grouping and rendering. |
| - `generateTable()`, `generateGroups()`, `generateRows()`: Template |
| methods responsible for constructing the HTML structure of the table |
| using `lit-html`. |
| - `groupAnomalies()`: Implements the logic for grouping anomalies based on |
| overlapping revision ranges. |
| - `openReport()`: Handles the logic for generating a URL to graph the |
| selected anomalies, potentially calling the `/_anomalies/group_report` |
| API. |
| - `togglePopup()`: Manages the visibility of the triage menu popup. |
| - `anomalyChecked()`: Handles checkbox state changes and updates the |
| `checkedAnomaliesSet`. |
| - `openMultiGraphUrl()`: Constructs the URL for viewing an anomaly's trend |
| in the multi-graph explorer, fetching time range context via an API |
| call. |
| - **`anomalies-table-sk.scss`:** Contains the SCSS styles specific to the |
| anomalies table, defining its layout, appearance, and the styling for |
| different states (e.g., improvement, regression, expanded rows). |
| - **`index.ts`:** A simple entry point that imports and registers the |
| `anomalies-table-sk` custom element. |
| - **`anomalies-table-sk-demo.ts` and `anomalies-table-sk-demo.html`:** Provide |
| a demonstration page for the component, showcasing its usage with sample |
| data and interactive buttons to populate the table and retrieve checked |
| anomalies. The demo also sets up a global `window.perf` object with |
| configuration typically provided by the Perf application environment. |
| |
| ### Workflows: |
| |
| **1. Displaying and Grouping Anomalies:** |
| |
| ``` |
| [User Action: Page Load with Anomaly Data] |
| | |
| v |
| AnomaliesTableSk.populateTable(anomalyList) |
| | |
| v |
| AnomaliesTableSk.groupAnomalies() |
| |-> For each Anomaly in anomalyList: |
| | |-> Try to merge with existing AnomalyGroup (if revision ranges intersect) |
| | |-> Else, create new AnomalyGroup |
| | |
| v |
| AnomaliesTableSk._render() |
| | |
| v |
| [DOM Update: Table is rendered with grouped anomalies, groups initially collapsed] |
| ``` |
| |
| **2. Selecting and Triaging Anomalies:** |
| |
| ``` |
| [User Action: Clicks checkbox for an anomaly or group] |
| | |
| v |
| AnomaliesTableSk.anomalyChecked() or AnomalySk.toggleChildrenCheckboxes() |
| |-> Updates `checkedAnomaliesSet` |
| |-> Updates header checkbox state if needed |
| |-> Emits 'anomalies_checked' event |
| |-> Enables/Disables "Triage" and "Graph" buttons based on selection |
| | |
| v |
| [User Action: Clicks "Triage" button (if enabled)] |
| | |
| v |
| AnomaliesTableSk.togglePopup() |
| |-> Shows TriageMenuSk popup |
| |-> TriageMenuSk.setAnomalies(checkedAnomalies) |
| | |
| v |
| [User interacts with TriageMenuSk (e.g., assigns bug, marks invalid)] |
| | |
| v |
| TriageMenuSk makes API request (e.g., to /_/triage) |
| | |
| v |
| [Application reloads data or updates table based on triage result] |
| ``` |
| |
| **3. Graphing Selected Anomalies:** |
| |
| ``` |
| [User Action: Selects one or more anomalies] |
| | |
| v |
| [User Action: Clicks "Graph" button (if enabled)] |
| | |
| v |
| AnomaliesTableSk.openReport() |
| | |
| |--> If single anomaly selected: |
| | |-> window.open(`/u/?anomalyIDs={id}`, '_blank') |
| | |
| |--> If multiple anomalies selected: |
| |-> Call fetchGroupReportApi(idString) |
| | |-> POST to /_/anomalies/group_report with anomaly IDs |
| | |-> Receives response with `sid` (shared ID) |
| | |
| |-> window.open(`/u/?sid={sid}`, '_blank') |
| ``` |
| |
| **4. Expanding/Collapsing an Anomaly Group:** |
| |
| ``` |
| [User Action: Clicks expand/collapse button on a group row] |
| | |
| v |
| AnomaliesTableSk.expandGroup(anomalyGroup) |
| |-> Toggles `anomalyGroup.expanded` boolean |
| | |
| v |
| AnomaliesTableSk._render() |
| | |
| v |
| [DOM Update: Rows within the group are shown or hidden] |
| ``` |
| |
| # Module: /modules/anomaly-sk |
| |
| The `anomaly-sk` module provides a custom HTML element `<anomaly-sk>` and |
| related functionalities for displaying details about performance anomalies. It's |
| designed to present information about a specific anomaly, including its |
| severity, the affected revision range, and a link to the associated bug report. |
| A key utility function, `getAnomalyDataMap`, is also provided to process raw |
| anomaly data into a format suitable for plotting. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`anomaly-sk.ts`**: This is the core file defining the `<anomaly-sk>` |
| custom element. |
| |
| - **Why**: To encapsulate the logic and presentation of individual anomaly |
| details in a reusable web component. This promotes modularity and makes |
| it easy to integrate anomaly information into various parts of the Perf |
| application. |
| - **How**: It extends `ElementSk` and uses the `lit-html` library for |
| templating. It accepts an `Anomaly` object as a property and dynamically |
| renders a table displaying information like the score before and after |
| the anomaly, percentage change, revision range, improvement status, and |
| bug ID. |
| - It fetches commit details (hashes) using the `lookupCids` function from |
| the `cid` module to construct a clickable link to the commit range. |
| - It formats numbers and percentages for better readability. |
| - It handles different bug ID states (e.g., 0 for no bug, -1 for invalid |
| alert, -2 for ignored alert) by displaying appropriate text or a link to |
| the bug tracking system. The `bug_host_url` property allows |
| customization of the bug tracker URL. |
| - The `formatRevisionRange` method asynchronously fetches commit hashes |
| for the start and end revisions of the anomaly to create a link to the |
| commit range view. If `window.perf.commit_range_url` is not defined, it |
| simply displays the revision numbers. |
| |
| - **`getAnomalyDataMap` (function in `anomaly-sk.ts`)**: |
| |
| - **Why**: To transform raw trace data and anomaly information into a |
| structured format that can be easily consumed by plotting components |
| like `plot-simple-sk`. This function bridges the gap between the raw |
| data representation and the visual representation of anomalies on a |
| graph. |
| - **How**: It takes a `TraceSet` (a collection of traces), |
| `ColumnHeader[]` (representing commit points on the x-axis), an |
| `AnomalyMap` (mapping trace IDs and commit IDs to `Anomaly` objects), |
| and a list of `highlight_anomalies` IDs. |
| - It iterates through each trace in the `TraceSet`. If a trace has |
| anomalies listed in the `AnomalyMap`, it then iterates through those |
| anomalies. |
| - For each anomaly, it finds the corresponding x-coordinate by matching |
| the anomaly's commit ID (`cid`) with the `offset` in the `ColumnHeader`. |
| A crucial detail is that if an exact commit ID match isn't found in the |
| header (e.g., due to a data upload failure for that specific commit), it |
| will associate the anomaly with the _next available_ commit point. This |
| ensures that anomalies are still visualized even if their precise commit |
| data point is missing, rather than being omitted entirely. |
| - The y-coordinate is taken directly from the trace data at that |
| x-coordinate. |
| - It determines if an anomaly should be highlighted based on the |
| `highlight_anomalies` input. |
| - The output is an object where keys are trace IDs and values are arrays |
| of `AnomalyData` objects, each containing the `x`, `y` coordinates, the |
| `Anomaly` object itself, and a `highlight` flag. |
| |
| ``` |
| Input: |
| TraceSet: { "traceA": [10, 12, 15*], ... } (*value at commit 101) |
| Header: [ {offset: 99}, {offset: 100}, {offset: 101} ] |
| AnomalyMap: { "traceA": { "101": AnomalyObjectA } } |
| HighlightList: [] |
| |
| getAnomalyDataMap |
| | |
| V |
| Output: |
| { |
| "traceA": [ |
| { x: 2, y: 15, anomaly: AnomalyObjectA, highlight: false } |
| ], |
| ... |
| } |
| ``` |
| |
| - **`anomaly-sk.scss`**: This file contains the SCSS styles for the |
| `<anomaly-sk>` element. |
| |
| - **Why**: To provide a consistent visual appearance for the anomaly |
| details table, aligning with the overall theme of the application |
| (`themes_sass_lib`). |
| - **How**: It defines basic table styling, such as text alignment and |
| padding for `th` and `td` elements within the `anomaly-sk` component. |
| |
| - **`anomaly-sk-demo.html` and `anomaly-sk-demo.ts`**: These files set up a |
| demonstration page for the `<anomaly-sk>` element. |
| |
| - **Why**: To provide a sandbox environment for developers to see the |
| component in action with various inputs and to facilitate isolated |
| testing and development. |
| - **How**: `anomaly-sk-demo.html` includes instances of `<anomaly-sk>` |
| with different IDs. `anomaly-sk-demo.ts` initializes these components |
| with sample `Anomaly` data. It also mocks the `/_/cid/` API endpoint |
| using `fetch-mock` to simulate responses for commit detail lookups, |
| which is crucial for the `formatRevisionRange` functionality to work in |
| the demo. Global `window.perf` configurations are also set up, as the |
| component relies on them (e.g., `commit_range_url`). |
| |
| - **Test Files (`anomaly-sk_test.ts`, `anomaly-sk_puppeteer_test.ts`)**: |
| |
| - **Why**: To ensure the correctness and reliability of the component's |
| logic and rendering. |
| - **`anomaly-sk_test.ts`**: Contains unit tests for the |
| `getAnomalyDataMap` function (verifying its mapping logic, especially |
| the handling of missing commit points) and for static utility methods |
| within `AnomalySk` like `formatPercentage` and the asynchronous |
| `formatRevisionRange`. It uses `fetch-mock` to control API responses for |
| CID lookups. |
| - **`anomaly-sk_puppeteer_test.ts`**: Contains browser-based integration |
| tests using Puppeteer. It verifies that the demo page renders correctly |
| and takes screenshots for visual regression testing. |
| |
| **Workflow for Displaying an Anomaly:** |
| |
| 1. An `Anomaly` object is passed to the `anomaly` property of the |
| `<anomaly-sk>` element. `<anomaly-sk |
| .anomaly=${someAnomalyObject}></anomaly-sk>` |
| 2. The `set anomaly()` setter in `AnomalySk` is triggered. |
| 3. It calls `this.formatRevisionRange()` to asynchronously prepare the revision |
| range display. |
| - `formatRevisionRange` extracts `start_revision` and `end_revision`. |
| - It calls `lookupCids([start_rev_num, end_rev_num])` which makes a POST |
| request to `/_/cid/`. |
| - The response provides commit hashes. |
| - If `window.perf.commit_range_url` is set, it constructs an `<a>` tag |
| with the URL populated with the fetched hashes. Otherwise, it just |
| formats the revision numbers as text. |
| - The resulting `TemplateResult` is stored in `this._revision`. |
| 4. `this._render()` is called, which re-renders the component's template. |
| 5. The template (`AnomalySk.template`) displays the table: |
| - Score, Prior Score, Percent Change (calculated using |
| `getPercentChange`). |
| - Revision Range (using the `this.revision` template generated in step 3). |
| - Improvement status. |
| - Bug ID (formatted using `AnomalySk.formatBug`, potentially linking to |
| `this.bugHostUrl`). |
| |
| This module effectively isolates the presentation and data transformation logic |
| related to individual anomalies, making it a maintainable and reusable piece of |
| the Perf frontend. The handling of potentially missing data points in |
| `getAnomalyDataMap` shows a robust design choice for dealing with real-world |
| data imperfections. |
| |
| # Module: /modules/bisect-dialog-sk |
| |
| ## Bisect Dialog (`bisect-dialog-sk`) |
| |
| The `bisect-dialog-sk` module provides a user interface element for initiating a |
| bisection process within the Perf application. This is specifically designed to |
| help pinpoint the commit that introduced a performance regression or |
| improvement, primarily for Chrome. |
| |
| ### Core Responsibility |
| |
| The primary responsibility of this module is to present a dialog to the user, |
| pre-filled with relevant information extracted from a chart tooltip (e.g., when |
| a user identifies an anomaly in a performance graph). It allows the user to |
| confirm or modify these parameters and then submit a request to the backend to |
| start a bisection task. |
| |
| ### Why a Dedicated Dialog? |
| |
| Performance analysis often involves identifying the exact change that caused a |
| shift in metrics. A manual bisection process can be tedious and error-prone. |
| This dialog streamlines this by: |
| |
| 1. **Pre-filling Data:** It leverages context from the chart (like the test |
| path and commit range) to pre-populate the necessary fields, reducing manual |
| data entry and potential mistakes. |
| 2. **Structured Input:** It provides a clear form for all required parameters |
| for a bisection request, ensuring that the backend receives all necessary |
| information. |
| 3. **User Authentication Awareness:** It integrates with the `alogin-sk` module |
| to fetch the logged-in user's email, which is a required parameter for the |
| bisect request. |
| 4. **Feedback Mechanism:** It provides visual feedback to the user during the |
| submission process (e.g., a spinner) and communicates success or failure via |
| toast messages. |
| |
| ### How it Works |
| |
| 1. **Initialization and Pre-filling:** |
| |
| - The dialog is typically instantiated and hidden until needed. |
| - When a user triggers a bisection (e.g., from a chart tooltip), the |
| `setBisectInputParams` method is called with details like the |
| `testPath`, `startCommit`, `endCommit`, `bugId`, `story`, and |
| `anomalyId`. |
| - These parameters are used to populate the input fields within the |
| dialog's form. |
| |
| 2. **User Interaction and Submission:** |
| |
| - The `open()` method displays the modal dialog. |
| - The user can review and, if necessary, modify the pre-filled values |
| (e.g., adjust the commit range or add a bug ID). They can also provide |
| an optional patch to be applied during the bisection. |
| - Upon clicking the "Bisect" button, the `postBisect` method is invoked. |
| |
| 3. **Request Construction and API Call:** |
| |
| - `postBisect` gathers the current values from the form fields. |
| - It parses the `testPath` to extract components like the `benchmark`, |
| `chart`, and `statistic`. The logic for deriving `chart` and `statistic` |
| involves checking the last part of the test name against a predefined |
| list of `STATISTIC_VALUES` (e.g., "avg", "count"). |
| - A `CreateBisectRequest` object is constructed with all the necessary |
| parameters. |
| - A `fetch` call is made to the `/_/bisect/create` endpoint with the JSON |
| payload. |
| |
| 4. **Response Handling:** |
| |
| - If the request is successful, a success message is typically displayed |
| (often as a toast by the calling context, as this dialog focuses on the |
| submission itself), and the dialog closes. |
| - If the request fails, an error message is displayed using |
| `errorMessage`, and the dialog remains open, allowing the user to |
| correct any issues or retry. |
| |
| **Simplified Bisect Request Workflow:** |
| |
| ``` |
| User Clicks Bisect Trigger (e.g., on chart) |
| | |
| V |
| Calling Code prepares `BisectPreloadParams` |
| | |
| V |
| `bisect-dialog-sk.setBisectInputParams(params)` |
| | |
| V |
| `bisect-dialog-sk.open()` |
| | |
| V |
| Dialog is Displayed (pre-filled) |
| | |
| V |
| User reviews/modifies data & Clicks "Bisect" |
| | |
| V |
| `bisect-dialog-sk.postBisect()` |
| | |
| V |
| `testPath` is parsed (extract benchmark, chart, statistic) |
| | |
| V |
| `CreateBisectRequest` object is built |
| | |
| V |
| `fetch POST /_/bisect/create` with request data |
| | |
| V |
| Handle API Response: |
| - Success -> Close dialog, Show success notification (external) |
| - Error -> Show error message, Keep dialog open |
| ``` |
| |
| ### Key Components and Files |
| |
| - **`bisect-dialog-sk.ts`**: This is the core TypeScript file defining the |
| `BisectDialogSk` custom element. |
| |
| - **`BisectDialogSk` class**: Extends `ElementSk` and manages the dialog's |
| state, rendering, and interaction logic. |
| - `BisectPreloadParams` interface: Defines the structure of the initial |
| data passed to the dialog. |
| - `template`: A lit-html template defining the dialog's HTML structure, |
| including input fields for test path, bug ID, start/end commits, story, |
| and an optional patch. It also includes a close icon, a spinner for |
| loading states, and submit/close buttons. |
| - `connectedCallback()`: Initializes the element, sets up property |
| upgrades, queries for DOM elements (dialog, form, spinner, button), and |
| attaches an event listener to the form's submit event. It also fetches |
| the logged-in user's status. |
| - `setBisectInputParams()`: Populates the internal state and input fields |
| with data provided externally. |
| - `open()`: Shows the modal dialog and ensures the submit button is |
| enabled. |
| - `closeBisectDialog()`: Closes the dialog. |
| - `postBisect()`: This is the heart of the submission logic. It: |
| - Activates the spinner and disables the submit button. |
| - Parses the `testPath` to extract various components required for the |
| bisect request (like `benchmark`, `chart`, `story`, `statistic`). |
| The logic for `chart` and `statistic` derivation is particularly |
| important here. |
| - Constructs the `CreateBisectRequest` payload. |
| - Makes a `POST` request to the `/_/bisect/create` endpoint. |
| - Handles the response, either closing the dialog on success or |
| displaying an error message on failure. |
| - `STATISTIC_VALUES`: A constant array used to determine if the last part |
| of a test name is a statistic (e.g., `avg`, `min`, `max`). |
| |
| - **`bisect-dialog-sk.scss`**: Contains the SASS styles for the dialog, |
| ensuring it aligns with the application's theme. It styles the dialog |
| itself, input fields, and the footer elements. |
| |
| - **`index.ts`**: A simple entry point that imports and thus registers the |
| `bisect-dialog-sk` custom element. |
| |
| - **`BUILD.bazel`**: Defines the build rules for this module, specifying its |
| dependencies (SASS, TypeScript, other SK elements like `alogin-sk`, |
| `select-sk`, `spinner-sk`, `close-icon-sk`) and sources. The dependencies |
| highlight its reliance on common UI components and infrastructure modules |
| for features like login status and error messaging. |
| |
| ### Design Choices |
| |
| - **Custom Element (`ElementSk`)**: Encapsulating the dialog as a custom |
| element promotes reusability and modularity. It can be easily integrated |
| into different parts of the Perf application where bisection capabilities |
| are needed. |
| - **`lit-html` for Templating**: Provides an efficient and declarative way to |
| define the dialog's HTML structure and update it based on its state. |
| - **Pre-computation of Request Parameters**: The dialog takes a "test path" |
| and derives several other parameters (benchmark, chart, statistic) from it. |
| This simplifies the input required from the user or the calling component, |
| as they only need to provide the full test identifier. |
| - **Specific to Chrome**: The comment "The bisect logic is only specific to |
| Chrome" indicates that the backend service this dialog interacts with |
| (`/_/bisect/create`) is tailored for Chrome's bisection infrastructure. The |
| `project: 'chromium'` in the request payload confirms this. |
| - **Error Handling**: The use of `jsonOrThrow` and `errorMessage` provides a |
| standard way to handle API errors and inform the user. |
| - **Spinner for Feedback**: The `spinner-sk` element gives visual feedback |
| during the asynchronous `fetch` operation, improving user experience. |
| |
| # Module: /modules/calendar-input-sk |
| |
| ## Calendar Input Element (`calendar-input-sk`) |
| |
| The `calendar-input-sk` module provides a user-friendly way to select dates. It |
| combines a standard text input field for manual date entry with a button that |
| reveals a `calendar-sk` element within a dialog for visual date picking. This |
| approach offers flexibility for users who prefer typing dates directly and those |
| who prefer a visual calendar interface. |
| |
| ### Responsibilities and Key Components |
| |
| - **`calendar-input-sk.ts`**: This is the core file defining the |
| `CalendarInputSk` custom element. |
| |
| - **Why**: It orchestrates the interaction between the text input, the |
| calendar button, and the pop-up calendar dialog. The goal is to provide |
| a seamless date selection experience. |
| - **How**: |
| - It uses a standard HTML `<input type="text">` element for direct date |
| input. A `pattern` attribute (`[0-9]{4}-[0-9]{1,2}-[0-9]{1,2}`) and a |
| `title` are used to guide the user on the expected `YYYY-MM-DD` format. |
| An error indicator (`✗`) is shown if the input doesn't match the |
| pattern. |
| - A `<button>` element, styled with a `date-range-icon-sk`, triggers the |
| display of the calendar. |
| - A standard HTML `<dialog>` element is used to present the `calendar-sk` |
| element. This choice simplifies the implementation of modal behavior. |
| - The `openHandler` method is responsible for showing the dialog. It uses |
| a `Promise` to manage the asynchronous nature of user interaction with |
| the dialog (either selecting a date or canceling). This makes the event |
| handling logic cleaner and easier to follow. |
| - The `inputChangeHandler` is triggered when the user types into the text |
| field. It validates the input against the defined pattern. If valid, it |
| parses the date string and updates the `displayDate` property. |
| - The `calendarChangeHandler` is invoked when a date is selected from the |
| `calendar-sk` component within the dialog. It resolves the |
| aforementioned `Promise` with the selected date. |
| - The `dialogCancelHandler` is called when the dialog is closed without a |
| date selection (e.g., by pressing the "Cancel" button or the Escape |
| key). It rejects the `Promise`. |
| - An `input` custom event (of type `CustomEvent<Date>`) is dispatched |
| whenever the selected date changes, whether through the text input or |
| the calendar dialog. This allows parent components to react to date |
| selections. |
| - The `displayDate` property acts as the single source of truth for the |
| currently selected date. Setting this property will update both the text |
| input and the date displayed in the `calendar-sk` when it's opened. |
| - It leverages the `lit-html` library for templating, providing a |
| declarative way to define the element's structure and efficiently update |
| the DOM. |
| - The element extends `ElementSk`, inheriting common functionalities for |
| Skia custom elements. |
| |
| - **`calendar-input-sk.scss`**: This file contains the styling for the |
| `calendar-input-sk` element. |
| |
| - **Why**: To provide a consistent visual appearance that integrates well |
| with the Skia design system (themes). |
| - **How**: It uses SASS to define styles for the input field, the calendar |
| button, the error indicator, and the dialog. It leverages CSS variables |
| (e.g., `--error`, `--on-surface`, `--surface-1dp`) for theming, allowing |
| the component's appearance to adapt to different contexts (like dark |
| mode). The `.invalid` class is conditionally displayed based on the |
| input field's validity state using the `:invalid` pseudo-class. |
| |
| - **`index.ts`**: This file simply imports and thereby registers the |
| `calendar-input-sk` custom element. |
| |
| - **Why**: This is a common pattern for making custom elements available |
| for use in an application. It acts as the entry point for the component. |
| |
| - **`calendar-input-sk-demo.html` / `calendar-input-sk-demo.ts`**: These files |
| constitute a demonstration page for the `calendar-input-sk` element. |
| |
| - **Why**: To showcase the element's functionality, different states |
| (including invalid input and dark mode), and provide a simple way for |
| developers to interact with and understand the component. It also serves |
| as a testbed during development. |
| - **How**: The HTML file includes multiple instances of |
| `<calendar-input-sk>` in various configurations. The TypeScript file |
| initializes these instances, sets initial `displayDate` values, and |
| demonstrates how to listen for the `input` event. It also shows an |
| example of programmatically setting an invalid value in one of the input |
| fields. |
| |
| ### Key Workflows |
| |
| **1. Selecting a Date via Text Input:** |
| |
| ``` |
| User types "2023-10-26" into text input |
| | |
| V |
| inputChangeHandler in calendar-input-sk.ts |
| | |
| +-- (Input is valid: matches pattern "YYYY-MM-DD") --> Parse "2023-10-26" into a Date object |
| | | |
| | V |
| | Update _displayDate property |
| | | |
| | V |
| | Render component (updates input field's .value) |
| | | |
| | V |
| | Dispatch "input" CustomEvent<Date> |
| | |
| +-- (Input is invalid: e.g., "2023-") --> Do nothing (CSS shows error indicator) |
| ``` |
| |
| **2. Selecting a Date via Calendar Dialog:** |
| |
| ``` |
| User clicks calendar button |
| | |
| V |
| openHandler in calendar-input-sk.ts |
| | |
| V |
| dialog.showModal() is called |
| | |
| V |
| <dialog> with <calendar-sk> is displayed |
| | |
| +-- User selects a date in <calendar-sk> --> <calendar-sk> dispatches "change" event |
| | | |
| | V |
| | calendarChangeHandler in calendar-input-sk.ts |
| | | |
| | V |
| | dialog.close() |
| | | |
| | V |
| | Promise resolves with the selected Date |
| | |
| +-- User clicks "Cancel" button or presses Esc --> dialog dispatches "cancel" event |
| | |
| V |
| dialogCancelHandler in calendar-input-sk.ts |
| | |
| V |
| dialog.close() |
| | |
| V |
| Promise rejects |
| ``` |
| |
| **If Promise resolves (date selected):** |
| |
| ``` |
| openHandler continues after await |
| | |
| V |
| Update _displayDate property with the resolved Date |
| | |
| V |
| Render component (updates input field's .value) |
| | |
| V |
| Dispatch "input" CustomEvent<Date> |
| | |
| V |
| Focus on the text input field |
| ``` |
| |
| The design emphasizes a clear separation of concerns: the `calendar-sk` handles |
| the visual calendar logic, while `calendar-input-sk` manages the integration of |
| text input and the dialog presentation. The use of a `Promise` in `openHandler` |
| simplifies the handling of the asynchronous dialog interaction, leading to more |
| readable and maintainable code. |
| |
| # Module: /modules/calendar-sk |
| |
| The `calendar-sk` module provides a custom HTML element `<calendar-sk>` that |
| displays an interactive monthly calendar. This element was created to address |
| limitations with the native HTML `<input type="date">` element, specifically its |
| lack of Safari support and the inability to style the pop-up calendar. |
| Furthermore, it aims to be more themeable and accessible than other existing web |
| component solutions like Elix. |
| |
| The core philosophy behind `calendar-sk` is to provide a user-friendly, |
| accessible, and customizable date selection experience. Accessibility is a key |
| consideration, with design choices informed by WAI-ARIA practices for date |
| pickers. This includes keyboard navigation and appropriate ARIA attributes. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`calendar-sk.ts`**: This is the heart of the module, defining the |
| `CalendarSk` custom element which extends `ElementSk`. |
| - **Rendering the Calendar:** It uses the `lit-html` library for |
| templating, dynamically generating the HTML for the calendar grid. The |
| calendar displays one month at a time. |
| - The main template (`CalendarSk.template`) constructs the overall table |
| structure, including navigation buttons for changing the year and month, |
| and headers for the year and month. |
| - `CalendarSk.rowTemplate` is responsible for rendering each week (row) of |
| the calendar. |
| - `CalendarSk.buttonForDateTemplate` creates the individual day buttons. |
| It handles logic for disabling buttons for dates outside the current |
| month and highlighting the selected date and today's date. |
| - **Date Management:** |
| - It internally manages a `_displayDate` (a JavaScript `Date` object) |
| which represents the currently selected or focused date. |
| - The `CalendarDate` class is a helper to simplify comparisons of year, |
| month, and date, as JavaScript `Date` objects can be tricky with |
| timezones and direct comparisons. |
| - Helper functions like `getNumberOfDaysInMonth` and |
| `firstDayIndexOfMonth` are used to correctly layout the days within the |
| grid. |
| - **Navigation:** |
| - Provides UI buttons (using `navigate-before-icon-sk` and |
| `navigate-next-icon-sk`) for incrementing/decrementing the month and |
| year. Methods like `incYear`, `decYear`, `incMonth`, and `decMonth` |
| handle the logic for updating `_displayDate` and re-rendering. A crucial |
| detail in month/year navigation is handling cases where the current day |
| (e.g., 31st) doesn't exist in the target month (e.g., February). In such |
| scenarios, the date is adjusted to the last valid day of the target |
| month. |
| - **Keyboard Navigation:** |
| - The `keyboardHandler` method implements navigation using arrow keys |
| (day/week changes) and PageUp/PageDown keys (month changes). This |
| handler is designed to be attached to a parent element (like a dialog or |
| the document) to allow for controlled event handling, especially when |
| multiple keyboard-interactive elements are on a page. When a key is |
| handled, it prevents further event propagation and focuses the newly |
| selected date button. |
| - **Internationalization (i18n):** |
| - Leverages `Intl.DateTimeFormat` to display month names and weekday |
| headers according to the specified `locale` property or the browser's |
| default locale. The `buildWeekDayHeader` method dynamically generates |
| these headers. |
| - **Events:** |
| - Dispatches a `change` custom event ( `CustomEvent<Date>`) whenever a new |
| date is selected by clicking on a day. The event detail contains the |
| selected `Date` object. |
| - **Theming:** |
| - The component is themeable through CSS custom properties, as defined in |
| `calendar-sk.scss`. It imports styles from |
| `//perf/modules/themes:themes_sass_lib` and |
| `//elements-sk/modules/styles:buttons_sass_lib`. |
| - **`calendar-sk.scss`**: This file contains the SASS/CSS styles for the |
| `<calendar-sk>` element. It defines the visual appearance of the calendar |
| grid, buttons, headers, and how selected or "today" dates are highlighted. |
| It relies on CSS variables (e.g., `--background`, `--secondary`, |
| `--surface-1dp`) for theming, allowing the look and feel to be customized by |
| the consuming application. |
| - **`calendar-sk-demo.html` and `calendar-sk-demo.ts`**: These files set up a |
| demonstration page for the `calendar-sk` element. |
| - `calendar-sk-demo.html` includes instances of the calendar, some in dark |
| mode and one configured for a different locale (`zh-Hans-CN`), to |
| showcase its versatility. |
| - `calendar-sk-demo.ts` initializes these calendar instances, sets their |
| initial `displayDate` and `locale`, and attaches event listeners to log |
| the `change` event. It also demonstrates how to hook up the |
| `keyboardHandler`. |
| - **`index.ts`**: A simple entry point that imports and thus registers the |
| `calendar-sk` custom element, making it available for use in HTML. |
| |
| **Key Workflows:** |
| |
| 1. **Initialization and Rendering:** `ElementSk constructor` -> |
| `connectedCallback` -> `buildWeekDayHeader` -> `_render` (calls |
| `CalendarSk.template`) |
| |
| - When the `<calendar-sk>` element is added to the DOM, its |
| `connectedCallback` is invoked. |
| - This triggers the initial rendering process, including building the |
| weekday headers based on the current locale. |
| - The main template then renders the calendar grid for the month of the |
| initial `_displayDate`. |
| |
| 2. **Date Selection (Click):** User clicks on a date button -> `dateClick` |
| method -> Updates `_displayDate` -> Dispatches `change` event with the new |
| `Date` -> `_render` (to update UI, e.g., highlight new selection) |
| |
| User clicks a date button. `[date button]` --click--> `dateClick(event)` | |
| +--> `new Date(this._displayDate)` (create copy) | +--> |
| `d.setDate(event.target.dataset.date)` (update day) | +--> |
| `dispatchEvent(new CustomEvent<Date>('change', { detail: d }))` | +--> |
| `this._displayDate = d` | +--> `this._render()` |
| |
| 3. **Month/Year Navigation (Click):** User clicks "Previous Month" button -> |
| `decMonth` method -> Calculates new year, monthIndex, and date (adjusting |
| for days in month) -> Updates `_displayDate` with the new `Date` -> |
| `_render` (to display the new month/year) |
| |
| User clicks "Previous Month" button. `[Previous Month button]` --click--> |
| `decMonth()` | +--> Calculate new year, month, date (adjusting for month |
| boundaries and days in month) | +--> `this._displayDate = new Date(newYear, |
| |
| newMonthIndex, newDate)`| +-->`this.\_render()` |
| |
| 4. **Keyboard Navigation:** User presses "ArrowRight" while calendar (or its |
| container) has focus -> `keyboardHandler(event)` -> `case 'ArrowRight': |
| this.incDay();` -> `incDay` method updates `_displayDate` (e.g., from May 21 |
| to May 22) -> `this._render()` -> `e.stopPropagation(); e.preventDefault();` |
| -> |
| `this.querySelector<HTMLButtonElement>('button[aria-selected="true"]')!.focus();` |
| |
| User presses ArrowRight key. `keydown event (ArrowRight)` ---> |
| `keyboardHandler(event)` | + (matches `case 'ArrowRight'`) | +--> |
| `this.incDay()` | | | +--> `this.\_displayDate = new Date(year, monthIndex, |
| |
| date + 1)`| | | +-->`this.\_render()`| +-->`event.stopPropagation()`| |
| +-->`event.preventDefault()` | +--> Focus the newly selected day button. |
| |
| The use of zero-indexed months (`monthIndex`) internally, as is common with the |
| JavaScript `Date` object, is a deliberate choice for consistency with the |
| underlying API, though it requires careful handling to avoid off-by-one errors, |
| especially when calculating things like the number of days in a month. |
| |
| # Module: /modules/chart-tooltip-sk |
| |
| ## chart-tooltip-sk Module Documentation |
| |
| ### Overview |
| |
| The `chart-tooltip-sk` module provides a custom HTML element, |
| `<chart-tooltip-sk>`, designed to display detailed information about a specific |
| data point on a chart. This tooltip is intended to be interactive, offering |
| context-sensitive actions and information relevant to performance monitoring and |
| analysis. It can be triggered by hovering over or clicking on a chart point. |
| |
| The design philosophy behind this module is to centralize the presentation of |
| complex data point information and related actions. Instead of scattering this |
| logic across various chart implementations, `chart-tooltip-sk` encapsulates it, |
| promoting reusability and maintainability. It aims to provide a rich user |
| experience by surfacing relevant details like commit information, anomaly |
| status, bug tracking, and actions like bisection or requesting further traces. |
| |
| ### Key Responsibilities and Components |
| |
| The primary responsibility of `chart-tooltip-sk` is to render a tooltip with |
| relevant information and interactive elements based on the data point it's |
| associated with. |
| |
| **Core Functionality & Design Choices:** |
| |
| - **Data Loading and Display:** |
| - The `load()` method is the main entry point for populating the tooltip |
| with data. It accepts various parameters like the trace index, test |
| name, y-value, date, commit position, anomaly details, and bug |
| information. This comprehensive loading mechanism allows the parent |
| charting component (e.g., `explore-simple-sk`) to provide all necessary |
| context. |
| - It displays fundamental information such as the test name, data value, |
| units, and the date of the data point. |
| - **Why:** Consolidating data loading into a single method simplifies the |
| interface for parent components. |
| - **Commit Information:** |
| - The tooltip can display details about the commit associated with the |
| data point, including the author, message, and a link to the commit in |
| the version control system. |
| - The `fetch_details()` method is responsible for asynchronously |
| retrieving commit details using the `/_/cid/` endpoint. This is done to |
| avoid loading all commit details upfront for every point on a chart, |
| which could be performance-intensive. |
| - The `_always_show_commit_info` and `_skip_commit_detail_display` flags |
| (sourced from `window.perf`) allow for configurable display of commit |
| details, catering to different instance needs. |
| - **Why:** On-demand fetching of commit details optimizes initial load |
| times. Configuration flags provide flexibility for different deployment |
| scenarios. |
| - **Anomaly Detection and Triage:** |
| - If a data point is identified as an anomaly, the tooltip will highlight |
| this and display relevant anomaly metrics (e.g., median before/after, |
| percentage change). |
| - It integrates with `anomaly-sk` for consistent formatting of anomaly |
| data. |
| - It incorporates `triage-menu-sk` to allow users to triage new anomalies |
| (e.g., create bugs, mark as not a bug). |
| - If a bug is already associated with an anomaly, it displays the bug ID |
| and provides an option to unassociate it. |
| - **Why:** Centralizing anomaly display and triage actions within the |
| tooltip provides a focused user workflow. |
| - **Bug Association:** |
| - Integrates with `user-issue-sk` to display and manage Buganizer issues |
| linked to a data point (even if it's not a formal anomaly). Users can |
| associate existing bugs or create new ones. |
| - The `bug_host_url` (from `window.perf`) is used to construct links to |
| the bug tracking system. |
| - **Why:** Direct integration with bug tracking streamlines the process of |
| linking performance data to actionable issues. |
| - **Interactive Actions:** |
| - **Bisect:** Provides a "Bisect" button (if `_show_pinpoint_buttons` is |
| true, typically for Chromium instances) that opens `bisect-dialog-sk`. |
| This allows users to initiate a bisection to find the exact commit that |
| caused a regression. |
| - **Request Trace:** Offers a "Request Trace" button (also gated by |
| `_show_pinpoint_buttons`) that opens `pinpoint-try-job-dialog-sk`. This |
| is used to request more detailed trace data for a specific commit. |
| - **Point Links:** Integrates `point-links-sk` to show relevant links for |
| a data point based on instance configuration (e.g., links to V8 or |
| WebRTC specific commit ranges). This is configured via |
| `keys_for_commit_range` and `keys_for_useful_links` in `window.perf`. |
| - **JSON Source:** If enabled (`show_json_file_display` in `window.perf`), |
| it provides a way to view the raw JSON data for the point via |
| `json-source-sk`. |
| - **Why:** Placing these actions directly in the tooltip makes them easily |
| discoverable and accessible in the context of the selected data point. |
| - **Positioning and Visibility:** |
| - The `moveTo()` method handles the dynamic positioning of the tooltip |
| relative to the mouse cursor or the selected chart point. It |
| intelligently adjusts its position to stay within the viewport and avoid |
| overlapping critical chart elements. |
| - The tooltip can be "fixed" (typically on click) or transient (on hover). |
| A fixed tooltip remains visible and offers more interactive elements. |
| - **Why:** Smart positioning ensures the tooltip is always usable and |
| doesn't obstruct the underlying chart. The fixed/transient behavior |
| balances information density with unobtrusiveness. |
| - **Styling:** |
| - Uses SCSS for styling (`chart-tooltip-sk.scss`), including themes |
| imported from `//perf/modules/themes:themes_sass_lib`. |
| - Employs `md-elevation` for a Material Design-inspired shadow effect. |
| - **Why:** SCSS allows for organized and maintainable styles. Material |
| Design elements provide a consistent look and feel. |
| |
| **Key Files:** |
| |
| - **`chart-tooltip-sk.ts`:** The core TypeScript file defining the |
| `ChartTooltipSk` class, its properties, methods, and HTML template (using |
| `lit-html`). This is where the primary logic for data display, interaction |
| handling, and integration with sub-components resides. |
| - **`chart-tooltip-sk.scss`:** The SASS file containing the styles for the |
| tooltip element. |
| - **`index.ts`:** A simple entry point that imports and registers the |
| `chart-tooltip-sk` custom element. |
| - **`chart-tooltip-sk-demo.html` & `chart-tooltip-sk-demo.ts`:** Files for |
| demonstrating the tooltip's functionality. The demo sets up mock data and |
| `fetchMock` to simulate API responses, allowing isolated testing and |
| visualization of the component. |
| - **`BUILD.bazel`:** Defines how the element and its demo page are built, |
| including dependencies on other Skia Elements and Perf modules like |
| `anomaly-sk`, `commit-range-sk`, `triage-menu-sk`, etc. |
| |
| **Workflow Example: Displaying Tooltip on Chart Point Click (Fixed Tooltip)** |
| |
| ``` |
| User clicks a point on a chart |
| | |
| V |
| Parent Chart Component (e.g., explore-simple-sk) |
| 1. Determines data for the clicked point (coordinates, commit, trace info). |
| 2. Optionally fetches commit details if not already available. |
| 3. Optionally checks its anomaly map for anomaly data. |
| 4. Calls `chartTooltipSk.load(...)` with all relevant data, |
| setting `tooltipFixed = true` and providing a close button action. |
| 5. Calls `chartTooltipSk.moveTo({x, y})` to position the tooltip. |
| | |
| V |
| chart-tooltip-sk |
| 1. `load()` method populates internal properties (_test_name, _y_value, _commit_info, _anomaly, etc.). |
| 2. `_render()` is triggered (implicitly or explicitly). |
| 3. The lit-html template in `static template` is evaluated: |
| - Basic info (test name, value, date) is displayed. |
| - If `commit_info` is present, commit details (author, message, hash) are shown. |
| - If `_anomaly` is present: |
| - Anomaly metrics are displayed. |
| - If `anomaly.bug_id === 0`, `triage-menu-sk` is shown. |
| - If `anomaly.bug_id > 0`, bug ID is shown with an unassociate button. |
| - Pinpoint job links are shown if available. |
| - If `tooltip_fixed` is true: |
| - "Bisect" and "Request Trace" buttons are shown (if configured). |
| - `user-issue-sk` is shown (if not an anomaly). |
| - `json-source-sk` button/link is shown (if configured). |
| - The close icon is visible. |
| 4. Child components like `commit-range-sk`, `point-links-sk`, `user-issue-sk`, `triage-menu-sk` |
| are updated with their respective data. |
| 5. `moveTo()` positions the rendered `div.container` on the screen. |
| | |
| V |
| User interacts with buttons (e.g., "Bisect", "Triage", "Close") |
| | |
| V |
| chart-tooltip-sk or its child components handle the interaction |
| - e.g., clicking "Bisect" calls `openBisectDialog()`, which shows `bisect-dialog-sk`. |
| - e.g., clicking "Close" executes the `_close_button_action` passed during `load()`. |
| ``` |
| |
| This modular approach ensures that `chart-tooltip-sk` is a self-contained, |
| feature-rich component for displaying detailed contextual information and |
| actions related to data points in performance charts. |
| |
| # Module: /modules/cid |
| |
| ## CID Module Documentation |
| |
| This module, `/modules/cid`, provides functionality for interacting with Commit |
| IDs (CIDs), which are also referred to as CommitNumbers. The primary purpose of |
| this module is to facilitate the retrieval of detailed commit information based |
| on a set of commit numbers and their corresponding sources. |
| |
| ### Design and Implementation |
| |
| The core functionality revolves around the `lookupCids` function. This function |
| is designed to be a simple and efficient way to fetch commit details from a |
| backend endpoint. |
| |
| **Why Asynchronous Operations?** |
| |
| The lookup of commit information involves a network request to a backend service |
| (`/_/cid/`). Network requests are inherently asynchronous. Therefore, |
| `lookupCids` returns a `Promise`. This allows the calling code to continue |
| execution while the commit information is being fetched and to handle the |
| response (or any potential errors) when it becomes available. This non-blocking |
| approach is crucial for maintaining a responsive user interface or efficient |
| server-side processing. |
| |
| **Why JSON for Data Exchange?** |
| |
| JSON (JavaScript Object Notation) is used as the data format for both the |
| request and the response. |
| |
| - **Request:** The input `cids` (an array of `CommitNumber` objects) is |
| serialized into a JSON string and sent in the body of the HTTP POST request. |
| JSON is a lightweight and widely supported format, making it ideal for |
| client-server communication. |
| - **Response:** The backend endpoint is expected to return a JSON response |
| conforming to the `CIDHandlerResponse` type. The `jsonOrThrow` utility |
| (imported from `../../../infra-sk/modules/jsonOrThrow`) is used to parse |
| this JSON response. This utility simplifies error handling by automatically |
| throwing an error if the response is not valid JSON or if the HTTP request |
| itself fails. |
| |
| **Why POST Request?** |
| |
| A POST request is used instead of a GET request for sending the `cids`. While |
| GET requests are often used for retrieving data, they are typically limited in |
| the amount of data that can be sent in the URL (e.g., through query parameters). |
| Since the number of `cids` to look up could be large, sending them in the |
| request body via a POST request is a more robust and scalable approach. The |
| `Content-Type: application/json` header informs the server that the request body |
| contains JSON data. |
| |
| ### Key Components and Files |
| |
| - **`cid.ts`**: This is the sole TypeScript file in the module and contains |
| the implementation of the `lookupCids` function. |
| - **`lookupCids(cids: CommitNumber[]): Promise<CIDHandlerResponse>`**: |
| - **Responsibility**: Takes an array of `CommitNumber` objects and |
| asynchronously fetches detailed commit information for each from the |
| `/_/cid/` backend endpoint. |
| - **How it works**: |
| 1. It constructs an HTTP POST request to the `/_/cid/` endpoint. |
| 2. The `cids` array is converted into a JSON string and included as the |
| request body. |
| 3. Appropriate headers (`Content-Type: application/json`) are set. |
| 4. The `fetch` API is used to make the network request. |
| 5. The response from the server is then processed by `jsonOrThrow`. If |
| the request is successful and the response is valid JSON, it |
| resolves the promise with the parsed `CIDHandlerResponse`. |
| Otherwise, it rejects the promise with an error. |
| - **Dependencies**: |
| - `jsonOrThrow` (from `../../../infra-sk/modules/jsonOrThrow`): For |
| robust JSON parsing and error handling. |
| - `CommitNumber`, `CIDHandlerResponse` (from `../json`): These are |
| type definitions that define the structure of the input commit |
| identifiers and the expected response from the backend. |
| |
| ### Workflow: Looking up Commit IDs |
| |
| The typical workflow for using this module is as follows: |
| |
| ``` |
| Caller | /modules/cid/cid.ts (lookupCids) | Backend Server (/_/cid/) |
| ---------------------------|----------------------------------|------------------------- |
| 1. Has array of CommitNumber objects. |
| | | |
| 2. Calls `lookupCids(cids)`| | |
| `---------------------->`| | |
| | 3. Serializes `cids` to JSON. | |
| | 4. Creates POST request with JSON body. |
| | `--------------------------->`| 5. Receives POST request. |
| | | 6. Processes `cids`. |
| | | 7. Generates `CIDHandlerResponse`. |
| | `<---------------------------`| 8. Sends JSON response. |
| | 9. Receives response. | |
| | 10. `jsonOrThrow` parses response.| |
| | (Throws error on failure) | |
| | | |
| 11. Receives Promise that | | |
| resolves with | | |
| `CIDHandlerResponse` | | |
| (or rejects with error). |
| `<----------------------`| | |
| ``` |
| |
| # Module: /modules/cluster-lastn-page-sk |
| |
| The `cluster-lastn-page-sk` module provides a user interface for testing and |
| configuring alert configurations by running them against a recent range of |
| commits. This allows users to "dry run" an alert to see what regressions it |
| would detect before saving it to run periodically. |
| |
| **Core Functionality:** |
| |
| The primary purpose of this module is to facilitate the iterative process of |
| defining effective alert configurations. Instead of deploying an alert and |
| waiting for it to trigger (potentially with undesirable results), users can |
| simulate its behavior on historical data. This helps in fine-tuning parameters |
| like the detection algorithm, radius, sparsity, and interestingness threshold. |
| |
| **Key Components and Files:** |
| |
| - **`cluster-lastn-page-sk.ts`**: This is the heart of the module, defining |
| the `ClusterLastNPageSk` custom element. |
| |
| - **State Management**: It manages the current alert configuration |
| (`this.state`), the commit range (`this.domain`), and the results of the |
| dry run (`this.regressions`). It utilizes `stateReflector` to |
| potentially persist and restore parts of this state in the URL, allowing |
| users to share specific configurations or test setups. |
| - **User Interaction**: It handles user actions such as: |
| - Editing the alert configuration via a dialog (`alert-config-dialog` |
| which hosts an `alert-config-sk` element). |
| - Modifying the commit range using a `domain-picker-sk` element. |
| - Initiating the dry run (`run()` method). |
| - Saving the configured alert (`writeAlert()` method). |
| - Viewing details of detected regressions in a dialog |
| (`triage-cluster-dialog` which hosts a `cluster-summary2-sk` element). |
| - **API Communication**: |
| - Fetches initial data (paramset for alert configuration, default new |
| alert template) from `/_/initpage/` and `/_/alert/new` respectively. |
| - Sends the alert configuration and commit range to the `/_/dryrun/start` |
| endpoint to initiate the clustering and regression detection process. It |
| uses the `startRequest` utility from `../progress/progress` to handle |
| the asynchronous request and display progress. |
| - Sends the finalized alert configuration to `/_/alert/update` to save or |
| update it in the backend. |
| - **Rendering**: It uses `lit-html` for templating and dynamically renders |
| the UI based on the current state, including the controls, the progress |
| of a running dry run, and a table of detected regressions. The table |
| displays commit details (`commit-detail-sk`) and triage status |
| (`triage-status-sk`) for each detected regression. |
| - **Error Handling**: It displays error messages if the dry run or alert |
| saving fails. |
| |
| - **`cluster-lastn-page-sk.html` (Demo Page)**: A simple HTML file that |
| includes the `cluster-lastn-page-sk` element and an `error-toast-sk` for |
| displaying global error messages. This is primarily used for demonstration |
| and testing purposes. |
| |
| - **`cluster-lastn-page-sk-demo.ts`**: Sets up mock HTTP responses using |
| `fetch-mock` for the demo page. This allows the `cluster-lastn-page-sk` |
| element to function in isolation without needing a live backend. It mocks |
| endpoints like `/_/initpage/`, `/_/alert/new`, `/_/count/`, and |
| `/_/loginstatus/`. |
| |
| - **`cluster-lastn-page-sk.scss`**: Provides the styling for the |
| `cluster-lastn-page-sk` element and its dialogs, ensuring a consistent look |
| and feel with the rest of the Perf application. It uses shared SASS |
| libraries for buttons and themes. |
| |
| **Workflow for Testing an Alert Configuration:** |
| |
| 1. **Load Page**: User navigates to the page. |
| |
| - `cluster-lastn-page-sk` fetches initial paramset and a default new alert |
| configuration. |
| |
| ``` |
| User -> cluster-lastn-page-sk |
| cluster-lastn-page-sk -> GET /_/initpage/ (fetches paramset) |
| cluster-lastn-page-sk -> GET /_/alert/new (fetches default alert) |
| ``` |
| |
| 2. **Configure Alert**: User clicks the "Configure Alert" button. |
| |
| - A dialog (`alert-config-dialog`) opens, showing `alert-config-sk`. |
| - User modifies alert parameters (algorithm, radius, query, etc.). |
| - User clicks "Accept". |
| - The `state` in `cluster-lastn-page-sk` is updated with the new |
| configuration. |
| |
| ``` |
| User --clicks--> "Configure Alert" button |
| cluster-lastn-page-sk --shows--> alert-config-dialog |
| User --interacts with--> alert-config-sk |
| User --clicks--> "Accept" |
| alert-config-sk --updates--> cluster-lastn-page-sk.state |
| ``` |
| |
| 3. **(Optional) Adjust Commit Range**: User interacts with `domain-picker-sk` |
| to define the number of recent commits or a specific date range for the dry |
| run. |
| |
| - `cluster-lastn-page-sk.domain` is updated. |
| |
| 4. **Run Dry Run**: User clicks the "Run" button. |
| |
| - `cluster-lastn-page-sk` constructs a `RegressionDetectionRequest` using |
| the current alert `state` and `domain`. |
| - It sends this request to `/_/dryrun/start`. |
| - The UI shows a spinner and progress messages. |
| - As results (regressions) become available, they are displayed in a |
| table. |
| |
| ``` |
| User --clicks--> "Run" button |
| cluster-lastn-page-sk --creates--> RegressionDetectionRequest |
| cluster-lastn-page-sk --POSTs to--> /_/dryrun/start (with request body) |
| (progress updates via startRequest callback) |
| Backend --processes & clusters--> |
| Backend --sends progress/results--> cluster-lastn-page-sk |
| cluster-lastn-page-sk --updates--> UI (regressions table, status messages) |
| ``` |
| |
| 5. **Review Results**: User examines the table of regressions. |
| |
| - Each row shows a commit and the regressions (low/high) found at that |
| commit. |
| - User can click on a regression to open a `triage-cluster-dialog` |
| (showing `cluster-summary2-sk`) for more details. |
| - From the summary dialog, user can open related traces in the explorer |
| view. |
| |
| 6. **Iterate or Save**: |
| |
| - If results are not satisfactory, user goes back to step 2 to adjust the |
| alert configuration and re-runs. |
| - If results are satisfactory, user clicks "Create Alert" (or "Update |
| Alert" if modifying an existing one). |
| - `cluster-lastn-page-sk` sends the current alert `state` to |
| `/_/alert/update`. `User --clicks--> "Create Alert" / "Update Alert" |
| |
| button cluster-lastn-page-sk --POSTs to--> /\_/alert/update (with alert |
| config) Backend --saves/updates alert--> Backend --responds with ID--> |
| cluster-lastn-page-sk cluster-lastn-page-sk --updates--> UI (button text |
| might change to "Update Alert")` |
| |
| **Design Decisions:** |
| |
| - **Client-Side Dry Run Initiation**: The "dry run" is initiated from the |
| client, sending the full alert configuration. This allows immediate feedback |
| and iteration without needing to first save an incomplete or experimental |
| alert to the backend. |
| - **Component-Based UI**: The UI is built using custom elements (e.g., |
| `alert-config-sk`, `domain-picker-sk`, `cluster-summary2-sk`). This promotes |
| modularity, reusability, and separation of concerns. |
| - **Asynchronous Operations with Progress**: Long-running operations like the |
| dry run are handled asynchronously with visual feedback (spinners, status |
| messages) provided by the `../progress/progress` utility, enhancing user |
| experience. |
| - **State Reflection**: Using `stateReflector` allows parts of the page's |
| state (like the alert configuration) to be encoded in the URL. This is |
| useful for sharing specific test scenarios or bookmarking them. |
| - **Dialogs for Focused Interaction**: Modal dialogs are used for alert |
| configuration and viewing regression summaries, preventing users from |
| interacting with the main page content while these tasks are in progress, |
| thus guiding their focus. |
| - **Mocking for Demo/Testing**: The demo page |
| (`cluster-lastn-page-sk-demo.ts`) heavily relies on `fetch-mock`. This |
| enables isolated development and testing of the UI component without a |
| backend dependency, which is crucial for frontend unit/integration tests and |
| local development. |
| |
| # Module: /modules/cluster-page-sk |
| |
| The `cluster-page-sk` module provides the user interface for Perf's trace |
| clustering functionality. This allows users to identify groups of traces that |
| exhibit similar behavior, which is crucial for understanding performance |
| regressions or improvements across different configurations and tests. |
| |
| **Core Functionality and Design:** |
| |
| The primary goal of this page is to allow users to define a set of traces and |
| then apply a clustering algorithm to them. The "why" behind this is to simplify |
| the analysis of large datasets by grouping related performance changes. Instead |
| of manually inspecting hundreds or thousands of individual traces, users can |
| focus on a smaller number of clusters, each representing a distinct performance |
| pattern. |
| |
| The "how" involves several key components: |
| |
| 1. **Defining the Scope of Analysis:** |
| |
| - **Commit Selection:** Users first select a central commit around which |
| the analysis will be performed. This is handled by |
| `commit-detail-picker-sk`. The clustering will typically look at commits |
| before and after this selected point. The `state.offset` property stores |
| the selected commit's offset. |
| - **Query:** Users define the set of traces to consider using a query |
| string. This is managed by `query-sk` and `paramset-sk`. The |
| `state.query` holds this query. The `query-count-sk` element provides |
| feedback on how many traces match the current query. |
| - **Time Range/Commit Radius:** Users can specify a "radius" (in terms of |
| number of commits) around the selected commit to include in the |
| analysis. This is stored in `state.radius`. |
| |
| 2. **Clustering Algorithm and Parameters:** |
| |
| - **Algorithm Selection:** Users can choose the clustering algorithm |
| (e.g., k-means). This is facilitated by `algo-select-sk` and stored in |
| `state.algo`. The choice of algorithm impacts how clusters are formed |
| and what "similarity" means. |
| - **Number of Clusters (K):** For algorithms like k-means, the user can |
| suggest the number of clusters to find. A value of 0 typically means the |
| server will try to determine an optimal K. This is stored in `state.k`. |
| - **Interestingness Threshold:** Users can define a threshold for what |
| constitutes an "interesting" cluster, often based on the magnitude of |
| regression or step size. This is `state.interesting`. |
| - **Sparse Data Handling:** An option (`state.sparse`) allows users to |
| indicate if the data is sparse, meaning not all traces have data points |
| for all commits. This affects how the clustering algorithm processes |
| missing data. |
| |
| 3. **Executing the Clustering and Displaying Results:** |
| |
| - **Initiating the Request:** The "Run" button triggers the clustering |
| process. The `start()` method constructs a `RegressionDetectionRequest` |
| object containing all the user-defined parameters. This request is sent |
| to the `/_/cluster/start` endpoint. |
| - **Background Processing and Progress:** Clustering can be a long-running |
| operation. The module uses the `progress` utility to manage the |
| asynchronous request. It displays a spinner (`spinner-sk`) and status |
| messages (`ele.status`, `ele.runningStatus`) to keep the user informed. |
| The `requestId` property tracks the active request. |
| - **Displaying Clusters:** Once the server responds, the |
| `RegressionDetectionResponse` contains a list of `FullSummary` objects. |
| Each `FullSummary` represents a discovered cluster. These are rendered |
| using multiple `cluster-summary2-sk` elements. This component is |
| responsible for visualizing the details of each cluster, including its |
| member traces and regression information. |
| - **Sorting Results:** Users can sort the resulting clusters by various |
| metrics (size, regression score, etc.) using `sort-sk`. |
| |
| **State Management:** |
| |
| The `cluster-page-sk` component maintains its internal state in a `State` |
| object. This includes user selections like the query, commit offset, algorithm, |
| and various parameters. Crucially, this state is reflected in the URL using the |
| `stateReflector` utility. This design decision ensures that: |
| |
| - The page is bookmarkable: Users can save and share URLs that directly lead |
| to a specific clustering configuration and its results. |
| - Browser history (back/forward buttons) works as expected. |
| - The application state is serializable and easily reproducible. |
| |
| The `stateHasChanged()` method is called whenever a piece of the state is |
| modified, triggering the `stateReflector` to update the URL and potentially |
| re-render the component. |
| |
| **Key Files and Their Roles:** |
| |
| - **`cluster-page-sk.ts`:** This is the main TypeScript file defining the |
| `ClusterPageSk` custom element. It orchestrates all the sub-components, |
| manages the application state, handles user interactions (e.g., button |
| clicks, input changes), makes API calls for clustering, and renders the |
| results. It defines the overall layout and logic of the clustering page. |
| - **`cluster-page-sk.html` (inferred, as it's a LitElement):** The HTML |
| template is defined within `cluster-page-sk.ts` using `lit-html`. This |
| template structures the page, embedding various custom elements for commit |
| selection, query building, algorithm choice, and result display. |
| - **`cluster-page-sk.scss`:** Provides the specific styling for the |
| `cluster-page-sk` element and its layout, ensuring a consistent look and |
| feel. |
| - **`index.ts`:** A simple entry point that imports and registers the |
| `cluster-page-sk` custom element, making it available for use in HTML. |
| - **`cluster-page-sk-demo.ts` & `cluster-page-sk-demo.html`:** These files set |
| up a demonstration page for the `cluster-page-sk` element. |
| `cluster-page-sk-demo.ts` uses `fetch-mock` to simulate API responses, |
| allowing the component to be developed and tested in isolation without |
| needing a live backend. This is crucial for rapid development and ensuring |
| the UI behaves correctly under various backend scenarios. |
| - **`State` class (within `cluster-page-sk.ts`):** Defines the structure of |
| the data that is persisted in the URL and drives the component's behavior. |
| It encapsulates all user-configurable options for the clustering process. |
| |
| **Workflow Example: Performing a Cluster Analysis** |
| |
| ``` |
| User Interaction | Component/State Change | Backend Interaction |
| -----------------------------------------|-------------------------------|--------------------- |
| 1. User navigates to the cluster page. | `ClusterPageSk` initializes. | Fetches initial paramset (`/_/initpage/`) |
| | `stateReflector` initializes | |
| | from URL or defaults. | |
| | | |
| 2. User selects a commit. | `commit-detail-picker-sk` | (Potentially fetches commit details if not cached) |
| | emits `commit-selected`. | |
| | `state.offset` updates. | |
| | `stateHasChanged()` called. | |
| | | |
| 3. User types a query (e.g., "config=gpu").| `query-sk` emits | (Potentially `/`_`/count/` to update trace count) |
| | `query-change`. | |
| | `state.query` updates. | |
| | `stateHasChanged()` called. | |
| | | |
| 4. User selects an algorithm (e.g., kmeans).| `algo-select-sk` emits | |
| | `algo-change`. | |
| | `state.algo` updates. | |
| | `stateHasChanged()` called. | |
| | | |
| 5. User adjusts advanced parameters | Input elements update | |
| (K, radius, interestingness). | corresponding `state` props. | |
| | `stateHasChanged()` called. | |
| | | |
| 6. User clicks "Run". | `start()` method is called. | POST to `/_/cluster/start` with `RegressionDetectionRequest` |
| | `requestId` is set. | (This is a long-running request) |
| | Spinner becomes active. | |
| | | |
| 7. Page periodically updates status. | `progress` utility polls for | GET requests to check progress. |
| | updates. | |
| | `ele.runningStatus` updates. | |
| | | |
| 8. Clustering completes. | `progress` utility resolves. | Final response from `/_/cluster/start` (or progress endpoint) |
| | `summaries` array is populated| containing `RegressionDetectionResponse`. |
| | with cluster data. | |
| | `requestId` is cleared. | |
| | Spinner stops. | |
| | | |
| 9. Results are displayed. | `ClusterPageSk` re-renders, | |
| | showing `cluster-summary2-sk` | |
| | elements for each cluster. | |
| ``` |
| |
| This workflow highlights how user inputs are translated into state changes, |
| which then drive API requests and ultimately update the UI to present the |
| clustering results. The separation of concerns among various sub-components (for |
| query, commit selection, etc.) makes the main `cluster-page-sk` element more |
| manageable. |
| |
| # Module: /modules/cluster-summary2-sk |
| |
| The `cluster-summary2-sk` module provides a custom HTML element for displaying |
| detailed information about a cluster of performance test results. This includes |
| visualizing the trace data, showing regression statistics, and allowing users to |
| triage the cluster. |
| |
| **Core Functionality and Design:** |
| |
| The primary purpose of this element is to present a comprehensive summary of a |
| performance cluster. It aims to provide all necessary information for a user to |
| understand the nature of a performance change (regression or improvement) and |
| take appropriate action (e.g., filing a bug, marking it as expected). |
| |
| Key design considerations include: |
| |
| - **Data Visualization:** A `plot-simple-sk` element is used to display the |
| centroid trace of the cluster over time. This visual representation helps |
| users quickly grasp the trend and identify the point of change. An "x-bar" |
| can be displayed on the plot to highlight the specific commit where a step |
| change is detected. |
| - **Statistical Summary:** The element displays key statistics about the |
| cluster, such as its size, the regression factor, step size, and least |
| squares error. The labels and formatting of these statistics dynamically |
| adapt based on the `StepDetection` algorithm used (e.g., 'absolute', |
| 'percent', 'mannwhitneyu'). This ensures that the presented information is |
| relevant and interpretable for the specific detection method. |
| - **Commit Details:** Integration with `commit-detail-panel-sk` allows users |
| to view details of the commit associated with the detected step point or any |
| selected point on the trace plot. This is crucial for correlating |
| performance changes with specific code modifications. |
| - **Triaging:** If not disabled via the `notriage` attribute, the element |
| includes a `triage2-sk` component. This allows authenticated users with |
| "editor" privileges to set the triage status (e.g., "positive", "negative", |
| "untriaged") and add a message. This functionality is essential for tracking |
| the investigation and resolution of performance issues. |
| - **Contextual Actions:** Buttons are provided to: |
| - "View on dashboard": Opens the current cluster view in a broader |
| explorer context, pre-filling relevant parameters like shortcut ID and |
| time range. |
| - "Word Cloud": Toggles the visibility of a `word-cloud-sk` element, which |
| displays a summary of the parameters that make up the traces in the |
| cluster. This helps in understanding the common characteristics of the |
| affected tests. |
| - A permalink is generated to directly link to the triage page for the |
| specific step point. |
| - **Interactive Exploration:** The `commit-range-sk` component allows users to |
| define a range around the detected step or a selected commit, facilitating |
| further investigation within the Perf application. |
| |
| **Key Components and Their Roles:** |
| |
| - **`cluster-summary2-sk.ts`**: This is the main TypeScript file defining the |
| `ClusterSummary2Sk` custom element. |
| - **`ClusterSummary2Sk` class:** Extends `ElementSk` and manages the |
| element's state, rendering, and event handling. |
| - **Data Properties (`full_summary`, `triage`, `alert`):** These |
| properties receive the core data for the cluster. When `full_summary` is |
| set, it triggers the rendering of the plot, statistics, and commit |
| details. The `alert` property determines the labels and formatting for |
| regression statistics. The `triage` property reflects the current triage |
| state. |
| - **Template (`template` static method):** Uses `lit-html` to define the |
| element's structure, binding data to various sub-components and display |
| areas. |
| - **Event Handling:** |
| - `open-keys`: Fired when the "View on dashboard" button is clicked, |
| providing details for opening the explorer. |
| - `triaged`: Fired when the triage status is updated, containing the |
| new status and the relevant commit information. |
| - `trace_selected`: Handles events from `plot-simple-sk` when a point |
| on the graph is clicked, triggering a lookup for the corresponding |
| commit details. |
| - **Helper Methods:** |
| - `statusClass()`: Determines the CSS class for the regression display |
| based on the severity (e.g., "high", "low"). |
| - `permaLink()`: Generates a URL to the triage page focused on the |
| step point. |
| - `lookupCids()` (static): A static method (delegating to |
| `../cid/cid.ts`) used to fetch commit details based on commit |
| numbers. |
| - **`labelsForStepDetection`:** A crucial constant object that maps |
| different `StepDetection` algorithm names (e.g., 'percent', |
| 'mannwhitneyu', 'absolute') to specific labels and number formatting |
| functions for the regression statistics. This ensures that the displayed |
| information is meaningful and correctly interpreted for the algorithm |
| used to detect the cluster. |
| - **`cluster-summary2-sk.html` (template, rendered by |
| `cluster-summary2-sk.ts`):** Defines the visual layout using HTML and |
| embedded custom elements. It uses a CSS grid for positioning the main |
| sections: regression summary, statistics, plot, triage status, commit |
| details, actions, and word cloud. |
| - **`cluster-summary2-sk.scss`**: Provides the styling for the element. It |
| defines how different sections are displayed, including styles for |
| regression severity (e.g., red for "high" regressions, green for "low"), |
| button appearances, and responsive behavior (hiding the plot on smaller |
| screens). |
| - **`cluster-summary2-sk-demo.html` and `cluster-summary2-sk-demo.ts`**: These |
| files set up a demonstration page for the `cluster-summary2-sk` element. The |
| `.ts` file provides mock data for `FullSummary`, `Alert`, and `TriageStatus` |
| to populate the demo instances of the element. It also demonstrates how to |
| listen for the `triaged` and `open-keys` custom events. |
| |
| **Workflows:** |
| |
| 1. **Initialization and Data Display:** |
| |
| - The host application provides `full_summary` (containing cluster data |
| and trace frame), `alert` (details of the alert that triggered this |
| cluster), and optionally `triage` (current triage status) properties to |
| the `cluster-summary2-sk` element. |
| - `set full_summary()`: |
| - Updates internal `summary` and `frame` data. |
| - Populates dataset attributes for sorting (e.g., `data-clustersize`). |
| - Clears and redraws the `plot-simple-sk` with the centroid trace from |
| `summary.centroid` and time labels from `frame.dataframe.header`. |
| - If a step point is identified and the status is not "Uninteresting", |
| an x-bar is placed on the plot at the corresponding commit. |
| - `lookupCids` is called to fetch and display details for the commit |
| at the step point in `commit-detail-panel-sk`. |
| - `set alert()`: |
| - Updates the `labels` used for displaying regression statistics based |
| on `alert.step` and `labelsForStepDetection`. |
| - `set triage()`: |
| - Updates the `triageStatus` and re-renders the triage controls. |
| - The element renders based on the provided data, displaying statistics, |
| plot, commit details, and triage controls. |
| |
| ``` |
| Host Application cluster-summary2-sk |
| ---------------- ------------------- |
| [Set full_summary data] --> Process data |
| | |
| +-> plot-simple-sk (Draws trace) |
| | |
| +-> commit-detail-panel-sk (Shows step commit) |
| | |
| +-> Display stats (regression, size, etc.) |
| |
| [Set alert data] ---------> Update regression labels/formatters |
| |
| [Set triage data] --------> Update triage2-sk state |
| ``` |
| |
| 2. **User Triage:** |
| |
| - User interacts with `triage2-sk` (selects status) and the message input |
| field. |
| - User clicks the "Update" button. |
| - `update()` method is called: |
| - An `ClusterSummary2SkTriagedEventDetail` object is created |
| containing the `step_point` (as `columnHeader`) and the current |
| `triageStatus`. |
| - A `triaged` custom event is dispatched with this detail. |
| - The host application listens for the `triaged` event to persist the |
| triage status. |
| |
| ``` |
| User cluster-summary2-sk Host Application |
| ---- ------------------- ---------------- |
| Selects status ----> [triage2-sk updates value] |
| Types message ----> [Input updates value] |
| Clicks "Update" ---> update() |
| | |
| +-> Creates TriagedEventDetail |
| | |
| +-> Dispatches "triaged" event --> Listens and handles event |
| (e.g., saves to backend) |
| ``` |
| |
| 3. **Viewing on Dashboard:** |
| |
| - User clicks the "View on dashboard" button. |
| - `openShortcut()` method is called: |
| - An `ClusterSummary2SkOpenKeysEventDetail` object is created with the |
| `shortcut` ID, `begin` and `end` timestamps from the frame, and the |
| `step_point` as `xbar`. |
| - An `open-keys` custom event is dispatched. |
| - The host application listens for `open-keys` and navigates the user to |
| the explorer view with the provided parameters. |
| |
| ``` |
| User cluster-summary2-sk Host Application |
| ---- ------------------- ---------------- |
| Clicks "View on dash" --> openShortcut() |
| | |
| +-> Creates OpenKeysEventDetail |
| | |
| +-> Dispatches "open-keys" event --> Listens and handles event |
| (e.g., navigates to explorer) |
| ``` |
| |
| The `cluster-summary2-sk` element plays a vital role in the Perf frontend by |
| providing a focused and interactive view for analyzing individual performance |
| regressions or improvements identified through clustering. Its integration with |
| plotting, commit details, and triaging makes it a key tool for performance |
| analysis workflows. |
| |
| # Module: /modules/commit-detail-panel-sk |
| |
| ## Commit Detail Panel SK |
| |
| **High-level Overview:** |
| |
| The `commit-detail-panel-sk` module provides a custom HTML element |
| `<commit-detail-panel-sk>` designed to display a list of commit details. It |
| offers functionality to make these commit entries selectable and emits an event |
| when a commit is selected. This component is primarily used in user interfaces |
| where users need to browse and interact with a sequence of commits. |
| |
| **Why and How:** |
| |
| The core purpose of this module is to present commit information in a structured |
| and interactive way. Instead of simply displaying raw commit data, it leverages |
| the `commit-detail-sk` element (an external dependency) to render each commit |
| with relevant information like author, message, and a link to the commit. |
| |
| The design decision to make commits selectable (via the `selectable` attribute) |
| enhances user interaction. When a commit is clicked in "selectable" mode, it |
| triggers a `commit-selected` custom event. This event carries detailed |
| information about the selected commit, including its index in the list, a |
| concise description, and the full commit object. This allows parent components |
| or applications to react to user selections and perform actions based on the |
| chosen commit (e.g., loading further details, navigating to a specific state). |
| |
| The implementation uses Lit library for templating and rendering. The commit |
| data is provided via the `details` property, which expects an array of `Commit` |
| objects (defined in `perf/modules/json`). The component dynamically generates |
| table rows for each commit. |
| |
| The visual appearance is controlled by `commit-detail-panel-sk.scss`. It defines |
| styles for the panel, including highlighting the selected row and adjusting |
| opacity based on the `selectable` state. The styling aims for a clean and |
| readable presentation of commit information. |
| |
| A `hide` property is also available to conditionally show or hide the entire |
| commit list. This is useful for scenarios where the panel's visibility needs to |
| be controlled dynamically by the parent application. |
| |
| **Key Components/Files:** |
| |
| - **`commit-detail-panel-sk.ts`**: This is the heart of the module. It defines |
| the `CommitDetailPanelSk` class, which extends `ElementSk`. |
| - **Responsibilities**: |
| - Manages the list of `Commit` objects (`_details` property). |
| - Renders the list of commits as an HTML table using Lit templates |
| (`template` and `rows` static methods). |
| - Handles user clicks on table rows (`_click` method). |
| - When a commit is selected (and the `selectable` attribute is present), |
| it dispatches the `commit-selected` custom event with relevant commit |
| data. |
| - Manages the `selectable`, `selected`, and `hide` attributes and their |
| corresponding properties, re-rendering the component when these change. |
| - Integrates the `commit-detail-sk` element to display individual commit |
| details within each row. |
| - **`commit-detail-panel-sk.scss`**: This file contains the SASS styles for |
| the component. |
| - **Responsibilities**: |
| - Defines the visual appearance of the commit panel, including link |
| colors, table cell padding, and selected row highlighting. |
| - Adjusts the opacity and cursor style based on whether the panel is |
| `selectable`. |
| - Leverages theme variables (e.g., `--primary`, `--surface-1dp`) from |
| `//perf/modules/themes:themes_sass_lib` for consistent theming. |
| - **`commit-detail-panel-sk-demo.ts` and `commit-detail-panel-sk-demo.html`**: |
| These files provide a demonstration page for the component. |
| - **Responsibilities**: |
| - Illustrate how to use the `<commit-detail-panel-sk>` element in an HTML |
| page. |
| - Show examples of the component in both selectable and non-selectable |
| states, and in light/dark themes. |
| - Demonstrate how to provide commit data to the `details` property and how |
| to listen for the `commit-selected` event. |
| - **`index.ts`**: A simple entry point that imports and registers the |
| `commit-detail-panel-sk` custom element, making it available for use. |
| - **`BUILD.bazel`**: Defines how the module is built and its dependencies. For |
| instance, it declares `commit-detail-sk` as a runtime dependency and Lit as |
| a TypeScript dependency. |
| - **`commit-detail-panel-sk_puppeteer_test.ts`**: Contains Puppeteer tests to |
| verify the component's rendering and basic functionality. |
| |
| **Key Workflows:** |
| |
| 1. **Initialization and Rendering:** |
| |
| ``` |
| Parent Application --> Sets 'details' property of <commit-detail-panel-sk> with Commit[] |
| | |
| V |
| commit-detail-panel-sk.ts --> _render() is called |
| | |
| V |
| Lit template generates <table> |
| | |
| V |
| For each Commit in 'details': |
| Generates <tr> containing <commit-detail-sk .cid=Commit> |
| ``` |
| |
| 2. **Commit Selection (when `selectable` is true):** `User --> Clicks on a <tr> |
| in the <commit-detail-panel-sk> | V commit-detail-panel-sk.ts --> |
| _click(event) handler is invoked | V Determines the clicked commit's index |
| and data | V Sets 'selected' attribute/property to the index of the clicked |
| commit | V Dispatches 'commit-selected' CustomEvent with { selected: index, |
| description: string, commit: Commit } | V Parent Application --> Listens for |
| 'commit-selected' event and processes the event.detail` |
| |
| The design favors declarative attribute-based configuration (e.g., `selectable`, |
| `selected`) and event-driven communication for user interactions, which are |
| common patterns in web component development. |
| |
| # Module: /modules/commit-detail-picker-sk |
| |
| The `commit-detail-picker-sk` module provides a user interface element for |
| selecting a specific commit from a range of commits. It's designed to be a |
| reusable component that simplifies the process of commit selection within |
| applications that need to interact with commit histories. |
| |
| **Core Functionality and Design:** |
| |
| The primary purpose of `commit-detail-picker-sk` is to allow users to browse and |
| select a commit. This is achieved by presenting a button that, when clicked, |
| opens a dialog. |
| |
| - **Button as Entry Point:** The button displays a summary of the currently |
| selected commit (author and message) or a default message like "Choose a |
| commit." This provides immediate context to the user. Clicking this button |
| triggers the opening of the selection dialog. `[Button: "Author - Commit |
| Message"] --- (click) ---> [Dialog Opens]` |
| - **Dialog for Selection:** The dialog is the main interaction point for |
| choosing a commit. It contains: |
| - `commit-detail-panel-sk`: This submodule is responsible for displaying |
| the list of commits fetched from the backend. Users can click on a |
| commit in this panel to select it. |
| - **Date Range Selection:** A `day-range-sk` component allows users to |
| specify a time window for fetching commits. This is crucial for |
| performance and usability, as it prevents loading an overwhelming number |
| of commits at once. When the date range changes, the component |
| automatically fetches the relevant commits. `[day-range-sk] -- (date |
| range change) --> [Fetch Commits for New Range] | V |
| [commit-detail-panel-sk updates]` |
| - **Spinner:** A `spinner-sk` element provides visual feedback to the user |
| while commits are being fetched, indicating that an operation is in |
| progress. |
| - **Close Button:** Allows the user to dismiss the dialog without making a |
| selection or after a selection is made. |
| |
| **Data Flow and State Management:** |
| |
| 1. **Initialization:** When the component is first loaded, it initializes with |
| a default date range (typically the last 24 hours). It then fetches the |
| commits within this initial range. |
| 2. **Fetching Commits:** The component makes a POST request to the |
| `/_/cidRange/` endpoint. The request body includes the `begin` and `end` |
| timestamps of the desired range and optionally the `offset` of a currently |
| selected commit (to ensure it's included in the results if it falls outside |
| the new range). `User Action (e.g., change date range) | V |
| [commit-detail-picker-sk] | V (Constructs RangeRequest: {begin, end, |
| offset}) POST /_/cidRange/ | V (Receives Commit[] array) |
| [commit-detail-picker-sk] | V (Updates internal 'details' array) |
| [commit-detail-panel-sk] (Re-renders with new commit list)` |
| 3. **Commit Selection:** - When a user selects a commit in the `commit-detail-panel-sk`, the panel |
| emits a `commit-selected` event. - `commit-detail-picker-sk` listens for this event and updates its |
| internal `selected` index. - The dialog is then closed, and the main button's text updates to reflect |
| the new selection. - Crucially, `commit-detail-picker-sk` itself emits a `commit-selected` |
| event. This allows parent components to react to the user's choice. The |
| detail of this event is of type |
| `CommitDetailPanelSkCommitSelectedDetails`, containing information about |
| the selected commit. `[commit-detail-panel-sk] -- (internal click on a |
| commit) --> Emits 'commit-selected' (internal) | V |
| [commit-detail-picker-sk] -- (handles internal event) --> Updates |
| 'selected' index Updates button text Closes dialog Emits |
| 'commit-selected' (external)` |
| 4. **External Selection (`selection` property):** The component exposes a |
| `selection` property (of type `CommitNumber`). If this property is set |
| externally, the component will attempt to fetch commits around that |
| `CommitNumber` and pre-select it in the panel. |
| |
| **Key Files and Responsibilities:** |
| |
| - **`commit-detail-picker-sk.ts`:** This is the core TypeScript file defining |
| the `CommitDetailPickerSk` custom element. |
| - **Why:** It orchestrates the interactions between the button, dialog, |
| `commit-detail-panel-sk`, and `day-range-sk`. It handles fetching commit |
| data, managing the selection state, and emitting the final |
| `commit-selected` event. |
| - **How:** It uses the Lit library for templating and rendering. It |
| defines methods for opening/closing the dialog (`open()`, `close()`), |
| handling range changes (`rangeChange()`), updating the commit list |
| (`updateCommitSelections()`), and processing selections from the panel |
| (`panelSelect()`). The `selection` getter/setter allows for programmatic |
| control of the selected commit. |
| - **`commit-detail-picker-sk.scss`:** Contains the SASS/CSS styles for the |
| component. |
| - **Why:** To provide a consistent visual appearance and layout for the |
| button and the dialog, ensuring it integrates well with the overall |
| application theme (e.g., light and dark modes via CSS variables like |
| `--on-background`, `--background`). |
| - **How:** It styles the `dialog` element, the buttons within it, and |
| ensures proper display and spacing of child components like |
| `day-range-sk`. |
| - **`commit-detail-picker-sk-demo.html` & `commit-detail-picker-sk-demo.ts`:** |
| These files provide a demonstration page for the component. |
| - **Why:** To showcase the component's functionality in isolation, making |
| it easier to test and understand its usage. The demo also includes |
| examples for light and dark themes. |
| - **How:** The HTML sets up basic page structure and placeholders for the |
| component. The TypeScript file initializes instances of |
| `commit-detail-picker-sk`, mocks the backend API call (`/_/cidRange/`) |
| using `fetch-mock` to provide sample commit data, and sets up an event |
| listener to display the `commit-selected` event details. |
| - **Dependencies:** |
| - `commit-detail-panel-sk`: Used within the dialog to list and allow |
| selection of individual commits. `commit-detail-picker-sk` passes the |
| fetched `details` (array of `Commit` objects) to this panel. |
| - `day-range-sk`: Used to allow the user to define the time window for |
| which commits should be fetched. Its `day-range-change` event triggers a |
| refetch in the picker. |
| - `spinner-sk`: Provides visual feedback during data loading. |
| - `ElementSk`: Base class from `infra-sk` providing common custom element |
| functionality. |
| - `jsonOrThrow`: Utility for parsing JSON responses and throwing an error |
| if parsing fails or the response is not OK. |
| - `errorMessage`: Utility for displaying error messages to the user. |
| |
| The design focuses on encapsulation: the `commit-detail-picker-sk` component |
| manages its internal state (current range, fetched commits, selected index) and |
| exposes a clear interface for interaction (a button to open, a `selection` |
| property, and a `commit-selected` event). This makes it easy to integrate into |
| larger applications that require users to pick a commit from a potentially large |
| history. |
| |
| # Module: /modules/commit-detail-sk |
| |
| ## commit-detail-sk |
| |
| The `commit-detail-sk` module provides a custom HTML element |
| `<commit-detail-sk>` designed to display concise information about a single |
| commit. This element is crucial for user interfaces where presenting commit |
| details in a structured and interactive manner is necessary. |
| |
| ### Why |
| |
| In applications dealing with version control systems, there's often a need to |
| display details of individual commits. This could be for reviewing changes, |
| navigating commit history, or linking to related actions like exploring code |
| changes, viewing clustered data, or triaging issues associated with a commit. |
| The `commit-detail-sk` element encapsulates this functionality, offering a |
| reusable and consistent way to present commit information. |
| |
| ### How |
| |
| The core of the module is the `CommitDetailSk` class, which extends `ElementSk`. |
| This class defines the structure and behavior of the `<commit-detail-sk>` |
| element. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`commit-detail-sk.ts`**: This is the heart of the module. |
| |
| - It defines the `CommitDetailSk` custom element. |
| - The element takes a `Commit` object (defined in `perf/modules/json`) as |
| input via the `cid` property. This object contains details like the |
| commit hash, author, message, timestamp, and URL. |
| - The `template` function, using `lit-html`, defines the HTML structure of |
| the element. It displays: |
| - A truncated commit hash. |
| - The commit author. |
| - The time elapsed since the commit (human-readable, via `diffDate`). |
| - The commit message. |
| - It also renders a set of Material Design outlined buttons: "Explore", |
| "Cluster", "Triage", and "Commit". These buttons are intended to |
| navigate the user to different views or actions related to the specific |
| commit. The links for these buttons are dynamically generated based on |
| the commit hash and the `cid.url`. |
| - The `openLink` method handles the click events on these buttons, opening |
| the respective links in a new browser window/tab. |
| - `upgradeProperty` is used to ensure that the `cid` property is correctly |
| initialized if it's set before the element is fully connected to the |
| DOM. |
| |
| - **`commit-detail-sk.scss`**: This file contains the styling for the |
| `<commit-detail-sk>` element. |
| |
| - It defines styles for the layout, typography, and appearance of the |
| commit information and the action buttons. |
| - It utilizes CSS variables for theming (e.g., `--blue`, `--primary`), |
| allowing the component to adapt to different visual themes (light and |
| dark mode, as demonstrated in the demo). |
| - It includes styles from `//perf/modules/themes:themes_sass_lib` and |
| `//elements-sk/modules:colors_sass_lib` to ensure consistency with the |
| broader application's design system. |
| |
| - **`commit-detail-sk-demo.html` and `commit-detail-sk-demo.ts`**: These files |
| provide a demonstration page for the `<commit-detail-sk>` element. |
| |
| - The HTML sets up basic page structure and includes instances of |
| `<commit-detail-sk>` in both light and dark mode contexts. |
| - The TypeScript file initializes these demo elements with sample `Commit` |
| data. It also simulates a click on the element to potentially reveal |
| more details or actions if such functionality were implemented (though |
| in the current version, the "tip" div with buttons is always visible). |
| The `Date.now` function is mocked to ensure consistent output for the |
| `diffDate` calculation in the demo and tests. |
| |
| **Workflow Example: Displaying Commit Information and Actions** |
| |
| ``` |
| 1. Application provides a `Commit` object. |
| e.g., { hash: "abc123...", author: "user@example.com", ... } |
| |
| 2. The `Commit` object is assigned to the `cid` property of a `<commit-detail-sk>` element. |
| <commit-detail-sk .cid=${commitData}></commit-detail-sk> |
| |
| 3. `CommitDetailSk` element renders: |
| [abc123...] - [user@example.com] - [2 days ago] - [Commit message] |
| +----------------------------------------------------------------+ |
| | [Explore] [Cluster] [Triage] [Commit (link to commit source)] | <- Action buttons |
| +----------------------------------------------------------------+ |
| |
| 4. User clicks an action button (e.g., "Explore"). |
| |
| 5. `openLink` method is called with a generated URL (e.g., "/g/e/abc123..."). |
| |
| 6. A new browser tab opens to the specified URL. |
| ``` |
| |
| This design promotes reusability and separation of concerns. The element focuses |
| solely on presenting commit information and providing relevant action links, |
| making it easy to integrate into various parts of an application that need to |
| display commit details. The use of `lit-html` for templating allows for |
| efficient rendering and updates. |
| |
| # Module: /modules/commit-range-sk |
| |
| The `commit-range-sk` module provides a custom HTML element, |
| `<commit-range-sk>`, designed to display a link representing a range of commits |
| within a Git repository. This functionality is particularly useful in |
| performance analysis tools where identifying the specific commits that |
| introduced a performance regression or improvement is crucial. |
| |
| **Core Functionality and Design:** |
| |
| The primary purpose of `commit-range-sk` is to dynamically generate a URL that |
| points to a commit range viewer (e.g., a Git web interface like Gerrit or |
| GitHub). This URL is constructed based on a "begin" and an "end" commit. |
| |
| - **Identifying the Commit Range:** |
| |
| - The element takes a `trace` (an array of numerical data points, where |
| each point corresponds to a commit), a `commitIndex` (the index within |
| the `trace` array that represents the "end" commit of interest), and |
| `header` information (which maps trace indices to commit metadata like |
| `offset` or commit number). |
| - The "end" commit is directly determined by the `commitIndex` and the |
| `header`. |
| - The "begin" commit is found by iterating backward from the |
| `commitIndex - 1` in the `trace`. It skips over any entries marked with |
| `MISSING_DATA_SENTINEL` (indicating commits for which there's no data |
| point) until it finds a valid previous commit. |
| - This logic ensures that the range always spans from a commit with actual |
| data to the target commit, even if there are intermediate commits with |
| missing data. |
| |
| - **Converting Commit Numbers to Hashes:** |
| |
| - The commit range URL template, configured globally via |
| `window.perf.commit_range_url`, typically requires Git commit hashes |
| (SHAs) rather than internal commit numbers or offsets. |
| - The `commit-range-sk` element uses a `commitNumberToHashes` function to |
| perform this conversion. |
| - The default implementation, `defaultcommitNumberToHashes`, makes an |
| asynchronous call to a backend service (likely `/`/cid/``) by |
| invoking`lookupCids`from the`//perf/modules/cid:cid_ts_lib` module. This |
| service is expected to return the commit hashes corresponding to the |
| provided commit numbers. |
| - This design allows for testability by enabling the replacement of |
| `commitNumberToHashes` with a mock function during testing (as seen in |
| `commit-range-sk_test.ts`). |
| |
| - **URL Construction and Display:** |
| |
| - Once the "begin" and "end" commit numbers are identified and their |
| corresponding hashes are retrieved, the element populates the |
| `window.perf.commit_range_url` template. This template usually contains |
| placeholders like `{begin}` and `{end}` which are replaced with the |
| actual commit hashes. |
| - The displayed text for the link is also dynamically generated. If the |
| "begin" and "end" commits are not consecutive (i.e., there's at least |
| one commit between them, or the "begin" commit had to skip missing data |
| points), the text will show a range like "`<begin_offset + 1> - |
| <end_offset>`". Otherwise, it will just show the "`<end_offset>`". The |
| `+1` for the begin offset in a range is to ensure the displayed range |
| starts _after_ the last known good commit. |
| - The element supports two display modes controlled by the `showLinks` |
| property: |
| - If `showLinks` is `false` (default, or when the element is merely |
| hovered over in some UIs), only the text representing the commit(s) is |
| displayed. |
| - If `showLinks` is `true`, a fully formed hyperlink (`<a>` tag) is |
| rendered. |
| |
| **Key Components/Files:** |
| |
| - **`commit-range-sk.ts`**: This is the core file defining the `CommitRangeSk` |
| custom element. |
| |
| - It extends `ElementSk`, a base class for custom elements in the Skia |
| infrastructure. |
| - It manages the state of the component through properties like `_trace`, |
| `_commitIndex`, `_header`, `_url`, `_text`, and `_commitIds`. |
| - The `recalcLink()` method is central to its operation. It's triggered |
| whenever relevant input properties (`trace`, `commitIndex`, `header`) |
| change. This method orchestrates the process of finding commit IDs, |
| converting them to hashes, and generating the URL and display text. |
| - `setCommitIds()` implements the logic for determining the start and end |
| commit numbers based on the input trace and header, handling missing |
| data points. |
| - It uses the `lit/html` library for templating, allowing for efficient |
| rendering and updates to the DOM. |
| |
| - **`commit-range-sk-demo.ts` and `commit-range-sk-demo.html`**: These files |
| provide a demonstration page for the `commit-range-sk` element. |
| |
| - `commit-range-sk-demo.ts` sets up a mock environment, including mocking |
| the `fetch` call to `/`/cid/``using`fetch-mock`. This is crucial for |
| demonstrating the element's behavior without needing a live backend. |
| - It also initializes the global `window.perf` object with necessary |
| configuration, such as the `commit_range_url` template. |
| - It then instantiates the `<commit-range-sk>` element and populates its |
| properties to showcase its functionality. |
| |
| - **`commit-range-sk_test.ts`**: This file contains unit tests for the |
| `CommitRangeSk` element. |
| |
| - It utilizes `chai` for assertions and `setUpElementUnderTest` for easy |
| instantiation of the element in a test environment. |
| - A key testing strategy involves overriding the `commitNumberToHashes` |
| method on the element instance to provide controlled hash values and |
| assert the correctness of the generated URL and text, especially in |
| scenarios involving `MISSING_DATA_SENTINEL`. |
| |
| - **`BUILD.bazel`**: Defines how the module is built, its dependencies (e.g., |
| `//infra-sk/modules/ElementSk`, `//perf/modules/json`, `lit`), and how the |
| demo page and tests are structured. |
| |
| **Workflow Example: Generating a Commit Range Link** |
| |
| 1. **Initialization:** |
| |
| - The application using `<commit-range-sk>` sets the global |
| `window.perf.commit_range_url` (e.g., |
| `"http://example.com/range/{begin}/{end}"`). |
| - The `<commit-range-sk>` element is added to the DOM. |
| |
| 2. **Property Setting:** |
| |
| - The application provides data to the element: |
| - `element.trace = [10, MISSING_DATA_SENTINEL, 12, 15];` |
| - `element.header = [{offset: C1}, {offset: C2}, {offset: C3}, |
| |
| {offset: C4}];`(where C1-C4 are commit numbers) |
| |
| -`element.commitIndex = 3;`(points to the data`15`and commit`C4`) |
| |
| - `element.showLinks = true;` |
| |
| 3. **`recalcLink()` Triggered:** |
| |
| - Changing any of the above properties automatically calls `recalcLink()`. |
| |
| 4. **Determine Commit IDs (`setCommitIds()`):** |
| |
| - End commit: `header[commitIndex].offset` => `C4`. |
| - Previous commit search: |
| - Start at `commitIndex - 1 = 2`. `trace[2]` is `12` (not missing). |
| So, `header[2].offset` => `C3`. |
| - `_commitIds` becomes `[C3, C4]`. |
| |
| 5. **Check if Range (`isRange()`):** |
| |
| - Is `C3 + 1 === C4`? Let's assume `C3` and `C4` are not consecutive |
| (e.g., `C3=100`, `C4=102`). `isRange()` returns `true`. |
| - Text becomes: `"${C3 + 1} - ${C4}"` (e.g., `"101 - 102"`). |
| |
| 6. **Convert Commit IDs to Hashes (`commitNumberToHashes`):** |
| |
| - `commitNumberToHashes([C3, C4])` is called. |
| - Internally, this likely makes a POST request to `/`/cid/``with`[C3, |
| C4]`. |
| - Backend returns: `{ commitSlice: [{hash: "hash_for_C3"}, {hash: |
| |
| "hash_for_C4"}] }`. |
| |
| - The function resolves with `["hash_for_C3", "hash_for_C4"]`. |
| |
| 7. **Construct URL:** |
| |
| - `url = window.perf.commit_range_url` (e.g., |
| `"http://example.com/range/{begin}/{end}"`) |
| - `url = url.replace('{begin}', "hash_for_C3")` |
| - `url = url.replace('{end}', "hash_for_C4")` |
| - `_url` becomes `"http://example.com/range/hash_for_C3/hash_for_C4"`. |
| |
| 8. **Render:** |
| |
| - Since `showLinks` is true, the template becomes: `<a |
| |
| href="http://example.com/range/hash_for_C3/hash_for_C4" |
| target="\_blank">101 - 102</a>` - The element updates its content with this HTML. |
| |
| This workflow demonstrates how `commit-range-sk` encapsulates the logic for |
| finding relevant commits, converting their identifiers, and presenting a |
| user-friendly link to explore changes between them, abstracting away the |
| complexities of interacting with commit data and URL templates. |
| |
| # Module: /modules/common |
| |
| ## Common Module |
| |
| The `common` module houses utility functions and data structures that are shared |
| across various parts of the Perf application, particularly those related to data |
| visualization and testing. Its primary purpose is to promote code reuse and |
| maintain consistency in how data is processed and displayed. |
| |
| ### Responsibilities and Key Components |
| |
| The module's responsibilities can be broken down into the following areas: |
| |
| 1. **Plot Data Construction and Formatting**: |
| |
| - **Why**: Visualizing performance data often involves transforming raw |
| data into formats suitable for charting libraries (like Google Charts). |
| This process needs to be standardized to ensure plots are consistent and |
| correctly represent the underlying information. |
| - **How**: |
| |
| - `plot-builder.ts`: This file is central to preparing data for |
| plotting. |
| |
| - `convertFromDataframe`: This function is crucial for adapting data |
| organized in a `DataFrame` structure (where traces are rows) into a |
| format suitable for Google Charts, which typically expects data in |
| columns. It essentially transposes the `TraceSet`. The `domain` |
| parameter allows specifying whether the x-axis should represent |
| commit positions, dates, or both, providing flexibility in how |
| time-series data is visualized. |
| |
| ``` |
| Input DataFrame (TraceSet): |
| TraceA: [val1, val2, val3] |
| TraceB: [valA, valB, valC] |
| Header: [commit1, commit2, commit3] |
| |
| convertFromDataframe (domain='commit') -> |
| |
| Output for Google Chart: |
| ["Commit Position", "TraceA", "TraceB"] |
| [commit1_offset, val1, valA ] |
| [commit2_offset, val2, valB ] |
| [commit3_offset, val3, valC ] |
| ``` |
| |
| - `ConvertData`: This function takes a `ChartData` object, which is a |
| more abstract representation of plot data (lines with x, y |
| coordinates and labels), and transforms it into the specific |
| array-of-arrays format required by Google Charts. This abstraction |
| allows other parts of the application to work with `ChartData` |
| without needing to know the exact details of the charting library's |
| input format. |
| |
| ``` |
| Input ChartData: |
| xLabel: "Time" |
| lines: { |
| "Line1": [{x: t1, y: v1}, {x: t2, y: v2}], |
| "Line2": [{x: t1, y: vA}, {x: t2, y: vB}] |
| } |
| |
| ConvertData -> |
| |
| Output for Google Chart: |
| ["Time", "Line1", "Line2"] |
| [t1, v1, vA ] |
| [t2, v2, vB ] |
| ``` |
| |
| - `mainChartOptions` and `SummaryChartOptions`: These functions |
| provide pre-configured option objects for Google Line Charts. They |
| encapsulate common styling and behavior (like colors, axis |
| formatting, tooltip behavior, and null interpolation) to ensure a |
| consistent look and feel for different types of charts (main detail |
| charts vs. summary overview charts). This avoids repetitive |
| configuration and makes it easier to maintain visual consistency. |
| The options are also designed to adapt to the current theme |
| (light/dark mode) by using CSS custom properties. |
| |
| - `defaultColors`: A predefined array of colors used for chart series, |
| ensuring a consistent and visually distinct palette. |
| |
| 2. **Plotting Utilities**: |
| |
| - **Why**: Beyond basic data transformation, there are common tasks |
| related to preparing data specifically for plotting, such as associating |
| anomalies with data points and handling missing values. |
| - **How**: |
| |
| - `plot-util.ts`: This file contains helper functions that build upon |
| `plot-builder.ts`. |
| |
| - `CreateChartDataFromTraceSet`: This function serves as a |
| higher-level constructor for `ChartData`. It takes a raw `TraceSet` |
| (a dictionary where keys are trace identifiers and values are arrays |
| of numbers), corresponding x-axis labels (commit numbers or dates), |
| the desired x-axis format, and anomaly information. It then iterates |
| through the traces, constructs `DataPoint` objects (which include x, |
| y, and any associated anomaly), and organizes them into the |
| `ChartData` structure. A key aspect is its handling of |
| `MISSING_DATA_SENTINEL` to exclude missing points from the chart |
| data, relying on the charting library's interpolation. It also uses |
| `findMatchingAnomaly` to link anomalies to their respective data |
| points. |
| |
| ``` |
| Input TraceSet: |
| "trace_foo": [10, 12, MISSING_DATA_SENTINEL, 15] |
| xLabels: [c1, c2, c3, c4] |
| Anomalies: { "trace_foo": [{x: c2, y: 12, anomaly: {...}}] } |
| |
| CreateChartDataFromTraceSet -> |
| |
| Output ChartData: |
| lines: { |
| "trace_foo": [ |
| {x: c1, y: 10, anomaly: null}, |
| {x: c2, y: 12, anomaly: {...}}, |
| // Point for c3 is skipped due to MISSING_DATA_SENTINEL |
| {x: c4, y: 15, anomaly: null} |
| ] |
| } |
| ... |
| ``` |
| |
| - `findMatchingAnomaly`: A utility to efficiently check if a given |
| data point (identified by its trace key, x-coordinate, and |
| y-coordinate) corresponds to a known anomaly. This is used by |
| `CreateChartDataFromTraceSet` to enrich data points with anomaly |
| details. |
| |
| 3. **Test Utilities**: |
| |
| - **Why**: Writing effective unit and integration tests, as well as |
| creating demo pages, often requires mock data and simulated API |
| responses. Centralizing these test utilities avoids duplication and |
| makes tests easier to write and maintain. |
| - **How**: |
| - `test-util.ts`: This file provides functions to set up a common |
| testing and demo environment. |
| - `setUpExploreDemoEnv`: This is a comprehensive function that uses |
| `fetch-mock` to intercept various API calls that are typically made |
| by Perf frontend components (e.g., explore page, alert details). It |
| returns predefined, static responses for endpoints like |
| `/_/login/status`, `/_/initpage/...`, `/_/count/`, `/_/frame/start`, |
| `/_/defaults/`, `/_/status/...`, `/_/cid/`, `/_/details/`, |
| `/_/shortcut/get`, `/_/nextParamList/`, and `/_/shortcut/update`. |
| - The purpose of mocking these endpoints is to allow frontend |
| components to be tested or demonstrated in isolation, without |
| requiring a live backend. The mocked data is designed to be |
| representative of real API responses, enabling realistic testing |
| scenarios. For example, it provides sample `paramSet` data, |
| `DataFrame` structures, commit information, and default |
| configurations. This ensures that components relying on these API |
| calls behave predictably in a test or demo environment. The function |
| also checks for a `proxy_endpoint` cookie to avoid mocking if a real |
| backend is being proxied for development or demo purposes. |
| |
| # Module: /modules/const |
| |
| The `/modules/const` module serves as a centralized repository for constants |
| utilized throughout the Perf UI. Its primary purpose is to ensure consistency |
| and maintainability by providing a single source of truth for values that are |
| shared across different parts of the user interface. |
| |
| A key design decision behind this module is to manage values that might also be |
| defined in the backend. This avoids potential discrepancies and ensures that |
| frontend and backend systems operate with the same understanding of specific |
| sentinel values or configurations. |
| |
| The core responsibility of this module is to define and export these shared |
| constants. |
| |
| One of the key components is the `const.ts` file. This file contains the actual |
| definitions of the constants. A notable constant defined here is |
| `MISSING_DATA_SENTINEL`. |
| |
| The `MISSING_DATA_SENTINEL` constant (value: `1e32`) is critical for |
| representing missing data points within traces. The backend uses this specific |
| floating-point value to indicate that a sample is absent. The choice of `1e32` |
| is deliberate. JSON, the data interchange format used, does not natively support |
| `NaN` (Not a Number) or infinity values (`+/- Inf`). Therefore, a valid |
| `float32` that has a compact JSON representation and is unlikely to clash with |
| actual data values was chosen. It is imperative that this frontend constant |
| remains synchronized with the `MissingDataSentinel` constant defined in the |
| backend Go package `//go/vec32/vec`. This synchronization ensures that both the |
| UI and the backend correctly interpret missing data. |
| |
| Any part of the Perf UI that needs to interpret or display trace data, |
| especially when dealing with potentially incomplete datasets, will rely on this |
| `MISSING_DATA_SENTINEL`. For instance, charting libraries or data table |
| components might use this constant to visually differentiate missing points or |
| to exclude them from calculations. |
| |
| Workflow involving `MISSING_DATA_SENTINEL`: |
| |
| Backend Data Generation --> Data contains `MissingDataSentinel` from |
| `//go/vec32/vec` | V Data Serialization (JSON) --> `1e32` is used for missing |
| data | V Frontend Data Fetching | V Frontend UI Component (e.g., a chart) | V UI |
| uses `MISSING_DATA_SENTINEL` from `/modules/const/const.ts` to identify missing |
| points | V Appropriate rendering (e.g., gap in a line chart, specific |
| placeholder in a table) |
| |
| # Module: /modules/csv |
| |
| The `/modules/csv` module provides functionality to convert `DataFrame` objects, |
| a core data structure representing performance or experimental data, into the |
| Comma Separated Values (CSV) format. This conversion is essential for users who |
| wish to export data for analysis in external tools, spreadsheets, or for |
| archival purposes. |
| |
| The primary challenge in converting a `DataFrame` to CSV lies in representing |
| the potentially sparse and varied parameter sets associated with each trace |
| (data series) in a flat, tabular format. The `DataFrame` stores traces indexed |
| by a "trace ID," which is a string encoding of key-value pairs representing the |
| parameters that uniquely identify that trace. |
| |
| The conversion process addresses this challenge through a multi-step approach: |
| |
| 1. **Parameter Key Consolidation**: |
| |
| - The `parseIdsIntoParams` function takes an array of trace IDs and |
| transforms each ID string back into its constituent key-value parameter |
| pairs. This is achieved by leveraging the `fromKey` function from the |
| `//perf/modules/paramtools` module. |
| - The `allParamKeysSorted` function then iterates through all these parsed |
| parameter sets to identify the complete, unique set of all parameter |
| keys present across all traces. These keys are then sorted |
| alphabetically. This sorted list of unique parameter keys will form the |
| initial set of columns in the CSV, ensuring a consistent order and |
| comprehensive representation of all parameters. |
| |
| _Pseudocode for parameter key consolidation:_ |
| |
| ``` |
| traceIDs = ["key1=valueA,key2=valueB", "key1=valueC,key3=valueD"] |
| parsedParams = {} |
| for each id in traceIDs: |
| parsedParams[id] = fromKey(id) // e.g., {"key1=valueA,key2=valueB": {key1:"valueA", key2:"valueB"}} |
| |
| allKeys = new Set() |
| for each params in parsedParams.values(): |
| for each key in params.keys(): |
| allKeys.add(key) |
| |
| sortedColumnNames = sorted(Array.from(allKeys)) // e.g., ["key1", "key2", "key3"] |
| ``` |
| |
| 2. **Header Row Generation**: |
| |
| - The `dataFrameToCSV` function begins by constructing the header row of |
| the CSV. |
| - This row starts with the `sortedColumnNames` derived in the previous |
| step. |
| - It then appends column headers derived from the `DataFrame`'s `header` |
| property. Each element in `df.header` typically represents a point in |
| time (or a commit, build, etc.), and its `timestamp` field is converted |
| into an ISO 8601 formatted date string. |
| |
| _Pseudocode for header row generation:_ |
| |
| ``` |
| csvHeader = sortedColumnNames |
| for each columnHeader in df.header: |
| csvHeader.push(new Date(columnHeader.timestamp * 1000).toISOString()) |
| csvLines.push(csvHeader.join(',')) |
| ``` |
| |
| 3. **Data Row Generation**: |
| |
| - For each trace in the `df.traceset` (excluding "special\_" traces, which |
| are likely internal or metadata traces not intended for direct CSV |
| export): |
| - The corresponding parameter values for the `sortedColumnNames` are |
| retrieved. If a trace does not have a value for a particular |
| parameter key, an empty string is used, ensuring that each row has |
| the same number of columns corresponding to the parameter keys. |
| - The actual data points for the trace are then appended. The |
| `MISSING_DATA_SENTINEL` (defined in `//perf/modules/const`) is a |
| special value indicating missing data; this is converted to an empty |
| string in the CSV to represent a null or missing value. Other |
| numerical values are appended directly. |
| - Each fully constructed row is then joined by commas. |
| |
| _Pseudocode for data row generation:_ |
| |
| ``` |
| for each traceId, traceData in df.traceset: |
| if traceId starts with "special_": |
| continue |
| |
| traceParams = parsedParams[traceId] |
| rowData = [] |
| for each columnName in sortedColumnNames: |
| rowData.push(traceParams[columnName] or "") // Add parameter value or empty string |
| |
| for each value in traceData: |
| if value is MISSING_DATA_SENTINEL: |
| rowData.push("") |
| else: |
| rowData.push(value) |
| csvLines.push(rowData.join(',')) |
| ``` |
| |
| 4. **Final CSV String Assembly**: |
| |
| - Finally, all the generated lines (header and data rows) are joined |
| together with newline characters (`\n`) to produce the complete CSV |
| string. |
| |
| The design prioritizes creating a CSV that is both human-readable and easily |
| parsable by other tools. By dynamically determining the parameter columns based |
| on the input `DataFrame` and sorting them, it ensures that all relevant trace |
| metadata is included in a consistent manner. The explicit handling of |
| `MISSING_DATA_SENTINEL` ensures that missing data is represented clearly as |
| empty fields. |
| |
| The key files in this module are: |
| |
| - `index.ts`: This file contains the core logic for the CSV conversion. It |
| houses the `parseIdsIntoParams`, `allParamKeysSorted`, and the main |
| `dataFrameToCSV` functions. It leverages helper functions from |
| `//perf/modules/paramtools` for parsing trace ID strings and relies on |
| constants from `//perf/modules/const` for identifying missing data. |
| - `index_test.ts`: This file provides unit tests for the `dataFrameToCSV` |
| function. It defines a sample `DataFrame` with various scenarios, including |
| different parameter sets per trace and missing data points, and asserts that |
| the generated CSV matches the expected output. This is crucial for ensuring |
| the correctness and robustness of the CSV generation logic. |
| |
| The dependencies on `//perf/modules/const` (for `MISSING_DATA_SENTINEL`) and |
| `//perf/modules/json` (for `DataFrame`, `ColumnHeader`, `Params` types) indicate |
| that this module is tightly integrated with the broader data representation and |
| handling mechanisms of the Perf system. The dependency on |
| `//perf/modules/paramtools` (for `fromKey`) highlights its role in interpreting |
| the structured information encoded within trace IDs. |
| |
| # Module: /modules/dataframe |
| |
| The `dataframe` module is designed to manage and manipulate time-series data, |
| specifically performance testing traces, within the Perf application. It |
| provides a centralized way to fetch, store, and process trace data, enabling |
| functionalities like visualizing performance trends, identifying anomalies, and |
| managing user-reported issues. |
| |
| The core idea is to have a reactive data repository that components can consume. |
| This allows for efficient data loading and updates, especially when dealing with |
| large datasets and dynamic time ranges. Instead of each component fetching and |
| managing its own data, they can rely on a shared `DataFrameRepository` to handle |
| these tasks. This promotes consistency and reduces redundant data fetching. |
| |
| ## Key Components and Responsibilities |
| |
| ### `dataframe_context.ts` |
| |
| This file defines the `DataFrameRepository` class, which acts as the central |
| data store and manager. It's implemented as a LitElement |
| (`<dataframe-repository-sk>`) that doesn't render any UI itself but provides |
| data and loading states through Lit contexts. |
| |
| **Why a LitElement with Contexts?** Using a LitElement allows easy integration |
| into the existing component-based architecture. Lit contexts (`@lit/context`) |
| provide a clean and reactive way for child components to consume the `DataFrame` |
| and related information without prop drilling or complex event bus |
| implementations. |
| |
| **Core Functionalities:** |
| |
| - **Data Fetching:** |
| |
| - `resetTraces(range, paramset)`: Fetches an initial set of traces based |
| on a time range and a `ParamSet` (a set of key-value pairs defining the |
| traces to query). This is typically called when the user defines a new |
| query. `User defines query -> explore-simple-sk calls resetTraces() | V |
| DataFrameRepository -> Fetches data from /_/frame/start | V Updates |
| internal _header, _traceset, anomaly, userIssues | V Provides DataFrame, |
| DataTable, AnomalyMap, UserIssueMap via context` |
| - `extendRange(offsetInSeconds)`: Fetches additional data to extend the |
| current time range, either forwards or backwards. This is used for |
| infinite scrolling or when the user wants to see more data. To improve |
| performance for large range extensions, it slices the requested range |
| into smaller chunks (`chunkSize`) and fetches them concurrently. `User |
| scrolls/requests more data -> UI calls extendRange() | V |
| DataFrameRepository -> Slices range into chunks if needed | V Fetches |
| data for each chunk from /_/frame/start concurrently | V Merges new data |
| with existing _header, _traceset, anomaly | V Provides updated |
| DataFrame, DataTable, AnomalyMap via context` |
| - The fetching mechanism uses the `/_/frame/start` endpoint, sending a |
| `FrameRequest` which includes the time range, query (derived from |
| `ParamSet`), and timezone. |
| - It handles responses, including potential errors or "Finished" status |
| with no data (e.g., no commits in the requested range). |
| |
| - **Data Caching and Merging:** |
| |
| - Maintains an internal representation of the data: `_header` (array of |
| `ColumnHeader` objects, representing commit points/timestamps) and |
| `_traceset` (a `TraceSet` object mapping trace keys to their data |
| arrays). |
| - When new data is fetched (either initial load or extension), it's merged |
| with the existing cached data. The merging logic ensures that headers |
| are correctly ordered and trace data is appropriately prepended or |
| appended. If a trace being extended isn't present in a new data chunk, |
| it's padded with `MISSING_DATA_SENTINEL` to maintain alignment with the |
| header. |
| |
| - **Anomaly Management:** |
| |
| - Fetches anomaly data (`AnomalyMap`) along with the trace data. |
| - `updateAnomalies(anomalies, id)`: Allows merging new anomalies and |
| removing specific anomalies (e.g., when an anomaly is nudged or |
| re-triaged). This uses `mergeAnomaly` and `removeAnomaly` from |
| `index.ts`. |
| |
| - **User-Reported Issue Management:** |
| |
| - `getUserIssues(traceKeys, begin, end)`: Fetches user-reported issues |
| (e.g., Buganizer bugs linked to specific data points) from the |
| `/_/user_issues/` endpoint for a given set of traces and commit range. |
| - `updateUserIssue(traceKey, commitPosition, bugId)`: Updates the local |
| cache of user issues, typically after a new issue is filed or an |
| existing one is modified. |
| - Trace keys are normalized by removing special functions (e.g., `norm()`) |
| before querying for user issues to ensure issues are found even if the |
| displayed trace is a transformed version of the original. |
| |
| - **Google DataTable Conversion:** |
| |
| - Converts the internal `DataFrame` into a |
| `google.visualization.DataTable` format using `convertFromDataframe` |
| (from `perf/modules/common:plot-builder_ts_lib`). This `DataTable` is |
| then provided via `dataTableContext` and is typically consumed by |
| charting components like `<plot-google-chart-sk>`. |
| - The Google Chart library is loaded asynchronously |
| (`DataFrameRepository.loadPromise`). |
| |
| - **State Management:** |
| |
| - `loading`: A boolean provided via `dataframeLoadingContext` to indicate |
| if a data request is in flight. |
| - `_requestComplete`: A Promise that resolves when the current data |
| fetching operation completes. This can be used to coordinate actions |
| that depend on data being available. |
| |
| **Contexts Provided:** |
| |
| - `dataframeContext`: Provides the current `DataFrame` object. |
| - `dataTableContext`: Provides the `google.visualization.DataTable` derived |
| from the `DataFrame`. |
| - `dataframeAnomalyContext`: Provides the `AnomalyMap` for the current data. |
| - `dataframeUserIssueContext`: Provides the `UserIssueMap` for the current |
| data. |
| - `dataframeLoadingContext`: Provides a boolean indicating if data is |
| currently being loaded. |
| - `dataframeRepoContext`: Provides the `DataFrameRepository` instance itself, |
| allowing consumers to call its methods (e.g., `extendRange`). |
| |
| ### `index.ts` |
| |
| This file contains utility functions for manipulating `DataFrame` structures, |
| similar to its Go counterpart (`//perf/go/dataframe/dataframe.go`). These |
| functions are crucial for merging, slicing, and analyzing the data. |
| |
| **Key Functions:** |
| |
| - `findSubDataframe(header, range, domain)`: Given a `DataFrame` header and a |
| time/offset range, this function finds the start and end indices within the |
| header that correspond to the given range. This is essential for slicing |
| data. |
| - `generateSubDataframe(dataframe, range)`: Creates a new `DataFrame` |
| containing only the data within the specified index range of the original |
| `DataFrame`. |
| - `mergeAnomaly(anomaly1, ...anomalies)`: Merges multiple `AnomalyMap` objects |
| into a single one. If anomalies exist for the same trace and commit, the |
| later ones in the arguments list will overwrite earlier ones. It always |
| returns a non-null `AnomalyMap`. |
| - `removeAnomaly(anomalies, id)`: Creates a new `AnomalyMap` excluding any |
| anomalies with the specified `id`. This is used when an anomaly is moved or |
| re-triaged on the backend, and the old entry needs to be cleared. |
| - `findAnomalyInRange(allAnomaly, range)`: Filters an `AnomalyMap` to include |
| only anomalies whose commit positions fall within the given commit range. |
| - `mergeColumnHeaders(a, b)`: Merges two arrays of `ColumnHeader` objects, |
| producing a new sorted array of unique headers. It also returns mapping |
| objects (`aMap`, `bMap`) that indicate the new index of each header from the |
| original arrays. This is fundamental for the `join` operation. |
| - **Why map objects?** When merging traces from two DataFrames, the data |
| points need to be placed at the correct positions in the newly merged |
| header. The maps provide this correspondence. |
| - `join(a, b)`: Combines two `DataFrame` objects into a new one. |
| 1. It first merges their headers using `mergeColumnHeaders`. |
| 2. Then, it creates a new `traceset`. For each trace in the original |
| DataFrames, it uses the `aMap` and `bMap` to place the trace data points |
| into the correct slots in the new, longer trace arrays, filling gaps |
| with `MISSING_DATA_SENTINEL`. |
| 3. It also merges the `paramset` from both DataFrames. |
| 4. **Purpose:** This is useful when combining data from different sources |
| or different time periods that might not perfectly align. |
| - `buildParamSet(d)`: Reconstructs the `paramset` of a `DataFrame` based on |
| the keys present in its `traceset`. This ensures the `paramset` accurately |
| reflects the data. |
| - `timestampBounds(df)`: Returns the earliest and latest timestamps present in |
| the `DataFrame`'s header. |
| |
| ### `traceset.ts` |
| |
| This file provides utility functions for extracting and formatting information |
| from the trace keys within a `DataFrame` or `DataTable`. Trace keys are strings |
| that encode various parameters (e.g., |
| `",benchmark=Speedometer,test=MotionMark,"`). |
| |
| **Key Functions:** |
| |
| - `getAttributes(df)`: Extracts all unique attribute keys (e.g., "benchmark", |
| "test") present across all trace keys in a `DataFrame`. |
| - `getTitle(dt)`: Identifies the common key-value pairs across all trace |
| labels in a `DataTable`. These common pairs form the "title" of the chart, |
| representing what all displayed traces have in common. |
| - **Why `DataTable` input?** This function is often used directly with the |
| `DataTable` that feeds a chart, as column labels in the `DataTable` are |
| typically the trace keys. |
| - `getLegend(dt)`: Identifies the key-value pairs that are _not_ common across |
| all trace labels in a `DataTable`. These differing parts form the "legend" |
| for each trace, distinguishing them from one another. |
| - It ensures that all legend objects have the same set of keys (sorted |
| alphabetically), filling in missing values with `"untitled_key"` for |
| consistency in display. |
| - `titleFormatter(title)`: Formats the output of `getTitle` (an object) into a |
| human-readable string, typically by joining values with '/'. |
| - `legendFormatter(legend)`: Formats the output of `getLegend` (an array of |
| objects) into an array of human-readable strings. |
| - `getLegendKeysTitle(label)`: Takes a legend object (for a single trace) and |
| creates a string by joining its keys, often used as a title for the legend |
| section. |
| - `isSingleTrace(dt)`: Checks if a `DataTable` contains data for only a single |
| trace (i.e., has 3 columns: domain, commit position/date, and one trace). |
| - `findTraceByLabel(dt, legendTraceId)`: Finds the column label (trace key) in |
| a `DataTable` that matches the given `legendTraceId`. |
| - `findTracesForParam(dt, paramKey, paramValue)`: Finds all trace labels in a |
| `DataTable` that contain a specific key-value pair. |
| - `removeSpecialFunctions(key)`: A helper used internally to strip function |
| wrappers (like `norm(...)`) from trace keys before processing, ensuring that |
| the underlying parameters are correctly parsed. |
| |
| **Design Rationale for Title/Legend Generation:** When multiple traces are |
| plotted, the title should reflect what's common among them (e.g., |
| "benchmark=Speedometer"), and the legend should highlight what's different |
| (e.g., "test=Run1" vs. "test=Run2"). These functions automate this process by |
| analyzing the trace keys. |
| |
| ## Workflows |
| |
| ### Initial Data Load and Display |
| |
| ``` |
| 1. User navigates to a page or submits a query. |
| | |
| V |
| 2. <explore-simple-sk> (or similar component) determines initial time range and ParamSet. |
| | |
| V |
| 3. Calls `dataframeRepository.resetTraces(initialRange, initialParamSet)`. |
| | |
| V |
| 4. DataFrameRepository: |
| a. Sets `loading = true`. |
| b. Constructs `FrameRequest`. |
| c. POSTs to `/_/frame/start`. |
| d. Receives `FrameResponse` (containing DataFrame and AnomalyMap). |
| e. Updates its internal `_header`, `_traceset`, `anomaly`. |
| f. Calls `setDataFrame()`: |
| i. Updates `this.dataframe` (triggers `dataframeContext`). |
| ii. Converts DataFrame to `google.visualization.DataTable`. |
| iii. Updates `this.data` (triggers `dataTableContext`). |
| g. Updates `this.anomaly` (triggers `dataframeAnomalyContext`). |
| h. Sets `loading = false`. |
| | |
| V |
| 5. Charting components (consuming `dataTableContext`) re-render with the new data. |
| | |
| V |
| 6. Other UI elements (consuming `dataframeContext`, `dataframeAnomalyContext`) update. |
| ``` |
| |
| ### Extending Time Range (e.g., Scrolling) |
| |
| ``` |
| 1. User action triggers a request to load more data (e.g., scrolls near edge of chart). |
| | |
| V |
| 2. UI component calls `dataframeRepository.extendRange(offsetInSeconds)`. |
| | |
| V |
| 3. DataFrameRepository: |
| a. Sets `loading = true`. |
| b. Calculates the new time range (`deltaRange`). |
| c. Slices the new range into chunks if `offsetInSeconds` is large (`sliceRange`). |
| d. For each chunk: |
| i. Constructs `FrameRequest`. |
| ii. POSTs to `/_/frame/start`. |
| e. `Promise.all` awaits all chunk responses. |
| f. Filters out empty/error responses and sorts responses by timestamp. |
| g. Merges `header` and `traceset` from sorted responses into existing `_header` and `_traceset`. |
| - For traceset: pads with `MISSING_DATA_SENTINEL` if a trace is missing in a new chunk. |
| h. Merges `anomalymap` from sorted responses into existing `anomaly`. |
| i. Calls `setDataFrame()` (as in initial load). |
| j. Sets `loading = false`. |
| | |
| V |
| 4. Charting components and other UI elements update. |
| ``` |
| |
| ### Displaying Chart Title and Legend |
| |
| ``` |
| 1. Charting component (e.g., <perf-explore-sk>) has access to the `DataTable` via `dataTableContext`. |
| | |
| V |
| 2. It calls `getTitle(dataTable)` and `getLegend(dataTable)` from `traceset.ts`. |
| | |
| V |
| 3. It then uses `titleFormatter` and `legendFormatter` to get displayable strings. |
| | |
| V |
| 4. Renders these strings as the chart title and legend series. |
| ``` |
| |
| ## Testing |
| |
| - `dataframe_context_test.ts`: Tests the `DataFrameRepository` class. It uses |
| `fetch-mock` to simulate API responses from `/_/frame/start` and |
| `/_/user_issues/`. Tests cover initialization, data loading (`resetTraces`), |
| range extension (`extendRange`) with and without chunking, anomaly merging, |
| and user issue fetching/updating. |
| - `index_test.ts`: Tests the utility functions in `index.ts`, such as |
| `mergeColumnHeaders`, `join`, `findSubDataframe`, `mergeAnomaly`, etc. It |
| uses manually constructed `DataFrame` objects to verify the logic of these |
| data manipulation functions. |
| - `traceset_test.ts`: Tests the functions in `traceset.ts` for extracting |
| titles and legends from trace keys. It generates `DataFrame` objects with |
| various key combinations, converts them to `DataTable` (requiring Google |
| Chart API to be loaded), and then asserts the output of `getTitle`, |
| `getLegend`, etc. |
| - `test_utils.ts`: Provides helper functions for tests, notably: |
| - `generateFullDataFrame`: Creates mock `DataFrame` objects with specified |
| structures, which is invaluable for setting up consistent test |
| scenarios. |
| - `generateAnomalyMap`: Creates mock `AnomalyMap` objects linked to a |
| `DataFrame`. |
| - `mockFrameStart`: A utility to easily mock the `/_/frame/start` endpoint |
| with `fetch-mock`, returning parts of a provided full `DataFrame` based |
| on the request's time range. |
| - `mockUserIssues`: Mocks the `/_/user_issues/` endpoint. |
| |
| The testing strategy relies heavily on creating controlled mock data and API |
| responses to ensure that the data processing and fetching logic behaves as |
| expected under various conditions. |
| |
| # Module: /modules/day-range-sk |
| |
| The `day-range-sk` module provides a custom HTML element for selecting a date |
| range. It allows users to pick a "begin" and "end" date, which is a common |
| requirement in applications that deal with time-series data or event logging. |
| |
| The primary goal of this module is to offer a user-friendly way to define a time |
| interval. It achieves this by composing two `calendar-input-sk` elements, one |
| for the start date and one for the end date. This design choice leverages an |
| existing, well-tested component for date selection, promoting code reuse and |
| consistency. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`day-range-sk.ts`**: This is the core file defining the `DayRangeSk` |
| custom element. |
| |
| - **Why**: It encapsulates the logic for managing the begin and end dates, |
| handling user interactions, and emitting an event when the range |
| changes. |
| - **How**: |
| - It extends `ElementSk`, a base class for custom elements, providing |
| lifecycle callbacks and rendering capabilities. |
| - It uses the `lit-html` library for templating, rendering two |
| `calendar-input-sk` elements labeled "Begin" and "End". |
| - The `begin` and `end` dates are stored as attributes (and corresponding |
| properties) representing Unix timestamps in seconds. This is a common |
| and unambiguous way to represent points in time. |
| - When either `calendar-input-sk` element fires an `input` event |
| (signifying a date change), the `DayRangeSk` element updates its |
| corresponding `begin` or `end` attribute and then dispatches a custom |
| event named `day-range-change`. |
| - The `day-range-change` event's `detail` object contains the `begin` and |
| `end` timestamps, allowing parent components to easily consume the |
| selected range. |
| - Default values for `begin` and `end` are set if not provided: `begin` |
| defaults to 24 hours before the current time, and `end` defaults to the |
| current time. This provides a sensible initial state. |
| - The `connectedCallback` and `attributeChangedCallback` are used to |
| ensure the element renders correctly when added to the DOM or when its |
| attributes are modified. |
| |
| - **`day-range-sk.scss`**: This file contains the styling for the |
| `day-range-sk` element. |
| |
| - **Why**: To provide a consistent visual appearance and integrate with |
| the application's theming. |
| - **How**: It imports common theme variables (`themes.scss`) and defines |
| specific styles for the labels and input fields within the |
| `day-range-sk` component, ensuring they adapt to light and dark modes. |
| |
| - **`day-range-sk-demo.html` and `day-range-sk-demo.ts`**: These files provide |
| a demonstration page for the `day-range-sk` element. |
| |
| - **Why**: To showcase the element's functionality, allow for interactive |
| testing, and serve as an example of how to use it. |
| - **How**: |
| - The HTML file includes instances of `day-range-sk` with different |
| initial `begin` and `end` attributes. |
| - The TypeScript file listens for the `day-range-change` event from these |
| instances and displays the event details in a `<pre>` tag, demonstrating |
| how to retrieve the selected date range. |
| |
| - **`day-range-sk_puppeteer_test.ts`**: This file contains Puppeteer tests for |
| the `day-range-sk` element. |
| |
| - **Why**: To ensure the element renders correctly and behaves as expected |
| in a browser environment. |
| - **How**: It uses the `loadCachedTestBed` utility to set up a testing |
| environment, navigates to the demo page, and takes screenshots for |
| visual regression testing. It also performs a basic smoke test to |
| confirm the element is present on the page. |
| |
| **Key Workflows:** |
| |
| 1. **Initialization:** `User HTML` -> `day-range-sk (attributes: begin, end)` |
| `day-range-sk.connectedCallback()` `IF begin/end not set` `Set default begin |
| (now - 24h), end (now)` `_render()` `Create two <calendar-input-sk> elements |
| with initial dates` |
| |
| 2. **User Selects a New "Begin" Date:** `User interacts with "Begin" |
| <calendar-input-sk>` `<calendar-input-sk> fires "input" event (with new |
| Date)` `day-range-sk._beginChanged(event)` `Update this.begin (convert Date |
| to timestamp)` `this._sendEvent()` `Dispatch "day-range-change" event with { |
| begin: new_begin_timestamp, end: current_end_timestamp }` |
| |
| 3. **User Selects a New "End" Date:** `User interacts with "End" |
| <calendar-input-sk>` `<calendar-input-sk> fires "input" event (with new |
| Date)` `day-range-sk._endChanged(event)` `Update this.end (convert Date to |
| timestamp)` `this._sendEvent()` `Dispatch "day-range-change" event with { |
| begin: current_begin_timestamp, end: new_end_timestamp }` |
| |
| 4. **Parent Component Consumes Date Range:** `Parent Component` `Listen for |
| "day-range-change" on <day-range-sk>` `On event:` `Access |
| event.detail.begin` `Access event.detail.end` `Perform actions with the new |
| date range` |
| |
| The conversion between `Date` objects (used by `calendar-input-sk`) and numeric |
| timestamps (used by `day-range-sk`'s attributes and events) is handled |
| internally by the `dateFromTimestamp` utility function and by using |
| `Date.prototype.valueOf() / 1000`. This design ensures that the `day-range-sk` |
| element exposes a simple, numeric API for its date range while leveraging a more |
| complex date object-based component for the UI. |
| |
| # Module: /modules/domain-picker-sk |
| |
| The `domain-picker-sk` module provides a custom HTML element |
| `<domain-picker-sk>` that allows users to select a data domain. This domain can |
| be defined in two ways: either as a specific date range or as a number of data |
| points (commits) preceding a chosen end date. This flexibility is crucial for |
| applications that need to visualize or analyze time-series data where users |
| might want to focus on a specific period or view the most recent N data points. |
| |
| The core design choice is to offer these two distinct modes of domain selection, |
| catering to different user needs. The "Date Range" mode is useful when users |
| know the specific start and end dates they are interested in. The "Dense" mode |
| is more suitable when users want to see a fixed amount of recent data, |
| regardless of the specific start date. |
| |
| The component's state is managed internally and can also be set externally via |
| the `state` property. This `state` object, defined by the `DomainPickerState` |
| interface, holds the `begin` and `end` timestamps (in Unix seconds), the |
| `num_commits` (for "Dense" mode), and the `request_type` which indicates the |
| current selection mode (0 for "Date Range" - `RANGE`, 1 for "Dense" - `DENSE`). |
| |
| **Key Files and Their Responsibilities:** |
| |
| - **`domain-picker-sk.ts`**: This is the heart of the module. It defines the |
| `DomainPickerSk` class, which extends `ElementSk`. |
| |
| - **Why**: It encapsulates all the logic for rendering the UI, handling |
| user interactions, and managing the component's state. |
| - **How**: |
| - It uses the `lit-html` library for templating, allowing for efficient |
| updates to the DOM when the state changes. The `template` static method |
| defines the basic structure, and `_showRadio` and `_requestType` static |
| methods conditionally render different parts of the UI based on the |
| current `request_type` and the `force_request_type` attribute. |
| - It manages the `_state` object. Initial default values are set in the |
| constructor (e.g., end date is now, begin date is 24 hours ago, default |
| `num_commits` is 50). |
| - Event handlers like `typeRange`, `typeDense`, `beginChange`, |
| `endChange`, and `numChanged` update the internal `_state` and then call |
| `render()` to reflect these changes in the UI. |
| - The `force_request_type` attribute (`'range'` or `'dense'`) allows the |
| consuming application to lock the picker into a specific mode, hiding |
| the radio buttons that would normally allow the user to switch. This is |
| useful when the application context dictates a specific type of domain |
| selection. The `attributeChangedCallback` and the getter/setter for |
| `force_request_type` handle this. |
| - It leverages other custom elements: `radio-sk` for mode selection and |
| `calendar-input-sk` for date picking, promoting modularity and reuse. |
| |
| - **`domain-picker-sk.scss`**: This file contains the SASS styles for the |
| component. |
| |
| - **Why**: It separates the presentation from the logic, making the |
| component easier to style and maintain. |
| - **How**: It defines styles for the layout of controls (e.g., using |
| flexbox to align items), descriptive text, input fields, and the |
| calendar input. It also imports shared styles from |
| `elements-sk/modules/styles` for consistency (e.g., buttons, colors). |
| |
| - **`index.ts`**: A simple entry point that imports and registers the |
| `domain-picker-sk` custom element. |
| |
| - **Why**: This is a common pattern for web components, making it easy for |
| other parts of the application to import and use the component. |
| - **How**: It executes `import './domain-picker-sk';` which ensures the |
| `DomainPickerSk` class is defined and registered with the browser's |
| `CustomElementRegistry` via the `define` function call within |
| `domain-picker-sk.ts`. |
| |
| - **`domain-picker-sk-demo.html` and `domain-picker-sk-demo.ts`**: These files |
| provide a demonstration page for the component. |
| |
| - **Why**: They allow developers to see the component in action, test its |
| different states and attributes, and serve as a basic example of how to |
| use it. |
| - **How**: `domain-picker-sk-demo.html` includes instances of |
| `<domain-picker-sk>`, some with the `force_request_type` attribute set. |
| `domain-picker-sk-demo.ts` initializes the `state` of these demo |
| instances with sample data. |
| |
| - **`domain-picker-sk_puppeteer_test.ts`**: Contains Puppeteer tests for the |
| component. |
| |
| - **Why**: To ensure the component renders correctly and behaves as |
| expected in a browser environment. |
| - **How**: It uses the `puppeteer-tests/util` library to load the demo |
| page and take screenshots, verifying the visual appearance of the |
| component in its default state. |
| |
| **Key Workflows/Processes:** |
| |
| 1. **Initialization and Rendering:** |
| |
| - `<domain-picker-sk>` element is added to the DOM. |
| - `connectedCallback` is invoked. |
| - Properties like `state` and `force_request_type` are upgraded (if set as |
| attributes before the element was defined). |
| - Default `_state` is established (e.g., end = now, begin = 24h ago, |
| mode = RANGE). |
| - `render()` is called: |
| - It checks `force_request_type`. If set, it overrides |
| `_state.request_type`. |
| - The main template is rendered. |
| - `_showRadio` decides whether to show mode selection radio buttons. |
| - `_requestType` renders either the "Begin" date input (for RANGE |
| mode) or the "Points" number input (for DENSE mode). |
| |
| ``` |
| [DOM Insertion] -> connectedCallback() -> _upgradeProperty('state') |
| -> _upgradeProperty('force_request_type') |
| -> render() |
| | |
| V |
| [UI Displayed] |
| ``` |
| |
| 2. **User Changes Mode (if `force_request_type` is not set):** |
| |
| - User clicks on "Date Range" or "Dense" radio button. |
| - `@change` event triggers `typeRange()` or `typeDense()`. |
| - `_state.request_type` is updated. |
| - `render()` is called. |
| - The UI updates to show the relevant inputs (Begin date vs. Points). |
| |
| ``` |
| [User clicks radio] -> typeRange()/typeDense() -> _state.request_type updated |
| -> render() |
| | |
| V |
| [UI Updates] |
| ``` |
| |
| 3. **User Changes Date/Number of Commits:** |
| |
| - User interacts with `<calendar-input-sk>` (for Begin/End dates) or the |
| `<input type="number">` (for Points). |
| - `@input` (for calendar) or `@change` (for number input) event triggers |
| `beginChange()`, `endChange()`, or `numChanged()`. |
| - The corresponding part of `_state` (e.g., `_state.begin`, `_state.end`, |
| `_state.num_commits`) is updated. |
| - `render()` is called (though in the case of date changes, the |
| `<calendar-input-sk>` handles its own visual update for the date |
| display, and `render()` here ensures the parent component is aware and |
| can re-render if other parts depend on these values, although in the |
| current implementation, `render()` on the parent might be redundant for |
| just date changes if no other part of _this_ component's template |
| changes directly). |
| |
| ``` |
| [User changes input] -> beginChange()/endChange()/numChanged() |
| | |
| V |
| _state updated |
| | |
| V |
| render() // Potentially re-renders the component |
| | |
| V |
| [UI reflects new value] |
| ``` |
| |
| The component emits no custom events itself but relies on the events from its |
| child components (`radio-sk`, `calendar-input-sk`) to trigger internal state |
| updates and re-renders. Consumers of `domain-picker-sk` would typically read the |
| `state` property to get the user's selection. |
| |
| # Module: /modules/errorMessage |
| |
| The `errorMessage` module provides a wrapper around the `errorMessage` function |
| from the `elements-sk` library. Its primary purpose is to offer a more |
| convenient way to display persistent error messages to the user. |
| |
| **Core Functionality and Design Rationale:** |
| |
| The key differentiation of this module lies in its default behavior for message |
| display duration. While the `elements-sk` `errorMessage` function requires a |
| duration to be specified for how long a message (often referred to as a "toast") |
| remains visible, this module defaults the duration to `0` seconds. |
| |
| This design choice is intentional: a duration of `0` typically signifies that |
| the error message will _not_ automatically close. This is particularly useful in |
| scenarios where an error is critical or requires user acknowledgment, and an |
| auto-dismissing message might be missed. By defaulting to a persistent display, |
| the module prioritizes ensuring the user is aware of the error. |
| |
| **Responsibilities and Key Components:** |
| |
| The module exposes a single function: `errorMessage`. |
| |
| - **`errorMessage(message: string | { message: string } | { resp: Response } | |
| object, duration: number = 0): void`**: |
| - This function is responsible for displaying an error message to the |
| user. |
| - It accepts the same flexible `message` parameter as the underlying |
| `elements-sk` function. This means it can handle plain strings, objects |
| with a `message` property, objects containing a `Response` object (from |
| which an error message can often be extracted), or generic objects. |
| - The crucial aspect is the `duration` parameter. If not explicitly |
| provided by the caller, it defaults to `0`. This default triggers the |
| persistent display behavior mentioned above. |
| - Internally, this function simply calls `elementsErrorMessage` from the |
| `elements-sk` library, passing along the provided `message` and the |
| (potentially defaulted) `duration`. |
| |
| **Workflow:** |
| |
| The typical workflow for using this module is straightforward: |
| |
| 1. **Import:** The `errorMessage` function is imported from this module. |
| 2. **Invocation:** When an error condition occurs that needs to be communicated |
| to the user persistently, the `errorMessage` function is called with the |
| error details. |
| - `errorMessage("A critical error occurred.")` -> Displays "A critical |
| error occurred." indefinitely. |
| - `errorMessage("Something went wrong.", 5000)` -> Displays "Something |
| went wrong." for 5 seconds (overriding the default). |
| |
| Essentially, this module acts as a thin convenience layer, promoting a specific |
| error display pattern (persistent messages) by changing the default behavior of |
| a more general utility. This reduces boilerplate for common use cases where |
| persistent error notification is desired. |
| |
| # Module: /modules/existing-bug-dialog-sk |
| |
| The `existing-bug-dialog-sk` module provides a user interface element for |
| associating performance anomalies with existing bug reports in a bug tracking |
| system (like Monorail). It's designed to be used within a larger performance |
| monitoring application where users need to triage and manage alerts generated by |
| performance regressions. |
| |
| The core purpose of this module is to simplify the workflow of linking one or |
| more detected anomalies to a pre-existing bug. Instead of manually navigating to |
| the bug tracker and updating the bug, users can do this directly from the |
| performance monitoring interface. This reduces context switching and streamlines |
| the bug management process. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`existing-bug-dialog-sk.ts`**: This is the heart of the module, defining |
| the custom HTML element `existing-bug-dialog-sk`. |
| |
| - **Why**: It encapsulates the entire UI and logic for the dialog. This |
| includes displaying a form for entering a bug ID, a dropdown for |
| selecting the bug tracking project (though currently hardcoded to |
| 'chromium'), and a list of already associated bugs for the selected |
| anomalies. |
| - **How**: |
| - It uses Lit for templating and rendering the dialog's HTML structure. |
| - It manages the dialog's visibility (`open()`, `closeDialog()`). |
| - It handles form submission: |
| - Takes the entered bug ID and the list of selected anomalies |
| (`_anomalies`). |
| - Makes an HTTP POST request to a backend endpoint |
| (`/_/triage/associate_alerts`) to create the association. |
| - Upon success, it opens the bug page in a new tab and dispatches a |
| custom event `anomaly-changed`. This event signals other parts of |
| the application (e.g., charts or lists displaying anomalies) that |
| the anomaly data has been updated (specifically, the `bug_id` field) |
| and they might need to re-render. |
| - Handles potential errors by displaying an error message toast. |
| - It fetches and displays a list of bugs already associated with the |
| anomalies in the current group. This involves: |
| - Making a POST request to `/_/anomalies/group_report` to get details |
| of anomalies in the same group, including their associated |
| `bug_id`s. This endpoint might return a `sid` (state ID) if the |
| report generation is asynchronous, requiring a follow-up request. |
| - Once the list of associated bug IDs is retrieved, it makes another |
| POST request to `/_/triage/list_issues` to fetch the titles of these |
| bugs. This provides more context to the user than just showing bug |
| IDs. |
| - The `setAnomalies()` method is crucial for initializing the dialog with |
| the relevant anomaly data when it's about to be shown. |
| - It relies on `window.perf.bug_host_url` to construct links to the bug |
| tracker. |
| |
| - **`existing-bug-dialog-sk.scss`**: This file contains the SASS/CSS styles |
| for the dialog. |
| |
| - **Why**: It ensures the dialog has a consistent look and feel with the |
| rest of the application, using shared theme variables |
| (`--on-background`, `--background`, etc.). |
| - **How**: It defines styles for the dialog container, input fields, |
| buttons, close icon, and the list of associated bugs. It also includes |
| specific styling for the loading spinner and selected items. |
| |
| - **`index.ts`**: This is a simple entry point that imports and registers the |
| `existing-bug-dialog-sk` custom element, making it available for use in |
| HTML. |
| |
| **Workflow for Associating Anomalies with an Existing Bug:** |
| |
| 1. **User Action**: The user selects one or more anomalies in the main |
| application interface and chooses an option to associate them with an |
| existing bug. |
| 2. **Dialog Initialization**: The application calls `setAnomalies()` on an |
| `existing-bug-dialog-sk` instance, passing the selected anomalies. |
| 3. **Dialog Display**: The application calls `open()` on the dialog instance. |
| `Application existing-bug-dialog-sk | | | -- setAnomalies(anomalies) --> | | |
| | | ------ open() ---------> | | | | -- fetch_associated_bugs() --> Backend |
| API (/anomalies/group_report) | | | <-- (Associated Bug IDs) -- | | | | -- |
| fetch_bug_titles() ---> Backend API (/triage/list_issues) | | | <--- (Bug |
| Titles) -------- | | | | -- Renders Dialog with form & associated bugs list |
| --` |
| 4. **User Interaction**: |
| - The user sees the dialog. |
| - If there are other anomalies in the same group already linked to bugs, |
| these bugs (ID and title) are listed. |
| - The user enters a Bug ID into the input field. |
| - The user clicks the "Submit" button. |
| 5. **Form Submission and Backend Communication**: `existing-bug-dialog-sk | |
| | -- (User Submits Form) --> | | | | -- _spinner.active = true --> (UI |
| Update: Show spinner) | | | -- fetch('/_/triage/associate_alerts', POST, |
| {bug_id, keys}) --> Backend API | | | <---- (Success/Failure) ---- | | | |
| | -- _spinner.active = false -> (UI Update: Hide spinner) | | | -- IF |
| Success: | | | -- closeDialog() ------> (UI Update: Hide dialog) | | | | |
| | -- window.open(bug_url) -> (Opens bug in new tab) | | | | | -- |
| dispatchEvent('anomaly-changed') --> Application (Notifies other components) |
| | | | -- IF Failure: | | | -- errorMessage(msg) -> (UI Update: Show error |
| toast)` |
| 6. **Outcome**: |
| - **Success**: The anomalies are linked to the specified bug in the |
| backend. The dialog closes, the bug page opens in a new tab, and other |
| parts of the UI (listening for `anomaly-changed`) update to reflect the |
| new association. |
| - **Failure**: An error message is shown, and the dialog remains open, |
| allowing the user to try again or correct the input. |
| |
| The design prioritizes a clear and focused user experience for a common task in |
| performance alert triaging. By integrating directly with the backend API for bug |
| association and fetching related bug information, it aims to be an efficient |
| tool for developers and SREs. The use of custom events allows for loose coupling |
| with other components in the larger application. |
| |
| # Module: /modules/explore-multi-sk |
| |
| ## explore-multi-sk Module |
| |
| ### Overview |
| |
| The `explore-multi-sk` module provides a user interface for displaying and |
| interacting with multiple performance data graphs simultaneously. This is |
| particularly useful when users need to compare different metrics, |
| configurations, or time ranges side-by-side. The core idea is to leverage the |
| functionality of individual `explore-simple-sk` elements, which represent single |
| graphs, and manage their states and interactions within a unified multi-graph |
| view. |
| |
| ### Key Design Decisions and Implementation Choices |
| |
| **State Management:** A central `State` object within `explore-multi-sk` manages |
| properties that are common across all displayed graphs. These include the time |
| range (`begin`, `end`), display options (`showZero`, `dots`), and pagination |
| settings (`pageSize`, `pageOffset`). This approach simplifies the overall state |
| management and keeps the URL from becoming overly complex, as only a limited set |
| of shared parameters need to be reflected in the URL. |
| |
| Each individual graph (`explore-simple-sk` instance) maintains its own specific |
| state related to the data it displays (formulas, queries, selected keys). |
| `explore-multi-sk` stores an array of `GraphConfig` objects, where each object |
| corresponds to an `explore-simple-sk` instance and holds its unique |
| configuration. |
| |
| The `stateReflector` utility is used to synchronize the shared `State` with the |
| URL, allowing for bookmarking and sharing of multi-graph views. |
| |
| **Dynamic Graph Addition and Removal:** Users can dynamically add new graphs to |
| the view. When a new graph is added, an empty `explore-simple-sk` instance is |
| created and the user can then configure its data source (query or formula). |
| |
| If the `useTestPicker` option is enabled (often determined by backend defaults), |
| instead of a simple "Add Graph" button, a `test-picker-sk` element is displayed. |
| This component provides a more structured way to select tests and parameters, |
| and upon selection, a new graph is automatically generated and populated. |
| |
| Graphs can also be removed. Event listeners are in place to handle |
| `remove-explore` custom events, which are typically dispatched by the individual |
| `explore-simple-sk` elements when a user closes them in a "Multiview" context |
| (where `useTestPicker` is active). |
| |
| **Pagination:** To handle potentially large numbers of graphs, pagination is |
| implemented using the `pagination-sk` element. This allows users to view a |
| subset of the total graphs at a time, improving performance and usability. The |
| `pageSize` and `pageOffset` are part of the shared state. |
| |
| **Graph Manipulation (Split and Merge):** |
| |
| - **Split Graph:** If a single graph displaying multiple traces is present, |
| the "Split Graph" functionality allows the user to create separate graphs |
| for each of those traces. This is useful for focusing on individual trends |
| that were previously combined. |
| - **Merge Graphs:** Conversely, the "Merge Graphs" functionality takes all |
| currently displayed graphs and combines their traces into a single graph. |
| This can be helpful for seeing an aggregated view. |
| |
| These operations primarily involve manipulating the `graphConfigs` array and |
| then re-rendering the graphs. |
| |
| **Shortcuts:** The module supports saving and loading multi-graph configurations |
| using shortcuts. When the configuration of graphs changes (traces added/removed, |
| graphs split/merged), `updateShortcutMultiview` is called. This function |
| communicates with a backend service (`/_/shortcut/get` and a corresponding save |
| endpoint invoked by `updateShortcut` from `explore-simple-sk`) to store or |
| retrieve the `graphConfigs` associated with a unique shortcut ID. This ID is |
| then reflected in the URL, allowing users to share specific multi-graph setups. |
| |
| **Synchronization of Interactions:** |
| |
| - **X-Axis Label:** When the x-axis label (e.g., switching between commit |
| number and date) is toggled on one graph, a custom event `x-axis-toggled` is |
| dispatched. `explore-multi-sk` listens for this and updates the x-axis on |
| all other visible graphs to maintain consistency. |
| - **Chart Selection (Plot Summary):** While not explicitly detailed in |
| `explore-multi-sk.ts`, the `explore-simple-sk` component likely has |
| mechanisms for plot selection. If the `plotSummary` feature is active, |
| selections on one graph might influence others, though the provided code for |
| `explore-multi-sk` doesn't directly show this cross-graph selection |
| synchronization logic, but it does have `syncChartSelection` which would |
| handle this. |
| |
| **Defaults and Configuration:** The component fetches default configurations |
| from a `/_/defaults/` endpoint. These defaults can influence various aspects, |
| such as: - Whether to use `test-picker-sk` (`useTestPicker`). - Default |
| parameters and their order for `test-picker-sk` (`include_params`, |
| `default_param_selections`). This allows for instance-specific customization of |
| the Perf UI. |
| |
| ### Responsibilities and Key Components |
| |
| - **`explore-multi-sk.ts`**: |
| |
| - **Responsibilities**: This is the main TypeScript file defining the |
| `ExploreMultiSk` custom element. It is responsible for: |
| - Managing the overall state of the multi-graph view (shared properties |
| like time range, pagination). |
| - Handling the addition, removal, and configuration of individual |
| `explore-simple-sk` graph elements. |
| - Interacting with the `stateReflector` to update the URL based on the |
| shared state. |
| - Implementing the "Split Graph" and "Merge Graphs" functionalities. |
| - Managing pagination for the displayed graphs. |
| - Fetching and applying default configurations. |
| - Coordinating interactions between graphs (e.g., synchronizing x-axis |
| labels). |
| - Interacting with the `test-picker-sk` if enabled. |
| - Handling user authentication status for features like "Add to |
| Favorites". |
| - Managing shortcuts for saving and loading multi-graph configurations. |
| - **Key Interactions**: |
| - Creates and manages instances of `explore-simple-sk`. |
| - Uses `pagination-sk` for displaying graphs in pages. |
| - Uses `test-picker-sk` for adding graphs when `useTestPicker` is true. |
| - Uses `favorites-dialog-sk` to allow users to save graph configurations. |
| - Communicates with backend services for shortcuts and default |
| configurations. |
| |
| - **`explore-multi-sk.html` (Inferred from the Lit `html` template in |
| `explore-multi-sk.ts`)**: |
| |
| - **Responsibilities**: Defines the structure of the `explore-multi-sk` |
| element. This includes: |
| - A menu section with buttons for "Add Graph", "Split Graph", "Merge |
| Graphs", and "Add to Favorites". |
| - The `test-picker-sk` element (conditionally visible). |
| - `pagination-sk` elements for navigating through graph pages. |
| - A container (`#graphContainer`) where the individual `explore-simple-sk` |
| elements are dynamically rendered. |
| - **Key Components**: |
| - `<button>` elements for user actions. |
| - `<test-picker-sk>` for test selection. |
| - `<pagination-sk>` for graph pagination. |
| - `<favorites-dialog-sk>` for saving favorites. |
| - A `div` (`#graphContainer`) to hold the `explore-simple-sk` instances. |
| |
| - **`explore-multi-sk.scss`**: |
| |
| - **Responsibilities**: Provides the styling for the `explore-multi-sk` |
| element and its children. It ensures that the layout is appropriate for |
| displaying multiple graphs and their controls. |
| - **Key Aspects**: |
| - Styles the `#menu` and `#pagination` areas. |
| - Defines the height of embedded `explore-simple-sk` plots. |
| - Handles the conditional visibility of elements like `#test-picker` and |
| `#add-graph-button`. |
| |
| ### Key Workflows |
| |
| **1. Initial Load and State Restoration:** |
| |
| ``` |
| User navigates to URL with explore-multi-sk |
| | |
| V |
| explore-multi-sk.connectedCallback() |
| | |
| V |
| Fetch defaults from /_/defaults/ |
| | |
| V |
| stateReflector() is initialized |
| | |
| V |
| State is read from URL (or defaults if URL is empty) |
| | |
| V |
| IF state.shortcut is present: |
| Fetch graphConfigs from /_/shortcut/get using the shortcut ID |
| | |
| V |
| ELSE (or after fetching): |
| For each graphConfig (or if starting fresh, often one empty graph is implied or added): |
| Create/configure explore-simple-sk instance |
| Set its state based on graphConfig and shared state |
| | |
| V |
| Add graphs to the current page based on pagination settings |
| | |
| V |
| Render the component |
| ``` |
| |
| **2. Adding a Graph (without Test Picker):** |
| |
| ``` |
| User clicks "Add Graph" button |
| | |
| V |
| explore-multi-sk.addEmptyGraph() is called |
| | |
| V |
| A new ExploreSimpleSk instance is created |
| A new empty GraphConfig is added to this.graphConfigs |
| | |
| V |
| explore-multi-sk.updatePageForNewExplore() |
| | |
| V |
| IF current page is full: |
| Increment pageOffset (triggering pageChanged) |
| ELSE: |
| Add new graph to current page |
| | |
| V |
| The new explore-simple-sk element might open its query dialog for the user |
| ``` |
| |
| **3. Adding a Graph (with Test Picker):** |
| |
| ``` |
| TestPickerSk is visible (due to defaults or state) |
| | |
| V |
| User interacts with TestPickerSk, selects tests/parameters |
| | |
| V |
| User clicks "Plot" button in TestPickerSk |
| | |
| V |
| TestPickerSk dispatches 'plot-button-clicked' event |
| | |
| V |
| explore-multi-sk listens for 'plot-button-clicked' |
| | |
| V |
| explore-multi-sk.addEmptyGraph(unshift=true) is called (new graph at the top) |
| | |
| V |
| explore-multi-sk.addGraphsToCurrentPage() updates the view |
| | |
| V |
| TestPickerSk.createQueryFromFieldData() gets the query |
| | |
| V |
| The new ExploreSimpleSk instance has its query set |
| ``` |
| |
| **4. Splitting a Graph:** |
| |
| ``` |
| User has one graph with multiple traces and clicks "Split Graph" |
| | |
| V |
| explore-multi-sk.splitGraph() |
| | |
| V |
| this.getTracesets() retrieves traces from the first (and only) graph |
| | |
| V |
| this.clearGraphs() removes the existing graph configuration |
| | |
| V |
| FOR EACH trace in the retrieved traceset: |
| this.addEmptyGraph() |
| A new GraphConfig is created for this trace (e.g., config.queries = [queryFromKey(trace)]) |
| | |
| V |
| this.updateShortcutMultiview() (new shortcut reflecting multiple graphs) |
| | |
| V |
| this.state.pageOffset is reset to 0 |
| | |
| V |
| this.addGraphsToCurrentPage() renders the new set of individual graphs |
| ``` |
| |
| **5. Saving/Updating a Shortcut:** |
| |
| ``` |
| Graph configuration changes (e.g., trace added/removed, graph split/merged, new graph added) |
| | |
| V |
| explore-multi-sk.updateShortcutMultiview() is called |
| | |
| V |
| Calls exploreSimpleSk.updateShortcut(this.graphConfigs) |
| | |
| V |
| (Inside updateShortcut) |
| IF graphConfigs is not empty: |
| POST this.graphConfigs to backend (e.g., /_/shortcut/new or /_/shortcut/update) |
| Backend returns a new or existing shortcut ID |
| | |
| V |
| explore-multi-sk.state.shortcut is updated with the new ID |
| | |
| V |
| this.stateHasChanged() is called, triggering stateReflector to update the URL |
| ``` |
| |
| # Module: /modules/explore-simple-sk |
| |
| The `explore-simple-sk` module provides a custom HTML element for exploring and |
| visualizing performance data. It allows users to query, plot, and analyze |
| traces, identify anomalies, and interact with commit details. This element is a |
| core component of the Perf application's data exploration interface. |
| |
| **Core Functionality:** |
| |
| The element's primary responsibility is to provide a user interface for: |
| |
| 1. **Querying Data:** Users can construct queries to select specific traces |
| based on various parameters. |
| 2. **Plotting Traces:** Selected traces are rendered on a graph, allowing for |
| visual inspection of performance trends. |
| 3. **Analyzing Data:** Users can interact with the plot to zoom, pan, and |
| select individual data points for detailed inspection. |
| 4. **Anomaly Detection:** The element integrates with anomaly detection |
| services to highlight and manage performance regressions or improvements. |
| 5. **Commit Details:** Information about the commits associated with data |
| points can be displayed, linking performance changes to specific code |
| modifications. |
| |
| **Key Design Decisions and Implementation Choices:** |
| |
| - **State Management:** The element's state (e.g., current query, time range, |
| plot settings) is managed internally and reflected in the URL. This allows |
| users to share specific views of the data and enables bookmarking. The |
| `State` class in `explore-simple-sk.ts` defines the structure of this state. |
| - **Data Fetching:** Data is fetched asynchronously from the backend using the |
| `/frame/start` endpoint. The `requestFrame` method handles initiating these |
| requests and processing the responses. The `FrameRequest` and |
| `FrameResponse` types define the communication contract with the server. |
| - **Plotting Library:** The module supports two plotting libraries: |
| `plot-simple-sk` (a custom canvas-based plotter) and `plot-google-chart-sk` |
| (which wraps Google Charts). The choice of plotter can be configured. |
| - **Component-Based Architecture:** The UI is built using a collection of |
| smaller, specialized custom elements (e.g., `query-sk` for query input, |
| `paramset-sk` for displaying parameters, `commit-detail-panel-sk` for commit |
| information). This promotes modularity and reusability. |
| - **Event-Driven Communication:** Components communicate with each other and |
| with the main `explore-simple-sk` element through custom events. For |
| example, when a query changes in `query-sk`, it emits a `query-change` event |
| that `explore-simple-sk` listens to. |
| - **Caching and Optimization:** To improve performance, the element employs |
| strategies like incremental data loading when panning and caching commit |
| details. |
| |
| **Key Files and Components:** |
| |
| - **`explore-simple-sk.ts`:** This is the main TypeScript file that defines |
| the `ExploreSimpleSk` custom element. It handles: |
| - State management and URL reflection. |
| - Data fetching and processing. |
| - Rendering the UI template. |
| - Event handling and coordination between child components. |
| - Interaction logic for plotting, zooming, selecting points, etc. |
| - **`explore-simple-sk.html` (embedded in `explore-simple-sk.ts`):** This |
| Lit-html template defines the structure of the element's UI. It includes |
| placeholders for various child components and dynamic content. |
| - **`explore-simple-sk.scss`:** This SCSS file provides the styling for the |
| element and its components. |
| - **Child Components (imported in `explore-simple-sk.ts`):** |
| - `query-sk`: For constructing and managing queries. |
| - `paramset-sk`: For displaying and interacting with parameter sets. |
| - `plot-simple-sk` / `plot-google-chart-sk`: For rendering the plots. |
| - `commit-detail-panel-sk`: For displaying commit information. |
| - `anomaly-sk`: For displaying and managing anomalies. |
| - Many other components for specific UI elements like dialogs, buttons, |
| and icons. |
| |
| **Workflow Example: Plotting a Query** |
| |
| 1. **User Interaction:** The user interacts with the `query-sk` element to |
| define a query. |
| 2. **Event Emission:** `query-sk` emits a `query-change` event with the new |
| query. |
| 3. **State Update:** `explore-simple-sk` listens for this event, updates its |
| internal state (specifically the `queries` array in the `State` object), and |
| triggers a re-render. |
| 4. **Data Request:** `explore-simple-sk` constructs a `FrameRequest` based on |
| the updated state and calls `requestFrame` to fetch data from the server. |
| `User Input (query-sk) -> Event (query-change) -> State Update |
| (ExploreSimpleSk) -> Data Request (requestFrame)` |
| 5. **Data Processing:** Upon receiving the `FrameResponse`, `explore-simple-sk` |
| processes the data, updates its internal `_dataframe` object, and prepares |
| the data for plotting. |
| 6. **Plot Rendering:** `explore-simple-sk` passes the processed data to the |
| `plot-simple-sk` or `plot-google-chart-sk` element, which then renders the |
| traces on the graph. `Server Response (FrameResponse) -> Data Processing |
| (ExploreSimpleSk) -> Plot Update (plot-simple-sk/plot-google-chart-sk) -> |
| Visual Output` |
| 7. **URL Update:** The state change is reflected in the URL, allowing the user |
| to bookmark or share the current view. |
| |
| This workflow illustrates the reactive nature of the element, where user |
| interactions trigger state changes, which in turn lead to data fetching and UI |
| updates. |
| |
| # Module: /modules/explore-sk |
| |
| The `explore-sk` module serves as the primary user interface for exploring and |
| analyzing performance data within the Perf application. It provides a |
| comprehensive view for users to query, visualize, and interact with performance |
| traces. |
| |
| The core functionality of `explore-sk` is built upon the `explore-simple-sk` |
| element. `explore-sk` acts as a wrapper, enhancing `explore-simple-sk` with |
| additional features like user authentication integration, default configuration |
| loading, and the optional `test-picker-sk` for more guided query construction. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`explore-sk.ts`**: This is the main TypeScript file defining the |
| `ExploreSk` custom element. |
| |
| - **Why**: It orchestrates the interaction between various sub-components |
| and manages the overall state of the exploration page. |
| - **How**: |
| - It initializes by fetching default configurations (e.g., query |
| parameters, display settings) from a backend endpoint (`/_/defaults/`). |
| This ensures that the exploration view is pre-configured with sensible |
| starting points. |
| - It integrates with `alogin-sk` to determine the logged-in user's status. |
| This information is used to enable features like "favorites" if a user |
| is logged in. |
| - It utilizes `stateReflector` to persist and restore the state of the |
| underlying `explore-simple-sk` element in the URL. This allows users to |
| share specific views or bookmark their current exploration state. |
| - It conditionally initializes and displays `test-picker-sk`. If the |
| `use_test_picker_query` flag is set in the state (often via URL |
| parameters or defaults), the `test-picker-sk` component is shown, |
| providing a structured way to build queries based on available parameter |
| keys and values. |
| - It listens for events from `test-picker-sk` (e.g., |
| `plot-button-clicked`, `remove-all`, `populate-query`) and translates |
| these into actions on the `explore-simple-sk` element, such as adding |
| new traces based on the selected test parameters or clearing the view. |
| - It provides buttons like "View in multi-graph" and "Toggle Chart Style" |
| which directly interact with methods exposed by `explore-simple-sk`. |
| |
| - **`explore-simple-sk` (imported module)**: This is a fundamental building |
| block that handles the core trace visualization, querying logic, and |
| interaction with the graph. |
| |
| - **Why**: Encapsulates the complex logic of fetching trace data, |
| rendering graphs, and handling user interactions like zooming, panning, |
| and selecting traces. |
| - **How**: `explore-sk` delegates most of the heavy lifting related to |
| data exploration to this component. It passes down the initial state, |
| default configurations, and user-specific settings. |
| |
| - **`test-picker-sk` (imported module)**: A component that allows users to |
| build queries by selecting from available test parameters and their values. |
| |
| - **Why**: Simplifies the query construction process, especially when |
| dealing with a large number of possible parameters. It provides a more |
| user-friendly alternative to manually typing complex query strings. |
| - **How**: When active, it presents a UI for selecting dimensions and |
| values. Upon user action (e.g., clicking a "plot" button), it emits an |
| event with the constructed query, which `explore-sk` then uses to fetch |
| and display the corresponding traces via `explore-simple-sk`. It can |
| also be populated based on a highlighted trace, allowing users to |
| quickly refine queries based on existing data. |
| |
| - **`favorites-dialog-sk` (imported module)**: Enables users to save and |
| manage their favorite query configurations. |
| |
| - **Why**: Provides a convenient way for users to quickly return to |
| frequently used or important exploration views. |
| - **How**: Integrated into `explore-simple-sk` and its functionality is |
| enabled by `explore-sk` based on the user's login status. |
| |
| - **State Management (`stateReflector`)**: |
| |
| - **Why**: To make the exploration state shareable and bookmarkable. |
| Changes in the exploration view (queries, zoom levels, etc.) are |
| reflected in the URL. |
| - **How**: `explore-sk` uses `stateReflector` to listen for state changes |
| in `explore-simple-sk`. When the state changes, `stateReflector` updates |
| the URL. Conversely, when the page loads or the URL changes, |
| `stateReflector` parses the URL and applies the state to |
| `explore-simple-sk`. |
| |
| **Workflow Example: Initial Page Load with Test Picker** |
| |
| 1. `explore-sk` element is connected to the DOM. |
| 2. `connectedCallback` is invoked: |
| - Renders its initial template. |
| - Fetches default configurations from `/_/defaults/`. |
| - `stateReflector` is initialized. If the URL contains state for |
| `explore-simple-sk`, it's applied. |
| - The state might indicate `use_test_picker_query = true`. |
| 3. If `use_test_picker_query` is true: |
| - `initializeTestPicker()` is called. |
| - `test-picker-sk` element is made visible. |
| - `test-picker-sk` is initialized with parameters from the defaults (e.g., |
| `include_params`, `default_param_selections`) or from existing queries |
| in the state. |
| 4. User interacts with `test-picker-sk` to select desired test parameters. |
| 5. User clicks the "Plot" button within `test-picker-sk`. |
| 6. `test-picker-sk` emits a `plot-button-clicked` event. |
| 7. `explore-sk` listens for this event: |
| - It retrieves the query constructed by `test-picker-sk`. |
| - It calls `exploreSimpleSk.addFromQueryOrFormula()` to add the new traces |
| to the graph. |
| 8. `explore-simple-sk` fetches the data, renders the traces, and emits a |
| `state_changed` event. |
| 9. `stateReflector` captures this `state_changed` event and updates the URL to |
| reflect the new query. |
| |
| This workflow illustrates how `explore-sk` acts as a central coordinator, |
| integrating various specialized components to provide a cohesive data |
| exploration experience. The design emphasizes modularity, with |
| `explore-simple-sk` handling the core plotting and `test-picker-sk` offering an |
| alternative query input mechanism, all managed and presented by `explore-sk`. |
| |
| # Module: /modules/favorites-dialog-sk |
| |
| The `favorites-dialog-sk` module provides a custom HTML element that displays a |
| modal dialog for users to add or edit "favorites." Favorites, in this context, |
| are likely user-defined shortcuts or bookmarks to specific views or states |
| within the application, identified by a name, description, and a URL. |
| |
| **Core Functionality and Design:** |
| |
| The primary purpose of this module is to present a user-friendly interface for |
| managing these favorites. It's designed as a modal dialog to ensure that the |
| user's focus is on the task of adding or editing a favorite without distractions |
| from the underlying page content. |
| |
| **Key Components:** |
| |
| - **`favorites-dialog-sk.ts`**: This is the heart of the module, defining the |
| `FavoritesDialogSk` custom element. |
| |
| - **Why**: It encapsulates the logic for displaying the dialog, handling |
| user input, and interacting with a backend service to persist favorite |
| data. |
| - **How**: |
| - It extends `ElementSk`, a base class for custom elements in the Skia |
| infrastructure, providing a common foundation. |
| - It uses the Lit library (`lit/html.js`) for templating, allowing for |
| declarative and efficient rendering of the dialog's UI. |
| - The `open()` method is the public API for triggering the dialog. It |
| accepts optional parameters for pre-filling the form when editing an |
| existing favorite. Crucially, it returns a `Promise`. This promise-based |
| approach is a key design choice. It resolves when the favorite is |
| successfully saved and rejects if the user cancels the dialog. This |
| allows the calling code (likely a parent component managing the list of |
| favorites) to react appropriately, for instance, by re-fetching the |
| updated list of favorites only when a change has actually occurred. |
| - Input fields for "Name," "Description," and "URL" capture the necessary |
| information. The "Name" and "URL" fields are mandatory. |
| - The `confirm()` method handles the submission logic. It performs basic |
| validation (checking for empty name and URL) and then makes an HTTP POST |
| request to either `/_/favorites/new` or `/_/favorites/edit` depending on |
| whether a new favorite is being created or an existing one is being |
| modified. |
| - A `spinner-sk` element is used to provide visual feedback to the user |
| during the asynchronous operation of saving the favorite. |
| - Error handling is implemented using `errorMessage` to display issues to |
| the user, such as network errors or validation failures from the |
| backend. |
| - The `dismiss()` method handles the cancellation of the dialog, rejecting |
| the promise returned by `open()`. |
| - Input event handlers (`filterName`, `filterDescription`, `filterUrl`) |
| update the component's internal state as the user types, and trigger |
| re-renders via `this._render()`. |
| |
| - **`favorites-dialog-sk.scss`**: This file contains the SASS styles for the |
| dialog. |
| |
| - **Why**: It separates the presentation concerns from the JavaScript |
| logic, making the component more maintainable. |
| - **How**: It defines styles for the `<dialog>` element, input fields, |
| labels, and buttons, ensuring a consistent look and feel within the |
| application's theme (as indicated by `@import |
| '../themes/themes.scss';`). |
| |
| - **`favorites-dialog-sk-demo.html` / `favorites-dialog-sk-demo.ts`**: These |
| files provide a demonstration page for the `favorites-dialog-sk` element. |
| |
| - **Why**: This allows developers to see the component in isolation, test |
| its functionality, and understand how to integrate it. |
| - **How**: The HTML sets up a basic page with buttons to trigger the |
| dialog in "new favorite" and "edit favorite" modes. The TypeScript file |
| wires up event listeners on these buttons to call the `open()` method of |
| the `favorites-dialog-sk` element with appropriate parameters. |
| |
| **Workflow: Adding/Editing a Favorite** |
| |
| A typical workflow involving this dialog would be: |
| |
| 1. **User Action**: The user clicks a button (e.g., "Add Favorite" or an "Edit" |
| icon next to an existing favorite) in the main application UI. |
| 2. **Dialog Invocation**: The event handler for this action calls the `open()` |
| method of an instance of `favorites-dialog-sk`. |
| |
| - If adding a new favorite, `open()` might be called with minimal or no |
| arguments, defaulting the URL to the current page. |
| - If editing, `open()` is called with the `favId`, `name`, `description`, |
| and `url` of the favorite to be edited. |
| |
| ``` |
| User clicks "Add New" --> favoritesDialog.open('', '', '', 'current.page.url') |
| | |
| V |
| Dialog Appears |
| | |
| V |
| User fills form, clicks "Save" --> confirm() is called |
| | |
| V |
| POST /_/favorites/new |
| | |
| V (Success) |
| Dialog closes, open() Promise resolves |
| | |
| V |
| Calling component re-fetches favorites |
| |
| -------------------------------- OR --------------------------------- |
| |
| User clicks "Edit Favorite" --> favoritesDialog.open('id123', 'My Fav', 'Desc', 'fav.url.com') |
| | |
| V |
| Dialog Appears (pre-filled) |
| | |
| V |
| User modifies form, clicks "Save" --> confirm() is called |
| | |
| V |
| POST /_/favorites/edit (with 'id123') |
| | |
| V (Success) |
| Dialog closes, open() Promise resolves |
| | |
| V |
| Calling component re-fetches favorites |
| |
| -------------------------------- OR --------------------------------- |
| |
| User clicks "Cancel" or Close Icon --> dismiss() is called |
| | |
| V |
| Dialog closes, open() Promise rejects |
| | |
| V |
| Calling component does nothing (no re-fetch) |
| ``` |
| |
| 3. **User Interaction**: The user fills in or modifies the "Name," |
| "Description," and "URL" fields in the dialog. |
| |
| 4. **Submission/Cancellation**: |
| |
| - **Save**: The user clicks the "Save" button. |
| - The `confirm()` method is invoked. |
| - Input validation (name and URL not empty) is performed. |
| - A `fetch` request is made to the backend API (`/_/favorites/new` or |
| `/_/favorites/edit`). |
| - A spinner is shown during the API call. |
| - Upon successful completion, the dialog closes, and the `Promise` |
| returned by `open()` resolves. |
| - If the API call fails, an error message is displayed, and the dialog |
| remains open (or the promise might reject depending on specific |
| error handling in `confirm`). |
| - **Cancel**: The user clicks the "Cancel" button or the close icon. |
| - The `dismiss()` method is invoked. |
| - The dialog closes. |
| - The `Promise` returned by `open()` rejects. |
| |
| 5. **Post-Dialog Action**: The component that initiated the dialog (e.g., a |
| `favorites-sk` list component) uses the resolved/rejected state of the |
| `Promise` to decide whether to refresh its list of favorites. This is a key |
| aspect of the design – it avoids unnecessary re-fetches if the user simply |
| cancels the dialog. |
| |
| The design prioritizes a clear separation of concerns, using custom elements for |
| UI encapsulation, SASS for styling, and a promise-based API for asynchronous |
| operations and communication with parent components. This makes the |
| `favorites-dialog-sk` a reusable and well-defined piece of UI for managing user |
| favorites. |
| |
| # Module: /modules/favorites-sk |
| |
| The `favorites-sk` module provides a user interface element for displaying and |
| managing a user's "favorites". Favorites are essentially bookmarked URLs, |
| categorized into sections. This module allows users to view their favorited |
| links, edit their details (name, description, URL), and delete them. |
| |
| **Core Functionality & Design:** |
| |
| The primary responsibility of `favorites-sk` is to fetch favorite data from a |
| backend endpoint (`/_/favorites/`) and render it in a user-friendly way. It also |
| handles interactions for modifying these favorites, such as editing and |
| deleting. |
| |
| - **Data Fetching and Rendering:** |
| |
| - Upon connection to the DOM (`connectedCallback`), the element attempts |
| to fetch the favorites configuration from the backend. |
| - The fetched data, expected to be in a `Favorites` JSON format (defined |
| in `perf/modules/json`), is stored in the `favoritesConfig` property. |
| - The `_render()` method is called to update the display. |
| - The rendering logic iterates through sections and then links within each |
| section, generating an HTML table for display. |
| - A key design choice is to distinguish "My Favorites" from other |
| sections. "My Favorites" are displayed with "Edit" and "Delete" buttons, |
| implying user ownership and modifiability. Other sections are presented |
| as read-only. |
| |
| - **Favorite Management:** |
| |
| - **Deletion:** |
| - When a user clicks the "Delete" button for a favorite in the "My |
| Favorites" section, the `deleteFavoriteConfirm` method is invoked. |
| - This method displays a standard browser confirmation dialog |
| (`window.confirm`) to prevent accidental deletions. |
| - If confirmed, `deleteFavorite` sends a POST request to |
| `/_/favorites/delete` with the ID of the favorite to be removed. |
| - After a successful deletion, the favorites list is re-fetched to reflect |
| the change. |
| - **Editing:** |
| - Clicking the "Edit" button calls the `editFavorite` method. |
| - This method interacts with a `favorites-dialog-sk` element (defined in |
| `perf/modules/favorites-dialog-sk`). |
| - The `favorites-dialog-sk` is responsible for presenting a modal dialog |
| where the user can modify the favorite's name, description, and URL. |
| - Upon successful editing (dialog submission), the favorites list is |
| re-fetched. |
| |
| - **Error Handling:** |
| |
| - Network errors or non-OK responses during fetch operations (fetching |
| favorites, deleting favorites) are caught. |
| - An error message is displayed to the user via the `errorMessage` utility |
| (from `elements-sk/modules/errorMessage`). |
| |
| **Key Components/Files:** |
| |
| - **`favorites-sk.ts`:** This is the heart of the module. It defines the |
| `FavoritesSk` custom element, extending `ElementSk`. It contains the logic |
| for fetching, rendering, deleting, and initiating the editing of favorites. |
| - `constructor()`: Initializes the element with its Lit-html template. |
| - `deleteFavorite()`: Handles the asynchronous request to the backend for |
| deleting a favorite. |
| - `deleteFavoriteConfirm()`: Provides a confirmation step before actual |
| deletion. |
| - `editFavorite()`: Manages the interaction with the `favorites-dialog-sk` |
| for editing. |
| - `template()`: The static Lit-html template function that defines the |
| overall structure of the element. |
| - `getSectionsTemplate()`: A helper function that dynamically generates |
| the HTML for displaying sections and their links based on |
| `favoritesConfig`. It specifically adds edit/delete controls for the "My |
| Favorites" section. |
| - `fetchFavorites()`: Fetches the favorites data from the backend and |
| triggers a re-render. |
| - `connectedCallback()`: A lifecycle method that ensures favorites are |
| fetched when the element is added to the page. |
| - **`favorites-sk.scss`:** Provides the styling for the `favorites-sk` |
| element, defining its layout, padding, colors for links, and table |
| appearance. |
| - **`index.ts`:** A simple entry point that imports and registers the |
| `favorites-sk` custom element, making it available for use in HTML. |
| - **`favorites-sk-demo.html` & `favorites-sk-demo.ts`:** These files provide a |
| demonstration page for the `favorites-sk` element. The HTML includes an |
| instance of `<favorites-sk>` and a `<pre>` tag to display events. The |
| TypeScript file simply imports the element and sets up an event listener |
| (though no custom events are explicitly dispatched by `favorites-sk` in the |
| provided code). |
| |
| **Workflow: Deleting a Favorite** |
| |
| ``` |
| User Clicks "Delete" Button (for a link in "My Favorites") |
| | |
| V |
| favorites-sk.ts: deleteFavoriteConfirm(id, name) |
| | |
| V |
| window.confirm("Deleting favorite: [name]. Are you sure?") |
| | |
| +-- User clicks "Cancel" --> Workflow ends |
| | |
| V User clicks "OK" |
| favorites-sk.ts: deleteFavorite(id) |
| | |
| V |
| fetch('/_/favorites/delete', { method: 'POST', body: {id: favId} }) |
| | |
| +-- Network Error/Non-OK Response --> errorMessage() is called, display error |
| | |
| V Successful Deletion |
| favorites-sk.ts: fetchFavorites() |
| | |
| V |
| fetch('/_/favorites/') |
| | |
| V |
| Parse JSON response, update this.favoritesConfig |
| | |
| V |
| this._render() // Re-renders the component with the updated list |
| ``` |
| |
| **Workflow: Editing a Favorite** |
| |
| ``` |
| User Clicks "Edit" Button (for a link in "My Favorites") |
| | |
| V |
| favorites-sk.ts: editFavorite(id, name, desc, url) |
| | |
| V |
| Get reference to <favorites-dialog-sk id="fav-dialog"> |
| | |
| V |
| favorites-dialog-sk.open(id, name, desc, url) // Opens the edit dialog |
| | |
| +-- User cancels dialog --> Promise rejects (potentially with undefined, handled) |
| | |
| V User submits changes in dialog |
| Promise resolves |
| | |
| V |
| favorites-sk.ts: fetchFavorites() // Re-fetches and re-renders the list |
| | |
| V |
| fetch('/_/favorites/') |
| | |
| V |
| Parse JSON response, update this.favoritesConfig |
| | |
| V |
| this._render() |
| ``` |
| |
| The design relies on Lit for templating and rendering, which provides efficient |
| updates to the DOM when the `favoritesConfig` data changes. The separation of |
| concerns is evident: `favorites-sk` handles the list display and top-level |
| actions, while `favorites-dialog-sk` manages the intricacies of the editing |
| form. |
| |
| # Module: /modules/graph-title-sk |
| |
| ## Graph Title (`graph-title-sk`) |
| |
| The `graph-title-sk` module provides a custom HTML element designed to display |
| titles for individual graphs in a structured and informative way. Its primary |
| goal is to present key-value pairs of metadata associated with a graph in a |
| visually clear and space-efficient manner. |
| |
| ### Responsibilities and Key Components |
| |
| The core of this module is the `GraphTitleSk` custom element |
| (`graph-title-sk.ts`). Its main responsibilities are: |
| |
| 1. **Data Reception and Storage:** It receives a `Map<string, string>` where |
| keys represent parameter names (e.g., "bot", "benchmark") and values |
| represent their corresponding values (e.g., "linux-perf", "Speedometer2"). |
| This map, along with the number of traces in the graph, is provided via the |
| `set()` method. |
| |
| 2. **Dynamic Rendering:** Based on the provided data, the element dynamically |
| generates HTML to display the title. It iterates through the key-value pairs |
| and renders them in a columnar layout. Each pair is displayed with the key |
| (parameter name) in a smaller font above its corresponding value. |
| |
| 3. **Handling Empty or Generic Titles:** |
| |
| - If a key or its corresponding value is an empty string, that particular |
| entry is omitted from the displayed title. This ensures that the title |
| remains concise and only shows relevant information. |
| - If the input `titleEntries` map is empty but `numTraces` is greater than |
| zero, it displays a generic title like "Multi-trace Graph (X traces)" to |
| indicate a graph with multiple data series without specific shared |
| parameters. |
| |
| 4. **Space Management and Truncation:** |
| |
| - The title entries are arranged in a flexible, wrapping layout (`display: |
| |
| flex; flex-wrap: wrap;`) using CSS (`graph-title-sk.scss`). This allows |
| the title to adapt to different screen widths. |
| |
| - To prevent overcrowding, especially when there are many parameters, the |
| component implements a "show more" functionality. If the number of title |
| entries exceeds a predefined limit (`MAX_PARAMS`, currently 8), it |
| initially displays only the first `MAX_PARAMS` entries. A "Show Full |
| Title" button (`<md-text-button class="showMore">`) is then provided, |
| allowing the user to expand the view and see all title entries. |
| Conversely, a "Show Short Title" mechanism is implied (though not |
| explicitly shown as a button in the current code, `showShortTitles()` method exists) to revert to the truncated view. |
| - Individual values that are very long are visually truncated in the |
| display, but the full value is available as a tooltip when the user |
| hovers over the text. This is achieved by setting the`title`attribute |
| of the`div` containing the value. |
| |
| ### Design Decisions and Implementation Choices |
| |
| - **Custom Element (`ElementSk`):** The component is built as a custom element |
| extending `ElementSk`. This aligns with the Skia infrastructure's approach |
| to building reusable UI components and allows for easy integration into Skia |
| applications. |
| - **Lit Library for Templating:** The HTML structure is generated using the |
| `lit` library's `html` template literal tag. This provides a declarative and |
| efficient way to define the component's view and update it when data |
| changes. The `_render()` method, inherited from `ElementSk`, is called to |
| trigger re-rendering when the internal state (`_titleEntries`, `numTraces`, |
| `showShortTitle`) changes. |
| - **CSS for Styling:** Styling is handled through a dedicated SCSS file |
| (`graph-title-sk.scss`). This separates presentation concerns from the |
| component's logic. CSS variables (e.g., `var(--primary)`) are used for |
| theming, allowing the component's appearance to be consistent with the |
| overall application theme. |
| - **`set()` Method for Data Input:** Instead of relying solely on HTML |
| attributes for complex data like a map, a public `set()` method is provided. |
| This is a common pattern for custom elements when dealing with non-string |
| data or when updates need to trigger specific internal logic beyond simple |
| attribute reflection. |
| - **Conditional Rendering for Title Brevity:** The decision to truncate the |
| number of displayed parameters by default (when exceeding `MAX_PARAMS`) and |
| provide a "Show Full Title" option is a user experience choice. It |
| prioritizes a clean initial view for complex graphs while still allowing |
| users to access all details if needed. |
| |
| ### Key Workflows |
| |
| **1. Initial Rendering with Data:** |
| |
| ``` |
| User/Application Code GraphTitleSk Element |
| --------------------- -------------------- |
| calls set(titleData, numTraces) --> |
| stores titleData & numTraces |
| calls _render() |
| | |
| V |
| getTitleHtml() is invoked |
| | |
| V |
| Iterates titleData: |
| - Skips empty keys/values |
| - If entries > MAX_PARAMS & showShortTitle is true: |
| - Renders first MAX_PARAMS entries |
| - Renders "Show Full Title" button |
| - Else: |
| - Renders all entries |
| | |
| V |
| HTML template is updated with generated content |
| Browser renders the title |
| ``` |
| |
| **2. Toggling Full/Short Title Display (when applicable):** |
| |
| ``` |
| User Interaction GraphTitleSk Element |
| ---------------- -------------------- |
| Clicks "Show Full Title" button --> |
| onClick handler (showFullTitle) executes |
| | |
| V |
| this.showShortTitle = false |
| calls _render() |
| | |
| V |
| getTitleHtml() is invoked |
| | |
| V |
| Now renders ALL title entries because showShortTitle is false |
| | |
| V |
| HTML template is updated |
| Browser re-renders the title to show all entries |
| ``` |
| |
| A similar flow occurs if a mechanism to call `showShortTitles()` is implemented |
| and triggered. |
| |
| The demo page (`graph-title-sk-demo.html` and `graph-title-sk-demo.ts`) |
| showcases various states of the `graph-title-sk` element, including: |
| |
| - A "good" example with several valid entries. |
| - A "partial" example where some entries have empty keys or values. |
| - A "generic" example where an empty map is provided, resulting in the |
| "Multi-trace Graph" title. |
| - An "empty" example (though the demo code doesn't explicitly create a state |
| where `numTraces` is 0 and the map is also empty, which would result in no |
| title being displayed). |
| |
| # Module: /modules/ingest-file-links-sk |
| |
| ## Module: ingest-file-links-sk |
| |
| **Overview:** |
| |
| The `ingest-file-links-sk` module provides a custom HTML element, |
| `<ingest-file-links-sk>`, designed to display a list of relevant links |
| associated with a specific data point in the Perf performance monitoring system. |
| These links are retrieved from the `ingest.Format` data structure, which can be |
| generated by various ingestion processes. The primary purpose is to offer users |
| quick access to related resources, such as Swarming task runs, Perfetto traces, |
| or bot information, directly from the Perf UI. |
| |
| **Why:** |
| |
| Performance analysis often requires context beyond the raw data. Understanding |
| the environment in which a test ran (e.g., specific bot configuration), or |
| having direct access to detailed trace files, can be crucial for debugging |
| performance regressions or understanding improvements. This module centralizes |
| these relevant links in a consistent and easily accessible manner, improving the |
| efficiency of performance investigations. |
| |
| **How:** |
| |
| The `<ingest-file-links-sk>` element fetches link data asynchronously. When its |
| `load()` method is called with a `CommitNumber` (representing a specific point |
| in time or version) and a `traceID` (identifying the specific data series), it |
| makes a POST request to the `/_/details/?results=false` endpoint. This endpoint |
| is expected to return a JSON object conforming to the `ingest.Format` structure. |
| |
| The element then parses this JSON response. It specifically looks for the |
| `links` field within the `ingest.Format`. If `links` exist and the `version` |
| field in the `ingest.Format` is present (indicating a modern format), the |
| element dynamically renders a list of these links. |
| |
| Key design considerations and implementation details: |
| |
| - **Asynchronous Loading:** Link fetching is an asynchronous operation to |
| avoid blocking the UI. A `spinner-sk` element is displayed while data is |
| being loaded. |
| - **URL vs. Text:** The module intelligently differentiates between actual |
| URLs and plain text values within the `links` object. If a value is a valid |
| URL, it's rendered as an `<a>` tag. Otherwise, it's displayed as "Key: |
| Value". |
| - **Markdown Link Handling:** The element includes logic to parse and convert |
| Markdown-style links (e.g., `[Link Text](url)`) into standard HTML anchor |
| tags. This allows ingestion processes to provide links in a more |
| human-readable format if desired. |
| - **Sorted Display:** Links are displayed in alphabetical order by their keys |
| for consistent presentation. |
| - **Error Handling:** If the fetch request fails or the response is not in the |
| expected format, an error message is displayed, and the spinner is hidden. |
| - **Legacy Format Compatibility:** The element checks for the `version` field |
| in the response. If it's missing, it assumes a legacy data format that |
| doesn't support these links and gracefully avoids displaying anything. |
| |
| **Responsibilities and Key Components:** |
| |
| - **`ingest-file-links-sk.ts`:** This is the core file defining the |
| `IngestFileLinksSk` custom element. |
| - It handles the fetching of link data from the backend API. |
| - It manages the rendering of the link list based on the fetched data. |
| - It includes the logic for differentiating between URLs and plain text, |
| and for parsing Markdown links. |
| - It manages the display of a loading spinner and error messages. |
| - The `load(cid: CommitNumber, traceid: string)` method is the public API |
| for triggering the data fetching and rendering process. |
| - The `displayLinks` static method is responsible for generating the |
| `TemplateResult` array for rendering the list items. |
| - The `isUrl` and `removeMarkdown` helper functions provide utility for |
| link processing. |
| - **`ingest-file-links-sk.scss`:** This file contains the SASS styles for the |
| custom element, defining its appearance, including list styling and spinner |
| positioning. |
| - **`ingest-file-links-sk-demo.html` and `ingest-file-links-sk-demo.ts`:** |
| These files provide a demonstration page for the element. The demo page uses |
| `fetch-mock` to simulate the backend API response, allowing developers to |
| see the element in action and test its functionality in isolation. |
| - **`ingest-file-links-sk_test.ts`:** This file contains unit tests for the |
| `IngestFileLinksSk` element. It uses `fetch-mock` to simulate various API |
| responses and asserts the element's behavior, such as correct link |
| rendering, spinner state, and error handling. |
| - **`ingest-file-links-sk_puppeteer_test.ts`:** This file contains |
| Puppeteer-based end-to-end tests. These tests load the demo page in a |
| headless browser and verify the element's visual rendering and basic |
| functionality. |
| |
| **Key Workflow: Loading and Displaying Links** |
| |
| ``` |
| User Action/Page Load -> Calls ingest-file-links-sk.load(commit, traceID) |
| | |
| V |
| ingest-file-links-sk: Show spinner-sk |
| | |
| V |
| Make POST request to /_/details/?results=false |
| (with commit and traceID in request body) |
| | |
| V |
| Backend API: Processes request, retrieves links for the |
| given commit and trace |
| | |
| V |
| ingest-file-links-sk: Receives JSON response (ingest.Format) |
| | |
| +----------------------+ |
| | | |
| V V |
| Response OK? Response Error? |
| | | |
| V V |
| Parse links Display error message |
| Hide spinner Hide spinner |
| Render link list |
| ``` |
| |
| # Module: /modules/json |
| |
| ## JSON Module Documentation |
| |
| This module defines TypeScript interfaces and types that represent the structure |
| of JSON data used throughout the Perf application. It essentially acts as a |
| contract between the Go backend and the TypeScript frontend, ensuring data |
| consistency and type safety. |
| |
| **Why:** |
| |
| The primary motivation for this module is to leverage TypeScript's strong typing |
| capabilities. By defining these interfaces, we can catch potential data |
| inconsistencies and errors at compile time rather than runtime. This is |
| particularly crucial for a data-intensive application like Perf, where the |
| frontend relies heavily on JSON responses from the backend. |
| |
| Furthermore, these definitions are **automatically generated** from Go struct |
| definitions. This ensures that the frontend and backend data models remain |
| synchronized. Any changes to the Go structs will trigger an update to these |
| TypeScript interfaces, reducing the likelihood of manual errors and |
| inconsistencies. |
| |
| **How:** |
| |
| The `index.ts` file contains all the interface and type definitions. These are |
| organized into a flat structure for simplicity, with some nested namespaces |
| (e.g., `pivot`, `progress`, `ingest`) where logical grouping is beneficial. |
| |
| A key design choice is the use of **nominal typing** for certain primitive types |
| (e.g., `CommitNumber`, `TimestampSeconds`, `Trace`). This is achieved by |
| creating type aliases that are branded with a unique string literal type. For |
| example: |
| |
| ```typescript |
| export type CommitNumber = number & { |
| _commitNumberBrand: 'type alias for number'; |
| }; |
| |
| export function CommitNumber(v: number): CommitNumber { |
| return v as CommitNumber; |
| } |
| ``` |
| |
| This prevents accidental assignment of a generic `number` to a `CommitNumber` |
| variable, even though they are structurally identical at runtime. This adds an |
| extra layer of type safety, ensuring that, for example, a timestamp is not |
| inadvertently used where a commit number is expected. Helper functions (e.g., |
| `CommitNumber(v: number)`) are provided for convenient type assertion. |
| |
| **Key Components/Files/Submodules:** |
| |
| - **`index.ts`**: This is the sole file in this module and contains all the |
| TypeScript interface and type definitions. It serves as the single source of |
| truth for JSON data structures used in the frontend. |
| - **Interfaces (e.g., `Alert`, `DataFrame`, `FrameRequest`, |
| `Regression`)**: These define the shape of complex JSON objects. For |
| instance, the `Alert` interface describes the structure of an alert |
| configuration, including its query, owner, and various detection |
| parameters. The `DataFrame` interface represents the core data structure |
| for displaying traces, including the actual trace data (`traceset`), |
| column headers (`header`), and associated parameter sets (`paramset`). |
| - **Type Aliases (e.g., `ClusterAlgo`, `StepDetection`, `Status`)**: These |
| define specific allowed string values for certain properties, acting |
| like enums. For example, `ClusterAlgo` can only be `'kmeans'` or |
| `'stepfit'`, ensuring that only valid clustering algorithms are |
| specified. |
| - **Nominally Typed Aliases (e.g., `CommitNumber`, `TimestampSeconds`, |
| `Trace`, `ParamSet`)**: As explained above, these provide stronger type |
| checking for primitive types that have specific semantic meaning within |
| the application. `TraceSet`, for example, is a map where keys are trace |
| identifiers (strings) and values are `Trace` arrays (nominally typed |
| `number[]`). |
| - **Namespaced Interfaces (e.g., `pivot.Request`, `ingest.Format`)**: Some |
| interfaces are grouped under namespaces to organize related data |
| structures. For example, `pivot.Request` defines the structure for |
| requesting pivot table operations, including grouping criteria and |
| aggregation operations. The `ingest.Format` interface defines the |
| structure of data being ingested into Perf, including metadata like Git |
| hash and the actual performance results. |
| - **Utility/Generic Types (e.g., `ReadOnlyParamSet`, `AnomalyMap`)**: |
| These represent common data patterns. `ReadOnlyParamSet` is a map of |
| parameter names to arrays of their possible string values, marked as |
| read-only to reflect its typical usage. `AnomalyMap` is a nested map |
| structure used to associate anomalies with specific commits and traces. |
| |
| **Workflow Example: Requesting and Displaying Trace Data** |
| |
| A common workflow involves the frontend requesting trace data from the backend |
| and then displaying it. |
| |
| 1. **Frontend (Client) prepares a `FrameRequest`:** |
| |
| ``` |
| Client Code --> Creates `FrameRequest` object: |
| { |
| begin: 1678886400, // Start timestamp |
| end: 1678972800, // End timestamp |
| queries: ["config=gpu&name=my_test_trace"], |
| // ... other properties |
| } |
| ``` |
| |
| 2. **Frontend sends the `FrameRequest` to the Backend (Server).** |
| |
| 3. **Backend processes the request and generates a `FrameResponse`:** |
| |
| ``` |
| Server Logic --> Processes `FrameRequest` |
| --> Fetches data from database/cache |
| --> Constructs `FrameResponse` object: |
| { |
| dataframe: { |
| traceset: { "config=gpu&name=my_test_trace": [10.1, 10.5, 9.8, ...Trace] }, |
| header: [ { offset: 12345, timestamp: 1678886400 }, ...ColumnHeader[] ], |
| paramset: { "config": ["gpu", "cpu"], "name": ["my_test_trace"] } |
| }, |
| skps: [0, 5, 10], // Indices of significant points |
| // ... other properties like msg, display_mode, anomalymap |
| } |
| ``` |
| |
| 4. **Backend sends the `FrameResponse` (as JSON) back to the Frontend.** |
| |
| 5. **Frontend receives the JSON and parses it, expecting it to conform to the |
| `FrameResponse` interface:** ``Client Code --> Receives JSON --> Parses JSON |
| into a `FrameResponse` typed object --> Uses |
| `frameResponse.dataframe.traceset` to render charts --> Uses |
| `frameResponse.dataframe.header` to display commit information`` |
| |
| This typed interaction ensures that if the backend, for example, renamed |
| `traceset` to `trace_data` in its Go struct, the automatic generation would |
| update the `DataFrame` interface. The TypeScript compiler would then flag an |
| error in the frontend code trying to access `frameResponse.dataframe.traceset`, |
| preventing a runtime error and guiding the developer to update the frontend code |
| accordingly. |
| |
| # Module: /modules/json-source-sk |
| |
| The `json-source-sk` module provides a custom HTML element, `<json-source-sk>`, |
| designed to display the raw JSON data associated with a specific data point in a |
| trace. This is particularly useful in performance analysis and debugging |
| scenarios where understanding the exact input data ingested by the system is |
| crucial. |
| |
| The core responsibility of this module is to fetch and present JSON data in a |
| user-friendly dialog. It aims to simplify the process of inspecting the source |
| data for a given commit and trace identifier. |
| |
| The key component is the `JSONSourceSk` class, defined in `json-source-sk.ts`. |
| This class extends `ElementSk`, a base class for custom elements in the Skia |
| infrastructure. |
| |
| **How it Works:** |
| |
| 1. **Initialization and Properties:** |
| |
| - The element requires two primary properties to be set: |
| - `cid`: The Commit ID (represented as `CommitNumber`), which |
| identifies a specific version or point in time. |
| - `traceid`: A string identifier for the specific trace being |
| examined. |
| - When these properties are set, the element renders itself. If `traceid` |
| is not a valid key (checked by `validKey` from |
| `perf/modules/paramtools`), the control buttons are hidden. |
| |
| 2. **User Interaction and Data Fetching:** |
| |
| - The element displays two buttons: "View Json File" and "View Short Json |
| File". |
| - Clicking either button triggers the `_loadSource` or `_loadSourceSmall` |
| methods, respectively. |
| - These methods internally call `_loadSourceImpl`. This implementation |
| detail allows for sharing the core fetching logic while differentiating |
| the request URL. |
| - `_loadSourceImpl` constructs a `CommitDetailsRequest` object containing |
| the `cid` and `traceid`. |
| - It then makes a POST request to the `/_/details/` endpoint. |
| - If "View Short Json File" was clicked (`isSmall` is true), the URL |
| includes `?results=false`, indicating to the backend that a |
| potentially truncated or summarized version of the JSON is |
| requested. |
| - A `spinner-sk` element is activated to provide visual feedback |
| during the fetch operation. |
| - The response from the server is parsed as JSON using `jsonOrThrow`. If |
| the request is successful, the JSON data is formatted with indentation |
| and stored in the `_json` private property. |
| - The element is then re-rendered to display the fetched JSON. |
| - If an error occurs during fetching or parsing, `errorMessage` (from |
| `perf/modules/errorMessage`) is used to display an error notification to |
| the user. |
| |
| 3. **Displaying the JSON:** |
| |
| - The fetched JSON data is displayed within a `<dialog>` element |
| (`#json-dialog`). |
| - The `jsonFile()` method in the template is responsible for rendering the |
| `<pre>` tag containing the formatted JSON string, but only if `_json` is |
| not empty. |
| - The dialog is shown using `showModal()`, providing a modal interface for |
| viewing the JSON. |
| - A close button (`#closeIcon` with a `close-icon-sk`) allows the user to |
| dismiss the dialog. Closing the dialog also clears the `_json` property. |
| |
| **Design Rationale:** |
| |
| - **Dedicated Element:** Creating a dedicated custom element encapsulates the |
| functionality of fetching and displaying JSON, making it reusable across |
| different parts of the application where such inspection is needed. |
| - **Asynchronous Fetching:** The use of `async/await` and `fetch` allows for |
| non-blocking data retrieval, ensuring the UI remains responsive while |
| waiting for the server. |
| - **Error Handling:** Incorporating error handling via `jsonOrThrow` and |
| `errorMessage` provides a better user experience by informing users about |
| issues during data retrieval. |
| - **Clear Visual Feedback:** The `spinner-sk` element clearly indicates when |
| data is being loaded. |
| - **Modal Dialog:** Using a modal dialog (`<dialog>`) for displaying the JSON |
| helps focus the user's attention on the data without cluttering the main |
| interface. |
| - **Option for Short JSON:** The "View Short Json File" option caters to |
| scenarios where the full JSON might be excessively large, providing a way to |
| quickly inspect a summary or a smaller subset of the data. This can improve |
| performance and readability for very large JSON files. |
| - **Styling and Theming:** The SCSS file (`json-source-sk.scss`) provides |
| basic styling and leverages existing button styles |
| (`//elements-sk/modules/styles:buttons_sass_lib`). It also includes |
| considerations for dark mode by using CSS variables like `--on-background` |
| and `--background`. |
| |
| **Workflow Example: Viewing JSON Source** |
| |
| ``` |
| User Sets Properties Element Renders User Clicks Button Fetches Data Displays JSON |
| -------------------- --------------- ------------------ ------------ ------------- |
| [json-source-sk -> [Buttons visible] -> ["View Json File"] -> POST /_/details/ -> <dialog> |
| .cid = 123 {cid, traceid} <pre>{json}</pre> |
| .traceid = ",foo=bar,"] </dialog> |
| (spinner active) |
| | |
| V |
| Response Received |
| (spinner inactive) |
| ``` |
| |
| The demo page (`json-source-sk-demo.html` and `json-source-sk-demo.ts`) |
| illustrates how to use the `<json-source-sk>` element. It sets up mock data |
| using `fetchMock` to simulate the backend endpoint and programmatically clicks |
| the button to demonstrate the JSON loading functionality. |
| |
| The Puppeteer test (`json-source-sk_puppeteer_test.ts`) ensures the element |
| renders correctly and performs basic visual regression testing. |
| |
| # Module: /modules/new-bug-dialog-sk |
| |
| The `new-bug-dialog-sk` module provides a user interface element for filing new |
| bugs related to performance anomalies. It aims to streamline the bug reporting |
| process by pre-filling relevant information and integrating with the Buganizer |
| issue tracker. |
| |
| **Core Functionality:** |
| |
| The primary responsibility of this module is to display a dialog that allows |
| users to input details for a new bug. This dialog is populated with information |
| derived from one or more selected `Anomaly` objects. The user can then review |
| and modify this information before submitting the bug. |
| |
| **Key Design Decisions and Implementation Choices:** |
| |
| - **Pre-population of Bug Details:** To reduce manual effort and ensure |
| consistency, the dialog attempts to intelligently pre-fill fields like the |
| bug title, labels, and components. |
| - The bug title is generated based on the nature (regression/improvement), |
| magnitude (percentage change), and affected revision range of the |
| anomalies. This logic, found in `getBugTitle()`, mimics the behavior of |
| the legacy Chromeperf UI to maintain familiarity for users. |
| - Labels and components are aggregated from all selected anomalies. Unique |
| labels are presented as checkboxes (defaulting to checked), and unique |
| components are presented as radio buttons (with the first one selected |
| by default). This is handled by `getLabelCheckboxes()` and |
| `getComponentRadios()`. |
| - **Dynamic UI Generation:** The dialog's content, specifically the label |
| checkboxes and component radio buttons, is dynamically generated based on |
| the provided `Anomaly` data. This ensures that only relevant options are |
| presented to the user. Lit-html's templating capabilities are used for this |
| dynamic rendering. |
| - **User Contextualization:** The dialog attempts to automatically CC the |
| logged-in user on the new bug. This is achieved by fetching the user's login |
| status via `/alogin-sk`. |
| - **Asynchronous Bug Filing:** The actual bug filing process is asynchronous. |
| When the user submits the form, a POST request is made to the |
| `/_/triage/file_bug` endpoint. |
| - A spinner (`spinner-sk`) is displayed during this operation to provide |
| visual feedback. |
| - Upon successful bug creation, the user is redirected to the newly |
| created bug page in a new tab, and an `anomaly-changed` event is |
| dispatched to notify other components (like `explore-simple-sk` or |
| `chart-tooltip-sk`) that the anomalies have been updated with the new |
| bug ID. |
| - If an error occurs, an error message is displayed using |
| `error-toast-sk`, and the dialog remains open, allowing the user to |
| retry or correct information. |
| - **Standard HTML Dialog Element:** The core dialog functionality leverages |
| the native `<dialog>` HTML element, which provides built-in accessibility |
| and modal behavior. |
| |
| **Workflow: Filing a New Bug** |
| |
| 1. **Initialization:** An external component (e.g., a chart displaying |
| anomalies) invokes the `setAnomalies()` method on `new-bug-dialog-sk`, |
| passing the relevant `Anomaly` objects and associated trace names. |
| 2. **Opening the Dialog:** The external component calls the `open()` method. |
| `User Action (e.g., click "File Bug" button) | V External Component |
| --[setAnomalies(anomalies, traceNames)]--> new-bug-dialog-sk | V External |
| Component --[open()]--> new-bug-dialog-sk` |
| 3. **Dialog Population:** - `new-bug-dialog-sk` fetches the current user's login status to pre-fill |
| the CC field. - The `_render()` method is called, which uses the Lit-html template. - `getBugTitle()` generates a suggested title. - `getLabelCheckboxes()` and `getComponentRadios()` create the UI for |
| selecting labels and components based on the input anomalies. - The dialog (`<dialog id="new-bug-dialog">`) is displayed modally. |
| ``new-bug-dialog-sk.open() | V [Fetch Login Status] --> Updates `_user` |
| | V _render() |--> getBugTitle() --> Populates Title Input |--> |
| getLabelCheckboxes() --> Creates Label Checkboxes |--> |
| getComponentRadios() --> Creates Component Radios | V Dialog is |
| displayed to the user`` |
| 4. **User Interaction:** The user reviews and potentially modifies the |
| pre-filled information (title, description, labels, component, assignee, |
| CCs). |
| 5. **Submission:** The user clicks the "Submit" button. `User clicks "Submit" | |
| V Form Submit Event | V new-bug-dialog-sk.fileNewBug()` |
| 6. **Bug Filing Process:** - The `fileNewBug()` method is invoked. - The spinner is activated, and form buttons are disabled. - Form data (title, description, selected labels, selected component, |
| assignee, CCs, anomaly keys, trace names) is collected. - A POST request is sent to `/_/triage/file_bug` with the collected data. |
| `fileNewBug() | V [Activate Spinner, Disable Buttons] | V [Extract Form |
| Data] | V fetch('/_/triage/file_bug', {POST, body: jsonData})` |
| 7. **Response Handling:** - **Success:** - The server responds with a JSON object containing the `bug_id`. - The spinner is deactivated, and buttons are re-enabled. - The dialog is closed. - A new browser tab is opened to the URL of the created bug (e.g., |
| `https://issues.chromium.org/issues/BUG_ID`). - The `bug_id` is updated in the local `_anomalies` array. - An `anomaly-changed` custom event is dispatched with the updated |
| anomalies and bug ID. - **Failure:** - The server responds with an error. - The spinner is deactivated, and buttons are re-enabled. - An error message is displayed to the user via `errorMessage()`. The |
| dialog remains open. `fetch Response | +-- Success (HTTP 200, valid |
| JSON with bug_id) | | | V | [Deactivate Spinner, Enable Buttons] | | |
| | V | closeDialog() | | | V | window.open(bugUrl, '_blank') | | | V |
| | Update local _anomalies with bug_id | | | V | |
| dispatchEvent('anomaly-changed', {anomalies, bugId}) | +-- Failure |
| (HTTP error or invalid JSON) | V [Deactivate Spinner, Enable |
| Buttons] | V errorMessage(errorMsg) --> Displays error toast` |
| |
| **Key Files:** |
| |
| - **`new-bug-dialog-sk.ts`**: This is the core file containing the |
| `NewBugDialogSk` class definition, which extends `ElementSk`. It includes |
| the Lit-html template for the dialog, the logic for populating form fields |
| based on `Anomaly` data, handling form submission, interacting with the |
| backend API to file the bug, and managing the dialog's visibility and state. |
| - **`new-bug-dialog-sk.scss`**: This file defines the styles for the dialog, |
| ensuring it integrates visually with the rest of the application and themes. |
| It styles the dialog container, input fields, buttons, and the close icon. |
| - **`new-bug-dialog-sk-demo.ts` and `new-bug-dialog-sk-demo.html`**: These |
| files provide a demonstration page for the `new-bug-dialog-sk` element. The |
| `.ts` file sets up mock data (`Anomaly` objects) and mock fetch responses to |
| simulate the bug filing process, allowing for isolated testing and |
| development of the dialog. The `.html` file includes the `new-bug-dialog-sk` |
| element and a button to trigger its opening. |
| - **`index.ts`**: This file simply imports `new-bug-dialog-sk.ts` to ensure |
| the custom element is defined and available for use. |
| |
| The module relies on several other elements and libraries: |
| |
| - `alogin-sk`: To determine the logged-in user for CC'ing. |
| - `close-icon-sk`: For the dialog's close button. |
| - `spinner-sk`: To indicate activity during bug filing. |
| - `error-toast-sk` (via `errorMessage` utility): To display error messages. |
| - `lit`: For templating and component rendering. |
| - `jsonOrThrow`: A utility for parsing JSON responses and throwing errors on |
| failure. |
| |
| # Module: /modules/paramtools |
| |
| The `paramtools` module provides a TypeScript implementation of utility |
| functions for manipulating parameter sets and structured keys. It mirrors the |
| functionality found in the Go module `/infra/go/paramtools`, which is the |
| primary source of truth for these operations. The decision to replicate this |
| logic in TypeScript is to enable client-side applications to perform these |
| common tasks without needing to make server requests for simple transformations |
| or validations. This approach improves performance and reduces server load for |
| UI-driven interactions. |
| |
| The core responsibility of this module is to provide robust and consistent ways |
| to: |
| |
| 1. **Create and parse structured keys:** Structured keys are a fundamental |
| concept for identifying specific data points (e.g., traces in performance |
| data). |
| 2. **Manipulate `ParamSet` objects:** `ParamSet`s are used to represent |
| collections of possible parameter values, often used for filtering or |
| querying data. |
| |
| Key functionalities and their "why" and "how": |
| |
| - **`makeKey(params: Params | { [key: string]: string }): string`**: |
| |
| - **Why**: To create a canonical string representation of a set of |
| key-value parameters. This canonical form is essential for consistent |
| identification and comparison of data points. The keys within the |
| structured key are sorted alphabetically to ensure that the same set of |
| parameters always produces the same key, regardless of the order in |
| which they were provided. |
| - **How**: It takes a `Params` object (a dictionary of string key-value |
| pairs). It first checks if the `params` object is empty, throwing an |
| error if it is, as a key must represent at least one parameter. Then, it |
| sorts the keys of the `params` object alphabetically. Finally, it |
| constructs the string by joining each key-value pair with `=` and then |
| joining these pairs with `,`, prefixing and suffixing the entire string |
| with a comma. `Input: { "b": "2", "a": "1", "c": "3" } | V Sort keys: [ |
| "a", "b", "c" ] | V Format pairs: "a=1", "b=2", "c=3" | V Join and wrap: |
| ",a=1,b=2,c=3,"` |
| |
| - **`fromKey(structuredKey: string, attribute?: string): Params`**: |
| |
| - **Why**: To convert a structured key string back into a `Params` object, |
| making it easier to work with the individual parameters |
| programmatically. It also handles the removal of special functions that |
| might be embedded in the key (e.g., `norm(...)` for normalization). |
| - **How**: It first calls `removeSpecialFunctions` to strip any function |
| wrappers from the key. Then, it splits the key string by the comma |
| delimiter. Each resulting segment (if not empty) is then split by the |
| equals sign to separate the key and value. These key-value pairs are |
| collected into a new `Params` object. An optional `attribute` parameter |
| allows excluding a specific key from the resulting `Params` object, |
| which can be useful in scenarios where certain attributes are metadata |
| and not part of the core parameters. |
| |
| - **`removeSpecialFunctions(key: string): string`**: |
| |
| - **Why**: Structured keys can sometimes include functional wrappers |
| (e.g., `norm(...)`, `avg(...)`) or special markers (e.g., |
| `special_zero`). This function is designed to strip these away, |
| returning the "raw" underlying key. This is important when you need to |
| work with the base parameters without the context of the applied |
| function or special condition. |
| - **How**: It uses regular expressions to detect if the key matches a |
| pattern like `function_name(,param1=value1,...)`. If a match is found, |
| it extracts the content within the parentheses. The extracted string (or |
| the original key if no function was found) is then processed by |
| `extractNonKeyValuePairsInKey`. |
| - **`extractNonKeyValuePairsInKey(key: string): string`**: This helper |
| function further refines the key string. It splits the string by commas |
| and filters out any segments that do not represent a valid `key=value` |
| pair. This helps to remove extraneous parts like `special_zero` that |
| might be comma-separated but aren't true parameters. The valid pairs are |
| then re-joined and wrapped with commas. |
| |
| - **`validKey(key: string): boolean`**: |
| |
| - **Why**: To provide a simple client-side check to determine if a string |
| is a "valid" basic structured key, meaning it's not a key representing a |
| calculation (like `avg(...)`) or other special trace types. This is a |
| lightweight validation, as the server performs more comprehensive |
| checks. |
| - **How**: It checks if the key string starts and ends with a comma. This |
| is a convention for simple, non-functional structured keys. |
| |
| - **`addParamsToParamSet(ps: ParamSet, p: Params): void`**: |
| |
| - **Why**: To add a new set of parameters (from a `Params` object) to an |
| existing `ParamSet`. `ParamSet`s store unique values for each parameter |
| key. This function ensures that when new parameters are added, only new |
| values are appended to the existing lists for each key, maintaining |
| uniqueness. |
| - **How**: It iterates through the key-value pairs of the input `Params` |
| object (`p`). For each key, it retrieves the corresponding array of |
| values from the `ParamSet` (`ps`). If the key doesn't exist in `ps`, a |
| new array is created. If the value from `p` is not already present in |
| the array, it's added. |
| |
| - **`paramsToParamSet(p: Params): ParamSet`**: |
| |
| - **Why**: To convert a single `Params` object (representing one specific |
| combination of parameters) into a `ParamSet`. In a `ParamSet`, each key |
| maps to an array of values, even if there's only one value. |
| - **How**: It creates a new, empty `ParamSet`. Then, for each key-value |
| pair in the input `Params` object, it creates a new entry in the |
| `ParamSet` where the key maps to an array containing just that single |
| value. |
| |
| - **`addParamSet(p: ParamSet, ps: ParamSet | ReadOnlyParamSet): void`**: |
| |
| - **Why**: To merge one `ParamSet` (or `ReadOnlyParamSet`) into another. |
| This is useful for combining sets of available parameter options, for |
| example, when aggregating data from multiple sources. |
| - **How**: It iterates through the keys and their associated value arrays |
| in the source `ParamSet` (`ps`). If a key from `ps` is not present in |
| the target `ParamSet` (`p`), the entire key and its value array (cloned) |
| are added to `p`. If the key already exists in `p`, it iterates through |
| the values in the source array and adds any values that are not already |
| present in the target array for that key. |
| |
| - **`toReadOnlyParamSet(ps: ParamSet): ReadOnlyParamSet`**: |
| |
| - **Why**: To provide a type assertion that casts a mutable `ParamSet` to |
| an immutable `ReadOnlyParamSet`. This is useful for signaling that a |
| `ParamSet` should not be modified further, typically when passing it to |
| components or functions that expect read-only data. |
| - **How**: It performs a type assertion. No actual data transformation |
| occurs; it's a compile-time type hint. |
| |
| - **`queryFromKey(key: string): string`**: |
| |
| - **Why**: To convert a structured key into a URL query string format |
| (e.g., `a=1&b=2&c=3`). This is specifically useful for frontend |
| applications, like `explore-simple-sk`, where state or filters are often |
| represented in the URL. |
| - **How**: It first uses `fromKey` to parse the structured key into a |
| `Params` object. Then, it leverages the `URLSearchParams` browser API to |
| construct a query string from these parameters. This ensures proper URL |
| encoding of keys and values. `Input Key: ",a=1,b=2,c=3," | V fromKey -> |
| Params: { "a": "1", "b": "2", "c": "3" } | V URLSearchParams -> Query |
| String: "a=1&b=2&c=3"` |
| |
| The design choice to have these functions operate with less stringent validation |
| than their server-side Go counterparts is deliberate. The server remains the |
| ultimate authority on data validity. These client-side functions prioritize ease |
| of use and performance for UI interactions, assuming that the data they operate |
| on has either originated from or will eventually be validated by the server. |
| |
| The `index_test.ts` file provides comprehensive unit tests for these functions, |
| ensuring their correctness and robustness across various scenarios, including |
| handling empty inputs, duplicate values, and special key formats. This focus on |
| testing is crucial for maintaining the reliability of these foundational utility |
| functions. |
| |
| # Module: /modules/perf-scaffold-sk |
| |
| The `perf-scaffold-sk` module provides a consistent layout and navigation |
| structure for all pages within the Perf application. It acts as a wrapper, |
| ensuring that common elements like the title bar, navigation sidebar, and error |
| notifications are present and behave uniformly across different sections of |
| Perf. |
| |
| **Core Responsibilities:** |
| |
| - **Layout Management:** Establishes the primary visual structure, dividing |
| the page into a header, a collapsible sidebar for navigation, and a main |
| content area. |
| - **Navigation:** Provides a standardized set of navigation links in the |
| sidebar, allowing users to easily access different Perf features (e.g., New |
| Query, Favorites, Alerts). |
| - **Global Elements:** Hosts globally relevant components like the login |
| status (`alogin-sk`), theme chooser (`theme-chooser-sk`), and error/toast |
| notifications (`error-toast-sk`). |
| - **Dynamic Content Injection:** Handles the placement of page-specific |
| content into the `main` content area and allows for specific content (like |
| help text) to be injected into the sidebar. |
| |
| **Key Components and Design Decisions:** |
| |
| - **`perf-scaffold-sk.ts`:** This is the heart of the module, defining the |
| `PerfScaffoldSk` custom element. |
| |
| - **Why:** Encapsulating the scaffold logic within a custom element |
| promotes reusability and modularity. It allows any Perf page to adopt |
| the standard layout simply by including this element. |
| - **How:** It uses Lit for templating and rendering the structure |
| (`<app-sk>`, `header`, `aside#sidebar`, `main`, `footer`). |
| - **Content Redistribution:** A crucial design choice is how it handles |
| child elements. Since it doesn't use Shadow DOM for the main content |
| area (to allow global styles to apply easily to the page content), it |
| programmatically moves children of `<perf-scaffold-sk>` into the |
| `<main>` section. |
| |
| - **Process:** |
| |
| 1. When `connectedCallback` is invoked, existing children of |
| `<perf-scaffold-sk>` are temporarily moved out. |
| 2. The scaffold's own template (header, sidebar, etc.) is rendered. |
| 3. The temporarily moved children are then appended to the newly |
| rendered `<main>` element. |
| 4. A `MutationObserver` is set up to watch for any new children added |
| to `<perf-scaffold-sk>` and similarly move them to `<main>`. |
| |
| - **Sidebar Content:** An exception is made for elements with the specific |
| ID `SIDEBAR_HELP_ID`. These are moved into the `#help` div within the |
| sidebar. This allows pages to provide context-specific help information |
| directly within the scaffold. |
| |
| ``` |
| <perf-scaffold-sk> |
| <!-- This will go into <main> --> |
| <div>Page specific content</div> |
| |
| <!-- This will go into <aside>#help --> |
| <div id="sidebar_help">Contextual help</div> |
| </perf-scaffold-sk> |
| ``` |
| |
| - **Configuration via `window.perf`:** The scaffold reads various |
| configuration options from the global `window.perf` object. This allows |
| instances of Perf to customize links (help, feedback, chat), behavior |
| (e.g., `show_triage_link`), and display information (e.g., instance URL, |
| build tag). This makes the scaffold adaptable to different Perf |
| deployments. |
| |
| - For example, the `_helpUrl` and `_reportBugUrl` are initialized with |
| defaults but can be overridden by `window.perf.help_url_override` and |
| `window.perf.feedback_url` respectively. |
| |
| - The visibility of the "Triage" link is controlled by |
| `window.perf.show_triage_link`. |
| |
| - **Build Information:** It displays the current application build tag, |
| fetching it via `getBuildTag()` from |
| `//perf/modules/window:window_ts_lib` and linking it to the |
| corresponding commit in the buildbot git repository. |
| |
| - **Instance Title:** It can display the name of the Perf instance, |
| extracted from `window.perf.instance_url`. |
| |
| - **`perf-scaffold-sk.scss`:** Defines the styles for the scaffold. |
| |
| - **Why:** Separates styling concerns from the element's logic. |
| - **How:** It uses SASS and imports common themes from |
| `//perf/modules/themes:themes_sass_lib`. It defines the layout, |
| including the sidebar width and the main content area's width (using |
| `calc(99vw - var(--sidebar-width))` to avoid horizontal scrollbars |
| caused by `100vw` including the scrollbar width). It also styles the |
| navigation links and other elements within the scaffold. |
| |
| - **`perf-scaffold-sk-demo.html` & `perf-scaffold-sk-demo.ts`:** Provide a |
| demonstration page for the scaffold. |
| |
| - **Why:** Allows developers to see the scaffold in action and test its |
| appearance and behavior in isolation. |
| - **How:** `perf-scaffold-sk-demo.ts` initializes a mock `window.perf` |
| object with various settings and then injects an instance of |
| `<perf-scaffold-sk>` with some placeholder content (including a `div` |
| with `id="sidebar_help"`) into the `perf-scaffold-sk-demo.html` page. |
| |
| **Workflow: Initializing and Rendering a Page with the Scaffold** |
| |
| 1. A Perf page (e.g., the "New Query" page) includes `<perf-scaffold-sk>` as |
| its top-level layout element. `html <!-- new_query_page.html --> <body> |
| <perf-scaffold-sk> <!-- Content specific to the New Query page --> |
| <query-composer-sk></query-composer-sk> <div id="sidebar_help"> <p>Tips for |
| creating new queries...</p> </div> </perf-scaffold-sk> </body>` |
| 2. The `PerfScaffoldSk` element's `connectedCallback` fires. |
| 3. `perf-scaffold-sk.ts`: - Temporarily moves `<query-composer-sk>` and `<div |
| id="sidebar_help">...</div>` out of `perf-scaffold-sk`. - Renders its own internal template (header with title, login, theme |
| chooser; sidebar with nav links; empty main area; footer with error |
| toast). `<app-sk> <header>...</header> <aside id=sidebar> <div |
| id=links>...</div> <div id=help></div> <-- Placeholder for sidebar help |
| ... </aside> <main></main> <-- Placeholder for main content |
| <footer>...</footer> </app-sk>` - The `redistributeAddedNodes` function is called: - `<query-composer-sk>` (since it doesn't have `id="sidebar_help"`) is |
| appended to the `<main>` element. - `<div id="sidebar_help">...</div>` is appended to the `<div |
| id="help">` element within the `<aside>` sidebar. - A `MutationObserver` starts listening for any further children added |
| directly to `<perf-scaffold-sk>`. |
| |
| The final rendered structure (simplified) would look something like: |
| |
| ``` |
| perf-scaffold-sk |
| └── app-sk |
| ├── header |
| │ ├── h1.name (Instance Title) |
| │ ├── div.spacer |
| │ ├── alogin-sk |
| │ └── theme-chooser-sk |
| ├── aside#sidebar |
| │ ├── div#links |
| │ │ ├── a (New Query) |
| │ │ ├── a (Favorites) |
| │ │ └── ... (other nav links) |
| │ ├── div#help |
| │ │ └── div#sidebar_help (Content from original page) |
| │ │ └── <p>Tips for creating new queries...</p> |
| │ └── div#chat |
| ├── main |
| │ └── query-composer-sk (Content from original page) |
| └── footer |
| └── error-toast-sk |
| ``` |
| |
| # Module: /modules/picker-field-sk |
| |
| The `picker-field-sk` module provides a custom HTML element that serves as a |
| stylized text input field with an associated dropdown menu for selecting from a |
| predefined list of options. This component is designed to offer a user-friendly |
| way to pick a single value from potentially many choices, enhancing the user |
| experience in forms or selection-heavy interfaces. |
| |
| **Core Functionality and Design:** |
| |
| The primary goal of `picker-field-sk` is to present a familiar text input that, |
| upon interaction (focus or click), reveals a filterable list of valid options. |
| This addresses the need for a compact and efficient way to select an item, |
| especially when the number of options is large. |
| |
| The implementation leverages the Vaadin ComboBox component (`@vaadin/combo-box`) |
| for its underlying dropdown and filtering capabilities. This choice was made to |
| utilize a well-tested and feature-rich component, avoiding the need to |
| reimplement complex dropdown logic, keyboard navigation, and accessibility |
| features. `picker-field-sk` then wraps this Vaadin component, applying custom |
| styling and providing a simplified API tailored to its specific use case. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`picker-field-sk.ts`**: This is the heart of the module, defining the |
| `PickerFieldSk` custom element which extends `ElementSk`. |
| |
| - **Properties:** |
| - `label`: A string that serves as both the visual label above the input |
| field and the placeholder text within it when empty. This provides |
| context to the user about the expected input. |
| - `options`: An array of strings representing the valid choices the user |
| can select from. The component dynamically adjusts the width of the |
| dropdown overlay to accommodate the longest option, ensuring |
| readability. |
| - `helperText`: An optional string displayed below the input field, |
| typically used for providing additional guidance or information to the |
| user. |
| - **Events:** |
| - `value-changed`: This custom event is dispatched whenever the selected |
| value in the combo box changes. This includes selecting an item from the |
| dropdown, typing a value that matches an option (due to `autoselect`), |
| or clearing the input. The new value is available in |
| `event.detail.value`. This event is crucial for parent components to |
| react to user selections. |
| - **Methods:** |
| - `focus()`: Programmatically sets focus to the input field. |
| - `openOverlay()`: Programmatically opens the dropdown list of options. |
| This is useful for guiding the user or for integrating with other UI |
| elements. |
| - `disable()`: Makes the input field read-only, preventing user |
| interaction. |
| - `enable()`: Removes the read-only state, allowing user interaction. |
| - `clear()`: Clears the current value in the input field. |
| - `setValue(val: string)`: Programmatically sets the value of the input |
| field. |
| - `getValue()`: Retrieves the current value of the input field. |
| - **Rendering:** Uses `lit-html` for templating. The template renders a |
| `<vaadin-combo-box>` element and binds its properties and events to the |
| `PickerFieldSk` element's state. |
| - **Overlay Width Calculation:** The `calculateOverlayWidth()` private |
| method dynamically adjusts the `--vaadin-combo-box-overlay-width` CSS |
| custom property. It iterates through the `options` to find the longest |
| string and sets the overlay width to be slightly larger than this |
| string, ensuring all options are fully visible without truncation. This |
| is a key usability enhancement. `User provides options --> |
| PickerFieldSk.options setter | V calculateOverlayWidth() | V Find max |
| option length | V Set --vaadin-combo-box-overlay-width CSS property` |
| |
| - **`picker-field-sk.scss`**: Contains the SASS styles for the component. |
| |
| - It primarily targets the underlying `vaadin-combo-box` and its shadow |
| parts (e.g., `::part(label)`, `::part(input-field)`, `::part(items)`) to |
| customize its appearance to match the application's theme (including |
| dark mode support). |
| - CSS custom properties like `--vaadin-field-default-width`, |
| `--vaadin-combo-box-overlay-width`, and `--lumo-text-field-size` are |
| used to control the dimensions and sizing of the Vaadin component. |
| - Dark mode styles are applied by targeting `.darkmode picker-field-sk`, |
| adjusting colors for labels, helper text, and input fields to ensure |
| proper contrast and visual integration. |
| |
| - **`index.ts`**: A simple entry point that imports and thereby registers the |
| `picker-field-sk` custom element, making it available for use in HTML. |
| |
| - **`picker-field-sk-demo.html` & `picker-field-sk-demo.ts`**: These files |
| create a demonstration page for the `picker-field-sk` component. |
| |
| - `picker-field-sk-demo.html` includes instances of the `picker-field-sk` |
| element and buttons to trigger its various functionalities (focus, fill, |
| open overlay, disable/enable). |
| - `picker-field-sk-demo.ts` contains JavaScript to initialize the demo |
| elements with sample data (a large list of "speedometer" options to |
| showcase performance with many items) and to wire up the buttons to the |
| corresponding methods of the `PickerFieldSk` instances. This allows |
| developers to visually inspect and interact with the component. |
| |
| **Workflow Example: User Selects an Option** |
| |
| 1. **Initialization**: A parent component instantiates `<picker-field-sk>` and |
| sets its `label` and `options` properties. `<picker-field-sk .label="Fruit" |
| .options=${['Apple', 'Banana', 'Cherry']}></picker-field-sk>` |
| 2. **User Interaction**: The user clicks on or focuses the `picker-field-sk` |
| input. `User clicks/focuses input --> vaadin-combo-box internally handles |
| focus/click | V vaadin-combo-box displays dropdown with options` |
| 3. **Filtering (Optional)**: The user types into the input field. The |
| `vaadin-combo-box` filters the displayed options based on the typed text. |
| 4. **Selection**: The user clicks an option from the dropdown or presses Enter |
| when an option is highlighted. `User selects "Banana" --> vaadin-combo-box |
| updates its internal value | V vaadin-combo-box emits 'value-changed' event` |
| 5. **Event Propagation**: - The `vaadin-combo-box` within `picker-field-sk` emits its native |
| `value-changed` event. - The `onValueChanged` method in `PickerFieldSk` catches this event. - `PickerFieldSk` then dispatches its own `value-changed` custom event, |
| with the selected value in `event.detail.value`. |
| `picker-field-sk.onValueChanged(vaadinEvent) | V Dispatch new |
| CustomEvent('value-changed', { detail: { value: vaadinEvent.detail.value |
| }})` |
| 6. **Parent Component Reaction**: The parent component, listening for the |
| `value-changed` event on the `<picker-field-sk>` element, receives the event |
| and can act upon the selected value. `Parent component listens for |
| 'value-changed' --> Accesses event.detail.value | V Update application |
| state` |
| |
| This layered approach, building upon the Vaadin ComboBox, provides a robust and |
| themeable selection component while abstracting away the complexities of the |
| underlying library for the consumers of `picker-field-sk`. |
| |
| # Module: /modules/pinpoint-try-job-dialog-sk |
| |
| ## Pinpoint Try Job Dialog (`pinpoint-try-job-dialog-sk`) |
| |
| The `pinpoint-try-job-dialog-sk` module provides a user interface element for |
| initiating Pinpoint A/B try jobs. |
| |
| **Purpose:** |
| |
| The primary reason for this module's existence within the Perf application is to |
| allow users to request additional trace data for specific benchmark runs. While |
| Pinpoint itself supports a wider range of try job use cases, this dialog is |
| specifically tailored for this trace generation scenario. It's important to note |
| that this component is considered a legacy feature, and future development |
| should favor the newer Pinpoint frontend. |
| |
| **How it Works:** |
| |
| The dialog is designed to gather the necessary parameters from the user to |
| construct and submit a Pinpoint A/B try job request. This process involves: |
| |
| 1. **Initialization:** The dialog can be pre-populated with initial values such |
| as the test path, base commit, and end commit. This often happens when a |
| user interacts with a chart tooltip and wants to investigate a specific data |
| point further. |
| 2. **User Input:** The user can modify the pre-filled values or enter new ones. |
| Key inputs include: |
| - **Base Commit:** The starting commit hash for the A/B comparison. |
| - **Experiment Commit:** The ending commit hash for the A/B comparison. |
| - **Tracing Arguments:** A string specifying the categories and options |
| for the trace generation. A default value is provided, and a link to |
| Chromium source documentation offers more details on available options. |
| 3. **Authentication:** The dialog uses `alogin-sk` to determine the logged-in |
| user. The user's email is included in the try job request. |
| 4. **Submission:** Upon submission, the dialog constructs a |
| `CreateLegacyTryRequest` object. This object encapsulates all the necessary |
| information for the Pinpoint backend. |
| - The `testPath` (e.g., `master/benchmark_name/story_name`) is parsed to |
| extract the configuration (e.g., `benchmark_name`) and the benchmark |
| (e.g., `story_name`). |
| - The `story` is typically the last segment of the `testPath`. |
| - The `extra_test_args` field is formatted to include the user-provided |
| tracing arguments. |
| 5. **API Interaction:** The dialog sends a POST request to the `/_/try/` |
| endpoint with the JSON payload. |
| 6. **Response Handling:** |
| - **Success:** If the request is successful, the Pinpoint backend responds |
| with a JSON object containing the `jobUrl` for the newly created |
| Pinpoint job. This URL is then displayed to the user, allowing them to |
| navigate to the Pinpoint UI to monitor the job's progress. |
| - **Error:** If an error occurs during the request or processing, an error |
| message is displayed to the user. |
| |
| **Workflow:** |
| |
| ``` |
| User Interaction (e.g., click on chart tooltip) |
| | |
| V |
| Dialog Pre-populated with context (testPath, commits) |
| | |
| V |
| pinpoint-try-job-dialog-sk.open() |
| | |
| V |
| User reviews/modifies input fields (Base Commit, Exp. Commit, Trace Args) |
| | |
| V |
| User clicks "Send to Pinpoint" |
| | |
| V |
| [pinpoint-try-job-dialog-sk] |
| - Gathers input values |
| - Retrieves logged-in user via alogin-sk |
| - Constructs `CreateLegacyTryRequest` JSON |
| - Sends POST request to /_/try/ |
| | |
| V |
| [Backend Pinpoint Service] |
| - Processes the request |
| - Creates A/B try job |
| - Returns jobUrl (success) or error |
| | |
| V |
| [pinpoint-try-job-dialog-sk] |
| - Displays spinner during request |
| - On Success: |
| - Displays link to the created Pinpoint job (jobUrl) |
| - Hides spinner |
| - On Error: |
| - Displays error message |
| - Hides spinner |
| ``` |
| |
| **Key Components/Files:** |
| |
| - **`pinpoint-try-job-dialog-sk.ts`:** This is the core TypeScript file that |
| defines the custom element's logic. |
| - **`PinpointTryJobDialogSk` class:** Extends `ElementSk` and manages the |
| dialog's state, user input, and interaction with the Pinpoint API. |
| - **`template`:** Defines the HTML structure of the dialog using |
| `lit-html`. This includes input fields for commits and tracing |
| arguments, a submit button, a spinner for loading states, and a link to |
| the created Pinpoint job. |
| - **`connectedCallback()`:** Initializes the dialog, sets up event |
| listeners (e.g., for form submission, closing the dialog on outside |
| click), and fetches the logged-in user's information. |
| - **`setTryJobInputParams(params: TryJobPreloadParams)`:** Allows external |
| components to pre-fill the dialog's input fields. This is crucial for |
| integrating the dialog with other parts of the Perf UI, like chart |
| tooltips. |
| - **`open()`:** Displays the modal dialog. |
| - **`closeDialog()`:** Closes the modal dialog. |
| - **`postTryJob()`:** This is the central method for handling the job |
| submission. It reads values from the input fields, constructs the |
| `CreateLegacyTryRequest` payload, and makes the `fetch` call to the |
| Pinpoint API. It also handles the UI updates based on the API response |
| (showing the job URL or an error message). |
| - **`TryJobPreloadParams` interface:** Defines the structure for the |
| parameters used to pre-populate the dialog. |
| - **`pinpoint-try-job-dialog-sk.scss`:** Contains the SASS/CSS styles for the |
| dialog, ensuring it aligns with the application's visual theme. It styles |
| the input fields, buttons, and the overall layout of the dialog. |
| - **`index.ts`:** A simple entry point that imports and registers the |
| `pinpoint-try-job-dialog-sk` custom element. |
| - **`BUILD.bazel`:** Defines the build rules for the module, specifying its |
| dependencies (e.g., `elements-sk` components like `select-sk`, `spinner-sk`, |
| `alogin-sk`, and Material Web components) and how it should be compiled. |
| |
| **Design Decisions:** |
| |
| - **Based on `bisect-dialog-sk`:** The dialog's structure and initial |
| functionality were adapted from an existing bisect dialog. This likely |
| accelerated development by reusing common patterns for dialog interactions |
| and API calls. |
| - **Legacy Component:** The explicit note to avoid building on top of this |
| dialog indicates a strategic decision to migrate towards a newer Pinpoint |
| frontend. This suggests that this component is maintained for existing |
| functionality but is not the target for future enhancements related to |
| Pinpoint interactions. |
| - **Specific Use Case:** The dialog is narrowly focused on requesting |
| additional traces. This simplifies the UI and the request payload, making it |
| easier for users to achieve this specific task. |
| - **Client-Side Request Construction:** The `CreateLegacyTryRequest` object is |
| fully constructed on the client-side before being sent to the backend. This |
| gives the frontend more control over the request parameters. |
| - **Standard HTML Dialog:** The use of the `<dialog>` HTML element provides |
| built-in modal behavior, simplifying the implementation of showing and |
| hiding the dialog. |
| - **Error Handling:** The dialog includes basic error handling by displaying |
| messages returned from the API, improving the user experience when things go |
| wrong. |
| - **Spinner for Feedback:** The `spinner-sk` component provides visual |
| feedback to the user while the API request is in progress. |
| |
| This component serves as a bridge for users of the Perf application to leverage |
| Pinpoint's capabilities for generating detailed trace information, even as the |
| broader Pinpoint tooling evolves. |
| |
| # Module: /modules/pivot-query-sk |
| |
| The `pivot-query-sk` module provides a custom HTML element for users to |
| configure and interact with pivot table requests. Pivot tables are a powerful |
| data summarization tool, and this element allows users to define how data should |
| be grouped, what aggregate operations should be performed, and what summary |
| statistics should be displayed. |
| |
| The core of the module is the `PivotQuerySk` class, which extends `ElementSk`. |
| This class manages the state of the pivot request and renders the UI for user |
| interaction. It leverages other custom elements like `multi-select-sk` and |
| `select-sk` to provide intuitive input controls. |
| |
| **Key Design Choices and Implementation Details:** |
| |
| - **Event-Driven Updates:** The element emits a custom event, `pivot-changed`, |
| whenever the user modifies any part of the pivot request. This allows |
| consuming applications to react to changes in real-time. The event detail |
| (`PivotQueryChangedEventDetail`) contains the updated `pivot.Request` object |
| or `null` if the current configuration is invalid. This decouples the UI |
| component from the application logic that processes the pivot request. |
| - **Data Binding and Rendering:** The `PivotQuerySk` element uses Lit's `html` |
| templating for rendering. It maintains internal state for the |
| `_pivotRequest` (the current pivot configuration) and `_paramset` (the |
| available options for grouping). When these properties are set or updated, |
| the `_render()` method is called to re-render the component, ensuring the UI |
| reflects the current state. |
| - **Handling Null Pivot Requests:** The `createDefaultPivotRequestIfNull()` |
| method ensures that if `_pivotRequest` is initially `null`, it's initialized |
| with a default valid structure before any user interaction attempts to |
| modify it. This prevents errors and provides a sensible starting point. |
| - **Dynamic Option Generation:** The options for "group by" and "summary" are |
| dynamically generated based on the provided `_paramset` and the existing |
| `_pivotRequest`. The `allGroupByOptions()` method is particularly noteworthy |
| as it ensures that even if the `_paramset` changes, any currently selected |
| `group_by` keys in the `_pivotRequest` are still displayed as options. This |
| prevents accidental data loss during `_paramset` updates. It achieves this |
| by concatenating keys from both sources, sorting, and then filtering out |
| duplicates. |
| - **Input Validation:** The `pivotRequest` getter includes a call to |
| `validatePivotRequest` (from `pivotutil`). This ensures that the component |
| only returns a valid `pivot.Request` object. If the current configuration is |
| invalid, it returns `null`. This promotes data integrity. |
| |
| **Responsibilities and Key Components:** |
| |
| - **`pivot-query-sk.ts`**: This is the main file defining the `PivotQuerySk` |
| custom element. |
| |
| - **`PivotQuerySk` class**: |
| - Manages the `pivot.Request` object, which defines the grouping, |
| operation, and summary statistics for a pivot table. |
| - Takes a `ParamSet` as input, which provides the available keys for the |
| "group by" selection. This `ParamSet` likely originates from the dataset |
| being analyzed. |
| - Renders UI controls (multi-selects and a select dropdown) for users to |
| specify: |
| - **Group By Keys**: Which parameters to use for grouping data rows |
| (e.g., 'config', 'os'). This uses `multi-select-sk`. |
| - **Operation**: The primary aggregate function to apply (e.g., 'avg', |
| 'sum', 'count'). This uses a standard `select` element. |
| - **Summary Statistics**: Optional additional aggregate functions to |
| calculate for each group (e.g., 'stddev', 'percentile'). This also |
| uses `multi-select-sk`. |
| - Emits a `pivot-changed` custom event when the user modifies the pivot |
| request. |
| - **`PivotQueryChangedEventDetail` type**: Defines the structure of the |
| data passed in the `pivot-changed` event. |
| - **`PivotQueryChangedEventName` constant**: The string name of the custom |
| event. |
| - **Event Handlers (`groupByChanged`, `operationChanged`, |
| `summaryChanged`)**: These methods are triggered by user interactions |
| with the respective UI elements. They update the internal |
| `_pivotRequest` and then call `emitChangeEvent`. |
| - **`emitChangeEvent()`**: Constructs and dispatches the `pivot-changed` |
| event. |
| - **Property Getters/Setters (`pivotRequest`, `paramset`)**: Provide |
| controlled access to the element's core data, triggering re-renders when |
| set. |
| |
| - **`pivot-query-sk.scss`**: Contains the styling for the `pivot-query-sk` |
| element. It ensures a consistent look and feel, leveraging styles from |
| `themes_sass_lib` and `select_sass_lib`. The layout is primarily flex-based |
| to arrange the different selection components. |
| |
| - **`pivot-query-sk-demo.html` and `pivot-query-sk-demo.ts`**: These files |
| provide a demonstration page for the `pivot-query-sk` element. |
| |
| - The HTML sets up a basic page structure and includes an instance of |
| `pivot-query-sk`. |
| - The TypeScript initializes the demo element with sample `pivot.Request` |
| data and a `ParamSet`. It also includes an event listener for |
| `pivot-changed` to display the selected pivot configuration as JSON, |
| illustrating how to consume the element's output. |
| |
| **Workflow for User Interaction and Event Emission:** |
| |
| 1. **Initialization:** |
| |
| - The `pivot-query-sk` element is created. |
| - The consuming application sets the `paramset` (available grouping keys) |
| and optionally an initial `pivotRequest`. |
| - The element renders its initial state based on these inputs. |
| |
| 2. **User Modifies a Selection (e.g., changes a "group by" option):** |
| |
| - `multi-select-sk` (for "group by") emits a `selection-changed` event. |
| - `PivotQuerySk.groupByChanged()` is called. |
| - `createDefaultPivotRequestIfNull()` ensures `_pivotRequest` is not null. |
| - `_pivotRequest.group_by` is updated based on the new selection. |
| - `emitChangeEvent()` is called. |
| |
| 3. **Event Emission:** |
| |
| - `emitChangeEvent()`: |
| - Retrieves the current `pivotRequest` (which might be `null` if |
| invalid). |
| - Creates a new `CustomEvent` named `pivot-changed`. |
| - The `detail` of the event is the current (potentially validated) |
| `pivotRequest`. |
| - The event is dispatched, bubbling up the DOM. |
| |
| 4. **Application Responds:** |
| |
| - The consuming application, listening for `pivot-changed` events on the |
| `pivot-query-sk` element or one of its ancestors, receives the event. |
| - The application can then use the `event.detail` (the `pivot.Request`) to |
| update its data display, fetch new data, or perform other actions. |
| |
| This flow can be visualized as: |
| |
| ``` |
| User Interaction (e.g., click on multi-select) |
| | |
| v |
| Internal element event (e.g., @selection-changed from multi-select-sk) |
| | |
| v |
| PivotQuerySk Event Handler (e.g., groupByChanged) |
| | |
| v |
| Update internal _pivotRequest state |
| | |
| v |
| PivotQuerySk.emitChangeEvent() |
| | |
| v |
| Dispatch "pivot-changed" CustomEvent (with pivot.Request as detail) |
| | |
| v |
| Consuming Application's Event Listener |
| | |
| v |
| Application processes the new pivot.Request |
| ``` |
| |
| # Module: /modules/pivot-table-sk |
| |
| The `pivot-table-sk` module provides a custom HTML element, `<pivot-table-sk>`, |
| designed to display pivoted data in a tabular format. This element is |
| specifically for DataFrames that have been pivoted and contain summary values, |
| as opposed to summary traces (which would be displayed in a plot). |
| |
| **Core Functionality and Design** |
| |
| The primary purpose of `pivot-table-sk` is to present complex, multi-dimensional |
| data in an understandable and interactive table. The "why" behind its design is |
| to offer a user-friendly way to explore summarized data that arises from |
| pivoting operations. |
| |
| Key design considerations include: |
| |
| - **Data Input:** It takes a `DataFrame` (from |
| `//perf/modules/json:index_ts_lib`) and a `pivot.Request` (also from |
| `//perf/modules/json:index_ts_lib`) as input. The `pivot.Request` is crucial |
| as it dictates how the `DataFrame` was originally pivoted, including the |
| `group_by` keys, the main `operation`, and the `summary` operations. |
| - **Display:** The data is rendered as an HTML table. The table headers are |
| derived from the `group_by` keys and the `summary` operations. |
| - **Interactivity (Sorting):** Users can sort the table by clicking on column |
| headers. The sorting mechanism is designed to be intuitive, mimicking |
| spreadsheet behavior where subsequent sorts on different columns break ties |
| from previous sorts. |
| - **Query Context:** The element also displays the query parameters, the |
| "group by" keys, the primary operation, and the summary operations that led |
| to the current view of the data. This provides context to the user. |
| - **Validation:** It includes a mechanism to validate if the provided |
| `pivot.Request` is suitable for display as a pivot table (using |
| `validateAsPivotTable` from `//perf/modules/pivotutil:index_ts_lib`). This |
| prevents rendering errors or confusing displays if the input data structure |
| isn't appropriate. |
| |
| **Key Components and Files** |
| |
| - **`pivot-table-sk.ts`**: This is the heart of the module, defining the |
| `PivotTableSk` custom element. |
| |
| - **`PivotTableSk` class:** |
| - Extends `ElementSk` (from `//infra-sk/modules/ElementSk:index_ts_lib`). |
| - Manages the input `DataFrame` (`df`), `pivot.Request` (`req`), and the |
| original `query` string. |
| - **`KeyValues` type and `keyValuesFromTraceSet` function:** This is a |
| critical internal data structure. `KeyValues` is an object where keys |
| are trace keys (e.g., `',arch=x86,config=8888,'`) and values are arrays |
| of strings. These string arrays represent the values of the parameters |
| specified in `req.group_by`, in the same order. For example, if |
| `req.group_by` is `['config', 'arch']`, then for the trace |
| `',arch=arm,config=8888,'`, the corresponding `KeyValues` entry would be |
| `['8888', 'arm']`. This transformation is performed by |
| `keyValuesFromTraceSet` and is essential for rendering the "key" columns |
| of the table and for sorting by these keys. |
| - **`SortSelection` class:** Represents the sorting state of a single |
| column. It stores: |
| - `column`: The index of the column. |
| - `kind`: Whether the column represents 'keyValues' (from `group_by`) |
| or 'summaryValues' (from `summary` operations). |
| - `dir`: The sort direction ('up' or 'down'). |
| - It provides methods to `toggleDirection`, `buildCompare` (to create |
| a JavaScript sort comparison function based on its state), and |
| `encode`/`decode` for serialization. |
| - **`SortHistory` class:** Manages the overall sorting state of the table. |
| - It holds an array (`history`) of `SortSelection` objects. |
| - The "spreadsheet-like" multi-column sorting is achieved here. When a |
| user clicks a column to sort, that column's `SortSelection` is moved |
| to the _front_ of the `history` array, and its direction is toggled. |
| - `buildCompare` in `SortHistory` creates a composite comparison |
| function that iterates through the `SortSelection` objects in |
| `history`. The first `SortSelection` determines the primary sort |
| order. If it results in a tie, the second `SortSelection` is used to |
| break the tie, and so on. This creates the effect of a stable sort |
| across multiple user interactions without needing a true stable sort |
| algorithm for each click. |
| - It also provides `encode`/`decode` methods to serialize the entire |
| sort history (e.g., for persisting sort state in a URL). |
| - **`set()` method:** The primary way to provide data to the component. It |
| initializes `keyValues`, `sortHistory`, and the main `compare` function. |
| It can also accept an `encodedHistory` string to restore a previous sort |
| state. |
| - **Rendering Logic (Templates):** Uses `lit-html` for templating. |
| - `queryDefinition()`: Renders the contextual information about the |
| query and pivot operations. |
| - `tableHeader()`, `keyColumnHeaders()`, `summaryColumnHeaders()`: |
| Generate the table header row, including sort icons. |
| - `sortArrow()`: Dynamically displays the correct sort icon (up arrow, |
| down arrow, or neutral sort icon) based on the current |
| `SortHistory`. |
| - `tableRows()`, `keyRowValues()`, `summaryRowValues()`: Generate the |
| data rows of the table, applying the current sort order. |
| - `displayValue()`: Formats numerical values for display, converting a |
| special sentinel value (`MISSING_DATA_SENTINEL` from |
| `//perf/modules/const:const_ts_lib`) to '-'. |
| - **Event Emission:** Emits a `change` event when the user sorts the |
| table. The event detail (`PivotTableSkChangeEventDetail`) is the encoded |
| `SortHistory` string. This allows parent components to react to sort |
| changes and potentially persist the state. |
| - **Dependencies:** |
| - Relies on `paramset-sk` to display the query parameters. |
| - Uses various icon elements (`arrow-drop-down-icon-sk`, |
| `arrow-drop-up-icon-sk`, `sort-icon-sk`) for the sort indicators. |
| - `//perf/modules/json:index_ts_lib` for `DataFrame`, `TraceSet`, |
| `pivot.Request` types. |
| - `//perf/modules/pivotutil:index_ts_lib` for `operationDescriptions` and |
| `validateAsPivotTable`. |
| - `//perf/modules/paramtools:index_ts_lib` for `fromKey` (to parse trace |
| keys into parameter sets). |
| - `//infra-sk/modules:query_ts_lib` for `toParamSet` (to convert a query |
| string into a `ParamSet`). |
| |
| - **`pivot-table-sk.scss`**: Provides the styling for the `pivot-table-sk` |
| element, including table borders, padding, text alignment, and cursor styles |
| for interactive elements. It leverages themes from |
| `//perf/modules/themes:themes_sass_lib`. |
| |
| - **`index.ts`**: A simple entry point that imports and thereby registers the |
| `pivot-table-sk` custom element. |
| |
| - **`pivot-table-sk-demo.html` & `pivot-table-sk-demo.ts`**: |
| |
| - These files set up a demonstration page for the `pivot-table-sk` |
| element. |
| - `pivot-table-sk-demo.ts` creates sample `DataFrame` and `pivot.Request` |
| objects and uses them to populate instances of `pivot-table-sk` on the |
| demo page. This is crucial for development and visual testing. It |
| demonstrates valid use cases, cases with invalid pivot requests, and |
| cases with null DataFrames to ensure the component handles these |
| scenarios gracefully. |
| |
| - **Test Files (`pivot-table-sk_test.ts`, |
| `pivot-table-sk_puppeteer_test.ts`)**: |
| |
| - **`pivot-table-sk_test.ts` (Karma test):** Contains unit tests for the |
| `PivotTableSk` element and its internal logic, particularly the |
| `SortSelection` and `SortHistory` classes. It verifies: |
| - Correct initialization and rendering. |
| - The sorting behavior when column headers are clicked (e.g., sort |
| direction changes, correct sort icons appear, `change` event is emitted |
| with the correct encoded history). |
| - The `buildCompare` functions in `SortSelection` and `SortHistory` |
| produce the correct sorting results for various data types and sort |
| directions. |
| - The `encode` and `decode` methods for `SortSelection` and `SortHistory` |
| work correctly, allowing for round-tripping of sort state. |
| - The `keyValuesFromTraceSet` function correctly transforms `TraceSet` |
| data based on the `pivot.Request`. |
| - **`pivot-table-sk_puppeteer_test.ts` (Puppeteer test):** Performs |
| end-to-end tests by loading the demo page in a headless browser. |
| - It checks if the elements render correctly on the page (smoke test). |
| - It takes screenshots of the rendered component for visual regression |
| testing. |
| |
| **Workflow Example: User Sorting the Table** |
| |
| 1. **Initial State:** |
| |
| - The `pivot-table-sk` element is initialized with a `DataFrame`, a |
| `pivot.Request`, and an optional initial `encodedHistory` string. |
| - `pivot-table-sk` creates a `SortHistory` object. If `encodedHistory` is |
| provided, `SortHistory.decode()` is called. Otherwise, a default sort |
| order is established (usually based on the order of summary columns, |
| then key columns, all initially 'up'). |
| - `SortHistory.buildCompare()` generates the initial comparison function. |
| - The table is rendered, sorted according to this initial comparison |
| function. Each column header shows a default `sort-icon-sk`. |
| |
| 2. **User Clicks a Column Header (e.g., "config" key column):** |
| |
| - `changeSort(columnIndex, 'keyValues')` is called within |
| `pivot-table-sk`. |
| - `this.sortHistory.selectColumnToSortOn(columnIndex, 'keyValues')` is |
| invoked: |
| - The `SortSelection` for the "config" column is found in |
| `this.sortHistory.history`. |
| - It's removed from its current position. |
| - Its `direction` is toggled (e.g., from 'up' to 'down'). |
| - This updated `SortSelection` is prepended to |
| `this.sortHistory.history`. `Before: [SummaryCol0(up), |
| |
| SummaryCol1(up), KeyCol0(config, up), KeyCol1(arch, up)] Click on |
| KeyCol0 (config): After: [KeyCol0(config, down), SummaryCol0(up), |
| SummaryCol1(up), KeyCol1(arch, up)]` -`this.compare = this.sortHistory.buildCompare(...)`is called. A new |
| composite comparison function is generated. Now, rows will primarily be |
| sorted by "config" (descending). Ties will be broken by "SummaryCol0" |
| (ascending), then "SummaryCol1" (ascending), and finally "KeyCol1" |
| (ascending). |
| |
| - A`CustomEvent('change')`is dispatched. The`event.detail`contains |
| `this.sortHistory.encode()`, which is a string representation of the new |
| sort order (e.g., "dk0-su0-su1-ku1"). |
| - `this.\_render()`is called, re-rendering the table with the new sort |
| order. The "config" column header now shows an |
| `arrow-drop-down-icon-sk`. |
| |
| 3. **User Clicks Another Column Header (e.g., "avg" summary column):** |
| |
| - The process repeats. The `SortSelection` for the "avg" column is moved |
| to the front of `this.sortHistory.history` and its direction is toggled. |
| `Before: [KeyCol0(config, down), SummaryCol0(avg, up), SummaryCol1(sum, |
| |
| up), KeyCol1(arch, up)] Click on SummaryCol0 (avg): After: |
| [SummaryCol0(avg, down), KeyCol0(config, down), SummaryCol1(sum, up), |
| KeyCol1(arch, up)]` - The table is re-rendered, now primarily sorted by "avg" (descending), |
| with ties broken by "config" (descending), then "sum" (ascending), then |
| "arch" (ascending). |
| |
| This multi-level sorting, driven by the `SortHistory` maintaining the sequence |
| of user sort actions, is a key aspect of the "how" behind the `pivot-table-sk`'s |
| user experience. It aims to provide a powerful yet familiar way to analyze |
| pivoted data. |
| |
| # Module: /modules/pivotutil |
| |
| The `pivotutil` module provides utility functions and constants for working with |
| pivot table requests. Its primary purpose is to ensure the validity and |
| integrity of pivot requests before they are processed, and to offer |
| human-readable descriptions for pivot operations. This centralization of |
| pivot-related logic helps maintain consistency and simplifies the handling of |
| pivot table configurations across different parts of the application. |
| |
| ### Key Components and Responsibilities |
| |
| **`index.ts`**: This is the core file of the module and contains all the |
| exported functionalities. |
| |
| - **`operationDescriptions`**: |
| |
| - **Why**: Pivot operations are often represented by short, cryptic |
| identifiers (e.g., `avg`, `std`). To improve user experience and make |
| UIs more understandable, a mapping to human-readable names is necessary. |
| - **How**: This is a simple JavaScript object (dictionary) where keys are |
| the `pivot.Operation` enum values (imported from `../json`) and values |
| are their corresponding descriptive strings (e.g., "Mean", "Standard |
| Deviation"). This allows for easy lookup and display of operation names. |
| |
| - **`validatePivotRequest(req: pivot.Request | null): string`**: |
| |
| - **Why**: Before attempting to process a pivot request, it's crucial to |
| ensure that the request is structurally sound and contains the minimally |
| required information. This prevents runtime errors and provides early |
| feedback to the user or calling code if the request is malformed. |
| - **How**: This function performs basic validation checks on a |
| `pivot.Request` object. |
| |
| * It first checks if the request itself is `null`. If so, it returns an |
| error message. |
| * It then verifies that the `group_by` property is present and is an array |
| with at least one element. A pivot table fundamentally relies on |
| grouping data, so this is a mandatory field. |
| * If all checks pass, it returns an empty string, indicating a valid |
| request. Otherwise, it returns a string describing the specific |
| validation error. |
| |
| - **Workflow**: `Input: pivot.Request | null | V Is request null? |
| --(Yes)--> Return "Pivot request is null." | (No) V Is req.group_by null |
| or empty? --(Yes)--> Return "Pivot must have at least one GroupBy." | |
| (No) V Return "" (Valid)` |
| |
| - **`validateAsPivotTable(req: pivot.Request | null): string`**: |
| |
| - **Why**: Some contexts specifically require a pivot _table_ that |
| displays summary values, not just a pivot _plot_ which might only group |
| traces without performing summary calculations. This function enforces |
| the presence of summary operations. |
| - **How**: |
| |
| * It first calls `validatePivotRequest` to ensure the basic structure of |
| the request is valid. If `validatePivotRequest` returns an error, that |
| error is immediately returned. |
| * If the basic validation passes, it then checks if the `summary` property |
| of the request is present and is an array with at least one element. |
| Summary operations (like sum, average, etc.) are essential for |
| generating the aggregated values displayed in a pivot table. Without |
| them, the request might be valid for plotting individual traces grouped |
| by some criteria, but not for a typical pivot table with summarized |
| data. |
| * If the `summary` array is missing or empty, an error message is |
| returned. Otherwise, an empty string is returned. |
| |
| - **Workflow**: `Input: pivot.Request | null | V Call |
| validatePivotRequest(req) --> invalidMsg | V Is invalidMsg not empty? |
| --(Yes)--> Return invalidMsg | (No) V Is req.summary null or empty? |
| --(Yes)--> Return "Must have at least one Summary operation." | (No) V |
| Return "" (Valid for pivot table)` |
| |
| **`index_test.ts`**: This file contains unit tests for the functions in |
| `index.ts`. |
| |
| - **Why**: To ensure the validation logic correctly identifies valid and |
| invalid pivot requests under various conditions. This maintains the |
| reliability of the `pivotutil` module. |
| - **How**: It uses the `chai` assertion library to define test cases. |
| - For `validatePivotRequest`, it tests scenarios like: |
| - `null` request. |
| - `group_by` being `null`. |
| - `group_by` being an empty array. |
| - A completely valid request. |
| - For `validateAsPivotTable`, it builds upon the `validatePivotRequest` |
| checks and adds tests for: _ `summary` being `null`. _ `summary` being |
| an empty array. \* A valid request with at least one summary operation. |
| Each test asserts whether the validation functions return an empty |
| string (for valid inputs) or a non-empty error message string (for |
| invalid inputs) as expected. |
| |
| The design decision to separate `validatePivotRequest` and |
| `validateAsPivotTable` allows for more granular validation. Some parts of an |
| application might only need the basic validation (e.g., ensuring data can be |
| grouped), while others specifically require summary operations for display in a |
| tabular format. This separation provides flexibility. The use of descriptive |
| error messages aids in debugging and user feedback. |
| |
| # Module: /modules/plot-google-chart-sk |
| |
| The `plot-google-chart-sk` module provides a custom element for rendering |
| interactive time-series charts using Google Charts. It is designed to display |
| performance data, including anomalies and user-reported issues, and allows users |
| to interact with the chart through panning, zooming, and selecting data points. |
| |
| **Key Responsibilities:** |
| |
| - **Data Visualization:** Renders line charts based on `DataTable` objects, |
| which are consumed from a Lit context (`dataTableContext`). This `DataTable` |
| typically contains time-series data where the first column is a commit |
| identifier (e.g., revision number or timestamp), the second is a date |
| object, and subsequent columns represent different data traces. |
| - **Interactivity:** |
| - **Panning:** Allows users to pan the chart horizontally by clicking and |
| dragging. |
| - **Zooming:** Supports both horizontal and vertical zooming. Users can |
| Ctrl-click and drag to select a region to zoom into. A reset button |
| allows returning to the original view. |
| - **Delta Calculation:** Enables users to Shift-click and drag vertically |
| to measure the difference (both raw and percentage) between two Y-axis |
| values. |
| - **Tooltip Display:** Shows detailed information about a data point when |
| the user hovers over it. |
| - **Data Point Selection:** Allows users to click on a data point to |
| select it, which can trigger other actions in the application. |
| - **Anomaly and Issue Display:** Overlays icons on the chart to indicate |
| anomalies (regressions, improvements, untriaged, ignored) and user-reported |
| issues at specific data points. These are also consumed from Lit contexts |
| (`dataframeAnomalyContext` and `dataframeUserIssueContext`). |
| - **Legend and Trace Management:** Includes a side panel (`side-panel-sk`) |
| that displays a legend for the plotted traces. Users can toggle the |
| visibility of individual traces using checkboxes in the side panel. |
| - **Dynamic Updates:** Responds to changes in data, selected ranges, and other |
| properties by redrawing or updating the chart view. |
| |
| **Design Decisions and Implementation Choices:** |
| |
| - **Google Charts Integration:** Leverages the |
| `@google-web-components/google-chart` library for the core charting |
| functionality. This provides a robust and feature-rich charting engine. |
| - **LitElement and Context API:** Built as a LitElement custom element, making |
| it easy to integrate into modern web applications. It utilizes Lit's Context |
| API to consume shared data like the `DataTable`, anomaly information, and |
| loading states from parent components or a centralized data store. This |
| promotes a decoupled architecture. |
| - **Modular Sub-components:** |
| - `v-resizable-box-sk`: A dedicated component for the vertical selection |
| box used in the "deltaY" mode. It calculates and displays the difference |
| between the start and end points of the drag. |
| - `drag-to-zoom-box-sk`: Handles the visual representation of the |
| selection box during the drag-to-zoom interaction. It manages the |
| display and dimensions of the box as the user drags. |
| - `side-panel-sk`: Encapsulates the legend and trace visibility controls. |
| This separation of concerns keeps the main chart component focused on |
| plotting. |
| - **Event-Driven Communication:** Emits custom events (e.g., |
| `selection-changed`, `plot-data-mouseover`, `plot-data-select`) to notify |
| parent components of user interactions and chart state changes. This allows |
| for integration with other parts of an application. |
| - **Overlay for Anomalies and Issues:** Anomalies and user issues are rendered |
| as absolutely positioned `md-icon` elements on top of the chart. Their |
| positions are calculated based on the chart's layout and the data point |
| coordinates. This approach avoids modifying the Google Chart's internal |
| rendering and allows for more flexible styling and interaction with these |
| markers. |
| - **Caching and Performance:** |
| - Caches the Google Chart object (`this.chart`) and chart layout |
| information (`this.cachedChartArea`) to avoid redundant lookups. |
| - Maintains a `removedLabelsCache` to efficiently hide and show traces |
| without reconstructing the entire `DataView` each time. |
| - **Separate Interaction Modes:** The `navigationMode` property (`pan`, |
| `deltaY`, `dragToZoom`) manages the current mouse interaction state. This |
| simplifies event handling by directing mouse events to the appropriate logic |
| based on the active mode. |
| - **Dynamic Y-Axis Title:** The `determineYAxisTitle` method attempts to |
| create a meaningful Y-axis title by examining the `unit` and |
| `improvement_direction` parameters from the trace names. It displays these |
| only if they are consistent across all visible traces. |
| |
| **Key Components/Files:** |
| |
| - **`plot-google-chart-sk.ts`:** The core component that orchestrates the |
| chart display and interactions. |
| - Manages the Google Chart instance. |
| - Handles mouse events for panning, zooming, delta calculation, and data |
| point interactions. |
| - Consumes data (`DataTable`, `AnomalyMap`, `UserIssueMap`) via Lit |
| context. |
| - Renders anomaly and user issue icons as overlays. |
| - Communicates with `side-panel-sk` to manage trace visibility. |
| - Dispatches custom events for user interactions. |
| - **`side-panel-sk.ts`:** Implements the side panel containing the legend and |
| checkboxes for toggling trace visibility. |
| - Generates legend entries based on the `DataTable`. |
| - Manages the checked state of traces and communicates changes to |
| `plot-google-chart-sk`. |
| - Can display the calculated delta values from the `v-resizable-box-sk`. |
| - **`v-resizable-box-sk.ts`:** A custom element for the vertical resizable |
| selection box used during the delta calculation (Shift-click + drag). |
| - Displays the selection box and calculates the raw and percentage |
| difference between the Y-values at the start and end of the drag. |
| - **`drag-to-zoom-box-sk.ts`:** A custom element for the selection box used |
| during the drag-to-zoom interaction (Ctrl-click + drag). |
| - Draws a semi-transparent rectangle indicating the area to be zoomed. |
| - **`plot-google-chart-sk-demo.ts` and `plot-google-chart-sk-demo.html`:** |
| Provide a demonstration page showcasing the `plot-google-chart-sk` element |
| with sample data. This is crucial for development and testing. |
| - **`index.ts`:** Serves as the entry point for the module, importing and |
| registering all the custom elements defined within. |
| |
| **Key Workflows:** |
| |
| 1. **Initial Chart Rendering:** `DataTable` (from context) -> |
| `plot-google-chart-sk` -> `updateDataView()` -> Creates |
| `google.visualization.DataView` -> Sets columns based on `domain` |
| (commit/date) and visible traces -> `updateOptions()` configures chart |
| appearance (colors, axes, view window) -> `plotElement.value.view = view` |
| and `plotElement.value.options = options` -> Google Chart renders. -> |
| `onChartReady()`: -> Caches chart object. -> Calls `drawAnomaly()`, |
| `drawUserIssues()`, `drawXbar()`. |
| |
| 2. **Panning:** User mousedown (not Shift or Ctrl) -> `onChartMouseDown()`: |
| `navigationMode = 'pan'` User mousemove -> `onWindowMouseMove()`: -> |
| Calculates deltaX based on mouse movement and current domain. -> Updates |
| `this.selectedRange`. -> Calls `updateOptions()` to update chart's |
| horizontal view window. -> Dispatches `selection-changing` event. User |
| mouseup -> `onWindowMouseUp()`: -> Dispatches `selection-changed` event. -> |
| `navigationMode = null`. |
| |
| 3. **Drag-to-Zoom:** User Ctrl + mousedown -> `onChartMouseDown()`: |
| `navigationMode = 'dragToZoom'` -> `zoomRangeBox.value.initializeShow()`: |
| Displays the drag box. User mousemove -> `onWindowMouseMove()`: -> |
| `zoomRangeBox.value.handleDrag()`: Updates the drag box dimensions. User |
| mouseup -> `onChartMouseUp()`: -> Calculates zoom boundaries based on drag |
| box and `isHorizontalZoom`. -> `zoomRangeBox.value.hide()`. -> |
| `showResetButton = true`. -> `updateBounds()`: Updates chart's |
| `hAxis.viewWindow` or `vAxis.viewWindow`. -> `navigationMode = null`. |
| |
| 4. **Delta Calculation (Shift-Click):** User Shift + mousedown -> |
| `onChartMouseDown()`: `navigationMode = 'deltaY'` -> |
| `deltaRangeBox.value.show()`: Displays the vertical resizable box. User |
| mousemove -> `onWindowMouseMove()`: -> |
| `deltaRangeBox.value.updateSelection()`: Updates box height and calculates |
| delta. -> Updates `sidePanel.value` with delta values. User Shift + |
| mousedown (again) or regular mousedown -> `onChartMouseDown()`: -> Toggles |
| `deltaRangeOn`. If finishing, `sidePanel.value.showDelta = true`. User |
| mouseup (after dragging) -> `onChartMouseUp()`: -> Updates `sidePanel.value` |
| with final delta values. -> `navigationMode = null`. |
| |
| 5. **Toggling Trace Visibility:** User clicks checkbox in `side-panel-sk` -> |
| `side-panel-sk` dispatches `side-panel-selected-trace-change`. |
| `plot-google-chart-sk` listens (`sidePanelCheckboxUpdate()`): -> Updates |
| `this.removedLabelsCache`. -> Calls `updateDataView()`: -> Recreates |
| `DataView`, hiding/showing columns based on `removedLabelsCache`. -> Updates |
| chart. |
| |
| 6. **Anomaly/Issue Display:** `anomalyMap` or `userIssues` (from context) |
| changes -> `plot-google-chart-sk.willUpdate()` -> |
| `plotElement.value.redraw()` (if chart already rendered). Chart redraw |
| triggers `onChartReady()`: -> `drawAnomaly()` / `drawUserIssues()`: -> |
| Iterates through anomalies/issues for visible traces. -> Calculates screen |
| coordinates (x, y) using `chart.getChartLayoutInterface().getXLocation()` |
| and `getYLocation()`. -> Clones template `md-icon` elements from slots. -> |
| Positions the icons absolutely within `anomalyDiv` or `userIssueDiv`. |
| |
| This detailed explanation should provide a solid understanding of the |
| `plot-google-chart-sk` module's purpose, architecture, and key functionalities. |
| |
| # Module: /modules/plot-simple-sk |
| |
| The `plot-simple-sk` module provides a custom HTML element for rendering 2D line |
| graphs. It's designed to be interactive, allowing users to zoom, inspect |
| individual data points, and highlight specific traces. |
| |
| **Core Functionality and Design:** |
| |
| The primary goal of `plot-simple-sk` is to display time-series data or any data |
| that can be represented as a set of (x, y) coordinates. Key design |
| considerations include: |
| |
| 1. **Performance:** To handle potentially large datasets and maintain a smooth |
| user experience, the element employs several optimization techniques: |
| |
| - **Dual Canvases:** It uses two `<canvas>` elements stacked on top of |
| each other. |
| - The bottom canvas (`traces`) is for drawing the static parts of the |
| plot: the lines, axes, and dots representing data points. These are |
| pre-rendered into `Path2D` objects for efficient redrawing. |
| - The top canvas (`overlay`) is for dynamic elements that change |
| frequently, such as crosshairs, zoom selection rectangles, and hover |
| highlights. This separation prevents unnecessary redrawing of the |
| entire plot. |
| - **`Path2D` Objects:** Trace lines and data point dots are converted into |
| `Path2D` objects. This allows the browser to optimize their rendering, |
| leading to faster redraws compared to repeatedly issuing drawing |
| commands. |
| - **k-d Tree for Point Proximity:** For features like displaying |
| information on mouse hover or selecting the nearest data point on click, |
| a k-d tree (`kd.ts`) is used. This data structure allows for efficient |
| searching of the closest point in a 2D space, crucial for interactivity |
| with potentially many data points. |
| - **Debounced Redraws and Calculations:** Operations like rebuilding the |
| k-d tree (`recalcSearchTask`) or redrawing after a zoom (`zoomTask`) are |
| often scheduled using `window.setTimeout`. This prevents these |
| potentially expensive operations from blocking the main thread and |
| ensures they only happen when necessary, improving responsiveness. |
| `requestAnimationFrame` is used for mouse movement updates to |
| synchronize with the browser's repaint cycle. |
| |
| 2. **Interactivity:** |
| |
| - **Zooming:** |
| - **Summary and Detail Views:** The plot can optionally display a |
| "summary" area above the main "detail" area. The summary shows an |
| overview of all data, and users can drag a region on the summary to |
| zoom the detail view to that specific x-axis range. |
| - **Detail View Zoom:** Users can also drag a rectangle directly on |
| the detail view to zoom into a specific x and y range. |
| - **Zoom Stack:** The element maintains a stack of zoom levels |
| (`detailsZoomRangesStack`), allowing users to progressively zoom in |
| and potentially (though not explicitly stated as a current feature |
| for _out_) navigate back through zoom levels. |
| - **Hover and Selection:** |
| - Moving the mouse near a trace highlights the closest data point and |
| emits a `trace_focused` event. |
| - Clicking on a trace selects the closest data point and emits a |
| `trace_selected` event. |
| - **Crosshairs:** When the shift key is held, crosshairs are displayed, |
| indicating the mouse's current x and y position on the plot. |
| - **Highlighting Traces:** Specific traces can be programmatically |
| highlighted, making them stand out. |
| - **X-Bar and Bands:** Vertical lines (`xbar`) or regions (`bands`) can be |
| drawn on the plot to mark specific x-axis values or ranges. |
| - **Anomalies and User Issues:** The plot can display markers for |
| anomalies (regressions, improvements) and user-reported issues at |
| specific data points. |
| |
| 3. **Appearance and Theming:** |
| |
| - **Responsive Sizing:** The plot adapts to the `width` and `height` |
| attributes of the custom element and uses `ResizeObserver` to redraw |
| when its dimensions change. |
| - **Device Pixel Ratio:** It accounts for `window.devicePixelRatio` to |
| render crisply on high-DPI displays by drawing to a larger canvas and |
| then scaling it down with CSS transforms. |
| - **CSS Variables for Theming:** The element is designed to integrate with |
| `elements-sk/themes` and uses CSS variables for colors (e.g., |
| `--on-background`, `--success`, `--failure`), allowing its appearance to |
| be customized by the surrounding application's theme. It listens for |
| `theme-chooser-toggle` events to redraw when the theme changes. |
| |
| **Key Files and Responsibilities:** |
| |
| - **`plot-simple-sk.ts`:** This is the heart of the module, defining the |
| `PlotSimpleSk` custom element. |
| |
| - **Rendering Logic:** Contains all the drawing code for the traces, axes, |
| labels, summary view, detail view, crosshairs, zoom indicators, |
| anomalies, etc. It manages the two canvas contexts (`ctx` for traces, |
| `overlayCtx` for overlays). |
| - **State Management:** Manages the internal state, including the |
| `lineData` (traces and their pre-rendered paths), `labels` (x-axis tick |
| information), current `_zoom` state, `detailsZoomRangesStack` for detail |
| view zooms, `hoverPt`, `crosshair`, `highlighted` traces, `_xbar`, |
| `_bands`, and `_anomalyDataMap`. |
| - **Event Handling:** Sets up event listeners for mouse interactions |
| (move, down, up, leave, click) to handle zooming, hovering, and |
| selection. It also listens for `theme-chooser-toggle` and |
| `ResizeObserver` events. |
| - **API Methods:** Exposes methods like `addLines`, `deleteLines`, |
| `removeAll`, and properties like `highlight`, `xbar`, `bands`, `zoom`, |
| `anomalyDataMap`, `userIssueMap`, and `dots` to control the plot's |
| content and appearance. |
| - **Coordinate Transformations:** Uses `d3-scale` (specifically |
| `scaleLinear`) to map data coordinates (domain) to canvas pixel |
| coordinates (range) and vice-versa. Functions like `rectFromRange` and |
| `rectFromRangeInvert` handle these transformations for rectangular |
| regions. |
| - **Path and Search Builders:** |
| - `PathBuilder`: A helper class to construct `Path2D` objects for trace |
| lines and dots based on the current scales and data. |
| - `SearchBuilder`: A helper class to prepare the data points for the |
| `KDTree` by converting source coordinates to canvas coordinates. |
| - **Drawing Areas:** Defines `SummaryArea` and `DetailArea` interfaces and |
| manages their respective rectangles, axes, and scaling ranges. |
| |
| - **`kd.ts`:** Implements a k-d tree. |
| |
| - **Purpose:** Provides an efficient way (`O(log n)` on average for |
| search) to find the nearest data point to a given mouse coordinate on |
| the canvas. This is crucial for interactivity like mouse hovering and |
| clicking to identify specific points on traces. |
| - **Implementation:** It's a trimmed-down version of an existing k-d tree |
| library, specifically tailored for finding the single closest 2D point. |
| It takes an array of points (each with `x` and `y` properties), a |
| distance metric function, and the dimensions to consider (`['x', 'y']`). |
| The `nearest()` method is the primary interface used by |
| `plot-simple-sk.ts`. |
| |
| - **`ticks.ts`:** Responsible for generating appropriate tick marks and labels |
| for the time-based x-axis. |
| |
| - **Purpose:** Given an array of `Date` objects representing the x-axis |
| values, it determines a sensible set of tick positions and their |
| corresponding formatted string labels (e.g., "Jul", "Mon, 8 AM", "10:30 |
| AM"). |
| - **Logic:** It considers the total duration spanned by the dates and |
| selects an appropriate time granularity (e.g., months, days, hours, |
| minutes) for the labels using `Intl.DateTimeFormat`. It aims for a |
| reasonable number of ticks (`MIN_TICKS` to `MAX_TICKS`) and uses a |
| `fixTicksLength` function to thin out the ticks if too many are |
| generated. |
| - **Output:** The `ticks()` function returns an array of objects, each |
| with an `x` (index in the original data) and a `text` (formatted label). |
| |
| - **`plot-simple-sk.scss`:** Contains the SASS/CSS styles for the |
| `plot-simple-sk` element. |
| |
| - **Layout:** Defines the positioning of the canvas elements (absolute |
| positioning for the overlay on top of the trace canvas). |
| - **Theming Integration:** Imports `themes.scss` and uses CSS variables |
| (e.g., `var(--on-background)`, `var(--background)`) to ensure the plot's |
| colors match the application's theme. |
| |
| - **`index.ts`:** A simple entry point that imports `plot-simple-sk.ts` to |
| ensure the custom element is defined and registered with the browser. |
| |
| - **Demo Files (`plot-simple-sk-demo.html`, `plot-simple-sk-demo.ts`, |
| `plot-simple-sk-demo.scss`):** |
| |
| - Provide a live demonstration of the `plot-simple-sk` element's |
| capabilities. |
| - The HTML sets up the plot elements and buttons to trigger various |
| actions. |
| - The TypeScript file (`plot-simple-sk-demo.ts`) contains the logic to |
| interact with the plot, such as adding random trace data, highlighting |
| traces, zooming, clearing the plot, and displaying anomaly markers. It |
| also logs events emitted by the plot. |
| |
| **Key Workflows:** |
| |
| 1. **Initialization and Rendering:** `ElementSk constructor` -> |
| `connectedCallback` -> `render` `render` -> `_render` (lit-html template |
| instantiation) -> `canvas.getContext` -> `updateScaledMeasurements` -> |
| `updateScaleRanges` -> `recalcDetailPaths` -> `recalcSummaryPaths` -> |
| `drawTracesCanvas` |
| |
| 2. **Adding Data (`addLines`):** `addLines` -> Convert `MISSING_DATA_SENTINEL` |
| to `NaN` -> Store in `this.lineData` -> `updateScaleDomains` -> |
| `recalcSummaryPaths` -> `recalcDetailPaths` -> `drawTracesCanvas` |
| `recalcDetailPaths` / `recalcSummaryPaths` -> For each line: `PathBuilder` |
| creates `linePath` and `dotsPath`. `recalcDetailPaths` -> `recalcSearch` |
| (schedules `recalcSearchImpl`) `recalcSearchImpl` -> `SearchBuilder` |
| populates points -> `new KDTree` |
| |
| 3. **Mouse Hover and Focus:** `mousemove` event -> `this.mouseMoveRaw` updated |
| `raf` loop -> checks `this.mouseMoveRaw` -> `eventToCanvasPt` -> If |
| `this.pointSearch`: `this.pointSearch.nearest(pt)` -> updates `this.hoverPt` |
| -> dispatches `trace_focused` event -> Updates `this.crosshair` (based on |
| shift key and `hoverPt`) -> `drawOverlayCanvas` |
| |
| 4. **Zooming via Summary Drag:** `mousedown` on summary -> `this.inZoomDrag = |
| 'summary'` -> `this.zoomBegin` set `mousemove` (while dragging) -> `raf` |
| loop: -> `eventToCanvasPt` -> `clampToRect` (summary area) -> |
| `this.summaryArea.range.x.invert(pt.x)` to get source x -> `this.zoom = |
| [min_x, max_x]` (triggers `_zoomImpl` via setter task) `_zoomImpl` (after |
| timeout) -> `updateScaleDomains` -> `recalcDetailPaths` -> |
| `drawTracesCanvas` `mouseup` / `mouseleave` -> dispatches `zoom` event -> |
| `this.inZoomDrag = 'no-zoom'` |
| |
| 5. **Zooming via Detail Area Drag:** `mousedown` on detail -> |
| `this.inZoomDrag = 'details'` -> `this.zoomRect` initialized `mousemove` |
| (while dragging) -> `raf` loop: -> `eventToCanvasPt` -> `clampToRect` |
| (detail area) -> Updates `this.zoomRect.width/height` -> `drawOverlayCanvas` |
| (to show the dragging rectangle) `mouseup` / `mouseleave` -> |
| `dispatchZoomEvent` -> `doDetailsZoom` `doDetailsZoom` -> If zoom box is |
| large enough: `this.detailsZoomRangesStack.push(rectFromRangeInvert(...))` |
| -> `_zoomImpl` |
| |
| 6. **Drawing Process:** |
| |
| - `drawTracesCanvas()`: |
| |
| 1. Clears the appropriate part of the main trace canvas (`this.ctx`). |
| 2. Draws detail area: |
| - Saves context, clips to detail rect. |
| - Calls `drawXAxis` (for detail). |
| - Iterates `this.lineData`: draws `line.detail.linePath` and |
| `line.detail.dotsPath` if `this.dots` is true. |
| - Restores context. |
| - Calls `drawXAxis` again (to draw labels outside the clipped |
| region). |
| 3. If `this.summary` and not dragging zoom: |
| - Draws summary area similarly. |
| 4. Calls `drawYAxis` (for detail). |
| 5. Calls `drawOverlayCanvas()`. |
| |
| - `drawOverlayCanvas()`: |
| |
| 1. Clears the entire overlay canvas (`this.overlayCtx`). |
| 2. If `this.summary`: |
| - Saves context, clips to summary rect. |
| - Calls `drawXBar`, `drawBands`. |
| - Draws detail zoom indicator box if `detailsZoomRangesStack` is |
| not empty. |
| - Draws summary zoom bars and shaded regions based on |
| `this._zoom`. |
| - Restores context. |
| 3. Clips to detail rect: |
| - Calls `drawXBar`, `drawBands`. |
| - Draws highlighted lines. |
| - Draws hovered line/dots. |
| - Calls `drawUserIssues`, `drawAnomalies`. |
| - If dragging zoom in detail: draws `this.zoomRect` (dashed). |
| - If not dragging: draws crosshairs and hover label. |
| - Restores context. |
| |
| This structured approach allows `plot-simple-sk` to be both feature-rich and |
| performant for visualizing and interacting with 2D data plots. |
| |
| # Module: /modules/plot-summary-sk |
| |
| The `plot-summary-sk` module provides a custom HTML element, |
| `<plot-summary-sk>`, designed to display a summary plot of performance data and |
| allow users to select a range within that plot. This is particularly useful for |
| visualizing trends over time or commit ranges and enabling interactive |
| exploration of the data. |
| |
| At its core, `plot-summary-sk` leverages the Google Charts library to render an |
| area chart. It's designed to work with a `DataFrame`, a data structure commonly |
| used in Perf for holding timeseries data. The element can display data based on |
| either commit offsets or timestamps (`domain` attribute). |
| |
| Key Responsibilities: |
| |
| - **Data Visualization**: Renders an area chart representing performance data |
| over a specified domain (commit or date). |
| - **Range Selection**: Allows users to interactively select a range on the |
| plot. This selection can be initiated by dragging on the chart or by |
| programmatically setting the selection. |
| - **Event Emission**: Emits a `summary_selected` custom event when the user |
| makes or changes a selection. This event carries details about the selected |
| range (start, end, value, and domain). |
| - **Dynamic Data Loading**: Optionally, it can display controls to load more |
| data in either direction (earlier or later), integrating with a |
| `DataFrameRepository` to fetch and append new data. |
| - **Theming**: Adapts to theme changes (e.g., dark mode) by redrawing the |
| chart with appropriate styles. |
| - **Responsiveness**: The chart redraws itself when its container is resized, |
| ensuring it remains visually correct. |
| |
| Key Components/Files: |
| |
| - **`plot-summary-sk.ts`**: This is the main file defining the `PlotSummarySk` |
| LitElement. |
| - **Why**: It encapsulates the logic for chart rendering, user |
| interaction, data handling, and event emission. |
| - **How**: |
| - It consumes `DataFrame` data (from `dataTableContext`) and renders it |
| using `<google-chart>`. |
| - It manages the display of single or all traces based on the |
| `selectedTrace` property. |
| - It uses an internal `h-resizable-box-sk` element to provide the visual |
| selection rectangle and handles the mouse events for drawing and |
| resizing this selection. |
| - It translates between the visual coordinates of the selection box and |
| the data values (commit offsets or timestamps) of the underlying chart. |
| - It listens for `google-chart-ready` events to ensure operations like |
| setting a selection programmatically happen after the chart is fully |
| initialized. |
| - It provides `controlTemplate` for optional "load more data" buttons, |
| which interact with a `DataFrameRepository` (consumed via |
| `dataframeRepoContext`). |
| - It uses a `ResizeObserver` to detect when the element is resized and |
| triggers a chart redraw. |
| - It manages colors for different traces to ensure consistent |
| visualization. |
| - **`h-resizable-box-sk.ts`**: This file defines the `HResizableBoxSk` |
| LitElement, a reusable component for creating a horizontally resizable and |
| draggable selection box. |
| - **Why**: To decouple the complex UI interaction logic of drawing, |
| moving, and resizing a selection rectangle from the main |
| `plot-summary-sk` component. This promotes reusability and simplifies |
| the main component's logic. |
| - **How**: |
| - It renders a `div` (`.surface`) that represents the selection. |
| - It listens for `mousedown` events on its container to initiate an |
| action: 'draw' (if clicking outside the existing selection), 'drag' (if |
| clicking inside the selection), 'left' (if clicking on the left edge), |
| or 'right' (if clicking on the right edge). |
| - It listens for `mousemove` events on the `window` to update the |
| selection's position and size during an action. This ensures interaction |
| continues even if the mouse moves outside the element's bounds. |
| - It listens for `mouseup` events on the `window` to finalize the action |
| and emits a `selection-changed` event with the new range. |
| - It uses CSS to style the selection box and provide visual cues for |
| dragging and resizing (e.g., `cursor: move`, `cursor: ew-resize`). |
| - The `selectionRange` property (getter and setter) allows programmatic |
| control and retrieval of the selection, defined by `begin` and `end` |
| pixel offsets relative to the component. |
| - **`plot-summary-sk.css.ts`**: Contains the CSS styles for the |
| `plot-summary-sk` element, defined as a Lit `css` tagged template literal. |
| - **Why**: To encapsulate the visual styling, ensuring the plot and its |
| controls are laid out correctly and are visually consistent with the |
| application's theme. |
| - **How**: It uses flexbox for layout, positions the selection box |
| (`h-resizable-box-sk`) absolutely over the chart, and styles the |
| optional loading buttons and loading indicator. |
| - **`plot-summary-sk-demo.ts` and `plot-summary-sk-demo.html`**: Provide a |
| demonstration page for the `plot-summary-sk` element. |
| - **Why**: To allow developers to see the component in action, test its |
| features, and understand how to integrate it. |
| - **How**: The HTML sets up multiple instances of `plot-summary-sk` with |
| different configurations (e.g., `domain`, `selectionType`). The |
| TypeScript file generates sample `DataFrame` objects, converts them to |
| Google DataTable format, and populates the plot elements. It also |
| listens for `summary_selected` events and displays their details. |
| - **Test Files (`*.test.ts`, `*_puppeteer_test.ts`)**: |
| - **Why**: To ensure the component functions as expected and to prevent |
| regressions. |
| - **How**: |
| - Unit tests (`plot-summary-sk_test.ts`, `h_resizable_box_sk_test.ts`) |
| verify individual component logic, such as programmatic selection and |
| state changes. They often mock dependencies like the Google Chart |
| library or use test utilities to generate data. |
| - Puppeteer tests (`plot-summary-sk_puppeteer_test.ts`) perform end-to-end |
| testing by interacting with the component in a real browser environment. |
| They simulate user actions like mouse drags and verify the emitted event |
| details and visual output (via screenshots). |
| |
| Key Workflows: |
| |
| 1. **Initialization and Data Display**: |
| |
| ``` |
| [DataFrame via context or property] |
| | |
| v |
| plot-summary-sk |
| | |
| v |
| [willUpdate/updateDataView] --> Converts DataFrame to Google DataTable |
| | |
| v |
| <google-chart> --> Renders area chart |
| | |
| v |
| [google-chart-ready event] --> plot-summary-sk may apply cached selection |
| ``` |
| |
| 2. **User Selecting a Range by Drawing**: |
| |
| ``` |
| User mousedowns on <plot-summary-sk> (outside existing selection in h-resizable-box-sk) |
| | |
| v |
| h-resizable-box-sk (action = 'draw') |
| | |
| v |
| User moves mouse (mousemove on window) |
| | |
| v |
| h-resizable-box-sk --> Updates selection box dimensions |
| | |
| v |
| User mouseups (mouseup on window) |
| | |
| v |
| h-resizable-box-sk --> Emits 'selection-changed' (with pixel coordinates) |
| | |
| v |
| plot-summary-sk (onSelectionChanged) |
| | |
| v |
| Converts pixel coordinates to data values (commit/timestamp) |
| | |
| v |
| Emits 'summary_selected' (with data values) |
| ``` |
| |
| 3. **User Resizing/Moving an Existing Selection**: |
| |
| ``` |
| User mousedowns on <h-resizable-box-sk> (on edge for resize, or middle for drag) |
| | |
| v |
| h-resizable-box-sk (action = 'left'/'right'/'drag') |
| | |
| v |
| User moves mouse (mousemove on window) |
| | |
| v |
| h-resizable-box-sk --> Updates selection box position/dimensions |
| | |
| v |
| User mouseups (mouseup on window) |
| | |
| v |
| h-resizable-box-sk --> Emits 'selection-changed' |
| | |
| v |
| plot-summary-sk (onSelectionChanged) --> Converts & Emits 'summary_selected' |
| ``` |
| |
| 4. **Programmatic Selection**: `Application calls |
| plotSummarySkElement.Select(beginHeader, endHeader) OR Application sets |
| plotSummarySkElement.selectedValueRange = { begin: val1, end: val2 } | v |
| plot-summary-sk | v Caches selectedValueRange (important if chart not ready) |
| | v [If chart ready] --> Converts data values to pixel coordinates | v Sets |
| selectionRange on <h-resizable-box-sk>` If the chart is not ready when |
| `selectedValueRange` is set, the conversion and setting of the |
| `h-resizable-box-sk` selection is deferred until the `google-chart-ready` |
| event fires. |
| |
| The design separates the concerns of data plotting (Google Charts), interactive |
| range selection UI (`h-resizable-box-sk`), and the overall orchestration and |
| data conversion logic (`plot-summary-sk`). This makes the system more modular |
| and easier to maintain. The use of LitElement and contexts allows for a reactive |
| programming model and clean integration with other parts of the Perf |
| application. |
| |
| # Module: /modules/point-links-sk |
| |
| The `point-links-sk` module is a custom HTML element designed to display links |
| associated with specific data points in a performance analysis context. These |
| links often originate from ingestion files and can include commit details, build |
| logs, or other relevant resources. |
| |
| The primary purpose of this module is to provide users with quick access to |
| contextual information related to a data point. It achieves this by: |
| |
| 1. **Fetching and Displaying Links:** The module fetches link data from a |
| backend API based on a commit ID and a trace ID. It then renders these links |
| as clickable anchor elements. |
| 2. **Generating Commit Range Links:** A key feature is its ability to generate |
| links representing the range of commits between two data points. This is |
| particularly useful for understanding changes that might have occurred |
| between two performance measurements. |
| - If the commit hashes for a given key (e.g., "V8 Git Hash") are different |
| between the current and previous data points, it constructs a URL that |
| shows the log of commits between those two specific commit hashes. |
| - If the commit hashes are the same, it simply links to the individual |
| commit, indicating no change in that specific dependency. |
| 3. **Caching:** To optimize performance and avoid redundant API calls, the |
| module can utilize a provided cache of previously loaded commit links. If |
| the links for a specific commit and trace ID are already in the cache, it |
| will use those instead of re-fetching. |
| 4. **User-Friendly Presentation:** Links are presented in a list format, with a |
| "copy to clipboard" button for each link, enhancing usability. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`point-links-sk.ts`**: This is the core file defining the `PointLinksSk` |
| custom element. |
| - It extends `ElementSk` from `infra-sk`. |
| - **`load()` method**: This is the main public method responsible for |
| initiating the process of fetching and displaying links. It takes the |
| current commit ID, the previous commit ID, a trace ID, and arrays of |
| keys to identify which links should be treated as commit ranges and |
| which are general "useful links". It handles the logic for checking the |
| cache, fetching data from the API, processing commit ranges, and |
| updating the display. |
| - **`getLinksForPoint()` and `invokeLinksForPointApi()` methods**: These |
| private methods handle the actual API interaction to retrieve link data. |
| `getLinksForPoint` attempts to fetch from `/_/links/` first and falls |
| back to `/_/details/?results=false` if the initial attempt fails. It |
| also includes workarounds for specific data inconsistencies (e.g., V8 |
| and WebRTC URLs). |
| - **`renderPointLinks()` and `renderRevisionLink()` methods**: These |
| methods, along with the static `template`, use `lit-html` to generate |
| the HTML structure for displaying the links. |
| - **Helper methods (`getCommitIdFromCommitUrl`, `getRepoUrlFromCommitUrl`, |
| `getFormattedCommitRangeText`, `extractUrlFromStringForFuchsia`)**: |
| These provide utility functions for parsing URLs and formatting text. |
| - **Data properties (`commitPosition`, `displayUrls`, `displayTexts`)**: |
| These store the state of the component, such as the current commit and |
| the links to be displayed. |
| - **`point-links-sk.scss`**: Provides the styling for the `point-links-sk` |
| element, ensuring a consistent look and feel, including styling for Material |
| Design icons and buttons. |
| - **`index.ts`**: A simple entry point that imports and thereby registers the |
| `point-links-sk` custom element. |
| - **`point-links-sk-demo.html` & `point-links-sk-demo.ts`**: These files set |
| up a demonstration page for the `point-links-sk` element. The |
| `point-links-sk-demo.ts` file uses `fetch-mock` to simulate the backend API, |
| allowing developers to test the component's behavior in isolation. It |
| demonstrates how to instantiate and use the `point-links-sk` element with |
| different configurations. |
| |
| **Workflow for Loading and Displaying Links:** |
| |
| The typical workflow when the `load()` method is called can be visualized as: |
| |
| ``` |
| Caller invokes pointLinksSk.load(currentCID, prevCID, traceID, rangeKeys, usefulKeys, cachedLinks) |
| | |
| V |
| Check if links for (currentCID, traceID) exist in `cachedLinks` |
| | |
| +-- YES --> Use cached links |
| | | |
| | V |
| | Render links |
| | |
| +-- NO ---> Fetch links for `currentCID` from API (`getLinksForPoint`) |
| | |
| V |
| If `rangeKeys` are provided: |
| | Fetch links for `prevCID` from API (`getLinksForPoint`) |
| | For each key in `rangeKeys`: |
| | Extract current commit hash from `currentCID` links |
| | Extract previous commit hash from `prevCID` links |
| | If hashes are different: |
| | Generate "commit range" URL (e.g., .../+log/prevHash..currentHash) |
| | Else (hashes are same): |
| | Use current commit URL |
| | Add to `displayUrls` and `displayTexts` |
| | |
| V |
| If `usefulKeys` are provided: |
| | For each key in `usefulKeys`: |
| | Add corresponding link from `currentCID` links to `displayUrls` |
| | |
| V |
| Update cache with newly fetched/generated links for (currentCID, traceID) |
| | |
| V |
| Render links |
| ``` |
| |
| This module is designed to be flexible, allowing the consuming application to |
| specify which types of links should be processed for commit ranges and which |
| should be displayed as direct links. The inclusion of error handling (via |
| `errorMessage`) and the fallback mechanism in API calls (`/_/links/` then |
| `/_/details/`) make it more robust. |
| |
| # Module: /modules/progress |
| |
| The `progress` module provides a mechanism for initiating and monitoring the |
| status of long-running tasks on the server. This is crucial for user experience, |
| as it allows the client to display progress information and avoid appearing |
| unresponsive during lengthy operations. |
| |
| The core of this module is the `startRequest` function. This function is |
| designed to handle asynchronous server-side processes that might take a |
| significant amount of time to complete. |
| |
| **How `startRequest` Works:** |
| |
| 1. **Initiation:** |
| |
| - It begins by sending an initial POST request to a specified |
| `startingURL` with a given `body`. This request typically triggers the |
| long-running task on the server. |
| - If a `spinner-sk` element is provided, it's activated to visually |
| indicate that a process is underway. |
| |
| 2. **Polling:** |
| |
| - The server's response to the initial request (and subsequent polling |
| requests) is expected to be a JSON object of type |
| `progress.SerializedProgress`. This object contains: |
| - `status`: Indicates whether the task is "Running" or "Finished" (or |
| potentially other states like "Error"). |
| - `messages`: An array of key-value pairs providing more detailed |
| information about the current state of the task (e.g., current step, |
| progress percentage). |
| - `url`: If the `status` is "Running", this URL is used for the next |
| polling request to get updated progress. |
| - `results`: If the `status` is "Finished", this field contains the |
| final output of the long-running process. |
| - If the `status` is "Running", `startRequest` will schedule a |
| `setTimeout` to make a GET request to the `url` provided in the response |
| after a specified `period`. This creates a polling loop. |
| |
| 3. **Callback and Completion:** |
| |
| - An optional `callback` function can be provided. This function is |
| invoked after each successful fetch (both the initial request and every |
| polling update), receiving the `progress.SerializedProgress` object. |
| This allows the UI to update with the latest progress information. |
| - The polling continues until the server responds with a `status` that is |
| not "Running" (e.g., "Finished"). |
| - Once the task is complete, the Promise returned by `startRequest` |
| resolves with the final `progress.SerializedProgress` object. |
| - If a `spinner-sk` was provided, it is deactivated. |
| |
| 4. **Error Handling:** |
| |
| - If any network request fails (e.g., non-2xx HTTP status), the Promise |
| returned by `startRequest` is rejected with an error. |
| - The spinner (if provided) is also deactivated in case of an error. |
| |
| **Workflow Diagram:** |
| |
| ``` |
| Client UI startRequest Function Server |
| ---------- --------------------- ------ |
| | | |
| | -- Call startRequest --> | |
| | | -- POST to startingURL (body) --> | |
| | | | |
| | | <-- Response (SerializedProgress) -- | |
| | | |
| | -- (Optional) Activate -- | |
| | Spinner | |
| | | |
| | | -- If status is "Running": --------> Schedule setTimeout(period) |
| | | | |
| | | V |
| | | -- GET to progress.url -----------> | |
| | | | |
| | | <-- Response (SerializedProgress) -- | |
| | | | |
| | | --- (Invoke callback) ---------> Client UI (Update progress) |
| | | | |
| | | --- Loop back to "If status is 'Running'" |
| | | |
| | | -- If status is "Finished": -------> Resolve Promise |
| | | | |
| | -- (Optional) Deactivate | <----------------------------------- |
| | Spinner | |
| | | |
| | <-- Promise Resolves ---- | |
| | (SerializedProgress) | |
| ``` |
| |
| **Key Files:** |
| |
| - **`progress.ts`**: |
| |
| - **Responsibilities**: Implements the core logic for initiating requests, |
| polling for status updates, handling responses, and managing callbacks |
| and promises. It also provides utility functions for formatting progress |
| messages. |
| - **Key Components**: |
| - `startRequest`: The primary function that orchestrates the entire |
| progress monitoring flow. It encapsulates the logic for making the |
| initial POST request and subsequent GET requests for polling. The use of |
| a single `processFetch` internal function is a design choice to reduce |
| code duplication, as the response handling logic is identical for both |
| the initial and polling fetches. |
| - `messagesToErrorString`: A utility function designed to extract a |
| user-friendly error message from the `messages` array within |
| `SerializedProgress`. It prioritizes messages with the key "Error" but |
| falls back to concatenating all messages if no specific error message is |
| found. This ensures that some form of feedback is available even if the |
| server doesn't explicitly flag an error. |
| - `messagesToPreString`: Formats messages for display, typically within a |
| `<pre>` tag, by putting each key-value pair on a new line. This is |
| useful for presenting detailed progress logs. |
| - `messageByName`: Allows retrieval of a specific message's value by its |
| key from the `messages` array, with a fallback if the key is not found. |
| This is useful for extracting specific pieces of information from the |
| progress updates (e.g., the current step number). |
| - **Dependencies**: |
| - `elements-sk/modules/spinner-sk`: Used to visually indicate that a |
| background task is in progress. |
| - `perf/modules/json`: Provides the `progress.SerializedProgress` type |
| definition, ensuring consistency in how progress information is |
| structured between the client and server. |
| |
| - **`progress_test.ts`**: |
| |
| - **Responsibilities**: Contains unit tests for the `progress.ts` module. |
| - **Key Focus**: |
| - Verifies that `startRequest` correctly handles different server response |
| scenarios: immediate completion, one or more polling steps, and network |
| errors. |
| - Ensures that the optional callback is invoked correctly during the |
| polling process. |
| - Tests the behavior of the message formatting utility functions |
| (`messagesToErrorString`, `messageByName`) with various inputs. |
| - **Methodology**: Uses `fetch-mock` to simulate server responses, |
| allowing for controlled testing of the asynchronous network interactions |
| without relying on an actual backend. This is crucial for creating |
| reliable and fast unit tests. |
| |
| The design of this module prioritizes a clear separation of concerns. |
| `startRequest` focuses on the communication and polling logic, while the utility |
| functions provide convenient ways to interpret and display the progress |
| information received from the server. The use of Promises simplifies handling |
| asynchronous operations, and the optional callback provides flexibility for |
| updating the UI in real-time. |
| |
| # Module: /modules/query-chooser-sk |
| |
| ## Query Chooser Element (`query-chooser-sk`) |
| |
| The `query-chooser-sk` module provides a user interface element for selecting |
| and modifying query parameters. It's designed to offer a compact way to display |
| the currently active query and provide a mechanism to change it through a |
| dialog. |
| |
| ### Core Functionality and Design |
| |
| The primary goal of `query-chooser-sk` is to present a summarized view of the |
| current query and allow users to edit it in a more detailed interface. This is |
| achieved by: |
| |
| 1. **Displaying a summary:** The current query is displayed in a concise format |
| using the `paramset-sk` element. This gives users a quick overview of the |
| active filters. |
| 2. **Providing an "Edit" button:** This button triggers the display of a |
| dialog. |
| 3. **Embedding `query-sk` in a dialog:** The dialog contains a `query-sk` |
| element. This is where the user can interactively build or modify their |
| query by selecting values for different parameters. |
| 4. **Showing query match count:** Alongside the `query-sk` element, |
| `query-count-sk` is used to display how many items match the currently |
| constructed query. This provides immediate feedback to the user as they |
| refine their selection. |
| 5. **Event propagation:** `query-chooser-sk` listens for `query-change` events |
| from the embedded `query-sk` element. When a change occurs, |
| `query-chooser-sk` updates its own `current_query` and re-renders, |
| effectively propagating the change. It also emits its own `query-change` |
| event, allowing parent components to react to query modifications. |
| |
| This design separates the concerns of displaying the current state from the more |
| complex interaction of query building. The dialog provides a focused environment |
| for query modification without cluttering the main UI. |
| |
| ### Key Components and Files |
| |
| - **`query-chooser-sk.ts`**: This is the core TypeScript file defining the |
| `QueryChooserSk` custom element. |
| - It manages the visibility of the editing dialog. |
| - It orchestrates the interaction between the summary display |
| (`paramset-sk`), the query editing interface (`query-sk`), and the match |
| count display (`query-count-sk`). |
| - It defines properties like `current_query`, `paramset`, `key_order`, and |
| `count_url` which are essential for its operation and for configuring |
| its child elements. |
| - The `_editClick` and `_closeClick` methods handle the opening and |
| closing of the dialog. |
| - The `_queryChange` method is crucial for reacting to changes in the |
| embedded `query-sk` element and updating the `current_query`. |
| - **`query-chooser-sk.html` (template within `query-chooser-sk.ts`)**: This |
| Lit HTML template defines the structure of the element. |
| - It includes a `div` with class `row` to display the "Edit" button and |
| the `paramset-sk` summary. |
| - Another `div` with id `dialog` acts as the container for `query-sk`, |
| `query-count-sk`, and the "Close" button. The visibility of this dialog |
| is controlled by adding/removing the `display` class. |
| - **`query-chooser-sk.scss`**: This file provides the styling for the element. |
| It ensures proper layout of the button, summary, and the dialog content. It |
| also includes theming support. |
| - **`index.ts`**: A simple entry point that imports and registers the |
| `query-chooser-sk` custom element. |
| - **`query-chooser-sk-demo.html` / `query-chooser-sk-demo.ts`**: These files |
| provide a demonstration page for the element, showcasing its usage with |
| sample data and event handling. `fetchMock` is used in the demo to simulate |
| the `count_url` endpoint. |
| - **`query-chooser-sk_puppeteer_test.ts`**: Contains Puppeteer tests to verify |
| the rendering and basic functionality of the element. |
| |
| ### Workflow: Editing a Query |
| |
| The typical workflow for a user interacting with `query-chooser-sk` is as |
| follows: |
| |
| ``` |
| User sees current query summary & "Edit" button |
| | |
| | (User clicks "Edit") |
| V |
| Dialog appears, showing: |
| - `query-sk` (for selecting parameters/values) |
| - `query-count-sk` (displaying number of matches) |
| - "Close" button |
| | |
| | (User interacts with `query-sk`, changing selections) |
| V |
| `query-sk` emits "query-change" event |
| | |
| V |
| `query-chooser-sk` (_queryChange method): |
| - Updates its `current_query` property/attribute |
| - Re-renders to reflect new `current_query` in summary & `query-count-sk` |
| - Emits its own "query-change" event (for parent components) |
| | |
| | (User is satisfied with the new query) |
| V |
| User clicks "Close" |
| | |
| V |
| Dialog is hidden |
| | |
| V |
| `query-chooser-sk` displays the updated query summary. |
| ``` |
| |
| The `paramset` attribute is crucial as it provides the available keys and values |
| that `query-sk` will use to render its selection interface. The `key_order` |
| attribute influences the order in which parameters are displayed within |
| `query-sk`. The `count_url` is passed directly to `query-count-sk` to fetch the |
| number of matching items for the current query. |
| |
| # Module: /modules/query-count-sk |
| |
| The `query-count-sk` module provides a custom HTML element designed to display |
| the number of results matching a given query. Its primary purpose is to offer a |
| dynamic and responsive way to inform users about the scope of their queries in |
| real-time, without requiring a full page reload or complex UI updates. This is |
| particularly useful in applications where users frequently refine search |
| criteria and need immediate feedback on the impact of those changes. |
| |
| The core functionality revolves around the `QueryCountSk` class, which extends |
| `ElementSk`. This class manages the state of the displayed count, handles |
| asynchronous data fetching, and updates the UI accordingly. |
| |
| **Key Components and Design Decisions:** |
| |
| - **`query-count-sk.ts`:** This is the heart of the module. |
| - **Asynchronous Data Fetching:** When the `current_query` or `url` |
| attributes change, the element initiates a POST request to the specified |
| `url`. |
| - The request body includes the `current_query`, and a default time window |
| of the last 24 hours (`begin` and `end` timestamps). This design choice |
| implies that the element is typically used for querying recent data. |
| - To prevent race conditions and unnecessary network requests, any ongoing |
| fetch operation is aborted if a new query is initiated. This is achieved |
| using an `AbortController`. This is a crucial design decision for |
| performance and responsiveness, especially when users rapidly change |
| query parameters. |
| - The component expects a JSON response with a `count` (number of matches) |
| and a `paramset` (a read-only representation of parameters related to |
| the query). |
| - **State Management:** The `_count` property stores the fetched count as |
| a string, and `_requestInProgress` is a boolean flag indicating whether |
| a fetch operation is currently active. This flag is used to show/hide a |
| loading spinner (`spinner-sk`). |
| - **Rendering:** The component uses `lit-html` for efficient template |
| rendering. The template displays the `_count` and the `spinner-sk` |
| conditionally. |
| - **Event Emission:** Upon successful data retrieval, a `paramset-changed` |
| custom event is dispatched. This event carries the `paramset` received |
| from the server. This allows other components on the page to react to |
| changes in the available parameters based on the current query results. |
| This decoupling is a key design aspect for building modular UIs. |
| - **Error Handling:** Network errors or non-OK HTTP responses are caught, |
| and an error message is displayed to the user via the `errorMessage` |
| utility (likely from `perf/modules/errorMessage`). AbortErrors are |
| handled gracefully by simply stopping the current operation without |
| displaying an error, as this usually means the user initiated a new |
| action. |
| - **`query-count-sk.scss`:** Provides styling for the element, ensuring the |
| count and spinner are displayed appropriately. The `display: inline-block` |
| and flexbox layout for the internal `div` are chosen for simple alignment of |
| the count and spinner. |
| - **`query-count-sk-demo.html` and `query-count-sk-demo.ts`:** These files |
| provide a demonstration and testing environment for the `query-count-sk` |
| element. |
| - The demo sets up a `fetch-mock` to simulate server responses, allowing |
| for isolated testing of the component's behavior. |
| - It showcases how to instantiate the element and interact with its |
| attributes (`url`, `current_query`). |
| - The presence of `<error-toast-sk>` in the demo suggests that this is the |
| intended mechanism for displaying errors surfaced by `errorMessage`. |
| - **`index.ts`:** A simple entry point that imports and registers the |
| `query-count-sk` custom element, making it available for use in an HTML |
| page. |
| |
| **Workflow for Displaying Query Count:** |
| |
| 1. **Initialization:** |
| |
| - The `query-count-sk` element is added to the DOM. |
| - The `url` attribute (pointing to the backend endpoint) is set. |
| |
| ``` |
| Page query-count-sk |
| | | |
| |--(Set url)---->| |
| ``` |
| |
| 2. **Query Update:** |
| |
| - The `current_query` attribute is set or updated (e.g., by user input in |
| another part of the application). |
| |
| ``` |
| Page query-count-sk |
| | | |
| |--(Set current_query)-->| |
| ``` |
| |
| 3. **Data Fetching:** |
| |
| - The `attributeChangedCallback` (or `connectedCallback` on initial load) |
| triggers the `_fetch()` method. |
| - If a previous fetch is in progress, it's aborted. |
| - `_requestInProgress` is set to `true`, and the spinner becomes visible. |
| - A POST request is made to `this.url` with the `current_query` and time |
| range. |
| |
| ``` |
| query-count-sk Server |
| | | |
| |--(Set _requestInProgress=true)------>| (Spinner shows) |
| | | |
| |----(POST / {q: current_query, ...})-->| |
| ``` |
| |
| 4. **Response Handling:** |
| |
| - **Success:** |
| |
| - The server responds with JSON: `{ count: N, paramset: {...} }`. |
| - `_count` is updated with `N`. |
| - `_requestInProgress` is set to `false` (spinner hides). |
| - The component re-renders to display the new count. |
| - A `paramset-changed` event is dispatched with the `paramset`. |
| |
| ``` |
| query-count-sk Server |
| | | |
| |<----(HTTP 200, {count, paramset})----| |
| | | |
| |--(Update _count, _requestInProgress=false)-->| (Spinner hides, count updates) |
| | | |
| |--(Dispatch 'paramset-changed')------>| (Other components may react) |
| ``` |
| |
| - **Error (e.g., network issue, server error):** |
| |
| - `_requestInProgress` is set to `false` (spinner hides). |
| - An error message is displayed (e.g., via `error-toast-sk`). |
| |
| ``` |
| query-count-sk Server |
| | | |
| |<----(HTTP Error or Network Error)----| |
| | | |
| |--(Set _requestInProgress=false)------>| (Spinner hides) |
| | | |
| |--(Display error message)------------>| |
| ``` |
| |
| - **Abort:** |
| |
| - If the fetch was aborted (e.g., new query initiated before |
| completion), the `catch` block for `AbortError` is entered. |
| - No UI update for count or error display happens; the new fetch |
| operation takes precedence. |
| |
| The design emphasizes responsiveness by aborting stale requests and provides a |
| clear visual indication of ongoing activity (the spinner). The |
| `paramset-changed` event promotes loose coupling between components, allowing |
| other parts of the application to adapt based on the query results without |
| direct dependencies on `query-count-sk`'s internal implementation. |
| |
| # Module: /modules/regressions-page-sk |
| |
| The `regressions-page-sk` module provides a user interface for viewing and |
| managing performance regressions. It allows users to select a "subscription" |
| (often representing a team or area of ownership, like "Sheriff Config 1") and |
| then displays a list of detected performance anomalies (regressions or |
| improvements) associated with that subscription. |
| |
| The core functionality revolves around fetching and displaying this data in a |
| user-friendly way. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`regressions-page-sk.ts`**: This is the main TypeScript file that defines |
| the `RegressionsPageSk` custom HTML element. |
| |
| - **State Management (`State` interface, `stateReflector`)**: The |
| component maintains its UI state (selected subscription, whether to show |
| triaged items or improvements, and a flag for using a Skia-specific |
| backend) in the `state` object. The `stateReflector` utility is crucial |
| here. It synchronizes this internal state with the URL query parameters. |
| This means a user can bookmark a specific view (e.g., a particular |
| subscription with improvements shown) and share it, or refresh the page |
| and return to the same state. |
| - _Why `stateReflector`?_ It provides a clean way to manage application |
| state that needs to be persistent across page loads and shareable via |
| URLs, without manually parsing and updating the URL. |
| - **Data Fetching (`fetchRegressions`, `init`)**: |
| - The `init` method is called during component initialization and whenever |
| the state changes significantly (like selecting a new subscription). It |
| fetches the list of available subscriptions (sheriff lists) from either |
| a legacy endpoint (`/_/anomalies/sheriff_list`) or a Skia-specific one |
| (`/_/anomalies/sheriff_list_skia`) based on the `state.useSkia` flag. |
| The fetched subscriptions are then sorted alphabetically for display in |
| a dropdown. |
| - The `fetchRegressions` method is responsible for fetching the actual |
| anomaly data. It constructs a query based on the current `state` |
| (selected subscription, filters for triaged/improvements, and a cursor |
| for pagination). It also chooses between legacy and Skia-specific |
| anomaly list endpoints. The fetched anomalies are then appended to the |
| `cpAnomalies` array, and if a cursor is returned, a "Show More" button |
| is made visible. |
| - _Why two sets of endpoints (legacy vs. Skia)?_ This suggests a migration |
| path or different data sources/backends being supported, allowing the |
| component to adapt based on configuration. |
| - **Rendering (`template`, `_render`)**: The component uses `lit-html` for |
| templating. The `template` static method defines the HTML structure, |
| which includes: |
| - A dropdown (`<select id="filter">`) to choose a subscription. |
| - Buttons to toggle the display of triaged items and improvements. |
| - A `<subscription-table-sk>` to display details about the selected |
| subscription and its associated alerts. |
| - An `<anomalies-table-sk>` to display the list of anomalies/regressions. |
| - Spinners (`spinner-sk`) to indicate loading states. |
| - A "Show More" button for paginating through anomalies. |
| - The `_render()` method (implicitly called by `ElementSk` when properties |
| change) re-renders the component with the latest data. |
| - **Event Handling (`filterChange`, `triagedChange`, |
| `improvementChange`)**: These methods handle user interactions like |
| selecting a subscription or toggling filters. They update the |
| component's `state`, trigger `stateHasChanged` (which in turn updates |
| the URL and can re-fetch data), and then explicitly call |
| `fetchRegressions` and `_render` to reflect the changes. |
| - **Legacy Regression Display (`getRegTemplate`, `regRowTemplate`)**: |
| There's also code related to displaying `regressions` directly in a |
| table within this component (the `regressions` property and |
| `getRegTemplate`). However, the primary display of anomalies seems to be |
| delegated to `anomalies-table-sk`. This older regression display logic |
| might be for a previous version or a specific use case not currently |
| active in the demo. The `isRegressionImprovement` static method |
| determines if a given regression object represents an improvement based |
| on direction and cluster type. |
| |
| - **`anomalies-table-sk` (external dependency)**: This component is |
| responsible for rendering the detailed table of anomalies. |
| `regressions-page-sk` fetches the anomaly data and then passes it to |
| `anomalies-table-sk` for display. This promotes modularity, separating data |
| fetching/management from presentation. |
| |
| - **`subscription-table-sk` (external dependency)**: This component displays |
| information about the currently selected subscription, including any |
| configured alerts. Similar to `anomalies-table-sk`, it receives data from |
| `regressions-page-sk`. |
| |
| - **`regressions-page-sk.scss`**: Provides styling for the |
| `regressions-page-sk` component, including colors for positive/negative |
| changes and styles for spinners and buttons. |
| |
| - **`regressions-page-sk-demo.html` and `regressions-page-sk-demo.ts`**: These |
| files set up a demonstration page for the `regressions-page-sk` component. |
| |
| - `regressions-page-sk-demo.ts` is particularly important for |
| understanding how the component is intended to be used and tested. It |
| initializes a global `window.perf` object with configuration settings |
| that the main component might rely on (though direct usage isn't evident |
| in `regressions-page-sk.ts` itself, it's a common pattern in Perf). |
| - It uses `fetchMock` to simulate API responses for `/users/login/status`, |
| `/_/subscriptions`, and `/_/regressions` (which seems to be an older |
| endpoint pattern compared to what `regressions-page-sk.ts` uses). This |
| mocking is crucial for creating a standalone demo environment. |
| - _Why `fetchMock`?_ It allows developers to work on and test the UI |
| component without needing a live backend, ensuring predictable data and |
| behavior for demos and tests. |
| |
| **Workflow for Displaying Regressions:** |
| |
| 1. **Initialization (`connectedCallback`, `init`)**: |
| |
| - `regressions-page-sk` element is added to the DOM. |
| - `stateReflector` is set up to read initial state from URL or use |
| defaults. |
| - `init()` is called: |
| - Fetches the list of available subscriptions (e.g., "Sheriff Config |
| 1", "Sheriff Config 2"). |
| - Populates the subscription dropdown (`<select id="filter">`). |
| |
| 2. **User Selects a Subscription (`filterChange`)**: |
| |
| - User selects "Sheriff Config 2" from the dropdown. |
| - `filterChange("Sheriff Config 2")` is triggered. |
| - `state.selectedSubscription` is updated to "Sheriff Config 2". |
| - `cpAnomalies` is cleared, `anomalyCursor` is reset. |
| - `stateHasChanged()` is called, updating the URL (e.g., |
| `?selectedSubscription=Sheriff%20Config%202`). |
| - `fetchRegressions()` is called. |
| |
| 3. **Fetching Anomalies (`fetchRegressions`)**: |
| |
| - An API request is made to |
| `/_/anomalies/anomaly_list?sheriff=Sheriff%20Config%202` (or the Skia |
| equivalent). |
| - A loading spinner is shown. |
| - The server responds with a list of anomalies and potentially a cursor |
| for pagination. |
| |
| 4. **Displaying Anomalies**: |
| |
| - The fetched anomalies are appended to `this.cpAnomalies`. |
| - The `subscriptionTable` is updated with subscription details and alerts |
| from the response. |
| - The `anomaliesTable` (the `anomalies-table-sk` instance) is populated |
| with `this.cpAnomalies`. |
| - If a cursor was returned, the "Show More" button becomes visible. |
| - Loading spinner is hidden. |
| - The component re-renders. |
| |
| ``` |
| User Action Component State API Interaction UI Update |
| ----------- --------------- --------------- --------- |
| Page Load |
| | |
| V |
| regressions-page-sk.init() |
| | state = {selectedSubscription:''} |
| V |
| fetch('/_/anomalies/sheriff_list') -> ["Sheriff1", "Sheriff2"] |
| | subscriptionList = ["Sheriff1", "Sheriff2"] |
| V |
| Populate dropdown |
| Disable filter buttons |
| |
| Selects "Sheriff1" |
| | |
| V |
| regressions-page-sk.filterChange("Sheriff1") |
| | state = {selectedSubscription:'Sheriff1', ...} |
| | (URL updates via stateReflector) |
| V |
| regressions-page-sk.fetchRegressions() |
| | anomaliesLoadingSpinner = true |
| V |
| fetch('/_/anomalies/anomaly_list?sheriff=Sheriff1') -> {anomaly_list: [...], anomaly_cursor: 'cursor123'} |
| | cpAnomalies = [...], anomalyCursor = 'cursor123', showMoreAnomalies = true |
| | anomaliesLoadingSpinner = false |
| V |
| Update anomaliesTable |
| Update subscriptionTable |
| Show "Show More" button |
| Enable filter buttons |
| Clicks "Show More" |
| | |
| V |
| regressions-page-sk.fetchRegressions() (called by button click) |
| | showMoreLoadingSpinner = true |
| V |
| fetch('/_/anomalies/anomaly_list?sheriff=Sheriff1&anomaly_cursor=cursor123') -> {anomaly_list: [more...], anomaly_cursor: null} |
| | cpAnomalies = [all...], anomalyCursor = null, showMoreAnomalies = false |
| | showMoreLoadingSpinner = false |
| V |
| Update anomaliesTable (append) |
| Hide "Show More" button |
| ``` |
| |
| 5. **Toggling Filters (e.g., "Show Triaged", `triagedChange`)**: |
| |
| - User clicks "Show Triaged". |
| - `triagedChange()` is triggered. |
| - `state.showTriaged` is toggled. |
| - Button text updates (e.g., to "Hide Triaged"). |
| - `stateHasChanged()` updates the URL (e.g., |
| `?selectedSubscription=Sheriff%20Config%202&showTriaged=true`). |
| - `fetchRegressions()` is called again, this time with `triaged=true` in |
| the query. |
| - The UI updates with the newly filtered list of anomalies. |
| |
| The design separates concerns: `regressions-page-sk` handles overall page logic, |
| state, and orchestration of data fetching, while specialized components like |
| `anomalies-table-sk` and `subscription-table-sk` handle the rendering of |
| specific data views. The use of `stateReflector` ensures the UI state is |
| bookmarkable and shareable. The demo files with `fetchMock` are critical for |
| isolated development and testing of the UI component. |
| |
| # Module: /modules/report-page-sk |
| |
| The `report-page-sk` module is designed to display a detailed report page for |
| performance anomalies. Its primary purpose is to provide users with a |
| comprehensive view of selected anomalies, including their associated graphs and |
| commit information, facilitating the analysis and understanding of performance |
| regressions or improvements. |
| |
| At its core, the `report-page-sk` element orchestrates the display of several |
| key pieces of information. It fetches anomaly data from a backend endpoint |
| (`/_/anomalies/group_report`) based on URL parameters (like revision, anomaly |
| IDs, bug ID, etc.). This data is then used to populate an `anomalies-table-sk` |
| element, which presents a tabular view of the anomalies. |
| |
| A crucial design decision is the use of an `AnomalyTracker` class. This class is |
| responsible for managing the state of each anomaly, including whether it's |
| selected (checked) by the user, its associated graph, and the relevant time |
| range for graphing. This separation of concerns keeps the main `ReportPageSk` |
| class cleaner and focuses its responsibilities on rendering and user |
| interaction. |
| |
| When an anomaly is selected in the table, `report-page-sk` dynamically generates |
| and displays an `explore-simple-sk` graph for that anomaly. The |
| `explore-simple-sk` element is configured to show data around the anomaly's |
| occurrence, typically a week before and after, to provide context. If multiple |
| anomalies are selected, their graphs are displayed, and their heights are |
| adjusted to fit the available space. A key feature is the synchronized X-axis |
| across all displayed graphs, ensuring a consistent time scale for comparison. |
| |
| The page also attempts to identify and display common commits related to the |
| selected anomalies. It fetches commit details using the `lookupCids` function |
| and highlights commits that appear to be "roll" commits (e.g., "Roll repo from |
| hash to hash"). For these roll commits, it provides a link to the underlying |
| commit or the parent commit if the roll pattern is not directly parseable from |
| the commit message, which can be helpful for developers to trace the source of a |
| change. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`report-page-sk.ts`**: This is the main TypeScript file defining the |
| `ReportPageSk` custom element. |
| |
| - **`ReportPageSk` class**: |
| - **Initialization**: Fetches default configurations (`/_/defaults/`) and |
| then anomaly data based on URL parameters. |
| - **Anomaly Management**: Uses an `AnomalyTracker` instance to manage the |
| state of individual anomalies (selected, graphed, time range). |
| - **Rendering**: Dynamically renders the `anomalies-table-sk` and |
| `explore-simple-sk` graphs based on user interactions and fetched data. |
| It uses the `lit-html` library for templating. |
| - **Event Handling**: Listens for `anomalies_checked` events from the |
| `anomalies-table-sk` to update the displayed graphs. It also handles |
| `x-axis-toggled` events from `explore-simple-sk` to synchronize the |
| x-axis across multiple graphs. |
| - **Graph Generation**: When an anomaly is selected, it creates an |
| `explore-simple-sk` instance, configures its query based on the |
| anomaly's test path, and sets the appropriate time range. |
| - **Commit Information**: Fetches commit details relevant to the anomalies |
| and displays a list of common commits, with special handling for "roll" |
| commits. |
| - **Spinner**: Shows a loading spinner (`spinner-sk`) during data fetching |
| operations. |
| - **`AnomalyTracker` class**: |
| - **State Management**: Stores `AnomalyDataPoint` objects, each containing |
| an `Anomaly`, its checked status, its associated `ExploreSimpleSk` graph |
| instance (if any), and its `Timerange`. |
| - **Loading Data**: Populates its internal tracker from a list of |
| anomalies and their corresponding time ranges. |
| - **Accessors**: Provides methods to get individual anomaly data, |
| set/unset graphs, and retrieve lists of all or selected anomalies. This |
| abstraction is key to decoupling the graph display logic from the raw |
| anomaly data. |
| - **`AnomalyDataPoint` interface**: Defines the structure for storing |
| information about a single anomaly within the `AnomalyTracker`. |
| |
| - **`report-page-sk.scss`**: Contains the SASS/CSS styles for the |
| `report-page-sk` element, including styling for the common commits section |
| and the dialog for displaying all commits (though the dialog itself is not |
| fully implemented in the provided `showAllCommitsTemplate`). |
| |
| - **Data Fetching Workflow**: |
| |
| 1. `ReportPageSk` element is connected to the DOM. |
| 2. URL parameters (e.g., `rev`, `anomalyIDs`, `bugID`) are read. |
| 3. `fetchAnomalies()` is called. |
| - POST request to `/_/anomalies/group_report` with URL parameters in |
| the body. |
| - Backend responds with `anomaly_list`, `timerange_map`, and |
| `selected_keys`. |
| 4. `AnomalyTracker` is loaded with this data. |
| 5. `anomalies-table-sk` is populated. |
| 6. Graphs for initially selected anomalies are rendered. |
| |
| - **User Interaction Workflow (Selecting an Anomaly)**: |
| |
| 1. User checks/unchecks an anomaly in `anomalies-table-sk`. |
| 2. `anomalies-table-sk` fires an `anomalies_checked` custom event with the |
| anomaly and its checked state. |
| 3. `ReportPageSk` listens for this event. |
| 4. `updateGraphs()` is called: |
| - If checked and no graph exists: |
| - `addGraph()` is called. |
| - A new `explore-simple-sk` instance is created and configured. |
| - The graph is added to the DOM. |
| - The `AnomalyTracker` is updated with the new graph instance. |
| - If unchecked and a graph exists: |
| - The graph is removed from the DOM. |
| - The `AnomalyTracker` is updated to remove the graph reference. |
| 5. `updateChartHeights()` is called to adjust the height of all visible |
| graphs. |
| |
| The design emphasizes dynamic content loading and interactive exploration. By |
| using separate custom elements for the table (`anomalies-table-sk`) and graphs |
| (`explore-simple-sk`), the module maintains a good separation of concerns and |
| leverages reusable components. The `AnomalyTracker` further enhances this by |
| encapsulating the state and logic related to individual anomalies. |
| |
| # Module: /modules/revision-info-sk |
| |
| The `revision-info-sk` custom HTML element is designed to display information |
| about anomalies detected around a specific revision. This is particularly useful |
| for understanding the impact of a code change on performance metrics. |
| |
| The core functionality revolves around fetching and presenting `RevisionInfo` |
| objects. A `RevisionInfo` object contains details like the benchmark, bot, bug |
| ID, start and end revisions of an anomaly, the associated test, and links to |
| explore the anomaly further. |
| |
| **Key Components and Workflow:** |
| |
| 1. **`revision-info-sk.ts`**: This is the main TypeScript file defining the |
| `RevisionInfoSk` element. |
| |
| - **State Management**: The element maintains its state in a `State` |
| object, primarily storing the `revisionId`. It utilizes `stateReflector` |
| from `infra-sk/modules/statereflector` to keep the URL in sync with the |
| element's state. This allows users to share links that directly open to |
| a specific revision's information. |
| - `URL change` -> `stateReflector updates State.revisionId` -> |
| `getRevisionInfo() is called` |
| - `User types revision ID and clicks "Get Revision Information"` -> |
| `State.revisionId updated` -> `stateReflector updates URL` -> |
| `getRevisionInfo() is called` |
| - **Data Fetching (`getRevisionInfo`)**: When a revision ID is provided |
| (either via URL or user input), this method is triggered. |
| - It displays a spinner (`spinner-sk`) to indicate loading. |
| - It makes a `fetch` request to the `/_/revision/?rev=<revisionId>` |
| endpoint. |
| - The JSON response, an array of `RevisionInfo` objects, is parsed |
| using `jsonOrThrow`. |
| - The fetched `revisionInfos` are stored, and the UI is re-rendered to |
| display the information. |
| - **Rendering (`template`, `getRevInfosTemplate`, `revInfoRowTemplate`)**: |
| Lit-html is used for templating. |
| - The main template (`template`) includes an input field for the |
| revision ID, a button to trigger fetching, a spinner, and a |
| container for the revision information. |
| - `getRevInfosTemplate` generates an HTML table if `revisionInfos` is |
| populated. This table includes a header row with a "select all" |
| checkbox and columns for bug ID, revision range, master, bot, |
| benchmark, and test. |
| - `revInfoRowTemplate` renders each individual `RevisionInfo` as a row |
| in the table. Each row has a checkbox for selection, a link to the |
| bug (if any), a link to explore the anomaly, and the other relevant |
| details. |
| - **Multi-Graph Functionality**: The element allows users to select |
| multiple detected anomaly ranges and view them together on a multi-graph |
| page. |
| |
| - **Selection**: Checkboxes (`checkbox-sk`) are provided for each |
| revision info row and a "select all" checkbox. The `toggleSelectAll` |
| method handles the logic for the master checkbox. |
| - **`updateMultiGraphStatus`**: This method is called whenever a |
| checkbox state changes. It checks if any revisions are selected and |
| enables/disables the "View Selected Graph(s)" button accordingly. It |
| also updates the `selectAll` state if no individual revisions are |
| checked. |
| - **`getGraphConfigs`**: This helper function takes an array of |
| selected `RevisionInfo` objects and transforms them into an array of |
| `GraphConfig` objects. Each `GraphConfig` contains the query string |
| associated with the anomaly. |
| - **`getMultiGraphUrl`**: This asynchronous method constructs the URL |
| for the multi-graph view. |
| |
| * It calls `getGraphConfigs` to get the configurations for the |
| selected revisions. |
| * It calls `updateShortcut` (from `explore-simple-sk`) to generate a |
| shortcut ID for the combined graph configurations. This typically |
| involves a POST request to `/_/shortcut/update`. |
| * It determines the overall time range (`begin` and `end` timestamps) |
| encompassing all selected anomalies. |
| * It gathers all unique `anomaly_ids` from the selected revisions to |
| highlight them on the multi-graph page. |
| * It constructs the final URL, including the `begin`, `end` |
| timestamps, the `shortcut` ID, the `totalGraphs`, and |
| `highlight_anomalies` parameters. |
| |
| - **`viewMultiGraph`**: This method is called when the "View Selected |
| Graph(s)" button is clicked. |
| |
| * It gathers all checked `RevisionInfo` objects. |
| * It calls `getMultiGraphUrl` to generate the redirect URL. |
| * If a URL is successfully generated, it navigates the current window |
| (`window.open(url, '_self')`) to the multi-graph page. If not, it |
| displays an error message. |
| |
| - **Styling (`revision-info-sk.scss`)**: Provides basic styling for the |
| element, such as left-aligning table headers and styling the spinner. |
| |
| 2. **`index.ts`**: Simply imports and thereby registers the `revision-info-sk` |
| custom element. |
| |
| 3. **Demo Page (`revision-info-sk-demo.html`, `revision-info-sk-demo.ts`, |
| `revision-info-sk-demo.scss`)**: |
| |
| - Provides a simple HTML page to showcase the `revision-info-sk` element. |
| - The `revision-info-sk-demo.ts` file uses `fetch-mock` to mock the |
| `/_/revision/` API endpoint. This is crucial for demonstrating the |
| element's functionality without needing a live backend. When the demo |
| page loads and the user interacts with the element (e.g., enters a |
| revision ID '12345'), the mocked response is returned. |
| |
| **Design Decisions and Rationale:** |
| |
| - **Custom Element**: Encapsulating this functionality as a custom element |
| (`<revision-info-sk>`) promotes reusability across different parts of the |
| Perf application or potentially other Skia web applications. |
| - **State Reflection**: Using `stateReflector` enhances user experience by |
| allowing direct navigation to a revision's details via URL and updating the |
| URL as the user interacts with the element. This makes sharing and |
| bookmarking specific views straightforward. |
| - **Lit-html for Templating**: Lit-html is chosen for its efficiency and |
| declarative approach to building UIs, making the rendering logic concise and |
| maintainable. |
| - **Asynchronous Operations**: Data fetching and shortcut generation are |
| asynchronous operations. The use of `async/await` makes the code easier to |
| read and manage compared to traditional Promise chaining. |
| - **Dedicated Multi-Graph URL Generation**: The logic for constructing the |
| multi-graph URL is encapsulated in `getMultiGraphUrl`. This separates |
| concerns and makes the process of generating the complex URL clearer. It |
| relies on the `explore-simple-sk` module's `updateShortcut` function, |
| promoting reuse of existing shortcut generation logic. |
| - **Error Handling**: `jsonOrThrow` is used to simplify error handling for |
| fetch requests. The `viewMultiGraph` method also includes basic error |
| handling if the URL generation fails. |
| - **Clear Separation of Concerns**: The element focuses on displaying revision |
| information and providing navigation to related views (bug tracker, explore |
| page, multi-graph view). It doesn't concern itself with the details of how |
| anomalies are detected or how the multi-graph page itself functions. |
| |
| **Workflow for Displaying Revision Information:** |
| |
| ``` |
| User Interaction / URL Change |
| | |
| v |
| [revision-info-sk] stateReflector updates internal 'state.revisionId' |
| | |
| v |
| [revision-info-sk] getRevisionInfo() called |
| | |
| +--------------------------------+ |
| | | |
| v v |
| [revision-info-sk] shows spinner [revision-info-sk] makes fetch request to `/_/revision/?rev=<ID>` |
| | | |
| | v |
| | [Backend] processes request, returns RevisionInfo[] |
| | | |
| | v |
| +------------------> [revision-info-sk] receives JSON response, parses with jsonOrThrow |
| | |
| v |
| [revision-info-sk] stores 'revisionInfos', hides spinner |
| | |
| v |
| [revision-info-sk] re-renders using Lit-html templates to display table |
| ``` |
| |
| **Workflow for Viewing Multi-Graph:** |
| |
| ``` |
| User selects one or more revision info rows (checkboxes) |
| | |
| v |
| [revision-info-sk] updateMultiGraphStatus() enables "View Selected Graph(s)" button |
| | |
| v |
| User clicks "View Selected Graph(s)" button |
| | |
| v |
| [revision-info-sk] viewMultiGraph() called |
| | |
| v |
| [revision-info-sk] collects selected RevisionInfo objects |
| | |
| v |
| [revision-info-sk] calls getMultiGraphUrl(selectedRevisions) |
| | |
| +------------------------------------------------------+ |
| | | |
| v v |
| [getMultiGraphUrl] calls getGraphConfigs() to create GraphConfig[] [getMultiGraphUrl] calls updateShortcut(GraphConfig[]) |
| | | (makes POST to /_/shortcut/update) |
| | v |
| | [Backend] returns shortcut ID |
| | | |
| +-------------------------------------> [getMultiGraphUrl] constructs final URL (with begin, end, shortcut, anomaly IDs) |
| | |
| v |
| [viewMultiGraph] receives the multi-graph URL |
| | |
| v |
| [Browser] navigates to the generated multi-graph URL |
| ``` |
| |
| # Module: /modules/split-chart-menu-sk |
| |
| The `split-chart-menu-sk` module provides a user interface element for selecting |
| an attribute by which to split a chart. This is particularly useful in data |
| visualization scenarios where users need to break down aggregated data into |
| smaller, more specific views. For example, in a performance monitoring |
| dashboard, a user might want to see performance metrics split by benchmark, |
| specific test case (story), or sub-component (subtest). |
| |
| The core functionality revolves around presenting a list of available attributes |
| to the user in a dropdown menu. These attributes are dynamically derived from |
| the underlying data. When an attribute is selected, the component emits an |
| event, allowing other parts of the application to react and update the chart |
| display accordingly. |
| |
| **Key Components and Design:** |
| |
| - **`split-chart-menu-sk.ts`**: This is the main TypeScript file that defines |
| the `SplitChartMenuSk` LitElement. |
| |
| - **Data Consumption:** The component utilizes the Lit `context` API |
| (`@consume`) to access data from two sources: `dataframeContext` and |
| `dataTableContext`. |
| - `dataframeContext` provides the `DataFrame` (from |
| `//perf/modules/json:index_ts_lib` and |
| `//perf/modules/dataframe:dataframe_context_ts_lib`). The `DataFrame` is |
| the source from which the list of available attributes for splitting is |
| derived. This design decouples the menu from the specifics of data |
| fetching and management, allowing it to focus solely on the UI aspect of |
| attribute selection. The `getAttributes` function (from |
| `//perf/modules/dataframe:traceset_ts_lib`) is used to extract these |
| attributes. |
| - `dataTableContext` provides `DataTable` (also from |
| `//perf/modules/dataframe:dataframe_context_ts_lib`). While consumed, |
| its direct usage within this specific component's rendering logic isn't |
| immediately apparent in the provided `render` method, but it might be |
| used by other parts of the application or for future enhancements. |
| - **User Interaction:** |
| - A Material Design outlined button (`<md-outlined-button>`) labeled |
| "Split By" serves as the trigger to open the menu. |
| - The menu itself is a Material Design menu (`<md-menu>`), which is |
| populated with `<md-menu-item>` elements, one for each attribute |
| retrieved from the `DataFrame`. |
| - The `menuOpen` state property controls the visibility of the menu. |
| Clicking the button toggles this state. The menu also closes itself via |
| the `@closed` event. |
| - **Event Emission:** When a user clicks on a menu item, the |
| `bubbleAttribute` method is called. This method dispatches a custom |
| event named `split-chart-selection`. |
| - The event detail (`SplitChartSelectionEventDetails`) contains the |
| selected `attribute` (a string). |
| - The event is configured to bubble (`bubbles: true`) and pass through |
| shadow DOM boundaries (`composed: true`), making it easy for ancestor |
| elements to listen and react to the selection. This event-driven |
| approach is crucial for decoupling the menu from the chart component or |
| any other component that needs to know about the selected split |
| attribute. |
| - **Styling:** Styles are imported from `split-chart-menu-sk.css.ts` |
| (`style`). This keeps the component's presentation concerns separate |
| from its logic. The styles ensure the component is displayed as an |
| inline block and sets a default background color, also styling the |
| Material button. |
| |
| - **`split-chart-menu-sk.css.ts`**: This file defines the CSS styles for the |
| component using Lit's `css` tagged template literal. The primary styling |
| focuses on the host element's positioning and background, and customizing |
| the Material Design button's border radius. |
| |
| - **`index.ts`**: This file simply imports and registers the |
| `split-chart-menu-sk` custom element, making it available for use in HTML. |
| |
| **Workflow: Selecting a Split Attribute** |
| |
| 1. **Initialization:** |
| |
| - The `split-chart-menu-sk` component is rendered. |
| - It consumes the `DataFrame` from the `dataframeContext`. |
| - The `getAttributes()` method is called (implicitly via the render |
| method's map function) to populate the list of attributes for the menu. |
| |
| 2. **User Interaction:** |
| |
| - User clicks the "Split By" button. |
| - `menuClicked` handler is invoked -> `this.menuOpen` becomes `true`. |
| - The `<md-menu>` component becomes visible, displaying the list of |
| attributes. |
| |
| ``` |
| User split-chart-menu-sk DataFrame |
| | | | |
| |---Clicks "Split By"->| | |
| | |---Toggles menuOpen=true-->| |
| | | | |
| | |<--Displays Menu-------| |
| | | | |
| ``` |
| |
| 3. **Attribute Selection:** |
| |
| - User clicks on an attribute in the menu (e.g., "benchmark"). |
| - The `click` handler on `<md-menu-item>` calls |
| `this.bubbleAttribute("benchmark")`. |
| - `bubbleAttribute` creates a `CustomEvent('split-chart-selection', { |
| |
| detail: { attribute: "benchmark" } })`. - The event is dispatched. |
| |
| ``` |
| User split-chart-menu-sk (Parent Component) |
| | | | |
| |---Clicks "benchmark"->| | |
| | |---Calls bubbleAttribute("benchmark")-->| |
| | | | |
| | |---Dispatches "split-chart-selection" event--> (Listens for event) |
| | | | | |
| | | | |---Handles event, updates chart |
| ``` |
| |
| 4. **Menu Closes:** |
| |
| - The `<md-menu>` component emits a `closed` event. |
| - The `menuClosed` handler is invoked -> `this.menuOpen` becomes `false`. |
| |
| This design ensures that `split-chart-menu-sk` is a self-contained, reusable UI |
| component whose sole responsibility is to provide a way to select a splitting |
| attribute and communicate that selection to the rest of the application via a |
| well-defined event. The use of context for data consumption and custom events |
| for output makes it highly decoupled and easy to integrate. |
| |
| The demo page (`split-chart-menu-sk-demo.html` and |
| `split-chart-menu-sk-demo.ts`) demonstrates how to use the component and listen |
| for the `split-chart-selection` event. The Puppeteer test |
| (`split-chart-menu-sk_puppeteer_test.ts`) provides a basic smoke test and a |
| visual regression test by taking a screenshot. |
| |
| # Module: /modules/subscription-table-sk |
| |
| The `subscription-table-sk` module provides a custom HTML element designed to |
| display information about a "subscription" and its associated "alerts". This is |
| particularly useful in contexts where users need to understand the configuration |
| of automated monitoring or alerting systems. |
| |
| The core functionality is encapsulated within the `subscription-table-sk.ts` |
| file, which defines the `SubscriptionTableSk` custom element. This element is |
| built using Lit, a library for creating fast, lightweight web components. |
| |
| **Why and How:** |
| |
| The primary goal is to present complex subscription and alert data in a |
| user-friendly and interactive manner. Instead of a static display, this |
| component allows for toggling the visibility of the detailed alert |
| configurations. This design choice avoids overwhelming the user with too much |
| information upfront, providing a cleaner initial view focused on the |
| subscription summary. |
| |
| The `SubscriptionTableSk` element takes `Subscription` and `Alert[]` objects as |
| input. The `Subscription` object contains general information like name, contact |
| email, revision, bug tracking details (component, hotlists, priority, severity, |
| CC emails). The `Alert[]` array holds detailed configurations for individual |
| alerts, including their query parameters, step algorithm, radius, and other |
| specific settings. |
| |
| **Key Responsibilities and Components:** |
| |
| - **`subscription-table-sk.ts`**: |
| - **`SubscriptionTableSk` class**: This is the heart of the module. It |
| extends `ElementSk`, a base class for Skia custom elements. |
| - **Data Handling**: It stores the `subscription` and `alerts` data |
| internally. |
| - **Rendering Logic (`template` static method)**: It uses Lit's `html` |
| tagged template literal to define the structure and content of the |
| element. It conditionally renders the subscription details and the |
| alerts table based on the available data and the `showAlerts` state. |
| - Subscription details are always visible if a subscription is loaded. |
| - The alerts table is only rendered if `showAlerts` is true. This |
| state is toggled by a button. |
| - **`load(subscription: Subscription, alerts: Alert[])` method**: This |
| public method is the primary way to feed data into the component. It |
| updates the internal state and triggers a re-render. |
| - **`toggleAlerts()` method**: This method flips the `showAlerts` boolean |
| flag and triggers a re-render, effectively showing or hiding the alerts |
| table. |
| - **`formatRevision(revision: string)` method**: A helper function to |
| display the revision string as a clickable link, pointing to a specific |
| configuration file URL. This improves usability by allowing users to |
| quickly navigate to the source of the configuration. |
| - **`paramset-sk` integration**: For displaying the alert `query`, it |
| utilizes the `paramset-sk` element. The `toParamSet` utility function |
| (from `infra-sk/modules/query`) is used to convert the query string into |
| a format suitable for `paramset-sk`, which then renders it as a |
| structured set of key-value pairs. This enhances readability of complex |
| query strings. |
| - **Styling (`subscription-table-sk.scss`)**: This file defines the visual |
| appearance of the element. It uses SCSS and imports styles from shared |
| libraries (`themes_sass_lib`, `buttons_sass_lib`, `select_sass_lib`) to |
| maintain a consistent look and feel with other Skia elements. The styles |
| focus on clear presentation of information, with distinct sections for |
| subscription details and the alerts table. |
| |
| **Workflow: Displaying Subscription and Alerts** |
| |
| 1. **Initialization**: An instance of `subscription-table-sk` is added to the |
| DOM. `<subscription-table-sk></subscription-table-sk>` |
| 2. **Data Loading**: External code (e.g., in a demo page or a larger |
| application) calls the `load()` method on the element instance, passing in |
| the `Subscription` object and an array of `Alert` objects. |
| `element.load(mySubscriptionData, myAlertsData);` |
| 3. **Initial Render**: |
| - The `SubscriptionTableSk` element updates its internal `subscription` |
| and `alerts` properties. |
| - `showAlerts` is set to `false` by default upon loading new data. |
| - The `_render()` method is called (implicitly by Lit or explicitly). |
| - The `template` function generates the HTML: |
| - Subscription details (name, email, revision, etc.) are displayed. |
| - A button labeled "Show [N] Alert Configurations" is displayed. |
| - The alerts table is _not_ rendered yet. |
| 4. **User Interaction (Toggling Alerts)**: - The user clicks the "Show [N] Alert Configurations" button. - The `click` event triggers the `toggleAlerts()` method. - `showAlerts` becomes `true`. - `_render()` is called again. - The `template` function now also renders the `<table |
| id="alerts-table">`. - The table header is displayed. - For each `Alert` object in `ele.alerts`: - A table row (`<tr>`) is created. - Cells (`<td>`) display various alert properties (step algorithm, |
| radius, k, etc.). - The alert `query` is passed to a `<paramset-sk>` element for |
| structured display. - The button label changes to "Hide Alert Configurations". |
| 5. **Further Toggling**: Clicking the button again will hide the table, and the |
| label will revert. |
| |
| **Diagram: Data Flow and Rendering** |
| |
| ``` |
| External Code ---> subscriptionTableSkElement.load(subscription, alerts) |
| | |
| V |
| SubscriptionTableSk Internal State: |
| - this.subscription = subscription |
| - this.alerts = alerts |
| - this.showAlerts = false (initially or after load) |
| | |
| V |
| _render() ------> Lit Template Evaluation |
| | |
| ------------------------------------- |
| | | |
| V (if this.subscription is not null) V (if this.showAlerts is true) |
| Render Subscription Details Render Alerts Table |
| - Name, Email, Revision (formatted link) - Iterate through this.alerts |
| - Bug info, Hotlists, CCs - For each alert: |
| - "Show/Hide Alerts" Button - Display properties in <td> |
| - Use <paramset-sk> for query |
| ``` |
| |
| **Demo Page (`subscription-table-sk-demo.html`, |
| `subscription-table-sk-demo.ts`)** |
| |
| The demo page serves as an example and testing ground. |
| |
| - `subscription-table-sk-demo.html`: Sets up the basic HTML structure, |
| including instances of `subscription-table-sk` (one for light mode, one for |
| dark mode to test theming) and buttons to interact with them. It also |
| includes an `error-toast-sk` for displaying potential errors. |
| - `subscription-table-sk-demo.ts`: Contains JavaScript to: |
| - Import and register the `subscription-table-sk` element. |
| - Define sample `Subscription` and `Alert` data. |
| - Add event listeners to the "Populate Tables" button, which calls the |
| `load()` method on the `subscription-table-sk` instances with the sample |
| data. |
| - Add event listeners to the "Toggle Alerts Table" button, which calls the |
| `toggleAlerts()` method on the instances. |
| |
| This setup allows developers to see the component in action and verify its |
| functionality with predefined data. |
| |
| # Module: /modules/test-picker-sk |
| |
| The `test-picker-sk` module provides a custom HTML element, `<test-picker-sk>`, |
| designed to guide users in selecting a valid trace or test for plotting. It |
| achieves this by presenting a series of dependent input fields, where the |
| options available in each field are dynamically updated based on selections made |
| in previous fields. This ensures that users can only construct valid |
| combinations of parameters. |
| |
| **Core Functionality and Design:** |
| |
| The primary goal of `test-picker-sk` is to simplify the process of selecting a |
| specific data series (a "trace" or "test") from a potentially large and complex |
| dataset. This is often necessary in performance analysis tools where data is |
| categorized by multiple parameters (e.g., benchmark, bot, specific test, |
| sub-test variations). |
| |
| The design enforces a specific order for filling out these parameters. This |
| hierarchical approach is crucial because the valid options for a parameter often |
| depend on the values chosen for its preceding parameters. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`test-picker-sk.ts`**: This is the heart of the module, defining the |
| `TestPickerSk` custom element. |
| |
| - **`FieldInfo` class**: This internal class is a simple data structure |
| used to manage the state of each individual input field within the |
| picker. It stores a reference to the `PickerFieldSk` element, the |
| parameter name (e.g., "benchmark", "bot"), and the currently selected |
| value. |
| - **Dynamic Field Generation (`addChildField`)**: When a value is selected |
| in a field, and if there are more parameters in the hierarchy, a new |
| `PickerFieldSk` input is dynamically added to the UI. The options for |
| this new field are fetched from the backend. This progressive disclosure |
| prevents overwhelming the user with too many options at once. |
| - **Backend Communication (`callNextParamList`)**: The element interacts |
| with a backend endpoint (`/_/nextParamList/`). This endpoint is |
| responsible for: |
| |
| * Providing the list of valid options for the _next_ input field based on |
| the current selections. |
| * Returning a count of how many unique traces/tests match the current |
| partial or complete selection. |
| |
| - **State Management (`_fieldData`, `_currentIndex`)**: The `_fieldData` |
| array holds `FieldInfo` objects for each parameter field. |
| `_currentIndex` tracks which field is currently active or the next to be |
| added. |
| - **Event Handling (`value-changed`, `plot-button-clicked`)**: |
| - It listens for `value-changed` events from its child `picker-field-sk` |
| elements. When a value changes, it triggers logic to update subsequent |
| fields and the match count. |
| - It emits a `plot-button-clicked` custom event when the user clicks the |
| "Add Graph" button. This event includes the fully constructed query |
| string representing the selected trace. |
| - **Query Population (`populateFieldDataFromQuery`)**: This method allows |
| the picker to be initialized with a pre-existing query string. It will |
| populate the fields sequentially based on the query parameters. If a |
| parameter in the hierarchy is missing or empty in the query, the |
| population stops at that point. |
| - **Plotting Logic (`onPlotButtonClick`, `PLOT_MAXIMUM`)**: The "Add |
| Graph" button is enabled only when the number of matching traces is |
| within a manageable range (greater than 0 and less than or equal to |
| `PLOT_MAXIMUM`). This prevents users from attempting to plot an |
| overwhelming number of traces. |
| - **Rendering and UI Updates**: The component uses Lit library for |
| templating and re-renders itself when its internal state changes (e.g., |
| new fields added, count updated, request in progress). It also manages |
| the enabled/disabled state of input fields during backend requests. |
| |
| - **`picker-field-sk` (Dependency)**: While not part of this module, |
| `test-picker-sk` heavily relies on the `picker-field-sk` element. Each |
| parameter in the test picker is represented by an instance of |
| `picker-field-sk`. This child component is responsible for displaying a |
| label, an input field, and a dropdown menu of selectable options. |
| |
| - **`test-picker-sk.scss`**: Defines the visual styling for the |
| `test-picker-sk` element and its internal components, ensuring a consistent |
| look and feel. It styles the layout of the fields, the match count display, |
| and the plot button. |
| |
| **Workflow: User Selecting a Test** |
| |
| 1. **Initialization (`initializeTestPicker`)**: |
| |
| - `test-picker-sk` is given an ordered list of parameter names (e.g., |
| `['benchmark', 'bot', 'test']`) and optional default parameters. |
| - `test-picker-sk` -> Backend (`/_/nextParamList/`): Requests options for |
| the _first_ parameter (e.g., "benchmark") with an empty query. |
| |
| ``` |
| User Interface: Backend: |
| [test-picker-sk] |
| | |
| initializeTestPicker(['benchmark', 'bot', 'test'], {}) |
| | |
| ---> POST /_/nextParamList/ (q="") |
| | |
| (Processes request, queries data source) |
| | |
| <--- {paramset: {benchmark: ["b1", "b2"]}, count: 100} |
| | |
| (Renders first PickerFieldSk for "benchmark" with options "b1", "b2") |
| [Benchmark: [select â–¼]] [Matches: 100] [Add Graph (disabled)] |
| ``` |
| |
| 2. **User Selects a Value**: |
| |
| - The user selects "b1" for "benchmark". |
| - The `picker-field-sk` for "benchmark" emits a `value-changed` event. |
| - `test-picker-sk` -> Backend: Requests options for the _next_ parameter |
| ("bot"), now including the selection `benchmark=b1` in the query. |
| |
| ``` |
| User Interface: |
| [Benchmark: [b1 â–¼]] |
| | (value-changed: {value: "b1"}) |
| [test-picker-sk] |
| | |
| ---> POST /_/nextParamList/ (q="benchmark=b1") |
| | |
| (Processes request, filters based on benchmark=b1) |
| | |
| <--- {paramset: {bot: ["botX", "botY"]}, count: 20} |
| | |
| (Renders PickerFieldSk for "bot" with options "botX", "botY") |
| [Benchmark: [b1 â–¼]] [Bot: [select â–¼]] [Matches: 20] [Add Graph (disabled)] |
| ``` |
| |
| 3. **Process Repeats**: This continues for each parameter in the hierarchy. |
| |
| 4. **Final Selection and Plotting**: |
| |
| - Once all necessary parameters are selected (or the user chooses to |
| stop), the match count reflects the number of specific traces. |
| - If the count is within the `PLOT_MAXIMUM`, the "Add Graph" button |
| enables. |
| - User clicks "Add Graph". |
| - `test-picker-sk` emits `plot-button-clicked` with the final query (e.g., |
| `benchmark=b1&bot=botX&test=testZ`). |
| |
| ``` |
| User Interface: |
| [Benchmark: [b1 â–¼]] [Bot: [botX â–¼]] [Test: [testZ â–¼]] [Matches: 5] [Add Graph (enabled)] |
| | (User clicks "Add Graph") |
| [test-picker-sk] |
| | |
| emits 'plot-button-clicked' (detail: {query: "benchmark=b1&bot=botX&test=testZ"}) |
| ``` |
| |
| **Why this Approach?** |
| |
| - **Guided Selection**: Prevents users from creating invalid or non-existent |
| trace combinations. |
| - **Performance**: By fetching options incrementally, the backend doesn't need |
| to return massive lists of all possible values for all parameters at once. |
| Queries to the backend are progressively filtered. |
| - **User Experience**: The interface is less cluttered as fields appear only |
| when needed. The match count provides immediate feedback on the specificity |
| of the selection. |
| |
| The `test-picker-sk-demo.html` and `test-picker-sk-demo.ts` files provide a |
| runnable example of the component, mocking the backend `/_/nextParamList/` |
| endpoint to showcase its functionality without needing a live backend. This is |
| essential for development and testing. The Puppeteer and Karma tests |
| (`test-picker-sk_puppeteer_test.ts`, `test-picker-sk_test.ts`) ensure the |
| component behaves as expected under various conditions. |
| |
| # Module: /modules/themes |
| |
| The `/modules/themes` module is responsible for defining the visual styling and |
| theming for the application. It builds upon the base theming provided by |
| `infra-sk` and introduces application-specific overrides and additions. |
| |
| **Why and How:** |
| |
| The primary goal of this module is to establish a consistent and branded look |
| and feel across the application. Instead of defining all styles from scratch, it |
| leverages the `infra-sk` theming library as a foundation. This promotes code |
| reuse and ensures that common UI elements have a familiar appearance. |
| |
| The approach taken is to: |
| |
| 1. **Import Base Styles:** The `themes.scss` file begins by importing the core |
| styles from `../../../infra-sk/themes`. This brings in the foundational |
| design system, including color palettes, typography, spacing, and component |
| styles. |
| 2. **Import External Resources:** It also imports the Material Icons font |
| library directly from Google Fonts |
| (`https://fonts.googleapis.com/icon?family=Material+Icons`). This makes a |
| wide range of standard icons readily available for use within the |
| application's UI. |
| 3. **Define Application-Specific Overrides and Additions:** The core principle |
| is to only define _deltas_ from the base `infra-sk` theme and global changes |
| from `elements-sk` components. This means that `themes.scss` focuses on |
| styling aspects that are unique to this specific application or require |
| modifications to the default `infra-sk` appearance. |
| |
| **Key Components and Files:** |
| |
| - **`themes.scss`**: This is the central SCSS (Sassy CSS) file for the module. |
| |
| - **Responsibility:** It orchestrates the application's theme by importing |
| base styles, external resources, and defining application-specific |
| styling rules. |
| - **Implementation Details:** |
| - `@import '../../../infra-sk/themes';`: This line incorporates the |
| foundational theme from the `infra-sk` library. The relative path |
| indicates that `infra-sk` is expected to be a sibling or ancestor |
| directory in the project structure. |
| - `@import |
| url('https://fonts.googleapis.com/icon?family=Material+Icons');`: This |
| directive pulls in the Material Icons font stylesheet, enabling the use |
| of standard Google Material Design icons throughout the application. |
| - `body { margin: 0; padding: 0; }`: This is an example of a global |
| override. It resets the default browser margins and padding on the |
| `<body>` element, providing a cleaner baseline for layout. This is a |
| common practice to ensure consistent spacing across different browsers. |
| Other application-specific styles would follow this pattern, targeting |
| specific elements or defining new CSS classes. |
| |
| - **`BUILD.bazel`**: This file defines how the `themes.scss` file is processed |
| and made available to the rest of the application. |
| |
| - **Responsibility:** It uses the `sass_library` rule (defined in |
| `//infra-sk:index.bzl`) to compile the SCSS into CSS and declare it as a |
| reusable library. |
| - **Implementation Details:** |
| - `load("//infra-sk:index.bzl", "sass_library")`: Imports the necessary |
| Bazel rule for handling SASS compilation. |
| - `sass_library(name = "themes_sass_lib", ...)`: Defines a SASS library |
| target named `themes_sass_lib`. |
| - `srcs = ["themes.scss"]`: Specifies that `themes.scss` is the source |
| file for this library. |
| - `visibility = ["//visibility:public"]`: Makes this compiled CSS |
| library accessible to any other part of the project. |
| - `deps = ["//infra-sk:themes_sass_lib"]`: Declares a dependency on |
| the `infra-sk` SASS library. This is crucial because `themes.scss` |
| imports styles from `infra-sk`. The build system needs to know about |
| this dependency to ensure `infra-sk` styles are available during the |
| compilation of `themes.scss`. |
| |
| **Workflow (Styling Application):** |
| |
| ``` |
| Browser Request --> HTML Document |
| | |
| v |
| Link to Compiled CSS (from themes_sass_lib) |
| | |
| v |
| Application of Styles: |
| 1. Base browser styles |
| 2. infra-sk/themes.scss styles (imported) |
| 3. Material Icons styles (imported) |
| 4. modules/themes/themes.scss overrides & additions (applied last, taking precedence) |
| | |
| v |
| Rendered Page with Application-Specific Theme |
| ``` |
| |
| In essence, this module provides a layered approach to theming. It starts with a |
| robust base, incorporates external resources like icon fonts, and then applies |
| specific customizations to achieve the desired visual identity for the |
| application. The `BUILD.bazel` file ensures that these SASS files are correctly |
| processed and made available as CSS to the application during the build process. |
| |
| # Module: /modules/trace-details-formatter |
| |
| This module provides a mechanism for formatting trace details and converting |
| trace strings into query strings. The core idea is to offer a flexible way to |
| represent and interpret trace information, accommodating different formatting |
| conventions, particularly for Chrome-specific trace structures. |
| |
| The "why" behind this module stems from the need to handle various trace |
| formats. Different systems or parts of the application might represent trace |
| identifiers (which are essentially a collection of parameters) in distinct ways. |
| This module centralizes the logic for translating between these representations. |
| For example, a compact string representation of a trace might be used in URLs or |
| displays, while a more structured `ParamSet` is needed for querying data. |
| |
| The "how" is achieved through an interface `TraceFormatter` and concrete |
| implementations. This allows for different formatting strategies to be plugged |
| in as needed. The `GetTraceFormatter()` function acts as a factory, returning |
| the appropriate formatter based on the application's configuration |
| (`window.perf.trace_format`). |
| |
| **Key Components/Files:** |
| |
| - **`traceformatter.ts`**: This is the central file containing the core logic. |
| |
| - **`TraceFormatter` interface**: Defines the contract for all trace |
| formatters. It mandates two primary methods: |
| - `formatTrace(params: Params): string`: Takes a `Params` object (a |
| key-value map representing trace parameters) and returns a string |
| representation of the trace. This is useful for displaying trace |
| identifiers in a user-friendly or system-specific format. |
| - `formatQuery(trace: string): string`: Takes a string representation of a |
| trace and converts it into a query string (e.g., |
| "key1=value1&key2=value2"). This is crucial for constructing API |
| requests to fetch data related to a specific trace. |
| - **`DefaultTraceFormatter` class**: Provides a basic implementation of |
| `TraceFormatter`. |
| - Its `formatTrace` method generates a string like "Trace ID: |
| ,key1=value1,key2=value2,...". This is a generic way to represent the |
| trace parameters. |
| - Its `formatQuery` method currently returns an empty string, indicating |
| that this default formatter doesn't have a specific logic for converting |
| its trace string representation back into a query. |
| - **`ChromeTraceFormatter` class**: Implements `TraceFormatter` |
| specifically for traces originating from Chrome's performance |
| infrastructure. |
| - **Why `ChromeTraceFormatter`?** Chrome's performance data often uses a |
| hierarchical, slash-separated string to identify traces (e.g., |
| `master/bot/benchmark/test/subtest_1`). This formatter handles this |
| specific convention. |
| - **`keys` array**: This private property (`['master', 'bot', 'benchmark', |
| 'test', 'subtest_1', 'subtest_2', 'subtest_3']`) defines the expected |
| order of parameters in the Chrome-style trace string. This order is |
| significant for both formatting and parsing. |
| - **`formatTrace(params: Params): string`**: It iterates through the |
| predefined `keys` and constructs a slash-separated string from the |
| corresponding values in the input `params`. `Input Params: { master: |
| "m", bot: "b", benchmark: "bm", test: "t" } keys: [ "master", "bot", |
| "benchmark", "test", ... ] Output String: "m/b/bm/t"` |
| - **`formatQuery(trace: string): string`**: This is the inverse operation. |
| It takes a slash-separated trace string, splits it, and maps the parts |
| back to the predefined `keys` to build a `ParamSet`. It then converts |
| this `ParamSet` into a standard URL query string. - **Handling Statistics (Ad-hoc logic for Chromeperf/Skia bridge)**: A |
| special piece of logic exists within `formatQuery` related to |
| `window.perf.enable_skia_bridge_aggregation`. If a trace's 'test' |
| value ends with a known statistic suffix (e.g., `_avg`, `_count`), |
| this suffix is used to determine the `stat` parameter in the output |
| query, and the suffix is removed from the 'test' parameter. If no |
| such suffix is found, a default `stat` value of 'value' is added. |
| This logic is a temporary measure to bridge formatting differences |
| between Chromeperf and Skia systems and is intended to be removed |
| once Chromeperf is deprecated. `Input Trace String |
| (enable_skia_bridge_aggregation = true): |
| "master/bot/benchmark/test_name_max/subtest" Splits into: ["master", |
| "bot", "benchmark", "test_name_max", "subtest"] Processed ParamSet: |
| { master: ["master"], bot: ["bot"], benchmark: ["benchmark"], test: |
| ["test_name"], stat: ["max"], subtest_1: ["subtest"] } Output Query: |
| "master=master&bot=bot&benchmark=benchmark&test=test_name&stat=max&subtest_1=subtest"` |
| - **`STATISTIC_SUFFIX_TO_VALUE_MAP`**: A map used by |
| `ChromeTraceFormatter` to translate common statistic suffixes (like |
| "avg", "count") found in test names to their corresponding "stat" |
| parameter values (like "value", "count"). |
| - **`traceFormatterRecords`**: A record (object map) that associates |
| `TraceFormat` enum values (like `''` for default, `'chrome'` for |
| Chrome-specific) with their corresponding `TraceFormatter` instances. |
| This acts as a registry for available formatters. |
| - **`GetTraceFormatter()` function**: This is the public entry point for |
| obtaining a trace formatter. It reads `window.perf.trace_format` (a |
| global configuration setting) and returns the appropriate formatter |
| instance from `traceFormatterRecords`. If the format is not found, it |
| defaults to `DefaultTraceFormatter`. |
| |
| ``` |
| Global Config: window.perf.trace_format = "chrome" |
| | |
| v |
| GetTraceFormatter() |
| | |
| v |
| traceFormatterRecords["chrome"] |
| | |
| v |
| Returns new ChromeTraceFormatter() instance |
| ``` |
| |
| - **`traceformatter_test.ts`**: Contains unit tests for the |
| `ChromeTraceFormatter`, specifically focusing on the `formatQuery` method |
| and its logic for handling statistic suffixes under different configurations |
| of `window.perf.enable_skia_bridge_aggregation`. |
| |
| This module depends on: |
| |
| - `infra-sk/modules:query_ts_lib`: For the `fromParamSet` function, used to |
| convert a `ParamSet` object into a URL query string. |
| - `perf/modules/json:index_ts_lib`: For type definitions like `Params`, |
| `ParamSet`, and `TraceFormat`. |
| - `perf/modules/paramtools:index_ts_lib`: For the `makeKey` function, used by |
| `DefaultTraceFormatter` to create a string representation of a `Params` |
| object. |
| - `perf/modules/window:window_ts_lib`: To access global configuration values |
| like `window.perf.trace_format` and |
| `window.perf.enable_skia_bridge_aggregation`. |
| |
| # Module: /modules/triage-menu-sk |
| |
| The `triage-menu-sk` module provides a user interface element for managing and |
| triaging anomalies in bulk. It's designed to streamline the process of handling |
| multiple performance regressions or improvements detected in data. |
| |
| The core purpose of this module is to allow users to efficiently take action on |
| a set of selected anomalies. Instead of interacting with each anomaly |
| individually, this menu provides centralized controls for common triage |
| operations. This is crucial for workflows where many anomalies might be |
| identified simultaneously, requiring a quick and consistent way to categorize or |
| address them. |
| |
| Key responsibilities and components: |
| |
| - **`triage-menu-sk.ts`**: This is the heart of the module, defining the |
| `TriageMenuSk` custom element. |
| |
| - **Anomaly Aggregation**: It receives a list of `Anomaly` objects and |
| associated `trace_names`. This allows it to operate on multiple |
| anomalies at once. |
| - **Action Buttons**: It renders buttons for common triage actions: |
| - **"New Bug"**: Triggers the `new-bug-dialog-sk` element, allowing the |
| user to create a new bug report associated with the selected anomalies. |
| - **"Existing Bug"**: Triggers the `existing-bug-dialog-sk` element, |
| enabling the user to link the selected anomalies to an already existing |
| bug. |
| - **"Ignore"**: Marks the selected anomalies as "Ignored". This is useful |
| for anomalies that are deemed not actionable or are false positives. |
| - **Nudging Functionality**: |
| - The `NudgeEntry` class and related logic (`generateNudgeButtons`, |
| `nudgeAnomaly`, `makeNudgeRequest`) allow users to adjust the perceived |
| start and end points of an anomaly. This is a subtle but important |
| feature for refining the automated anomaly detection. The UI presents a |
| set of buttons (e.g., -2, -1, 0, +1, +2) that shift the anomaly's |
| boundaries. |
| - The `_allowNudge` flag controls whether the nudge buttons are visible, |
| allowing for contexts where nudging might not be appropriate (e.g., when |
| multiple, disparate anomalies are selected). |
| - **State Management**: It maintains the state of the selected anomalies |
| (`_anomalies`, `_trace_names`) and the nudge options (`_nudgeList`). |
| - **Communication with Backend**: The `makeEditAnomalyRequest` and |
| `makeNudgeRequest` methods handle sending HTTP POST requests to the |
| `/_/triage/edit_anomalies` endpoint. This endpoint is responsible for |
| persisting the triage decisions (bug associations, ignore status, nudge |
| adjustments) in the backend database. |
| - The `editAction` parameter in `makeEditAnomalyRequest` can take values |
| like `IGNORE`, `RESET` (to de-associate bugs), or implicitly associate |
| with a bug ID when called from the bug dialogs. |
| - **Event Emission**: It emits an `anomaly-changed` custom event. This |
| event signals to parent components (likely a component displaying a list |
| or plot of anomalies) that one or more anomalies have been modified and |
| their representation needs to be updated. The event detail includes the |
| affected `traceNames`, the `editAction` performed, and the updated |
| `anomalies`. |
| |
| - **Integration with Dialogs**: |
| |
| - It directly embeds and interacts with `new-bug-dialog-sk` and |
| `existing-bug-dialog-sk`. When the user clicks "New Bug" or "Existing |
| Bug", this element calls the respective `open()` methods on these dialog |
| components. |
| - It passes the currently selected anomalies and trace names to these |
| dialogs using their `setAnomalies` methods, so the dialogs know which |
| anomalies the bug report will be associated with. |
| |
| - **`triage-menu-sk.html` (Implicit via Lit template in `.ts`)**: Defines the |
| visual structure of the menu, including the layout of the action buttons and |
| the nudge buttons. The rendering is dynamic based on the number of selected |
| anomalies and whether nudging is allowed. |
| |
| - **`triage-menu-sk.scss`**: Provides the styling for the menu, ensuring it |
| integrates visually with the surrounding application. |
| |
| **Key Workflow Example (Ignoring Anomalies):** |
| |
| 1. **User Selects Anomalies**: In a parent component (e.g., a plot or a list), |
| the user selects one or more anomalies. |
| 2. **`triage-menu-sk` Receives Data**: The parent component calls |
| `triageMenuSkElement.setAnomalies(selectedAnomalies, |
| correspondingTraceNames, nudgeOptions)`. |
| 3. **Menu Updates**: `triage-menu-sk` re-renders, enabling the "Ignore" button |
| (and potentially others). `User Action (Selects Anomalies) --> Parent |
| Component | v triage-menu-sk.setAnomalies() | v UI Renders (Buttons |
| enabled)` |
| 4. **User Clicks "Ignore"**: `User Click ("Ignore") --> |
| triage-menu-sk.ignoreAnomaly() | v makeEditAnomalyRequest(anomalies, traces, |
| "IGNORE") | v POST /_/triage/edit_anomalies | (Backend processes) v HTTP 200 |
| OK | v Dispatch "anomaly-changed" event` |
| 5. **Backend Interaction**: `makeEditAnomalyRequest` is called. It constructs a |
| JSON payload with the anomaly keys, trace names, and the action "IGNORE". |
| This payload is sent to `/_/triage/edit_anomalies`. |
| 6. **Event Notification**: Upon a successful response from the backend, |
| `triage-menu-sk` updates the local state of the anomalies (setting `bug_id` |
| to -2 for ignored anomalies) and dispatches the `anomaly-changed` event. |
| 7. **Parent Component Reacts**: The parent component listens for |
| `anomaly-changed` and updates its display to reflect that the anomalies are |
| now ignored (e.g., by changing their color, removing them from an active |
| list). |
| |
| The design decision to have `triage-menu-sk` orchestrate calls to the backend |
| and then emit a generic `anomaly-changed` event decouples it from the specifics |
| of how anomalies are displayed. Parent components only need to know that |
| anomalies have changed and can react accordingly. The use of dedicated dialog |
| components (`new-bug-dialog-sk`, `existing-bug-dialog-sk`) encapsulates the |
| complexity of bug reporting, keeping the triage menu itself focused on |
| initiating these actions. |
| |
| # Module: /modules/triage-page-sk |
| |
| ## Triage Page (`triage-page-sk`) |
| |
| The `triage-page-sk` module provides the user interface for viewing and triaging |
| regressions in performance data. It allows users to filter regressions based on |
| time range, commit status (all, regressions, untriaged), and alert |
| configurations. The primary goal is to present a clear overview of regressions |
| and facilitate the process of identifying their cause and impact. |
| |
| ### Responsibilities and Key Components |
| |
| The module is responsible for: |
| |
| - **Fetching and Displaying Regression Data:** It communicates with a backend |
| endpoint (`/_/reg/`) to retrieve regression information for a specified time |
| range and filter criteria. This data is then rendered in a tabular format, |
| showing commits along with any associated regressions. |
| - **State Management:** The component's state (selected time range, filters) |
| is reflected in the URL. This allows users to bookmark specific views or |
| share links to particular triage scenarios. The `stateReflector` utility |
| from `infra-sk/modules/statereflector` is used for this purpose. |
| - **User Interaction for Filtering:** It provides UI elements (select |
| dropdowns, date range pickers) for users to define what data they want to |
| see. Changes to these filters trigger new data fetches. |
| - **Triage Workflow:** When a user initiates a triage action on a specific |
| regression, a dialog (`<dialog>`) containing the `cluster-summary2-sk` |
| element is displayed. This dialog allows the user to view details of the |
| regression and assign a triage status (e.g., "positive", "negative", |
| "acknowledged"). |
| - **Communicating Triage Decisions:** Once a triage status is submitted, the |
| module sends this information to a backend endpoint (`/_/triage/`) to |
| persist the decision. |
| - **Displaying Triage Status:** Each regression in the table is visually |
| represented by a `triage-status-sk` element, which shows its current triage |
| state. |
| |
| ### Key Files |
| |
| - **`triage-page-sk.ts`**: This is the core TypeScript file defining the |
| `TriagePageSk` custom element. |
| - **Why**: It encapsulates all the logic for data fetching, rendering, |
| state management, and handling user interactions. It leverages Lit for |
| templating and rendering the UI. |
| - **How**: |
| - It defines a `State` interface to manage the component's configuration |
| (begin/end timestamps, subset filter, alert filter). |
| - The `connectedCallback` initializes the `stateReflector` to synchronize |
| the component's state with the URL. |
| - `updateRange()` is a crucial method that fetches regression data from |
| the `/_/reg/` endpoint whenever the state changes (e.g., date range or |
| filter selection). It uses the `fetch` API for network requests. |
| - The `template` function (using `lit/html`) defines the HTML structure of |
| the component, including the filter controls, the main table displaying |
| regressions, and the triage dialog. |
| - Event handlers like `commitsChange`, `filterChange`, `rangeChange`, |
| `triage_start`, and `triaged` manage user input and interactions with |
| child components. |
| - The `triage_start` method is triggered when a user wants to triage a |
| specific regression. It prepares the data for the `cluster-summary2-sk` |
| element and displays the triage dialog. |
| - The `triaged` method is called when the user submits a triage decision |
| from the `cluster-summary2-sk` dialog. It sends a POST request to |
| `/_/triage/` with the triage information. |
| - Helper methods like `stepUpAt`, `stepDownAt`, `alertAt`, etc., are used |
| to determine how to render cells in the regression table based on the |
| data received. |
| - `calc_all_filter_options` dynamically generates the list of available |
| alert filters based on categories returned from the backend. |
| - **`triage-page-sk.scss`**: Contains the SASS/CSS styles for the |
| `triage-page-sk` element. |
| - **Why**: To ensure the component has a consistent and appropriate visual |
| appearance within the application. |
| - **How**: It defines styles for the layout of the header, filter |
| sections, the regression table, and the triage dialog. It imports shared |
| styles for buttons, selects, and theming. |
| - **`triage-page-sk-demo.html` / `triage-page-sk-demo.ts`**: Provide a |
| demonstration page for the `triage-page-sk` element. |
| - **Why**: To allow developers to see the component in action and test its |
| basic functionality in isolation. |
| - **How**: The HTML file includes an instance of `<triage-page-sk>`. The |
| TypeScript file simply imports the main component to register it. |
| |
| ### Key Workflows |
| |
| **1. Initial Page Load and Data Fetch:** |
| |
| ``` |
| User navigates to page / URL with state parameters |
| | |
| V |
| triage-page-sk.connectedCallback() |
| | |
| V |
| stateReflector initializes state from URL (or defaults) |
| | |
| V |
| triage-page-sk.updateRange() |
| | |
| V |
| FETCH /_/reg/ with current state (begin, end, subset, alert_filter) |
| | |
| V |
| Backend responds with RegressionRangeResponse (header, table, categories) |
| | |
| V |
| triage-page-sk.reg is updated |
| | |
| V |
| triage-page-sk.calc_all_filter_options() (if categories present) |
| | |
| V |
| triage-page-sk._render() displays the regression table |
| ``` |
| |
| **2. User Changes Filter or Date Range:** |
| |
| ``` |
| User interacts with <select> (commits/filter) or <day-range-sk> |
| | |
| V |
| Event handler (e.g., commitsChange, filterChange, rangeChange) updates this.state |
| | |
| V |
| this.stateHasChanged() (triggers stateReflector to update URL) |
| | |
| V |
| triage-page-sk.updateRange() |
| | |
| V |
| FETCH /_/reg/ with new state |
| | |
| V |
| Backend responds with updated RegressionRangeResponse |
| | |
| V |
| triage-page-sk.reg is updated |
| | |
| V |
| triage-page-sk._render() re-renders the regression table with new data |
| ``` |
| |
| **3. User Initiates Triage:** |
| |
| ``` |
| User clicks on a regression in the table (within a <triage-status-sk> element) |
| | |
| V |
| <triage-status-sk> emits 'start-triage' event with details (alert, full_summary, cluster_type) |
| | |
| V |
| triage-page-sk.triage_start(event) |
| | |
| V |
| this.dialogState is populated with event.detail |
| | |
| V |
| this._render() (updates the <cluster-summary2-sk> properties within the dialog) |
| | |
| V |
| this.dialog.showModal() (displays the triage dialog) |
| ``` |
| |
| **4. User Submits Triage:** |
| |
| ``` |
| User interacts with <cluster-summary2-sk> in the dialog and clicks "Save" (or similar) |
| | |
| V |
| <cluster-summary2-sk> emits 'triaged' event with details (columnHeader, triage status) |
| | |
| V |
| triage-page-sk.triaged(event) |
| | |
| V |
| Constructs TriageRequest body (cid, triage, alert, cluster_type) |
| | |
| V |
| this.dialog.close() |
| | |
| V |
| this.triageInProgress = true; this._render() (shows spinner) |
| | |
| V |
| FETCH POST /_/triage/ with TriageRequest |
| | |
| V |
| Backend responds (e.g., with a bug link if applicable) |
| | |
| V |
| this.triageInProgress = false; this._render() (hides spinner) |
| | |
| V |
| (Optional) If json.bug exists, window.open(json.bug) |
| | |
| V |
| (Implicit) The <triage-status-sk> for the triaged item may update its display, or a full data refresh might be triggered if necessary to show the updated status. |
| ``` |
| |
| ### Design Decisions |
| |
| - **State Reflection in URL:** The decision to reflect the component's state |
| (date range, filters) in the URL is crucial for shareability and |
| bookmarking. It allows users to return to a specific view of regressions or |
| share it with colleagues. |
| - **Component-Based Architecture:** The page is built using custom elements |
| (`triage-page-sk`, `commit-detail-sk`, `day-range-sk`, `triage-status-sk`, |
| `cluster-summary2-sk`). This promotes modularity, reusability, and |
| separation of concerns. Each component handles a specific piece of |
| functionality. |
| - **Asynchronous Operations:** Data fetching and triage submissions are |
| asynchronous operations handled using `fetch` and Promises. Spinners |
| (`spinner-sk`) are used to provide visual feedback to the user during these |
| operations. |
| - **Dedicated Triage Dialog:** Instead of inline editing, a modal dialog |
| (`<dialog>`) is used for the triage process. This provides a focused |
| interface for the user to review cluster details and make a triage decision |
| without cluttering the main regression table. |
| - **Dynamic Filter Options:** The "Which alerts to display" filter options are |
| dynamically populated based on the categories returned from the backend. |
| This ensures that the filter options are relevant to the current dataset. |
| - **Use of Lit for Templating:** Lit is used for its efficient rendering and |
| declarative templating, making it easier to manage the UI structure and |
| updates. |
| |
| The `triage-page-sk` serves as the central hub for users to actively engage with |
| and manage performance regressions, making it a critical component in the |
| performance monitoring workflow. |
| |
| # Module: /modules/triage-status-sk |
| |
| The `triage-status-sk` module provides a custom HTML element designed to |
| visually represent and interact with the triage status of a "cluster" within the |
| Perf application. A cluster, in this context, likely refers to a group of |
| related performance measurements or anomalies that require user attention and |
| classification (triaging). |
| |
| **Core Functionality & Design:** |
| |
| The primary purpose of this element is to offer a concise and interactive way |
| for users to understand the current triage state of a cluster and to initiate |
| the triaging process. |
| |
| 1. **Visual Indication:** The element displays a button. The appearance of this |
| button (specifically, an icon within it) changes based on the cluster's |
| triage status: "positive," "negative," or "untriaged." This provides an |
| immediate visual cue to the user. |
| |
| - **Why:** Direct visual feedback is crucial for quickly assessing the |
| state of many items in a list or dashboard. Instead of reading text, |
| users can rely on familiar icons. |
| - **How:** It leverages the `tricon2-sk` element to display the |
| appropriate icon based on the `triage.status` property. The styling for |
| these states is defined in `triage-status-sk.scss`, ensuring visual |
| consistency with the application's theme (including dark mode). |
| |
| 2. **Initiating Triage:** Clicking the button does not directly change the |
| triage status within this element. Instead, it emits a custom event named |
| `start-triage`. |
| |
| - **Why:** This follows a common pattern in web components where |
| individual components are responsible for a specific piece of UI and |
| interaction, but delegate more complex actions or state management to |
| parent components or application-level logic. This keeps the |
| `triage-status-sk` element focused and reusable. The actual triaging |
| process likely involves a dialog or a more complex UI, which is beyond |
| the scope of this simple button. |
| - **How:** The `_start_triage` method is invoked on button click. This |
| method constructs a `detail` object containing all relevant information |
| about the cluster (`full_summary`, current `triage` status, `alert` |
| configuration, `cluster_type`, and a reference to the element itself) |
| and dispatches the `start-triage` `CustomEvent`. |
| |
| **Key Components & Files:** |
| |
| - **`triage-status-sk.ts`:** This is the heart of the module, defining the |
| `TriageStatusSk` custom element class which extends `ElementSk`. |
| - **Properties:** It manages several key pieces of data as properties: |
| - `triage`: An object of type `TriageStatus` (defined in |
| `perf/modules/json`) holding the `status` ('positive', 'negative', |
| 'untriaged') and a `message` string. This is the primary driver for the |
| element's appearance. |
| - `full_summary`: Potentially detailed information about the cluster, of |
| type `FullSummary`. |
| - `alert`: Information about any alert configuration associated with the |
| cluster, of type `Alert`. |
| - `cluster_type`: A string ('high' or 'low'), likely indicating the |
| priority or type of the cluster. |
| - **Rendering:** It uses `lit-html` for templating |
| (`TriageStatusSk.template`). The template renders a `<button>` |
| containing a `tricon2-sk` element. The `class` of the button and the |
| `value` of the `tricon2-sk` are bound to `ele.triage.status`, |
| dynamically changing the appearance. |
| - **Event Dispatch:** The `_start_triage` method is responsible for |
| creating and dispatching the `start-triage` event. |
| - **`triage-status-sk.scss`:** Defines the visual styling for the |
| `triage-status-sk` element. It includes specific styles for the different |
| triage states (`.positive`, `.negative`, `.untriaged`) and their hover |
| states, ensuring they integrate with the application's themes (including |
| dark mode variables like `--positive`, `--negative`, `--surface`). |
| - **`index.ts`:** A simple entry point that imports and thereby registers the |
| `triage-status-sk` custom element, making it available for use in HTML. |
| - **`triage-status-sk-demo.html` & `triage-status-sk-demo.ts`:** These files |
| provide a demonstration page for the `triage-status-sk` element. |
| - The HTML sets up instances of the element in different theme contexts |
| (default and dark mode). |
| - The TypeScript file demonstrates how to listen for the `start-triage` |
| event and how to programmatically set the `triage` property of the |
| element. This is crucial for developers to understand how to integrate |
| and use the component. |
| - **`BUILD.bazel`:** Defines how the module is built and its dependencies. It |
| specifies `tricon2-sk` as a UI dependency and includes necessary SASS and |
| TypeScript libraries. |
| - **`triage-status-sk_puppeteer_test.ts`:** Contains Puppeteer-based tests to |
| ensure the element renders correctly and behaves as expected in a browser |
| environment. This is important for maintaining code quality and preventing |
| regressions. |
| |
| **Workflow Example: User Initiates Triage** |
| |
| ``` |
| User sees a triage-status-sk button (e.g., showing an 'untriaged' icon) |
| | |
| V |
| User clicks the button |
| | |
| V |
| [triage-status-sk.ts] _start_triage() method is called |
| | |
| V |
| [triage-status-sk.ts] Creates a 'detail' object with: |
| - triage: { status: 'untriaged', message: '...' } |
| - full_summary: { ... } |
| - alert: { ... } |
| - cluster_type: 'low' | 'high' |
| - element: (reference to itself) |
| | |
| V |
| [triage-status-sk.ts] Dispatches a 'start-triage' CustomEvent with the 'detail' object |
| | |
| V |
| [Parent Component/Application Logic] Listens for 'start-triage' event |
| | |
| V |
| [Parent Component/Application Logic] Receives event.detail |
| | |
| V |
| [Parent Component/Application Logic] Uses the received data to: |
| - Open a triage dialog |
| - Populate the dialog with cluster details |
| - Allow user to select a new triage status |
| - (Potentially) update the original triage-status-sk element's |
| 'triage' property after the dialog interaction is complete. |
| ``` |
| |
| This design allows `triage-status-sk` to be a focused, presentational component, |
| while the more complex logic of handling the triage process itself is managed |
| elsewhere in the application. This promotes separation of concerns and |
| reusability. |
| |
| # Module: /modules/triage2-sk |
| |
| The `triage2-sk` module provides a custom HTML element for selecting a triage |
| status. This element is designed to be a simple, reusable UI component for |
| indicating whether a particular item is "positive", "negative", or "untriaged". |
| Its primary purpose is to offer a standardized way to represent and interact |
| with triage states across different parts of the Perf application. |
| |
| The core of the module is the `triage2-sk` custom element, defined in |
| `triage2-sk.ts`. This element leverages the Lit library for templating and |
| rendering. It presents three buttons, each representing one of the triage |
| states: |
| |
| - **Positive:** Indicated by a check circle icon (`<check-circle-icon-sk>`). |
| - **Negative:** Indicated by a cancel icon (`<cancel-icon-sk>`). |
| - **Untriaged:** Indicated by a help icon (`<help-icon-sk>`). |
| |
| The "why" behind this design is to provide a clear visual representation of the |
| current triage status and an intuitive way for users to change it. By using |
| distinct icons and styling for each state, the element aims to reduce ambiguity. |
| |
| **Key Implementation Details:** |
| |
| - **`triage2-sk.ts`:** This is the main TypeScript file defining the |
| `TriageSk` class, which extends `ElementSk`. |
| |
| - **State Management:** The current triage state is managed by the `value` |
| attribute (and corresponding property). It can be one of "positive", |
| "negative", or "untriaged". If no value is provided, it defaults to |
| "untriaged". |
| - **Event Emission:** When the user clicks a button to change the triage |
| state, the element dispatches a custom event named `change`. The |
| `detail` property of this event contains the new triage status as a |
| string (e.g., "positive"). This allows parent components to react to |
| changes in the triage status. `User clicks "Positive" button | V |
| triage2-sk sets its 'value' attribute to "positive" | V triage2-sk |
| dispatches a 'change' event with detail: "positive"` |
| - **Rendering:** The `template` static method uses Lit's `html` tagged |
| template literal to define the structure of the element. It dynamically |
| sets the `selected` attribute on the appropriate button based on the |
| current `value`. |
| - **Attribute Observation:** The element observes the `value` attribute. |
| When this attribute changes (either programmatically or through user |
| interaction), the `attributeChangedCallback` is triggered, which |
| re-renders the component and dispatches the `change` event. |
| - **Type Safety:** The `isStatus` function ensures that the `value` |
| property is always one of the allowed `Status` types, defaulting to |
| "untriaged" if an invalid value is encountered. This contributes to the |
| robustness of the component. |
| |
| - **`triage2-sk.scss`:** This file contains the SASS styles for the |
| `triage2-sk` element. |
| |
| - **Theming:** It defines styles for both a legacy color scheme and a |
| theme-based color scheme (including dark mode). This ensures the |
| component integrates visually with the rest of the application, |
| regardless of the active theme. The styling differentiates the selected |
| button and provides hover effects for better user feedback. The fill |
| colors of the icons change based on the triage state (e.g., green for |
| positive, red for negative). |
| |
| - **`index.ts`:** This file serves as the entry point for the module, |
| exporting the `TriageSk` class and ensuring the custom element is defined. |
| |
| - **Demo and Testing:** |
| |
| - `triage2-sk-demo.html` and `triage2-sk-demo.ts`: Provide a simple |
| demonstration page showcasing the element in various states and how to |
| listen for the `change` event. This is useful for manual testing and |
| visual inspection. |
| - `triage2-sk_test.ts`: Contains Karma unit tests that verify the event |
| emission and value changes of the component. |
| - `triage2-sk_puppeteer_test.ts`: Includes Puppeteer-based end-to-end |
| tests that check the rendering of the component in a browser environment |
| and capture screenshots for visual regression testing. |
| |
| The design choice of using custom elements and Lit allows for a modular and |
| maintainable component that can be easily integrated into larger applications. |
| The clear separation of concerns (logic in TypeScript, styling in SASS, and |
| structure in the template) follows common best practices for web component |
| development. |
| |
| # Module: /modules/tricon2-sk |
| |
| The `tricon2-sk` module provides a custom HTML element `<tricon2-sk>` designed |
| to visually represent triage states. This component is crucial for user |
| interfaces where quick identification of an item's status (e.g., in a bug |
| tracker, code review system, or monitoring dashboard) is necessary. |
| |
| The core idea is to offer a standardized, reusable icon that clearly |
| communicates whether an item is "positive," "negative," or "untriaged." This |
| avoids inconsistencies and reduces cognitive load for users who frequently |
| interact with such systems. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`tricon2-sk.ts`**: This is the heart of the module. It defines the |
| `TriconSk` class, which extends `ElementSk` (a base class for custom |
| elements in the Skia infrastructure). |
| |
| - **Purpose:** To render one of three specific icons based on its `value` |
| attribute. |
| - **Implementation:** |
| - It utilizes the `lit-html` library for templating, allowing for |
| efficient rendering and updates. |
| - A `static template` function determines which icon to display |
| (`check-circle-icon-sk` for "positive", `cancel-icon-sk` for "negative", |
| and `help-icon-sk` for "untriaged" or any other value). This design |
| centralizes the icon selection logic. |
| - The `value` attribute is the primary interface for controlling the |
| displayed icon. Changes to this attribute trigger a re-render via |
| `attributeChangedCallback` and `_render()`. |
| - The `connectedCallback` ensures that the `value` property is properly |
| initialized if set before the element is attached to the DOM. |
| - **Dependencies:** It imports specific icon components |
| (`check-circle-icon-sk`, `cancel-icon-sk`, `help-icon-sk`) from the |
| `elements-sk` module, promoting modularity and reuse of existing icon |
| assets. |
| |
| - **`tricon2-sk.scss`**: This file handles the styling of the `tricon2-sk` |
| element and its internal icons. |
| |
| - **Purpose:** To define the colors of the icons based on their state and |
| to ensure they adapt correctly to different themes (e.g., light and dark |
| mode). |
| - **Implementation:** |
| - It uses SASS for more organized and maintainable styles. |
| - Crucially, it defines CSS variables (e.g., `--green`, `--red`, |
| `--brown`) for the icon fill colors. This allows themes (defined in |
| `themes.scss`) to override these colors easily. |
| - Specific styles are also provided for when the element is within a |
| `.body-sk` context and when `.darkmode` is applied to `.body-sk`. This |
| ensures the icons maintain appropriate contrast and visibility across |
| different UI themes. The fallback hardcoded colors (`#388e3c`, etc.) |
| provide a default styling if CSS variables are not defined by a theme. |
| |
| - **`index.ts`**: This file serves as the main entry point for the module when |
| it's imported. Its sole responsibility is to import `tricon2-sk.ts`, which |
| in turn registers the `<tricon2-sk>` custom element. This is a common |
| pattern for organizing custom element definitions. |
| |
| - **`tricon2-sk-demo.html` and `tricon2-sk-demo.ts`**: These files create a |
| demonstration page for the `<tricon2-sk>` element. |
| |
| - **Purpose:** To showcase the different states of the `tricon2-sk` |
| element and how it appears in various theming contexts (default, with |
| `colors.css` theming, and with `themes.css` in both light and dark |
| modes). This is invaluable for development, testing, and documentation. |
| - **How it works:** The HTML file directly uses the `<tricon2-sk>` element |
| with different `value` attributes. The accompanying TypeScript file |
| simply imports the `index.ts` of the `tricon2-sk` module to ensure the |
| custom element is defined before the browser tries to render it. |
| |
| - **`tricon2-sk_puppeteer_test.ts`**: This file contains automated UI tests |
| for the `tricon2-sk` element using Puppeteer. |
| |
| - **Purpose:** To verify that the element renders correctly in different |
| states and to capture screenshots for visual regression testing. |
| - **How it works:** It loads the demo page (`tricon2-sk-demo.html`) in a |
| headless browser, checks if the expected number of `tricon2-sk` elements |
| are present (a basic smoke test), and then takes a screenshot of the |
| page. This ensures that changes to the component's appearance are caught |
| early. |
| |
| **Workflow: Displaying a Triage Icon** |
| |
| 1. **Usage:** An application includes the `<tricon2-sk>` element in its HTML, |
| setting the `value` attribute: |
| |
| ```html |
| <tricon2-sk value="positive"></tricon2-sk> |
| ``` |
| |
| 2. **Element Initialization (`tricon2-sk.ts`):** |
| |
| - The `TriconSk` class is instantiated. |
| - `connectedCallback` is called, ensuring the `value` property is |
| synchronized with the attribute. |
| - `_render()` is called. |
| |
| 3. **Template Selection (`tricon2-sk.ts`):** |
| |
| - The `static template` function is invoked. |
| - Based on `this.value` (e.g., "positive"), it returns the corresponding |
| HTML template: `html<check-circle-icon-sk></check-circle-icon-sk>`. |
| |
| 4. **Icon Rendering:** |
| |
| - The selected icon component (e.g., `<check-circle-icon-sk>`) renders |
| itself. |
| |
| 5. **Styling (`tricon2-sk.scss`):** |
| |
| - CSS rules are applied. For example, if the value is "positive": |
| `tricon2-sk { check-circle-icon-sk { fill: var(--green); // Initially |
| |
| attempts to use the CSS variable } }` - If themes are active (e.g.,`.body-sk.darkmode`), more specific rules |
| |
| might override the fill color: `.body-sk.darkmode tricon2-sk { |
| |
| check-circle-icon-sk { fill: #4caf50; // Specific dark mode color } }` |
| |
| **Diagram: Attribute Change leading to Icon Update** |
| |
| ``` |
| [User/Application sets/changes 'value' attribute on <tricon2-sk>] |
| | |
| v |
| [<tricon2-sk> element] |
| | |
| +---------------------+ |
| | attributeChangedCallback() is triggered | |
| +---------------------+ |
| | |
| v |
| [this._render()] |
| | |
| v |
| [TriconSk.template(this)] <-- Reads current 'this.value' |
| | |
| +-------------+-------------+ |
| | (value is | (value is | (value is other) |
| | "positive") | "negative") | |
| v v v |
| [Returns [Returns [Returns |
| <check-...>] <cancel-...>] <help-...>] |
| | |
| v |
| [lit-html updates the DOM with the new icon template] |
| | |
| v |
| [Browser renders the new icon with appropriate CSS styles] |
| ``` |
| |
| The design decision to use distinct, imported icon components |
| (`check-circle-icon-sk`, etc.) rather than, for example, a single SVG sprite or |
| dynamically generating SVG paths, promotes better separation of concerns. Each |
| icon can be managed and updated independently. The use of CSS variables for |
| theming is a standard and flexible approach, allowing consuming applications to |
| easily adapt the icon colors to their specific look and feel without modifying |
| the component's core logic or styles directly. |
| |
| # Module: /modules/trybot |
| |
| The `trybot` module provides utilities for processing and analyzing results from |
| Perf trybots. Trybots are automated systems that run performance tests on code |
| changes before they are submitted. This module focuses on calculating and |
| presenting metrics that help developers understand the performance impact of |
| their changes. |
| |
| The core functionality revolves around aggregating and averaging `stddevRatio` |
| values across different parameter combinations. The `stddevRatio` is a key |
| metric representing the change in performance relative to the standard deviation |
| of the baseline. A positive `stddevRatio` generally indicates a performance |
| regression, while a negative value suggests an improvement. |
| |
| The primary goal is to help developers quickly identify which aspects of their |
| change (represented by key-value parameters like `model=GCE` or |
| `test=MyBenchmark`) are contributing most significantly to performance changes, |
| both positive and negative. By grouping results by these parameters and |
| calculating average `stddevRatio`, the module provides a summarized view that |
| highlights potential problem areas or confirms expected improvements. |
| |
| ### Key Components and Files: |
| |
| - **`calcs.ts`**: This file contains the logic for performing calculations on |
| trybot results. |
| |
| - **`byParams(res: TryBotResponse): AveForParam[]`**: This is the central |
| function of the module. |
| |
| - **Why**: Developers need a way to understand the overall performance |
| impact of their changes across various configurations (e.g., different |
| devices, tests, or operating systems). Simply looking at individual |
| trace results can be overwhelming. This function provides a summarized |
| view by grouping results by their parameters. |
| |
| - **How**: |
| |
| 1. It takes a `TryBotResponse` object, which contains a list of |
| individual test results (`res.results`). Each result includes a |
| `stddevRatio` and a set of `params` (key-value pairs describing the |
| test configuration). |
| 2. It iterates through each result and then through each key-value pair |
| within that result's `params`. |
| 3. For each unique `key=value` string (e.g., "model=GCE"), it maintains |
| a running total of `stddevRatio` values, the count of traces |
| contributing to this total (`n`), and counts of traces with positive |
| (`high`) or negative (`low`) `stddevRatio`. This aggregation happens |
| in the `runningTotals` object. |
| |
| ``` |
| Input TryBotResponse.results: |
| [ |
| { params: {arch: "arm", os: "android"}, stddevRatio: 1.5 }, |
| { params: {arch: "x86", os: "linux"}, stddevRatio: -0.5 }, |
| { params: {arch: "arm", os: "ios"}, stddevRatio: 2.0 } |
| ] |
| |
| -> runningTotals intermediate state (simplified): |
| "arch=arm": { totalStdDevRatio: 3.5, n: 2, high: 2, low: 0 } |
| "os=android": { totalStdDevRatio: 1.5, n: 1, high: 1, low: 0 } |
| "arch=x86": { totalStdDevRatio: -0.5, n: 1, high: 0, low: 1 } |
| "os=linux": { totalStdDevRatio: -0.5, n: 1, high: 0, low: 1 } |
| "os=ios": { totalStdDevRatio: 2.0, n: 1, high: 1, low: 0 } |
| ``` |
| |
| 4. After processing all results, it calculates the average |
| `stddevRatio` for each `key=value` pair by dividing |
| `totalStdDevRatio` by `n`. |
| |
| 5. It constructs an array of `AveForParam` objects. Each object |
| represents a `key=value` parameter and includes its calculated |
| average `stddevRatio`, the total number of traces (`n`) that matched |
| this parameter, and the counts of high and low `stddevRatio` traces. |
| |
| 6. Finally, it sorts this array in descending order based on the |
| `aveStdDevRatio`. This crucial step brings the parameters associated |
| with the largest (potentially negative) performance regressions to |
| the top, making them easy to identify. |
| |
| - **`AveForParam` interface**: Defines the structure for the output of |
| `byParams`. It holds the aggregated average `stddevRatio` for a specific |
| `keyValue` pair, along with counts of traces. |
| |
| - **`runningTotal` interface**: An internal helper interface used during |
| the aggregation process within `byParams` to keep track of sums and |
| counts before the final average is computed. |
| |
| - **`calcs_test.ts`**: This file contains unit tests for the functions in |
| `calcs.ts`. |
| |
| - **Why**: To ensure the correctness of the calculation logic, especially |
| for edge cases (e.g., empty input) and the core averaging and sorting |
| functionality. |
| - **How**: It uses `chai` for assertions. Tests cover scenarios like: |
| - Empty input to `byParams` should return an empty list. |
| - Correct calculation of average `stddevRatio` for multiple traces sharing |
| common parameters. For example, if two traces have `test=1`, their |
| `stddevRatio` values should be averaged for the `test=1` entry in the |
| output. |
| - Ensuring the output is correctly sorted by `aveStdDevRatio` in |
| descending order. |
| |
| ### Key Workflows/Processes: |
| |
| **Calculating Average StdDevRatio by Parameter:** |
| |
| ``` |
| TryBotResponse |
| | |
| v |
| byParams(response) |
| | |
| | 1. Initialize `runningTotals` (empty map) |
| | |
| | 2. For each `result` in `response.results`: |
| | | |
| | |-> For each `param` (key-value pair) in `result.params`: |
| | | |
| | |--> Generate `runningTotalsKey` (e.g., "model=GCE") |
| | |--> Retrieve or create `runningTotal` entry for `runningTotalsKey` |
| | |--> Update `totalStdDevRatio`, `n`, `high`, `low` in the entry |
| | |
| | 3. Initialize `ret` (empty array of AveForParam) |
| | |
| | 4. For each `runningTotalKey` in `runningTotals`: |
| | | |
| | |-> Calculate `aveStdDevRatio` = `runningTotal.totalStdDevRatio` / `runningTotal.n` |
| | |-> Create `AveForParam` object |
| | |-> Push to `ret` |
| | |
| | 5. Sort `ret` by `aveStdDevRatio` (descending) |
| | |
| v |
| Array of AveForParam |
| ``` |
| |
| This workflow allows users to quickly pinpoint which configuration parameters |
| (like specific device models, operating systems, or test names) are associated |
| with the most significant average performance changes in a given trybot run. The |
| sorting ensures that the most impactful parameters are immediately visible. |
| |
| # Module: /modules/trybot-page-sk |
| |
| The `trybot-page-sk` module provides a user interface for analyzing performance |
| regressions. It allows users to select either a specific commit from the |
| repository or a trybot run (representing a potential code change) and then |
| analyze performance metrics associated with that selection. The core purpose is |
| to help developers identify and understand performance impacts before or after |
| code submission. |
| |
| **Key Responsibilities and Components:** |
| |
| - **User Input and Selection:** |
| |
| - The page is organized into two main tabs: "Commit" and "TryBot". This |
| separation allows users to focus on either analyzing historical |
| performance data or evaluating the impact of pending changes. |
| - **Commit Analysis:** Users can select a specific commit using the |
| `commit-detail-picker-sk` element. This allows them to investigate |
| performance regressions that might have been introduced by a particular |
| code change. |
| - **TryBot Analysis:** (The "TryBot" tab is present in the UI template but |
| its functionality for selecting trybot runs, CLs, and patch numbers is |
| not fully detailed in the provided `trybot-page-sk.ts`. It appears to be |
| a planned feature or a more complex interaction than commit selection.) |
| The underlying `TryBotRequest` interface includes fields like `cl` and |
| `patch_number`, indicating the intent to support this. |
| - Once a commit (or eventually a trybot run) is selected, users define the |
| scope of the analysis by specifying a query using `query-sk`. This query |
| filters the performance traces to be considered (e.g., focusing on |
| specific benchmarks, configurations, or architectures). |
| - The `paramset-sk` and `query-count-sk` elements provide feedback on the |
| current query, showing the matching parameters and the number of traces |
| that fit the criteria. This helps users refine their query to target the |
| relevant data. |
| |
| - **Data Fetching and Processing:** |
| |
| - When the user clicks the "Run" button, the `run` method is invoked. This |
| method constructs a `TryBotRequest` object based on the user's |
| selections (commit number, query, or eventually CL/patch details). |
| - It sends this request to the `/_/trybot/load/` backend endpoint. This |
| endpoint is responsible for fetching the relevant performance data |
| (trace values, headers, parameter sets) for the specified commit/trybot |
| and query. The `startRequest` utility handles the asynchronous request |
| and displays progress using a `spinner-sk`. |
| - The response (`TryBotResponse`) contains the performance data, |
| including: |
| - `results`: An array of individual trace results, each containing |
| parameter values (`params`), actual metric values (`values`), and a |
| `stddevRatio` (how many standard deviations the trace's value is from |
| the median of its historical data). |
| - `paramset`: The complete set of parameters found across all returned |
| traces. |
| - `header`: Information about the data points in each trace, likely |
| including timestamps. |
| - The received data is then processed. Notably, the `byParams` function |
| (from `../trybot/calcs`) is used to aggregate results by parameter |
| key-value pairs, calculating average standard deviation ratios, counts, |
| and high/low values for each group. This helps identify which parameters |
| are most strongly correlated with performance changes. |
| |
| - **Results Display and Visualization:** |
| |
| - The results are presented in two tabs: "Individual" and "By Params". |
| - **Individual Tab:** |
| - Lists individual traces that match the query, showing their |
| parameters, standard deviation ratio, and an option to plot them. |
| - To avoid overwhelming the user, only the head and tail of long lists |
| are displayed. |
| - Clicking the plot icon (`timeline-icon-sk`) for a trace renders its |
| values over time on a `plot-simple-sk` element. Users can CTRL-click |
| to plot multiple traces on the same graph for comparison. |
| - The table intelligently displays parameter values, showing "〃" if a |
| value is the same as the row above it and "∅" if a parameter doesn't |
| exist for a trace. |
| - **By Params Tab:** |
| - Displays the aggregated results from the `byParams` calculation. For |
| each parameter key-value pair (e.g., "config=gles"), it shows the |
| average standard deviation ratio, the number of traces (N) in that |
| group, and the highest/lowest individual trace values. |
| - This view helps quickly identify which specific parameter values are |
| associated with significant performance deviations. |
| - Similar to the individual tab, users can click a plot icon to |
| visualize a group of traces. Up to `maxByParamsPlot` traces from the |
| selected group (sorted by `stddevRatio`) are plotted on a separate |
| `plot-simple-sk`. |
| - When a trace is focused on the "By Params" plot (e.g., by hovering), |
| its full trace ID and its parameter set are displayed below the plot |
| using `by-params-traceid` and `by-params-paramset` respectively. |
| `paramset-sk` is used to display the parameters, highlighting the |
| ones belonging to the focused trace. |
| |
| - **State Management:** |
| |
| - The component uses `stateReflector` to synchronize its internal state |
| (`this.state`, which is a `TryBotRequest` object) with the URL. This |
| means that the selected commit, query, and analysis type ("commit" or |
| "trybot") are reflected in the URL query parameters. This allows users |
| to bookmark or share specific analysis views. |
| - Changes to the commit selection, query, or tab selection trigger |
| `stateHasChanged()`, which updates the URL via `stateReflector` and |
| re-renders the component. |
| |
| - **Styling and Structure:** |
| |
| - The `trybot-page-sk.scss` file defines the visual appearance and layout |
| of the component, including styles for the query section, results |
| tables, and plot areas. |
| - The component is built using Lit templates, enabling reactive updates to |
| the DOM when the underlying state changes. |
| |
| **Workflow Example (Commit Analysis):** |
| |
| 1. **User Selects Tab:** User ensures the "Commit" tab is selected. `[tabs-sk] |
| --selects index 0--> [TrybotPageSk.tabSelected] --> state.kind = "commit" |
| --> stateHasChanged()` |
| 2. **User Selects Commit:** User interacts with `commit-detail-picker-sk`. |
| `[commit-detail-picker-sk] --commit-selected event--> |
| [TrybotPageSk.commitSelected] --> state.commit_number = |
| selected_commit_offset --> stateHasChanged() --> _render() (UI updates to |
| show query section)` |
| 3. **User Enters Query:** User types into `query-sk`. |
| |
| ``` |
| [query-sk] --query-change event--> [TrybotPageSk.queryChange] |
| --> state.query = new_query_string |
| --> stateHasChanged() |
| --> _render() (paramset-sk summary updates) |
| |
| [query-sk] --query-change-delayed event--> [TrybotPageSk.queryChangeDelayed] |
| --> [query-count-sk].current_query = new_query_string (triggers count update) |
| ``` |
| |
| 4. **User Clicks "Run":** `[Run Button] --click--> [TrybotPageSk.run] --> |
| spinner-sk.active = true --> startRequest('/_/trybot/load/', state, ...) --> |
| HTTP POST to backend with { kind: "commit", commit_number: X, query: "Y" } |
| <-- Backend responds with TryBotResponse (trace data, paramset, header) --> |
| results = TryBotResponse --> byParams = byParams(results) --> |
| spinner-sk.active = false --> _render() (results tables and plot areas |
| become visible and populated)` |
| |
| 5. **User Interacts with Results:** |
| |
| - **Plotting Individual Trace:** `[Timeline Icon in Individual Table] |
| |
| --click--> [TrybotPageSk.plotIndividualTrace(event, index)] --> |
| individualPlot.addLines(...) --> displayedTrace = true --> \_render() |
| (individual plot becomes visible)` - **Plotting By Params Group:**`[Timeline Icon in By Params Table] |
| --click--> [TrybotPageSk.plotByParamsTraces(event, index)] --> Filters |
| results.results for matching key=value --> byParamsPlot.addLines(...) |
| --> byParamsParamSet.paramsets = [ParamSet of plotted traces] --> |
| displayedByParamsTrace = true --> \_render() (by params plot and its |
| paramset become visible)` - **Focusing Trace on By Params Plot:**`[by-params-plot] --trace_focused |
| event--> [TrybotPageSk.byParamsTraceFocused] --> |
| byParamsTraceID.innerText = focused_trace_name --> |
| byParamsParamSet.highlight = fromKey(focused_trace_name) --> \_render() |
| (updates highlighted params in by-params-paramset)` |
| |
| The design emphasizes providing both a high-level overview of potential |
| regression areas (via "By Params") and the ability to drill down into individual |
| trace performance. The use of `stddevRatio` as a primary metric helps quantify |
| the significance of observed changes. |
| |
| # Module: /modules/user-issue-sk |
| |
| ## User Issue Management Element (`user-issue-sk`) |
| |
| The `user-issue-sk` module provides a custom HTML element for associating and |
| managing Buganizer issues with specific data points in the Perf application. |
| This allows users to directly link performance regressions or anomalies to their |
| corresponding bug reports, enhancing traceability and collaboration. |
| |
| **Why:** Tracking issues related to performance data is crucial for effective |
| debugging and resolution. This element centralizes the issue linking process |
| within the Perf UI, providing a seamless experience for users to add, view, and |
| remove bug associations. |
| |
| **How:** |
| |
| The core functionality revolves around the `UserIssueSk` LitElement class. This |
| class manages the display and interaction logic for associating a Buganizer |
| issue with a data point identified by its `trace_key` and `commit_position`. |
| |
| **Key Responsibilities and Components:** |
| |
| - **User Authentication:** The element first checks if a user is logged in |
| using `alogin-sk`. This is essential because only logged-in users can add or |
| remove issue associations. If a user is not logged in, they can only view |
| existing issue links. |
| - **State Management:** |
| - `bug_id`: This property determines the element's display. |
| - `bug_id === 0`: Indicates no Buganizer issue is associated with the data |
| point. The element will display an "Add Bug" button (if the user is |
| logged in). |
| - `bug_id > 0`: An existing Buganizer issue is linked. The element will |
| display a link to the bug and, if the user is logged in, a "close" icon |
| to remove the association. |
| - `bug_id === -1`: This is a special state where the element renders |
| nothing, effectively hiding itself. This might be used in scenarios |
| where issue linking is not applicable. |
| - `_text_input_active`: A boolean flag that controls the visibility of the |
| input field for entering a new bug ID. |
| - **Rendering Logic:** The `render()` method dynamically chooses between two |
| main templates based on the `bug_id` and login status: |
| - `addIssueTemplate()`: Shown when `bug_id === 0` and the user is logged |
| in. It initially displays an "Add Bug" button. Clicking this button |
| reveals an input field for the bug ID and confirm/cancel icons. |
| - `showLinkTemplate()`: Shown when `bug_id > 0`. It displays a formatted |
| link to the Buganizer issue (using `AnomalySk.formatBug`). If the user |
| is logged in, a "close" icon is also displayed to allow removal of the |
| issue link. |
| - **API Interaction:** |
| - `addIssue()`: Triggered when a user submits a new bug ID. It makes a |
| POST request to the `/_/user_issue/save` endpoint with the `trace_key`, |
| `commit_position`, and the new `issue_id`. |
| - `removeIssue()`: Triggered when a logged-in user clicks the "close" icon |
| next to an existing bug link. It makes a POST request to the |
| `/_/user_issue/delete` endpoint with the `trace_key` and |
| `commit_position`. |
| - **Event Dispatching:** After successfully adding or removing an issue, the |
| element dispatches a custom event named `user-issue-changed`. This event |
| bubbles up and carries a `detail` object containing the `trace_key`, |
| `commit_position`, and the new `bug_id`. This allows parent components or |
| other parts of the application to react to changes in issue associations |
| (e.g., by refreshing a list of user-reported issues). |
| - **Error Handling:** Uses the `errorMessage` utility from |
| `perf/modules/errorMessage` to display feedback to the user in case of API |
| errors or invalid input. |
| |
| **Key Files:** |
| |
| - **`user-issue-sk.ts`**: This is the heart of the module. It defines the |
| `UserIssueSk` LitElement, including its properties, styles, templates, and |
| logic for interacting with the backend API and handling user input. The |
| design focuses on conditional rendering based on the `bug_id` and user login |
| status. The API calls are standard `fetch` requests. |
| - **`index.ts`**: A simple entry point that imports and registers the |
| `user-issue-sk` custom element, making it available for use in HTML. |
| - **`BUILD.bazel`**: Defines the build dependencies for the element, including |
| `alogin-sk` for authentication, `anomaly-sk` for bug link formatting, icon |
| elements for the UI, and Lit libraries for web component development. |
| |
| **Workflows:** |
| |
| 1. **Adding a New Issue:** User (logged in) sees "Add Bug" button User clicks |
| "Add Bug" -> `activateTextInput()` is called -> `_text_input_active` becomes |
| `true` -> Element re-renders to show input field, check icon, close icon |
| User types bug ID into input field -> `changeHandler()` updates `_input_val` |
| User clicks check icon -> `addIssue()` is called -> Input validation (is |
| `_input_val` > 0?) -> POST request to `/_/user_issue/save` with trace_key, |
| commit_position, input_val -> On success: -> `bug_id` is updated with |
| `_input_val` -> `_input_val` reset to 0 -> `_text_input_active` set to |
| `false` -> `user-issue-changed` event is dispatched -> Element re-renders to |
| show the new bug link and remove icon -> On failure: -> `errorMessage` is |
| displayed -> `hideTextInput()` is called (resets state) |
| |
| 2. **Viewing an Existing Issue:** Element is initialized with `bug_id > 0` -> |
| `render()` calls `showLinkTemplate()` -> A link to `perf.bug_host_url + |
| bug_id` is displayed. -> If user is logged in, a "close" icon is also |
| displayed. |
| |
| 3. **Removing an Existing Issue:** User (logged in) sees bug link and "close" |
| icon User clicks "close" icon -> `removeIssue()` is called -> POST request |
| to `/_/user_issue/delete` with trace_key, commit_position -> On success: -> |
| `bug_id` is set to 0 -> `_input_val` reset to 0 -> `_text_input_active` set |
| to `false` -> `user-issue-changed` event is dispatched -> Element re-renders |
| to show "Add Bug" button -> On failure: -> `errorMessage` is displayed |
| |
| The design prioritizes a clear separation of concerns: display logic is handled |
| by LitElement's templating system, state is managed through properties, and |
| backend interactions are encapsulated in dedicated asynchronous methods. The use |
| of custom events allows for loose coupling with other components that might need |
| to react to changes in issue associations. |
| |
| # Module: /modules/window |
| |
| The `window` module is designed to provide utility functions related to the |
| browser's `window` object, specifically focusing on parsing and interpreting |
| configuration data embedded within it. This approach centralizes the logic for |
| accessing and processing global configurations, making it easier to manage and |
| test. |
| |
| A key responsibility of this module is to extract and process build tag |
| information. This information is often embedded in the `window.perf.image_tag` |
| global variable, which is expected to be an `SkPerfConfig` object (defined in |
| `//perf/modules/json:index_ts_lib`). The `getBuildTag` function is the primary |
| component for this task. |
| |
| The `getBuildTag` function takes an image tag string as input (or defaults to |
| `window.perf?.image_tag`). Its core purpose is to parse this string and |
| categorize the build tag. The function employs a specific parsing logic based on |
| the structure of the image tag: |
| |
| 1. **Initial Validation**: |
| |
| - The function first splits the input tag string by the `@` character. |
| - If there are fewer than two parts (i.e., no `@` or `@` is the first/last |
| character), it's considered an invalid tag. |
| - It then checks if the second part (after `@`) starts with `tag:`. If |
| not, it's also an invalid tag. |
| |
| ``` |
| Input Tag String |
| | |
| V |
| Split by '@' |
| | |
| V |
| Check for at least 2 parts AND second part starts with "tag:" |
| | |
| +-- No --> Invalid Tag |
| | |
| V |
| Proceed to type determination |
| ``` |
| |
| 2. **Tag Type Determination**: Based on the prefix of the raw tag (the part |
| after `tag:`): |
| |
| - **Git Tag**: If the raw tag starts with `tag:git-`, it's classified as a |
| 'git' type. The function extracts the first 7 characters of the Git |
| hash. `rawTag starts with "tag:git-" | V Type: 'git' Tag: First 7 chars |
| |
| of Git hash` - **Louhi Build Tag**: If the raw tag has a specific length (>= 38 |
| |
| characters) and contains`louhi`at a particular position (substring |
| from index 25 to 30), it's classified as a 'louhi' type. The function |
| extracts a 7-character identifier (substring from index 31 to 38) which |
| typically represents a hash or version.`rawTag length >= 38 AND |
| |
| rawTag[25:30] == "louhi" | V Type: 'louhi' Tag: rawTag[31:38]` - **Regular Tag**: If neither of the above conditions is met, it's |
| |
| considered a generic 'tag' type. The function returns the portion of the |
| string after`tag:`. `Neither Git nor Louhi | V Type: 'tag' Tag: rawTag |
| |
| after "tag:"` |
| |
| This structured approach ensures that different build tag formats can be |
| reliably identified and their relevant parts extracted. The decision to |
| differentiate between 'git', 'louhi', and generic 'tag' types allows downstream |
| consumers of this information to handle them appropriately. For instance, a |
| 'git' tag might be used to link to a specific commit, while a 'louhi' tag might |
| indicate a specific build from an internal CI system. |
| |
| The module also extends the global `Window` interface to declare the `perf: |
| SkPerfConfig` property. This is a TypeScript feature that provides type safety |
| when accessing `window.perf`, ensuring that developers are aware of its expected |
| structure. |
| |
| The `window_test.ts` file provides unit tests for the `getBuildTag` function, |
| covering various scenarios including valid git tags, Louhi build tags, arbitrary |
| tags, and different forms of invalid tags. These tests are crucial for verifying |
| the correctness of the parsing logic and ensuring that changes to the function |
| do not introduce regressions. The use of `chai` for assertions is a standard |
| practice for testing in this environment. |
| |
| # Module: /modules/word-cloud-sk |
| |
| The `word-cloud-sk` module provides a custom HTML element designed to visualize |
| key-value pairs and their relative frequencies. This is particularly useful for |
| displaying data from clusters or other datasets where understanding the |
| distribution of different attributes is important. |
| |
| The core idea is to present this frequency information in an easily digestible |
| format, combining textual representation with a simple bar graph for each item. |
| This allows users to quickly grasp the prevalence of certain key-value pairs |
| within a dataset. |
| |
| **Key Components and Responsibilities:** |
| |
| - **`word-cloud-sk.ts`**: This is the heart of the module, defining the |
| `WordCloudSk` custom element which extends `ElementSk`. |
| |
| - **Why**: It encapsulates the logic for rendering the word cloud. By |
| extending `ElementSk`, it leverages common functionalities provided by |
| the `infra-sk` library for custom elements. |
| - **How**: It uses the `lit-html` library for templating. The `items` |
| property, an array of `ValuePercent` objects (defined in |
| `//perf/modules/json:index_ts_lib`), is the primary input. Each |
| `ValuePercent` object contains a `value` (the key-value string) and a |
| `percent` (its frequency). |
| - The rendering logic iterates through the `items` and creates a table row |
| for each. Each row displays the key-value string, its percentage as |
| text, and a horizontal bar whose width is proportional to the |
| percentage. |
| - The `connectedCallback` ensures that if the `items` property is set |
| before the element is fully connected to the DOM, it's properly upgraded |
| and the element is rendered. |
| - The `_render()` method is called whenever the `items` property changes, |
| ensuring the display is updated. |
| |
| - **`word-cloud-sk.scss`**: This file contains the SASS styles for the |
| `word-cloud-sk` element. |
| |
| - **Why**: It provides the visual appearance of the word cloud, ensuring |
| it's readable and visually distinct. |
| - **How**: It defines styles for the table, table cells, and the |
| percentage bar. It uses CSS variables for theming (e.g., `--light-gray`, |
| `--on-surface`, `--primary`), allowing the component to adapt to |
| different themes (like light and dark mode) defined in |
| `//perf/modules/themes:themes_sass_lib` and |
| `//elements-sk/modules:colors_sass_lib`. |
| - Specific styles are applied for font family, size, padding, borders, and |
| the background color and height of the percentage bar. |
| |
| - **`word-cloud-sk-demo.html` and `word-cloud-sk-demo.ts`**: These files |
| provide a demonstration page for the `word-cloud-sk` element. |
| |
| - **Why**: They serve as a live example of how to use the component and |
| allow for easy visual testing and development. |
| - **How**: `word-cloud-sk-demo.html` includes multiple instances of the |
| `<word-cloud-sk>` tag, some within sections with different theming |
| (e.g., dark mode). `word-cloud-sk-demo.ts` then selects these instances |
| and populates their `items` property with sample data. This demonstrates |
| how the component can be instantiated and how data is passed to it. |
| |
| - **`index.ts`**: This file simply imports and thereby registers the |
| `word-cloud-sk` custom element. |
| |
| - **Why**: It acts as the entry point for the element, ensuring it's |
| defined when the module is imported. |
| |
| **Workflow: Data Display** |
| |
| The primary workflow involves providing data to the `word-cloud-sk` element and |
| its subsequent rendering: |
| |
| 1. **Instantiation**: An instance of `<word-cloud-sk>` is created in HTML. |
| |
| ``` |
| <word-cloud-sk></word-cloud-sk> |
| ``` |
| |
| 2. **Data Provision**: The `items` property of the element is set with an array |
| of `ValuePercent` objects. |
| |
| ``` |
| // In JavaScript/TypeScript: |
| const wordCloudElement = document.querySelector('word-cloud-sk'); |
| wordCloudElement.items = [ |
| { value: 'arch=x86', percent: 100 }, |
| { value: 'config=565', percent: 60 }, |
| // ... more items |
| ]; |
| ``` |
| |
| 3. **Rendering (`_render()` called in `word-cloud-sk.ts`)**: |
| |
| - The `WordCloudSk` element iterates through the `_items` array. |
| - For each item: |
| - A table row (`<tr>`) is generated. |
| - The `item.value` is displayed in the first cell (`<td>`). |
| - The `item.percent` is displayed as text (e.g., "60%") in the second |
| cell. |
| - A `<div>` element is created in the third cell. Its `width` style is |
| set to `item.percent` pixels, creating a visual bar representation |
| of the percentage. |
| |
| The overall structure rendered looks like this (simplified): |
| |
| ``` |
| <table> |
| <tr> <!-- For item 1 --> |
| <td class="value">[item1.value]</td> |
| <td class="textpercent">[item1.percent]%</td> |
| <td class="percent"> |
| <div style="width: [item1.percent]px"></div> |
| </td> |
| </tr> |
| <tr> <!-- For item 2 --> |
| <td class="value">[item2.value]</td> |
| <td class="textpercent">[item2.percent]%</td> |
| <td class="percent"> |
| <div style="width: [item2.percent]px"></div> |
| </td> |
| </tr> |
| <!-- ... more rows --> |
| </table> |
| ``` |
| |
| This process ensures that whenever the input data changes, the visual |
| representation of the word cloud is automatically updated. The use of CSS |
| variables for styling allows the component to seamlessly integrate into |
| applications with different visual themes. |
| |
| # Module: /nanostat |
| |
| ## Nanostat |
| |
| `nanostat` is a command-line tool designed to compare and analyze the results of |
| Skia's nanobench benchmark. It takes two JSON files generated by nanobench as |
| input, representing "old" and "new" benchmark runs, and provides a statistical |
| summary of the performance changes between them. This is particularly useful for |
| developers to understand the performance impact of their code changes. |
| |
| ### Why it exists |
| |
| When making changes to a codebase, especially one as performance-sensitive as a |
| graphics library like Skia, it's crucial to measure the impact on performance. |
| Nanobench produces detailed raw data, but interpreting this data directly can be |
| cumbersome. `nanostat` was created to: |
| |
| 1. **Automate Statistical Analysis:** Apply statistical tests (Mann-Whitney U |
| test or Welch's T-test) to determine if observed differences in benchmark |
| results are statistically significant or likely due to random variation. |
| 2. **Summarize Changes:** Present a concise, human-readable summary of |
| performance changes, highlighting significant regressions or improvements. |
| 3. **Facilitate Quick Comparisons:** Enable developers to quickly compare |
| benchmark runs before and after a code change, streamlining the performance |
| analysis workflow. |
| 4. **Provide Filtering and Sorting:** Offer options to filter out insignificant |
| changes, remove outliers, and sort results based on various criteria (e.g., |
| by the magnitude of change or by test name). |
| |
| ### How it works |
| |
| The core workflow of `nanostat` involves several steps: |
| |
| 1. **Input:** It accepts two file paths as command-line arguments, pointing to |
| the "old" and "new" nanobench JSON output files. |
| |
| ``` |
| nanostat [options] old.json new.json |
| ``` |
| |
| 2. **Parsing:** The `loadFileByName` function in `main.go` is responsible for |
| opening and parsing these JSON files. It uses the |
| `perf/go/ingest/format.ParseLegacyFormat` function to interpret the |
| nanobench output structure and then |
| `perf/go/ingest/parser.GetSamplesFromLegacyFormat` to extract the raw sample |
| values for each benchmark test. Each file's data is converted into a |
| `parser.SamplesSet`, which is a map where keys are test identifiers and |
| values are slices of performance measurements (samples). |
| |
| 3. **Statistical Analysis:** The `samplestats.Analyze` function (from the |
| `perf/go/samplestats` module) is the heart of the comparison. It takes the |
| two `parser.SamplesSet` (before and after samples) and a |
| `samplestats.Config` object as input. The configuration includes: |
| |
| - `Alpha`: The significance level (default 0.05). A p-value below alpha |
| indicates a significant difference. |
| - `IQRR`: A boolean indicating whether to apply the Interquartile Range |
| Rule to remove outliers from the sample data before analysis. |
| - `All`: A boolean determining if all results (significant or not) should |
| be displayed. |
| - `Test`: The type of statistical test to perform (Mann-Whitney U test or |
| Welch's T-test). |
| - `Order`: The function used to sort the output rows. |
| |
| For each common benchmark test found in both input files, |
| `samplestats.Analyze` calculates statistics for both sets of samples (mean, |
| percentage deviation) and then performs the chosen statistical test to |
| compare the two distributions. This yields a p-value. |
| |
| 4. **Filtering and Sorting:** Based on the `config`, `samplestats.Analyze` |
| filters out rows where the change is not statistically significant (if |
| `config.All` is false). The remaining rows are then sorted according to |
| `config.Order`. |
| |
| 5. **Output Formatting:** The `formatRows` function in `main.go` takes the |
| analyzed and sorted `samplestats.Row` data and prepares it for display. |
| |
| - It identifies "important keys" from the benchmark parameters (e.g., |
| `config`, `name`, `test`). These are keys whose values differ across the |
| benchmark results, helping to distinguish them. |
| - It constructs a header line for the output table. |
| - For each row of results, it formats the old and new means, standard |
| deviations, the percentage delta, the p-value, sample sizes, and the |
| important key values. |
| - If a change is not significant (p-value > alpha), the delta is shown as |
| "~" unless the `--all` flag is used. |
| - The formatted strings are then printed to `stdout` using |
| `text/tabwriter` to create a well-aligned table. |
| |
| Example output line: |
| |
| ``` |
| old new delta stats name |
| 2.15 ± 5% 2.00 ± 2% -7% (p=0.001, n=10+ 8) tabl_digg.skp |
| ``` |
| |
| ### Key Components and Files |
| |
| - **`main.go`**: This is the entry point of the application. |
| |
| - **Responsibilities**: |
| - Parses command-line arguments and flags (`-alpha`, `-sort`, `-iqrr`, |
| `-all`, `-test`). |
| - Validates user input and displays usage information if necessary. |
| - Calls `loadFileByName` to load and parse the input JSON files. |
| - Constructs the `samplestats.Config` based on the provided flags. |
| - Invokes `samplestats.Analyze` to perform the statistical comparison. |
| - Calls `formatRows` to format the results for display. |
| - Uses `text/tabwriter` to print the formatted output to the console. |
| - **Key functions**: |
| - `actualMain(stdout io.Writer)`: Contains the main logic, allowing |
| `stdout` to be replaced for testing. |
| - `loadFileByName(filename string) parser.SamplesSet`: Reads a nanobench |
| JSON file, parses it, and extracts the performance samples. It leverages |
| `perf/go/ingest/format` and `perf/go/ingest/parser`. |
| - `formatRows(config samplestats.Config, rows []samplestats.Row) |
| []string`: Takes the analysis results and formats them into a slice of |
| strings, ready for tabular display. It intelligently includes relevant |
| parameter keys in the output. |
| |
| - **`main_test.go`**: Contains unit tests for `nanostat`. |
| |
| - **Responsibilities**: |
| - Ensures that `nanostat` produces the expected output for various |
| command-line flag combinations and input files. |
| - Uses golden files (`testdata/*.golden`) to compare actual output against |
| expected output. |
| - **Key functions**: |
| - `TestMain_DifferentFlags_ChangeOutput(t *testing.T)`: The main test |
| function that sets up different test cases. |
| - `check(t *testing.T, name string, args ...string)`: A helper function |
| that runs `nanostat` with specified arguments, captures its output, and |
| compares it against a corresponding golden file. |
| |
| - **`README.md`**: Provides user-facing documentation on how to install and |
| use `nanostat`, including examples and descriptions of command-line options. |
| |
| - **`Makefile`**: Contains targets for building, testing, and regenerating |
| test data (golden files). The `regenerate-testdata` target is crucial for |
| updating the golden files when the tool's output format or logic changes. |
| |
| - **`BUILD.bazel`**: Defines how to build and test the `nanostat` binary and |
| its library using the Bazel build system. It lists dependencies on other |
| Skia modules, such as: |
| |
| - `//go/paramtools`: Used in `formatRows` to work with parameter sets from |
| benchmark results. |
| - `//perf/go/ingest/format`: Used for parsing the legacy nanobench JSON |
| format. |
| - `//perf/go/ingest/parser`: Used to extract sample data from the parsed |
| format. |
| - `//perf/go/samplestats`: Provides the core statistical analysis |
| functions (`samplestats.Analyze`, `samplestats.Order`, |
| `samplestats.Test`). |
| |
| ### Dependencies and Design Choices |
| |
| - **`perf/go/samplestats`**: `nanostat` heavily relies on this module for the |
| actual statistical computations. This promotes code reuse and separation of |
| concerns, keeping `nanostat` focused on command-line parsing, file I/O, and |
| output formatting. |
| - **`perf/go/ingest/format` and `perf/go/ingest/parser`**: These modules |
| handle the complexities of interpreting the nanobench JSON structure, |
| abstracting this detail away from `nanostat`'s main logic. |
| - **Command-line Flags**: The tool offers a range of flags to customize its |
| behavior (`-alpha`, `-iqrr`, `-all`, `-sort`, `-test`). This flexibility |
| allows users to tailor the analysis to their specific needs. For example, |
| the `-iqrr` flag allows for more robust analysis by removing potential |
| outlier data points that could skew results. The `-test` flag allows users |
| to choose between parametric (T-test) and non-parametric (U-test) |
| statistical tests, depending on the assumptions they are willing to make |
| about their data's distribution. |
| - **Tabular Output**: Using `text/tabwriter` provides a clean, aligned, and |
| easy-to-read output format, which is essential for quickly scanning and |
| understanding the performance changes. |
| - **Golden File Testing**: The use of golden files in `main_test.go` is a good |
| practice for testing command-line tools. It makes it easy to verify that |
| changes to the code don't unintentionally alter the output format or the |
| results of the analysis. The `Makefile` target `regenerate-testdata` |
| simplifies updating these files when intended changes occur. |
| |
| # Module: /pages |
| |
| The `/pages` module is responsible for defining the HTML structure and initial |
| JavaScript and CSS for all the user-facing pages of the Skia Performance |
| application. Each page represents a distinct view or functionality within the |
| application, such as viewing alerts, exploring performance data, or managing |
| regressions. |
| |
| The core design philosophy is to keep the HTML files minimal and delegate the |
| rendering and complex logic to custom HTML elements (Skia Elements). This |
| promotes modularity and reusability of UI components. |
| |
| **Key Components and Responsibilities:** |
| |
| - **HTML Files (e.g., `alerts.html`, `newindex.html`):** |
| - These files serve as the entry point for each page. |
| - They define the basic HTML structure (`<head>`, `<body>`). |
| - Crucially, they include a `perf-scaffold-sk` custom element. This |
| element acts as a common layout wrapper for all pages, providing |
| consistent navigation, header, footer, and potentially other shared UI |
| elements. |
| - Inside the `perf-scaffold-sk`, they embed the primary custom element |
| specific to that page's functionality (e.g., `<alerts-page-sk>`, |
| `<explore-sk>`). |
| - They include Go template placeholders like `{%- template |
| "googleanalytics" . -%}` and `{% .Nonce %}` for server-side rendering of |
| common snippets and security nonces. |
| - A `window.perf = {%.context %};` script tag is used to pass initial data |
| or configuration from the server (Go backend) to the client-side |
| JavaScript. This context likely contains information needed by the |
| page-specific custom element to initialize itself. |
| - **TypeScript Files (e.g., `alerts.ts`, `newindex.ts`):** |
| - These files are the JavaScript entry points for each page. |
| - Their primary responsibility is to import the necessary custom elements. |
| This ensures that the browser knows how to render elements like |
| `<perf-scaffold-sk>` and the page-specific custom element (e.g., |
| `../modules/alerts-page-sk`). |
| - By importing these elements, their associated JavaScript logic is |
| executed, making them functional. |
| - **SCSS Files (e.g., `alerts.scss`, `newindex.scss`):** |
| - These files provide page-specific styling. |
| - Currently, they all primarily `@import 'body';`, which means they |
| inherit base body styles from `body.scss`. |
| - If a page required unique styling beyond what the custom elements or |
| `body.scss` provide, those styles would be defined here. |
| - **`body.scss`:** |
| - This file defines global, minimal styles for the `<body>` element, such |
| as removing default margins and padding. This ensures a consistent |
| baseline across all pages. |
| - **`BUILD.bazel`:** |
| - This file defines how each page is built using the `sk_page` rule from |
| `//infra-sk:index.bzl`. |
| - For each page, it specifies: |
| - `html_file`: The entry HTML file. |
| - `ts_entry_point`: The entry TypeScript file. |
| - `scss_entry_point`: The entry SCSS file. |
| - `sk_element_deps`: A list of dependencies on other modules that provide |
| the custom HTML elements used by the page. This is crucial for ensuring |
| that elements like `perf-scaffold-sk` and page-specific elements (e.g., |
| `alerts-page-sk`) are compiled and available. |
| - `sass_deps`: Dependencies for SCSS, typically including `:body_sass_lib` |
| which refers to the `body.scss` file. |
| - Other build-related configurations like `assets_serving_path`, `nonce`, |
| and `production_sourcemap`. |
| |
| **Workflow for a Page Request:** |
| |
| 1. User navigates to a URL (e.g., `/alerts`). |
| 2. The server (Go backend) maps this URL to the corresponding HTML file (e.g., |
| `alerts.html`). |
| 3. The Go backend processes the HTML template, injecting data for `{% .context |
| %}`, the `{% .Nonce %}`, and other templates like "googleanalytics" and |
| "cookieconsent". |
| 4. The processed HTML is sent to the browser. `User Request ----> Go Backend |
| ----> Template Processing (alerts.html + context) ----> HTML Response (URL |
| Routing) (Injects window.perf data, nonce)` |
| 5. The browser parses the HTML. |
| 6. When the browser encounters `<script src="alerts.js"></script>` (or the |
| equivalent generated by the build system), it fetches and executes |
| `alerts.ts`. |
| 7. `alerts.ts` imports `../modules/perf-scaffold-sk` and |
| `../modules/alerts-page-sk`. This registers these custom elements with the |
| browser. `Browser Receives HTML -> Parses HTML -> Encounters <script> for |
| alerts.ts | -> Fetches and Executes alerts.ts | -> import |
| '../modules/perf-scaffold-sk'; -> import '../modules/alerts-page-sk'; |
| (Custom elements are now defined)` |
| 8. The browser then renders the custom elements (`<perf-scaffold-sk>` and |
| `<alerts-page-sk>`). The JavaScript logic within these custom elements takes |
| over, potentially fetching more data via AJAX using the initial |
| `window.perf` context if needed, and populating the page content. `Custom |
| Elements Registered -> Browser renders <perf-scaffold-sk> and |
| <alerts-page-sk> | -> JavaScript within these elements executes (e.g., reads |
| window.perf, makes AJAX calls, builds UI)` |
| 9. The SCSS file (`alerts.scss`) is also linked in the HTML (via the build |
| system), and its styles (including those from `body.scss`) are applied. |
| |
| This structure allows for a clean separation of concerns: |
| |
| - HTML provides the basic skeleton and server-side data injection points. |
| - TypeScript/JavaScript (via custom elements) handles all dynamic behavior, UI |
| rendering, and interaction logic. |
| - SCSS handles the styling. |
| |
| The `help.html` page is slightly different as it directly embeds more static |
| content (help text and examples) within its HTML structure using Go templating |
| (`{% range ... %}`). However, it still utilizes the `perf-scaffold-sk` for |
| consistent page layout and imports its JavaScript for any scaffold-related |
| functionalities. |
| |
| The `newindex.html` and `multiexplore.html` pages additionally include a `div` |
| with `id="sidebar_help"` within the `perf-scaffold-sk`. This suggests that the |
| `perf-scaffold-sk` might have a designated area or slot where page-specific help |
| content can be injected, or that the page-specific JavaScript (`explore-sk.ts` |
| or `explore-multi-sk.ts`) might dynamically populate or interact with this |
| sidebar content. |
| |
| # Module: /res |
| |
| ## Resource Module (`/res`) |
| |
| ### High-Level Overview |
| |
| The `/res` module serves as a centralized repository for static assets required |
| by the application. Its primary purpose is to provide a consistent and organized |
| location for resources such as images, icons, and potentially other static files |
| that are part of the user interface or overall application branding. By |
| co-locating these assets, the module simplifies resource management, facilitates |
| easier updates, and ensures that all parts of the application can reliably |
| access necessary visual or static elements. |
| |
| ### Design Decisions and Implementation Choices |
| |
| The decision to have a dedicated `/res` module stems from the need to separate |
| static content from dynamic code. This separation offers several benefits: |
| |
| 1. **Organization:** Grouping all static assets in one place makes the project |
| structure cleaner and easier to navigate. Developers know exactly where to |
| look for or add new resources. |
| 2. **Maintainability:** When assets need to be updated (e.g., a new logo, a |
| changed icon), modifications are localized to this module, reducing the risk |
| of inadvertently affecting other parts of the codebase. |
| 3. **Build Process Optimization:** Build tools can often be configured to |
| handle static assets differently (e.g., copying them directly to the output |
| directory, optimizing images). Having a dedicated module simplifies the |
| configuration of such processes. |
| 4. **Caching and Delivery:** Web servers and content delivery networks (CDNs) |
| can be more effectively configured to cache and serve static assets when |
| they are located in a well-defined directory. |
| |
| The internal structure of `/res` is designed to categorize different types of |
| assets. For instance, images are placed within a dedicated `img` subdirectory. |
| This categorization aids in discoverability and allows for type-specific |
| processing or handling if needed in the future. |
| |
| ### Key Components/Files/Submodules |
| |
| - **`/res/img` (Submodule/Directory):** |
| - **Responsibility:** This submodule is dedicated to storing all image |
| assets used by the application. This includes logos, icons, background |
| images, and any other visual elements that are not dynamically |
| generated. |
| - **Why:** Separating images into their own directory within `/res` keeps |
| the root of the resource module clean and allows for specific |
| image-related build optimizations or management strategies. For example, |
| image compression tools or sprite generation scripts could target this |
| directory specifically. |
| - **Key Files:** |
| - **`/res/img/favicon.ico`:** |
| - **Responsibility:** This specific file provides the "favorite icon" |
| or "favicon" for the application. Web browsers display this icon in |
| various places, such as the browser tab, bookmarks bar, and address |
| bar history. It's a small but important branding element that helps |
| users quickly identify the application among many open tabs or saved |
| links. |
| - **Why:** The `.ico` format is the traditional and most widely |
| supported format for favicons, ensuring compatibility across |
| different browsers and platforms. Placing it directly in the `img` |
| directory makes it easily discoverable by build tools and web |
| servers, which often look for `favicon.ico` in standard locations. |
| Its presence here ensures that the application has a visual |
| identifier in browser contexts. |
| |
| ### Workflows and Processes |
| |
| A typical workflow involving the `/res` module might look like this: |
| |
| 1. **Asset Creation/Acquisition:** A designer creates a new icon or a new |
| version of the application logo. |
| |
| ``` |
| Designer Developer |
| | | |
| [New Image Asset] --> [Receives Asset] |
| ``` |
| |
| 2. **Asset Placement:** The developer places the new image file (e.g., |
| `new_icon.png`) into the appropriate subdirectory within `/res`, likely |
| `/res/img/`. |
| |
| ``` |
| Developer |
| | |
| [Places new_icon.png into /res/img/] |
| ``` |
| |
| 3. **Referencing the Asset:** Application code (e.g., HTML, CSS, JavaScript) |
| that needs to display this icon will reference it using a path relative to |
| how the assets are served. |
| |
| ``` |
| Application Code (e.g., HTML) |
| | |
| <img src="/path/to/res/img/new_icon.png"> |
| ``` |
| |
| _(Note: The exact `/path/to/` depends on how the web server or build system |
| exposes the `/res` directory.)_ |
| |
| 4. **Build Process:** During the application build, files from the `/res` |
| module are typically copied to a public-facing directory in the build |
| output. |
| |
| ``` |
| Build System |
| | |
| [Reads /res/img/new_icon.png] --> [Copies to /public_output/img/new_icon.png] |
| ``` |
| |
| 5. **Client Request:** When a user accesses the application, their browser |
| requests the asset. `User's Browser Web Server | | [Requests |
| /public_output/img/new_icon.png] ----> [Serves new_icon.png] | | [Displays |
| new_icon.png] <------------------------+` |
| |
| This workflow highlights how the `/res` module acts as the source of truth for |
| static assets, which are then processed and served to the end-user. The |
| `favicon.ico` follows a similar, often more implicit, path as browsers |
| automatically request it from standard locations. |
| |
| # Module: /samplevariance |
| |
| The `samplevariance` module is a command-line tool designed to analyze the |
| variance of benchmark samples, specifically those generated by nanobench and |
| stored in Google Cloud Storage (GCS). Nanobench typically produces multiple |
| samples (e.g., 10) for each benchmark execution. This tool facilitates the |
| examination of these samples across a large corpus of historical benchmark runs. |
| |
| The primary motivation for this tool is to identify benchmarks exhibiting high |
| variance in their results. High variance can indicate instability in the |
| benchmark itself, the underlying system, or the measurement process. By |
| calculating statistics like the ratio of the median to the minimum value for |
| each set of samples, `samplevariance` helps pinpoint traces that warrant further |
| investigation. |
| |
| The core workflow involves: |
| |
| 1. **Initialization**: Parsing command-line flags to determine the GCS location |
| of benchmark data, output destination (stdout or a file), filtering criteria |
| for traces, and the number of top results to display. |
| 2. **File Discovery**: Listing all relevant JSON files from the specified GCS |
| bucket and prefix. |
| 3. **Data Processing (Concurrent)**: Distributing the discovered filenames to a |
| pool of worker goroutines. Each worker: |
| - Downloads a JSON file from GCS. |
| - Parses the legacy nanobench format to extract benchmark results. |
| - Filters traces based on the user-provided criteria. |
| - For each matching trace, calculates the median and minimum of its |
| samples. |
| - Computes the ratio of median to minimum. |
| - Stores this information as a `sampleInfo` struct. |
| 4. **Aggregation and Sorting**: Collecting all `sampleInfo` structs from the |
| workers and sorting them in descending order based on the calculated |
| median/min ratio. This brings the traces with the highest variance to the |
| top. |
| 5. **Output**: Writing the sorted results to a CSV file (or stdout), including |
| the trace identifier, minimum value, median value, and the median/min ratio. |
| |
| ``` |
| [Flags] -> initialize() -> (ctx, bucket, objectPrefix, traceFilter, outputWriter) |
| | |
| v |
| filenamesFromBucketAndObjectPrefix(ctx, bucket, objectPrefix) -> [filenames] |
| | |
| v |
| samplesFromFilenames(ctx, bucket, traceFilter, [filenames]) |
| | |
| |--> [gcsFilenameChannel] -> Worker Goroutine 1 -> traceInfoFromFilename() -> [sampleInfo] --\ |
| | | |
| |--> [gcsFilenameChannel] -> Worker Goroutine 2 -> traceInfoFromFilename() -> [sampleInfo] ----> [aggregatedSamples] (mutex protected) |
| | | |
| |--> ... (up to workerPoolSize) | |
| | | |
| |--> [gcsFilenameChannel] -> Worker Goroutine N -> traceInfoFromFilename() -> [sampleInfo] --/ |
| | |
| v |
| Sort([aggregatedSamples]) |
| | |
| v |
| writeCSV([sortedSamples], topN, outputWriter) -> CSV Output |
| ``` |
| |
| Key components and their responsibilities: |
| |
| - **`main.go`**: This is the entry point of the application and orchestrates |
| the entire process. |
| |
| - `main()`: Drives the overall workflow: initialization, fetching |
| filenames, processing samples, sorting, and writing the output. |
| - `initialize()`: Handles command-line argument parsing. It sets up the |
| GCS client, determines the input GCS path (defaulting to yesterday's |
| data if not specified), parses the trace filter query, and configures |
| the output writer (stdout or a specified file). The choice to default to |
| yesterday's data provides a convenient way to monitor recent benchmark |
| stability without requiring explicit date specification. |
| - `filenamesFromBucketAndObjectPrefix()`: Interacts with GCS to list all |
| object names (filenames) under the specified bucket and prefix. It uses |
| GCS client library features to efficiently retrieve only the names, |
| minimizing data transfer. |
| - `samplesFromFilenames()`: Manages the concurrent processing of benchmark |
| files. It creates a channel (`gcsFilenameChannel`) to distribute |
| filenames to a pool of worker goroutines (`workerPoolSize`). An |
| `errgroup` is used to manage these goroutines and propagate any errors. |
| A mutex protects the shared `samples` slice where results from workers |
| are aggregated. This concurrent design is crucial for performance when |
| dealing with a large number of benchmark files. |
| - `traceInfoFromFilename()`: This function is executed by each worker |
| goroutine. It takes a single GCS filename, reads the corresponding |
| object from the bucket, parses the JSON content using |
| `format.ParseLegacyFormat` (from `perf/go/ingest/format`) and |
| `parser.GetSamplesFromLegacyFormat` (from `perf/go/ingest/parser`). For |
| each trace that matches the `traceFilter` (a `query.Query` object from |
| `go/query`), it sorts the sample values, calculates the median (using |
| `stats.Sample.Quantile` from `go-moremath/stats`) and minimum, and then |
| computes their ratio. The use of established libraries for parsing and |
| statistical calculation ensures correctness and leverages existing, |
| tested code. |
| - `writeCSV()`: Formats the processed `sampleInfo` data into CSV format |
| and writes it to the designated output writer. It includes a header row |
| and then iterates through the `sampleInfo` slice, writing each entry. It |
| also handles the `--top` flag to limit the number of output rows. |
| - `sampleInfo`: A simple struct to hold the calculated statistics (trace |
| ID, median, min, ratio) for a single benchmark trace's samples. |
| - `sampleInfoSlice`: A helper type that implements `sort.Interface` to |
| allow sorting `sampleInfo` slices by the `ratio` field in descending |
| order. This is key to presenting the most variant traces first. |
| |
| - **`main_test.go`**: Contains unit tests for the `writeCSV` function. These |
| tests verify that the CSV output is correctly formatted under different |
| conditions, such as when writing all samples, a limited number of top |
| samples, or when the number of samples is less than the requested top N. |
| This ensures the output formatting logic is robust. |
| |
| The design decision to use a worker pool (`workerPoolSize`) for processing files |
| in parallel significantly speeds up the analysis, especially when dealing with |
| numerous benchmark result files often found in GCS. The use of |
| `golang.org/x/sync/errgroup` simplifies error handling in concurrent operations. |
| Filtering capabilities (via the `--filter` flag and `go/query`) allow users to |
| narrow down the analysis to specific subsets of benchmarks, making the tool more |
| flexible and targeted. The output as a CSV file makes it easy to import the |
| results into spreadsheets or other data analysis tools for further examination. |
| |
| # Module: /scripts |
| |
| The `/scripts` module provides tooling to support the data ingestion pipeline |
| for Skia Perf. The primary focus is on automating the process of transferring |
| processed data to the designated cloud storage location for further analysis and |
| visualization within the Skia performance monitoring system. |
| |
| The key responsibility of this module is to ensure reliable and timely delivery |
| of performance data. This is achieved by interacting with Google Cloud Storage |
| (GCS) using the `gsutil` command-line tool. |
| |
| The main component within this module is the `upload_extracted_json_files.sh` |
| script. |
| |
| **`upload_extracted_json_files.sh`** |
| |
| This shell script is responsible for uploading JSON files, which are assumed to |
| be the output of a preceding data extraction or processing phase, to a specific |
| Google Cloud Storage bucket (`gs://skia-perf/nano-json-v1/`). |
| |
| **Design Rationale and Implementation Details:** |
| |
| - **Why a shell script?** Shell scripting is a straightforward and widely |
| available tool for automating command-line operations, making it suitable |
| for tasks like file transfers to cloud storage. It avoids the need for more |
| complex programming language environments for this specific, relatively |
| simple task. |
| - **Why `gsutil`?** `gsutil` is the standard command-line tool for interacting |
| with Google Cloud Storage. It provides robust features for uploading, |
| downloading, and managing data in GCS buckets. |
| - **Why `-m` (parallel uploads)?** The `-m` flag in `gsutil cp` enables |
| parallel uploads. This is a crucial performance optimization, especially |
| when dealing with a potentially large number of JSON files. By uploading |
| multiple files concurrently, the overall time taken for the transfer is |
| significantly reduced. |
| - **Why `cp -r` (recursive copy)?** The `-r` flag ensures that the entire |
| directory structure under `downloads/` is replicated in the destination GCS |
| path. This is important for maintaining the organization of the data and |
| potentially for downstream processing that might rely on the file paths. |
| - **Why the specific GCS path structure (`gs://skia-perf/nano-json-v1/$(date |
| -u --date +1hour +%Y/%m/%d/%H)`)?** |
| - `gs://skia-perf/nano-json-v1/`: This is the base path in the GCS bucket |
| designated for "nano" format JSON files, version 1. This structured |
| naming helps in organizing different types and versions of data within |
| the bucket. |
| - `$(date -u --date +1hour +%Y/%m/%d/%H)`: This part dynamically generates |
| a timestamped subdirectory structure. |
| - `date -u`: Ensures the date is in UTC, providing a consistent timezone |
| regardless of where the script is run. |
| - `--date +1hour`: This is a deliberate choice to place the data into the |
| _next_ hour's ingestion slot. This likely provides a buffer, ensuring |
| that all data generated within a given hour is reliably captured and |
| processed for that hour, even if the script runs slightly before or |
| after the hour boundary. It helps prevent data from being missed or |
| attributed to the wrong time window due to minor timing discrepancies in |
| script execution. |
| - `+%Y/%m/%d/%H`: Formats the date and time into a hierarchical path |
| (e.g., `2023/10/27/15`). This organization is beneficial for: |
| - **Data partitioning:** Makes it easy to query or process data for |
| specific time ranges. |
| - **Data lifecycle management:** Facilitates policies for archiving or |
| deleting older data based on these time-based folders. |
| - **Browseability:** Improves human readability and navigation within |
| the GCS bucket. |
| |
| **Workflow:** |
| |
| The script executes a simple, linear workflow: |
| |
| 1. **Source:** Identifies the `downloads/` directory in the current working |
| directory as the source of JSON files. `[Local Filesystem] | ./downloads/ |
| (contains *.json files)` |
| 2. **Destination Path Generation:** Dynamically constructs the target GCS path |
| using the current UTC time, advanced by one hour, and formatted as |
| `YYYY/MM/DD/HH`. `date command ---> YYYY/MM/DD/HH (e.g., 2023/10/27/15) | |
| Target GCS Path: gs://skia-perf/nano-json-v1/YYYY/MM/DD/HH/` |
| 3. **Upload:** Uses `gsutil` to recursively copy all contents from `downloads/` |
| to the generated GCS path, utilizing parallel uploads for efficiency. |
| `./downloads/* ---(gsutil -m cp -r)---> |
| gs://skia-perf/nano-json-v1/YYYY/MM/DD/HH/` |
| |
| This script assumes that the `downloads/` directory exists in the location where |
| the script is executed and contains the JSON files ready for upload. It also |
| presumes that the user running the script has the necessary `gsutil` tool |
| installed and configured with appropriate permissions to write to the specified |
| GCS bucket. |
| |
| # Module: /secrets |
| |
| The `/secrets` module is responsible for managing the creation and configuration |
| of secrets required for various Skia Perf services to operate. These secrets |
| primarily involve Google Cloud service accounts and OAuth credentials for email |
| sending. The scripts in this module automate the setup of these credentials, |
| ensuring that services have the necessary permissions to interact with Google |
| Cloud APIs and other resources. |
| |
| The design philosophy emphasizes secure and automated credential management. |
| Instead of manual creation and configuration of secrets, these scripts provide a |
| repeatable and version-controlled way to provision them. This reduces the risk |
| of human error and ensures that services are configured with the principle of |
| least privilege. For instance, service accounts are granted only the specific |
| roles they need to perform their tasks. |
| |
| ### Key Components and Scripts: |
| |
| **1. Service Account Creation Scripts:** |
| |
| - **`create-flutter-perf-service-account.sh`**: This script provisions a |
| Google Cloud service account specifically for the Flutter Perf instance. It |
| leverages a common script (`../../kube/secrets/add-service-account.sh`) to |
| handle the underlying `gcloud` commands. |
| |
| - **Why**: Flutter Perf needs its own identity to interact with Google |
| Cloud services like Pub/Sub (for message queuing) and Cloud Trace (for |
| application performance monitoring). Separating this into its own |
| service account adheres to the principle of least privilege and allows |
| for more granular permission management. |
| - **How**: It calls the `add-service-account.sh` script, passing in |
| parameters like the project ID, the desired service account name |
| ("flutter-perf-service-account"), a descriptive display name, and the |
| necessary IAM roles (`roles/pubsub.editor`, `roles/cloudtrace.agent`). |
| |
| - **`create-perf-cockroachdb-backup-service-account.sh`**: This script creates |
| a dedicated service account for the Perf CockroachDB backup cronjob. |
| |
| - **Why**: The backup process requires permissions to write data to Google |
| Cloud Storage. A dedicated service account ensures that only the backup |
| job has these specific permissions, enhancing security. If the backup |
| job's credentials were compromised, the blast radius would be limited to |
| storage object administration. |
| - **How**: Similar to the Flutter Perf service account, it utilizes |
| `../../kube/secrets/add-service-account.sh`. It specifies the service |
| account name ("perf-cockroachdb-backup") and the |
| `roles/storage.objectAdmin` role, which grants permissions to manage |
| objects in Cloud Storage buckets. |
| |
| - **`create-perf-ingest-sa.sh`**: This script is responsible for creating the |
| `perf-ingest` service account. This account is used by the Perf ingestion |
| service, which processes and stores performance data. |
| |
| - **Why**: The ingestion service needs to publish messages to Pub/Sub |
| topics, send trace data to Cloud Trace, and read data from specific |
| Google Cloud Storage buckets (`gs://skia-perf`, |
| `gs://cluster-telemetry-perf`). A dedicated service account with these |
| precise permissions is crucial for security and operational clarity. It |
| also leverages Workload Identity, a more secure way for Kubernetes |
| workloads to access Google Cloud services. |
| - **How**: |
| |
| * It sources configuration (`../kube/config.sh`) and utility functions |
| (`../bash/ramdisk.sh`) for environment setup. |
| * Creates the service account (`perf-ingest`) using `gcloud iam |
| service-accounts create`. |
| * Assigns necessary IAM roles: |
| - `roles/pubsub.editor`: To publish messages to Pub/Sub. |
| - `roles/cloudtrace.agent`: To send trace data. |
| * Configures Workload Identity by binding the Kubernetes service account |
| (`default/perf-ingest` in the `skia-public` namespace) to the Google |
| Cloud service account. This allows pods running as `perf-ingest` in |
| Kubernetes to impersonate the `perf-ingest` Google Cloud service account |
| without needing to mount service account key files directly. `Kubernetes |
| Pod (default/perf-ingest) ----> Impersonates ----> Google Cloud SA |
| (perf-ingest@skia-public.iam.gserviceaccount.com) | +----> Accesses GCP |
| Resources (Pub/Sub, Cloud Trace, GCS)` |
| * Grants `objectViewer` permissions on specific GCS buckets using `gsutil |
| iam ch`. |
| * Creates a JSON key file for the service account (`perf-ingest.json`). |
| * Creates a Kubernetes secret named `perf-ingest` from this key file using |
| `kubectl create secret generic`. This secret can then be used by |
| deployments that might not be able to use Workload Identity directly or |
| for other specific use cases. |
| * Operations are performed in a temporary ramdisk (`/tmp/ramdisk`) to |
| avoid leaving sensitive key files on persistent storage. |
| |
| - **`create-perf-sa.sh`**: This script creates the primary `skia-perf` service |
| account. This is a general-purpose service account for the main Perf |
| application. |
| |
| - **Why**: The main Perf application requires permissions for Pub/Sub, |
| Cloud Trace, and reading from the `gs://skia-perf` bucket. Similar to |
| `perf-ingest`, this service account uses Workload Identity for enhanced |
| security when running within Kubernetes. |
| - **How**: The process is very similar to `create-perf-ingest-sa.sh`: |
| |
| * Sources configuration and sets up a ramdisk. |
| * Creates the `skia-perf` service account. |
| * Assigns `roles/cloudtrace.agent` and `roles/pubsub.editor`. |
| * Configures Workload Identity, binding the Kubernetes service account |
| (`default/skia-perf`) to the `skia-perf` Google Cloud service account. |
| * Grants `objectViewer` on the `gs://skia-perf` GCS bucket. |
| * Creates a JSON key and stores it as a Kubernetes secret named |
| `skia-perf`. |
| |
| **2. Email Secrets Creation:** |
| |
| - **`create-email-secrets.sh`**: This script facilitates the creation of |
| Kubernetes secrets necessary for Perf to send emails via Gmail. This |
| typically involves an OAuth 2.0 flow. |
| - **Why**: Perf needs to send email notifications (e.g., for alerts). |
| Using Gmail programmatically requires proper authentication, which is |
| achieved through OAuth 2.0. Storing these credentials as Kubernetes |
| secrets makes them securely available to the Perf application pods. |
| - **How**: This script guides the user through a semi-automated process: |
| * It takes the email address to be authenticated as an argument (e.g., |
| `alertserver@skia.org`). |
| * It converts the email address into a Kubernetes-friendly secret name |
| format (e.g., `alertserver-skia-org`). |
| * It prompts the user to download the `client_secret.json` file (obtained |
| from the Google Cloud Console after enabling the Gmail API and creating |
| OAuth 2.0 client credentials) to `/tmp/ramdisk`. |
| * It then instructs the user to run the `three_legged_flow` Go program |
| (which must be built and installed separately from |
| `../go/email/three_legged_flow`). This program initiates the OAuth 2.0 |
| three-legged authentication flow. `User Action: Run three_legged_flow |
| --> Browser opens for Google Auth --> User authenticates as specified |
| email | v three_legged_flow generates client_token.json` |
| * Once `client_token.json` (containing the authorization token and refresh |
| token) is generated in `/tmp/ramdisk`, the script uses `kubectl create |
| secret generic` to create a Kubernetes secret named |
| `perf-${EMAIL}-secrets`. This secret contains both `client_secret.json` |
| and `client_token.json`. |
| * Crucially, it then removes the `client_token.json` file from the local |
| filesystem because it contains a sensitive refresh token. The source of |
| truth for this token becomes the Kubernetes secret. |
| * The use of `/tmp/ramdisk` ensures that sensitive downloaded and |
| generated files are stored in memory and are less likely to be |
| inadvertently persisted. |
| |
| The common pattern across these scripts is the use of `gcloud` for Google Cloud |
| resource management and `kubectl` for interacting with Kubernetes to store the |
| secrets. The use of a ramdisk for temporary storage of sensitive files like |
| service account keys and OAuth tokens is a security best practice. Workload |
| Identity is preferred for service accounts running in GKE, reducing the need to |
| manage and distribute service account key files. |