Cloud Spanner Schema Migrations & Validation

This package manages database schema migrations and validations for the Performance Dashboard (perf) Cloud Spanner PostgreSQL database.

SECTION 1: Architecture & Execution Flow

We use a Sequential Versioned SQL Migrations model to securely support delayed, batched, and zero-downtime deployments.

1. Sequence-Numbered SQL Scripts

All database schema upgrades are written as raw SQL DDL scripts inside the migrations/ directory. Each file is named with a four-digit zero-padded sequence number:

migrations/0001_init.sql: Baseline schema containing all tables

2. Migration Logging Table

A tracking table named schema_migrations logs all successfully applied versions:

CREATE TABLE IF NOT EXISTS schema_migrations (
    version INT PRIMARY KEY,
    applied_at TIMESTAMPTZ DEFAULT CURRENT_TIMESTAMP
);

3. Upgrade Execution Flow

On deployment, the background maintenance process runs ValidateAndMigrateNewSchema:

Creates the schema_migrations table if it doesn't exist.
Queries the maximum applied version from schema_migrations.
Evaluates all embedded scripts in migrations/, filtering out already applied versions.
Executes pending scripts chronologically and records them in schema_migrations.
Performs Final Schema Catalog Verification: Loads the expected schema layout from schema_spanner.json and validates it against the live catalog description using assertdeep.Diff to guarantee the schema was correctly upgraded.

4. Database Auto-Bootstrapping for Existing Instances

To ensure a seamless rollout without manual database log intervention, the runner has a built-in bootstrapping logic:

If schema_migrations is empty, the runner checks if the database was already initialized:
- If the autobisections table exists, it indicates the initial schema layout (version 1) has already been successfully set up.
- The runner automatically bootstraps the history by inserting version 1 directly into the schema_migrations table.
- If no database tables exist, the database is considered uninitialized (version 0) and all migrations are run from scratch.

5. Active Readiness Probe Gating (Zero-Downtime Deployments)

In production, instead of crash-looping or panicking when schemas are out of date, regular application containers (frontend and ingest) decouple database boot checks from startup:

Dynamic Environment Check: On startup, the server checks the KUBERNETES_SERVICE_HOST environment variable:
- In Local Dev (K8s var is empty): The server fails-fast immediately on boot, giving the developer an instant, clear error trace so they know to run migrations locally.
- In Production (K8s var is present): Connection-level validation is bypassed on boot, allowing the container to start up successfully and delegating verification entirely to the readiness probe.
Readiness Gating: The container exposes a /readiness HTTP endpoint. On every poll, the readiness handler checks expectedschema.VerifySchemaVersion(ctx, f.db).
State Gating: If the database version is lower than the expected binary version, /readiness returns 503 Service Unavailable. Kubernetes holds the new container in the NotReady state (preventing any live traffic routing) while older containers continue serving traffic safely.
Background Logging: A lightweight background thread logs a status update every 10 seconds (Database schema is not ready yet... Waiting for migration job...) until the migration job upgrades the database, at which point /readiness turns green (200 OK) and traffic seamlessly shifts.

SECTION 2: Developer Guidelines: Adding a New Schema Change

When you need to modify the database schema, follow this step-by-step workflow:

Step 1: Create the Migration File

Add a new sequential file migrations/000X_description.sql under the migrations/ folder (e.g. 0003_add_new_column.sql).
Add your DDL statements (e.g. ALTER TABLE ... ADD COLUMN ...).

Place the manual rollback DDL inside a -- Rollback SQL comment block at the top of the file:

-- Rollback SQL (For manual reference):
-- ALTER TABLE Autobisections DROP COLUMN new_column;

ALTER TABLE Autobisections ADD COLUMN new_column TEXT;

Step 2: Update the Go Struct Definition

Update the corresponding Go struct representations of your table (e.g., inside perf/go/autobisection/sqlautobisectionstore/schema/schema.go or other schemas) to match your new column or index.

Step 3: Regenerate SQL Schema and Catalog JSON

Run the regeneration scripts inside perf/ directory to automatically generate the expected SQL and JSON catalogs:

To regenerate Go schema code (spanner/schema_spanner.go), run:
```
cd perf/go/sql && go run ./tosql
```

Start your local database emulator, and regenerate schema_spanner.json by running:

cd perf && go run ./go/sql/exportschema --out ./go/sql/expectedschema/schema_spanner.json --databaseType spanner

Step 4: Add & Run Unit Tests

Verify that all migrations run and match perfectly by executing:
```
bazelisk test //perf/go/sql/expectedschema/...
```
Make sure to start the Spanner emulator container (make -C perf run-spanner-emulator) and set the printed environment variables before running unit tests.