blob: dba62be8c84c1d2caa4c567035d3b3827f855e50 [file] [log] [blame] [view]
# Infra Team Launch Checklist
## Assumptions
Your service:
- Is checked in to the `go.skia.org/infra` Git repo.
- Is an HTTP server written in Go with a front end in WebComponents (legacy apps have Polymer)
- Will run in a Docker Container.
- Will use a `*.skia.org` domain.
- Does not handle personal data (additional steps will be required).
- Is not intended to be used by the public at large (additional steps will be required).
[JSFiddle](https://github.com/google/skia-buildbot/tree/master/jsfiddle) is a recent service
that was launched that demonstrates the above. See go/ for the server code and
modules/ and pages/ for the front end.
## Coding
Use `go.skia.org/infra/go/sklog` for logging.
Add flags to your main package like:
port = flag.String("port", ":8002", "HTTP service port (e.g., ':8002')")
local = flag.Bool("local", false, "Running locally if true. As opposed to in production.")
promPort = flag.String("prom_port", ":20000", "Metrics service address (e.g., ':10110')")
resourcesDir = flag.String("resources_dir", "./dist", "The directory to find HTML, JS, and CSS files. If blank the current directory will be used.")
Call `common.InitWithMust([opt], [opt])` in your main function.
Use `go.skia.org/infra/go/login` paired with `../../../infra-sk/modules/login.js`
(Legacy Polymer apps use `res/imp/login.html`) and/or
`go.skia.org/infra/go/webhook` for authentication.
When using OAuth, see the secrets section below for including client secrets.
Wrap your `http.Handler` (many services use [mux.NewRouter()](https://github.com/gorilla/mux) with
`go.skia.org/infra/go/httputils.LoggingGzipRequestResponse` to provide monitoring and
logging of HTTP requests and responses. Then, wrap it in
`go.skia.org/infra/go/httputils.HealthzAndHTTPS` to add an unlogged /healthz endpoint for use
with GKE health monitoring and various HTTPS configuration.
Use `go.skia.org/infra/go/httputils.DefaultClientConfig` for HTTP clients, which
provides several features:
- ensures requests time out within a reasonable limit
- tracks how much load we place on external services
- optionally adds authentication to requests
- optionally adds automatic retries with exponential backoff
- optionally treats non-2xx responses as errors
Write your code with security in mind:
- Make sure your service is listed in the [team security scan](http://go/skia-infra-scan).
- Follow the [team security guidelines](http://go/skia-infra-sec) when developing your applicaation.
If you add any critical TODOs while you're coding, file a blocking bug for the issue.
If your application requires Puppeteer tests, it should be explicitly opted in
by making any necessary changes to //puppeteer-tests/docker/run-tests.sh.
## Makefiles
It is customary to have the following commands in a Makefile for the service.
- `build` : Build a development version of the front end.
- `serve` : Run the demo pages of the front end in a "watch" mode. This command is for
primary development of front end pages.
- `core` : Build the server components.
- `watch` : Build a development version of the front end in watch mode. This command is
for running the server, but also making changes to the front end.
- `release` : Build a Docker container with the front end and backend parts in it (see below).
- `push` : Depends on release, and then pushes to GKE using `pushk`.
## Docker
Running apps in Docker makes deployment and local testing much much easier.
It additionally allows integration with GKE. Some legacy apps are not yet
run in Docker, but it is the goal to have everything on GKE+Docker.
Create a Dockerfile for your app in the root of the project folder
(e.g. `jsfiddle/Dockerfile`). If there are multiple services, put them in a named folder
(e.g. `fiddlek/fiddle/Dockerfile`, `fiddlek/fiddler/Dockerfile`).
When choosing a base image, consider our light wrappers, found in `kube/*`.
For example, `kube/basealpine/Dockerfile` which can be used by having
`FROM gcr.io/skia-public/basealpine:3.8` as the first line in a Dockerfile.
We have a helper script for 'installing' an app into a Docker container,
`bash/docker_build.sh`. A call to this script is customarily put in a bash
script which is called by `make release`. See `jsfiddle/build_release`
for an example. To integrate docker_build.sh into the actual container, add
a `COPY . /` to copy the executable(s) and HTML/JS/CSS from the build context
into the container. Legacy apps have a similar set-up, but for building a
Debian package instead of a container.
It is customary to include an ENTRYPOINT and CMD with sensible defaults
for the app. It's also a best practice to run the app as USER skia unless
root is absolutely needed.
Putting all the above together, a bare-bones Dockerfile would look something like:
FROM gcr.io/skia-public/basealpine:3.8
COPY . /
USER skia
ENTRYPOINT ["/usr/local/bin/my_app_name"]
CMD ["--logtostderr", "--port=:8000", "--resources_dir=/usr/local/share/my_app_name/"]
## Secrets and Service Accounts
If your app needs access to a GCS bucket or other similar things, it is recommended
you create a new service account for your app. See below for linking it into the
container.
Use an existing `create-sa.sh` script (e.g. `create-jsfiddle-sa.sh`) and tweak
the name, committing it into the app's root directory. Run this once to create
the service account and create the secrets in GKE.
## Using Git
Use of the Git binary itself is strongly discouraged unless it is unavoidable.
Please consider an alternative:
- go/gitiles provides an API for retrieving commit information, file contents,
git log, etc, via HTTP for repos hosted on Googlesource.
- go/gitstore provides a low-level interface for retrieving commit metadata by
time or index. This data is stored in BigTable and is ingested by the
`gitsync` app, which also sends PubSub messages for low-latency updates.
- go/vcsinfo/bt_vcs provides a similar interface for retrieving metadata but
adds caching and packages Gitiles into a common API.
- go/git/repograph provides a complete in-memory graph of a repository for fast
traversal. It loads data via go/gitstore and can be automatically updated via
PubSub.
- go/gerrit provides access to the Gerrit API, including uploading and
committing changes to repos which use Gerrit.
The following are valid reasons to use the Git binary itself:
- You need to do more complex write operations, eg. merges.
- You need a full local checkout of some code, eg. to compile and run tests, or
to run a script. Note that you can use go/gitiles to download a standalone
script, so a full checkout should not be necessary unless your use case
requires a large or changing set of files.
If you do need Git for your app, use the `base-cipd` Docker image, which
includes a pinned version of Git (as well as other tools). Do not install Git
via the package manager in your Docker image.
## Launching
- Write/update design doc so that others understand how to use, maintain, and
improve your service. `DESIGN.md` typically has high level design structures
(e.g. where is data stored, how do the pieces of software interact, etc).
`PROD.md` has an overview of the alerts and any other notes for maintaining the
service.
- Do some back-of-the-envelope calculations to make sure your service can handle
the expected load. If your service is latency-sensitive, measure the latency
under load.
- Test on browsers that your users will be using, at least Chrome on desktop and
ideally Chrome on Android.
- Create an `app.yaml` in [k8s-config](https://skia.googlesource.com/k8s-config/+show/master/)
This controls how your app will be run in GKE. See
[these docs](https://kubernetes.io/docs/concepts/services-networking/connect-applications-service/)
for more on the schema. Commit this, then run `pushk appname` to make the configuration
active.
- Metrics are customarily made available at port 20000. To configure metrics scraping,
add the following to the `app.yaml` under spec -> template -> metadata:
```yaml
annotations:
prometheus.io.scrape: "true"
prometheus.io.port: "20000"
```
- Metrics will be available on [thanos-query.skia.org](https://thanos-query.skia.org/).
- The metrics will be labeled `app=<foo>` where `foo` is the first argument to
`common.InitWithMust`.
- If you have secrets (like a service account), bind it to the deployment by adding
the following to `app.yaml`:
```yaml
spec:
automountServiceAccountToken: false
...
containers:
- name: my-container
...
volumeMounts:
- name: my-app-sa
mountPath: /var/secrets/google
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
value: /var/secrets/google/key.json
...
volumes:
- name: my-app-sa
secret:
secretName: my-app
```
- If you use OAuth, it should be configured to use the &ast;.skia.org cookie (the default).
Additionally, you will need to mount the secrets to use with your login.Init&ast; code:
```yaml
spec:
...
containers:
- name: my-container
...
volumeMounts:
- name: skia-org-legacy-login-secrets
mountPath: /etc/skia.org/
...
volumes:
- name: skia-org-legacy-login-secrets
secret:
secretName: skia-org-legacy-login-secrets
```
- It is possible to test your service/config without making it publicly visible.
- Deploy your `app.yaml` either with `pushk` or `kubectl apply -f app.yaml`
- Identify a pod name, `kubectl get pods | grep [my-app]` where my-app is the
name of the new service.
- Forward a local port (e.g. 8083) to a port on the pod (e.g. the HTTP port 8000):
`kubectl port-forward my-app-7bf542629-jujzm 8083:8000`
- Navigate a browser to <http://localhost:8083> to see your service.
- If you have simple routing needs, to make your service visible to the public
add a [`skia.org.domain` annotation to your Service
YAML](https://skia.googlesource.com/buildbot/+doc/refs/heads/master/skfe/README.md)
with the domain name and deploy your updated yaml with `kubectl apply`.
If your routing is more complicated you can skip the YAML annotation and write
the routing rules directly into `infra/skfe/k8s/default.conf`.
Either way you then push a new version of nginx-skia-org:
~~~
cd infra/skfe
make k8s_push
~~~
And watch that the new instances start running:
~~~
watch kubectl get pods -lapp=nginx-skia-org
~~~
- Add prober rules to `probers.json` in your application directory.
- Ideally, probe all public HTML pages and all nullipotent JSON endpoints.
You can write functions in `prober/go/prober/main.go` to check the
response body if desired.
- Add additional stats gathering to your program using
`go.skia.org/infra/go/metrics2`, e.g. to ensure liveness/heartbeat of any
background processes.
- Add alert rules to [alerts_public](https://skia.googlesource.com/buildbot/+show/master/promk/prometheus/alerts_public.yml).
The alerts may link to a production manual, `PROD.md`, checked into the
application source directory. Examples:
- All prober rules.
- Additional stats from metrics2.
Legacy apps have their alert rules in `prometheus/sys/alert.rules`
- Some [general metrics](https://skia.googlesource.com/buildbot/+show/master/promk/prometheus/alerts_general.yml)
apply to all apps and may not need to be added
explicitly for your application, such as:
- Too many goroutines.
- Free disk space on the instance and any attached disks.
- This is also for alerts that apply to skia-public and skia-corp projects.
- Check your alert rules by running `make validate` in `promk/`
(Legacy apps should run that commaind in `prometheus/`).
- Then, after landing your valid alerts, run `make push_config && make push_config_corp` in `promk/`
(Again, legacy apps should do `make push` in `prometheus/`).
- Tell people about your new service.
- Be prepared for bug reports. :-)
Continuous Deployment
=====================
Some apps are set up to be continuously re-built and re-deployed on every commit of Skia
or Skia Infra. To do that, see [docker_pushes_watcher/README.md](./docker_pushes_watcher/README.md).