GitSync Production Manual

General Metrics

The following dashboard is for the skia-public instances:

The following dashboard is for the skia-corp instances:

Some things to look for:

  • Do goroutines or memory increase continuously (e.g leaks)?
  • Have any repos taken more than a few seconds to sync?
  • Is there an elevated error rate?

General Logs

Logs for GitSync instances in skia-public/skia-corp are in the usual GKE container grouping, for example:


Items below here should include target links from alerts.


This alert means we haven't successfully synced a repo in over 5 minutes. This could be due to failure to communicate with the Gitiles server, or because of a problem with GitSync itself. Check the logs for details.

Key metrics: liveness_last_successful_git_sync_s


The log error rate is elevated. There are a number of possible causes; check the logs and verify that things are working as expected.

Key metrics: rate(num_log_lines{level=“ERROR”,app=~“gitsync.*”}[30m])