[golden] Add a PodDisruptionBudget for baseline servers

Keeping baseline servers highly available is important because
many client CQ tasks depend on calling APIs that are served by
this service.

We already have a podAntiAffinity rule that prevents more than one
of the same type of baseline-server on the same node, so if a node
goes offline we have redundancy.

This CL adds a PodDisruptionBudget [1] which will prevent routine
maintenance (e.g. Nodes are being automatically upgraded or
intentionally drained) from taking down more than one of
a pod at a time.

This does not prevent simultaneous NodeDiskPressure events from
evicted multiple pods at the same time (they ignore these budgets),
but we have hopefully solved that by increasing the diskspace
and (soon) having docker-pushes-watcher clean up after itself.

One misconception I had about evictions was that when k8s
evicts a pod, it starts up another replica on a different node and
waits for that second replica to start before terminating the
evicted pod. No - k8s sends the evicted pod into shutdown mode
right away so there is a gap in time before the second replica
is live.

While I would have liked to apply this to the Gold frontends as
well, those are not designed (yet) to be run with more than
one replica (although they are probably close). We cannot
have an effective PodDisruptionBudget unless the number of replicas
is greater than 1 (see misconception in the previous paragraph
which led me to optimistically hope for otherwise).

[1] https://kubernetes.io/docs/tasks/run-application/configure-pdb/

Change-Id: I853140320ac63c60d681e42791023147bc4231cf
Reviewed-on: https://skia-review.googlesource.com/c/buildbot/+/748436
Commit-Queue: Kevin Lubick <kjlubick@google.com>
Reviewed-by: Joe Gregorio <jcgregorio@google.com>
1 file changed
tree: 157bcf466ce9869a1f16e633c788392fcbd86ed7
  1. am/
  2. android_ingest/
  3. android_stats/
  4. api/
  5. autoroll/
  6. bash/
  7. bazel/
  8. blamer/
  9. bugs-central/
  10. cabe/
  11. cd/
  12. cherrypick-watcher/
  13. cmd/
  14. codereview-watcher/
  15. codesize/
  16. comments/
  17. comp-ui/
  18. cq_watcher/
  19. ct/
  20. datahopper/
  21. debugger-app/
  22. demos/
  23. docker/
  24. docker_pushes_watcher/
  25. docs/
  26. docsyserver/
  27. ds/
  28. elements-sk/
  29. email/
  30. external/
  31. fiddlek/
  32. firestore/
  33. get_service_account/
  34. gitsync/
  35. go/
  36. gold-client/
  37. golden/
  38. helloworld/
  39. infra/
  40. infra-sk/
  41. jsdoc/
  42. jsfiddle/
  43. k8s-checker/
  44. k8s-deployer/
  45. kube/
  46. leasing/
  47. licenses/
  48. machine/
  49. make/
  50. modules/
  51. named-fiddles/
  52. new_element/
  53. npm-audit-mirror/
  54. perdiff/
  55. perf/
  56. periodic-trigger/
  57. proberk/
  58. promk/
  59. puppeteer-tests/
  60. sa-keys-checker/
  61. scrap/
  62. scripts/
  63. shaders/
  64. sk/
  65. skbug/
  66. skcq/
  67. skfe/
  68. skolo/
  69. skottie/
  70. static_server/
  71. status/
  72. task_driver/
  73. task_scheduler/
  74. test-service/
  75. tools/
  76. tree_status/
  77. trybot_updater/
  78. .bazelignore
  79. .bazelrc
  80. .bazelversion
  81. .eslintrc.js
  82. .gitattributes
  83. .gitignore
  84. .npmrc
  85. .prettierignore
  86. .prettierrc.json
  87. .puppeteerrc.js
  88. .vpython
  89. BAZEL_CHEATSHEET.md
  90. BUILD.bazel
  91. build_infra_prod.sh
  92. cipd.ensure
  93. codereview.settings
  94. DATASTORE.md
  95. demopage.sh
  96. DEPS
  97. go.mod
  98. go.sum
  99. go_repositories.bzl
  100. karmatest.sh
  101. launch.md
  102. LICENSE
  103. Makefile
  104. OWNERS
  105. package-lock.json
  106. package.json
  107. PRESUBMIT.py
  108. PRIVACY_POLICY.md
  109. README.md
  110. STYLEGUIDE.md
  111. tools.go
  112. tsconfig.json
  113. whitespace.txt
  114. WORKSPACE
README.md

Skia-Buildbot Repository

This repo contains infrastructure code for Skia.

Getting the Source Code

The main source code repository is a Git repository hosted at https://skia.googlesource.com/buildbot.git. It is possible to check out this repository directly with git clone or via go get.

Using git clone allows you to work in whatever directory you want. You will still need to set GOPATH in order to build some apps (recommended to put this in a cache dir). E.g.:

$ cd ${WORKDIR}
$ git clone https://skia.googlesource.com/buildbot.git
$ export GOPATH=${HOME}/.cache/gopath/$(basename ${WORKDIR})
$ mkdir $GOPATH
$ cd buildbot

Install dependencies

Almost all applications are built with Bazel, and bazelisk is the recommended tool to ensure you have the right version of bazel installed:

go install github.com/bazelbuild/bazelisk@latest
go install github.com/bazelbuild/buildtools/buildifier@latest
go install github.com/kisielk/errcheck@latest
go install golang.org/x/tools/cmd/goimports@latest
go install github.com/mikefarah/yq/v4@latest
go install go.chromium.org/luci/client/cmd/...@latest

Install other dependencies:

sudo apt-get install jq

Build ~everything

bazelisk build --config=mayberemote //...

Test everything

bazelisk test --config=mayberemote //...

Generated Code

To update generated code run the following in any directory:

go generate ./...

Running unit tests

Install Cloud SDK.

Use this command to run the presubmit tests:

./run_unittests --small