on
Intro to Observability as Code: Managing Dashboards with GitOps
Observability dashboards are as important as the metrics and logs they visualize — but they’re also easy to lose track of when dashboards are edited by hand in a running instance. Treating dashboards as code and managing them via GitOps brings repeatability, reviewability, and a clear lifecycle: change in Git → CI validation → automated deploy to Grafana. Recent improvements across the Grafana ecosystem and supporting tools have made this pattern practical for small teams and platforms alike. (grafana.com)
Why treat dashboards as code?
- Version control: every change has an audit trail and diffable history.
- Peer review: dashboards can be improved through pull requests and code reviews.
- Reproducibility: the same dashboard definitions produce the same UI across environments.
- Declarative drift control: GitOps tools ensure the running Grafana matches what’s declared in Git.
Common GitOps patterns for dashboards
- Dashboard JSON in Git: export a dashboard as Grafana JSON and store it in a repo. A CI step can validate JSON and run schema checks before it’s applied. (grafana.github.io)
- Kubernetes CRs + Operator: in k8s-first platforms, teams use the Grafana Operator and the GrafanaDashboard custom resource to store dashboard JSON (or a URL/gzipped JSON) in Git and let Argo CD or Flux sync it into clusters. This pattern lets you manage dashboards with the same GitOps tooling you use for apps. (grafana.github.io)
- Client-side tooling: tools such as Grizzly let you keep dashboards as files locally (JSON, Jsonnet) and push them to Grafana via the API during CI or developer workflows, giving a non-Kubernetes path to GitOps-style management. (grafana.github.io)
- Templating/DRY: generate dashboards from templates using Jsonnet (grafonnet) or other templating tools to avoid duplicating panels when provisioning many similar dashboards (e.g., one per service or SLO). Note: some Jsonnet patterns are evolving, so check current tooling choices. (wasilzafar.com)
Tools you’ll see in the wild
- Grafana Git Sync / Git-based workflows: Grafana has been adding native Git-oriented features (e.g., Git Sync) to bridge interactive editing and Git-managed dashboards. These features are making it easier to treat Git as the source of truth for dashboard definitions. (grafana.com)
- Grafana Operator: exposes GrafanaDashboard, GrafanaDatasource and related CRDs; ideal when Grafana runs inside Kubernetes and you want operators to reconcile Git-stored manifest files into Grafana. The operator supports inline JSON, gzipped JSON, remote URLs, and more. (grafana.github.io)
- Grizzly: a CLI crafted for “observability as code” workflows — it can render templates and push dashboards, datasources, and alerts to Grafana via the API, and is useful when you prefer to manage Grafana outside Kubernetes. (grafana.github.io)
- Jsonnet / Grafonnet: libraries and templates for generating Grafana dashboard JSON programmatically. They’re useful for DRY dashboards, but check which Jsonnet variants and libraries your operator or tooling expects. (wasilzafar.com)
- Argo CD / Flux: GitOps controllers that synchronize Kubernetes resources (including Grafana CRs) from Git to cluster. Use these to automate deployment of operator-managed dashboards. (grafana.co.za)
A compact GitOps workflow for dashboards (high level)
- Author: create or update dashboard JSON or template in a feature branch.
- CI validation: run lint/JSON schema validation, and if using templates (Jsonnet/Grizzly), render to final JSON and validate panel queries and variables.
- Pull request review: team reviews UI and queries (the PR shows the real JSON diff so reviewers can see exact changes).
- Merge → GitOps sync: Argo CD / Flux or a CI job applies manifests (GrafanaDashboard CRs or an API push via Grizzly). The operator or the CI job reconciles the Git state into Grafana.
- Observability checks: confirm the dashboard appears, panels load data, and variables resolve.
Example: GrafanaDashboard CR (simplified)
apiVersion: grafana.integreatly.org/v1beta1
kind: GrafanaDashboard
metadata:
name: example-service-dashboard
spec:
instanceSelector:
matchLabels:
dashboards: "grafana"
json: >
{
"id": null,
"title": "Example Service Overview",
"panels": [ ... ]
}
This is the pattern used by the Grafana Operator: put the JSON (or a URL/gzipped JSON) in a CR and let the operator create or update the dashboard in Grafana. The operator supports several input modes so you can choose the representation that best fits your repo layout. (grafana.github.io)
Validating dashboards in CI
- JSON schema checks: verify the exported JSON parses and roughly matches Grafana dashboard schema.
- Query checks: optionally run lightweight query checks (e.g., test that Prometheus queries run or return expected labels) to detect broken data-source references.
- Snapshot preview: render a JSON diff or a human-readable summary in the PR so reviewers can see what’s changing without opening Grafana.
Common pitfalls and how teams avoid them
- Drift from manual edits: if developers edit dashboards directly in Grafana, the Git repo can drift. Adopt a policy that changes to Git-managed dashboards must come through the repository (Grafana features like Git Sync help bridge interactive editing and Git workflows). (grafana.com)
- Secrets & data sources: dashboards reference datasources and credentials. Manage data sources and secrets declaratively (GrafanaDatasource CRs, external secret operators, or API-driven provisioning) so dashboards don’t break when moved between environments. (grafana.github.io)
- Templating complexity: overly clever templates can make diffs hard to review. Keep templates readable and ensure rendered JSON is part of CI artifacts so reviewers can see the final result. (wasilzafar.com)
Why this approach scales
- Single source of truth: your Git repo becomes the auditable definition of what observability looks like across environments.
- Consistent onboarding: new environments are provisioned by applying manifests; dashboards, datasources, alerts follow the same lifecycle as services.
- Safer changes: PR-based workflows and CI validation reduce the chance a dashboard change breaks alerts or panels in production.
Conclusion Managing dashboards with GitOps turns what’s often a manual, ad-hoc process into a repeatable, reviewable, and automatable workflow. The Grafana ecosystem now supports both operator-driven (Kubernetes) and API-driven (client-side) flows — tools like the Grafana Operator, Grizzly, Grafana’s Git Sync, and templating libraries make it practical to treat dashboards as first-class code. Pick the workflow that matches your platform (kubernetes-first or not), keep rendered JSON easy to inspect in PRs, and validate dashboards in CI so Git truly becomes the single source of truth. (grafana.com)