on
Managing observability dashboards with GitOps: an intro to observability-as-code
Dashboards are the backstage console of modern systems — they’re where SREs, developers, and product owners tune into how an app is performing. Yet many teams still edit dashboards by hand, paste JSON into a UI, or keep a single editable “gold” dashboard in production. Observability-as-code brings the same benefits we expect from infrastructure-as-code to dashboards: versioning, code review, repeatable deployments, and safe rollbacks.
This article walks through why GitOps is a natural fit for dashboards, the recent tooling choices that make it practical today, a small example of what a GitOps layout looks like, and common gotchas to watch for.
Why treat dashboards like code?
- Version history and review: diffs and pull requests make layout and query changes auditable and reviewable by others, avoiding “it looked fine on my laptop” edits.
- Reproducibility: the same dashboard artifacts can be deployed to staging and prod so teams see consistent telemetry.
- Automation and safety: CI can validate dashboard JSON, run linting, and prevent malformed queries from reaching production.
- Multi-environment promotion: promote a dashboard change the same way you promote app config — through branches and GitOps syncs.
Think of dashboards like a music playlist. You want every listener (environment) to have the same curated tracks in the same order, but you also want the option to test a new cut in a private playlist before promoting it to everyone.
Recent developments that matter
Grafana — still the most widely used dashboard platform in many shops — has been pushing hard on “observability as code” primitives. Recent releases introduced a Git Sync workflow (where Grafana can synchronize dashboard changes with a Git repo and open PRs directly), a new resource-oriented API model, and a redesigned dashboard schema that is friendlier for diffs and composability. These changes make both in-Grafana Git workflows and external GitOps pipelines more practical. (grafana.com)
If you run Grafana on Kubernetes, the Grafana Operator exposes dashboards as Kubernetes custom resources (GrafanaDashboard). That makes it straightforward to manage dashboards with GitOps tools like Argo CD or Flux by storing the manifests in Git and letting the GitOps controller reconcile the cluster to the repo state. The Grafana docs and operator examples contain concrete CRD manifests you can drop into a repository. (grafana.com)
Finally, there are complementary approaches: Terraform and SDKs (e.g., Grafana Foundation SDK) let you generate or manage dashboards from code and wire them into CI/CD pipelines for automated provisioning. These are useful when dashboards are generated from service metadata or when you want stronger type-safety. (grafana.com)
Patterns for managing dashboards with GitOps
There are three common patterns you’ll see in the wild — choose the one that fits your team’s workflow and risk tolerance.
1) In-Grafana Git workflows (Git Sync)
- Changes are edited in Grafana’s UI and saved into a configured Git repo as branches/PRs. This lowers the barrier for teams that prefer UI-first editing while still adding Git review.
- Pros: low friction for creators; preserves visual editing experience.
- Cons: depends on the platform’s integration maturity and whether it’s experimental; you still need CI/linting if you want automated checks. (grafana.com)
2) Kubernetes-native GitOps (Grafana Operator + Argo CD / Flux)
- Dashboards live as GrafanaDashboard CRs in a Git repo; Argo CD or Flux applies those manifests to your cluster; the operator reconciles them into the Grafana instance.
- Pros: cloud-native, declarative, fits standard GitOps practices; good for multi-tenant clusters or multi-environment promotion. (grafana.com)
3) CI/CD pipeline provisioning (Foundation SDK, Terraform, or GitHub Actions)
- Dashboards are authored in code (TypeScript, Jsonnet, HCL) and CI converts them to dashboard JSON or API calls and applies them via the Grafana API or Terraform provider.
- Pros: powerful when dashboards are generated or templated; can run linting, automated tests, image previews as part of PRs. (grafana.com)
A minimal GitOps layout (example)
Here’s a tiny, realistic repository layout for the Kubernetes-native approach:
- repo-root/
- clusters/
- prod/
- grafana/
- grafana-dashboard-apps.yaml
- dashboards/
- app1-dashboard.yaml
- app2-dashboard.yaml
- grafana/
- staging/
- grafana/
- grafana-dashboard-apps.yaml
- dashboards/
- app1-dashboard.yaml
- grafana/
- prod/
- clusters/
A sample GrafanaDashboard CR (simplified) looks like this (the operator examples use apiVersion grafana.integreatly.org/v1beta1):
apiVersion: grafana.integreatly.org/v1beta1
kind: GrafanaDashboard
metadata:
name: app1-overview
namespace: observability
spec:
instanceSelector:
matchLabels:
dashboards: "grafana"
json: >
{
"id": null,
"title": "App1 Overview",
"panels": [],
"time": {"from":"now-1h","to":"now"}
}
Argo CD (or Flux) will sync the cluster folder to apply the manifest; the grafana-operator reconciles the CR into a dashboard inside Grafana. The operator supports several content sources: inline JSON, gzip-compressed JSON, remote URLs, or Jsonnet projects — which gives flexibility for generated dashboards. (grafana.github.io)
Best practices and gotchas
-
Pick a single source of truth If a dashboard is managed by GitOps (operator, Terraform, or API-driven), avoid manual edits in the Grafana UI — the next sync can overwrite changes or create a conflict. Make the repository authoritative and document it for your team. Operator docs explain how the operator maps resource UIDs to Grafana dashboards to prevent duplicate resources. (grafana.github.io)
-
Beware of UIDs and folder handling Grafana assigns UIDs to dashboards; the operator has behavior to handle missing UIDs (it can propagate Kubernetes metadata UID). If you hardcode UIDs, they should be stable across environments. Folders can be controlled by the CR or by folder references — check the operator docs for recommended patterns. (grafana.github.io)
-
Validate before you sync Use CI to run JSON linting and basic checks (missing queries, syntax errors) and, where possible, run a dry-run/preview. If you use in-Grafana Git Sync, validate PR changes because the Git workflow makes it easy to push edits quickly. (grafana.com)
-
Handle secrets and tokens carefully Grafana API tokens, cloud instance tokens, and datasource credentials should be stored in secrets and referenced securely (Kubernetes Secrets for operator, or secret stores for CI). Avoid baking credentials into dashboard manifests.
-
Track schema changes Grafana’s newer API and v2 dashboard schema are designed to be friendlier for diffs and dynamic composition, but schema changes can affect tooling and operators. If you depend on specific dashboard JSON shapes or on a Terraform provider built against a certain API, document and test migrations before rolling out across environments. (grafana.com)
When to pick which pattern
- If your team prefers visual editing and you want to lower friction: in-Grafana Git workflows can be a good bridge — but note platform maturity and whether the feature is experimental in your deployment. (grafana.com)
- If you already use Kubernetes and a GitOps controller: Grafana Operator + Argo CD/Flux fits naturally and yields a clean, declarative lifecycle. (grafana.com)
- If dashboards are generated from code or service metadata: SDKs or Terraform combined with CI make automated, templated dashboards straightforward. (grafana.com)
A pragmatic closing note
Treating dashboards as code is about bringing discipline and reproducibility to a part of observability that often lives outside the rest of your engineering governance. Newer features and APIs from popular tooling are lowering the friction to adopt GitOps for dashboards, but every team will need to balance editability, review, and safety. Use automation where it helps (linting, previews, PR checks), and keep a single source of truth so your visualizations remain reliable and consistent across environments. (grafana.com)
References (selected)
- Grafana: Git Sync feature and observability-as-code overview. (grafana.com)
- Grafana docs: Manage dashboards with GitOps using ArgoCD / Grafana Operator. (grafana.com)
- Grafana docs: Automate dashboard provisioning with CI/CD (Foundation SDK and GitHub Actions). (grafana.com)
- Grafana Operator documentation and examples (GrafanaDashboard CRD). (grafana.github.io)
Enjoy the rhythm of observability-as-code — like a well-arranged playlist, good dashboards should play nicely everywhere.