Written by Nelson Koning
on April 13, 2026

Intro to Observability as Code: Managing Dashboards with GitOps

Observability as code brings the same benefits we expect from infrastructure as code — versioning, reviewability, repeatability — to dashboards, alerting rules, and other observability configuration. Instead of clicking in a GUI to create panels, teams store dashboard definitions and related config in Git, let a GitOps controller reconcile those files into running systems, and treat dashboards as reviewable, auditable artifacts. This approach reduces drift, makes rollbacks trivial, and integrates dashboard changes into standard CI/CD workflows. (grafana.com)

Why this matters

Dashboards describe how your team sees production. Treating them as code makes that view explicit and reviewable.
Git history captures who changed what and why, which is invaluable when a chart or alert is adjusted during an incident.
Automated reconciliation (a GitOps controller such as ArgoCD or Flux) keeps the live Grafana instance consistent with the repo — eliminating configuration drift and manual toil. (grafana.com)

Key building blocks

Dashboard manifests: JSON or a higher-level representation (JSONNet, YAML CRDs, Terraform resources).
Data sources: declared in code so dashboards don’t break when moved between environments.
A GitOps controller (ArgoCD or Flux): watches the repo and reconciles changes into the cluster or the Grafana API.
A provisioning layer: either Grafana’s provisioning (files/ConfigMaps), the Grafana Operator CRDs, or API-based tooling that creates dashboards in Grafana. (grafana.github.io)

Patterns for managing dashboards with GitOps Below are patterns you’ll see in real-world projects, and when to use each.

File provisioning + sidecar
- Store dashboard JSON in a repo. Use a ConfigMap or a file-provisioning mechanism to mount JSON into a Grafana pod (often via the official Helm chart with a dashboard sidecar).
- Good for teams already deploying Grafana via Helm and who want simple file-based sync.
Operator CRDs (Grafana Operator)
- Use custom resources (GrafanaDashboard, GrafanaDataSource, etc.) that the Grafana Operator reconciles into Grafana.
- Clean Kubernetes-native model with RBAC and namespaces. Works well when you run multiple Grafana instances in clusters. (grafana.github.io)
API-driven GitOps (ArgoCD/Flux + tooling)
- Use a controller to detect changes in Git and call Grafana’s API (or grafanactl/CLI) to push dashboards and datasources.
- This model decouples dashboard lifecycle from a particular Kubernetes deployment and can operate against Grafana Cloud or managed Grafana instances. (grafana.com)

A minimal repo layout (example)

dashboards/
- k8s-cluster-overview.json
- app-metrics.json
datasources/
- prometheus.yaml
grafana/
- grafanadashboard-k8s.yaml # CRD or reconciler manifest
kustomize/ or argocd-app.yaml

Sample GrafanaDashboard CRD (simplified)

apiVersion: integreatly.org/v1alpha1
kind: GrafanaDashboard
metadata:
  name: k8s-cluster-overview
  namespace: monitoring
spec:
  json: |
    {
      "title": "Kubernetes cluster overview",
      "panels": [
        {
          "type": "graph",
          "title": "CPU usage",
          "targets": [{ "expr": "sum(rate(container_cpu_usage_seconds_total[5m])) by (pod)" }]
        }
      ]
    }

This CRD-style manifest can be applied by a GitOps controller and reconciled by the Grafana Operator into a running dashboard. The exact CRD group/version may vary by operator release. (grafana.github.io)

Choosing a GitOps controller: ArgoCD vs Flux

ArgoCD:
- Strong UI for visualizing application sync status and diffs.
- Works well with operators (Grafana Operator) and Helm/Kustomize app patterns.
- Many Grafana docs and examples show integration with ArgoCD for dashboard workflows. (grafana.com)
Flux:
- Lightweight, Kubernetes-native, and often chosen for simple declarative workflows.
- Can manage HelmRelease, Kustomize, and plain manifests; pairing Flux with a sidecar/configmap pattern or an operator is common. Recent community guides show teams using ConfigMaps and Flux to manage dashboards. (oneuptime.com)

A simple GitOps workflow (high level)

Author dashboard JSON (or JSONNet/YAML/CRD) in a branch.
Open a pull request describing intent and queries changed.
Run CI checks (linting, schema validation, and rendering tests).
Merge to main — the GitOps controller detects the change and reconciles it into the cluster or pushes it to Grafana.
Monitor the controller’s sync status and dashboard health; if something breaks, revert the commit and the controller rolls back the running config.

Testing and CI for dashboards

Lint and validate JSON against Grafana schema (many projects use schema validators, jsonnet fmt, or custom scripts).
Snapshot tests: render a dashboard and compare selected metric queries or panel counts to expected shapes.
Dry-run: for API-driven pushes, put a dry-run mode in the CI job to preview what would change in Grafana.
Add small acceptance tests that import the dashboard into a disposable Grafana instance (GitHub Actions/CI job) to ensure the dashboard loads without errors.

Tooling and templating options

JSON: standard and compatible, but verbose.
JSONNet / grafonnet: programmatic dashboard generation — useful for many similar dashboards.
Terraform: many teams use Terraform Grafana provider for cloud-managed Grafana, treating dashboards as Terraform resources.
grafanactl and Grafana provisioning APIs: CLI tools that help push dashboards from pipelines. Grafana’s recent evolution has also introduced changes to the dashboard schema and CLI tooling that make programmatic management easier; newer versions decouple layout from panel configuration to improve readability and reusability. (plushcap.com)

Best practices (practical, battle-tested)

Keep dashboards small and focused: one clear purpose per dashboard reduces cognitive load and easier reviews.
Version test fixtures: store sample queries and a small set of synthetic metrics to validate panels in CI.
Separate environment configs: use different folders or branches for staging vs production, or parameterize datasources.
DRY for repeated panels: template common panels (e.g., CPU, memory) using Jsonnet or generators rather than copying JSON blobs.
Namespaces and RBAC: when using the operator CRDs, leverage Kubernetes namespaces and RBAC to give teams scoped control over their dashboards.
Document panel intent in PRs: include short notes explaining why queries changed — this is the single most useful thing for on-call teams when investigating incidents.

Handling drift, secrets, and multi-tenant setups

Drift: GitOps controllers continuously reconcile; but you should still alert on repeated drift (someone is repeatedly changing Grafana via the UI).
Secrets: keep API keys and datasource credentials in sealed secrets or a secret store (do not commit raw API keys). Use the operator or controller patterns that support external secret references.
Multi-tenant Grafana: map teams to folders or separate Grafana instances. When you have many teams, prefer CRD/operator or isolated Grafana instances per team to limit blast radius.

Common pitfalls and how teams mitigate them

Dynamic/templated panels that reference external variables can be hard to import as code. Mitigation: centralize variable definitions and include them in the same repo.
Large monolithic dashboard JSONs: break into panels and reference them via templating.
Unreviewed GUI edits: lock down Grafana permissions and require that changes be made via Git. Use read-only for most users.
Rate limits and quotas: if you push dashboards programmatically to a hosted Grafana (Cloud), be mindful of API rate limits when syncing many dashboards; batch your changes or use the operator pattern that reconciles incrementally.

A short checklist before you adopt

Do you have a Git repo structure for observability config? If not, create one and add a README explaining where dashboards live.
Can you run basic validation in CI? Add at least JSON schema validation and a lint step.
Decide on a provisioning strategy (ConfigMap/sidecar, operator CRDs, or API pushes) that matches your deployment model (Kubernetes-native vs managed Grafana).
Lock down the Grafana UI for write access to a small group; treat Git as the source of truth.
Add monitoring for your GitOps controller’s sync status so you can spot broken reconciliations quickly. (grafana.com)

Closing thoughts Managing dashboards with GitOps reframes dashboards from ephemeral dashboards to first-class, versioned artifacts. The immediate wins — reproducible dashboards, audit trails, and simple rollbacks — are obvious. The bigger win is cultural: you change dashboard edits from “I clicked something” to “I opened a PR,” which invites review, sanity checking, and better collaboration across teams.

If you’re starting, pick one dashboard or folder, move it into a repo, add a simple CI check, and let a GitOps controller reconcile it. The pattern scales: teams that adopt observability as code find fewer surprises in production and a much clearer history of why metric views changed over time. (grafana.com)

← → Top