Building a DigitalOcean Cost Explorer with Prometheus & Grafana

How I built a custom cost exporter for DigitalOcean using Prometheus and Grafana to fill a key monitoring gap and enable data-driven cloud cost savings at Avail.

2024-12-15·2 min read
#devops#prometheus#grafana#cost-optimization#digitalocean

The Problem

When working at Avail, we were running significant infrastructure on DigitalOcean. Unlike AWS with its Cost Explorer, DigitalOcean didn't have a built-in granular cost monitoring solution. We needed visibility into where our money was going — per droplet, per service, per team.

Without this visibility, it was impossible to:

  • Identify underutilized resources
  • Attribute costs to specific services
  • Set budgets and alerts for cost overruns
  • Make data-driven scaling decisions

The Solution

I built a custom Prometheus exporter that pulls billing data from the DigitalOcean API and exposes it as metrics. Combined with Grafana dashboards, this gave us real-time cost visibility across our entire infrastructure.

Architecture

DigitalOcean API → Cost Exporter (Python) → Prometheus → Grafana
                                                          ↓
                                                    Alert Manager → Slack/PagerDuty

Key Metrics Exported

  • digitalocean_droplet_cost_monthly — per-droplet monthly cost
  • digitalocean_kubernetes_cost_monthly — per-cluster K8s costs
  • digitalocean_spaces_cost_monthly — object storage costs
  • digitalocean_bandwidth_cost — data transfer costs
  • digitalocean_total_projected_cost — projected monthly spend

The Impact

After deploying this solution, we were able to:

  1. Identify $400K in annual savings by finding overprovisioned droplets and idle resources
  2. Set up automated alerts when projected costs exceeded thresholds
  3. Attribute costs to specific teams and services
  4. Make informed decisions about when to migrate workloads

Lessons Learned

  • Build for observability first — you can't optimize what you can't measure
  • Start with the metrics that matter — don't try to export everything at once
  • Automate the response — alerts are useless without runbooks

The exporter is now running in production and has become a core part of our infrastructure monitoring stack.


Check out the source code on GitHub and the detailed write-up on my blog.