
Today we're announcing support for Run:ai in the OpenMeter Collector. With this release, you can meter GPU, CPU, and memory allocation for Run:ai workloads and use that data for accurate billing and invoicing.
What is Run:ai?
Run:ai, now part of Nvidia, is a Kubernetes-based scheduler and resource manager built for GPU-intensive workloads. It lets organizations pool and allocate GPU resources across teams and environments. The platform is used to run large-scale AI training and inference across hybrid, multi-cloud, and on-prem setups.
The resource cost of AI infrastructure is significant. Teams need a way to meter what's used, tie it to customers or teams, and bill accurately. OpenMeter solves this.
What is the OpenMeter Collector?
The OpenMeter Collector is a standalone application that captures usage data from multiple sources in your infrastructure. It now supports Run:ai as a native integration.
It can ingest usage data from:
- Run:ai workloads and pods
- Kubernetes pods, storage and network usage
- Prometheus metrics
- PostgreSQL, ClickHouse, and other databases
Once collected, this data is turned into billable usage events and processed through OpenMeter's platform. From there, you can price, invoice, and track revenue.
What Metrics Can You Track?
OpenMeter collects detailed resource metrics from Run:ai, including:
- GPU Allocation Time: Total time GPUs are reserved for a workload, whether used or idle
- GPU Memory Usage: Memory reserved and consumed by each job or container
- Bandwidth (NVLink and PCIe): Data transfer to and from GPUs, useful for network cost recovery
- Multi-Tenant Attribution: Breakdown of usage by customer, team, or project
This helps with cost visibility, chargebacks, and metered billing.
How to Price GPU Workloads
The way you charge for GPU usage depends on the workload. Here are a few common models:
- Per GPU Type Create SKUs by GPU (like A100, L4) and price accordingly
- By Allocation Bill for reserved GPU time, regardless of utilization
- By Workload Type Use one rate for training jobs and another for inference
- SLA-Based For inference workloads, offer a flat rate to reserve GPU capacity plus a variable rate for usage
Many vendors price by the second or minute based on allocated time. Others bill for network usage separately. OpenMeter supports both with its pricing catalog.
How OpenMeter Helps
With OpenMeter and Run:ai, you can:
- Collect usage data at a fine-grained level
- Track GPU hours and other infrastructure costs
- Use a no-code catalog to price each resource
- Generate invoices automatically after usage
- Support enterprise contracts with tiered pricing or commitments
- Set tenant limits and usage thresholds
All of this runs in your infrastructure and integrates with your payment providers and CRM systems.