March 25, 2025

Monetize GPU Usage with Run:ai and OpenMeterMeter, price, and invoice GPU and AI workloads with precision

Today we're announcing support for Run:ai in the OpenMeter Collector. With this release, you can meter GPU, CPU, and memory allocation for Run:ai workloads and use that data for accurate billing and invoicing.

#What is Run:ai?

Run:ai, now part of Nvidia, is a Kubernetes-based scheduler and resource manager built for GPU-intensive workloads. It lets organizations pool and allocate GPU resources across teams and environments. The platform is used to run large-scale AI training and inference across hybrid, multi-cloud, and on-prem setups.

The resource cost of AI infrastructure is significant. Teams need a way to meter what's used, tie it to customers or teams, and bill accurately. OpenMeter solves this.

#What is the OpenMeter Collector?

The OpenMeter Collector is a standalone application that captures usage data from multiple sources in your infrastructure. It now supports Run:ai as a native integration.

It can ingest usage data from:

Run:ai workloads and pods
Kubernetes pods, storage and network usage
Prometheus metrics
PostgreSQL, ClickHouse, and other databases

Once collected, this data is turned into billable usage events and processed through OpenMeter's platform. From there, you can price, invoice, and track revenue.

#What Metrics Can You Track?

OpenMeter collects detailed resource metrics from Run:ai, including:

GPU Allocation Time: Total time GPUs are reserved for a workload, whether used or idle
GPU Memory Usage: Memory reserved and consumed by each job or container
Bandwidth (NVLink and PCIe): Data transfer to and from GPUs, useful for network cost recovery
Multi-Tenant Attribution: Breakdown of usage by customer, team, or project

This helps with cost visibility, chargebacks, and metered billing.

#How to Price GPU Workloads

The way you charge for GPU usage depends on the workload. Here are a few common models:

Per GPU Type Create SKUs by GPU (like A100, L4) and price accordingly
By Allocation Bill for reserved GPU time, regardless of utilization
By Workload Type Use one rate for training jobs and another for inference
SLA-Based For inference workloads, offer a flat rate to reserve GPU capacity plus a variable rate for usage

Many vendors price by the second or minute based on allocated time. Others bill for network usage separately. OpenMeter supports both with its pricing catalog.

#How OpenMeter Helps

With OpenMeter and Run:ai, you can:

Collect usage data at a fine-grained level
Track GPU hours and other infrastructure costs
Use a no-code catalog to price each resource
Generate invoices automatically after usage
Support enterprise contracts with tiered pricing or commitments
Set tenant limits and usage thresholds

All of this runs in your infrastructure and integrates with your payment providers and CRM systems.

Peter Marton@slashdotpeter