April 5, 2024

What Engineers Meter in 2024Common Use Cases by OpenMeter Users

At OpenMeter, we're privileged to work alongside exceptional teams, helping them monetize their innovative products and features. Our collaborations have painted a clear picture of what leading companies seek to meter. This article explores the most popular metering use cases we've encountered.

#LLMs Run on Tokens

The launch of ChatGPT by OpenAI has set a precedent: charging based on token usage for generative AI, a model driven by both customer value and cost. Token count not only reflects the amount of information processed by foundation models but also correlates with operating costs. Given the high per-token cost across various LLMs, token-based billing provides a margin safety net when utilizing APIs from OpenAI, Anthropic, or other LLM providers.

#Recommendations for Token Usage Metering

Considering the variation in charges across different prompt types and models, we suggest grouping token usage by prompt type (input, output, and system) and model version (gpt-3.5, gpt-4).

Example meter definition with OpenMeter:

Create Meter in Cloud

meters:
  - slug: tokens_total
    description: AI Token Usage
    # Filter events by type
    eventType: prompt
    aggregation: SUM
    # JSONPath to parse usage value
    valueProperty: $.tokens
    groupBy:
      # Model used: gpt4-turbo, etc.
      model: $.model
      # Prompt type: input, output, system
      type: $.type

#GPUs Power AI

GPUs are used everywhere to run AI workloads and build AI models, eventually leading to the 2023 GPU shortage. Reportedly, OpenAI uses 10,000 Nvidia GPUs to train models. It is no surprise that GPU time is expensive and must be accurately metered. We see an increasing number of OpenMeter users metering GPU time on a second granularity for monetization and cost control use-cases.

#Recommendations for Metering GPU Time

We advocate for heartbeat-style metering, where the process periodically reports its status, simplifying implementation and avoiding the complexities of start-stop log analysis, which is error-prone and especially complex for long-running processes that overlap with billing period changes. You can read more about execution time metering in our previous blog post.

Example meter definition with OpenMeter:

Create Meter in Cloud

meters:
  - slug: gpu_execution_duration_seconds
    description: GPU Time
    eventType: gpu_time
    aggregation: SUM
    valueProperty: $.duration_seconds
    groupBy:
     	hostname: $.hostname
     	region: $.region
      # Type of GPU: e.g. nvidia_A100
 	    gpu_type: $.gpu_type

#Multi-Tenancy and Cloud Cost

With cloud costs significantly impacting COGS, companies prioritize metering their priciest resources. This includes attributing compute, storage, and network usage within multi-tenant setups to respective consumers and teams. Examples include metering Kubernetes pod runtime, database storage, and ingested data volume.

#Recommendations to meter multi-tenant resources

Given the complexity of modern systems, we recommend focusing on one or two key consumption metrics that best align with costs for efficiency instead of trying to meter every aspect of a distributed system. For example, you may meter storage and query usage for a database instead of metering all the components involved in running a database. This is efficient as the cost of most parts moves together anyway like the backup cost correlated with storage cost and so on.

Example meter definition with OpenMeter:

Create Meter in Cloud

meters:
  - slug: pod_execution_time
    description: Pod Execution Time
    eventType: kube-pod-exec-time
    aggregation: SUM
    valueProperty: $.duration_seconds
    groupBy:
      pod_name: $.pod_name
      pod_namespace: $.pod_namespace

#APIs Are Here to Stay

Billing based on API usage remains a popular pricing model. In serverless architectures, this extends to measuring the duration of API calls.

#Recommendations to meter API calls

We recommend masking path parameters in your metering strategy to manage dataset cardinality and keep it human-readable. For example, it is much easier to search for /products/:product_id instead of having 10,000 different endpoint paths due to various product IDs.

Create Meter in Cloud

meters:
  - slug: api_requests_total
    description: API Requests
    # Filter events by type
    eventType: request
    aggregation: COUNT
    groupBy:
      # HTTP Method: GET, POST, etc.
      method: $.method
      # Route: /products/:product_id
      route: $.route

#Summary

2024, we see the rising need to meter AI resources like LLMs and GPUs. Companies adopting AI features necessitate tighter cost control and drive the adoption of complex usage-based pricing models. By implementing accurate metering, companies can ensure fair billing, optimize costs, and gain insights into customer behavior.

Serverless metering

Get started with OpenMeter Cloud today!

Join OpenMeter Cloud

Peter Marton@slashdotpeter