Logo

Kubernetes

OpenMeter includes a Kubernetes integration that meters pod execution time and reports it to OpenMeter. Measuring execution time is useful if you want to bill based on CPU/GPU time or calculate unit economics for your Kubernetes containers.

Installation

The simplest method for installing the collector is through Helm:

export OPENMETER_TOKEN=om_...
helm install --wait --create-namespace \
  --namespace openmeter-collector \
  --set preset=kubernetes-pod-exec-time \
  --set openmeter.token=${OPENMETER_TOKEN} \
  openmeter-collector oci://ghcr.io/openmeterio/helm-charts/benthos-openmeter

With the default settings, the collector will report on pods running in the default namespace to OpenMeter every 15 seconds.

Take a look at this example to see the collector in action.

Start Metering

To get start measuring Kubernetes pod execution time, create a meter in OpenMeter:

Create meter

Configuration

The collector accepts several configuration values as environment variables:

VariableDefaultDescription
SCRAPE_NAMESPACEdefaultThe namespace to scrape (currently, only one is supported; see this issue).
SCRAPE_INTERVAL15sThe interval for scraping in Go duration format (Minimum interval: 1 second).
BATCH_SIZE20The minimum number of events to wait for before reporting. The collector will report events in a single batch (set to 0 to disable).
BATCH_PERIOD-The maximum duration to wait before reporting the current batch to OpenMeter, in Go duration format.
DEBUGfalseIf set to true, every reported event is logged to stdout.

These values can be set in the Helm chart's values.yaml file using env or envFrom. For more details, refer to the Helm chart reference.

Mapping

The collector maps information from each pod to CloudEvents according to the following rules:

  • Event type is set as kube-pod-exec-time.
  • Event source is set as kubernetes-api.
  • The subject name is mapped from the value of the openmeter.io/subject annotation. It falls back to the pod name if the annotation is not present.
  • Pod execution time is mapped to duration_seconds in the data section.
  • Pod name and namespace are mapped to pod_name and pod_namespace, respectively, in the data section.
  • Annotations labeled data.openmeter.io/KEY are mapped to KEY in the data section, with previous data attributes being given higher priority.

Performance tuning

The ideal performance tuning options for this collector depend on the specific use case and the context in which it is being used. For instance, reporting on a large number of pods infrequently requires different settings than reporting on a few pods frequently. Additionally, accuracy and real-time requirements might influence the selection of appropriate options.

The primary factors influencing performance that you can adjust are SCRAPE_INTERVAL, BATCH_SIZE, and BATCH_PERIOD.

A lower SCRAPE_INTERVAL implies the need for more accurate information about pod execution time, but it also generates more events, leading to increased requests to OpenMeter. To mitigate this, you can raise BATCH_SIZE and/or set or increase BATCH_PERIOD. This approach reduces the number of requests (and the potential for back-pressure) to OpenMeter. However, this means that your more accurate data will be delayed in reaching OpenMeter, resulting in less real-time data.

Managing a large number of pods typically requires a higher SCRAPE_INTERVAL to avoid overburdening the Kubernetes API and OpenMeter. A higher SCRAPE_INTERVAL, though, might lead to less accurate metering.

For additional options to fine-tune performance, please consult the Advanced configuration section.

Advanced configuration

The Kubernetes collector utilizes Benthos to gather pod information, convert it to CloudEvents, and reliably transmit it to OpenMeter.

The configuration file for the collector is available here.

To tailor the configuration to your needs, you can edit this file and mount it to the collector container. This can be done using the config or configFile options in the Helm chart. For more details, refer to the Helm chart reference.

For additional tips on performance fine-tuning, consult the Benthos documentation.

Additional use cases

Capture metadata (like CPU and memory limits)

A common use case for monetizing workloads running on Kubernetes is to charge based on resource consumption. One way to do that is to include resource limits of containers in the ingested data.

Although it requires a custom configuration, it's straightforward to achieve with the Kubernetes collector:

input:
  # just use the defaults
 
pipeline:
  processors:
    - mapping: |
        root = {
          "id": uuid_v4(),
          "specversion": "1.0",
          "type": "kube-pod-exec-time",
          "source": "kubernetes-api",
          "time": meta("schedule_time"),
          "subject": this.metadata.annotations."openmeter.io/subject".or(this.metadata.name),
          "data": this.metadata.annotations.filter(item -> item.key.has_prefix("data.openmeter.io/")).map_each_key(key -> key.trim_prefix("data.openmeter.io/")).assign({
            "pod_name": this.metadata.name,
            "pod_namespace": this.metadata.namespace,
            "duration_seconds": (meta("schedule_interval").parse_duration() / 1000 / 1000 / 1000).round().int64(),
            "memory_limit": this.spec.containers.index(0).resources.limits.memory,
            "memory_requests": this.spec.containers.index(0).resources.requests.memory,
            "cpu_limit": this.spec.containers.index(0).resources.limit.cpu,
            "cpu_requests": this.spec.containers.index(0).resources.requests.cpu,
          }),
        }
    - json_schema:
        schema_path: 'file://./cloudevents.spec.json'
    - catch:
        - log:
            level: ERROR
            message: 'Schema validation failed due to: ${!error()}'
        - mapping: 'root = deleted()'
 
output:
  # just use the defaults

Here is the important section from the above configuration:

  "memory_limit": this.spec.containers.index(0).resources.limits.memory,
  "memory_requests": this.spec.containers.index(0).resources.requests.memory,
  "cpu_limit": this.spec.containers.index(0).resources.limit.cpu,
  "cpu_requests": this.spec.containers.index(0).resources.requests.cpu,

It adds the memory and CPU limits and requests to the data section of the CloudEvent. You can tweak the configuration to include other resource limits and requests or resource info for multiple containers as needed.