Logo

S3

OpenMeter can ingest data from an S3-compatible object storage, making integration easier with existing data pipelines. Example of popular S3 compatible object stores:

This guide will show you how to collect data from an S3-compatible object store and ingest it into OpenMeter.

Prerequisites

There are several strategies to collect data from an S3-compatible object store, but all of them depend on how the data is stored. Since we cannot possibly cover all possible scenarios, we will provide a solution for the most common one: batches of data indexed by timestamps.

If your data is not structured that way, check out the aws_s3 Redpanda Connect input documentation for more options.

Configuration

First, create a new YAML file for the collector configuration. You will have to use the aws_s3 Redpanda Connect input:

input:
  aws_s3:
    bucket: my-bucket
    region: us-east-1
    prefix: ${!timestamp_unix().ts_round("1h".parse_duration()).ts_unix()}/

The above section will tell Redpanda Connect to read data from your S3 bucket in the specified region, and look for objects in the specified prefix (hourly timestamp). You will have to run the collector as a cron job every hour to ingest the data into OpenMeter.

Feel free to tweak the prefix to your needs.

Next, you need to configure the mapping from your schema to CloudEvents using bloblang:

pipeline:
  processors:
    - mapping: |
        root = {
          "id": this.id,
          "specversion": "1.0",
          "type": "your-usage-event-type",
          "source": "s3",
          "time": this.time,
          "subject": this.subject_field,
          "data": {
            "data": this.data_field,
          },
        }

Finally, you need to configure the OpenMeter output:

# Send processed events to OpenMeter
output:
  label: 'openmeter'
  drop_on:
    error: false
    error_patterns:
      - Bad Request
    output:
      http_client:
        url: '${OPENMETER_URL:https://openmeter.cloud}/api/v1/events'
        verb: POST
        headers:
          Authorization: 'Bearer ${OPENMETER_TOKEN:}'
          Content-Type: 'application/json'
        timeout: 30s
        retry_period: 15s
        retries: 3
        max_retry_backoff: 1m
        # Maximum number of concurrent requests
        max_in_flight: 64
        batch_as_multipart: false
        drop_on:
          - 400
        # Batch settings for efficient API usage
        batching:
          # Send up to 100 events in a single request
          count: 100
          # Or send after 1 second, whichever comes first
          period: 1s
          processors:
            # Track metrics on sent events
            - metric:
                type: counter
                name: openmeter_events_sent
                value: 1
            # Convert batch to JSON array format
            - archive:
                format: json_array
        dump_request_log_level: DEBUG

Read more about configuring Redpanda Connect in the OpenMeter collector guide.

Installation

Check out the OpenMeter collector guide for installation instructions.