Logo

LangChain

LangChain, a framework for building language model applications, integrates seamlessly with OpenMeter. Tracking and attributing token usage to users is crucial for precise billing, real-time usage dashboards and operational insights in language model operations.

Setup

To integrate OpenMeter with your LangChain application, start by installing the OpenMeter Node.js SDK:

npm install @openmeter/sdk
# also install the LangChain SDK
npm install @langchain/core langchain

Next, create a file for the OpenMeter handler. This implementation serves as a foundation and can be tailored to meet specific requirements:

handler.ts
import { OpenMeter, Event } from '@openmeter/sdk';
import { BaseCallbackHandler } from "@langchain/core/callbacks/base";
import { Serialized } from "langchain/load/serializable";
import { LLMResult } from "langchain/schema";
 
interface TokenUsage {
completionTokens?: number;
promptTokens?: number;
}
 
export class OpenMeterCallbackHandler extends BaseCallbackHandler {
  name = 'OpenMeterCallbackHandler';
  runMetadata = new Map<string, Record<string, unknown>>();
 
constructor(
public openmeter: OpenMeter,
public params: {
source?: string;
type: string;
} = {
source: 'langchain',
type: 'tokens',
},
) {
super();
}
 
// Handle the start of a language model operation.
async handleLLMStart(
\_llm: Serialized,
\_prompts: string[],
runId: string,
parentRunId?: string | undefined,
\_extraParams?: Record<string, unknown> | undefined,
\_tags?: string[],
metadata: Record<string, unknown> = {},
\_name?: string | undefined,
) {
// If a parent run ID is present, merge its metadata with the current metadata.
if (parentRunId) {
const parentMetadata = this.runMetadata.get(parentRunId);
if (parentMetadata) {
// alternatively, use remeda.js' mergeDeep function
// https://remedajs.com/docs#mergeDeep
Object.assign(metadata, parentMetadata);
}
}
this.runMetadata.set(runId, metadata);
}
 
// Handle the end of a language model operation.
async handleLLMEnd(
output: LLMResult,
runId: string,
\_parentRunId?: string | undefined,
\_tags?: string[] | undefined,
) {
// Extract token usage from the language model's output.
const { promptTokens = 0, completionTokens = 0 }: TokenUsage =
output.llmOutput?.['tokenUsage'] ??
output.llmOutput?.['estimatedTokenUsage'] ??
{};
// Ensure that token usage data is present.
if (!(promptTokens > 0 || completionTokens > 0)) {
console.warn(
`${this.name}: no token usage in LLM output`,
output.llmOutput,
);
return;
}
 
    // Retrieve metadata and construct the event for OpenMeter.
    const metadata = this.runMetadata.get(runId) ?? {};
    const { subject, ...data } = metadata;
    if (!subject || typeof subject !== 'string') {
      console.warn(`${this.name}: could not find 'subject' in run metadata`);
      return;
    }
 
    const inputEvent: Event = {
      id: `${runId}-input`,
      source: this.params.source,
      type: this.params.type,
      subject,
      data: {
        ...data,
        type: 'input',
        tokens: promptTokens,
      },
    };
 
    const outputEvent: Event = {
      id: `${runId}-output`,
      source: this.params.source,
      type: this.params.type,
      subject,
      data: {
        ...data,
        type: 'output',
        tokens: completionTokens,
      },
    };
 
    // Ingest the events into OpenMeter.
    try {
      console.debug(`${this.name}: ingesting event`, inputEvent);
      await this.openmeter.events.ingest(inputEvent);
      console.debug(`${this.name}: ingesting event`, outputEvent);
      await this.openmeter.events.ingest(outputEvent);
    } catch (err) {
      console.error(`${this.name}: error ingesting event`, err);
    }
 
    this.runMetadata.delete(runId);
 
}
}
 

Note: Some language models may not provide token usage information in the LLM output. Please verify this with your specific model documentation. Custom token counting can be implemented by using the js-tiktoken library.

Usage

Node.js

route.ts
import { OpenMeter } from '@openmeter/sdk';
import { PromptTemplate } from '@langchain/core/prompts';
import { ChatOpenAI } from "@langchain/openai";
import { LLMChain } from "langchain/chains";
import { OpenMeterCallbackHandler } from './handler';
 
const openmeter = new OpenMeter({
  baseUrl: process.env.OPENMETER_BASE_URL ?? 'https://openmeter.cloud',
  token: process.env.OPENMETER_TOKEN,
});
 
const handler = new OpenMeterCallbackHandler(openmeter);
 
const llm = new ChatOpenAI({
  // The handler will be used for all calls made with this LLM.
  callbacks: [handler],
});
 
const chain = new LLMChain({
  llm,
  prompt: PromptTemplate.fromTemplate('Hello, world!'),
});
 
await chain.call({
  metadata: {
    // Specify the subject for each call.
    subject: '<customer_identifier>',
    // Additional fields for event data.
    model: llm.modelName,
  },
});

Next.js and Vercel's AI library

The following example demonstrates how to integrate OpenMeter with a Next.js application using the Vercel's AI library.

route.ts
import { NextRequest, NextResponse } from "next/server";
import {
  Message as VercelChatMessage,
  StreamingTextResponse,
  LangChainStream,
} from "ai";
 
import { ChatOpenAI } from '@langchain/openai';
import { PromptTemplate } from '@langchain/core/prompts';
import { LLMChain } from 'langchain/chains';
import { OpenMeter } from '@openmeter/sdk';
import { OpenMeterCallbackHandler } from '@/openmeter/handler';
 
const formatMessage = (message: VercelChatMessage) => {
return `${message.role}: ${message.content}`;
};
 
const TEMPLATE = `You are a pirate named Patchy.
All responses must be extremely verbose and in pirate dialect.
 
Current conversation:
{chat_history}
 
User: {input}
AI:`;
 
const openmeter = new OpenMeter({
baseUrl: process.env.OPENMETER_BASE_URL ?? "https://openmeter.cloud",
token: process.env.OPENMETER_TOKEN,
});
 
const handler = new OpenMeterCallbackHandler(openmeter);
 
export async function POST(req: NextRequest) {
  try {
    const body = await req.json();
    const messages = body.messages ?? [];
    const formattedPreviousMessages = messages.slice(0, -1).map(formatMessage);
    const currentMessageContent = messages[messages.length - 1].content;
    const prompt = PromptTemplate.fromTemplate(TEMPLATE);
 
    const llm = new ChatOpenAI({
      // The handler will be used for all calls made with this LLM.
      callbacks: [handler],
      streaming: true,
    });
 
    const chain = new LLMChain({
      llm,
      prompt,
    });
 
    const { stream, handlers } = LangChainStream();
 
    chain.call(
      {
        chat_history: formattedPreviousMessages.join("\n"),
        input: currentMessageContent,
        metadata: {
          // Specify the subject for each call.
          subject: "<customer_identifier>",
          // Additional fields for event data.
          model: llm.modelName,
        },
      },
      [handlers],
    );
 
    return new StreamingTextResponse(stream);
 
} catch (e: any) {
return NextResponse.json({ error: e.message }, { status: 500 });
}
}
 

How it works

LangChain callbacks

LangChain provides a callbacks system that allows you to hook into the various stages of your LLM application. This is useful for logging, monitoring, streaming, and other tasks.

OpenMeterCallbackHandler is responsible for capturing the token usage of language model operations and reporting this data to OpenMeter as events. The handler captures the start and end of language model operations.

It operates at two key stages:

  1. Start of the Language Model Operation (handleLLMStart): At this point, the handler captures and stores the metadata of the current chain execution.

  2. End of the Language Model Operation (handleLLMEnd): This is where the handler extracts information such as the number of prompt tokens, completion tokens from the language model's output (tokenUsage or estimatedTokenUsage for streaming requests) and sends two events (input and output) to OpenMeter.

Event Structure

The OpenMeter events will have the following structure:

{
  "id": "<unique_run_id>-<input | output>",
  "source": "langchain",
  "type": "tokens",
  "subject": "<customer_identifier>",
  "data": {
    "type": "<input | output>",
    "tokens": "<token_count>",
    // "model": "<model_name>",
    // ...
  }
}
  • type: Indicates the tokens type. Input tokens are tokens used in the prompt, while output tokens are tokens used in the completion.
  • tokens: Number of tokens.

The subject field associates the event with a specific customer or operation, aiding in billing and analysis.

Create Meter in Cloud
Last edited on February 23, 2024