Overview

FinOps provides complete Google GenAI API compatibility through protocol adaptation. The integration handles request transformation, response normalization, and error mapping between Google's GenAI API specification and FinOps's internal processing pipeline.

This integration enables you to utilize FinOps's features like governance, load balancing, semantic caching, multi-provider support, and more, all while preserving your existing Google GenAI SDK-based architecture.

Endpoint: /genai

Setup

Python

from google import genai
from google.genai.types import HttpOptions

# Configure client to use FinOps
client = genai.Client(
 api_key="dummy-key", # Keys handled by FinOps
 http_options=HttpOptions(base_url="{AI_GATEWAY_URL}/genai")
)

# Make requests as usual
response = client.models.generate_content(
 model="gemini-1.5-flash",
 contents="Hello!"
)

print(response.text)

JavaScript

import { GoogleGenerativeAI } from "@google/generative-ai";

// Configure client to use FinOps
const genAI = new GoogleGenerativeAI("dummy-key", {
 baseUrl: "{AI_GATEWAY_URL}/genai", // Keys handled by FinOps
});

// Make requests as usual
const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });
const response = await model.generateContent("Hello!");

console.log(response.response.text);

Provider/Model Usage Examples

Use multiple providers through the same GenAI SDK format by prefixing model names with the provider:

Python

from google import genai
from google.genai.types import HttpOptions

client = genai.Client(
 api_key="dummy-key",
 http_options=HttpOptions(base_url="{AI_GATEWAY_URL}/genai")
)

# Google Vertex models (default)
vertex_response = client.models.generate_content(
 model="gemini-1.5-flash",
 contents="Hello from Gemini!"
)

# OpenAI models via GenAI SDK format
openai_response = client.models.generate_content(
 model="openai/gpt-4o-mini",
 contents="Hello from OpenAI!"
)

# Anthropic models via GenAI SDK format
anthropic_response = client.models.generate_content(
 model="anthropic/claude-3-sonnet-20240229",
 contents="Hello from Claude!"
)

# Azure models
azure_response = client.models.generate_content(
 model="azure/gpt-4o",
 contents="Hello from Azure!"
)

# Local Ollama models
ollama_response = client.models.generate_content(
 model="ollama/llama3.1:8b",
 contents="Hello from Ollama!"
)

JavaScript

import { GoogleGenerativeAI } from "@google/generative-ai";

const genAI = new GoogleGenerativeAI("dummy-key", {
 baseUrl: "{AI_GATEWAY_URL}/genai",
});

// Google Vertex models (default)
const geminiModel = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });
const vertexResponse = await geminiModel.generateContent("Hello from Gemini!");

// OpenAI models via GenAI SDK format
const openaiModel = genAI.getGenerativeModel({ model: "openai/gpt-4o-mini" });
const openaiResponse = await openaiModel.generateContent("Hello from OpenAI!");

// Anthropic models via GenAI SDK format
const anthropicModel = genAI.getGenerativeModel({ model: "anthropic/claude-3-sonnet-20240229" });
const anthropicResponse = await anthropicModel.generateContent("Hello from Claude!");

// Azure models
const azureModel = genAI.getGenerativeModel({ model: "azure/gpt-4o" });
const azureResponse = await azureModel.generateContent("Hello from Azure!");

// Local Ollama models
const ollamaModel = genAI.getGenerativeModel({ model: "ollama/llama3.1:8b" });
const ollamaResponse = await ollamaModel.generateContent("Hello from Ollama!");

Adding Custom Headers

Pass custom headers required by FinOps plugins (like governance, telemetry, etc.):

Python

from google import genai
from google.genai.types import HttpOptions

# Configure client with custom headers
client = genai.Client(
 api_key="dummy-key",
 http_options=HttpOptions(
 base_url="{AI_GATEWAY_URL}/genai",
 headers={
 "x-bf-vk": "vk_12345", # Virtual key for governance
 }
 )
)

response = client.models.generate_content(
 model="gemini-1.5-flash",
 contents="Hello with custom headers!"
)

JavaScript

import { GoogleGenerativeAI } from "@google/generative-ai";

// Configure client with custom headers
const genAI = new GoogleGenerativeAI("dummy-key", {
 baseUrl: "{AI_GATEWAY_URL}/genai",
 customHeaders: {
 "x-bf-vk": "vk_12345", // Virtual key for governance
 },
});

const model = genAI.getGenerativeModel({ model: "gemini-1.5-flash" });
const response = await model.generateContent("Hello with custom headers!");

Dynamic Thinking Budget

When thinkingConfig.thinkingBudget is set to -1, FinOps handles it differently per provider:

Gemini: Preserves -1 for native dynamic thinking support
Anthropic, Bedrock, Cohere: Converts to minimum reasoning budget value (1024)
OpenAI: Converts to medium reasoning effort

response = client.models.glenerate_content(
 model="gemini-2.5-flash",
 contents="Complex reasoning task",
 config={
 "thinking_config": {
 "include_thoughts": true,
 "thinking_budget": -1 # Dynamic thinking
 }
 }
)

Supported Features

The Google GenAI integration supports all features that are available in both the Google GenAI SDK and FinOps core functionality. If the Google GenAI SDK supports a feature and FinOps supports it, the integration will work seamlessly.

Next Steps

OpenAI SDK - GPT integration patterns
Configuration - FinOps setup and configuration
Core Features - Advanced FinOps capabilities

Overview

On this page