Overview
Use FinOps as a Bedrock-compatible gateway for the Converse and Invoke APIs, with FinOps features on top.
Overview
FinOps provides a Bedrock-compatible endpoint for the Converse and Invoke APIs via protocol adaptation. The integration handles request transformation, response normalization, and error mapping between AWS Bedrock's API specification and FinOps's internal processing pipeline.
This integration enables you to utilize FinOps's features like governance, load balancing, semantic caching, multi-provider support, and more, all while preserving your existing Bedrock SDK-based architecture.
Endpoint: /bedrock
Setup
Python
import boto3
# Configure boto3 Bedrock client to use FinOps
# Note: When using FinOps keys, dummy credentials are required
# because boto3 needs credentials to sign requests, even though
# FinOps will use its own configured keys.
client = boto3.client(
service_name="bedrock-runtime",
endpoint_url="{AI_GATEWAY_URL}/bedrock",
region_name="us-west-2",
aws_access_key_id="finops-dummy-key", # Required when using FinOps keys
aws_secret_access_key="finops-dummy-secret" # Required when using FinOps keys
)
# Make requests as usual
response = client.converse(
modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",
messages=[
{
"role": "user",
"content": [{"text": "Hello!"}]
}
]
)
print(response)Provider/Model Usage Examples
Because Bedrock itself is a multi-provider platform, you can use any Bedrock-supported model ID and still route through FinOps. FinOps will handle governance, observability, and other cross-cutting concerns.
import boto3
import json
client = boto3.client(
service_name="bedrock-runtime",
endpoint_url="{AI_GATEWAY_URL}/bedrock",
region_name="us-west-2",
aws_access_key_id="finops-dummy-key",
aws_secret_access_key="finops-dummy-secret"
)
# Anthropic via Bedrock (Converse API)
anthropic_response = client.converse(
modelId="anthropic.claude-3-sonnet-20240229",
messages=[{"role": "user", "content": [{"text": "Hello from Claude!"}]}]
)
# Mistral via Bedrock (Converse API)
mistral_response = client.converse(
modelId="mistral.mistral-large-2407",
messages=[{"role": "user", "content": [{"text": "Hello from Mistral!"}]}]
)
# Mistral via Bedrock (Invoke API)
mistral_invoke_response = client.invoke_model(
modelId="mistral.mistral-large-2407",
contentType="application/json",
accept="application/json",
body=json.dumps({
"prompt": "Say hello from Mistral using Invoke API.",
"max_tokens": 50,
"temperature": 0.7
}),
)Adding Custom Headers
Pass custom headers required by FinOps plugins (like governance, telemetry, etc.) using boto3's event system:
Python
import boto3
def add_bifrost_headers(request, **kwargs):
"""Add custom FinOps headers to the request before signing."""
request.headers.add_header("x-bf-vk", "vk_12345") # Virtual key for governance
request.headers.add_header("x-bf-env", "production") # Environment tag
client = boto3.client(
service_name="bedrock-runtime",
endpoint_url="{AI_GATEWAY_URL}/bedrock",
region_name="us-west-2",
aws_access_key_id="finops-dummy-key",
aws_secret_access_key="finops-dummy-secret"
)
# Register the header injection for all Bedrock API calls
client.meta.events.register_first(
"before-sign.bedrock-runtime.*",
add_bifrost_headers,
)
# Now make requests with custom headers
response = client.converse(
modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",
messages=[{"role": "user", "content": [{"text": "Hello with custom headers!"}]}]
)Note: Use
register_firstto ensure headers are added before request signing. The event name format isbefore-sign.<service-name>.<operation-name>. You need to register for each API operation you plan to use (Converse, ConverseStream, InvokeModel, etc.).
Streaming Examples
Converse Stream
Use converse_stream for chat-based streaming with a unified interface across models.
import boto3
client = boto3.client(
service_name="bedrock-runtime",
endpoint_url="{AI_GATEWAY_URL}/bedrock",
region_name="us-west-2",
aws_access_key_id="finops-dummy-key",
aws_secret_access_key="finops-dummy-secret"
)
response = client.converse_stream(
modelId="anthropic.claude-3-5-sonnet-20240620-v1:0",
messages=[{"role": "user", "content": [{"text": "Tell me a story about a brave knight."}]}],
inferenceConfig={"maxTokens": 512, "temperature": 0.5}
)
print("Response:")
for chunk in response["stream"]:
if "contentBlockDelta" in chunk:
text = chunk["contentBlockDelta"]["delta"]["text"]
print(text, end="", flush=True)Invoke Stream
Use invoke_model_with_response_stream for model-specific streaming payloads.
import boto3
import json
client = boto3.client(
service_name="bedrock-runtime",
endpoint_url="{AI_GATEWAY_URL}/bedrock",
region_name="us-west-2",
aws_access_key_id="finops-dummy-key",
aws_secret_access_key="finops-dummy-secret"
)
# Example for Claude 3 (Messages API format)
body = json.dumps({
"anthropic_version": "bedrock-2023-05-31",
"max_tokens": 1024,
"messages": [
{"role": "user", "content": "Write a haiku about coding."}
]
})
response = client.invoke_model_with_response_stream(
modelId="anthropic.claude-3-haiku-20240307-v1:0",
body=body,
contentType="application/json",
accept="application/json"
)
print("Response:")
for event in response.get("body"):
if "chunk" in event:
chunk = event["chunk"]
if "bytes" in chunk:
# The chunk bytes contain the model-specific JSON response
result = json.loads(chunk["bytes"].decode("utf-8"))
# Extract content based on model (e.g., Claude)
if "delta" in result and "text" in result["delta"]:
print(result["delta"]["text"], end="", flush=True)
elif "completion" in result:
print(result["completion"], end="", flush=True)Supported Features
The Bedrock integration currently supports:
- Converse API (
/bedrock/model/{modelId}/converse) for text/chat-style workloads - Invoke API (
/bedrock/model/{modelId}/invoke) for model-specific text completion workloads - Streaming via
converse_streamandinvoke_model_with_response_stream - Tools via
toolConfig,toolUse, andtoolResultinside Converse requests - Image and multimodal responses where supported by the underlying Bedrock model
- All FinOps core features that apply to these flows (governance, load balancing, semantic cache, observability, etc.)
Next Steps
- Files and Batch API - S3-based file operations and batch processing
- What is an integration? - Core integration concepts
- Configuration - Bedrock provider setup and API key management
- Core Features - Governance, semantic caching, and more