Skip to content

Add Amazon Bedrock Guardrails integration #2

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
38 changes: 38 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

## Commands
- Install dependencies: `cd src && uv pip install -r requirements.txt`
- Run Python scripts: `cd src && uv run <script_name>.py`
- Run locally: `cd src && uv run -m uvicorn api.app:app --host 0.0.0.0 --port 8000`
- Build Docker: `cd scripts && bash ./push-to-ecr.sh`
- Lint: `pipx run ruff check`
- Format: `pipx run ruff format`

## Environment Configuration
- We use `uv` instead of regular `python` or `pip` commands
- Enable debug mode: `export DEBUG=true`
- Set AWS region: `export AWS_REGION=us-east-1`
- Custom models are enabled by default

## Code Style
- Python version: 3.12
- Line length: 120 characters max
- Indentation: 4 spaces
- Quote style: Double quotes for strings
- Imports: Group (standard lib, third-party, internal) and alphabetically sorted
- Use FastAPI patterns for API development
- Type annotations required for all functions and classes
- Use abstract base classes (ABC) for interfaces
- Snake case for variables/functions, PascalCase for classes
- Explicit error handling with specific exception types
- Use HTTPException for API errors
- Document public functions with docstrings

## Architecture
This project provides OpenAI-compatible RESTful APIs for Amazon Bedrock models, making it easy to use AWS foundation models without changing existing code that uses OpenAI APIs.

## Testing
- Test API functionality: `cd src && uv run test_api.py`
- Test custom models: `cd src && uv run test_custom_models.py`
123 changes: 123 additions & 0 deletions CUSTOM_MODELS_IMPLEMENTATION.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
# Custom Imported Models Implementation

## Overview

This document describes the implementation of custom imported model support in the Bedrock Access Gateway. Custom models are models that you have imported into Bedrock, and this feature allows you to use them through the OpenAI-compatible API interface just like foundation models.

## User-Friendly Model IDs

One of the key features of this implementation is the creation of user-friendly model IDs that include the model name. Instead of cryptic AWS IDs like `custom.a1b2c3d4`, models are presented with descriptive IDs in the format:

```
{model-name}-id:custom.{aws_id}
```

For example: `mistral-7b-instruct-id:custom.a1b2c3d4`

This makes it easier to identify models when using the Models API, while maintaining compatibility with the original AWS ID format.

## Changes Made

1. **Model Discovery**
- Extended `list_bedrock_models()` to include:
- Custom models via `bedrock_client.list_custom_models()`
- Imported models via `bedrock_client.list_imported_models()`
- Created user-friendly model IDs that include the model name
- Added type field to model metadata to distinguish between "foundation" and "custom" models
- Added region information to each model to support cross-region invocation
- Stored model ARN for custom models for invocation purposes

2. **Configuration**
- Custom models are always enabled by default
- Added retry configuration for custom model invocation
- The implementation uses the local AWS region by default

3. **Model Validation**
- Added ID transformation logic in the `validate()` method to handle both:
- Descriptive model IDs (e.g., `mistral-7b-instruct-id:custom.a1b2c3d4`)
- Original AWS IDs (e.g., `custom.a1b2c3d4`)
- Stores the original display ID to preserve it in responses

4. **Model Invocation**
- Added branching logic in `_invoke_bedrock()` to handle custom models differently
- Implemented `_invoke_custom_model()` method to handle custom model invocation via `InvokeModel`/`InvokeModelWithResponseStream`
- Added custom model response parsing to handle various model output formats
- Implemented special handling for `ModelNotReadyException`
- Added region-specific client creation for cross-region models

5. **Streaming Support**
- Added `_handle_custom_model_stream()` method to handle streaming responses
- Added support for parsing different streaming formats from custom models

6. **Message Formatting**
- Implemented `_create_prompt_from_messages()` to convert OpenAI-style chat messages to text format for custom models

7. **Documentation**
- Updated README.md to include new feature
- Updated Usage.md with custom model usage examples
- Updated FAQs to indicate custom model support

## Usage

### Listing Custom Models

To list available custom models in your AWS account, use the Models API:

```bash
curl -s $OPENAI_BASE_URL/models -H "Authorization: Bearer $OPENAI_API_KEY" | jq '.data[] | select(.id | startswith("custom.") or contains("-id:custom."))'
```

### Using Custom Models

Custom models can be used with either their descriptive ID or the original AWS ID:

```bash
# Using descriptive ID
curl $OPENAI_BASE_URL/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "mistral-7b-instruct-id:custom.a1b2c3d4",
"messages": [
{ "role": "user", "content": "Hello, world!" }
]
}'

# Using original AWS ID (also supported)
curl $OPENAI_BASE_URL/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "custom.a1b2c3d4",
"messages": [
{ "role": "user", "content": "Hello, world!" }
]
}'
```

## Troubleshooting

### Model Not Found

If your custom model isn't appearing:

1. Check if the model exists in your AWS account:
```bash
aws bedrock list-imported-models --region us-east-1
```

2. Restart the gateway to refresh the model list:
```bash
# For Lambda deployments
# Go to Lambda console > Find your function > Click "Deploy new image"

# For Fargate deployments
# Go to ECS console > Find your cluster > Tasks tab > Stop running task
```

### Invocation Errors

Common errors when invoking custom models:

- `ModelNotReadyException`: The model is still being prepared - wait a few minutes and try again
- `ValidationException`: Check that your input format is compatible with the model
11 changes: 8 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@ If you find this GitHub repository useful, please consider giving it a free star
- [x] Support Embedding API
- [x] Support Multimodal API
- [x] Support Cross-Region Inference
- [x] Support Reasoning (**new**)
- [x] Support Reasoning
- [x] Support Custom Imported Models (**new**)

Please check [Usage Guide](./docs/Usage.md) for more details about how to use the new APIs.

Expand Down Expand Up @@ -230,9 +231,13 @@ Also, you can use Lambda Web Adapter + Function URL (see [example](https://githu

Currently, there is no plan to support SageMaker models. This may change provided there's a demand from customers.

### Any plan to support Bedrock custom models?
### Support for Bedrock custom models

Fine-tuned models and models with Provisioned Throughput are currently not supported. You can clone the repo and make the customization if needed.
Custom imported models are now supported! You can use them just like foundation models using user-friendly model IDs in the format `{model-name}-id:custom.{aws_id}` or with the original AWS format `custom.{aws_id}`. Run the Models API to see the available custom models in your account.

The user-friendly format makes it easier to identify models while maintaining backward compatibility with the original AWS IDs. For more details, see [Custom Imported Models](./docs/Usage.md#custom-imported-models) in the usage documentation.

Fine-tuned models and models with Provisioned Throughput may require additional configuration.

### How to upgrade?

Expand Down
6 changes: 5 additions & 1 deletion deployment/BedrockProxy.template
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ Parameters:
Type: String
Default: anthropic.claude-3-sonnet-20240229-v1:0
Description: The default model ID, please make sure the model ID is supported in the current region
EcrAccountId:
Type: String
Default: 366590864501
Resources:
VPCB9E5F0B4:
Type: AWS::EC2::VPC
Expand Down Expand Up @@ -170,7 +173,8 @@ Resources:
ImageUri:
Fn::Join:
- ""
- - 366590864501.dkr.ecr.
- - Ref: EcrAccountId
- ".dkr.ecr."
- Ref: AWS::Region
- "."
- Ref: AWS::URLSuffix
Expand Down
10 changes: 8 additions & 2 deletions deployment/BedrockProxyFargate.template
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@ Parameters:
Type: String
Default: anthropic.claude-3-sonnet-20240229-v1:0
Description: The default model ID, please make sure the model ID is supported in the current region
EcrAccountId:
Type: String
Default: 366590864501
Resources:
VPCB9E5F0B4:
Type: AWS::EC2::VPC
Expand Down Expand Up @@ -158,7 +161,9 @@ Resources:
- ""
- - "arn:aws:ecr:"
- Ref: AWS::Region
- :366590864501:repository/bedrock-proxy-api-ecs
- ":"
- Ref: EcrAccountId
- :repository/bedrock-proxy-api-ecs
- Action: ecr:GetAuthorizationToken
Effect: Allow
Resource: "*"
Expand Down Expand Up @@ -226,7 +231,8 @@ Resources:
Image:
Fn::Join:
- ""
- - 366590864501.dkr.ecr.
- - Ref: EcrAccountId
- ".dkr.ecr."
- Ref: AWS::Region
- "."
- Ref: AWS::URLSuffix
Expand Down
78 changes: 77 additions & 1 deletion docs/Usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ export OPENAI_BASE_URL=<API base url>
- [Multimodal API](#multimodal-api)
- [Tool Call](#tool-call)
- [Reasoning](#reasoning)
- [Custom Imported Models](#custom-imported-models)

## Models API

Expand Down Expand Up @@ -441,4 +442,79 @@ for chunk in response:
reasoning_content += chunk.choices[0].delta.reasoning_content
elif chunk.choices[0].delta.content:
content += chunk.choices[0].delta.content
```
```

## Custom Imported Models

This feature allows you to use models that you've imported into Amazon Bedrock. Custom imported models can be used just like foundation models with minor differences in configuration.

**Important Notes:**
- Custom models are displayed in the Models API with user-friendly IDs in the format `{model-name}-id:custom.{aws_id}`
- The original AWS ID format `custom.{aws_id}` is also supported for backward compatibility
- Custom models are always enabled by default
- Custom imported models may have different response formats than foundation models, so the gateway attempts to normalize the outputs
- If a model is not ready yet, the API will return a 503 error with a detail message indicating the model is not ready

**Example Request**

First, use the Models API to get a list of available custom models:

```bash
curl -s $OPENAI_BASE_URL/models -H "Authorization: Bearer $OPENAI_API_KEY" | jq '.data[] | select(.id | contains("-id:custom.") or startswith("custom."))'
```

Then use the custom model in your chat completions (using either format):

```bash
# Using the user-friendly model ID
curl $OPENAI_BASE_URL/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "mistral-7b-instruct-id:custom.a1b2c3d4",
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
],
"max_tokens": 500,
"temperature": 0.7
}'

# Using the original AWS ID format (also supported)
curl $OPENAI_BASE_URL/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "custom.a1b2c3d4",
"messages": [
{
"role": "user",
"content": "What is the meaning of life?"
}
],
"max_tokens": 500,
"temperature": 0.7
}'
```

**Example Python SDK Usage**

```python
from openai import OpenAI

client = OpenAI()
completion = client.chat.completions.create(
# Can use either format:
# model="mistral-7b-instruct-id:custom.a1b2c3d4" # User-friendly format
model="custom.a1b2c3d4", # Original AWS format
messages=[{"role": "user", "content": "What is the meaning of life?"}],
max_tokens=500,
temperature=0.7
)

print(completion.choices[0].message.content)
```

For more details on the implementation, see [CUSTOM_MODELS_IMPLEMENTATION.md](../CUSTOM_MODELS_IMPLEMENTATION.md).
Loading