DSP-FD2 is an enterprise-grade, modular API gateway that serves as the intelligent front door for dynamic service discovery and routing. It provides a unified entry point for all AI/ML services, with automatic module loading, security enforcement, and comprehensive observability.
Client Request β Front Door (dsp-fd2) β Control Tower β Module Discovery β Module Execution β Backend Service
β β β
JWT Service Vault Service Metrics/Logging
- Front Door Service: Main gateway handling request routing
- Module Manager: Dynamic module lifecycle management
- Control Tower Integration: Manifest fetching and caching
- Security Layer: JWT validation and secret management
- Module Interface: Standardized contract for all modules
- Python 3.11+
- Redis (for caching)
- PostgreSQL (optional, for audit logs)
- Install dependencies:
python -m venv .fd_venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txt
- Start services with Docker Compose:
docker-compose up -d
- Verify installation:
curl http://localhost:8080/health
Send requests to the Front Door with project and module information:
# OpenAI-compatible chat completion
curl -X POST http://localhost:8080/my-project/inference/v1/chat/completions \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "X-Environment: production" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
-
Path-based (Recommended):
/{project}/{module}/endpoint
-
Header-based:
X-Project-Module: project/module
-
Subdomain-based:
project-module.api.company.com
Modules are configured via manifests stored in the Control Tower:
{
"module_type": "inference_openai",
"runtime": {
"type": "python:3.11",
"implementation": "src.modules.inference_openai.InferenceOpenAIModule"
},
"endpoints": {
"dev": {
"primary": "http://dev-llm-gateway:8080"
},
"prod": {
"primary": "https://prod-llm-gateway"
}
},
"configuration_references": [
{
"name": "api_key",
"source": "vault://secrets/openai_key",
"required": true
}
]
}
Key configuration options (see .env.example
for full list):
Variable | Description | Default |
---|---|---|
CONTROL_TOWER_URL |
Control Tower API endpoint | http://localhost:8081 |
VAULT_URL |
HashiCorp Vault endpoint | http://localhost:8200 |
JWT_SERVICE_URL |
JWT validation service | http://localhost:8082 |
CACHE_TTL_SECONDS |
Manifest cache duration | 300 |
MODULE_POOL_SIZE |
Max modules in memory | 10 |
- Implement the BaseModule interface:
from src.core.module_interface import BaseModule, ModuleConfig, ModuleRequest, ModuleResponse
class MyCustomModule(BaseModule):
async def initialize(self, config: ModuleConfig) -> None:
await super().initialize(config)
# Your initialization logic
async def handle_request(self, request: ModuleRequest) -> ModuleResponse:
# Process request
return ModuleResponse(
status_code=200,
body={"result": "success"}
)
async def health_check(self) -> Dict[str, Any]:
return {"status": "healthy"}
async def shutdown(self) -> None:
# Cleanup resources
await super().shutdown()
- Register in manifest:
{
"module_type": "my_custom",
"runtime": {
"implementation": "src.modules.my_custom.MyCustomModule"
}
}
Prometheus metrics available at /metrics
:
fd_requests_total
: Total request countfd_request_duration_seconds
: Request latencyfd_active_requests
: Currently active requestsfd_module_load_seconds
: Module loading timefd_cache_hit_ratio
: Cache effectiveness
# Basic health
curl http://localhost:8080/health
# Detailed health with dependencies
curl http://localhost:8080/health?detailed=true
Structured JSON logs with correlation IDs:
{
"timestamp": "2024-01-15T10:00:00Z",
"level": "INFO",
"request_id": "abc-123",
"message": "Request processed",
"duration": 0.042,
"module": "inference",
"status": 200
}
- JWT Bearer Tokens: Primary authentication method
- API Keys: Alternative for service-to-service
- mTLS: Optional for enhanced security
All secrets are stored in HashiCorp Vault and injected at runtime:
# Secrets are automatically available in module config
api_key = self.config.runtime_references.get("api_key")
Configurable per-client rate limits with burst support:
RATE_LIMIT_REQUESTS_PER_MINUTE=60
RATE_LIMIT_BURST_SIZE=10
pytest tests/unit -v
pytest tests/integration -v
# Using k6
k6 run tests/load/scenario.js
./scripts/e2e_test.sh
docker build -t dsp-fd2:latest .
docker run -p 8080:8080 --env-file .env dsp-fd2:latest
Comprehensive documentation is available in the docs/
directory:
- Documentation Index - Complete documentation overview
- Langfuse Quick Start - Get started with LLM observability
- APISIX Integration - API Gateway integration guide
- Security Design - Security architecture and best practices
- Scalability & Resilience - High availability strategies
See the docs directory for the complete documentation set.
- β Refactored monolithic client into modular architecture
- β Added Langfuse integration for LLM observability
- β Implemented composition pattern for better maintainability
- β Created comprehensive plugin builders
- β Full documentation suite
See Refactoring Summary for details.
- Phase 1: Basic routing and module interface
- Phase 2: Control Tower integration
- Phase 3: Security implementation
- Phase 4: Dynamic module loading
- Phase 5: OpenAI-compatible inference module
- Phase 6: APISIX Gateway integration
- Phase 7: Langfuse observability integration
- Phase 8: Production monitoring enhancements
- Phase 9: Auto-scaling and optimization
- Phase 10: Additional module types (RAG, Data Processing, etc.)
- Review the documentation
- Check existing issues and PRs
- Follow the module development guide
- Add tests for new features
- Update documentation
Built with β€οΈ by the DSP Platform Team