Skip to content

[AI] Search API Development #1809

@ad-m-ss

Description

@ad-m-ss

Ticket Information

Context & Background

Develop REST API endpoints for semantic search functionality that allows finding case-related articles using vector similarity. This includes search request handling, result ranking, logging, and performance optimization for the AI article indexing system.

Reference Documents:

Requirements & Acceptance Criteria

  • Install Django REST Framework if not present in the project
  • Create case-related articles API endpoint (/api/knowledge/case-related/)
  • Implement semantic similarity search for case recommendations
  • Create result ranking system based on similarity scores and metadata
  • Add comprehensive search request logging and analytics
  • Implement pagination for search results with configurable page sizes
  • Add result filtering by relevance threshold and article metadata
  • Create performance monitoring and caching for frequent queries
  • Add comprehensive unit and integration tests for all API endpoints

Implementation Steps

1. Django REST Framework Setup

Add required packages to requirements.txt:

  • djangorestframework>=3.14.0
  • django-filter>=23.0

Configure Django settings:

  • Add 'rest_framework' and 'django_filters' to INSTALLED_APPS
  • Configure REST_FRAMEWORK settings with authentication, permissions, and pagination
  • Setup default filter backends and page size configuration

2. Search API Serializers

Create knowledge/serializers.py with the following serializers:

  • ArticleSearchResultSerializer: Serialize article search results with similarity scores
    • Include fields: id, title, excerpt, url, published_at, categories, tags, similarity_score, relevance_snippet
    • Add method to generate article excerpt from content if not available
  • CaseSearchRequestSerializer: Validate case search requests
    • Fields: case_description, limit, min_similarity, categories, published_after
    • Include validation for search parameters and limits

3. Search Service Implementation

Create knowledge/services/search_service.py with ArticleSearchService class:

  • Case-Related Search: Main search method for finding articles related to legal cases
  • Embedding Generation: Generate embeddings for search queries using OpenAI client
  • Vector Search: Use ChromaDB integration to find similar article chunks
  • Result Aggregation: Aggregate chunk results by article and calculate relevance scores
  • Result Enhancement: Add relevance snippets and metadata to search results
  • Search Logging: Log all search requests for analytics and monitoring

Key methods:

  • search_case_related_articles() - Main search functionality
  • _build_search_filters() - Build ChromaDB filter conditions
  • _aggregate_and_rank_results() - Aggregate chunks by article and rank
  • _enhance_results_with_snippets() - Add relevance snippets to results
  • _log_search_request() - Log search requests for analytics

4. API Views Implementation

Create knowledge/views.py with search API views:

  • CaseRelatedArticlesView: API endpoint for case-related article search
  • Request Validation: Validate search requests using serializers
  • Search Execution: Execute search using ArticleSearchService
  • Response Formatting: Format search results with proper serialization
  • Error Handling: Handle search errors and return appropriate responses
  • Performance Monitoring: Track response times and search performance

5. URL Configuration

Create knowledge/urls.py with API URL patterns:

  • Configure case-related articles endpoint: /api/knowledge/case-related/
  • Add URL namespace for knowledge API endpoints
  • Include proper URL routing with parameter validation

Update main urls.py to include knowledge API URLs

6. Search Analytics and Logging

Implement comprehensive search analytics:

  • Search Request Logging: Log all search requests with SearchLog model
  • Performance Metrics: Track response times and result quality
  • Usage Analytics: Monitor search patterns and popular queries
  • Error Tracking: Log search errors and failures for debugging

7. Performance Optimization

Implement performance optimizations:

  • Query Caching: Cache frequent search queries to improve response times
  • Result Caching: Cache search results for repeated queries
  • Database Optimization: Optimize database queries for search operations
  • Pagination: Implement efficient pagination for large result sets

8. API Documentation

Create comprehensive API documentation:

  • Endpoint Documentation: Document all API endpoints with parameters
  • Request/Response Examples: Provide example requests and responses
  • Error Codes: Document possible error codes and meanings
  • Usage Guidelines: Provide guidelines for optimal API usage

Code Changes Required

  • Install Django REST Framework and django-filter
  • Create knowledge/serializers.py with search serializers
  • Create knowledge/services/search_service.py with search logic
  • Create knowledge/views.py with API views
  • Create knowledge/urls.py with URL configuration
  • Update main urls.py to include knowledge API URLs
  • Add comprehensive unit and integration tests
  • Create API documentation

External Documentation

Deliverables

  1. Complete REST API for case-related article search
  2. Semantic similarity search functionality
  3. Result ranking and filtering system
  4. Search request logging and analytics
  5. Performance optimization and caching
  6. Comprehensive API documentation
  7. Unit and integration test suite
  8. API usage guidelines and examples

API Specification

Case-Related Articles Endpoint

  • URL: /api/knowledge/case-related/
  • Method: POST
  • Authentication: Required (Token or Session)
  • Request Format: JSON with case_description, optional filters
  • Response Format: JSON with paginated article results
  • Response Time: < 2 seconds (requirement from Phase 1 plan)

Request Parameters

  • case_description: Text description of the legal case (required, max 5000 chars)
  • limit: Maximum results to return (optional, default 10, max 50)
  • min_similarity: Minimum similarity threshold (optional, default 0.6)
  • categories: Filter by article categories (optional, array)
  • published_after: Filter by publication date (optional, date)

Response Format

  • results: Array of article objects with similarity scores
  • count: Total number of results found
  • pagination: Pagination metadata (next, previous, page info)
  • search_metadata: Search performance and metadata

Performance Requirements

  • API response time < 2 seconds for typical queries
  • Support concurrent searches from multiple users
  • Handle search queries up to 5000 characters
  • Return relevant results with similarity scores ≥ 0.6
  • Support pagination for large result sets

Next Steps

  • Upon completion, enable [AI] Quality Assurance Testing #1810 (Quality Assurance Testing)
  • Schedule API testing with product team
  • Integrate with case management system for real-world testing

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions