Skip to content

MrGraversen/spring-data-openai-rag-starter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ•΅οΈ spring-boot-starter-data-rag

Talk to your data like a human β€” AI writes SQL, explains the results, powered by RAG

What is this?

A drop-in Spring Boot starter that lets you query your database in natural language.
No need to write SQL β€” just ask a question, and the AI will:

  1. Understand your database schema (via JDBC metadata)
  2. Generate a valid SQL query
  3. Execute it
  4. Explain the result in plain English

πŸ”§ Features

  • βœ… Plug-and-play with Spring Boot 3
  • βœ… Works with any JDBC-supported database
  • βœ… Auto-detects your database type (Postgres, MySQL, etc.)
  • βœ… Runs on your schema β€” no JPA @Entity classes needed
  • βœ… Simple interface to ask questions and get answers

πŸš€ Quick Example

The following example demonstrates how to use the starter with the Pagila sample database for Postgres, which is a well-known open-source schema designed for learning and testing SQL queries. Pagila includes realistic tables such as customer, payment, and others, making it ideal for showcasing natural language to SQL capabilities powered by RAG and AI.

@Slf4j
@Component
@RequiredArgsConstructor
public class ExampleRunner implements ApplicationRunner {
    // Just inject DatabaseRagFacade and that's it! You are immediately connected to your database.
    private final DatabaseRagFacade databaseRagFacade;

    @Override
    public void run(ApplicationArguments args) {
        // Ask a natural language question about your database.
        final var question = "Which five customers have spent the most on rentals in the past year?";
        databaseRagFacade.askDatabaseAsync(question).thenAccept(explainResult());
    }

    private Consumer<DatabaseRagResult> explainResult() {
        return result -> {
            log.info("Question: {}", result.getQuestion());
            log.info("Generated SQL: {}", result.getExecution().getSql());
            log.info("Explanation: {}", result.getExplanation().getResult());
            log.info("Cost: Input: {} tokens / Output: {} tokens", result.getTotalInputTokens(), result.getTotalOutputTokens());
        };
    }
}

Sample AI-generated SQL

SELECT 
    c.customer_id AS "Customer ID",
    c.first_name AS "First Name",
    c.last_name AS "Last Name",
    SUM(p.amount) AS "Total Spent"
FROM 
    customer c
JOIN (
    SELECT customer_id, amount FROM payment_p2022_01
    UNION ALL
    SELECT customer_id, amount FROM payment_p2022_02
    UNION ALL
    SELECT customer_id, amount FROM payment_p2022_03
    UNION ALL
    SELECT customer_id, amount FROM payment_p2022_04
    UNION ALL
    SELECT customer_id, amount FROM payment_p2022_05
    UNION ALL
    SELECT customer_id, amount FROM payment_p2022_06
    UNION ALL
    SELECT customer_id, amount FROM payment_p2022_07
) p ON c.customer_id = p.customer_id
GROUP BY 
    c.customer_id, c.first_name, c.last_name
ORDER BY 
    "Total Spent" DESC
LIMIT 5;

Sample AI-generated explanation

The five customers who have spent the most on rentals in the past year are Karl Seal, Eleanor Hunt, Clara Shaw, Rhonda Kennedy, and Marion Snyder. Karl Seal spent the highest amount at $221.55, followed by Eleanor Hunt with $216.54. Clara Shaw spent $195.58, while both Rhonda Kennedy and Marion Snyder each spent $194.61.

🧠 Under the hood

This project uses the RAG pattern:
β†’ It dynamically builds a metamodel from your connected database
β†’ That metamodel is used to generate SQL queries based on AI understanding of your natural language question β†’ The AI is prepared to understand your database technology, whether it's Postgres, MySQL, or any other JDBC-supported database β†’ The SQL queries are executed against your database β†’ The results are executed and explained in natural language

βš™οΈ Getting Started

Add the dependency (Maven coordinates coming soon) and ensure your application.yml is configured with your database + OpenAI credentials.

ai:
  spring:
    data:
      rag:
        openai-api-key: your-openai-key-here
        openai-model-name: gpt-4.1-mini
        temperature: 0.0
        explain-results: true
        fallback-to-plain-on-explain-failure: false
  • openai-api-key: Your OpenAI API key, not mine!
  • openai-model-name: The model to use for SQL generation (default is gpt-4.1-mini)
  • temperature: Controls the model's "creative freedom" in SQL generation and explanation (0.0 = deterministic)
  • explain-results: Whether to explain the SQL results in natural language (default is true)
  • fallback-to-plain-on-explain-failure: If true, will return raw SQL results if explanation fails (default is false)

πŸ“š License

MIT – do cool things with it.

πŸ’¬ Why?

Because writing SELECT statements for the hundredth time is boring β€” and teaching AI to do it is awesome.

About

πŸ•΅οΈ Talk to your data like a human β€” RAG-powered AI writes SQL & explains it

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published