StudyFetch AI Tutor

StudyFetch AI Tutor is a web application that helps students understand PDF documents through an interactive split-screen interface. Users can upload PDFs and chat with an AI about the document's content, with the AI able to reference and highlight relevant parts of the PDF in real-time.

Features

🔐 User Authentication: Secure email/password signup and login with session management
📄 PDF Upload & Viewing: Upload, store, and navigate PDF documents
💬 AI Chat Interface: Interact with the AI about document content via text
🔍 Smart Document Search: Vector embeddings power semantic retrieval of relevant document content
📌 Context-Aware Responses: AI references specific page numbers and content from the PDF
📝 Persistent Conversations: Chat history is saved and can be resumed later
🔄 Multi-Document Support: Upload and manage multiple documents with separate conversation histories

Tech Stack

Frontend

Next.js 15+ with App Router
React 19
TailwindCSS for styling
React PDF for PDF rendering

Backend

Next.js API Routes
PostgreSQL with pgvector extension for vector similarity search
Prisma ORM for database operations
AWS S3 for PDF storage

AI Integration

OpenAI GPT-4 for chat responses
Custom Embeddings Service using sentence-transformers
LangChain for document processing

Architecture

RAG Pipeline

Image credit: https://www.dailydoseofds.com/

The application follows a Retrieval Augmented Generation (RAG) approach:

PDF documents are processed into chunks
Each chunk gets a vector embedding representing its semantic meaning
When the user asks a question, relevant chunks are retrieved via similarity search
The AI generates a response based on the retrieved context

Service Components

Web Application: Next.js app for frontend and API routes
Embeddings Service: FastAPI service for document processing and embedding generation
PostgreSQL Database: Stores user data, documents, conversations, and vector embeddings

Prerequisites

Node.js v18+
Docker and Docker Compose
OpenAI API key
AWS S3 credentials (for production deployment)

Installation

1. Clone the repository

git clone https://github.com/CruiseDevice/ai-tutor
cd ai-tutor

2. Install dependencies

npm install

3. Start the database and embeddings service

# start postgresql and build/run the embeddings service
docker-compose up -d

# verify the embeddings service is running
curl http://localhost:8000/health

The embeddings service runs in a Docker container and exposes the following endpoints:

GET /health: Check if the service is running
POST /embeddings: Generate embeddings for a single text
POST /batch-embeddings: Generate embeddings for multiple texts
POST /process-document: Process a PDF document into chunks with embeddings

4. Set up environment variables

Create a .env file in the root directory:

# Database
DATABASE_URL="postgresql://postgres:postgres@localhost:5432/studyfetch"

# S3 Storage
AWS_REGION="us-east-1"
AWS_ACCESS_KEY_ID="your-access-key"
AWS_SECRET_ACCESS_KEY="your-secret-key"
S3_PDFBUCKET_NAME="your-bucket-name"

# Embeddings Service
EMBEDDINGS_SERVICE_URL="http://localhost:8000"

# Environment
NODE_ENV="development"

5. Initialize the database

npx prisma generate
npx prisma db push
node scripts/setup-pgvector.js

Running the Application

Start the development server:

npm run dev

The application will be available at http://localhost:3000

User Guide

Register/Login: Create an account or sign in
API Setup: Navigate to API Settings and add your OpenAI API key
Upload a PDF: On the dashboard, click "Upload PDF" to begin
Chat with the Document: Ask questions about the PDF content
Document History: Access previous documents from the sidebar

Project Structure

/
├── embeddings/                # Embeddings service (FastAPI)
│   ├── document_processor.py  # PDF processing logic
│   └── embeddings_service.py  # API endpoints
├── prisma/                    # Database schema and migrations
├── public/                    # Static assets
├── scripts/                   # Setup scripts
├── src/
│   ├── app/                   # Next.js app directory
│   │   ├── api/               # API routes
│   │   ├── dashboard/         # Dashboard page
│   │   ├── login/             # Login page
│   │   └── register/          # Registration page
│   ├── components/            # React components
│   │   ├── ChatInterface.tsx  # Chat UI
│   │   ├── Dashboard.tsx      # Main application component
│   │   └── EnhancedPDFViewer.tsx # PDF viewer with annotation
│   └── lib/                   # Utility libraries
│       ├── auth.ts            # Authentication utilities
│       ├── db.ts              # Database client
│       └── pgvector.ts        # Vector search functions
└── docker-compose.yml         # Docker services configuration

API Routes

/api/auth/*: Authentication endpoints (login, register, logout)
/api/documents: PDF upload and processing
/api/conversations: Conversation management
/api/chat: AI messaging endpoint

Development

Add Database Migration

After schema changes:

npx prisma migrate dev --name your_migration_name

Update Embeddings Model

To change the embeddings model, update the model name in:

/embeddings/document_processor.py
/embeddings/embeddings_service.py

Deployment

The application can be deployed on Vercel with the following considerations:

Set up a PostgreSQL database with pgvector extension (e.g., using Supabase or Neon)
Deploy the embeddings service separately (e.g., on a server or containerized service)
Configure environment variables in your hosting platform

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

Acknowledgments

OpenAI for the GPT API
Sentence Transformers for embeddings
LangChain for document processing utilities
Vercel for Next.js hosting infrastructure

Name		Name	Last commit message	Last commit date
Latest commit History 109 Commits
.github/workflows		.github/workflows
embeddings		embeddings
prisma		prisma
public		public
scripts		scripts
src		src
.gitignore		.gitignore
Procfile		Procfile
README.md		README.md
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
landing_page.png		landing_page.png
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
rag-diagram-1.webp		rag-diagram-1.webp
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

StudyFetch AI Tutor

Features

Tech Stack

Frontend

Backend

AI Integration

Architecture

RAG Pipeline

Service Components

Prerequisites

Installation

1. Clone the repository

2. Install dependencies

3. Start the database and embeddings service

4. Set up environment variables

5. Initialize the database

Running the Application

User Guide

Project Structure

API Routes

Development

Add Database Migration

Update Embeddings Model

Deployment

Contributing

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

CruiseDevice/ai-tutor

Folders and files

Latest commit

History

Repository files navigation

StudyFetch AI Tutor

Features

Tech Stack

Frontend

Backend

AI Integration

Architecture

RAG Pipeline

Service Components

Prerequisites

Installation

1. Clone the repository

2. Install dependencies

3. Start the database and embeddings service

4. Set up environment variables

5. Initialize the database

Running the Application

User Guide

Project Structure

API Routes

Development

Add Database Migration

Update Embeddings Model

Deployment

Contributing

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages