Skip to content

Commit a7e2735

Browse files
authored
Merge pull request #982 from NamanVer02/main
Automated Quiz Generator from Folder Files for Personalized Practice
2 parents c3902cf + 55aa187 commit a7e2735

File tree

3 files changed

+211
-0
lines changed

3 files changed

+211
-0
lines changed

llm_quiz_generator/README.md

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
# Automated Quiz Generator from PDF Files
2+
3+
This Python script automates the process of generating Multiple Choice Questions (MCQs) from the content of PDF files stored in a folder. It extracts the text from the PDFs, generates unique questions with four answer options using a language model (LLM), and saves the quiz as a text file.
4+
5+
- **Automatic PDF Text Extraction**: Extracts text from all PDFs in a specified folder.
6+
- **MCQ Generation**: Generates unique multiple choice questions with 4 answer options and identifies the correct answer.
7+
- **Quiz Output**: Saves the quiz in a text format for easy review.
8+
9+
## Setup Instructions
10+
11+
This guide will walk you through the steps to set up and run the Automated Quiz Generator, which converts PDF content into multiple-choice questions (MCQs). You will need to create an account with Groq to get an API key and set up your Python environment before running the script.
12+
13+
### Step 1: Create a Groq Account and Get an API Key (100% free)
14+
15+
1. **Visit Groq's Console**:
16+
Open your web browser and go to [Groq Console](https://console.groq.com/login).
17+
18+
2. **Log In or Create an Account**:
19+
You can log in with your email, GitHub account, or create a new Groq account for free.
20+
21+
3. **Generate an API Key**:
22+
- After logging in, navigate to the "API Keys" section in the Groq console.
23+
- Click the "Create API Key" button.
24+
- Enter a name for your API key (e.g., `quiz_key`).
25+
- **Important**: After you create the key, Groq will display it **only once**. Be sure to copy it correctly at this time.
26+
27+
4. **Save the API Key**:
28+
You will need this key to run the quiz generator script.
29+
30+
---
31+
32+
### Step 2: Add Your API Key to the Script
33+
34+
You have two options to use your API key. We recommend using option 1 since storing API keys directly in your code is risky because it exposes sensitive information, especially if you share or push your code to platforms like GitHub. Using a `.env` file is a safer approach because it keeps your keys private and separate from the code. It also prevents accidental exposure by ensuring the keys aren't included in version control systems like Git. This method enhances security and protects your application from unauthorized access.
35+
36+
#### Option 1: Store the API Key in a `.env` File
37+
38+
1. Create a new file in the same directory as your script and name it `.env`.
39+
2. Open the `.env` file in a text editor.
40+
3. Add the following line to the `.env` file, replacing `your_groq_api_key_here` with your actual API key:
41+
```
42+
GROQ_API_KEY="your_groq_api_key_here"
43+
```
44+
4. Save the `.env` file.
45+
46+
#### Option 2: Paste the API Key Directly into the Script
47+
48+
1. Open the `main.py` file in your code editor.
49+
2. Find the following line in the script (around line 38):
50+
```python
51+
API_KEY = os.environ["GROQ_API_KEY"]
52+
```
53+
3. Replace the above line with your API key directly, like this:
54+
```python
55+
API_KEY = "your_groq_api_key_here"
56+
```
57+
58+
---
59+
60+
### Step 3: Prepare the PDF Files
61+
62+
1. Create a folder (if not present) called `Source` in the same directory as your Python script.
63+
2. Place all the PDF files that you want to generate quizzes from inside the `Source` folder.
64+
65+
---
66+
67+
### Step 4: Install the Dependencies
68+
69+
1. Install all the required modules using this command:
70+
```
71+
pip install -r requirements.txt
72+
```
73+
74+
---
75+
76+
### Step 5: Run the Script
77+
78+
1. To generate the quiz, open a terminal or command prompt in the folder where the script is located and run the following command:
79+
```
80+
python main.py
81+
```
82+
This will extract the text from the PDFs, generate multiple-choice questions (MCQs) using the language model, and save the output in a folder named `Generated_Quizes`.
83+
84+
---
85+
86+
## Output
87+
88+
The generated MCQ quiz will be saved in a text file with a timestamp in the `Generated_Quizes` folder. Each question will have four options, and the correct answer will be indicated.
89+
90+
## Author(s)
91+
92+
[Naman Verma](https://github.com/NamanVer02/)
93+
94+
## Disclaimers
95+
96+
Please note that the free tier of Groq API has rate limits, which may cause errors if too many requests are made in a short period of time. If you encounter a rate limit error, try reducing the number of PDFs in the 'Source' folder or lower the number of questions being generated. This should help avoid hitting the rate limits. For more information on the exact rate limits, please refer to the [Groq API documentation](https://console.groq.com/settings/limits).

llm_quiz_generator/main.py

Lines changed: 109 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,109 @@
1+
import os
2+
from PyPDF2 import PdfReader
3+
from datetime import datetime
4+
from langchain_groq import ChatGroq
5+
from langchain.chains import RetrievalQA
6+
from dotenv import load_dotenv, find_dotenv
7+
from langchain_community.vectorstores import FAISS
8+
from langchain_huggingface import HuggingFaceEmbeddings
9+
from langchain.text_splitter import CharacterTextSplitter
10+
11+
load_dotenv(find_dotenv())
12+
API_KEY = os.environ["GROQ_API_KEY"]
13+
14+
# Change this if you want to set the number of MCQs
15+
num_questions = 5
16+
17+
18+
def extract_text_from_pdfs():
19+
"""Extracts text from PDF files in the 'Source' folder."""
20+
print("Extracting text from PDF files in the folder: 'Source'...")
21+
all_text = []
22+
23+
if len(os.listdir('Source')) == 0:
24+
print("Source Folder Empty!")
25+
print("Process exiting...")
26+
exit(0)
27+
28+
for file_name in os.listdir('Source'):
29+
if file_name.endswith(".pdf"):
30+
file_path = os.path.join('Source', file_name)
31+
print(f"Processing file: {file_name}")
32+
reader = PdfReader(file_path)
33+
for page in reader.pages:
34+
all_text.append(page.extract_text())
35+
print("Text extraction completed.")
36+
return " ".join(all_text)
37+
38+
39+
def generate_unique_mcq(text, num_questions=5):
40+
"""Generates unique multiple choice questions from text."""
41+
print("LLM processing...")
42+
text_splitter = CharacterTextSplitter(
43+
chunk_size=1000,
44+
chunk_overlap=0
45+
)
46+
docs = text_splitter.create_documents([text])
47+
48+
embeddings = HuggingFaceEmbeddings()
49+
store = FAISS.from_documents(docs, embeddings)
50+
51+
print(f"Connecting to LLM to generate {num_questions} unique MCQs...")
52+
llm = ChatGroq(
53+
temperature=0.2,
54+
model="llama-3.1-70b-versatile",
55+
api_key=API_KEY
56+
)
57+
58+
retrieval_chain = RetrievalQA.from_chain_type(
59+
llm=llm,
60+
chain_type="stuff",
61+
retriever=store.as_retriever()
62+
)
63+
64+
quiz = []
65+
query = (
66+
f"Generate {num_questions} unique multiple choice questions"
67+
"from the text: {text}"
68+
"Provide 4 answer options and also the correct answer in plaintext."
69+
)
70+
71+
response = retrieval_chain.invoke(query)
72+
question_and_options = response['result']
73+
quiz.append(question_and_options)
74+
75+
print("MCQ generation completed.")
76+
return quiz
77+
78+
79+
def save_mcq_to_file(quiz, file_name="generated_mcq_quiz.txt"):
80+
"""Saves generated MCQs to a text file."""
81+
output_folder = "Generated_Quizes"
82+
83+
if not os.path.exists(output_folder):
84+
os.makedirs(output_folder)
85+
print(f"Folder '{output_folder}' created.")
86+
87+
current_time = datetime.now().strftime("%Y-%m-%d_%H-%M-%S")
88+
file_name = f"generated_mcq_quiz_{current_time}.txt"
89+
file_path = os.path.join(output_folder, file_name)
90+
91+
print(f"Saving the generated MCQs to file: '{file_path}'...")
92+
with open(file_path, "w") as f:
93+
for i, question in enumerate(quiz, 1):
94+
f.write(f"Question {i}:\n{question}\n\n")
95+
96+
print(f"MCQ Quiz saved to {file_path}")
97+
98+
99+
if __name__ == "__main__":
100+
if not os.path.exists('Source'):
101+
print("Folder 'Source' not found.")
102+
else:
103+
print("Folder 'Source' found. Starting process...")
104+
text = extract_text_from_pdfs()
105+
print("Text extracted from PDFs.")
106+
107+
mcq_quiz = generate_unique_mcq(text, num_questions=num_questions)
108+
save_mcq_to_file(mcq_quiz)
109+
print("Process completed successfully.")

llm_quiz_generator/requirements.txt

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
PyPDF2==3.0.1
2+
langchain==0.3.3
3+
langchain_groq==0.2.0
4+
langchain_community==0.3.2
5+
langchain_huggingface==0.1.0
6+
faiss-cpu==1.9.0

0 commit comments

Comments
 (0)