This repository contains the details of the Image Text Recognition and Sentiment Analysis
Build a two-stage AI pipeline that can:
-
Read Text from an Image 📷
- Leverage a model trained on the IIIT-5K Words dataset to detect and transcribe any text in an image (e.g., a sign, screenshot, or advertisement).
-
Analyze Sentiment ❤️ 😭 😐
- Feed the extracted text into a Recurrent Neural Network (RNN) trained on the Twitter Sentiment Dataset to classify it as positive 👍, negative 👎, or neutral 😐.
The goal is to create an end-to-end system that can
- “see” an image 👁️
- “read” its contents 📝
- “understand” the tone of the message 🧠
🔗 Source & Download
- Curated by IIIT-Hyderabad (IIIT-H).
- Distributed as a
.tar.gzarchive with annotations in MATLAB.matfiles. - Official download: IIIT5K-Words official site.
🗂️ Size & Structure
- Total images: 5,000 single‐word crops
- Suggested split:
- Train: 3,000 images
- Test: 2,000 images
- Image format: JPG/PNG, each containing one isolated word
- Annotations:
testdata.mat→ list of image paths + word labelstestCharBound.mat→ per‐character bounding‐box coordinates
🔣 Content & Complexity
- Typographic variability:
- Multiple fonts, sizes and styles (italic, bold, serif, sans-serif)
- Real-world challenges:
- Noisy or semi-transparent backgrounds
- Partially occluded characters
- Compression artifacts and blur
🏷️ Labeling
- Each image shows exactly one English word.
- Ground‐truth
Although the standard Wine Quality Dataset does not include missing values, in real-world scenarios it is very likely that some physicochemical measurements may be absent when evaluating a wine. Therefore, a fundamental aspect of this application is its capacity to manage the absence of one or more input values provided by the user. 🤷♂️
- Python
- Dataset OCR: The dataset was provided in the following link OCR Images
- Dasatet Sentiments: The dataset was provided in the following link Sentiment dataset
-
Clone the project on your computer:
git clone https://github.com/C102002/proyecto-ia-2
Note
Python Version 3.11 🚀:
- Dependency Compatibility: Using Python 3.11 helps resolve known issues with data analysis and dependency conflicts with libraries like Keras and TensorFlow. ⚙️
- Bug Fixes & Stability: This version includes essential fixes and improvements that enhance overall stability, ensuring smoother execution of your ML workflows. 🐛✅
- Optimized Performance: With core runtime improvements, Python 3.11 delivers faster execution and better resource management during data processing and model training. ⚡💻
Adopting Python 3.11 is crucial for building robust, efficient applications in data science and deep learning.
-
Create the Python virtual environment
# Run the following command to create a virtual environment in the project directory: py -3.11 -m venv venv -
Activate the virtual environment
# Windows (using Command Prompt): venv\Scripts\activate # Windows (using PowerShell): .\venv\Scripts\activate.ps1 # macOS and Linux: source venv/bin/activate
-
Install the dependencies
# Run the following command: pip install -r requirements.txt -
Update dependencies
# Run the following command to update the requirements file: pip freeze > requirements.txt
NT
# Run this if the requirements file appears with strage values pip freeze | Out-File requirements.txt -Encoding utf8
STR model [1] trained and evaluated on the IIIT-5K Words dataset [2].
Detects and transcribes words in images with varied fonts, sizes, and noise levels, producing a clean, ordered text string.
Bidirectional LSTM RNN [3][4] trained on the Twitter Sentiment Dataset [5].
Classifies each extracted fragment as positive 😊, negative 😞, or neutral 😐, revealing the underlying intent and tone.
The final app ties both models into a simple pipeline: upload an image, extract its text, then analyze its sentiment.
- Example of usage
# In the root of the project
python -m app.main Then wait a litle bit to show the main menu
? Bienvenido, ¿qué desea hacer? (Use arrow keys)
» 1. Cargar imagen
2. Probar con un ejemplo
3. Instrucciones
4. ¿Quiénes somos?
5. Informacion de los modelos
6. SalirVideo of example of correct usage
|
Hualong Chiang 📖 |
Alfredo Fung 📖 |
Daniel Bortot 📖 |
Juan Perdomo 📖 |
Gabriela Martinez 📖 |
This project is under Apache license. See the LICENSE file for more details.

