A high-performance MNIST digit classifier achieving 99.84% accuracy (0.16% error rate) using ensemble learning with Squeeze-and-Excitation blocks.
- Ensemble Learning: Combines 20 models with different random initializations for robust predictions
- Squeeze-and-Excitation Blocks: Channel-wise feature recalibration for improved representation
- Multi-Stage Training: Progressive learning rate decay for optimal convergence
- Data Augmentation: Rotation, shifting, shearing, and zooming for better generalization
- Mixed Precision Training: Faster training with FP16/FP32 mixed precision (when GPU available)
- Professional Code Structure: Clean, documented, and maintainable codebase
- Type Hints: Full type annotations for better IDE support and code clarity
| Metric | Value |
|---|---|
| Test Accuracy | ~99.84% |
| Error Rate | ~0.16% |
| Individual Model Accuracy | 99.69% - 99.81% |
| Training Time | ~30-45 min (GPU) / ~3-4 hours (CPU) |
The model architecture consists of:
-
Three Convolutional Blocks, each containing:
- 3x Conv2D layers (128 filters, 3x3 kernel, ReLU activation)
- Batch Normalization
- Squeeze-and-Excitation block (SE ratio: 32)
- Average Pooling (after blocks 2 and 3)
-
Global Pooling Layer:
- Concatenation of Global Max Pooling and Global Average Pooling
-
Output Layer:
- Dense layer with softmax activation
- L1 regularization (0.00025) for ensemble performance
- Python 3.8+
- pip package manager
- Clone the repository:
git clone https://github.com/Matuzas77/MNIST-0.17.git
cd MNIST-0.17- Install dependencies:
pip install -r requirements.txtRun the complete training pipeline:
python mnist_classifier.pyThis will:
- Load and preprocess the MNIST dataset
- Train 20 models with different initializations
- Evaluate each model individually
- Compute ensemble predictions
- Report final accuracy
from mnist_classifier import Config, EnsembleTrainer, DataPreprocessor
# Customize configuration
config = Config()
config.NUM_MODELS = 10 # Train fewer models
config.BATCH_SIZE = 64 # Larger batch size
config.USE_MIXED_PRECISION = True # Enable mixed precision
# Load data
preprocessor = DataPreprocessor(config)
(x_train, y_train), (x_test, y_test) = preprocessor.load_and_preprocess_data()
# Train ensemble
trainer = EnsembleTrainer(config)
models = trainer.train_ensemble(x_train, y_train, x_test, y_test)
# Evaluate
accuracy, predictions = trainer.evaluate_ensemble(x_test, y_test)
print(f"Ensemble accuracy: {accuracy:.4f}")Key parameters can be adjusted in the Config class:
| Parameter | Default | Description |
|---|---|---|
NUM_MODELS |
20 | Number of models in the ensemble |
BATCH_SIZE |
32 | Training batch size |
INITIAL_LEARNING_RATE |
0.001 | Starting learning rate |
CONV_FILTERS |
128 | Number of filters in Conv2D layers |
USE_MIXED_PRECISION |
True | Enable FP16/FP32 mixed precision |
ROTATION_RANGE |
10 | Data augmentation rotation (degrees) |
MNIST-0.17/
├── mnist_classifier.py # Main training script
├── MNIST_final_solution.ipynb # Original Jupyter notebook
├── requirements.txt # Python dependencies
└── README.md # This file
This refactored version includes:
- ✅ Comprehensive docstrings (Google style)
- ✅ Type hints for all functions
- ✅ Proper logging with timestamps
- ✅ Configuration management
- ✅ Error handling
- ✅ PEP 8 compliant formatting
- ✅ Mixed precision training (2x faster on compatible GPUs)
- ✅ Modern TensorFlow APIs (replaced deprecated
fit_generator,lrparameter) - ✅ Optimized data pipeline
- ✅ Better memory management
- ✅ Proper kernel initialization (
he_normal) - ✅ Fixed normalization bug in preprocessing
- ✅ SE block as custom layer for reusability
- ✅ Better BatchNorm placement
- ✅ Named layers for better debugging
- ✅ Proper regularization strategy
The training uses a multi-stage learning rate schedule:
- Stage 1 (13 epochs): LR = 0.001 - Initial rapid learning
- Stage 2 (3 epochs): LR = 0.0001 - Fine-tuning
- Stage 3 (3 epochs): LR = 0.00001 - Precision improvement
- Final (1 epoch): Original data without augmentation
Expected output (20 models):
Model 1 accuracy: 0.9973 (error: 0.27%)
Model 2 accuracy: 0.9977 (error: 0.23%)
...
Model 20 accuracy: 0.9975 (error: 0.25%)
Ensemble accuracy: 0.9984
Ensemble error rate: 0.16%
- tensorflow >= 2.10.0
- numpy >= 1.21.0
- scikit-learn >= 1.0.0
MIT License
Contributions are welcome! Please feel free to submit a Pull Request.
- Squeeze-and-Excitation Networks: Hu et al., 2018
- MNIST Dataset: Yann LeCun et al.
If you use this code in your research, please cite:
@software{mnist_ensemble_classifier,
title={MNIST Ensemble Classifier with SE Blocks},
author={Professional ML Pipeline},
year={2024},
url={https://github.com/Matuzas77/MNIST-0.17}
}