A high-performance MNIST digit classifier achieving 99.84% accuracy (0.16% error rate) using ensemble learning with Squeeze-and-Excitation blocks.
- Ensemble Learning: Combines 20 models with different random initializations for robust predictions
- Squeeze-and-Excitation Blocks: Channel-wise feature recalibration for improved representation
- Multi-Stage Training: Progressive learning rate decay for optimal convergence
- Data Augmentation: Rotation, shifting, shearing, and zooming for better generalization
- Mixed Precision Training: Faster training with FP16/FP32 mixed precision (when GPU available)
- Professional Code Structure: Clean, documented, and maintainable codebase
- Type Hints: Full type annotations for better IDE support and code clarity
| Metric | Value | 
|---|---|
| Test Accuracy | ~99.84% | 
| Error Rate | ~0.16% | 
| Individual Model Accuracy | 99.69% - 99.81% | 
| Training Time | ~30-45 min (GPU) / ~3-4 hours (CPU) | 
The model architecture consists of:
- 
Three Convolutional Blocks, each containing: - 3x Conv2D layers (128 filters, 3x3 kernel, ReLU activation)
- Batch Normalization
- Squeeze-and-Excitation block (SE ratio: 32)
- Average Pooling (after blocks 2 and 3)
 
- 
Global Pooling Layer: - Concatenation of Global Max Pooling and Global Average Pooling
 
- 
Output Layer: - Dense layer with softmax activation
- L1 regularization (0.00025) for ensemble performance
 
- Python 3.8+
- pip package manager
- Clone the repository:
git clone https://github.com/Matuzas77/MNIST-0.17.git
cd MNIST-0.17- Install dependencies:
pip install -r requirements.txtRun the complete training pipeline:
python mnist_classifier.pyThis will:
- Load and preprocess the MNIST dataset
- Train 20 models with different initializations
- Evaluate each model individually
- Compute ensemble predictions
- Report final accuracy
from mnist_classifier import Config, EnsembleTrainer, DataPreprocessor
# Customize configuration
config = Config()
config.NUM_MODELS = 10  # Train fewer models
config.BATCH_SIZE = 64   # Larger batch size
config.USE_MIXED_PRECISION = True  # Enable mixed precision
# Load data
preprocessor = DataPreprocessor(config)
(x_train, y_train), (x_test, y_test) = preprocessor.load_and_preprocess_data()
# Train ensemble
trainer = EnsembleTrainer(config)
models = trainer.train_ensemble(x_train, y_train, x_test, y_test)
# Evaluate
accuracy, predictions = trainer.evaluate_ensemble(x_test, y_test)
print(f"Ensemble accuracy: {accuracy:.4f}")Key parameters can be adjusted in the Config class:
| Parameter | Default | Description | 
|---|---|---|
| NUM_MODELS | 20 | Number of models in the ensemble | 
| BATCH_SIZE | 32 | Training batch size | 
| INITIAL_LEARNING_RATE | 0.001 | Starting learning rate | 
| CONV_FILTERS | 128 | Number of filters in Conv2D layers | 
| USE_MIXED_PRECISION | True | Enable FP16/FP32 mixed precision | 
| ROTATION_RANGE | 10 | Data augmentation rotation (degrees) | 
MNIST-0.17/
├── mnist_classifier.py      # Main training script
├── MNIST_final_solution.ipynb  # Original Jupyter notebook
├── requirements.txt         # Python dependencies
└── README.md               # This file
This refactored version includes:
- ✅ Comprehensive docstrings (Google style)
- ✅ Type hints for all functions
- ✅ Proper logging with timestamps
- ✅ Configuration management
- ✅ Error handling
- ✅ PEP 8 compliant formatting
- ✅ Mixed precision training (2x faster on compatible GPUs)
- ✅ Modern TensorFlow APIs (replaced deprecated fit_generator,lrparameter)
- ✅ Optimized data pipeline
- ✅ Better memory management
- ✅ Proper kernel initialization (he_normal)
- ✅ Fixed normalization bug in preprocessing
- ✅ SE block as custom layer for reusability
- ✅ Better BatchNorm placement
- ✅ Named layers for better debugging
- ✅ Proper regularization strategy
The training uses a multi-stage learning rate schedule:
- Stage 1 (13 epochs): LR = 0.001 - Initial rapid learning
- Stage 2 (3 epochs): LR = 0.0001 - Fine-tuning
- Stage 3 (3 epochs): LR = 0.00001 - Precision improvement
- Final (1 epoch): Original data without augmentation
Expected output (20 models):
Model 1 accuracy: 0.9973 (error: 0.27%)
Model 2 accuracy: 0.9977 (error: 0.23%)
...
Model 20 accuracy: 0.9975 (error: 0.25%)
Ensemble accuracy: 0.9984
Ensemble error rate: 0.16%
- tensorflow >= 2.10.0
- numpy >= 1.21.0
- scikit-learn >= 1.0.0
MIT License
Contributions are welcome! Please feel free to submit a Pull Request.
- Squeeze-and-Excitation Networks: Hu et al., 2018
- MNIST Dataset: Yann LeCun et al.
If you use this code in your research, please cite:
@software{mnist_ensemble_classifier,
  title={MNIST Ensemble Classifier with SE Blocks},
  author={Professional ML Pipeline},
  year={2024},
  url={https://github.com/Matuzas77/MNIST-0.17}
}