Skip to content

CNN-based multiclass ship type classifier (13 classes, 32×32 RGB) built under Kaggle’s 30-layer constraint. Implements a compact wide CNN with BatchNorm, Dropout, AdamW, label smoothing, and progressive training across multiple augmented datasets. Includes balanced data generation, efficient TPU/CPU pipelines, and reproducible training strategy.

Notifications You must be signed in to change notification settings

lucasduport/ship-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Ship Classification (Kaggle)

This README explains how we achieved strong performance under the Kaggle competition constraint of a maximum of 30 layers, and how to use and extend this notebook efficiently.

Competition context and constraints

  • Task: Multiclass ship type classification from 32×32 RGB images (13 classes).
  • Compute: Kaggle TPU/GPU/CPU with automatic fallback.
  • Key constraint: Maximum of 30 layers in the model.

High-level approach for best performance

  1. Compact but expressive CNN under 30 layers
    • Three convolutional blocks with BatchNorm and Dropout to stabilize and regularize training.
    • Wide filters (base 256) to increase representational capacity without adding depth.
    • Final Dense head with BatchNorm + Dropout before softmax.
    • L2 weight decay, label smoothing, gradient clipping for extra stability.
  2. Progressive training using multiple pre-generated datasets
    • Start with a lightly augmented dataset ("base"), then train on stronger augmentations ("mild", "strong").
    • Preserves useful features early, then improves robustness and generalization in later phases.
  3. Balanced data via targeted augmentation
    • For each class, upsample with on-the-fly Keras preprocessing layers to reach a target count per class (median/max strategies).
    • Scale dataset size (scale_factor) per augmentation setting to control training signal and batch diversity.
  4. Careful training control
    • Early stopping with patience and ReduceLROnPlateau to converge without overfitting.
    • Checkpoint the best model by validation accuracy.
    • Stratified validation split at each phase for fair evaluation.
  5. Efficient input pipeline
    • Batch size scaled for TPU replicas when available; AUTOTUNE prefetching.
    • Memory checks to keep multiple augmented datasets feasible.

Architecture and 30-layer compliance

We use a Sequential CNN designed to stay comfortably under the 30-layer cap while maintaining capacity:

  • Block 1: [Conv, BN, Conv, BN, MaxPool, Dropout]
  • Block 2: [Conv, BN, Conv, BN, MaxPool, Dropout]
  • Block 3: [Conv, BN, Conv, BN, MaxPool, Dropout]
  • Head: [Flatten, Dense(256), BN, Dropout, Dense(13, softmax)]

Key hyperparameters

  • Base filters: 256, doubling per block (256 -> 512 -> 1024 effective conv widths across blocks).
  • Regularization: L2 2e-5; Dropout 0.2 in conv blocks (0.4 after block 3), 0.5 before final layer.
  • Loss: Categorical cross-entropy with label_smoothing=0.1.
  • Optimizer: AdamW (lr 1e-3 initially, weight_decay 2e-4, clipnorm 1.0). LRs adjusted per phase.

Why this works under depth limits

  • For 32×32 inputs, depth beyond ~25 layers often yields diminishing returns; width + BN + regularization is more impactful.
  • Label smoothing and WD improve calibration and robustness, especially with strong augmentations.

Progressive training strategy

Phases (example used):

  • Phase 1 — dataset: "base", epochs: 50, lr: 0.001
  • Phase 2 — dataset: "mild", epochs: 100, lr: 0.005
  • Phase 3 — dataset: "strong", epochs: 100, lr: 0.005

At each phase:

  • Stratified 80/20 train/val split for the current dataset.
  • EarlyStopping (patience 20) + ReduceLROnPlateau (factor 0.5, patience 5).
  • Checkpoint best weights by val_accuracy to avoid overfitting regressions.

Rationale

  • Start easy (less augmentation) to learn stable features quickly.
  • Increase augmentation strength to improve invariance and generalization.
  • Reset LR per phase to re-accelerate learning on the new distribution.

Data pipeline, balancing, and augmentation

  • Source: ships32 folder extracted from Kaggle dataset archive.
  • Loading: keras.utils.image_dataset_from_directory (shuffled, batch_size adapted to hardware).
  • Normalization: x / 255.0 when assembling normalized datasets.
  • Class balancing: For each class, target a per-class count derived from median or max of class frequencies, then synthesize missing samples with:
    • RandomFlip (horizontal/vertical), RandomRotation, RandomZoom, RandomTranslation
    • Optional: RandomContrast, RandomBrightness, GaussianNoise
  • Dataset scaling: scale_factor per augmentation config (e.g., 1.0, 1.4, 1.1) to tune total training signal.

Tips

  • Keep augmentations realistic for small 32×32 images; too strong transforms can erase discriminative signals.
  • Use moderate translation/rotation/zoom and enable flips if class semantics allow it.

Validation protocol and metrics

  • Stratified split (20% val) each phase, using label indices for splitting.
  • Monitored metrics: val_accuracy (primary), val_loss (secondary).
  • Plot training curves with clear phase boundaries to diagnose transitions.

Reproducibility checklist

  • Set numpy and tensorflow seeds before data generation and training.
  • Log the augmentation configs and per-phase LR/epochs.
  • Pin the final best model (best_model.keras) and the exact seed producing it.
  • Keep the same stratified split strategy for comparable validation numbers.

Authors

About

CNN-based multiclass ship type classifier (13 classes, 32×32 RGB) built under Kaggle’s 30-layer constraint. Implements a compact wide CNN with BatchNorm, Dropout, AdamW, label smoothing, and progressive training across multiple augmented datasets. Includes balanced data generation, efficient TPU/CPU pipelines, and reproducible training strategy.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published