Skip to content

Phase 2: Background Task Migration to Celery (Umbrella Issue) #1829

@ad-m-ss

Description

@ad-m-ss

Phase 2: Background Task Migration to Celery (Umbrella Issue)

Overview

Migrate all existing cron-based management commands to Celery tasks, replacing the current .contrib/docker/cron/ system with robust, scalable background processing. This umbrella issue tracks the complete migration of the background job system.

Scope

This issue coordinates the migration of all existing management commands to Celery tasks with proper error handling, retry logic, logging, and scheduling. This issue does NOT include removal of the legacy cron system - that is handled separately in Phase 3.

Current Background Jobs to Migrate

High Priority (Daily Operations)

  1. Court Session Parser - run_court_session_parser (Daily 23:10)

    • File: poradnia/judgements/management/commands/run_court_session_parser.py
    • Schedule: "10 23 * * *" (Daily at 23:10)
    • Complexity: High - Web scraping, database operations, event creation
  2. Event Reminders - send_event_reminders (Daily 12:00)

    • File: poradnia/events/management/commands/send_event_reminders.py
    • Schedule: "0 12 * * *" (Daily at 12:00)
    • Complexity: Medium - Email sending, event processing

Medium Priority (Periodic Maintenance)

  1. Old Cases Reminder - send_old_cases_reminder (Monthly)

    • File: poradnia/cases/management/commands/send_old_cases_reminder.py
    • Schedule: "0 6 2 * *" (Monthly on 2nd at 06:00)
    • Complexity: Medium - Case analysis, notification sending
  2. Django Session Cleanup - clearsessions (Daily)

    • Schedule: "@daily" (Daily)
    • Complexity: Low - Built-in Django command

Sub-Issues

This umbrella issue is broken down into the following sub-issues:

Implementation Strategy

Task Implementation Requirements

Each migrated task must include:

  • Robust error handling with exponential backoff retry logic
  • Comprehensive logging with structured output for monitoring
  • Progress tracking for long-running operations
  • Memory-efficient processing for large datasets
  • Graceful failure handling with appropriate notifications

Scheduling Migration

  • Set up Celery beat periodic schedules matching current cron timing
  • Use database-backed scheduling (django-celery-beat) for easier management
  • Maintain parallel execution during transition period for safety
  • Legacy cron system will remain until Phase 3 cleanup

Files Structure

Tasks will be organized by Django app:

poradnia/judgements/tasks.py    # Court session parser task
poradnia/events/tasks.py        # Event reminder tasks  
poradnia/cases/tasks.py         # Case-related tasks
poradnia/core/tasks.py          # Generic/utility tasks (clearsessions)

Dependencies

All sub-issues are blocked by the completion of Phase 1 infrastructure setup.

Current System Files

Cron Configuration (To Remain During This Phase)

  • .contrib/docker/cron/set_crontab.sh - Current cron job definitions (kept for parallel execution)
  • .contrib/docker/cron/run_locked.sh - Job execution wrapper (kept as backup)

Management Commands (To Be Migrated)

  • poradnia/judgements/management/commands/run_court_session_parser.py
  • poradnia/events/management/commands/send_event_reminders.py
  • poradnia/cases/management/commands/send_old_cases_reminder.py

Files to Modify/Create

New Task Files (To Create)

  • poradnia/judgements/tasks.py - Court session parser Celery task
  • poradnia/events/tasks.py - Event reminder Celery tasks
  • poradnia/cases/tasks.py - Case management Celery tasks
  • poradnia/core/tasks.py - Utility Celery tasks (clearsessions)

Configuration Updates

  • poradnia/settings/base.py - Celery beat schedule configuration
  • CLAUDE.md - Add Celery task management commands (keep cron docs during transition)

Legacy System (NO CHANGES IN THIS PHASE)

  • .contrib/docker/cron/set_crontab.sh - NO CHANGES (kept for safety)
  • docker-compose.yml - NO CHANGES to cron services (if any)

Related Issues

Infrastructure Dependency

Sub-Issues (Task Migrations)

Follow-up Work

Acceptance Criteria

  • All current background jobs run successfully as Celery tasks
  • Original scheduling timing is preserved (same execution times)
  • Error handling and retry logic improve reliability over cron system
  • Task execution can be monitored and managed through Celery
  • Database-backed scheduling allows runtime schedule modifications
  • Parallel execution with legacy cron system works safely during transition
  • Development workflow includes Celery task management commands
  • Legacy cron system remains functional for rollback capability

Migration Benefits

Robust error handling with automatic retries and exponential backoff
Real-time task monitoring and status tracking
Resource efficiency through persistent worker processes
Horizontal scaling capability via additional worker containers
Comprehensive logging and debugging tools
Database-backed scheduling for easier schedule management
Graceful failure handling with admin notifications

Testing Strategy

Each sub-issue will include:

  • Unit tests for individual task functions
  • Integration tests for Celery task execution
  • Schedule verification tests
  • Error handling and retry logic tests
  • Performance comparison with original management commands

Timeline

Sub-issues can be worked on in parallel once Phase 1 infrastructure is complete. Recommended order:

  1. Migrate run_court_session_parser to Celery Task #1830 - run_court_session_parser (most complex, highest priority)
  2. Migrate send_event_reminders to Celery Task #1831 - send_event_reminders (daily operation)
  3. Migrate send_old_cases_reminder to Celery Task #1832 - send_old_cases_reminder (monthly operation)
  4. Migrate Django clearsessions to Celery Task #1833 - clearsessions (simple Django utility)

After completion: Phase 3 (#1834) will handle legacy cron system removal.

Scope Clarification

This Phase 2 includes:

  • Converting management commands to Celery tasks
  • Setting up Celery beat schedules
  • Adding error handling and monitoring
  • Testing parallel execution with cron

This Phase 2 does NOT include:

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions