-
Notifications
You must be signed in to change notification settings - Fork 13
Open
Description
Migrate Django clearsessions to Celery Task
Overview
Migrate the Django built-in clearsessions management command to a Celery task for consistent background job management. This task removes expired sessions from the database to prevent session table bloat.
Current Implementation Analysis
Command Location
- Command: Django built-in
clearsessionsmanagement command - Current Schedule: Daily (
"@daily") - Execution: Via cron with file locking (
.contrib/docker/cron/run_locked.sh)
Task Complexity
- Purpose: Remove expired Django sessions from the database
- Operations: Database cleanup, session expiry evaluation
- Dependencies: Django session framework, database backend
- Runtime: Low-Medium (depends on session table size)
- Failure Points: Database locks, large session tables, concurrent access
Implementation Tasks
1. Create Celery Task Wrapper
- Create or update
poradnia/core/tasks.py(or appropriate utility app) - Create Celery task that wraps Django's
clearsessionscommand - Maintain existing cleanup functionality
- Add Celery-specific error handling and monitoring
2. Error Handling & Database Safety
- Implement retry logic for database lock conflicts (max 3 retries)
- Add specific exception handling for:
- Database connection errors
- Table lock timeouts
- Large transaction issues
- Concurrent access conflicts
- Safe handling of large session deletion batches
- Transaction management for cleanup operations
3. Logging & Monitoring
- Structured logging with cleanup progress tracking
- Track number of sessions removed
- Log cleanup execution time and performance
- Database table size monitoring (before/after)
- Integration with Celery result backend
4. Performance Optimization
- Batch deletion for large session tables
- Monitor database performance impact
- Optimize deletion queries for efficiency
- Handle potential deadlocks gracefully
5. Scheduling Configuration
- Configure Celery beat periodic task (daily)
- Use database-backed scheduling (
django-celery-beat) - Allow runtime schedule modifications
- Choose optimal daily execution time (low-traffic period)
Files to Modify/Create
New/Updated Files
poradnia/core/tasks.py- Celery task implementation (create app if needed)poradnia/core/__init__.py- Create core app if neededporadnia/core/apps.py- Django app configuration if needed
Modified Files
poradnia/settings/base.py- Add clearsessions to Celery beat schedule and INSTALLED_APPS if core app createddocs/celery.rst– Documentation updates
Configuration
# poradnia/settings/base.py - Add to CELERY_BEAT_SCHEDULE
'clearsessions': {
'task': 'poradnia.core.tasks.clearsessions',
'schedule': crontab(hour=3, minute=0), # Daily at 03:00 (low traffic)
},Dependencies
- BLOCKED BY: Phase 1: Celery Infrastructure Setup #1828 - Phase 1: Celery Infrastructure Setup
- REQUIRES: Redis service, Celery worker service, database access
- PARENT: Phase 2: Background Task Migration to Celery (Umbrella Issue) #1829 - Phase 2: Background Task Migration to Celery (Umbrella Issue)
This issue cannot begin until the Celery infrastructure from #1828 is fully operational.
Related Issues
- Infrastructure: Phase 1: Celery Infrastructure Setup #1828 - Phase 1: Celery Infrastructure Setup (must be completed first)
- Umbrella: Phase 2: Background Task Migration to Celery (Umbrella Issue) #1829 - Phase 2: Background Task Migration to Celery (tracks overall progress)
- Parallel Tasks:
- Migrate run_court_session_parser to Celery Task #1830 - Migrate run_court_session_parser to Celery Task
- Migrate send_event_reminders to Celery Task #1831 - Migrate send_event_reminders to Celery Task
- Migrate send_old_cases_reminder to Celery Task #1832 - Migrate send_old_cases_reminder to Celery Task
Can be developed in parallel with other task migrations once infrastructure is ready. This is the lowest priority of the migration tasks.
Testing Requirements
Unit Tests
- Test task execution with mock session data
- Test cleanup logic with various session states
- Test error handling for database issues
- Test retry logic for lock conflicts
- Test performance with large session datasets
Integration Tests
- Test full Celery task execution
- Test daily scheduling via Celery beat
- Test database cleanup with real session data
- Test concurrent execution handling
Performance Tests
- Test cleanup performance with large session tables
- Test database impact during cleanup operations
- Test memory usage during batch deletions
- Compare performance with original management command
Implementation Example Structure
# poradnia/core/tasks.py
from celery import shared_task
from celery.utils.log import get_task_logger
from django.core.management import call_command
from django.db import transaction, DatabaseError
from django.core.management.base import CommandError
logger = get_task_logger(__name__)
@shared_task(bind=True, autoretry_for=(DatabaseError,), retry_kwargs={'max_retries': 3, 'countdown': 600})
def clearsessions(self):
"""
Clear expired Django sessions from the database.
Wrapper around Django's built-in clearsessions command with Celery integration.
"""
try:
logger.info("Starting Django session cleanup task")
# Get session count before cleanup (for reporting)
from django.contrib.sessions.models import Session
initial_count = Session.objects.count()
# Execute Django's clearsessions command
with transaction.atomic():
call_command('clearsessions', verbosity=0)
# Get session count after cleanup
final_count = Session.objects.count()
sessions_removed = initial_count - final_count
result = {
"status": "completed",
"sessions_removed": sessions_removed,
"initial_count": initial_count,
"final_count": final_count
}
logger.info(f"Session cleanup completed: {result}")
return result
except CommandError as cmd_exc:
logger.error(f"Django clearsessions command failed: {cmd_exc}")
raise self.retry(exc=cmd_exc)
except DatabaseError as db_exc:
logger.error(f"Database error during session cleanup: {db_exc}")
raise self.retry(exc=db_exc)
except Exception as exc:
logger.error(f"Session cleanup task failed: {exc}")
raise self.retry(exc=exc)Acceptance Criteria
- Celery task successfully clears expired Django sessions
- Task runs on daily schedule (optimized timing)
- All current cleanup functionality is preserved
- Error handling improves reliability over cron system
- Task execution can be monitored through Celery
- Database performance impact is minimal
- Session cleanup metrics are tracked and logged
- Task handles large session tables efficiently
Django Session Considerations
- Respect Django session backend configuration
- Handle different session backends (database, cached, file)
- Maintain session expiry logic accuracy
- Consider session security implications
- Handle session table locks gracefully
App Structure Decision
If creating a new core app for utility tasks:
- Create
poradnia/core/directory structure - Add to
INSTALLED_APPSin Django settings - Follow existing app patterns in the project
- Document the purpose of the core app
Rollback Plan
- Keep original cron-based clearsessions during transition
- Monitor database performance after migration
- Document rollback procedure for utility tasks
Success Metrics
- Reliability: 100% daily execution success rate
- Performance: Cleanup time comparable to or better than original
- Database Health: Session table size maintained efficiently
- Monitoring: Clear visibility into cleanup operations and results
References
- Django clearsessions documentation: https://docs.djangoproject.com/en/4.2/ref/django-admin/#clearsessions
- Current cron configuration:
.contrib/docker/cron/set_crontab.sh - Django session framework:
django.contrib.sessions - Session models:
django.contrib.sessions.models.Session
Metadata
Metadata
Assignees
Labels
No labels