A realistic simulation of an Australian health insurance company's operational database, designed as a source for data warehouse demonstrations and testing. This project generates time-series data with daily changes that can be tracked using SQL Server's Change Data Capture (CDC).
- Core Insurance Operations
- Member management and demographics
- Policy creation and lifecycle management
- Coverage plan configuration
- Claims processing and assessment
- Premium payment tracking
- Provider network management
- Data Generation
- Dynamic patient data generation with realistic demographics
- Age distributions matching population demographics
- Life stages with address and name changes over time
- Data variants to simulate errors and changes
- Australian-Specific Elements
- Hospital cover tiers (Basic, Bronze, Silver, Gold)
- Private Health Insurance (PHI) rebate tiers
- Lifetime Health Cover (LHC) loading
- Medicare Benefits Schedule (MBS) integration
- Australian states and postcodes
- Technical Features
- SQL Server Change Data Capture (CDC) for tracking changes
- Synthea FHIR patient data integration
- PyODBC database connectivity
- Comprehensive test suite
- Cross-platform support (Linux, macOS, Windows)
- Python 3.8+
- SQL Server instance
- ODBC Driver 17+ for SQL Server
- pyodbc package
- Faker library (for dynamic data generation)
# Clone the repository
git clone https://github.com/yourusername/health-insurance-au.git
cd health-insurance-au
# Install dependencies
pip install -e .
Create a configuration file with your database credentials:
cp config/db_config.env.example config/db_config.env
# Edit config/db_config.env with your database credentials
# Initialize the database schema
./bin/initialize_db.sh
# Add initial reference data
./bin/add_initial_data.sh
# Initialize the database schema
bin\initialize_db.bat
# Add initial reference data
bin\add_initial_data.bat
Run a realistic simulation with dynamic data generation:
./bin/run_realistic_simulation.sh --start-date 2023-01-01 --end-date 2023-01-31 --members-per-day 10
bin\run_realistic_simulation.bat --start-date 2023-01-01 --end-date 2023-01-31 --members-per-day 10
Option | Description |
---|---|
--start-date |
Start date for the simulation (YYYY-MM-DD) |
--end-date |
End date for the simulation (YYYY-MM-DD) |
--members-per-day |
Base number of new members per day |
--log-level |
Logging level (DEBUG, INFO, WARNING, ERROR) |
--reset-members |
Reset the list of used member IDs |
--use-static-data |
Use static data from JSON file instead of dynamic generation |
The simulation generates data with realistic patterns:
- Fewer members join on weekends
- More claims at the beginning/end of the month
- Business hours for transactions (8 AM to 5 PM)
By default, the simulation uses dynamic data generation to create realistic patient profiles:
# Use dynamic data generation (default)
./bin/run_realistic_simulation.sh --start-date 2023-01-01 --end-date 2023-01-31
# Use static data from JSON file
./bin/run_realistic_simulation.sh --start-date 2023-01-01 --end-date 2023-01-31 --use-static-data
# Use dynamic data generation (default)
bin\run_realistic_simulation.bat --start-date 2023-01-01 --end-date 2023-01-31
# Use static data from JSON file
bin\run_realistic_simulation.bat --start-date 2023-01-01 --end-date 2023-01-31 --use-static-data
The database is organized into the following schemas:
Core operational tables for the insurance business:
- Members - Personal information, contact details, Medicare numbers
- CoveragePlans - Plan details, benefits, premiums, waiting periods
- Policies - Policy details, status, coverage type, excess amounts
- PolicyMembers - Relationship between policies and members
- Claims - Claim details, status, payment information
- Providers - Provider information, specialties, agreement status
- PremiumPayments - Payment tracking, due dates, payment status
Tables related to Australian health insurance regulations:
- PHIRebateTiers - Private Health Insurance rebate tiers and rates
- MBSItems - Medicare Benefits Schedule items and rebates
Tables for Synthea FHIR data integration:
- SyntheaPatients - Patient data from Synthea
- SyntheaEncounters - Encounter data from Synthea
- SyntheaProcedures - Procedure data from Synthea
This project uses SQL Server's Change Data Capture (CDC) feature to track changes to the data over time.
# Enable CDC on the database and tables
./bin/enable_cdc.sh
# Enable CDC on the database and tables
bin\enable_cdc.bat
# Monitor changes to a specific table for the last 24 hours
./bin/monitor_cdc.sh --schema Insurance --table Members --hours 24
Run the test suite:
# Run all tests
./bin/run_tests.sh
# Run tests with coverage report
./bin/run_tests.sh coverage
# Run all tests
bin\run_tests.bat
# Run tests with coverage report
bin\run_tests.bat coverage
Detailed documentation is available in the docs/
directory:
health_insurance_au/ # Main Python package
βββ api/ # API endpoints
βββ cli/ # Command-line interfaces
βββ models/ # Data models
βββ simulation/ # Simulation modules
β βββ simulation.py # Core simulation logic
βββ utils/ # Utility functions
β βββ data_generation/ # Dynamic patient data generation
β β βββ generate_data.py # Core data generation script
β βββ data_loader.py # Load data from static files
β βββ dynamic_data_generator.py # Dynamic data integration
βββ integration/ # External system integration
βββ config.py # Configuration settings
scripts/ # Standalone scripts
βββ db/ # Database scripts
βββ simulation/ # Simulation scripts
βββ realistic_simulation.py # Realistic simulation script
bin/ # Scripts for running operations
βββ initialize_db.sh/.bat # Database initialization
βββ add_initial_data.sh/.bat # Add initial data
βββ run_realistic_simulation.sh/.bat # Run realistic simulation
βββ enable_cdc.sh/.bat # Enable CDC
βββ run_tests.sh/.bat # Run tests
config/ # Configuration files
docs/ # Documentation
data/ # Data files
tests/ # Test suite
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add some amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Project Maintainer - Mehdi Modarressi
Project Link: https://github.com/Mmodarre/AusHealthSim