Skip to content

A comprehensive simulation system that generates realistic health insurance data for the Australian market

License

Notifications You must be signed in to change notification settings

Mmodarre/AusHealthSim

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

30 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Australian Health Insurance Simulation

License: MIT Python 3.8+ SQL Server

A realistic simulation of an Australian health insurance company's operational database, designed as a source for data warehouse demonstrations and testing. This project generates time-series data with daily changes that can be tracked using SQL Server's Change Data Capture (CDC).

πŸ“‹ Features

  • Core Insurance Operations
    • Member management and demographics
    • Policy creation and lifecycle management
    • Coverage plan configuration
    • Claims processing and assessment
    • Premium payment tracking
    • Provider network management
  • Data Generation
    • Dynamic patient data generation with realistic demographics
    • Age distributions matching population demographics
    • Life stages with address and name changes over time
    • Data variants to simulate errors and changes
  • Australian-Specific Elements
    • Hospital cover tiers (Basic, Bronze, Silver, Gold)
    • Private Health Insurance (PHI) rebate tiers
    • Lifetime Health Cover (LHC) loading
    • Medicare Benefits Schedule (MBS) integration
    • Australian states and postcodes
  • Technical Features
    • SQL Server Change Data Capture (CDC) for tracking changes
    • Synthea FHIR patient data integration
    • PyODBC database connectivity
    • Comprehensive test suite
    • Cross-platform support (Linux, macOS, Windows)

πŸš€ Getting Started

Prerequisites

  • Python 3.8+
  • SQL Server instance
  • ODBC Driver 17+ for SQL Server
  • pyodbc package
  • Faker library (for dynamic data generation)

Installation

# Clone the repository
git clone https://github.com/yourusername/health-insurance-au.git
cd health-insurance-au

# Install dependencies
pip install -e .

Database Configuration

Create a configuration file with your database credentials:

cp config/db_config.env.example config/db_config.env
# Edit config/db_config.env with your database credentials

Initialize the Database

On Linux/macOS:

# Initialize the database schema
./bin/initialize_db.sh

# Add initial reference data
./bin/add_initial_data.sh

On Windows:

# Initialize the database schema
bin\initialize_db.bat

# Add initial reference data
bin\add_initial_data.bat

πŸƒβ€β™‚οΈ Running Simulations

Quick Start

Run a realistic simulation with dynamic data generation:

On Linux/macOS:

./bin/run_realistic_simulation.sh --start-date 2023-01-01 --end-date 2023-01-31 --members-per-day 10

On Windows:

bin\run_realistic_simulation.bat --start-date 2023-01-01 --end-date 2023-01-31 --members-per-day 10

Simulation Options

Option Description
--start-date Start date for the simulation (YYYY-MM-DD)
--end-date End date for the simulation (YYYY-MM-DD)
--members-per-day Base number of new members per day
--log-level Logging level (DEBUG, INFO, WARNING, ERROR)
--reset-members Reset the list of used member IDs
--use-static-data Use static data from JSON file instead of dynamic generation

Data Generation

The simulation generates data with realistic patterns:

  • Fewer members join on weekends
  • More claims at the beginning/end of the month
  • Business hours for transactions (8 AM to 5 PM)

Dynamic vs Static Data

By default, the simulation uses dynamic data generation to create realistic patient profiles:

On Linux/macOS:

# Use dynamic data generation (default)
./bin/run_realistic_simulation.sh --start-date 2023-01-01 --end-date 2023-01-31

# Use static data from JSON file
./bin/run_realistic_simulation.sh --start-date 2023-01-01 --end-date 2023-01-31 --use-static-data

On Windows:

# Use dynamic data generation (default)
bin\run_realistic_simulation.bat --start-date 2023-01-01 --end-date 2023-01-31

# Use static data from JSON file
bin\run_realistic_simulation.bat --start-date 2023-01-01 --end-date 2023-01-31 --use-static-data

πŸ“Š Database Structure

The database is organized into the following schemas:

Insurance Schema

Core operational tables for the insurance business:

  • Members - Personal information, contact details, Medicare numbers
  • CoveragePlans - Plan details, benefits, premiums, waiting periods
  • Policies - Policy details, status, coverage type, excess amounts
  • PolicyMembers - Relationship between policies and members
  • Claims - Claim details, status, payment information
  • Providers - Provider information, specialties, agreement status
  • PremiumPayments - Payment tracking, due dates, payment status

Regulatory Schema

Tables related to Australian health insurance regulations:

  • PHIRebateTiers - Private Health Insurance rebate tiers and rates
  • MBSItems - Medicare Benefits Schedule items and rebates

Integration Schema

Tables for Synthea FHIR data integration:

  • SyntheaPatients - Patient data from Synthea
  • SyntheaEncounters - Encounter data from Synthea
  • SyntheaProcedures - Procedure data from Synthea

πŸ“ˆ Change Data Capture (CDC)

This project uses SQL Server's Change Data Capture (CDC) feature to track changes to the data over time.

Enable CDC

On Linux/macOS:

# Enable CDC on the database and tables
./bin/enable_cdc.sh

On Windows:

# Enable CDC on the database and tables
bin\enable_cdc.bat

Monitor CDC Changes

# Monitor changes to a specific table for the last 24 hours
./bin/monitor_cdc.sh --schema Insurance --table Members --hours 24

πŸ§ͺ Testing

Run the test suite:

On Linux/macOS:

# Run all tests
./bin/run_tests.sh

# Run tests with coverage report
./bin/run_tests.sh coverage

On Windows:

# Run all tests
bin\run_tests.bat

# Run tests with coverage report
bin\run_tests.bat coverage

πŸ“š Documentation

Detailed documentation is available in the docs/ directory:

πŸ“ Project Structure

health_insurance_au/          # Main Python package
β”œβ”€β”€ api/                      # API endpoints
β”œβ”€β”€ cli/                      # Command-line interfaces
β”œβ”€β”€ models/                   # Data models
β”œβ”€β”€ simulation/               # Simulation modules
β”‚   └── simulation.py         # Core simulation logic
β”œβ”€β”€ utils/                    # Utility functions
β”‚   β”œβ”€β”€ data_generation/      # Dynamic patient data generation
β”‚   β”‚   └── generate_data.py  # Core data generation script
β”‚   β”œβ”€β”€ data_loader.py        # Load data from static files
β”‚   └── dynamic_data_generator.py  # Dynamic data integration
β”œβ”€β”€ integration/              # External system integration
└── config.py                 # Configuration settings

scripts/                      # Standalone scripts
β”œβ”€β”€ db/                       # Database scripts
└── simulation/               # Simulation scripts
    └── realistic_simulation.py  # Realistic simulation script

bin/                          # Scripts for running operations
β”œβ”€β”€ initialize_db.sh/.bat     # Database initialization
β”œβ”€β”€ add_initial_data.sh/.bat  # Add initial data
β”œβ”€β”€ run_realistic_simulation.sh/.bat  # Run realistic simulation
β”œβ”€β”€ enable_cdc.sh/.bat        # Enable CDC
└── run_tests.sh/.bat         # Run tests

config/                       # Configuration files
docs/                         # Documentation
data/                         # Data files
tests/                        # Test suite

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ“§ Contact

Project Maintainer - Mehdi Modarressi

Project Link: https://github.com/Mmodarre/AusHealthSim

About

A comprehensive simulation system that generates realistic health insurance data for the Australian market

Topics

Resources

License

Stars

Watchers

Forks