An interactive web-based visualizer that demonstrates how Merkle trees (hash trees) are used in database replication to efficiently detect and sync changes between a source database and a replica, without transferring the entire dataset.
This visualizer simulates the complete database replication process using Merkle trees:
- Building Merkle Trees: Creates binary hash trees for both source and replica databases
- Efficient Comparison: Compares root hashes to detect differences
- Drill-Down Analysis: Traverses only mismatched branches to identify specific changes
- Selective Synchronization: Syncs only the changed data blocks, demonstrating bandwidth efficiency
- Interactive Data Management: Add, edit, and remove data entries in both source and replica databases
- Dynamic Tree Visualization: Real-time rendering of Merkle trees using D3.js
- Step-by-Step Comparison: Visual comparison process with detailed logging
- Efficient Sync: Only transfers changed data blocks, not entire datasets
- Performance Statistics: Shows bandwidth savings and efficiency metrics
- Side-by-Side Layout: Source and replica trees displayed simultaneously
- Zoom and Pan Controls: Interactive navigation for large trees
- Real-time Updates: Immediate visual feedback for all operations
- Comprehensive Logging: Detailed operation log with timestamps
- Error Handling: Robust error handling with user-friendly messages
- Content-Based Hashing: SHA-256 hashing of data values (not IDs)
- Tree Balancing: Handles unbalanced trees with padding nodes
- Change Detection: Identifies modifications, additions, and deletions
- Visual Highlighting: Highlights nodes during comparison and sync operations
- Modern web browser (Chrome, Firefox, Safari, Edge)
- No additional software installation required
- Clone or download the repository
- Navigate to the project directory
- Start a local HTTP server:
python3 -m http.server 8000
- Open your browser and go to
http://localhost:8000
- The application starts with empty databases
- Click "Generate Sample Data" to create identical test data in both databases
- This creates 4 sample records with different values
- Click "Create Differences" to simulate changes in the source database
- This modifies one existing record and adds one new record
- The replica database remains unchanged
- Click "Build Trees" to generate Merkle trees for both databases
- Trees are rendered side-by-side with zoom and pan controls
- Hash information is displayed in the operation log
- Click "Compare Roots" to check if the databases are in sync
- If hashes differ, the system identifies that synchronization is needed
- Root nodes are highlighted to show the comparison
- Click "Drill Down" to traverse mismatched branches
- The system identifies specific nodes where differences occur
- Mismatched nodes are highlighted in both trees
- Click "Identify Changes" to determine the specific data changes
- The system categorizes changes as:
- Modified: Same ID, different value
- Added: New record in source
- Deleted: Record removed from source
- Click "Sync Changes" to apply only the identified changes
- Changes are applied one by one with visual feedback
- The replica database is updated to match the source
- Bandwidth efficiency statistics are displayed
- Use the "+" buttons in the database tables to add new rows
- Each new row gets a unique ID and editable value
- Changes are reflected immediately in the data tables
- Click on any value in the data tables to edit it
- Changes are saved automatically
- Modified data affects the Merkle tree hashes
- Use the "×" buttons to remove rows from either database
- Removed data is immediately reflected in the tables
- Start with empty databases
- Add different data to source and replica
- Build trees and observe the differences
- Experiment with various change patterns
- Zoom In/Out: Use the zoom buttons or mouse wheel
- Pan: Click and drag to move around the tree
- Reset Zoom: Click the reset button to return to the initial view
- Green Circles: Leaf nodes (actual data)
- Blue Circles: Internal nodes (hash combinations)
- Red Circles: Root nodes (tree roots)
- Hash Labels: First 8 characters of each node's hash
- Tooltips: Hover over nodes to see full hash and data details
- Leaves: Contain actual database records
- Parents: Hash of concatenated child hashes
- Root: Top-level hash representing entire database state
- Compares the top-level hashes of both trees
- If equal: databases are in sync
- If different: changes exist and need identification
- Traverses tree levels to find mismatched branches
- Only examines branches where differences exist
- Highlights specific nodes causing the mismatch
- Collects all leaf nodes from both trees
- Compares data by ID (not position)
- Categorizes differences accurately
- Modified: Updates existing record value
- Added: Inserts new record
- Deleted: Removes existing record
- Only transfers changed data blocks
- Calculates bandwidth savings percentage
- Demonstrates real-world efficiency gains
- Step-by-step application of changes
- Real-time tree updates
- Progress logging with timestamps
- Frontend: Vanilla JavaScript with ES6+ features
- Visualization: D3.js v7 for tree rendering
- Hashing: CryptoJS SHA-256 implementation
- Styling: CSS3 with responsive design
merkle-db-sync/
├── index.html # Main HTML structure
├── styles.css # CSS styling and layout
├── app.js # JavaScript application logic
└── README.md # This documentation
- Main application class managing all functionality
- Handles data, tree building, comparison, and synchronization
- Provides error handling and logging
- Creates leaf nodes from data records
- Builds tree bottom-up by pairing nodes
- Handles unbalanced trees with padding
- Generates unique hashes for each node
- Compares trees by content, not structure
- Identifies specific data changes
- Provides detailed change categorization
- Enables efficient synchronization
- Comprehensive validation of input data
- Graceful handling of edge cases
- User-friendly error messages
- Robust recovery mechanisms
- Merkle Tree Structure: Understanding binary hash trees
- Database Replication: Efficient change detection and synchronization
- Cryptographic Hashing: Content-based hash generation
- Algorithm Efficiency: Minimizing data transfer in distributed systems
- Distributed Databases: Efficient synchronization between nodes
- Blockchain Technology: Merkle trees in blockchain structures
- Version Control Systems: Efficient diff and merge operations
- Content Delivery Networks: Change detection in distributed content
- Hash Trees: Hierarchical data structures for efficient comparison
- Change Detection: Identifying differences without full data transfer
- Selective Synchronization: Transferring only necessary changes
- Bandwidth Efficiency: Reducing network overhead in distributed systems
- Check browser console for JavaScript errors
- Ensure all files are loaded correctly
- Verify D3.js library is accessible
- Ensure trees are built before comparison
- Check that changes are properly identified
- Verify data structure integrity
- Reduce data size for better performance
- Use zoom controls for large trees
- Clear browser cache if needed
- "Invalid data format": Check data structure and required fields
- "Duplicate IDs found": Ensure unique IDs in each database
- "Container not found": Verify HTML structure integrity
- "Failed to build tree": Check data validity and structure
- Chrome: 60+ (Recommended)
- Firefox: 55+
- Safari: 12+
- Edge: 79+
- Optimal performance with up to 32 data records
- Larger datasets may require longer processing times
- Zoom and pan controls help navigate large trees
- Error handling prevents crashes with invalid data
This is an educational project demonstrating Merkle tree concepts. Feel free to:
- Experiment with different data scenarios
- Modify the visualization styles
- Add new features or improvements
- Report issues or suggest enhancements
This project is under MIT LICENSE