Multi-environment CSV data analysis orchestrator with isolated profiling engines
AutoCSV Profiler Suite resolves dependency conflicts in data science through a multi-environment architecture that isolates profiling engines while providing a unified interface.
AutoCSV Profiler Suite is a multi-environment CSV data analysis orchestrator that solves dependency conflicts between profiling engines through isolated conda environments. The system provides a unified interface while running YData Profiling, SweetViz, DataPrep, and custom statistical analysis in separate environments to prevent package version conflicts
Data science projects face dependency conflicts between profiling engines and statistical libraries. This project uses isolated conda environments to prevent conflicts while maintaining functionality.
- Dependency Conflict Resolution: Three specialized conda environments plus base environment prevent library conflicts
- Multiple Profiling Engines: YData Profiling, SweetViz, DataPrep, and custom statistical analysis
- Memory Management: Chunking for large files with 1GB default limit
- Interface: Console interface with progress tracking and error handling
- Lazy Loading: Engines load only when needed for performance
- Degradation: Continues working even with partial engine availability
- Cross-Platform Support: Windows, Linux, macOS with conda environment isolation
Architecture and dependency details: Architecture Guide
Demo Interactive demonstration of the complete analysis workflow from setup to results
- Anaconda or Miniconda
- Python 3.10 or higher
- At least 3GB free disk space (2GB for conda environments, 1GB for data/outputs)
Complete installation instructions: Installation Guide
Quick setup:
# 1. Clone and navigate
git clone https://github.com/dhaneshbb/autocsv-profiler-suite.git
cd autocsv-profiler-suite
# 2. Install requirements and setup environments
pip install -r requirements.txt
python bin/setup_environments.py create --parallelRun analysis:
First, explore available analysis options:
python bin/run_analysis.py --helpThen start the interactive analysis:
python bin/run_analysis.pyThe interface provides file selection, delimiter detection, and engine selection:
Setup guide: Installation Guide | Issues: Troubleshooting Guide
graph TD
A[User Interface] --> B[Main Orchestrator]
B --> C[Base Environment<br/>Python 3.10+]
C --> D[Environment Manager]
D --> E[Main Environment<br/>Python 3.11]
D --> F[Profiling Environment<br/>Python 3.10]
D --> G[DataPrep Environment<br/>Python 3.10]
E --> H[Statistical Analysis<br/>numpy 2.2.6,
scipy 1.13.1]
F --> I[YData Profiling<br/>numpy <2.2]
F --> J[SweetViz<br/>legacy pandas]
G --> K[DataPrep Engine<br/>pandas 1.5.3]
H --> L[Analysis Reports]
I --> L
J --> L
K --> L
Architecture details: Architecture Guide
Summary: Multi-environment conda architecture resolves dependency conflicts between profiling engines. Requires conda, Python 3.10+, and 3GB free disk space.
python bin/run_analysis.pyInteractive mode includes:
- File selection with validation
- Delimiter detection (with manual override)
- Engine selection based on availability
- Progress monitoring with updates
- Result summary and output locations
# Analyze specific file directly
python bin/run_analysis.py path/to/data.csv
# Debug mode with detailed output
python bin/run_analysis.py --debug
# Direct analysis with debug mode
python bin/run_analysis.py path/to/data.csv --debugCommand options documentation available in Quick Start Guide above.
Command-line interface usage for reliable multi-environment support:
# Interactive mode (recommended) - guides through file selection
python bin/run_analysis.py
# Direct file analysis
python bin/run_analysis.py data.csv
# Debug mode for troubleshooting
python bin/run_analysis.py --debugPython API (requires proper environment setup):
# Note: ImportError may occur in multi-environment setup
# CLI interface recommended for production workflows
try:
from autocsv_profiler import profile_csv
report_path = profile_csv("data.csv", "output_directory/")
except ImportError:
print("Usage: python bin/run_analysis.py data.csv")Getting Started:
- Installation Guide - Environment setup instructions
- Getting Started Tutorial - Step-by-step walkthrough
- User Guide - Reference for daily usage
Technical Reference:
- API Documentation - Technical API reference
- Architecture Guide - Multi-environment design
- Performance Guide - Optimization and benchmarks
Development:
- Development Guide - Environment setup and workflow
- Design Decisions - Architectural decision records
- Troubleshooting Guide - Common issues and solutions
Examples:
- Examples Directory - samples
- Engine Testing Guide - Engine testing procedures
See Contributing Guide for workflow details.
MIT License - see LICENSE file. Third-party dependencies have various licenses - see NOTICE for details.
Important: See DISCLAIMER for liability limitations and dependency responsibility.
- Repository: https://github.com/dhaneshbb/autocsv-profiler-suite
- Documentation: https://github.com/dhaneshbb/autocsv-profiler-suite/tree/main/docs
- Issues: https://github.com/dhaneshbb/autocsv-profiler-suite/issues
- Changelog: CHANGELOG.md
- License: MIT License
Version 2.0.0 | Beta | Python 3.10-3.13 | Cross-Platform
Copyright 2025 dhaneshbb | License: MIT | Homepage: https://github.com/dhaneshbb/autocsv-profiler-suite




