Skip to content

04. Developer Guide

FindHao edited this page Jul 30, 2025 · 6 revisions

This guide is for developers who want to contribute to TritonParse, understand its architecture, or extend its functionality.

πŸ—οΈ Architecture Overview

High-Level Architecture

TritonParse consists of three main components:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   Python Backend     β”‚    β”‚   Processing         β”‚    β”‚   Frontend UI        β”‚
β”‚                      β”‚    β”‚                      β”‚    β”‚                      β”‚
β”‚ β€’ Structured Logging │──▢│ β€’ Log Parsing        │──▢│ β€’ React Interface    β”‚
β”‚ β€’ Triton Hooks       β”‚    β”‚ β€’ Source Mapping     β”‚    β”‚ β€’ IR Visualization   β”‚
β”‚ β€’ Trace Generation   β”‚    β”‚ β€’ Data Compression   β”‚    β”‚ β€’ Code Comparison    β”‚
β”‚                      β”‚    β”‚ β€’ Process Launch Traceβ”‚   β”‚ β€’ Launch Diff Analysisβ”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Component Details

1. Python Backend (tritonparse/)

  • Purpose: Capture Triton compilation events and generate structured logs. This package contains the core logic for parsing IR, processing traces, and creating source mappings.
  • Key Files:
    • structured_logging.py - Main logging infrastructure to capture events.
    • trace_processor.py - Processes raw trace files, groups events, and generates launch_diff and autotune analysis.
    • ir_parser.py - Extracts source location information from various IRs (TTIR, TTGIR, PTX, AMDGCN).
    • mapper.py - Creates bidirectional mappings between different IRs and Python source code.
    • utils.py - Main CLI entrypoint (unified_parse) and other parsing utilities.
    • extract_source_mappings.py - Legacy utility for IR stage correlation.
    • event_diff.py - Logic for comparing kernel launch events.
    • sourcemap_utils.py - Helper functions for source mapping.
    • common.py, shared_vars.py, tp_logger.py - Common utilities, shared state, and logger configuration.

2. Processing Pipeline

  • Purpose: Transform raw logs into structured, analyzable format.
  • Key Functions:
    • Parse NDJSON logs
    • Extract source mappings between IR stages
    • Process launch trace
    • Compress and package data

3. Frontend UI (website/)

  • Purpose: Interactive visualization and analysis interface
  • Key Technologies:
    • React 19 with TypeScript
    • Vite build system
    • Tailwind CSS for styling
    • Monaco Editor for code display

πŸ“ Project Structure

tritonparse/
β”œβ”€β”€ tritonparse/                 # Python package
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ structured_logging.py    # Core logging infrastructure to capture events
β”‚   β”œβ”€β”€ trace_processor.py       # Processes raw trace files, groups events, and generates diffs
β”‚   β”œβ”€β”€ ir_parser.py             # Extracts source location information from various IRs
β”‚   β”œβ”€β”€ mapper.py                # Creates bidirectional mappings between different IRs and Python source code
β”‚   β”œβ”€β”€ event_diff.py            # Logic for comparing kernel launch events
β”‚   β”œβ”€β”€ utils.py                 # Main CLI entrypoint (`unified_parse`) and other parsing utilities
β”‚   β”œβ”€β”€ source_type.py           # Source type definitions (e.g., TTIR, PTX)
β”‚   β”œβ”€β”€ sourcemap_utils.py       # Helper functions for source mapping
β”‚   β”œβ”€β”€ common.py                # Common utilities and helper functions
β”‚   β”œβ”€β”€ shared_vars.py           # Shared state and variables for the package
β”‚   └── tp_logger.py             # Logger configuration
β”œβ”€β”€ website/                     # React web application for visualization
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/          # Reusable React components
β”‚   β”‚   β”‚   β”œβ”€β”€ ArgumentViewer.tsx   # Displays kernel arguments
β”‚   β”‚   β”‚   β”œβ”€β”€ CodeViewer.tsx       # Displays IR code with syntax highlighting
β”‚   β”‚   β”‚   β”œβ”€β”€ DiffViewer.tsx       # Side-by-side diff view for text
β”‚   β”‚   β”‚   β”œβ”€β”€ CodeComparisonView.tsx # Compares two IRs with line mappings
β”‚   β”‚   β”‚   └── AutotuneAnalysis.tsx # Displays autotuning results
β”‚   β”‚   β”œβ”€β”€ pages/               # Main application pages
β”‚   β”‚   β”‚   β”œβ”€β”€ KernelOverview.tsx # Main analysis view for a kernel, combining all components
β”‚   β”‚   β”‚   └── CodeView.tsx     # A focused view for a single IR file
β”‚   β”‚   β”œβ”€β”€ utils/               # Utility functions
β”‚   β”‚   β”‚   └── dataLoader.ts    # Data loading and processing from parsed logs
β”‚   β”‚   β”œβ”€β”€ App.tsx              # Main application component routing to pages
β”‚   β”‚   └── main.tsx             # Application entry point
β”‚   β”œβ”€β”€ public/                  # Static assets (e.g., images, sample data)
β”‚   β”œβ”€β”€ package.json             # Frontend dependencies and scripts
β”‚   └── vite.config.ts           # Vite build configuration
β”œβ”€β”€ tests/                       # Test suite for the Python package
β”œβ”€β”€ docs/                        # Project documentation
β”œβ”€β”€ .github/                     # GitHub Actions workflows
β”œβ”€β”€ .ci/                         # CI scripts
β”œβ”€β”€ pyproject.toml               # Python project configuration
β”œβ”€β”€ Makefile                     # Development commands
└── README.md                    # Project overview

πŸ”§ Development Environment Setup

Prerequisites

  • Python >= 3.10
  • Node.js >= 18.0.0
  • Triton >= 3.4.0 (latest version recommended)
  • Git for version control

1. Clone and Setup

# Clone repository
git clone https://github.com/pytorch-labs/tritonparse.git
cd tritonparse

# Install Python dependencies
make install-dev

# Install website dependencies
cd website
npm install

2. Verify Development Setup

# Check formatting and linting
make format-check
make lint-check

3. Verify Setup

# Check Python setup
make format-check
make lint-check
python -m unittest tests.test_tritonparse.TestTritonparseCPU -v

# Check website setup
cd website
npm run dev

πŸ› οΈ Development Workflow

Code Style and Formatting

We use a comprehensive formatting pipeline:

Tool Purpose Configuration
Black Code formatting pyproject.toml
usort Import sorting pyproject.toml
Ruff Linting Built-in rules

Essential Commands

# Format code
make format

# Check formatting
make format-check

# Run linting
make lint-check

# Run tests
python -m unittest tests.test_tritonparse -v

# Website development
cd website && npm run dev

Development Quality Checks

Before committing, ensure:

  1. Code is formatted: make format
  2. Linting passes: make lint-check
  3. Tests pass: python -m unittest tests.test_tritonparse -v
  4. Website builds: cd website && npm run build

πŸ—οΈ Backend Development

Core Components

1. Structured Logging (structured_logging.py)

Purpose: Capture Triton compilation and launch events in structured format

Key Functions:

  • init(log_path, enable_launch_trace=False) - Initialize logging system.
    • log_path: The directory where log files will be stored.
    • enable_launch_trace: If True, captures detailed metadata for each kernel launch. This is required for launch analysis.

Integration Points:

  • Triton compilation hooks
  • PyTorch TorchInductor integration
  • Stack trace extraction

2. Log Processing (utils.py)

Purpose: Transform raw logs into analyzable format

Key Functions:

  • unified_parse() - Main parsing interface
  • oss_run() - OSS-specific parsing logic
  • parse_logs() - Core log processing

Processing Pipeline:

  1. Read raw NDJSON logs from input directory
  2. Parse and validate log entries
  3. Extract source mappings between IR stages
  4. Compress and save processed data

3. Source Mapping (extract_source_mappings.py)

Purpose: Correlate lines between different IR stages

Key Functions:

  • extract_source_mappings() - Main extraction logic
  • process_kernel_logs() - Process individual kernel logs
  • map_ir_stages() - Map lines between IR formats

Adding New Features

  1. Define the new data: Determine what new information needs to be captured.
  2. Update structured_logging.py: Add logic to capture the new data within the appropriate hooks (e.g., pre-compilation, post-compilation).
  3. Modify trace_processor.py: If the new data requires special processing or aggregation (like the launch analysis), add the logic here.
  4. Update unified_parse(): Ensure the new data is handled correctly during the main parsing routine.
  5. Write tests: Add unit and integration tests to tests/ to validate the new feature.

Testing Backend Changes

# Run CPU tests (no GPU required)
python -m unittest tests.test_tritonparse.TestTritonparseCPU -v

# Run GPU tests (requires CUDA)
python -m unittest tests.test_tritonparse.TestTritonparseCUDA -v

# Run specific test
python -m unittest tests.test_tritonparse.TestTritonparseCUDA.test_whole_workflow -v

# Test with real kernel
cd tests
TORCHINDUCTOR_FX_GRAPH_CACHE=0 python test_add.py

🎨 Frontend Development

Technology Stack

  • React 22 - UI framework
  • TypeScript - Type safety
  • Vite - Build tool and dev server
  • Tailwind CSS - Styling
  • Monaco Editor - Code display

Key Components

1. Data Loading (utils/dataLoader.ts)

Purpose: Load and process trace files

Key Functions:

  • loadLogData() - Load from URL
  • loadLogDataFromFile() - Load from file
  • processKernelData() - Process raw data

2. Code Viewer (components/CodeViewer.tsx)

Purpose: Display IR code with syntax highlighting

Features:

  • Language-specific syntax highlighting
  • Line number display
  • Interactive line selection
  • Source mapping visualization

3. Code Comparison (components/CodeComparisonView.tsx)

Purpose: Side-by-side IR comparison

Features:

  • Synchronized scrolling
  • Line mapping visualization
  • Interactive highlighting
  • Dropdown IR selection

Adding New Features

  1. Update dataLoader.ts: Modify the data loading and processing functions to handle any new data fields from the backend.
  2. Create new components: In website/src/components/, create new React components to display the new information. For example, a new panel in the KernelOverview.tsx or a new view.
  3. Integrate components: Add the new components to the appropriate pages (e.g., KernelOverview.tsx, CodeComparisonView.tsx).
  4. Style the components: Use Tailwind CSS for styling to match the existing interface.
  5. Add tests: If applicable, add tests for the new components or functionality.

Testing Frontend Changes

cd website

# Development server
npm run dev

# Type checking
npm run build

# Linting
npm run lint

# Test with sample data
# Load ./public/f0_fc0_a0_cai-.ndjson in browser

πŸ“Š Data Flow

End-to-End Data Flow

Python Code
     β”‚
     β–Ό
Triton Compilation
(triggers Hook Events)
     β”‚
     β–Ό
Structured Logging
     β”‚
     β–Ό
Raw NDJSON Logs
     β”‚
     β–Ό
Log Processing
  - Source Mapping
  - Launch Analysis
     β”‚
     β–Ό
Compressed Data
     β”‚
     β–Ό
Web Interface
     β”‚
     β–Ό
Interactive Visualization

Data Formats

1. Raw NDJSON Format

{
  "event_type": "compilation_start",
  "timestamp": 1234567890,
  "kernel_name": "add_kernel",
  "metadata": {...}
}

2. Processed Format

{
  "kernels": [
    {
      "hash": "abc123",
      "name": "add_kernel",
      "metadata": {...},
      "irFiles": {
        "ttgir": "...",
        "ptx": "..."
      },
      "sourceMappings": {
        "ttgir": {...},
        "ptx": {...}
      }
    }
  ]
}

πŸ” Debugging and Development Tools

Debug Logging

# Enable debug logging
export TRITONPARSE_DEBUG=1

# Run with debug output
python your_script.py

Development Utilities

# Check log file contents
head -n 10 ./logs/*.ndjson

# Inspect compressed data
zcat ./parsed_output/*.gz | head -n 20

# Test parsing pipeline
python -c "
import tritonparse.utils
tritonparse.utils.unified_parse('./logs/', './test_output/', verbose=True)
"

Browser Developer Tools

// Enable frontend debug logging
localStorage.setItem('tritonparse-debug', 'true');

// Inspect loaded data
console.log(window.tritonparseData);

// Test data processing
import { processKernelData } from './utils/dataLoader';
console.log(processKernelData(rawData));

πŸ§ͺ Testing

Test Structure

tests/
β”œβ”€β”€ test_tritonparse.py         # Main test suite
β”œβ”€β”€ test_add.py                 # Manual test example
└── example_output/             # Sample data

Running Tests

# All tests
python -m unittest tests.test_tritonparse -v

# CPU-only tests
python -m unittest tests.test_tritonparse.TestTritonparseCPU -v

# GPU tests (requires CUDA)
python -m unittest tests.test_tritonparse.TestTritonparseCUDA -v

# Manual test
cd tests
TORCHINDUCTOR_FX_GRAPH_CACHE=0 python test_add.py

Writing Tests

To add a new end-to-end test case, you should follow the structure of existing tests in TestTritonparseCUDA. The general workflow is as follows:

  1. Define a test method: Create a new method inside TestTritonparseCUDA with a name starting with test_.
  2. Define a Triton kernel: Write a simple Triton kernel that demonstrates the feature you want to test. This can be defined directly inside the test method.
  3. Set up a temporary environment: Use tempfile.mkdtemp() to create temporary directories for logs and parsed output.
  4. Initialize tritonparse logging: Call tritonparse.structured_logging.init() to start capturing events.
  5. Run the kernel: Execute the kernel to generate compilation and launch events. Run it multiple times if you need to test launch_diff functionality.
  6. Parse the logs: Call tritonparse.utils.unified_parse() to process the raw logs.
  7. Assert the results: Check the contents of the raw log files or the parsed output to verify that the behavior is correct.
  8. Clean up: Use a try...finally block to ensure the temporary directory is always removed.

Here is a simplified example illustrating how to add a new test:

# In tests/test_tritonparse.py, inside TestTritonparseCUDA

@unittest.skipUnless(torch.cuda.is_available(), "CUDA not available")
def test_new_feature_workflow(self):
    """Test a new feature in the end-to-end workflow."""

    # 1. Define the kernel for the test
    @triton.jit
    def my_new_kernel(x_ptr, y_ptr, n_elements, BLOCK_SIZE: tl.constexpr):
        # ... kernel implementation ...
        pid = tl.program_id(axis=0)
        offsets = pid * BLOCK_SIZE + tl.arange(0, BLOCK_SIZE)
        x = tl.load(x_ptr + offsets, mask=offsets < n_elements)
        tl.store(y_ptr + offsets, x, mask=offsets < n_elements)

    # 2. Set up temporary directories
    temp_dir = tempfile.mkdtemp()
    log_path = os.path.join(temp_dir, "logs")
    parsed_path = os.path.join(temp_dir, "parsed")
    os.makedirs(log_path, exist_ok=True)
    os.makedirs(parsed_path, exist_ok=True)

    # 3. Initialize logging and ensure cleanup
    tritonparse.structured_logging.init(log_path, enable_launch_trace=True)
    try:
        # 4. Run the kernel to generate logs
        x = torch.randn(128, device="cuda")
        y = torch.empty_like(x)
        my_new_kernel[(1,)](x, y, 128, BLOCK_SIZE=128)
        torch.cuda.synchronize()

        # 5. Parse the generated logs
        tritonparse.utils.unified_parse(source=log_path, out=parsed_path)

        # 6. Verify the output
        parsed_files = os.listdir(parsed_path)
        self.assertGreater(len(parsed_files), 0, "Parsing did not produce output files.")

        # ... (add more specific assertions on file contents) ...

    finally:
        # 7. Clean up the temporary directory
        shutil.rmtree(temp_dir)
        tritonparse.structured_logging.clear_logging_config()

πŸ“¦ Release Process

Version Management

Versions are managed in:

  • pyproject.toml - Python package version
  • website/package.json - Frontend version

Release Steps

  1. Update version numbers
  2. Update CHANGELOG.md
  3. Run full test suite
  4. Build and test website
  5. Create GitHub release
  6. Deploy to GitHub Pages

GitHub Actions

CI/CD pipeline includes:

  • Format checking - Code style validation
  • Linting - Code quality checks
  • Testing - Python and frontend tests
  • Website deployment - Automatic GitHub Pages deployment

🀝 Contributing Guidelines

Pull Request Process

  1. Fork the repository
  2. Create feature branch: git checkout -b feature-name
  3. Make changes following coding standards
  4. Add tests for new functionality
  5. Run formatting: make format
  6. Run tests: make lint-check && python -m unittest tests.test_tritonparse -v
  7. Submit pull request

Code Review Process

  • All PRs require review by core maintainers
  • CI checks must pass before merge
  • Documentation updates required for new features
  • Tests required for new functionality

Issue Reporting

When reporting issues:

  1. Use issue templates provided
  2. Include system information
  3. Provide reproduction steps
  4. Include error messages and logs

πŸ“š Additional Resources

Documentation

Community

External Resources

πŸ”— Next Steps

For new developers:

  1. Complete the Installation Guide
  2. Read the Usage Guide to understand the tool
  3. Explore the codebase starting with simple components
  4. Run the test suite to verify your setup
  5. Join GitHub Discussions for community support

For experienced contributors:

  1. Check GitHub Issues for open tasks
  2. Review the Architecture Deep Dive for advanced topics
  3. Contribute to documentation improvements
  4. Propose new features through GitHub Discussions
Clone this wiki locally