Skip to content

Conversation

al-rigazzi
Copy link
Collaborator

Overview

This PR modernizes the entire SmartSim codebase to use Python 3.10+ typing syntax, improving code readability and type safety while maintaining full backward compatibility.

Changes Made

🔄 Syntax Modernization

  • Union Types: Union[X, Y]X | Y
  • Optional Types: Optional[X]X | None
  • Generic Collections:
    • List[X]list[X]
    • Dict[X, Y]dict[X, Y]
    • Tuple[X, Y]tuple[X, Y]
    • Set[X]set[X]

📦 Import Modernization

  • Updated collections.abc imports (Callable, Iterable, Sequence, etc.)
  • Removed 46 unused import typing as t statements
  • Cleaned up import organization and grouping

🔧 Type Annotation Fixes

  • Fixed dict type annotations with union syntax for mypy compatibility
  • Resolved dict[str, int, str | float]dict[str, int | str | float] patterns
  • Ensured all type hints are properly formed and validated

Quality Metrics

✅ Code Quality

  • Pylint Score: 10.00/10 (maintained perfect score)
  • MyPy Validation: Success: no issues found in 128 source files
  • Type Safety: All existing type annotations preserved and improved

📊 Impact

  • Files Modified: 108 files across core codebase
  • Core Files Updated: 93 files in smartsim/ and tests/
  • Lines Changed: 954 insertions, 1026 deletions (net code improvement)

Files Affected

Core Components

  • Settings classes (smartsim/settings/)
  • Core utilities (smartsim/_core/)
  • Entity management (smartsim/entity/)
  • Launcher infrastructure (smartsim/_core/launcher/)
  • CLI tools (smartsim/_core/_cli/)

Test Suite

  • Updated test files to match new typing patterns
  • Maintained full test compatibility

Backward Compatibility

Python Version Support: This modernization maintains compatibility with the project's Python version requirements while using modern syntax available in Python 3.10+

API Stability: No breaking changes to public APIs or interfaces

Runtime Behavior: All functionality preserved with improved type safety

Testing

  • ✅ All existing tests pass
  • ✅ MyPy type checking passes without errors
  • ✅ Pylint analysis maintains perfect 10.00/10 score
  • ✅ No runtime behavior changes

Benefits

  1. Improved Readability: Modern union syntax (X | Y) is more concise and readable
  2. Better IDE Support: Enhanced autocomplete and type inference
  3. Future-Proof: Aligned with current Python typing best practices
  4. Consistency: Uniform typing style across entire codebase
  5. Maintainability: Cleaner, more organized import structure

This modernization brings SmartSim's typing infrastructure up to current Python standards while maintaining all existing functionality and compatibility.

- Remove TelemetryConfiguration classes and related code
- Remove telemetry monitor entrypoint and utilities
- Remove telemetry collectors and sinks
- Remove telemetry-related tests
- Remove watchdog dependency
- Simplify job entities and controller logic
- Remove telemetry configuration from config.py

This removes approximately 5,838 lines of telemetry-related code
while preserving core SmartSim functionality.
- Remove telemetry_dir usage from controller.py batch job creation
- Clean up telemetry references in job.py comments and docstrings
- Remove telemetry-related properties from manifest.py
- Update serialize.py to remove telemetry directory and metadata references
- Remove telemetry_dir argument from indirect.py entrypoint and step.py launcher
- Update indirect tests to remove telemetry_dir parameter expectations
- Fix conftest.py to import JobEntity from correct location
- Clean up remaining telemetry comments and replace with generic logging

All telemetry code, configuration, tests, and documentation have now been
completely removed from the SmartSim codebase.
- Clean up remaining telemetry references in job.py comments
- Simplify step.py proxy decorator to always use direct launch
- Remove telemetry.disable() call from CLI validate.py
- Simplify dragon backend cooldown period configuration
- Remove unused get_config import from dragon backend

All telemetry code has been completely removed from SmartSim.
The codebase now works without any telemetry dependencies or references.
- Replace CONFIG.telemetry_subdir references with 'status' directory
- Remove telemetry event tracking from test_process_failure and test_complete_process
- Simplify tests to focus on actual process execution rather than telemetry events
- All indirect tests now pass without telemetry dependencies

Tests now verify core functionality without relying on removed telemetry system.
- Remove dashboard CLI plugin and all associated functionality
- Remove SmartDashboard documentation file (smartdashboard.rst)
- Update documentation index to remove SmartDashboard section
- Clean up ReadTheDocs configuration to remove dashboard dependency
- Update Docker files to remove SmartDashboard installation
- Remove dashboard-related tests and update plugin tests
- Update changelog to document SmartDashboard removal as breaking change
- Remove SmartDashboard changelog section

SmartSim now operates independently without SmartDashboard integration.
The core monitoring and logging functionality is preserved through
SmartSim's existing logging infrastructure.
- Add proper type annotation for empty plugins tuple in plugin.py
- Add explicit type annotation for plugin_items in cli.py
- All mypy checks now pass successfully
- Remove telemetry-related test functions from test_experiment.py
- Fix status_dir metadata by setting it to .smartsim subdirectory
- Fix controller test expecting removed exp_path parameter
- All tests now pass and mypy is clean
- Remove telemetry-related test functions from test_config.py and test_serialize.py
- Remove telemetry fixtures and references from test_logs.py and conftest.py
- Update manifest_json fixture to use simple path instead of telemetry_subdir
- All tests now pass without telemetry dependencies
- Updated test_output_files.py to match simplified .smartsim directory structure
- Updated test_symlinking.py to use new output file paths
- Fixed controller to use absolute paths for status directories
- Implemented historical file preservation with timestamps
- Updated batch job tests to use correct entity relationships
- Modified symlink_error test to match new auto-creating behavior

All core telemetry removal is complete with only output redirection issues remaining.
- Remove unused imports (CONFIG, subprocess, sys, pathlib, get_ts_ms, encode_cmd, UnproxyableStepError)
- Fix line length issues in indirect.py and job.py
- Remove unreachable code after return statements
- Remove unused variables (start_rc, status_dir, is_dragon)
- Fix import-outside-toplevel issue with time module in controller.py
- Add pylint disable comment for unused argument raw_experiment
- Remove unnecessary pass statement and simplify docstring

All lint checks now pass with 10.00/10 rating.
- Delete smartsim/_core/entrypoints/indirect.py
- Delete tests/test_indirect.py
- Update step.py comment to remove references to indirect launching
- Clean up cached files and mypy cache for removed modules
- Verified all tests pass and no type errors remain
- Fix KeyError for status directory in batch job steps by setting status_dir in _create_batch_job_step
- Remove test_orc_telemetry test that referenced deleted telemetry functionality
- Remove remaining telemetry environment variable settings from dragon and pals tests
- Update line formatting for better lint compliance
- All originally failing tests now pass
- Enhanced symlink_output_files to auto-create parent directories
- Fixed path handling for entities with sub-entities (Orchestrator/Ensemble)
- Ensured all tests use proper test directories instead of repo root
- Removed unused CONFIG imports
- All tests now pass without creating lingering files in repo root
- Remove MockSink class and mock_sink fixture
- Remove mock_con, mock_mem, mock_redis, and mock_entity fixtures
- Remove MockCollectorEntityFunc protocol
- Clean up unused imports (asyncio, DragonLauncher, JobEntity)
- Improves pylint score from 9.56 to 9.67
- Add CONFIG.metadata_subdir property following established pattern
- Refactor controller to use consistent .smartsim/metadata base path
- Replace timestamped run_dir with metadata_dir/run_timestamp structure
- Update all method signatures: run_dir -> metadata_dir parameters
- Preserve historical log functionality with timestamped subdirectories
- Update tests to work with new metadata directory pattern
- Add test coverage for new CONFIG.metadata_subdir property

Addresses reviewer feedback for consistent directory structure
while maintaining backward compatibility and historical logs.
al-rigazzi and others added 29 commits August 4, 2025 11:15
Implement the following improvements from PR CrayLabs#789 code review:

1. Fix import style: Move shutil import to module level in test_controller_metadata_usage.py
   - Relocate shutil import from method to top-level imports per Python best practices

2. Remove unused JobEntity code: Complete cleanup of JobEntity ecosystem
   - Remove JobEntity class and _JobKey class from job.py
   - Remove JobEntity imports and isinstance checks from jobmanager.py
   - Simplify Job type annotations to use actual SmartSim entities only
   - Eliminate telemetry-related legacy code that's no longer needed

3. Enhance CONFIG with Path objects: Improve type safety for directory paths
   - Update smartsim_base_dir, dragon_default_subdir, dragon_logs_subdir, metadata_subdir
     to return pathlib.Path objects instead of strings
   - Maintain backward compatibility with os.path.join and string operations
   - Update test expectations to validate Path object behavior

All changes tested and verified:
- Import style follows Python conventions
- JobEntity references completely removed from codebase
- Path objects provide enhanced type safety while preserving compatibility
- All existing tests pass with new Path-based CONFIG properties
Address MattToast's feedback about removing run_id which was used for
telemetry tracking but is no longer needed after telemetry removal.

Changes:
- Remove run_id field from _LaunchedManifestMetadata NamedTuple
- Remove run_id parameter from LaunchedManifestBuilder constructor
- Remove run_id from serialized manifest.json output
- Update all test files to remove run_id parameters
- Update test expectations to use timestamp for uniqueness instead

The manifest system now uses timestamp for run identification instead
of the UUID-based run_id, simplifying the codebase after telemetry removal.
- Remove LaunchedManifest, _LaunchedManifestMetadata, and LaunchedManifestBuilder classes
- Simplify serialize.py by removing orphaned telemetry functions (80% reduction)
- Update controller.py to remove LaunchedManifest dependencies and phantom method call
- Clean up all test files to remove LaunchedManifest references
- Delete tests/test_serialize.py as it only tested removed functionality
- Maintain core Manifest class functionality for entity organization
- Achieve 10.00/10 linting score across all modified files
- Restore missing _save_orchestrator() call in _launch_orchestrator_simple()
- This was accidentally removed during LaunchedManifest cleanup
- Fixes test_dbnode.py::test_hosts which requires checkpoint file for reconnection
- Maintains 10.00/10 linting score
- Restore missing _jobs.set_db_hosts(orchestrator) call in _launch_orchestrator_simple()
- This was accidentally removed during LaunchedManifest cleanup
- Fixes IndexError in db_is_active() where hosts list was empty
- Resolves backend ML model test failures (test_dbmodel.py, test_dbscript.py)
- Database addresses now properly populated for entity launches
- Maintains 10.00/10 linting score
- Add timestamp-based unique metadata directories for each launch
- Import get_ts_ms helper function from utils.helpers
- Modify ensemble and model metadata directory paths to include launch timestamp
- Ensures each experiment launch gets unique metadata directories
- Fixes test_output_files.py::test_mutated_model_output
- Prevents output file overwrites when same model is run multiple times
- Historical output files now properly preserved across multiple runs
- Maintains 10.00/10 linting score
- Move TStepLaunchMetaData type definition from serialize.py to controller_utils.py
- Remove unused smartsim/_core/utils/serialize.py file entirely
- Add pathlib.Path import to controller_utils.py for type definition
- Remove TYPE_CHECKING import that was only used for the moved type
- Complete final cleanup of telemetry-related serialization code
- All functionality preserved and tests still pass
- Replace Union[X, Y] with X | Y syntax across entire codebase
- Replace Optional[X] with X | None syntax
- Update List[X] to list[X] and Dict[X, Y] to dict[X, Y]
- Update Tuple[X, Y] to tuple[X, Y] and Set[X] to set[X]
- Modernize collections.abc imports (Callable, Iterable, etc.)
- Remove 46 unused 'import typing as t' statements
- Fix dict type annotations with union syntax for mypy compatibility
- Update 100+ files with modern type hints
- Maintain 10.00/10 pylint score
- Achieve 'Success: no issues found' mypy validation

Files affected: 93 core files across smartsim/ and tests/
Type safety: All existing type annotations preserved and improved
Compatibility: Python 3.10+ syntax with backward compatibility
- Remove incorrect quotes from os.PathLike[str] in union type
- Fixes runtime import error in builder.py
- Union should be: str | os.PathLike[str] (not str | "os.PathLike[str]")
- Maintains proper type safety and Python 3.10+ union syntax
- Remove incorrect quotes from os.PathLike[str] in union types
- Fixes 2 additional instances of the same issue as builder.py
- Function parameter: str | os.PathLike[str] (not str | "os.PathLike[str]")
- List type annotation: list[str | os.PathLike[str]]
- Ensures all SmartSim modules can import without syntax errors
Copy link

codecov bot commented Aug 14, 2025

Codecov Report

❌ Patch coverage is 95.25066% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.27%. Comparing base (d7d979e) to head (d8dbf0d).
⚠️ Report is 20 commits behind head on develop.

Files with missing lines Patch % Lines
smartsim/_core/launcher/util/launcherUtil.py 0.00% 5 Missing ⚠️
smartsim/_core/launcher/dragon/dragonLauncher.py 55.55% 4 Missing ⚠️
smartsim/_core/launcher/sge/sgeLauncher.py 50.00% 3 Missing ⚠️
smartsim/_core/control/controller.py 93.33% 2 Missing ⚠️
smartsim/_core/launcher/dragon/dragonBackend.py 88.23% 2 Missing ⚠️
smartsim/_core/launcher/launcher.py 50.00% 1 Missing ⚠️
smartsim/_core/launcher/step/sgeStep.py 50.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #791      +/-   ##
===========================================
- Coverage    83.91%   80.27%   -3.64%     
===========================================
  Files           83       78       -5     
  Lines         6284     6095     -189     
===========================================
- Hits          5273     4893     -380     
- Misses        1011     1202     +191     
Files with missing lines Coverage Δ
smartsim/_core/config/config.py 98.21% <100.00%> (+1.07%) ⬆️
smartsim/_core/control/controller_utils.py 86.66% <ø> (-13.34%) ⬇️
smartsim/_core/control/job.py 93.33% <100.00%> (-2.70%) ⬇️
smartsim/_core/control/jobmanager.py 93.46% <100.00%> (-0.70%) ⬇️
smartsim/_core/control/manifest.py 87.91% <100.00%> (-8.01%) ⬇️
smartsim/_core/control/previewrenderer.py 96.00% <100.00%> (-0.08%) ⬇️
smartsim/_core/generation/generator.py 96.77% <100.00%> (-0.03%) ⬇️
smartsim/_core/generation/modelwriter.py 100.00% <100.00%> (ø)
smartsim/_core/launcher/colocated.py 94.89% <100.00%> (-2.93%) ⬇️
smartsim/_core/launcher/dragon/dragonConnector.py 67.65% <100.00%> (-0.14%) ⬇️
... and 50 more

... and 2 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant