Skip to content

PyTorch and OpenMP/MPI-enabled AMReX don't get along in load_state_dict #322

Open
@RTSandberg

Description

@RTSandberg

On my local machine, PyTorch has some internal multithreaded functionality that doesn't get along with AMReX. Unless I set PyTorch.set_num_threads(1 or 2), then the attached script will hang when the neural network tries to set its initial parameters.

This script downloads some neural network parameters from Zenodo archive to then load them, and the load_state_dict function is the specific point of failure.

pytorch_amrex_hang_reproducer_v2.py.txt

Metadata

Metadata

Assignees

No one assigned

    Labels

    backend: openmpSpecific to OpenMP execution (CPUs)bugSomething isn't workingbug: affects latest releaseBug also exists in latest release versioncomponent: MPIDomain decomposition and communicationcomponent: third partyChanges in ImpactX that reflect a change in a third-party library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions