Skip to content

halo.update_async(); Deadlock in Multi Node Case on feat-halo #686

Open
@pauleonix

Description

@pauleonix

In my stencil benchmark I use the halo wrapper as in ex.02.matrix.halo.heat_equation. When building with the feat-halo branch my code deadlocks at halo.update_async(); when running the code on more than one node.
If I use the developement branch or run on a single node with more than one dash unit, this doesn't happen.

This also seems to be the reason for dash-test-mpi not being able to finish in 8 hours on multiple nodes in issue #682. If one looks at the end of the test output posted in that issue, one can observe that it was at mHaloTest.HaloMatrixWrapperNonCyclic2D when the test was cancelled due to a time limit.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions