Skip to content

[OpenMP] Enable simd in non-reduction composite constructs #146097

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mrkajetanp
Copy link
Contributor

Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs.

Despite currently being ignored with a warning, simd as a leaf in
composite constructs behaves as expected when the construct does not
contain a reduction. Enable it for those non-reduction constructs.

Signed-off-by: Kajetan Puchalski <[email protected]>
@llvmbot
Copy link
Member

llvmbot commented Jun 27, 2025

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-openmp

@llvm/pr-subscribers-mlir-llvm

Author: Kajetan Puchalski (mrkajetanp)

Changes

Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs.


Full diff: https://github.com/llvm/llvm-project/pull/146097.diff

4 Files Affected:

  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+2-5)
  • (added) mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir (+30)
  • (added) mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir (+33)
  • (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-14)
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 23140f22555a5..de51ca49649f9 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2847,11 +2847,8 @@ convertOmpSimd(Operation &opInst, llvm::IRBuilderBase &builder,
   llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder();
   auto simdOp = cast<omp::SimdOp>(opInst);
 
-  // TODO: Replace this with proper composite translation support.
-  // Currently, simd information on composite constructs is ignored, so e.g.
-  // 'do/for simd' will be treated the same as a standalone 'do/for'. This is
-  // allowed by the spec, since it's equivalent to using a SIMD length of 1.
-  if (simdOp.isComposite()) {
+  // TODO: Replace this once simd + reduction is properly supported
+  if (simdOp.isComposite() && simdOp.getReductionByref().has_value()) {
     if (failed(convertIgnoredWrapper(simdOp, moduleTranslation)))
       return failure();
 
diff --git a/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..9ea0ff590144f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
@@ -0,0 +1,30 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+llvm.func @test_parallel_do_simd() {
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+  %2 = llvm.mlir.constant(1000 : i32) : i32
+  %3 = llvm.mlir.constant(1 : i32) : i32
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  omp.parallel {
+    %5 = llvm.mlir.constant(1 : i64) : i64
+    %6 = llvm.alloca %5 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr
+    %7 = llvm.mlir.constant(1 : i64) : i64
+    omp.wsloop {
+      omp.simd {
+        omp.loop_nest (%arg0) : i32 = (%3) to (%2) inclusive step (%3) {
+          llvm.store %arg0, %6 : i32, !llvm.ptr
+          omp.yield
+        }
+      } {omp.composite}
+    } {omp.composite}
+    omp.terminator
+  }
+  llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..108a99cf7c40f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
@@ -0,0 +1,33 @@
+// RUN: mlir-translate --mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_teams_distribute_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+llvm.func @test_teams_distribute_parallel_do_simd() {
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+  %2 = llvm.mlir.constant(1000 : i32) : i32
+  %3 = llvm.mlir.constant(1 : i32) : i32
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  omp.teams {
+    omp.parallel {
+      omp.distribute {
+        omp.wsloop {
+          omp.simd private(@_QFEi_private_i32 %1 -> %arg0 : !llvm.ptr) {
+            omp.loop_nest (%arg1) : i32 = (%3) to (%2) inclusive step (%3) {
+              llvm.store %arg1, %arg0 : i32, !llvm.ptr
+              omp.yield
+            }
+          } {omp.composite}
+        } {omp.composite}
+      } {omp.composite}
+      omp.terminator
+    } {omp.composite}
+    omp.terminator
+  }
+  llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index 97608ca3b4df1..c515e6e70b264 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -26,20 +26,6 @@ llvm.func @atomic_hint(%v : !llvm.ptr, %x : !llvm.ptr, %expr : i32) {
 
 // -----
 
-llvm.func @do_simd(%lb : i32, %ub : i32, %step : i32) {
-  omp.wsloop {
-    // expected-warning@below {{simd information on composite construct discarded}}
-    omp.simd {
-      omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
-        omp.yield
-      }
-    } {omp.composite}
-  } {omp.composite}
-  llvm.return
-}
-
-// -----
-
 llvm.func @distribute_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) {
   // expected-error@below {{not yet implemented: Unhandled clause allocate in omp.distribute operation}}
   // expected-error@below {{LLVM Translation failed for operation: omp.distribute}}

@llvmbot
Copy link
Member

llvmbot commented Jun 27, 2025

@llvm/pr-subscribers-flang-openmp

Author: Kajetan Puchalski (mrkajetanp)

Changes

Despite currently being ignored with a warning, simd as a leaf in composite constructs behaves as expected when the construct does not contain a reduction. Enable it for those non-reduction constructs.


Full diff: https://github.com/llvm/llvm-project/pull/146097.diff

4 Files Affected:

  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+2-5)
  • (added) mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir (+30)
  • (added) mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir (+33)
  • (modified) mlir/test/Target/LLVMIR/openmp-todo.mlir (-14)
diff --git a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
index 23140f22555a5..de51ca49649f9 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp
@@ -2847,11 +2847,8 @@ convertOmpSimd(Operation &opInst, llvm::IRBuilderBase &builder,
   llvm::OpenMPIRBuilder *ompBuilder = moduleTranslation.getOpenMPBuilder();
   auto simdOp = cast<omp::SimdOp>(opInst);
 
-  // TODO: Replace this with proper composite translation support.
-  // Currently, simd information on composite constructs is ignored, so e.g.
-  // 'do/for simd' will be treated the same as a standalone 'do/for'. This is
-  // allowed by the spec, since it's equivalent to using a SIMD length of 1.
-  if (simdOp.isComposite()) {
+  // TODO: Replace this once simd + reduction is properly supported
+  if (simdOp.isComposite() && simdOp.getReductionByref().has_value()) {
     if (failed(convertIgnoredWrapper(simdOp, moduleTranslation)))
       return failure();
 
diff --git a/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..9ea0ff590144f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-parallel-do-simd.mlir
@@ -0,0 +1,30 @@
+// RUN: mlir-translate -mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+llvm.func @test_parallel_do_simd() {
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+  %2 = llvm.mlir.constant(1000 : i32) : i32
+  %3 = llvm.mlir.constant(1 : i32) : i32
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  omp.parallel {
+    %5 = llvm.mlir.constant(1 : i64) : i64
+    %6 = llvm.alloca %5 x i32 {bindc_name = "i", pinned} : (i64) -> !llvm.ptr
+    %7 = llvm.mlir.constant(1 : i64) : i64
+    omp.wsloop {
+      omp.simd {
+        omp.loop_nest (%arg0) : i32 = (%3) to (%2) inclusive step (%3) {
+          llvm.store %arg0, %6 : i32, !llvm.ptr
+          omp.yield
+        }
+      } {omp.composite}
+    } {omp.composite}
+    omp.terminator
+  }
+  llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
new file mode 100644
index 0000000000000..108a99cf7c40f
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/openmp-teams-distribute-parallel-do-simd.mlir
@@ -0,0 +1,33 @@
+// RUN: mlir-translate --mlir-to-llvmir %s | FileCheck %s
+
+// Check that omp.simd as a leaf of a composite construct still generates
+// the appropriate loop vectorization attribute.
+
+// CHECK-LABEL: define internal void @test_teams_distribute_parallel_do_simd..omp_par
+// CHECK: ![[VAL:.*]] = !{!"llvm.loop.vectorize.enable", i1 true}
+
+omp.private {type = private} @_QFEi_private_i32 : i32
+llvm.func @test_teams_distribute_parallel_do_simd() {
+  %0 = llvm.mlir.constant(1 : i64) : i64
+  %1 = llvm.alloca %0 x i32 {bindc_name = "i"} : (i64) -> !llvm.ptr
+  %2 = llvm.mlir.constant(1000 : i32) : i32
+  %3 = llvm.mlir.constant(1 : i32) : i32
+  %4 = llvm.mlir.constant(1 : i64) : i64
+  omp.teams {
+    omp.parallel {
+      omp.distribute {
+        omp.wsloop {
+          omp.simd private(@_QFEi_private_i32 %1 -> %arg0 : !llvm.ptr) {
+            omp.loop_nest (%arg1) : i32 = (%3) to (%2) inclusive step (%3) {
+              llvm.store %arg1, %arg0 : i32, !llvm.ptr
+              omp.yield
+            }
+          } {omp.composite}
+        } {omp.composite}
+      } {omp.composite}
+      omp.terminator
+    } {omp.composite}
+    omp.terminator
+  }
+  llvm.return
+}
diff --git a/mlir/test/Target/LLVMIR/openmp-todo.mlir b/mlir/test/Target/LLVMIR/openmp-todo.mlir
index 97608ca3b4df1..c515e6e70b264 100644
--- a/mlir/test/Target/LLVMIR/openmp-todo.mlir
+++ b/mlir/test/Target/LLVMIR/openmp-todo.mlir
@@ -26,20 +26,6 @@ llvm.func @atomic_hint(%v : !llvm.ptr, %x : !llvm.ptr, %expr : i32) {
 
 // -----
 
-llvm.func @do_simd(%lb : i32, %ub : i32, %step : i32) {
-  omp.wsloop {
-    // expected-warning@below {{simd information on composite construct discarded}}
-    omp.simd {
-      omp.loop_nest (%iv) : i32 = (%lb) to (%ub) step (%step) {
-        omp.yield
-      }
-    } {omp.composite}
-  } {omp.composite}
-  llvm.return
-}
-
-// -----
-
 llvm.func @distribute_allocate(%lb : i32, %ub : i32, %step : i32, %x : !llvm.ptr) {
   // expected-error@below {{not yet implemented: Unhandled clause allocate in omp.distribute operation}}
   // expected-error@below {{LLVM Translation failed for operation: omp.distribute}}

@mrkajetanp
Copy link
Contributor Author

Question for reviewers who have worked on this before - the current behaviour is fairly inconsistent, that is to say, using "simd reduction" on its own results in a crash because it's not implemented, whereas using "simd reduction" as part of a composite construct simply ignores the simd. Is it desirable to keep this behaviour? For instance, the following program will crash:

integer :: sum
sum = 0
!$omp simd reduction(+:sum)
do i=1, 1000
    sum = sum + 1
end do

While this one will only emit a warning.

integer :: sum
sum = 0
!$omp do simd reduction(+:sum)
do i=1, 1000
    sum = sum + 1
end do

There are several tests that rely on this, which is why I kept the current warning for composite constructs with reduction.

Additionally, if someone knows of any scenarios which I've overlooked that this might break for, please point them out. I was expecting something else to require fixing in order to do this, but I've not found anything that breaks so far.

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please could you elaborate on the testing you've done to be sure this is correctly supported.

@mrkajetanp
Copy link
Contributor Author

I ran both gfortran and Fujitsu test suites (alongside the LLVM one) to make sure this does not introduce regressions in those. For a number of test Fortran programs using composite constructs with simd, I manually inspected the IR and the final binary to make sure that with this change, vectorisable instructions inside these constructs turn into vectorised instructions even with -fno-vectorize, where without this change they do not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants