Skip to content

[mlir][core] Add an MLIR "pattern catalog" generator #146228

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

j2kun
Copy link
Contributor

@j2kun j2kun commented Jun 28, 2025

This PR adds a feature that, when enabled, it attaches a listener to all RewritePatterns that emits the name of the operation being modified to llvm::dbgs. When the MLIR test suite is run, these debug outputs can be filtered and combined into an index linking operations to the patterns that insert, modify, or replace them. This index is intended to be used to create a website that allows one to look up patterns from an operation name.

The feature is enabled when MLIR is combined in debug mode, though a previous version of this PR used a CMake build flag to trigger the behavior. The debug logs emitted can be viewed with --debug-only=generate-pattern-catalog, and the lit config is modified to do this when the env var MLIR_GENERATE_PATTERN_CATALOG is set.

Example usage:

mkdir build && cd build
cmake -G Ninja ../llvm \
  -DLLVM_ENABLE_PROJECTS="mlir" \
  -DLLVM_TARGETS_TO_BUILD="host" \
  -DCMAKE_BUILD_TYPE=DEBUG
ninja -j 24 check-mlir
MLIR_GENERATE_PATTERN_CATALOG=1 bin/llvm-lit -j 24 -v -a tools/mlir/test | grep 'pattern-logging-listener' | sed 's/^# | [pattern-logging-listener] //g' | sort | uniq > pattern_catalog.txt

Sample pattern catalog output (that fits in a gist): https://gist.github.com/j2kun/02d1ab8d31c10d71027724984c89905a

[edit]: removed use of RTTI, removed cmake build flag

@llvmbot llvmbot added mlir:core MLIR Core Infrastructure mlir bazel "Peripheral" support tier build system: utils/bazel labels Jun 28, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 28, 2025

@llvm/pr-subscribers-mlir-core

Author: Jeremy Kun (j2kun)

Changes

This PR adds a new cmake build option MLIR_ENABLE_CATALOG_GENERATOR. When enabled, it attaches a listener to all RewritePatterns that emits the name of the operation being modified to a special file. When the MLIR test suite is run, these files can be combined into an index linking operations to the patterns that insert, modify, or replace them. This index is intended to be used to create a website that allows one to look up patterns from an operation name.


Full diff: https://github.com/llvm/llvm-project/pull/146228.diff

5 Files Affected:

  • (modified) mlir/CMakeLists.txt (+19)
  • (modified) mlir/cmake/modules/MLIRConfig.cmake.in (+1)
  • (modified) mlir/include/mlir/IR/PatternMatch.h (+74)
  • (modified) mlir/lib/Rewrite/PatternApplicator.cpp (+47-2)
  • (modified) utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel (+1)
diff --git a/mlir/CMakeLists.txt b/mlir/CMakeLists.txt
index 44493b75b8a8c..bd30d94e1ccb4 100644
--- a/mlir/CMakeLists.txt
+++ b/mlir/CMakeLists.txt
@@ -202,6 +202,24 @@ if(MLIR_ENABLE_BINDINGS_PYTHON)
   mlir_configure_python_dev_packages()
 endif()
 
+#-------------------------------------------------------------------------------
+# MLIR Pattern Catalog Generator Configuration
+# Requires:
+#   RTTI to be enabled (set with -DLLVM_ENABLE_RTTI=ON)
+# When enabled, causes all rewriter patterns to dump their type names and the
+# names of affected operations, which can be used to build a search index
+# mapping operations to patterns.
+#-------------------------------------------------------------------------------
+
+set(MLIR_ENABLE_CATALOG_GENERATOR 0 CACHE BOOL
+       "Enables construction of a catalog of rewrite patterns.")
+
+if (MLIR_ENABLE_CATALOG_GENERATOR)
+  message(STATUS "Enabling MLIR pattern catalog generator")
+  add_definitions(-DMLIR_ENABLE_CATALOG_GENERATOR)
+  add_definitions(-DLLVM_ENABLE_RTTI)
+endif()
+
 set(CMAKE_INCLUDE_CURRENT_DIR ON)
 
 include_directories(BEFORE
@@ -322,3 +340,4 @@ endif()
 if(MLIR_STANDALONE_BUILD)
   llvm_distribution_add_targets()
 endif()
+
diff --git a/mlir/cmake/modules/MLIRConfig.cmake.in b/mlir/cmake/modules/MLIRConfig.cmake.in
index 71f3e028b1e88..f4ae70a22b3d2 100644
--- a/mlir/cmake/modules/MLIRConfig.cmake.in
+++ b/mlir/cmake/modules/MLIRConfig.cmake.in
@@ -16,6 +16,7 @@ set(MLIR_IRDL_TO_CPP_EXE "@MLIR_CONFIG_IRDL_TO_CPP_EXE@")
 set(MLIR_INSTALL_AGGREGATE_OBJECTS "@MLIR_INSTALL_AGGREGATE_OBJECTS@")
 set(MLIR_ENABLE_BINDINGS_PYTHON "@MLIR_ENABLE_BINDINGS_PYTHON@")
 set(MLIR_ENABLE_EXECUTION_ENGINE "@MLIR_ENABLE_EXECUTION_ENGINE@")
+set(MLIR_ENABLE_CATALOG_GENERATOR "@MLIR_ENABLE_CATALOG_GENERATOR@")
 
 set_property(GLOBAL PROPERTY MLIR_ALL_LIBS "@MLIR_ALL_LIBS@")
 set_property(GLOBAL PROPERTY MLIR_DIALECT_LIBS "@MLIR_DIALECT_LIBS@")
diff --git a/mlir/include/mlir/IR/PatternMatch.h b/mlir/include/mlir/IR/PatternMatch.h
index 10cfe851765dc..141b3c6806ed8 100644
--- a/mlir/include/mlir/IR/PatternMatch.h
+++ b/mlir/include/mlir/IR/PatternMatch.h
@@ -475,6 +475,80 @@ class RewriterBase : public OpBuilder {
     RewriterBase::Listener *rewriteListener;
   };
 
+  struct CatalogingListener : public RewriterBase::ForwardingListener {
+    CatalogingListener(OpBuilder::Listener *listener,
+                       const std::string &patternName, raw_ostream &os,
+                       std::mutex &writeMutex)
+        : RewriterBase::ForwardingListener(listener), patternName(patternName),
+          os(os), writeMutex(writeMutex) {}
+
+    void notifyOperationInserted(Operation *op, InsertPoint previous) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationInserted"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationInserted(op, previous);
+    }
+
+    void notifyOperationModified(Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationModified"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationModified(op);
+    }
+
+    void notifyOperationReplaced(Operation *op, Operation *newOp) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationReplaced (with op)"
+           << " | " << op->getName() << " | " << newOp->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationReplaced(op, newOp);
+    }
+
+    void notifyOperationReplaced(Operation *op,
+                                 ValueRange replacement) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationReplaced (with values)"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationReplaced(op, replacement);
+    }
+
+    void notifyOperationErased(Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationErased"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationErased(op);
+    }
+
+    void notifyPatternBegin(const Pattern &pattern, Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyPatternBegin"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyPatternBegin(pattern, op);
+    }
+
+  private:
+    const std::string &patternName;
+    raw_ostream &os;
+    std::mutex &writeMutex;
+  };
+
   /// Move the blocks that belong to "region" before the given position in
   /// another region "parent". The two regions must be different. The caller
   /// is responsible for creating or updating the operation transferring flow
diff --git a/mlir/lib/Rewrite/PatternApplicator.cpp b/mlir/lib/Rewrite/PatternApplicator.cpp
index 4a12183492fd4..c66aaf267881f 100644
--- a/mlir/lib/Rewrite/PatternApplicator.cpp
+++ b/mlir/lib/Rewrite/PatternApplicator.cpp
@@ -15,8 +15,19 @@
 #include "ByteCode.h"
 #include "llvm/Support/Debug.h"
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/raw_ostream.h"
+#include <cxxabi.h>
+#include <mutex>
+#endif
+
 #define DEBUG_TYPE "pattern-application"
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+static std::mutex catalogWriteMutex;
+#endif
+
 using namespace mlir;
 using namespace mlir::detail;
 
@@ -152,6 +163,16 @@ LogicalResult PatternApplicator::matchAndRewrite(
   unsigned anyIt = 0, anyE = anyOpPatterns.size();
   unsigned pdlIt = 0, pdlE = pdlMatches.size();
   LogicalResult result = failure();
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+  std::error_code ec;
+  llvm::raw_fd_ostream catalogOs("pattern_catalog.txt", ec,
+                                 llvm::sys::fs::OF_Append);
+  if (ec) {
+    op->emitError("Failed to open pattern catalog file: " + ec.message());
+    return failure();
+  }
+#endif
+
   do {
     // Find the next pattern with the highest benefit.
     const Pattern *bestPattern = nullptr;
@@ -206,14 +227,38 @@ LogicalResult PatternApplicator::matchAndRewrite(
           } else {
             LLVM_DEBUG(llvm::dbgs() << "Trying to match \""
                                     << bestPattern->getDebugName() << "\"\n");
-
             const auto *pattern =
                 static_cast<const RewritePattern *>(bestPattern);
-            result = pattern->matchAndRewrite(op, rewriter);
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+            OpBuilder::Listener *oldListener = rewriter.getListener();
+            int status;
+            const char *mangledPatternName = typeid(*pattern).name();
+            char *demangled = abi::__cxa_demangle(mangledPatternName, nullptr,
+                                                  nullptr, &status);
+            std::string demangledPatternName;
+            if (status == 0 && demangled) {
+              demangledPatternName = demangled;
+              free(demangled);
+            } else {
+              // Fallback in case demangling fails.
+              demangledPatternName = mangledPatternName;
+            }
+
+            RewriterBase::CatalogingListener *catalogingListener =
+                new RewriterBase::CatalogingListener(
+                    oldListener, demangledPatternName, catalogOs,
+                    catalogWriteMutex);
+            rewriter.setListener(catalogingListener);
+#endif
+            result = pattern->matchAndRewrite(op, rewriter);
             LLVM_DEBUG(llvm::dbgs()
                        << "\"" << bestPattern->getDebugName() << "\" result "
                        << succeeded(result) << "\n");
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+            rewriter.setListener(oldListener);
+            delete catalogingListener;
+#endif
           }
 
           // Process the result of the pattern application.
diff --git a/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel b/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
index 23d89f41a3a45..281b2566304b0 100644
--- a/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
+++ b/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
@@ -44,6 +44,7 @@ expand_template(
         "@MLIR_ENABLE_SPIRV_CPU_RUNNER@": "0",
         "@MLIR_ENABLE_VULKAN_RUNNER@": "0",
         "@MLIR_ENABLE_BINDINGS_PYTHON@": "0",
+        "@MLIR_ENABLE_CATALOG_GENERATOR@": "0",
         "@MLIR_RUN_AMX_TESTS@": "0",
         "@MLIR_RUN_ARM_SVE_TESTS@": "0",
         "@MLIR_RUN_ARM_SME_TESTS@": "0",

@llvmbot
Copy link
Member

llvmbot commented Jun 28, 2025

@llvm/pr-subscribers-mlir

Author: Jeremy Kun (j2kun)

Changes

This PR adds a new cmake build option MLIR_ENABLE_CATALOG_GENERATOR. When enabled, it attaches a listener to all RewritePatterns that emits the name of the operation being modified to a special file. When the MLIR test suite is run, these files can be combined into an index linking operations to the patterns that insert, modify, or replace them. This index is intended to be used to create a website that allows one to look up patterns from an operation name.


Full diff: https://github.com/llvm/llvm-project/pull/146228.diff

5 Files Affected:

  • (modified) mlir/CMakeLists.txt (+19)
  • (modified) mlir/cmake/modules/MLIRConfig.cmake.in (+1)
  • (modified) mlir/include/mlir/IR/PatternMatch.h (+74)
  • (modified) mlir/lib/Rewrite/PatternApplicator.cpp (+47-2)
  • (modified) utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel (+1)
diff --git a/mlir/CMakeLists.txt b/mlir/CMakeLists.txt
index 44493b75b8a8c..bd30d94e1ccb4 100644
--- a/mlir/CMakeLists.txt
+++ b/mlir/CMakeLists.txt
@@ -202,6 +202,24 @@ if(MLIR_ENABLE_BINDINGS_PYTHON)
   mlir_configure_python_dev_packages()
 endif()
 
+#-------------------------------------------------------------------------------
+# MLIR Pattern Catalog Generator Configuration
+# Requires:
+#   RTTI to be enabled (set with -DLLVM_ENABLE_RTTI=ON)
+# When enabled, causes all rewriter patterns to dump their type names and the
+# names of affected operations, which can be used to build a search index
+# mapping operations to patterns.
+#-------------------------------------------------------------------------------
+
+set(MLIR_ENABLE_CATALOG_GENERATOR 0 CACHE BOOL
+       "Enables construction of a catalog of rewrite patterns.")
+
+if (MLIR_ENABLE_CATALOG_GENERATOR)
+  message(STATUS "Enabling MLIR pattern catalog generator")
+  add_definitions(-DMLIR_ENABLE_CATALOG_GENERATOR)
+  add_definitions(-DLLVM_ENABLE_RTTI)
+endif()
+
 set(CMAKE_INCLUDE_CURRENT_DIR ON)
 
 include_directories(BEFORE
@@ -322,3 +340,4 @@ endif()
 if(MLIR_STANDALONE_BUILD)
   llvm_distribution_add_targets()
 endif()
+
diff --git a/mlir/cmake/modules/MLIRConfig.cmake.in b/mlir/cmake/modules/MLIRConfig.cmake.in
index 71f3e028b1e88..f4ae70a22b3d2 100644
--- a/mlir/cmake/modules/MLIRConfig.cmake.in
+++ b/mlir/cmake/modules/MLIRConfig.cmake.in
@@ -16,6 +16,7 @@ set(MLIR_IRDL_TO_CPP_EXE "@MLIR_CONFIG_IRDL_TO_CPP_EXE@")
 set(MLIR_INSTALL_AGGREGATE_OBJECTS "@MLIR_INSTALL_AGGREGATE_OBJECTS@")
 set(MLIR_ENABLE_BINDINGS_PYTHON "@MLIR_ENABLE_BINDINGS_PYTHON@")
 set(MLIR_ENABLE_EXECUTION_ENGINE "@MLIR_ENABLE_EXECUTION_ENGINE@")
+set(MLIR_ENABLE_CATALOG_GENERATOR "@MLIR_ENABLE_CATALOG_GENERATOR@")
 
 set_property(GLOBAL PROPERTY MLIR_ALL_LIBS "@MLIR_ALL_LIBS@")
 set_property(GLOBAL PROPERTY MLIR_DIALECT_LIBS "@MLIR_DIALECT_LIBS@")
diff --git a/mlir/include/mlir/IR/PatternMatch.h b/mlir/include/mlir/IR/PatternMatch.h
index 10cfe851765dc..141b3c6806ed8 100644
--- a/mlir/include/mlir/IR/PatternMatch.h
+++ b/mlir/include/mlir/IR/PatternMatch.h
@@ -475,6 +475,80 @@ class RewriterBase : public OpBuilder {
     RewriterBase::Listener *rewriteListener;
   };
 
+  struct CatalogingListener : public RewriterBase::ForwardingListener {
+    CatalogingListener(OpBuilder::Listener *listener,
+                       const std::string &patternName, raw_ostream &os,
+                       std::mutex &writeMutex)
+        : RewriterBase::ForwardingListener(listener), patternName(patternName),
+          os(os), writeMutex(writeMutex) {}
+
+    void notifyOperationInserted(Operation *op, InsertPoint previous) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationInserted"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationInserted(op, previous);
+    }
+
+    void notifyOperationModified(Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationModified"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationModified(op);
+    }
+
+    void notifyOperationReplaced(Operation *op, Operation *newOp) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationReplaced (with op)"
+           << " | " << op->getName() << " | " << newOp->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationReplaced(op, newOp);
+    }
+
+    void notifyOperationReplaced(Operation *op,
+                                 ValueRange replacement) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationReplaced (with values)"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationReplaced(op, replacement);
+    }
+
+    void notifyOperationErased(Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyOperationErased"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyOperationErased(op);
+    }
+
+    void notifyPatternBegin(const Pattern &pattern, Operation *op) override {
+      {
+        std::lock_guard<std::mutex> lock(writeMutex);
+        os << patternName << " | notifyPatternBegin"
+           << " | " << op->getName() << "\n";
+        os.flush();
+      }
+      ForwardingListener::notifyPatternBegin(pattern, op);
+    }
+
+  private:
+    const std::string &patternName;
+    raw_ostream &os;
+    std::mutex &writeMutex;
+  };
+
   /// Move the blocks that belong to "region" before the given position in
   /// another region "parent". The two regions must be different. The caller
   /// is responsible for creating or updating the operation transferring flow
diff --git a/mlir/lib/Rewrite/PatternApplicator.cpp b/mlir/lib/Rewrite/PatternApplicator.cpp
index 4a12183492fd4..c66aaf267881f 100644
--- a/mlir/lib/Rewrite/PatternApplicator.cpp
+++ b/mlir/lib/Rewrite/PatternApplicator.cpp
@@ -15,8 +15,19 @@
 #include "ByteCode.h"
 #include "llvm/Support/Debug.h"
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+#include "llvm/Support/FileSystem.h"
+#include "llvm/Support/raw_ostream.h"
+#include <cxxabi.h>
+#include <mutex>
+#endif
+
 #define DEBUG_TYPE "pattern-application"
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+static std::mutex catalogWriteMutex;
+#endif
+
 using namespace mlir;
 using namespace mlir::detail;
 
@@ -152,6 +163,16 @@ LogicalResult PatternApplicator::matchAndRewrite(
   unsigned anyIt = 0, anyE = anyOpPatterns.size();
   unsigned pdlIt = 0, pdlE = pdlMatches.size();
   LogicalResult result = failure();
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+  std::error_code ec;
+  llvm::raw_fd_ostream catalogOs("pattern_catalog.txt", ec,
+                                 llvm::sys::fs::OF_Append);
+  if (ec) {
+    op->emitError("Failed to open pattern catalog file: " + ec.message());
+    return failure();
+  }
+#endif
+
   do {
     // Find the next pattern with the highest benefit.
     const Pattern *bestPattern = nullptr;
@@ -206,14 +227,38 @@ LogicalResult PatternApplicator::matchAndRewrite(
           } else {
             LLVM_DEBUG(llvm::dbgs() << "Trying to match \""
                                     << bestPattern->getDebugName() << "\"\n");
-
             const auto *pattern =
                 static_cast<const RewritePattern *>(bestPattern);
-            result = pattern->matchAndRewrite(op, rewriter);
 
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+            OpBuilder::Listener *oldListener = rewriter.getListener();
+            int status;
+            const char *mangledPatternName = typeid(*pattern).name();
+            char *demangled = abi::__cxa_demangle(mangledPatternName, nullptr,
+                                                  nullptr, &status);
+            std::string demangledPatternName;
+            if (status == 0 && demangled) {
+              demangledPatternName = demangled;
+              free(demangled);
+            } else {
+              // Fallback in case demangling fails.
+              demangledPatternName = mangledPatternName;
+            }
+
+            RewriterBase::CatalogingListener *catalogingListener =
+                new RewriterBase::CatalogingListener(
+                    oldListener, demangledPatternName, catalogOs,
+                    catalogWriteMutex);
+            rewriter.setListener(catalogingListener);
+#endif
+            result = pattern->matchAndRewrite(op, rewriter);
             LLVM_DEBUG(llvm::dbgs()
                        << "\"" << bestPattern->getDebugName() << "\" result "
                        << succeeded(result) << "\n");
+#ifdef MLIR_ENABLE_CATALOG_GENERATOR
+            rewriter.setListener(oldListener);
+            delete catalogingListener;
+#endif
           }
 
           // Process the result of the pattern application.
diff --git a/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel b/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
index 23d89f41a3a45..281b2566304b0 100644
--- a/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
+++ b/utils/bazel/llvm-project-overlay/mlir/test/BUILD.bazel
@@ -44,6 +44,7 @@ expand_template(
         "@MLIR_ENABLE_SPIRV_CPU_RUNNER@": "0",
         "@MLIR_ENABLE_VULKAN_RUNNER@": "0",
         "@MLIR_ENABLE_BINDINGS_PYTHON@": "0",
+        "@MLIR_ENABLE_CATALOG_GENERATOR@": "0",
         "@MLIR_RUN_AMX_TESTS@": "0",
         "@MLIR_RUN_ARM_SVE_TESTS@": "0",
         "@MLIR_RUN_ARM_SME_TESTS@": "0",

Copy link

⚠️ We detected that you are using a GitHub private e-mail address to contribute to the repo.
Please turn off Keep my email addresses private setting in your account.
See LLVM Developer Policy and LLVM Discourse for more information.

@j2kun j2kun changed the title Add an MLIR "pattern catalog" generator [mlir][core] Add an MLIR "pattern catalog" generator Jun 28, 2025
@makslevental
Copy link
Contributor

I support this with every fiber of my being! Bravo.

My only comment (aside from getDebugName) is that it shouldn't mess with the filesystem and it should just dump to the screen. If you put the impl into it's own cpp file and then do #define DEBUG_TYPE "pattern-catalog", then people will be able to produce the catalog by just doing debug-only=pattern-catalog and piping (eg to a file on disk or whatever).

@j2kun
Copy link
Contributor Author

j2kun commented Jun 28, 2025

I support this with every fiber of my being! Bravo.

My only comment (aside from getDebugName) is that it shouldn't mess with the filesystem and it should just dump to the screen. If you put the impl into it's own cpp file and then do #define DEBUG_TYPE "pattern-catalog", then people will be able to produce the catalog by just doing debug-only=pattern-catalog and piping (eg to a file on disk or whatever).

I had originally tried something like this, and my obstacle was that I couldn't get the test suite to produce useful outputs because stuff was piped to FileCheck (when I tried just dumping to llvm::outs()), and I didn't have a way to modify the RUN command of all the lit tests in a systematic/pragmatic way to add a debug-only flag.

@makslevental
Copy link
Contributor

makslevental commented Jun 28, 2025

I had originally tried something like this, and my obstacle was that I couldn't get the test suite to produce useful outputs because stuff was piped to FileCheck (when I tried just dumping to llvm::outs()), and I didn't have a way to modify the RUN command of all the lit tests in a systematic/pragmatic way to add a debug-only flag.

If you make FileCheck -dump-input=always "settable" via env var (which shouldn't be too controversial?) then I think it could work (although you'd get the IR output too). Still I think that's better than relying on a filesystem and etc.

@makslevental
Copy link
Contributor

makslevental commented Jun 28, 2025

err sorry - you'd have to get debug-only to also be settable via env var. it occurs to me that this should be a cl::opt feature? like how pip works.

@j2kun
Copy link
Contributor Author

j2kun commented Jun 28, 2025

Is not relying on a filesystem all that critical? The mutex is mildly annoying, but it's scoped pretty tightly, and the alternative requires changes in a few other components (wouldn't I also need to plumb these environment variables through the lit config?) and additional cleanup of the output.

I agree setting environment variables would be the cleanest approach. I'm just not confident in my ability to push a big change to cl::opt through, and using the filesystem seems good enough.

@makslevental
Copy link
Contributor

Is not relying on a filesystem all that critical?

I personally don't care (I think it's a reasonable use) - maybe someone else can comment on whether this kind of thing is acceptable.

@j2kun
Copy link
Contributor Author

j2kun commented Jun 29, 2025

Spot checking a few examples, it looks like I also need to hook into the dialect conversion framework.

Copy link

github-actions bot commented Jun 30, 2025

✅ With the latest revision this PR passed the Python code formatter.

@j2kun
Copy link
Contributor Author

j2kun commented Jun 30, 2025

Using ToolSubst I modified the PR to no longer use file writing and/or the mutex. It's quite simple now, and the last two steps before this PR can feasibly be merged are:

  1. Add a test that ensures the catalog output is printed.
  2. Ensure the conversion rewriter patterns are working (it seems replaceOp on conversion pattern rewriter is not notifying the listener)

Copy link

github-actions bot commented Jun 30, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

"mlir-opt --debug-only=pattern-logging-listener --mlir-disable-threading",
unresolved="fatal",
),
ToolSubst("FileCheck", "FileCheck --dump-input=always", unresolved="fatal"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the FileCheck substitution relates to the catalog environment? Is this because some tests are redirecting stderr and you're using this to preserve the debug output?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and for future proofing against such trickery. I didn't do a detailed analysis of which tests redirect stderr (i.e., if many involve the rewrite engine), but many tests do and so it seems acceptable to account for this.

Copy link
Collaborator

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That LGTM, but @jpienaar is out for a few days and we should wait since he worked on an alternative.

@j2kun
Copy link
Contributor Author

j2kun commented Jul 1, 2025

After discussing with @matthias-springer it seems the dialect conversion framework is not immediately compatible with this change (in large part because the conversion rewriter is itself implemented as a listener that overrides much of the default behavior to do stateful things). The immediate result is that the catalog only includes "insert op" actions for dialect conversion patterns.

I am looking into a different way to support this for dialect conversion, until (perhaps in the distant future) dialect conversion is refactored so that it can operate with listeners more smoothly.

If it's OK, I will split that into a different PR, and so the last thing I need here is a lit test that ensures the debug output is produced in a pass that uses the greedy driver.

@j2kun j2kun requested a review from jpienaar July 5, 2025 04:25
@j2kun j2kun force-pushed the pattern-discovery branch from d2c8b21 to ed693d8 Compare July 5, 2025 04:37
This PR adds a new cmake build option MLIR_ENABLE_CATALOG_GENERATOR.
When enabled, it attaches a listener to all RewritePatterns that emits
the name of the operation being modified to a special file. When the
MLIR test suite is run, these files can be combined into an index
linking operations to the patterns that insert, modify, or replace them.
This index is intended to be used to create a website that allows one
to look up patterns from an operation name.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bazel "Peripheral" support tier build system: utils/bazel mlir:core MLIR Core Infrastructure mlir
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants