Skip to content

Add dataFence plugin interface #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 814 commits into
base: libomptarget-introduce-attach-support
Choose a base branch
from

Conversation

adurang
Copy link
Collaborator

@adurang adurang commented Jul 25, 2025

kazutakahirata and others added 30 commits July 21, 2025 07:24
This was left during the upstream.

Co-authored-by: Mekhanoshin, Stanislav <[email protected]>
…lvm#149631)

This is a partial support because some other instructions have not been upstreamed yet.
Add the syntax `-fintrinsic-modules-path=<dir>` as an alias to the
existing option `-fintrinsic-modules-path <dir>`. gfortran also supports
both alternatives.

This is particularly useful with CMake which de-duplicates command-line
options. For instance,
`-fintrinsic-modules-path /path/A -fintrinsic-modules-path /path/B`
is de-duplicated to
`-fintrinsic-modules-path /path/A /path/B`
since it conisiders the second `-fintrinsic-modules-path`
"redundant". This can be avoided using the syntax
`-fintrinsic-modules-path=/path/A -fintrinsic-modules-path=/path/B`.
When reviewing llvm#147156, the reviewers pointed out that we didn't need to
support the trigraph. The code never handled it right.

In the debug build, this kind of input caused the assertion in the
function `countLeadingWhitespace` to fail. The release build without
assertions outputted `?` `?` `/` separated by spaces.

```C
#define A ??/
  int i;
```

This is because the code in `countLeadingWhitespace` assumed that the
underlying lexer recognized the entire `??/` sequence as a single token.
In fact, the lexer recognized it as 3 separate tokens. The flag to make
the lexer recognize trigraphs was never enabled.

This patch enables the flag in the underlying lexer. This way, the
program now either turns the trigraph into a single `\` or removes it
altogether if the line is short enough. There are operators like the
`??=` in C#. So the flag is not enabled for all input languages. Instead
the check for the token size is moved from the assert line into the if
line.

The problem was introduced by my own patch 370bee4 from about 3
years ago. I added code to count the number of characters in the escape
sequence probably just because the block of code used to have a comment
saying someone should add the feature. Maybe I forgot to enable
assertions when I ran the code. I found the problem because reviewing
pull request 145243 made me look at the code again.
A follow up patch from llvm#140736.
Set default true16 mode from gfx110x to all gfx11 devices.

Tests has been address in preivous patches.
OpenMP 6.0 has changed the modifiers on the MAP clause:
- map-type-modifier has been split into individual modifiers,
- map-type "delete" has become a modifier,
- new modifiers have been added.

This patch adds parsing support for all of the OpenMP 6.0 modifiers. The
old "map-type-modifier" is retained, but is no longer created in
parsing. It will remain to take advantage of the preexisting modifier
validation for older versions: when the OpenMP version is < 6.0, the
modifiers will be rewritten back as map-type-modifiers (or map- type in
case of "delete").

In this patch the modifiers will always be rewritten in the older format
to isolate these changes to parsing as much as possible.
Now that readMemProf calls two helper functions handleAllocSite and
handleCallSite, we can simplify the control flow.  We don't need to
use "continue" anymore.
…perations (llvm#148350)

This patch generalizes the existing foldBitOpOfBitcasts optimization in the VectorCombine pass to handle additional cast operations beyond just bitcast.

  Fixes: [llvm#146037](llvm#146037)

  Summary

The optimization now supports folding bitwise operations (AND/OR/XOR)
with the following cast operations:
  - bitcast (original functionality)
  - trunc (truncate)
  - sext (sign extend)
  - zext (zero extend)

  The transformation pattern is:
  bitop(castop(x), castop(y)) -> castop(bitop(x, y))

This reduces the number of cast instructions from 2 to 1, improving
performance on targets where cast operations
are expensive or where performing bitwise operations on narrower types
is beneficial.
  
  Implementation Details

- Renamed foldBitOpOfBitcasts to foldBitOpOfCastops to reflect broader
functionality
  - Extended pattern matching to handle any CastInst operation
- Added validation for each cast type's constraints (e.g., trunc
requires source > dest)
  - Updated cost model to use the actual cast opcode
  - Preserves IR flags from original instructions
  - Handles multi-use scenarios appropriately

  Testing

- Added comprehensive tests in
test/Transforms/VectorCombine/bitop-of-castops.ll
  - Tests cover all supported cast types with all bitwise operations
  - Includes negative tests for unsupported patterns
  - All existing VectorCombine tests pass
Fixes llvm#149669; the old check compared with the end of the literal, but
we can just check that after parsing digits, we're pointing to one
character past the token start.
convert_iKxN_s is canonicalized into convert_iKxN_u when the argument is
known to have sign bit 0. This results in emitting Wasm opcodes that, on
some targets (like x86_64), are dramatically slower than signed versions
on major engines.

Similarly to X86, we now fix this up in isel when the instruction has
nonneg flag from canonicalization or if we know the source has zero sign
bit.

Fixes llvm#149457.
Mips was the only architecture having PtrDiffType = SignedInt and
IntPtrType = SignedLong

This fixes a problem on mipsel-windows-gnu triple, where uintptr_t was
wrongly defined as unsigned long instead of unsigned int, leading to
problems in compiler-rt.

compiler-rt/lib/interception/interception_type_test.cpp:24:17: error:
static assertion failed due to requirement
'__sanitizer::is_same<unsigned long, unsigned int>::value':
24 | COMPILER_CHECK((__sanitizer::is_same<__sanitizer::uptr,
::uintptr_t>::value));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compiler-rt/lib/interception/../sanitizer_common/sanitizer_internal_defs.h:369:44:
note: expanded from macro 'COMPILER_CHECK'
      369 | #define COMPILER_CHECK(pred) static_assert(pred, "")
          |                                            ^~~~
compiler-rt/lib/interception/interception_type_test.cpp:25:17: error:
static assertion failed due to requirement '__sanitizer::is_same<long,
int>::value':
25 | COMPILER_CHECK((__sanitizer::is_same<__sanitizer::sptr,
::intptr_t>::value));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compiler-rt/lib/interception/../sanitizer_common/sanitizer_internal_defs.h:369:44:
note: expanded from macro 'COMPILER_CHECK'
      369 | #define COMPILER_CHECK(pred) static_assert(pred, "")
          |                                            ^~~~
compiler-rt/lib/interception/interception_type_test.cpp:27:17: error:
static assertion failed due to requirement '__sanitizer::is_same<long,
int>::value':
27 | COMPILER_CHECK((__sanitizer::is_same<::PTRDIFF_T,
::ptrdiff_t>::value));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

compiler-rt/lib/interception/../sanitizer_common/sanitizer_internal_defs.h:369:44:
note: expanded from macro 'COMPILER_CHECK'
      369 | #define COMPILER_CHECK(pred) static_assert(pred, "")
Performance testing shows no significant gains or losses on graphics
workloads, so this is mostly to make the behavior consistent across all
supported OSes instead of special-casing HSA.
"to make the add's lexically identical" -> "to make the adds lexically
identical"
Clang 19 has been the oldest supported version of Clang since the LLVM
20 release, but we had not cleaned up the test suite yet.
Fix a failing test for constant-folding the nvvm_round intrinsic. The
original implementation added in llvm#141233 used a native libm call to the
"round" function, but on PPC this produces +0.0 if the input is -0.0,
which caused a test failure.

This patch updates it to use APFloat functions instead of native libm
calls to ensure cross-platform consistency.
This is one of the final remaining debug-intrinsic specific codepaths
out there, and pieces of cross-LLVM infrastructure to do with debug
intrinsics.
…r-noalias-addrspace.ll` (llvm#149826)

The callee and caller signature doesn't match
…49849)

string_utils.h uses uintptr_t, and there seems to be no tracking of this
dependency. It seems upstream builds are unaffected but downstream this
is causing a lot of flaky builds.
…vm#149178)

Cleans up debt from llvm#147849 and llvm#147860

I had originally duplicated this test since the WinEH directory wasn't
enabled for AArch64, but now that we can run AArch64 tests in that
directory, I've unified the tests.
alexrp and others added 30 commits July 23, 2025 00:03
…on (llvm#146308)

Needed to resolve this compilation error on some systems:

lib/libunwind/src/UnwindCursor.hpp:153:38: error: return type of
out-of-line definition of 'libunwind::DwarfFDECache::findFDE' differs
from that in the declaration
    typename A::pint_t DwarfFDECache<A>::findFDE(pint_t mh, pint_t pc) {
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
lib/libunwind/src/libunwind.cpp:31:10: note: in file included from
lib/libunwind/src/libunwind.cpp:31:
    #include "UnwindCursor.hpp"
             ^
lib/libunwind/src/UnwindCursor.hpp:100:17: note: previous declaration is
here
      static pint_t findFDE(pint_t mh, pint_t pc);
             ~~~~~~~^
…#150107)

Also switches immediate offset to signed for the subtarget.
Adding Loongarch64 to OpenBSD parts
## Purpose
Simplify `DEMANGLE_` macro definitions in
`llvm/Demangle/DemangleConfig.h` for clarity/maintainability.

## Overview
* Alias `DEMANGLE_DUMP_METHOD`, `DEMANGLE_FALLTHROUGH`, and
`DEMANGLE_UNREACHABLE` macros to their `LLVM_` counterparts defined in
`llvm/Support/Compiler.h`
* Remove several `DEMANGLE_`-prefixed macros that were only used within
`DemangleConfig.h`

## Background
* It should be safe for the `Demangle` component library to depend on
`Support`, so there is no need for it to maintain copies of macros
defined in `llvm/Support/Compiler.h`.
* Since the canonical copy `llvm/Demangle/ItaniumDemangle.h` lives under
`libcxxabi`, it cannot directly reference the `LLVM_`-prefixed macros so
we define `DEMANGLE_`-prefixed aliases.

## Validation
* Built llvm-project on Windows with `clang-cl` and MSVC `cl`.
* Built llvm-project on Linux with `clang` and `gcc`.
…50098)

The point of this change is simply to show that the constant check was
not required for correctness. The mixed intrinsic and shuffle tests are
added purely to exercise the code. An upcoming change will add support
for shuffle matching in getMask to support non-constant fixed vector
cases.
Fixed broken 'pr126337.ll' NVPTX related test (by llvm#149393)
…vm#149626)

There are cases we end up removing some intructions that use stackified
registers after RegStackify. For example,

```wasm
bb.0:
  %0 = ...    ;; %0 is stackified
  br_if %bb.1, %0
bb.1:
```

In this code, br_if will be removed in CFGSort, so we should unstackify
%0 so that it can be correctly dropped in ExplicitLocals.

Rather than handling this in case-by-case basis, this PR just
unstackifies all stackifies register with no uses in the beginning of
ExplicitLocals, so that they can be correctly dropped.

Fixes llvm#149097.
…5087)

Do some refactoring to allocation/deallocation interceptors. Expose
explicit per-alloc_type functions and stop accepting explicit AllocType.
This ensures we do not accidentally mix.

NOTE: This change rejects attempts to call `operator new(<some_size>,
static_cast<std::align_val_t>(0))`.

For llvm#144435

Signed-off-by: Justin King <[email protected]>
mbrtowc was not handling null destination correctly

---------

Co-authored-by: Sriya Pratipati <[email protected]>
Follow up to 28417e6, and the whole line of work started with 4b81dc7.

This change merges the handling for VPStore - currently in
lowerInterleavedVPStore - into the existing dedicated routine used in
the shuffle lowering path. This removes the last use of the dedicated
lowerInterleavedVPStore and thus we can remove it.

This contains two changes which are functional.

First, like in 28417e6, merging support for vp.store exposes the
strided store optimization for code using vp.store.

Second, it seems the strided store case had a significant missed
optimization. We were performing the strided store at the full unit
strided store type width (i.e. LMUL) rather than reducing it to match
the input width. This became obvious when I tried to use the mask
created by the helper routine as it caused a type incompatibility.

Normally, I'd try not to include an optimization in an API rework, but
structuring the code to both be correct for vp.store and not optimize
the existing case turned out be more involved than seemed worthwhile. I
could pull this part out as a pre-change, but its a bit awkward on it's
own as it turns out to be somewhat of a half step on the possible
optimization; the full optimization is complex with the old code
structure.

---------

Co-authored-by: Craig Topper <[email protected]>
Eliminate the `lldb_private::dwarf` namespace, in favor of using
`llvm::dwarf` directly. The latter is shorter, and this avoids ambiguity
in the ABI plugins that define a `dwarf` namespace inside an anonymous
namespace.
…150103)

The F4E2M1 truncation emulation was expanding or truncating operations
to F32 even when the pattern did not apply, causing non-convergent
rewrites when operating on doubles.

Also, fix a pair of whitespace issues that snuck in.
Addressing a suggestion from llvm#149605 consistently throughout file.
)

Simon Pilgrim ([1]) and Anton reported that the following warning will
appear when building clang compiler:
```
In file included from .../llvm-project/llvm/lib/Target/BPF/BPFASpaceCastSimplifyPass.cpp:9: .../llvm-project/llvm/lib/Target/BPF/BPF.h:25:20: warning: ‘llvm::BPF_TRAP’ defined but not used [-Wunused-variable]
   25 | static const char *BPF_TRAP = "__bpf_trap";
      |                    ^~~~~~~~
...
In file included from .../llvm-project/llvm/lib/Target/BPF/MCTargetDesc/BPFInstPrinter.cpp:14:
.../llvm-project/llvm/lib/Target/BPF/BPF.h:25:20: warning: ‘llvm::BPF_TRAP’ defined but not used [-Wunused-variable]
   25 | static const char *BPF_TRAP = "__bpf_trap";
      |                    ^~~~~~~~
...
```
Instead of using static const variable, use macro to silence warnings.

  [1] llvm#131731
)

If I'm reading the spec correctly, plui.h/w encode the immediate
differently from pli.h/w. pli.h/w appear to rotate the immediate
left by 1 before encoding while plui.h/w rotates the immediate right
by 1 before encoding.

Since I was splitting the classes, I made the name closer to the
instruction names since the immediate width was ambiguous. I've
added an _i suffix to make it similar to base and Zb* class names.
…m#150100)

Fixes llvm#147395

This PR:
- Excludes lifetime intrinsics from the Int64Ops shader flags analysis
to match DXC behavior and pass DXIL validation.
- Performs legalization of `llvm.lifetime.*` intrinsics in the
EmbedDXILPass just before invoking the DXILBitcodeWriter.
- After invoking the DXILBitcodeWriter, all lifetime intrinsics and
associated bitcasts are removed from the module to keep the Module
Verifier happy. This is fine since lifetime intrinsics are not needed by
any passes after the EmbedDXILPass.
…#149775)

Addresses llvm#112164. minimumnum
and maximumnum intrinsics were added in 5bf81e5.

The new built-ins can be used for implementing OpenCL math function fmax
and fmin in llvm#128506.
…lvm#150132)

LanguageType has two kinds of enumerators in it. The first is
DWARF-assigned enumerators which must be consecutive and match DW_LANG
values. The second is the vendor-assigned enumerators which must be
unique and must follow on from the DWARF-assigned values (i.e. the first
one is currently eLanguageTypeMojo + 1) even if that collides with
DWARF-assigned values that lldb is not yet aware of

Only the DWARF-assigned enumerators may be static_cast from DW_LANG
since their values match. The vendor-assigned enumerators must be
explicitly converted since their values do not match. This needs to
handle new languages added to DWARF and not yet implemented in lldb.

This fixes a crash when encountering a DW_LANG value >=
eNumLanguageTypes and wrong behaviour when encountering DW_LANG values
that have not yet been added to LanguageType but happen to coincide with
a vendor-assigned enumerator due to the consecutive values requirement
described above.

Another way to fix the crash is to add the language to LanguageType (and
fill any preceeding gaps in the number space) so that the DW_LANG being
encountered is correctly handled but this just moves the problem to a
new subset of DW_LANG values.

Also fix an unnecessary static-cast from LanguageType to LanguageType.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.