Xcvmem #287

realqhc · 2024-01-02T04:01:44Z

No description provided.

Signed-off-by: Jerry Zhang Jian <[email protected]>

…)" This reverts commit 42944e4.

In -convert-vector-to-arm-sme the permutation_map is explicitly checked for transpose when converting xfer ops, but for 2-D vector types the only non-identity permutation map is transpose so this can be simplified.

…3791) Found while investigating llvm#93709

…llvm#93795) The current way of lowering `llvm.clear_cache` is a bit unusual. As suggested by Matt Arsenault we are better off using an ISD node. This change introduces a new `ISD::CLEAR_CACHE`, registers a new libcall by default named `__clear_cache` and the default legalisation is a libcall. This is preparatory work for a custom lowering of `ISD::CLEAR_CACHE` needed by RISC-V on some platforms.

…t` (llvm#90754) The "reduction" clause is not allowed on the "target" construct.

…&& mixup ? (llvm#93093) Fixes llvm#93002

…m#93772) If present, the optional second argument of the ieee_exceptions intrinsic module procedure ieee_support_flag may be either a scalar or an array. Change the signature of the routine that implements this function so that it is processed as a transformational function, not an elemental function, which accounts for this argument variant.

The `target reduction` combination is no longer accepted. Disable the test to avoid build failures, until a better fix is ready.

…container-size-empty (llvm#93724) Verify that size/length methods are called with no arguments. Closes llvm#88203

Last used in e35fbf5.

Better design to put semantics on the ops, and in this case the ntt/intt op can lower in multiple ways depending on the polynomial ring modulus (it can need an nth root of unity for cyclic polymul -> ntt, or a 2nth root for negacyclic polymul -> ntt) --------- Co-authored-by: Jeremy Kun <[email protected]>

…93816) Reduce code bloat by checking test requirements in a common test fixture

Re-enable test disabled in 1bf1f93 with a fix.

…#93606) Back to back `linalg.transpose` can be rewritten to a single transpose

…ring (llvm#93592) Handle lowering of non optional inquired argument in custom lowering. Also fix an issue in the lowering of associated optional argument where a box was emboxed again which led to weird result.

…C). (llvm#93780)

And sort out some unused headers

Found by llvm#93667

Macos will automatically load dependent modules when creating a target, resulting in more modules than the test expects.

…en not poison Since llvm#93182 we can now call computeKnownBits inside getValidMaximumShiftAmount to determine the bounds of the shift amount ensuring that it wasn't poison, meaning if we did freeze the ahift amount, isGuaranteedNotToBeUndefOrPoison would then fail as we can't call computeKnownBits through FREEZE for potentially poison values. I'm still reducing a decent test case but wanted to get the buildbot fix ASAP.

…llvm#88712) This commit adds an API (`tileAndFuseConsumerOfSlice`) to fuse consumer to a producer within scf.for/scf.forall loop. To support this two new methods are added to the `TilingInterface` - `getIterationDomainTileFromOperandTile` - `getTiledImplementationFromOperandTile`. Consumer operations that implement this method can be used to be fused with tiled producer operands in a manner similar to (but essentially the inverse of) the fusion of an untiled producer with a tiled consumer. Note that this only does one `tiled producer` -> `consumer` fusion. This could be called repeatedly for fusing multiple consumers. The current implementation also is conservative in when this kicks in (like single use of the value returned by the inter-tile loops that surround the tiled producer, etc.) These can be relaxed over time. Signed-off-by: Abhishek Varma <[email protected]> --------- Signed-off-by: Abhishek Varma <[email protected]> Signed-off-by: Abhishek Varma <[email protected]> Co-authored-by: cxy <[email protected]>

…vm#92783) in `atomic::wait`, when we call the platform wait ulock_wait , we are using UL_COMPARE_AND_WAIT. But we should use UL_COMPARE_AND_WAIT64 instead as the address we are waiting for is a 64 bit integer. fixes llvm#85107 It is rather hard to test directly because in `atomic::wait`, before calling into the platform wait, our c++ code has some poll logic which checks the value not changing. Thus in this patch, the test is using the internal function.

Also drop errant header include from `Linalg` dialect into `Dialect/SCF/Transforms/TileUsingInterface.cpp`

…)))`; NFC

We can convert this to a select based on the `(icmp eq X, C)`, then constant fold the addition the true arm begin `(add C, (sext/zext 1))` and the false arm being `(add X, 0)` e.g - `(select (icmp eq X, C), (add C, (sext/zext 1)), (add X, 0))`. This is essentially a specialization of the only case that sees to actually show up from llvm#89020 Closes llvm#93840

) Port selection dag isel to new pass manager. Only `AMDGPU` and `X86` support new pass version. `-verify-machineinstrs` in new pass manager belongs to verify instrumentation, it is enabled by default.

…llvm#94146) This reverts commit de37c06 to de37c06 It still breaks EXPENSIVE_CHECKS build. Sorry.

We were missing the PoisonOnly argument (so Depth + 1 was being used instead and the default Depth = 0 argument then being silently used) Fixes llvm#94145 and serves as the test case for 9e22c7a

…eship coverage (llvm#94120) Three unrelated, small improvements: * `test_macros.h` was incorrectly saying `__has_include("<version>")` instead of `__has_include(<version>)`. + This caused `<ciso646>` to always be included (noticed because MSVC's STL emitted a deprecation warning). + I searched all of LLVM and found no other occurrences. * `thread.condition.condvarany/wait_for_pred.pass.cpp` forgot to test anything. + I followed what `wait_for.pass.cpp` is testing. * Uncomment spaceship test coverage.

I noticed that these tests had empty `main` functions. Dropping them and renaming the tests to `MEOW.compile.pass.cpp` will slightly improve test throughput.

There is no reason to give any of the functions C linkage. This makes all of the libc++ functions have C++ linkage, removing the need for `_LIBCPP_HIDE_FROM_ABI_C`.

Fixes llvm#92657.

The type of *Iter here is "const IndexedMemProfRecord &" as defined in RecordLookupTrait. Assigning *Iter to a variable of type "const IndexedMemProfRecord &" avoids a copy, reducing the cycle and instruction counts by 1.8% and 0.2%, respectively, with "llvm-profdata show" modified to deserialize all MemProfRecords. Note that RecordLookupTrait has an internal copy of IndexedMemProfRecord, so we don't have to worry about a dangling reference to a temporary.

This avoids the pitfall where we set the uwtable to none: ``` func.setUWTableKind(llvm::UWTableKind::None) ``` `Attribute::getAsString()` would see an unknown attribute and fail an assertion. In this patch, we assert that we do not see a None uwtable kind. This also skips the check of `UWTableKind::Async`. It is dominated by the check of `UWTableKind::Default`, which has the same enum value (nfc).

https://github.com/llvm/llvm-project/actions/runs/9342063971/job/25709589592

The `LoopBlock` stored in `LoopWorkList` consist of basic block and its loop data information. When iterate `LoopWorkList`, if estimated weight of a loop is not stored in `EstimatedLoopWeight`, `getLoopExitBlocks()` is called to get all exit blocks of the loop. The estimated weight of a loop is calculated by iterating over edges leading from basic block to all exit blocks of the loop. If at least one edge has unknown estimated weight, the estimated weight of loop is unknown and will not be stored in `EstimatedLoopWeight`. `LoopWorkList` can contain different blocks in a same loop, so there is wasted work that calls `getLoopExitBlocks()` for same loop multiple times. Since computing the exit blocks of loop is expensive and the loop structure is not mutated in Branch Probability Analysis, we can cache the result and improve compile time. With this change, the overall compile time for a file containing a very large loop is dropped by around 82%.

All post-Increment load/store, register-register load/store spec: https://github.com/openhwgroup/cv32e40p/blob/master/docs/source/instruction_set_extensions.rst

ChunyuLiao force-pushed the xcvmem branch 2 times, most recently from 795a445 to 26f3a72 Compare January 5, 2024 06:51

jerryzj and others added 28 commits May 30, 2024 19:18

[RISCV] Fix typo zamo -> zaamo (llvm#93792)

01921bd

Signed-off-by: Jerry Zhang Jian <[email protected]>

[LLVM] Remove executable permission from some non-executable files (l…

4d65887

…lvm#93803)

Revert "Add SBAddressRange and SBAddressRangeList to SB API (llvm#92014…

8b600a3

…)" This reverts commit 42944e4.

[mlir][ArmSME] Simplify permutation map handling (llvm#93515)

b49c0b8

In -convert-vector-to-arm-sme the permutation_map is explicitly checked for transpose when converting xfer ops, but for 2-D vector types the only non-identity permutation map is transpose so this can be simplified.

[libc][NFC] Tighten up guard conditions for sqrt and polyeval (llvm#9…

662b130

…3791) Found while investigating llvm#93709

AMDGPU/GlobalISel: Use correct type for intrinsic ID

08d168c

[gn] port 8c33b33 (InstrumentationTests)

0d0851b

[docs] Update security group nomination to use gh pr (llvm#93679)

806ed26

[lldb][bazel] Fix BUILD after 540a36a.

191e64f

[Frontend][OpenMP] Remove reduction from allowed clauses for `targe…

eb88e7c

…t` (llvm#90754) The "reduction" clause is not allowed on the "target" construct.

[clang] fix(93002): clang/lib/Sema/SemaOpenMP.cpp:7405: Possible & / …

7b77301

…&& mixup ? (llvm#93093) Fixes llvm#93002

[AMDGPU] Regenerate checks in inst-select-load-global.s96.mir

ed25d1a

[gn build] Port 8b600a3

c28566c

[InstCombine] Add test for miscompile in gep-of-gep fold (NFC)

2b9c158

[Offload] Temporarily disable failing test after eb88e7c

1bf1f93

The `target reduction` combination is no longer accepted. Disable the test to avoid build failures, until a better fix is ready.

[clang-tidy] Check number of arguments to size/length in readability-…

57da040

…container-size-empty (llvm#93724) Verify that size/length methods are called with no arguments. Closes llvm#88203

[MCP] Remove unused TII argument. NFC

b5db2e1

Last used in e35fbf5.

[clang-repl] Introduce common fixture class in unittests (NFC) (llvm#…

a871470

…93816) Reduce code bloat by checking test requirements in a common test fixture

[Offload] Update test to use target parallel for reduction

adc4e45

Re-enable test disabled in 1bf1f93 with a fix.

[mlir][linalg] Add folder for transpose(transpose) -> transpose (llvm…

1159e76

…#93606) Back to back `linalg.transpose` can be rewritten to a single transpose

[flang] Lower non optional inquired argument in custom intrinsic lowe…

f55622f

…ring (llvm#93592) Handle lowering of non optional inquired argument in custom lowering. Also fix an issue in the lowering of associated optional argument where a box was emboxed again which led to weird result.

[AMDGPU] Fixed subtarget name in the lit test check-prefix string (NF…

e8de977

…C). (llvm#93780)

[clang-repl] Fix SetUp in CodeCompletionTest fixture (llvm#93816)

647d272

And sort out some unused headers

[Clang] Fix overloading for constrained variadic functions (llvm#93817)

1ee02f9

Found by llvm#93667

[lldb] Attempt to fix TestCompletion on macos

a2bcb93

Macos will automatically load dependent modules when creating a target, resulting in more modules than the test expects.

kazutakahirata and others added 29 commits June 1, 2024 10:36

Use StringRef::starts_with (NFC) (llvm#94112)

2a1ea15

[mlir][bazel] Add bazel build support for llvm@2b2ce50 (llvm#94126)

e7e6e1e

Also drop errant header include from `Linalg` dialect into `Dialect/SCF/Transforms/TileUsingInterface.cpp`

[InstCombine] Add tests for folding `(add X, (sext/zext (icmp eq X, C…

c877eb3

…)))`; NFC

[NewPM][CodeGen] Port selection dag isel to new pass manager (llvm#83567

d2cdc8a

) Port selection dag isel to new pass manager. Only `AMDGPU` and `X86` support new pass version. `-verify-machineinstrs` in new pass manager belongs to verify instrumentation, it is enabled by default.

[SelectionDAG] Mark SelectionDAGISel destructor virtual (llvm#94132)

f63b1d2

[BPF] Remove unused ID in SelectionDAGISel (llvm#94134)

3f9ba00

[Targets] Remove unused ID in *DAGToDAGISel (llvm#94135)

de37c06

Revert "[NewPM][CodeGen] Port selection dag isel to new pass manager" (…

8917afa

…llvm#94146) This reverts commit de37c06 to de37c06 It still breaks EXPENSIVE_CHECKS build. Sorry.

[DAG] canCreateUndefOrPoison - fix missing argument typo

c9a86fa

We were missing the PoisonOnly argument (so Depth + 1 was being used instead and the default Depth = 0 argument then being silently used) Fixes llvm#94145 and serves as the test case for 9e22c7a

Fix pagination issue in libc++ buildbot restarter

b6ea134

Use llvm::less_first (NFC) (llvm#94136)

197c3a3

[memprof] Use GlobalValue::GUID instead of uint64_t (NFC) (llvm#94086)

e044283

[libc++] [test] Cleanup compile-only tests (llvm#94121)

df9167b

I noticed that these tests had empty `main` functions. Dropping them and renaming the tests to `MEOW.compile.pass.cpp` will slightly improve test throughput.

[clang][NFC] Update CWG issues list

c26a993

[TableGen] Use llvm::unique (NFC) (llvm#94163)

d929351

[libc++] Don't give functions C linkage (llvm#94102)

5367b2c

There is no reason to give any of the functions C linkage. This makes all of the libc++ functions have C++ linkage, removing the need for `_LIBCPP_HIDE_FROM_ABI_C`.

[clang-format] Handle attributes before lambda return arrow (llvm#94119)

80303cb

Fixes llvm#92657.

[clang-format][NFC] Add missing parens of __attribute in unit tests

f06f016

[clang-format][doc] Clean up quotes, etc.

2fbc9f2

[clang-format] Fix documentation build error

d7d2d4f

https://github.com/llvm/llvm-project/actions/runs/9342063971/job/25709589592

[RISCV] Codegen support for XCVmem extension

34b4bcf

All post-Increment load/store, register-register load/store spec: https://github.com/openhwgroup/cv32e40p/blob/master/docs/source/instruction_set_extensions.rst

ChunyuLiao force-pushed the xcvmem branch from 26f3a72 to 34b4bcf Compare June 3, 2024 07:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Xcvmem #287

Xcvmem #287

Uh oh!

realqhc commented Jan 2, 2024

Uh oh!

Uh oh!

Xcvmem #287

Are you sure you want to change the base?

Xcvmem #287

Uh oh!

Conversation

realqhc commented Jan 2, 2024

Uh oh!

Uh oh!