[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes)(Needs Torch Bump) #451

jorickert · 2025-01-14T09:57:15Z

Onnx bump [AutoBump] Merge with fixes of b8000366 (Needs LLVM bump)(Dec 19) (10) onnx-mlir#272
Run bump test script
Torch bump [AutoBump] Merge with fixes of aca33f17 (Oct 22, needs LLVM Oct 19) (92) torch-mlir#476

…vm#111575) Adds a new mlir-opt test-only pass, -test-spirv-cpu-runner-pipeline, which runs the set of MLIR passes needed for the mlir-spirv-cpu-runner, and removes them from the runner. The tests are changed to invoke mlir-opt with this flag before running the runner. The eventual goal is to move all host/device code generation steps out of the runner, like with some of the other runners.

6bac414 added this opcode with the wrong number of operands. It didn't fail on check-llvm for me or on pre-commit CI, but once committed we got buildbot failures. This patch fixes the definition of the instruction and fixes the failing test.

With the truncssat nodes these are relatively simple tablegen patterns to add. The existing intrinsics are converted to shift+truncsat to they can lower using the new patterns. Fixes llvm#112925.

…eturn value is different in pointer / lvalue ref / rvalue ref (llvm#112853) Per https://cplusplus.github.io/CWG/issues/960.html.

This patch adds assembly/disassembly for the following instructions: ldfadd{a,al,l,}, ldbfadd{a,al,l,} ldfmax{a,al,l,}, ldbfmax{a,al,l,} ldfmaxnm{a,al,l,}, ldbfmaxnm{a,al,l,} ldfmin{a,al,l,}, ldbfmin{a,al,l,} ldfminnm{a,al,l,} ldbfminnm{a,al,l,} stfadd{l,}, stbfadd{l,} stfmax{l,}, stbfmax{l,} stfmaxnm{l,}, stbfmaxnm{l,} stfmin{l,}, stbfmin{l,} stfminnm{l,}, stbfminnm{l,} According to [1] [1]https://developer.arm.com/documentation/ddi0602 Co-authored-by: Spencer Abson [[email protected]](mailto:[email protected]) Co-authored-by: Caroline Concatto [[email protected]](mailto:[email protected])

…llvm#113164) Fixes llvm#113123 Alive proof: https://alive2.llvm.org/ce/z/hnqeLC

…m#112935) Done in preparation of exploring rtsan on windows.

Add new register classes/operands and their encoder/decoder behaviour required for the new Armv9.6 instructions (see https://developer.arm.com/documentation/109697/2024_09/Feature-descriptions/The-Armv9-6-architecture-extension). This work is the basis ofthe 2024 Armv9.6 architecture update effort for SME. Co-authored-by: Caroline Concatto [email protected] Co-authored-by: Marian Lukac [email protected] Co-authored-by: Momchil Velikov [email protected]

…-opt" (llvm#113176) Reverts llvm#111575 This caused build failures: https://lab.llvm.org/buildbot/#/builders/138/builds/5244

Improve the codegen for uaddo node for i64 in 64-bit mode and i32 in 32-bit mode by custom lowering.

Test added by commit 47a6da2 fails on the AIX bot. So XFAIL for now to investigate further.

Previously we were attempting to remove the memprof-related metadata when iterating through instructions in the LTO backend. However, we missed some as there are a number of cases where we skip instructions, or even entire functions. Simplify the cleanup and ensure all is removed by doing a full sweep over all instructions after completing cloning. This is largely NFC except with -memprof-report-hinted-sizes enabled, because we were propagating and simplifying the metadata after inlining in the LTO backend, which caused some stray messages as metadata was re-converted to attributes.

movzx r11d,BYTE PTR [rdx] is four bytes long. Follow-up to llvm#111638

Remove the unused functions and register classes from the change below llvm@4679583

Root gather/buildvector node should be ignored when SLP vectorizer tries to find matching gather nodes, vectorized earlier. This node is definitely the last one in the pipeline and it does not have users. It may cause the compiler crash Fixes llvm#113143

This patch adds a LegalityResultWithReason class for describing the reason why legality decided not to vectorize the code.

…llvm#109850) Special case small int constant in the PPC custom lowering of scalar_to_vector.

This is one of the many PRs to fix errors with LLVM_ENABLE_WERROR=on. Built by GCC 11. Fix warning In destructor ‘llvm::APInt::~APInt()’, inlined from ‘llvm::APInt::~APInt()’ at llvm-project/llvm/include/llvm/ADT/APInt.h:190:3, inlined from ‘llvm::APSInt::~APSInt()’ at llvm-project/llvm/include/llvm/ADT/APSInt.h:23:21, inlined from ‘bool checkOMPArraySectionConstantForReduction(clang::ASTContext&, const clang::ArraySectionExpr*, bool&, llvm::SmallVectorImpl<llvm::APSInt>&)’ at llvm-project/clang/lib/Sema/SemaOpenMP.cpp:18357:45, inlined from ‘bool actOnOMPReductionKindClause(clang::Sema&, {anonymous}::DSAStackTy*, clang::OpenMPClauseKind, llvm::ArrayRef<clang::Expr*>, clang::SourceLocation, clang::SourceLocation, clang::SourceLocation, clang::SourceLocation, clang::CXXScopeSpec&, const clang::DeclarationNameInfo&, llvm::ArrayRef<clang::Expr*>, {anonymous}::ReductionData&)’ at llvm-project/clang/lib/Sema/SemaOpenMP.cpp:18715:68: llvm-project/llvm/include/llvm/ADT/APInt.h:192:18: error: ‘void operator delete [](void*)’ called on a pointer to an unallocated object ‘1’ [-Werror=free-nonheap-object] 192 | delete[] U.pVal; | ^~~~

This patch fixes: llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/Legality.h:85:16: error: private field 'Reason' is not used [-Werror,-Wunused-private-field]

…2765) MSG_DEALLOC_VGPRS slows down very small waveslot limited kernels. It's been identified this message is only really needed for VGPR limited kernels. A kernel becomes VGPR limited if a total number of VGPRs per SIMD / number of used VGPRs is more than a number of wave slots.

…#112990) Renames LegalizeData to LegalizeDataValues since this pass fixes up SSA values. LegalizeData suggested that it fixed data mapping. This change also adds support to fix up ssa values for data clause operations. Effectively, compute regions within a data region use the ssa values from data operations also. The ssa values within data regions but not within compute regions are not updated. This change is to support the requirement in the OpenACC spec which notes that a visible data clause is not just one on the current compute construct but on the lexically containing data construct or visible declare directive.

Based on this RFC: https://discourse.llvm.org/t/rfc-allow-the-scalarizer-pass-to-scalarize-vectors-returned-in-structs/82306 LLVM intrinsics do not support out params. To get around this limitation implementers will make intrinsics return structs to capture a return type and an out param. This implementation detail should not impact scalarization since these cases should be elementwise operations. ## Three changes are needed. - The CallInst visitor needs to be updated to handle Structs - A new visitor is needed for `ExtractValue` instructions - finsh needs to be update to handle structs so that insert elements are properly propogated. ## Testing changes - Add support for `llvm.frexp` - Add support for `llvm.dx.splitdouble` fixes llvm#111437

…vm#112997) The x86-fold-tables.td has been failing for me and [in CI](https://buildkite.com/llvm-project/github-pull-requests/builds/111277#0192a122-c5c9-4e4e-bc5b-7532fec99ae4) if Git happens to decide to check out the baseline file with Windows line endings. This fix for this is to add the `--strip-trailing-cr` option to diff to normalize the line endings before comparing them.

…lvm#112995) llvm#98060 introduced a warning for unterminated string constants, however it was only checking for `\n` which means that it produced strange results on Windows (always blaming column 1) including having the [associated test fail](https://buildkite.com/llvm-project/github-pull-requests/builds/111277#0192a122-c5c9-4e4e-bc5b-7532fec99ae4) if Git happened to use Windows newlines when creating the file. This fix for this is to detect both `\r` and `\n`, but don't double-warn for Windows newlines.

llvm#111531) Bot maintainers should be aware and it became too much of a burden for developers. In particular on Windows, where make.exe won't be found in Path typically.

…2867) The Intel C++ Compiler (ICX) passes linker flags through the driver unlike MSVC and clang-cl, and therefore needs them to be prefixed with `/Qoption,link` (the equivalent of -Wl, for gcc on *nix). Use the `LINKER:` prefix for the `/EXPORT:` options in clang-repl, this expands to the correct flag for ICX and nothing for MSVC / clang-cl. RFC: https://discourse.llvm.org/t/rfc-cmake-linker-flags-need-wl-equivalent-for-intel-c-icx-on-windows/82446

These two veclibs are only available for AArch64 targets, and as mentioned in https://discourse.llvm.org/t/rfc-should-fveclib-imply-fno-math-errno-for-all-targets/81384, we (Arm) think that `-fveclib` should imply `-fno-math-errno`. By setting `-fveclib` the user shows they intend to use the vector math functions, which implies they don't care about errno. However, currently, the vector mappings won't be used in many cases without setting `-fno-math-errno` separately. Making this change would also help resolve some inconsistencies in how vector mappings are applied (see llvm#108980 (comment)). Note: Both SLEEF and ArmPL state that they do not set `errno`: - https://developer.arm.com/documentation/101004/2410/General-information/Arm-Performance-Libraries-math-functions * "The vector functions in libamath which are available on Linux may not set errno nor raise exceptions" - https://sleef.org/2-references/libm/ * "These functions do not set errno nor raise an exception."

…llvm#113167) Define `OmpIteratorSpecifier` and `OmpIteratorModifier` parser classes, and add parsing for them. Those are reusable between any clauses that use iterator modifiers. Add support for iterator modifiers to the MAP clause up to lowering, where a TODO message is emitted.

Reverts llvm#112603

…llvm#113452)

…at container-inserter does (llvm#113103) This patch implements LWG4016: container-insertable checks do not match what container-inserter does.

…lvm#111236) The underlying issue with msan was fixed by llvm#113200

When compiling for an SVE target we can use INDEX to generate constant fixed-length step vectors, e.g.: ``` uint32x4_t foo() { return (uint32x4_t){0, 1, 2, 3}; } ``` Currently: ``` foo(): adrp x8, .LCPI1_0 ldr q0, [x8, :lo12:.LCPI1_0] ret ``` With INDEX: ``` foo(): index z0.s, #0, #1 ret ``` The logic for this was already in `LowerBUILD_VECTOR`, though it was hidden under a check for `!Subtarget->isNeonAvailable()`. This patch refactors this to enable the corresponding code path unconditionally for constant step vectors (as long as we can use SVE for them).

…sult type (llvm#113031) This commit simplifies the result type of materialization functions. Previously: `std::optional<Value>` Now: `Value` The previous implementation allowed 3 possible return values: - Non-null value: The materialization function produced a valid materialization. - `std::nullopt`: The materialization function failed, but another materialization can be attempted. - `Value()`: The materialization failed and so should the dialect conversion. (Previously: Dialect conversion can roll back.) This commit removes the last variant. It is not particularly useful because the dialect conversion will fail anyway if all other materialization functions produced `std::nullopt`. Furthermore, in contrast to type conversions, at least one materialization callback is expected to succeed. In case of a failing type conversion, the current dialect conversion can roll back and try a different pattern. This also used to be the case for materializations, but that functionality was removed with llvm#107109: failed materializations can no longer trigger a rollback. (They can just make the entire dialect conversion fail without rollback.) With this in mind, it is even less useful to have an additional error state for materialization functions. This commit is in preparation of merging the 1:1 and 1:N type converters. Target materializations will have to return multiple values instead of a single one. With this commit, we can keep the API simple: `SmallVector<Value>` instead of `std::optional<SmallVector<Value>>`. Note for LLVM integration: All 1:1 materializations should return `Value` instead of `std::optional<Value>`. Instead of `std::nullopt` return `Value()`.

mgehre-amd · 2025-01-23T14:11:05Z

Does a corresponding PR exist on Xilinx/onnx-mlir? If yes, could you please link them?

jorickert · 2025-01-23T14:17:18Z

Xilinx/onnx-mlir#272 I will update it today

jorickert · 2025-01-24T08:00:44Z

I update the onnx-bump, but we will need to include #455 in the same bump too

[AutoBump] Merge with a8d506b (Oct 22) (15)

[AutoBump] Merge with 8a9921f (Oct 23) (17)

[AutoBump] Merge with fixes of 519eef3 (Oct 22) (16) (Needs downstream changes)

[AutoBump] Merge with fixes of f18c3e4 (Oct 23) (18) (Needs ONNX Bump)(Needs Torch Bump)

andfau-amd and others added 30 commits October 21, 2024 06:55

[AArch64] Add some basic patterns for qshrn.

bd861d0

With the truncssat nodes these are relatively simple tablegen patterns to add. The existing intrinsics are converted to shift+truncsat to they can lower using the new patterns. Fixes llvm#112925.

[clang] Add covariance tests that make sure we return an error when r…

1dfdbf7

…eturn value is different in pointer / lvalue ref / rvalue ref (llvm#112853) Per https://cplusplus.github.io/CWG/issues/960.html.

[InstCombine] Preserve the flag from RHS only if the and is bitwise (…

a2ba438

…llvm#113164) Fixes llvm#113123 Alive proof: https://alive2.llvm.org/ce/z/hnqeLC

[rtsan][NFC] Rename *interceptors.cpp to *interceptors_posix.cpp (llv…

1e07c48

…m#112935) Done in preparation of exploring rtsan on windows.

[mlir] Fix shared build. NFC

e730231

Revert "[mlir][mlir-spirv-cpu-runner] Move MLIR pass pipeline to mlir…

17e9752

…-opt" (llvm#113176) Reverts llvm#111575 This caused build failures: https://lab.llvm.org/buildbot/#/builders/138/builds/5244

[PPC] Add custom lowering for uaddo (llvm#110137)

c5ca1b8

Improve the codegen for uaddo node for i64 in 64-bit mode and i32 in 32-bit mode by custom lowering.

[AIX][test] XFAIL constant folding log1p test

900b636

Test added by commit 47a6da2 fails on the AIX bot. So XFAIL for now to investigate further.

[win/asan] Fix instruction size for 44 0f b6 1a

8417f6a

movzx r11d,BYTE PTR [rdx] is four bytes long. Follow-up to llvm#111638

[NFC] Fix -WError for unused Encode/Decode ZK methods

42ba452

Remove the unused functions and register classes from the change below llvm@4679583

[SandboxVec][Legality] Scaffolding for Legality (llvm#112623)

54c93aa

This patch adds a LegalityResultWithReason class for describing the reason why legality decided not to vectorize the code.

[PowerPC] special case small int constant for custom scalar_to_vector (…

fc59f2c

…llvm#109850) Special case small int constant in the PPC custom lowering of scalar_to_vector.

[Vectorize] Fix a warning

d4630ae

This patch fixes: llvm/include/llvm/Transforms/Vectorize/SandboxVectorizer/Legality.h:85:16: error: private field 'Reason' is not used [-Werror,-Wunused-private-field]

[gn build] Port 54c93aa

006fb09

[AMDGPU] Avoid repeated map lookups (NFC) (llvm#112819)

766bd6f

[Transforms] Avoid repeated hash lookups (NFC) (llvm#113120)

eaa7b38

[lldb][NFC] Fix doxygen comment on top of GetMangledTypeName

30a4028

[mlir][bazel] add missing dep in OpenAccTransforms

ab07fc8

weliveindetail and others added 11 commits October 23, 2024 14:32

[lldb][CMake] If make isn't found, print a warning but don't error out (

ba19e98

llvm#111531) Bot maintainers should be aware and it became too much of a burden for developers. In particular on Windows, where make.exe won't be found in Path typically.

Revert "[PowerPC] Expand global named register support" (llvm#113457)

a19f05b

Reverts llvm#112603

[PS5][Driver] Query OPT_r/OPT_shared/OPT_static just once (NFC) (…

5560f7e

…llvm#113452)

[libc++][ranges] LWG4016: container-insertable checks do not match wh…

7c72199

…at container-inserter does (llvm#113103) This patch implements LWG4016: container-insertable checks do not match what container-inserter does.

Reapply "[InstCombine] Folding (icmp eq/ne (and X, -P2), INT_MIN)" (l…

294726d

…lvm#111236) The underlying issue with msan was fixed by llvm#113200

[AutoBump] Merge with fixes of 0a17bdf (Oct 15)

60a0124

jorickert changed the title ~~[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14)~~ [AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump) Jan 14, 2025

jorickert added 5 commits January 14, 2025 05:31

[AutoBump] Merge with a8d506b (Oct 22)

1b28c53

[AutoBump] Merge with fixes of 519eef3 (Oct 22)

dc1a1db

[AutoBump] Merge with 8a9921f (Oct 23)

3f5d12c

[AutoBump] Merge with fixes of f18c3e4 (Oct 23)

21fd016

Adjust return type to new signature of materialization callbacks

1fe0b8b

Base automatically changed from bump_to_f035d9f0 to feature/fused-ops January 23, 2025 14:10

jorickert added 3 commits January 24, 2025 09:43

Merge pull request #452 from Xilinx/bump_to_a8d506b3

d3ab502

[AutoBump] Merge with a8d506b (Oct 22) (15)

Merge pull request #454 from Xilinx/bump_to_8a9921f5

cc2a236

[AutoBump] Merge with 8a9921f (Oct 23) (17)

Merge pull request #453 from Xilinx/bump_to_519eef3b

ee69b74

[AutoBump] Merge with fixes of 519eef3 (Oct 22) (16) (Needs downstream changes)

jorickert changed the title ~~[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)~~ [AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes) Jan 24, 2025

mgehre-amd mentioned this pull request Jan 24, 2025

[AutoBump] Merge with fixes of aca33f17 (Oct 22, needs LLVM Oct 19) (92) Xilinx/torch-mlir#476

Open

1 task

Merge pull request #455 from Xilinx/bump_to_f18c3e4e

41d0253

[AutoBump] Merge with fixes of f18c3e4 (Oct 23) (18) (Needs ONNX Bump)(Needs Torch Bump)

jorickert changed the title ~~[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes)~~ [AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes)(Needs Torch Bump) Jan 27, 2025

mgehre-amd self-requested a review January 28, 2025 07:12

mgehre-amd approved these changes Jan 28, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes)(Needs Torch Bump) #451

[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes)(Needs Torch Bump) #451

jorickert commented Jan 14, 2025 •

edited by mgehre-amd

Loading

mgehre-amd commented Jan 23, 2025

jorickert commented Jan 23, 2025

jorickert commented Jan 24, 2025

[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes)(Needs Torch Bump) #451

Are you sure you want to change the base?

[AutoBump] Merge with fixes of 0a17bdfc (Oct 15) (14) (Needs ONNX Bump)(Needs downstream changes)(Needs Torch Bump) #451

Conversation

jorickert commented Jan 14, 2025 • edited by mgehre-amd Loading

mgehre-amd commented Jan 23, 2025

jorickert commented Jan 23, 2025

jorickert commented Jan 24, 2025

jorickert commented Jan 14, 2025 •

edited by mgehre-amd

Loading