LLVM and SPIRV-LLVM-Translator pulldown (WW32) #10783

When compiling against recent glibc (>= 2.35) but old kernel headers (< 4.18), `SYS_rseq` is not defined and thus llvm-exegesis fails to build. So also check that `SYS_rseq` is defined before trying to use it. Fixes llvm/llvm-project#64456 Reviewed By: MaskRay, gchatelet Differential Revision: https://reviews.llvm.org/D157189

We currently only enable hoisting in the last SimplifyCFG run of the function simplification pipeline. In particular this happens after GVN, which means that instructions that were identical (and thus hoistable) prior to GVN might no longer be so after it ran, due to equality replacements (see the phase ordering test). The history here is that D84108 restricted hoisting to the very late (module optimization) pipeline only. Then D101468 went back on that, and also performed it at the end of function simplification. This patch goes one step further and allows it prior to GVN. Importantly, we still don't perform hoisting before LoopRotate, which was the original motivation for delaying it. Differential Revision: https://reviews.llvm.org/D156532

As pointed out in D125755 the operand of a call to getCastInstrCost had the Src and Dst the wrong way around. Differential Revision: https://reviews.llvm.org/D154841

…ATE widening With SSSE3, widen the truncation for anything other than vXi64 -> vXi8 smaller than v8i64 (where PSHUFB would be better).

Differential Revision: https://reviews.llvm.org/D157256

…ions. The modeling of send, recv, sendmsg, recvmsg, sendto, recvfrom is changed: These functions do not return 0, except if the message length is 0. (In sendmsg, recvmsg the length is not checkable but it is more likely that a message with 0 length is invalid for these functions.) Reviewed By: donat.nagy Differential Revision: https://reviews.llvm.org/D155715

Differential Revision: https://reviews.llvm.org/D157002

This has failed once in a while on our Windows on Arm bot: https://lab.llvm.org/buildbot/#/builders/219/builds/4688 Traceback (most recent call last): File "C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\... self.assertGreaterEqual(duration_sec, 1) AssertionError: 0.9907491207122803 not greater than or equal to 1 We're not here to check that Python/the C++ lib/the OS implemented timers correctly, so accept anything 0.95 or greater.

…calling conventions in C++ As reported in <llvm/llvm-project#58929>, Clang's handling of empty structs in the case of small structs that may be eligible to be passed using the hard FP calling convention doesn't match g++. In general, C++ record fields are never empty unless [[no_unique_address]] is used, but the RISC-V FP ABI overrides this. After this patch, fields of structs that contain empty records will be ignored, even in C++, when considering eligibility for the FP calling convention ('flattening'). It isn't explicitly noted in the RISC-V psABI, but arrays of empty records will disqualify a struct for consideration of using the FP calling convention in g++. This patch matches that behaviour. The psABI issue <riscv-non-isa/riscv-elf-psabi-doc#358> seeks to clarify this. This patch was previously committed but reverted after a bug was found. This recommit adds additional logic to prevent that bug (adding an extra check for when a candidate from detectFPCCEligibleStructHelper may not be valid). Differential Revision: https://reviews.llvm.org/D142327

This allows using VPRecipeWithIRFlags for VPInstruction and reduces the diff for D157144 & D157194.

Currently we only consider basic blocks with a unique predecessor when estimating the size of dead code. However, we could expand to this to consider blocks with a back-edge, or blocks preceded by dead blocks. Differential Revision: https://reviews.llvm.org/D156903

This ports over the test cases half-convert.ll and implements patterns or RISCVISelLowering.cpp changes for all of the most straight-forward cases (those that don't require changes outside of lib/Target/RISCV). The remaining cases and noted poor codegen for saturating conversions will be handled in follow-up patches. Differential Revision: https://reviews.llvm.org/D156943

This patch moves directive sets defined internally in Semantics to a header accessible by other stages of the compiler to enable reuse. Some sets are renamed/rearranged and others are lifted from local definitions to provide a single source of truth. Differential Revision: https://reviews.llvm.org/D157090

…nversion Extending to f32 first (as is done for f16) results in better generated code for RISC-V (and affects no other in-tree tests). Additionally, performing the FP_EXTEND first seems equally justified for bf16 as for f16. Differential Revision: https://reviews.llvm.org/D156944

In most of testcases, it usually has a blank line after end of RUN lines for readability.

This reduces the number of places where we have to check for a list of DS_GWS_* opcodes. Differential Revision: https://reviews.llvm.org/D157099

The affected lit tests failed when they were run in a path that contained the word "call". CHECK-NOT lines that were supposed to match only the IR ended up matching the path printed in the output. Fixed this by checking for "call void" instead.

Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D152141

Split off suggested refactoring from D157144. Also adds a assert to make sure this is only used when OpType is FPMathOp.

6640df9 did not actually remove it, just its final user. cannotBeOrderedLessThanZeroImpl still has a user which needs to be updated before it can be removed. The users of SignBitMustBeZero currently have broken expectations for nan handling, so requires more work to replace.

Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D157214

Currently `isTriviallyReMaterializable` calls `isReallyTriviallyReMaterializable` and `isReallyTriviallyReMaterializableGeneric`. The two interfaces are confusing, but there are also some real issues with this. The documentation of this function (see below) suggests that `isReallyTriviallyRematerializable` allows the target to override the default behaviour. /// For instructions with opcodes for which the M_REMATERIALIZABLE flag is /// set, this hook lets the target specify whether the instruction is actually /// trivially rematerializable, taking into consideration its operands. It however implements something different. The default behaviour is the analysis done in `isReallyTriviallyReMaterializableGeneric`, which is testing if it is safe to rematerialize the MachineInstr. The result of `isReallyTriviallyReMaterializable` is only considered if `isReallyTriviallyReMaterializableGeneric` returns `false`. That means there is no way to override the default behaviour if `isReallyTriviallyReMaterializableGeneric` returns true (i.e. it is safe to rematerialize, but we'd rather not). By making this a single interface, we can override the interface to do either. Reviewed By: craig.topper, nemanjai Differential Revision: https://reviews.llvm.org/D156520

…s for PACKSS/PACKUS Begin to consolidate the similar matching code we have - all have semi-similar constraints that still need merging together to ensure we get consistent codegen depending on when the truncate is lowered.

…vector types Fuzz testing noticed that the sub-128-bit vector splitting added in ef4330f didn't correctly halt at <2 x iXX> truncations.

As requested in review for https://reviews.llvm.org/D156990 This additionally consistently uses the ilp32d/lp64d ABIs when the D extension is enabled.

redundant get() call on smart pointer.

Support IR that is generated by the vector-to-scf lowering of 2D vector transfers with a mask. Only 2D transfers that were fully unrolled are supported at the moment. Differential Revision: https://reviews.llvm.org/D156695

Use APInt to represent numeric variables and expressions, therefore removing overflow concerns. Only remains underflow when the format of an expression is unsigned (incl. hex values) but the result is negative. Note that this can only happen when substituting an expression, not when capturing since the regex used to capture unsigned value will not include minus sign, hence all the code removal for match propagation testing. This is what this patch implement. Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D150880

Add Matcher dependentSizedExtVectorType for DependentSizedExtVectorType. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D157237

Add Matcher convertVectorExpr. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D157248

…me from CF_OPTIONS This cherry-picks swiftlang/llvm-project#6431 since without it, macOS 14 SDK headers don't compile when targeting catalyst. Fixes #64438.

Similar to the other ValueTracking function, switch over the instruction opcode instead of doing a long sequence of match()es.

CONFLICT (content): Merge conflict in llvm/lib/Passes/PassBuilderPipelines.cpp

…f a live register This patch tweaks the fix in D20627 "Do not rename registers that do not start an independent live range" to only consider Data dependencies, not Output or Anti dependencies. An Output or Anti dependency to a superreg does not imply that that superreg is live at the current instruction. This enables breaking anti-dependencies in a few more cases as shown by the lit test updates. Differential Revision: https://reviews.llvm.org/D156879

…ister This patch reworks the fix from D20627 "Do not rename registers that do not start an independent live range". That fix depended on the scheduler dependency graph having redundant edges. Those edges are removed by D156552 "[MachineScheduler] Track physical register dependencies per-regunit" with the result that on several Hexagon lit tests, the post-RA scheduler would schedule the code in a way that fails machine verification. Consider this code where D11 is a pair R23:R22: SU(0): %R2<def> = A2_add %R23, %R17<kill> (anti dependency on R23 here) SU(8): %R23<def> = S2_asr_i_r %R22, 31 (data dependency on R23->D11 here) SU(10): %D0<def> = A2_tfrp %D11<kill> The original fix would detect this situation by examining the dependency from SU(8) to SU(10) and seeing that D11 is not a subreg of R23. A slightly more complicated example: SU(0): %R2<def> = A2_add %R23, %R17<kill> (anti dependency on R23 here) SU(8): %R23<def> = S2_asr_i_r %R22, 31 (data dependency on R23 here) SU(9): %R23<def> = S2_asr_i_r %R23, 31 (data dependency on R23->D11 here) SU(10): %D0<def> = A2_tfrp %D11<kill> The original fix also worked on this example, but only because ScheduleDAGInstrs adds an extra data dependency edge directly from SU(8) to SU(10). This edge is redundant, since you could infer it transitively from the edges SU(8)->SU(9) and SU(9)->SU(10), and since none of the data that SU(8) writes to R23 is read by SU(10). After D156552 the redundant edge SU(8)->SU(10) will not be present, so when we examine the successors of SU(8) we will not find any that read from a superreg of R23. This patch removes the original fix from D20627, which examined edges in the dependency graph. Instead it extends a check that was already being done in FindSuitableFreeRegisters: instead of checking that *some* register is a superreg of all registers in the rename group, we now check that the specific register that carries the anti-dependency that we want to break is a superreg of all registers in the rename group. Differential Revision: https://reviews.llvm.org/D156880

Change the scheduler's physical register dependency tracking from registers-and-their-aliases to regunits. This has a couple of advantages when subregisters are used: - The dependency tracking is more accurate and creates fewer useless edges in the dependency graph. An AMDGPU example, edited for clarity: SU(0): $vgpr1 = V_MOV_B32 $sgpr0 SU(1): $vgpr1 = V_ADDC_U32 0, $vgpr1 SU(2): $vgpr0_vgpr1 = FLAT_LOAD_DWORDX2 $vgpr0_vgpr1, 0, 0 There is a data dependency on $vgpr1 from SU(0) to SU(1) and from SU(1) to SU(2). But the old dependency tracking code also added a useless edge from SU(0) to SU(2) because it thought that SU(0)'s def of $vgpr1 aliased with SU(2)'s use of $vgpr0_vgpr1. - On targets like AMDGPU that make heavy use of subregisters, each register can have a huge number of aliases - it can be quadratic in the size of the largest defined register tuple. There is a much lower bound on the number of regunits per register, so iterating over regunits is faster than iterating over aliases. The LLVM compile-time tracker shows a tiny overall improvement of 0.03% on X86. I expect a larger compile-time improvement on targets like AMDGPU. Recommit after fixing AggressiveAntiDepBreaker in D156880. Differential Revision: https://reviews.llvm.org/D156552

A broadcast base pointer is the same as a scalar base pointer for GEP semantics (when there's at least one other vector operand). This is the form that SLP likes to emit, so we should handle it. Differential Revision: https://reviews.llvm.org/D157132

If the Envar is set to true (default), busy HSA queues will be actively avoided when assigning a queue to a Stream. Otherwise, we will initialize a new HSA queue for each requested Stream, then default to round robin once the set maximum has been reached. Reviewed By: jdoerfert, kevinsala Differential Revision: https://reviews.llvm.org/D156996

If we have a dominant value, we can still use a v(f)slide1down to handle the last value in the vector if that value is neither undef nor the dominant value. Note that we can extend this idea to any tail of elements, but that's ends up being a near complete merge of the v(f)slide1down insert path, and requires a bit more untangling on profitability heuristics first. Differential Revision: https://reviews.llvm.org/D157120

If the shl has either nuw or nsw flags, then we know that bits cannot be shifted out, so a power of two cannot become zero. Proofs: https://alive2.llvm.org/ce/z/4QfebE

…o for vscale" Logic is incorrect. Shift can make non-zero pow2 zero. This reverts commit 9c837b7.

P2408 requires this for C++23, but implementing it in C++20 is safe because the only code impacted would be code that violated a precondition of the parallel algorithm. It was P2408 intent to enable implementations to backport this to C++20. Closes #63447 . Reviewed By: philnik, #libc Differential Revision: https://reviews.llvm.org/D154305

This reverts commit 4097a24. Breaks tests on macOS, see https://reviews.llvm.org/rG4097a2458412#1235854

Most of the implementations are copied from linux.cpp and we will be keeping those memory functions in linux.cpp for a while until we are able to switch to use MemMap completely. The remaining part is SizeClassAllocator32 which hasn't been switched to use MemMap interface Reviewed By: cferris Differential Revision: https://reviews.llvm.org/D146453

Reviewed By: #libc, philnik Differential Revision: https://reviews.llvm.org/D157213

Use O(nlogn) instead of O(N2) (N <= 32) sorting approach and do not try to revectorize all possible combinations of stores, if they definitely cannot be combined because of mem/data dependencies. Compile time (O3 + lto, skylake_avx512): External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test 117.15 120.11 2.5% External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test 203.67 207.42 1.8% External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 232.43 235.01 1.1% External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test 205.49 207.25 0.9% External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 310.46 306.23 -1.4% Link time (O3+lto, skylake_avx512): External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 1383.69 1475.94 6.7% Other changes are too small, cannot rely on them. size..text Program size..text results results0 diff test-suite :: SingleSource/Regression/C/Regression-C-sumarray.test 392.00 1439.00 267.1% test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test 394258.00 394818.00 0.1% test-suite :: MultiSource/Applications/JM/lencod/lencod.test 846355.00 847075.00 0.1% test-suite :: External/SPEC/CINT2006/464.h264ref/464.h264ref.test 782816.00 783360.00 0.1% test-suite :: External/SPEC/CFP2017rate/508.namd_r/508.namd_r.test 779667.00 779923.00 0.0% test-suite :: MultiSource/Benchmarks/mafft/pairlocalalign.test 224398.00 224446.00 0.0% test-suite :: MultiSource/Applications/oggenc/oggenc.test 185019.00 185035.00 0.0% test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12487610.00 12488010.00 0.0% test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 1051772.00 1051804.00 0.0% test-suite :: MultiSource/Applications/SPASS/SPASS.test 529586.00 529602.00 0.0% test-suite :: External/SPEC/CINT2006/400.perlbench/400.perlbench.test 1084684.00 1084716.00 0.0% test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test 1014245.00 1014261.00 0.0% test-suite :: MultiSource/Benchmarks/MallocBench/espresso/espresso.test 223494.00 223478.00 -0.0% test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 660843.00 660795.00 -0.0% test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 660843.00 660795.00 -0.0% test-suite :: MultiSource/Applications/ClamAV/clamscan.test 568824.00 568760.00 -0.0% espresso - 2 more stores vectorized x264 - small number of changes in 3-4 functions, generated a bit more vector stores (2 4x zeroinitializer stores + some other small variations). clamscan - emitted 32xi8 store instead of several scalar stores + several 4x-8x stores. Differential Revision: https://reviews.llvm.org/D155246

Use APInt directly instead. Depends On D150880 Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D154430

This reverts commit ec70627. Reverting due to CI failure

We had some load/store patterns split because EEW=64 needed a different predicate. Refactor where the foreach is place and use the foreach value to pick the predicate. Reviewed By: wangpc Differential Revision: https://reviews.llvm.org/D157176

…le file Distinguish between copyin and copyin with the readonly modifier. Depends on D157121 Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D157125

Currently when a stack access is out of range of an sp-relative ldr or str then we jump straight to generating the offset with a literal pool load or mov32 pseudo-instruction. This patch improves that in two ways: * If the offset is within range of sp-relative add plus an ldr then use that. * When we use the mov32 pseudo-instruction, if putting part of the offset into the ldr will simplify the expansion of the mov32 then do so. Differential Revision: https://reviews.llvm.org/D156875

ruamel.yaml had a potential security issues (may also be a false positive in scanner). Related to #64417 llvm/llvm-project#64417 Reviewed By: avogelsgesang Differential Revision: https://reviews.llvm.org/D157284

Prior to this diff, names in the `__llvm_prf_names` section had the format `[<filepath>:]<function-name>`, e.g., `main.cpp:foo`, `bar`. `<filepath>` is used to discriminate between possibly identical function names when linkage is local and `<function-name>` simply comes from `F.getName()`. This has two problems: * `:` is commonly found in Objective-C functions so that names like `main.mm:-[C foo::]` and `-[C bar::]` are difficult to parse * `<function-name>` might be different from the linkage name, so it cannot be used to pass a function order to the linker via `-symbol-ordering-file` or `-order_file` (see https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068) Instead, this diff changes the format to `[<filepath>;]<linkage-name>`, e.g., `main.cpp;_foo`, `_bar`. The hope is that `;` won't realistically be found in either `<filepath>` or `<linkage-name>`. To prevent invalidating all prior IRPGO profiles, we also lookup the prior name format when a record is not found (see `InstrProfSymtab::create()`, `readMemprof()`, and `getInstrProfRecord()`). It seems that Swift and Clang FE-PGO rely on the original `getPGOFuncName()`, so we cannot simply replace it. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D156569

…penACC declare Depends on D156828 Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D156829

…nstructions Fix a bug introduced in a previous commit. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156380

…nstructions, and (un)merge instructions for narrow types Test legalization for (s7, s8, s16, s32, s48, s64, s96) for rv32, (s8, s15, s16, s32, s64, s72, s128, s192) for rv64. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156383

Anonymous unions should be transparent wrt `[[clang::trivial_abi]]`. Consider the test input below: ``` struct [[clang::trivial_abi]] Trivial { Trivial() {} Trivial(Trivial&& other) {} Trivial& operator=(Trivial&& other) { return *this; } ~Trivial() {} }; static_assert(__is_trivially_relocatable(Trivial), ""); struct [[clang::trivial_abi]] S2 { S2(S2&& other) {} S2& operator=(S2&& other) { return *this; } ~S2() {} union { Trivial field; }; }; static_assert(__is_trivially_relocatable(S2), ""); ``` Before the fix Clang would warn that 'trivial_abi' is disallowed on 'S2' because it has a field of a non-trivial class type (the type of the anonymous union is non-trivial, because it doesn't have the `[[clang::trivial_abi]]` attribute applied to it). Consequently, before the fix the `static_assert` about `__is_trivially_relocatable` would fail. Note that `[[clang::trivial_abi]]` cannot be applied to the anonymous union, because Clang warns that 'trivial_abi' is disallowed on '(unnamed union at ...)' because its copy constructors and move constructors are all deleted. Also note that it is impossible to provide copy nor move constructors for anonymous unions and structs. Reviewed By: gribozavr2 Differential Revision: https://reviews.llvm.org/D155895

AggressiveInstCombine fix typo in expandStrcmp method. Differential Revision: https://reviews.llvm.org/D156556

The patch in D155211 added basic support for the `.alias` keyword in PTX. This means we should be able to permit use of this in clang. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D156014

This is very simplistic and could be more thorough by replacing an existing `LIBC_INLINE` in the wrong location or a redunant `inline` when inserting the right macro use. But as is this suffices to automatically apply fixes for most or all of the instances in the libc tree today and get working results (despite some superfluous `inline` keywords left behind). Reviewed By: abrachet Differential Revision: https://reviews.llvm.org/D157164

This was generated using clang-tidy and clang-apply-replacements, on src/string/*.cpp for just the llvmlibc-inline-function-decl check, after applying https://reviews.llvm.org/D157164, and then some manual fixup. Reviewed By: abrachet Differential Revision: https://reviews.llvm.org/D157169

The cacheflush is only defined with __USE_MISC, which depends on _DEFAULT_SOURCE, _GNU_SOURCE or _BSD_SOURCE, or _SVID_SOURCE. If CC is called with -std=c11, these macros won't be defined, Let's use _flush_cache, which is defined always. Reviewed By: brad, jrtc27 Differential Revision: https://reviews.llvm.org/D156072

In D128337, The spelling of CheckOptions was updated to support a more natural dictionary syntax. This patch is just updating all test files to use the new syntax. Reviewed By: PiotrZSL Differential Revision: https://reviews.llvm.org/D130209

This patch is just updating all test files to use the new syntax. Fix for changes introduced after D130209 were created.

Summary: This test was accidentally not updated.

…abi]]`." This reverts commit bddaa35. Reverting as requested at https://reviews.llvm.org/D155895#4566945 (for breaking tests on Windows).

There are cases where the -1 doesn't become visible until lowering so the folding doesn't have a chance to run. I think in these cases there is a missed DAGCombine for truncate (undef), which I may fix separately, but RISC-V backend should protect itself. Fixes #64503. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D157314

In https://reviews.llvm.org/D156569 we changed the format of the IRPGO counter names which broke some macOS tests because the `__profc_` variable names changed. Use `{{_?}}` to allow mangled names to be prefixed with `_` to pass tests. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D157321

Set the halt_on_error runtime flag to make TSan errors fatal when running the test suite. For the API tests the environment variables are set conditionally on whether the TSan is enabled. The Shell and Unit tests don't have that logic but setting the environment variable is harmless. For consistency, I've also mirrored the ASAN option (detect_stack_use_after_return=1) for the Shell tests. Differential revision: https://reviews.llvm.org/D157152

Like for x86_64-linux-gnu, these need to be disabled for aarch64-linux-gnu. Differential Revision: https://reviews.llvm.org/D156815

getWildcardRegex() guarantees that only valid hex numbers are matched by FileCheck numeric expressions. This commit therefore only asserts the lack of parsing failure in valueFromStringRepr(). Depends On D154430 Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D154431

Needed for 019a477

This commit changes the `c++xx-or-later` definitions to also include C++23 and the upcoming C++26. `readability/container-contains.cpp` to also test newer C++ versions. Also, this commit adjusts a couple of test cases slightly: * `container-contains.cpp` now also tests newer C++ versions. Restricting it to C++20 was an oversight of mine when originally writing this check. * `unconventional-assign-operator.cpp`: The `return rhs` raised a "non-const lvalue reference to type 'BadReturnStatement' cannot bind to a temporary" error in C++23. The issue is circumenvented by writing `return *&rhs`. * `const-correctness-values.cpp` was also running into the same error in C++23. The troublesome test cases were moved to a separate file. Differential Revision: https://reviews.llvm.org/D157246

This avoids narrowing after it has been expanded to shifts. The G_SEXT_INREG narrowing can use the second operand of the instruction to optimize the narrowing. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D157172

Some printf implementations perform a null check on pointers passed to %s. While that's not in the standard, this patch adds it as an option for compatibility. It also puts a similar check in %n behind the same flag. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D156923

Consider the statement: int x = -1; And the following AST: `-VarDecl 0x55c4823a7670 <x.c:2:1, col:10> col:5 x 'int' cinit `-UnaryOperator 0x55c4823a7740 <col:9, col:10> 'int' prefix '-' `-IntegerLiteral 0x55c4823a7720 <col:10> 'int' 1 Return the evaluation of the subexpression negated. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D156378

Removed some of the warning supression needed for the multi-arg macro logic by making number of arguments the same everywhere. Also removes some verbose comments and obvious TODOs. Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D157327

… casts Consider the following statements: long x = 1; short y = 1; With the following AST: |-VarDecl 0x55d289973730 <x.c:1:1, col:10> col:6 x 'long' cinit | `-ImplicitCastExpr 0x55d289973800 <col:10> 'long' <IntegralCast> | `-IntegerLiteral 0x55d2899737e0 <col:10> 'int' 1 `-VarDecl 0x55d289973830 <line:2:1, col:11> col:7 y 'short' cinit `-ImplicitCastExpr 0x55d2899738b8 <col:11> 'short' <IntegralCast> `-IntegerLiteral 0x55d289973898 <col:11> 'int' 1 Sign or Zero extend or truncate based on the source signedness and destination width. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D156466

This patch is large, but is almost entirely just adding casts to calls to syscall_impl. Much of the work was done programatically, with human checking when the syntax or types got confusing. Reviewed By: mcgrathr Differential Revision: https://reviews.llvm.org/D156950

Legalize G_SHL, G_ASHR and G_LSHR for types narrower and upto (and including) XLen: (i7, i8, i16 and i32) for rv32 and (i8, i15, i16, i32 and i64) for rv64. This requires adding some rules to handle G_ANYEXT, G_ZEXT and G_SEXT. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D155772

…thods that are definitions without having a body DeclPrinter used FunctionDecl::isThisDeclarationADefinition to decide if the decl requires a semicolon at the end. However, there are several methods without body (that require a semicolon) that are definitions. Fixes llvm/llvm-project#62996 Initial commit had a failing test case on targets not supporting `__attribute__((alias))`. Added `-triple i386-linux-gnu` to the specific test case. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D156533

DynamicLoader::LoadBinaryWithUUIDAndAddress can create a Module based on the binary image in memory, which in some cases contains symbol names and can be genuinely useful. If we don't have a filename, it creates a name in the form `memory-image-0x...` with the header address. In practice, this is most useful with Darwin userland corefiles where the binary was stored in the corefile in whole, and we can't find a binary with the matching UUID. Using the binary out of the corefile memory in this case works well. But in other cases, akin to firmware debugging, we merely end up with an oddly named binary image and no symbols. Add a flag to control whether we will create these memory images and add them to the Target or not; only set it to true when working with a userland Mach-O image with the "all image infos" LC_NOTE for a userland corefile. Differential Revision: https://reviews.llvm.org/D157167

Commit c192b3d missed some targets when fixing standalone header parsing after 019a477.

This fixes over estimating code size. This was broken by 79f52af. https://reviews.llvm.org/D157103

https://reviews.llvm.org/D157108

…n-written) constructor initializers DeclPrinter::PrintConstructorInitializers did output non-written constructor initiaizers. In particular, implicit constructor initializers of base classes were output. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D156523

Updated status of alignment clause for allocate directive in OpenMP features table, section OpenMP 5.1 Implementation Details. Differential Revision: https://reviews.llvm.org/D157135

Legalize G_AND, G_OR, G_XOR for (s7, s48) on rv32 and (s15, s72) on rv64 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157017

Size must be multiple of Alignment. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D157247

Legalize G_ADD, G_SUB, G_(S/U)ADD(O/E). We test for (s7, s48, s64, s96) on rv32 and (s15, s72, s128, s192) on rv64. Differential Revision: https://reviews.llvm.org/D157019

Test legalization for (i7, i8, i16, i32, i48, i64) on rv32 and for (i8, i15, i16, i32, i64, i72, i128). Legalization fails for i96 on rv32 and i192 on rv64. Note that [i192 fails for AArch64](llvm/llvm-project#64394). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157023

Introduce test cases for folding `select` of `srem` and conditional add. Differential Revision: https://reviews.llvm.org/D156862

Simplify a pattern that may show up when computing the remainder of euclidean division. Particularly, when the divisor is a power of two and never negative, the signed remainder can be folded with a bitwise and. Fixes 64305. Proofs: https://alive2.llvm.org/ce/z/9_KG6c Differential Revision: https://reviews.llvm.org/D156811

…is less than 32 bits We have a variant of this for splats already, but hadn't handled the case where a single copy of the wider element can be inserted producing the entire required bit pattern. This shows up mostly in very small vector shuffle tests. Differential Revision: https://reviews.llvm.org/D157299

…ation. (NFC) Pre-commit test for D157184. Differential Revision: https://reviews.llvm.org/D157177

…coro-split functions in the debug info. This patch adds the linkage name update to DISubprogram's declaration after 6ce76ff. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D157184

Differential Revision: https://reviews.llvm.org/D152981

In SystemZTTIImpl::getMemoryOpCost, the call to getNumberOfParts will run type legalization, which can't handle structs. So before that, we check for an unknown value type and forward to BaseT, just like many other targets do in this situation. https://bugzilla.redhat.com/show_bug.cgi?id=2224885 Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D156379

…GCC behaviour GCC doesn't ignore non-zero-length array of empty structures in C++ while clang does. What this patch did is to match GCC's behaviour although this rule is not documented in psABI. Similar to D142327 for RISCV. Reviewed By: xry111, xen0n Differential Revision: https://reviews.llvm.org/D156116

The removal of the early return in 96832a6 was an error: it would include the 'standalone' library that's not used by linux. Instead we reproduce the library path handling in the linux/musl block. Differential Revision: https://reviews.llvm.org/D156771

Keeps track of CallsToRetrieve, how many SuccessfulRetrieves, from cached block allocations. Dumps this data in the MapAllocatorCache::getStats() function Reviewed By: cferris, Chia-hungDuan Differential Revision: https://reviews.llvm.org/D157154

Differential Revision: https://reviews.llvm.org/D157349

TSan reports the following data race: Write of size 4 at 0x000109e0b160 by thread T2 (mutexes: write M0, write M1): #0 NativeFile::Close() File.cpp:329 #1 ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:232 #2 Communication::Disconnect(lldb_private::Status*) Communication.cpp:61 #3 process_gdb_remote::ProcessGDBRemote::DidExit() ProcessGDBRemote.cpp:1164 #4 Process::SetExitStatus(int, char const*) Process.cpp:1097 #5 process_gdb_remote::ProcessGDBRemote::MonitorDebugserverProcess(...) ProcessGDBRemote.cpp:3387 Previous read of size 4 at 0x000109e0b160 by main thread (mutexes: write M2): #0 NativeFile::IsValid() const File.h:393 #1 ConnectionFileDescriptor::IsConnected() const ConnectionFileDescriptorPosix.cpp:121 #2 Communication::IsConnected() const Communication.cpp:79 #3 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...) GDBRemoteCommunication.cpp:256 #4 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...l) GDBRemoteCommunication.cpp:244 #5 process_gdb_remote::GDBRemoteClientBase::SendPacketAndWaitForResponseNoLock(llvm::StringRef, StringExtractorGDBRemote&) GDBRemoteClientBase.cpp:246 The problem is that in WaitForPacketNoLock's run loop, it checks that the connection is still connected. This races with the ConnectionFileDescriptor disconnecting. Most (but not all) access to the IOObject in ConnectionFileDescriptorPosix is already gated by the mutex. This patch just protects IsConnected in the same way. Differential revision: https://reviews.llvm.org/D157347

…ize()` This method should not load new dialect or affect the context itself. Differential Revision: https://reviews.llvm.org/D157198

… (NFC) It isn't mutated during the conversion already, communicate this through the API. Differential Revision: https://reviews.llvm.org/D157199

The multiple -convert-XXX-to-llvm passes are really nice testing tools for individual dialects, but the expectation is that a proper conversion should assemble the conversion patterns using `populateXXXToLLVMConversionPatterns() APIs. However most customers just chain the conversion passes by convenience. This pass makes it composable more transparently to assemble the required patterns for conversion to LLVM dialect by using an interface. The Pass will scan the input and collect all the dialect present, and for those who implement the `ConvertToLLVMPatternInterface` it will use it to populate the conversion pattern, and possible the conversion target. Since these conversions can involve intermediate dialects, or target other dialects than LLVM (for example AVX or NVVM), this pass can't statically declare the required `getDependentDialects()` before the pass pipeline begins. This is worked around by using an extension in the dialectRegistry that will be invoked for every new loaded dialects in the context. This allows to lookup the interface ahead of time and use it to query the dependent dialects. Differential Revision: https://reviews.llvm.org/D157183

…ared library build (NFC)

This reverts commit 0bdbe7b because it broke the bots.

Add import of 'DependentSizedExtVectorType'. Reviewed By: balazske Differential Revision: https://reviews.llvm.org/D157238

Add import of ConvertVectorExpr. Reviewed By: balazske Differential Revision: https://reviews.llvm.org/D157249

This reverts commit dfdfd30. An issue is reported for wrong warning, this has to be reconsidered. Differential Revision: https://reviews.llvm.org/D157352

@plt

PPC32 -fpic/-fPIC generates `bl __tls_get_addr(x@tlsgd)@PLT` or `bl __tls_get_addr(x@tlsgd)@plt+32768`. `powerpc-linux-gnu-gcc -fPIC` generates `bl __tls_get_addr+32668(x@tlsgd)@plt`. These expressions can be parsed by GNU assembler but not by the integrated assembler. Add the support. Differential Revision: https://reviews.llvm.org/D153206

This reverts commit aa4cd66. This will resolve the following failures: ``` SanitizerCommon-asan-i386-Linux::sanitizer_coverage_allowlist_ignorelist.cpp SanitizerCommon-asan-x86_64-Linux::sanitizer_coverage_allowlist_ignorelist.cpp SanitizerCommon-lsan-i386-Linux::sanitizer_coverage_allowlist_ignorelist.cpp SanitizerCommon-lsan-x86_64-Linux::sanitizer_coverage_allowlist_ignorelist.cpp SanitizerCommon-msan-x86_64-Linux::sanitizer_coverage_allowlist_ignorelist.cpp ```

B extension has been removed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157353

The number of values yielded from a LinalgOp's payload has to match the number of inits / outs operands of the LinalgOp. These two numbers got mixed up in the respective error message, this patch clarifies the message and updates the tests. Reviewed By: nicolasvasilache, mehdi_amini Differential Revision: https://reviews.llvm.org/D153124

Reviewed By: zixuan-wu Differential Revision: https://reviews.llvm.org/D154768

These tests will be optimized with IXH32/IXW32/IXD32 in the future. Reviewed By: zixuan-wu Differential Revision: https://reviews.llvm.org/D154332

Optimize "Rx * imm" for specific immediates to ([IXH32|IXW32|IXD32] (LSLI Rx, shift), Rx). Reviewed By: zixuan-wu Differential Revision: https://reviews.llvm.org/D154768

…rgerRelation This patch implements findSymbolicIntegerLexMin/Max for PresburgerRelation Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D156623

In line 544, if we go in to isFalse, then the Kind could be ICTriangleFalse and isRev must be False, so we never go into the true branch in line 545, better to remove it. Reviewed By: skan, pengfei Differential Revision: https://reviews.llvm.org/D157260

This patch fixes the reported regression caused by D146358 through adding notes about an uninitialized base class when we diagnose uninitialized constructor. This also changes the wording from the old one in order to make it clear that the uninitialized subobject is a base class and its constructor is not called. Wording changes: BEFORE: `subobject of type 'Base' is not initialized` AFTER: `constructor of base class 'Base' is not called` Fixes llvm/llvm-project#63496 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D153969

This patch adds all the language-level function keywords defined in: ARM-software/acle#188 (merged) ARM-software/acle#261 (update after D148700 landed) The keywords are used to control PSTATE.ZA and PSTATE.SM, which are respectively used for enabling the use of the ZA matrix array and Streaming mode. This information needs to be available on call sites, since the use of ZA or streaming mode may have to be enabled or disabled around the call-site (depending on the IR attributes set on the caller and the callee). For calls to functions from a function pointer, there is no IR declaration available, so the IR attributes must be added explicitly to the call-site. With the exception of '__arm_locally_streaming' and '__arm_new_za' the information is part of the function's interface, not just the function definition, and thus needs to be propagated through the FunctionProtoType::ExtProtoInfo. This patch adds the defintions of these keywords, as well as codegen and semantic analysis to ensure conversions between function pointers are valid and that no conflicting keywords are set. For example, '__arm_streaming' and '__arm_streaming_compatible' are mutually exclusive. Differential Revision: https://reviews.llvm.org/D127762

@jdoerfert

OMPD is already pushed to LLVM repo through https://reviews.llvm.org/D100181 . Currently, it supports Openmp 5.0 standard for the host in Linux machines. Reviewed By: @jdoerfert Differential Revision: https://reviews.llvm.org/D156878

These tests are touched in D149757 and to reduce the number of changes in that patch, the tests are updated here. The test format is fixed according to the rules: - # for actual comments; - RUN and CHECK lines are specified without #; - all comment markers should have a space between them and the rest of the line (e.g. # This is a comment). In some cases lines are reordered to make CHECK commands closer to the corresponding RUN lines. No other changes are made. Differential Revision: https://reviews.llvm.org/D155943

Update documentaiton now that macros are laid out in a more structured way. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D143911

…uires-expression Per CWG 2672 substitution failure within the body of a lambda inside a requires-expression should be a hard error. Fixes llvm/llvm-project#64138 Reviewed By: cor3ntin Differential Revision: https://reviews.llvm.org/D156993

This test was trying to detect a system installation of CUDA and was marked as returning exit code 1 as part of D156363. Pass an explicit CUDA installation to make the test return exit code 0 regardless of a CUDA being found on the system or not. Also add an explicit -march to get a stable test.

Makes the constructors a bit more flexible, to be used in D157194 & D157144.

Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157293

This reuses the same strategy for fixed vectors as other ops, i.e. custom lower to a scalable *_vl SD node. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157294

We need to add new VL nodes to mirror ISD::ROTL and ISD::ROTR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157295

Just a whitespace cleanup.

…le vs truncation Pulled out of LowerTruncateVecPackWithSignBits - prefer shuffles unless we can cheaply split the vector. ComputeNumSignBits struggles with vXi64 through bitcasts, so we're usually better off with shuffles.

…ing. Ensure expectOperationValueResult performs the is_integral_v as constexpr to prevent MSVC getting confused between the mixture of integer / string constructors in the if-else. Warning introduced in D150880

This helps simplify constant splats a little. Without this the code in llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L14072 always returns the existing node. Differential Revision: https://reviews.llvm.org/D157259

These make it easier to debug and improve alias analysis. Enable with --debug-only=fir-alias-analysis. Differential Revision: https://reviews.llvm.org/D157105

Differential Revision: https://reviews.llvm.org/D157106

Reverting because of buildbot failure This reverts commit c732a45.

This patch uses similar allocator configuration to Asan, i.e. dynamic allocator start address (~(uptr)0) and 128 GB allocator size. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D152895

…linalg Unless otherwise specified this pass should not assume a level, as this rejects otherwise valid TOSA. This has caused build failures in IREE. The level (and other validation options) have now been made configurable. The pass options have been converted to enums to make them more type safe in C++. Reviewed By: Tai78641 Differential Revision: https://reviews.llvm.org/D157282

Use the printOperands for printing VPInstruction's operands to be more in line with other recipes and ensure consistent printing after D15719. Also removes some stray spaces in print output.

…structions Differential Revision: https://reviews.llvm.org/D157094

Model wrap flags directly using VPRecipeWithIRFlags and clean up the duplicated *NUW opcodes. D157144 will build on this and also model FMFs for VPInstruction. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D157194

When using `nvgpu.tma.async.load` Op to asynchronously load data into shared memory, it fails to account for provided offsets, potentially leading to incorrect memory access. Using offset is common practice especially with the dynamic shared memory. This work addresses the problem by ensuring proper consideration of offsets. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D157380

…occurs in the conditional expression of the do while statement ``` constexpr int test() { do {} while (a + 1 < 10); return 0; } ``` Before: ``` `-FunctionDecl 0x56512a172650 <./recovery.cpp:1:1, line:4:1> line:1:15 constexpr test 'int ()' implicit-inline `-CompoundStmt 0x56512a172860 <col:22, line:4:1> `-ReturnStmt 0x56512a172850 <line:3:5, col:12> `-IntegerLiteral 0x56512a172830 <col:12> 'int' 0 ``` Now: ``` `-FunctionDecl 0x5642c4804650 <./recovery.cpp:1:1, line:4:1> line:1:15 constexpr test 'int ()' implicit-inline `-CompoundStmt 0x5642c48048e0 <col:22, line:4:1> |-DoStmt 0x5642c4804890 <line:2:5, col:28> | |-CompoundStmt 0x5642c4804740 <col:8, col:9> | `-BinaryOperator 0x5642c4804870 <col:18, col:26> '<dependent type>' contains-errors '<' | |-BinaryOperator 0x5642c4804850 <col:18, col:22> '<dependent type>' contains-errors '+' | | |-RecoveryExpr 0x5642c4804830 <col:18> '<dependent type>' contains-errors lvalue | | `-IntegerLiteral 0x5642c48047b0 <col:22> 'int' 1 | `-IntegerLiteral 0x5642c48047f0 <col:26> 'int' 10 `-ReturnStmt 0x5642c48048d0 <line:3:5, col:12> `-IntegerLiteral 0x5642c48048b0 <col:12> 'int' 0 ``` Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D157195

This restores the tooling library's ability to accept invocations that take a preprocessed file as the primary input. Regressed by https://reviews.llvm.org/D105695 Fixes llvm/llvm-project#63941 Differential Revision: https://reviews.llvm.org/D157011

Independent simplification suggested in D157194.

Lower to the strided/contiguous addressing mode of ld1/ldnt1 instructions depending on register allocation. Differential Revision: https://reviews.llvm.org/D156311

This revision adds support for direct lowering of a linalg.copy on buffers between global and shared memory to a tma async load + synchronization operations. This uses the recently introduced Hopper NVVM and NVGPU abstraction to connect things end to end. Differential Revision: https://reviews.llvm.org/D157087

/data/llvm-project/mlir/lib/Dialect/NVGPU/TransformOps/NVGPUTransformOps.cpp:969:16: error: unused variable 'inMemRefType' [-Werror,-Wunused-variable] MemRefType inMemRefType = inMemRef.getType(); ^ 1 error generated.

…esting This patch added checks for global entries in ReplaceWithVeclib testing using ArmPL and SLEEF vector libraries. Differential Revision: https://reviews.llvm.org/D157258

Support IR that is generated by the vector-to-scf lowering of N-D vector transfers with a mask. (Until now only 1-D and 2-D transfers were supported.) Only transfers that were fully unrolled are supported. Differential Revision: https://reviews.llvm.org/D157286

This re-introduces the workaround that had been introduced in d7ca140 and then removed in 0c0628c, since it seems like it is needed after all. Differential Revision: https://reviews.llvm.org/D157319

…r_information.py Those are not relevant anymore since we don't have tests for private headers anymore. Differential Revision: https://reviews.llvm.org/D155880

…xtending to f32 first As there is no direct bf16 libcall for these conversions, extend to f32 first. This patch includes a tiny refactoring to pull out equivalent logic in ExpandIntRes_XROUND_XRINT so it can be reused in ExpandIntRes_FP_TO_{S,U}INT. This patch also demonstrates incorrect codegen for RV32 without zfbfmin for the newly enabled tests. As it doesn't introduce that incorrect codegen (caused by the assumption that 'TypeSoftPromoteHalf' is only used for f16 types), a fix will be added in a follow-up (D157287). Differential Revision: https://reviews.llvm.org/D156990

… to binary strings. **For an explanation of these patches see D154153.** Commit message: This patch adds the utility base class `ModuleToObject`. This class provides an interface for compiling module operations into binary strings, by default this class serialize modules to LLVM bitcode. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D154100

**For an explanation of these patches see D154153.** Commit message: This patch adds the `GPUTargetAttrInterface` attribute interface, this interface is meant to be used as an opaque interface for serializing GPU modules into binary strings. Reviewed By: mehdi_amini, krzysz00 Differential Revision: https://reviews.llvm.org/D154104

Idx's type can be different from Ptr's, causing a "Binary operator types must match" assertion failure when emitting the MUL. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D156972

Otherwise device libs still has issues at O0 (in OpenCL-CTS) Depends on D156972 as well. They're unrelated fixes but both are needed to fix the issue. Fixes SWDEV-402331 Reviewed By: #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D156973

**For an explanation of these patches see D154153.** Commit message: Adds support for Target attributes in GPU modules. This change enables attaching an optional non empty array of GPU target attributes to the module. Depends on D154104 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D154113

…erations to binary strings." This reverts commit c8e0364.

Currently clang does not consider host/device preference when resolving delete operator in the file scope, which causes device operator delete selected for class member initialization. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D156795

Summrary: Following D156014 we can now use aliases for NVPTX, removing this source of divergence. We require at least +ptx63 and at least sm_30 for `.alias` but this is already within what we build for with `libc` support. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D157323

…findings per symbol We received some user feedback around this being disruptful rather than useful in certain workflows so add an option to control the output behaviour. Differential Revision: https://reviews.llvm.org/D157390

This reverts commit 1f37088 as it causes a large regression in x264, and some other regressions in downstream embedded benchmarks under LTO.

CONFLICT (content): Merge conflict in llvm/lib/Passes/PassBuilderPipelines.cpp

Original commit: KhronosGroup/SPIRV-LLVM-Translator@f751ba1

Original commit: KhronosGroup/SPIRV-LLVM-Translator@3448740

Link to the spec: https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/KHR/SPV_KHR_shader_clock.asciidoc Original commit: KhronosGroup/SPIRV-LLVM-Translator@e4dfa92

When NonSemantic.Shader.100 debug info is enabled. The related tests cases are enabled back. Original commit: KhronosGroup/SPIRV-LLVM-Translator@574b0c6

Original commit: KhronosGroup/SPIRV-LLVM-Translator@b045a91

Compilation unit can be translated earlier, e.g. in `transEntryPoint()`. We should save the translated LLVM value to be used further. Original commit: KhronosGroup/SPIRV-LLVM-Translator@803e528

#2074) Most of the changes are adding -emit-opaque-pointers=0 lines to test code. The code generally works in the forward translation at this point, although there is still substantial work that needs to be done to finish porting the tests. Original commit: KhronosGroup/SPIRV-LLVM-Translator@10b1354

This check was recently added to clang-tidy, but the code base doesn't quite comply to it, so disable it for now. Original commit: KhronosGroup/SPIRV-LLVM-Translator@2cea844

Count should be the 3rd parameter. Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@fca2a3a

The ABI name mangling of target extension types is tweaked to be like a pointer-to-a-struct, which maintains somewhat better compatibility with the typed pointer representation for name mangling. Original commit: KhronosGroup/SPIRV-LLVM-Translator@15e0aa9

There are a couple of tests that use some magic to reuse the checks between opencl-flavored IR and SPIR-V-friendly IR; these tests are not yet ported to make diffs smaller. Original commit: KhronosGroup/SPIRV-LLVM-Translator@60ac579

This fixes most of the tests to use -emit-opaque-pointers. There are a few tests not yet converted for various reasons: Several tests only have check lines for opaque struct declarations. These are not emitted in opaque pointer mode, so the tests need deeper rewrites. A few tests check both typed and opaque pointer output. A few tests only make sense with typed pointer output (ForwardPtr.ll and RecursiveType.ll in particular). One test has a crash in reverse translation. Outside of these cases, all tests should now be using opaque pointer for testing reverse translation. Original commit: KhronosGroup/SPIRV-LLVM-Translator@0dc80be

Followup for KhronosGroup/SPIRV-LLVM-Translator#2072 Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@8dcd16c

Original commit: KhronosGroup/SPIRV-LLVM-Translator@2bd25cf

Original commit: KhronosGroup/SPIRV-LLVM-Translator@cbac39f

Original commit: KhronosGroup/SPIRV-LLVM-Translator@5d400f8

It should be compiltely rewritten Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@0e3404e

getNonOpaquePointerElementType is deprecated. This patch removes 3 calls to it where it's no longer require functional changes. Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@3b9f9a6

Most of the changes in these tests are adjusting the way test checks work to match the output in opaque pointer mode (in particular, opaque struct names aren't getting generated anymore). In the case of cl_types.ll and spirv_types.ll, the tests are deleted entirely because they would require much more invasive changes to keep checking the same thing (since the opaque struct names aren't generally available, and all of the types would have to be used more directly in an intrinsic name to be available for translation. Original commit: KhronosGroup/SPIRV-LLVM-Translator@b6207b8

There is a bug uncovered by these tests that, when exposed as target extension types, these weren't properly adding necessary capabilities. Original commit: KhronosGroup/SPIRV-LLVM-Translator@952b523

Original commit: KhronosGroup/SPIRV-LLVM-Translator@224fed8

As far as I can tell, the original translation creating two OpTypePipeStorage instructions was actually a bug; they should always have been translated using a single instruction. The SPIR-V representation of PipeStorage as a target extension type has been changed to a pointer to a struct named spirv.PipeStorage instead. At the current time, target extension types don't support constant initializers, which would be necessary for these types. Instead, they're represented as structs with integers, with necessary bitcasting to these types for use in initializer. There is still a latent bug that OpTypePipeStorage is getting bitcast instead. Original commit: KhronosGroup/SPIRV-LLVM-Translator@e661cb7

This patch adds joint_matrix reverse translation to target extension type and starts rewriting all of the tests. Some tests are being removed as outdated Remaining tests to add after the patch: 1. tf32 test 2. element wise operations test Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@465eb3c

The -opaque-pointers flag is no longer supported. Original commit: KhronosGroup/SPIRV-LLVM-Translator@ff8c3e7

Workaround unsupported freeze insn by: replacing uses of freeze's result with freeze's source or a random (but compilation reproducible) constant if freeze's source is undef/poison deleting freeze insn. Long term solution is to add a freeze instruction extension in SPIR-V. Issue is tracked in (#1140) Signed-off-by: Lu, John <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ed25856

Original commit: KhronosGroup/SPIRV-LLVM-Translator@28e2408

Signed-off-by: Lu, John <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@201b782

It is a temporary change to enable back CI until we change LLVM version Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@93b6d57

'bool llvm::Type::isOpaquePointerTy() const' has been removed. We need to use isPointerTy() instead. This PR handles all uses except one (lib/SPIRV/SPIRVWriter.cpp:279) which is handled in a different PR (#2089) Thanks Original commit: KhronosGroup/SPIRV-LLVM-Translator@c7a0b9b

It's duplicating existing information in the SPIR-V and it has complex structure. Signed-off-by: Sarnie, Nick <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@d498f48

For AttrKind attributes, we need to convert them to the AttrKind enum before checking if it exists or adding it, otherwise it gets added as a string. Signed-off-by: Sarnie, Nick <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@d24b9c6

Should be more cautious with inline namespaces Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@668a97d

…atches (#2108) Removes use of setOpaquePointer --> Can be removed without any side effect. isOpaqueOrPointeeTypeMatches --> is always true. Thanks Signed-off-by: Arvind Sudarsanam <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@42d7c59

Also fix 'unused variable' warning. Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@b0142af

…2098) This commit changes "Source Lang Literal" flag from simple a scalar value to a vector of pairs: (compile unit, source language). Original commit: KhronosGroup/SPIRV-LLVM-Translator@eb051c7

The intention is to replace existing SPV_INTEL_joint_matrix extension to the Khronos one in future. Spec: https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/KHR/SPV_KHR_cooperative_matrix.asciidoc Original commit: KhronosGroup/SPIRV-LLVM-Translator@e0c9de8

Original commit: KhronosGroup/SPIRV-LLVM-Translator@3b3e903

This reverts commit 93b6d57759a12875810a215ef781ccb0e928f374. Original commit: KhronosGroup/SPIRV-LLVM-Translator@f80988f

After the first loop of deleting instructions in ValuesToDelete, deleted instructions in ValuesToDelete are in an unstable state. Then in the second loop of deleting, dyn_cast to GlobalValue could return true for an instruction and double eraseFromParent causes crash. Global values in ValuesToDelete are functions. Unused functions are deleted by eraseUselessFunctions anyway. Original commit: KhronosGroup/SPIRV-LLVM-Translator@aea1ac7

`TransOperand` is never called for StrideIdx (3), because the loop ends at MinOperandCount (3). Original commit: KhronosGroup/SPIRV-LLVM-Translator@dcd3052

Change `auto` to `auto *` when the type is a pointer. This makes the code comply with the clang-tidy `llvm-qualified-auto` check. That check is already enabled, but the code base wasn't fully compliant yet. Original commit: KhronosGroup/SPIRV-LLVM-Translator@aa9226e

There was a case where Intel's SYCL compiler was optimizing out a kernel argument but not updating the metadata. This meant the metadata had more operands than the number of kernel arguments. We use the number of metadata operands to iterate, so we did an out of bounds access. We can't fix this by instead using the function arguments to iterate because we still don't know which argument was removed, so just assert the metadata is valid. Signed-off-by: Sarnie, Nick <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@4f7efd3

* Don't wrap kernels that are not being called in the module This patch is a result of a reflection about previously merged PR KhronosGroup/SPIRV-LLVM-Translator#1149 "add an entry point wrapper around functions (llvm pass)" and is enspired by various reported translator, clang (OpenCL) and Intel GPU drivers issues (see KhronosGroup/SPIRV-LLVM-Translator#2029 for reference). While SPIR-V spec states: === *OpName* --//--. This has nosemantic impact and can safely be removed from a module. === yet having EntryPoint function and a function that shares the name via OpName might be confusing by both (old) drivers and programmers, who read the SPIR-V file. This patch prevents generation of the wrapper function when it's not necessary to generate it aka if a kernel function is not called by other kernel. We can do better in other cases as well, for example I have experiments of renaming a wrapped function adding a previx, so it could be distinguished from the actual kernel/entry point, but for now it doesn't pass validation for E2E OpenCL tests. Signed-off-by: Sidorov, Dmitry <[email protected]> * prevent a copy Signed-off-by: Sidorov, Dmitry <[email protected]> This patch is a result of a reflection about previously merged PR #1149 "add an entry point wrapper around functions (llvm pass)" and is enspired by various reported translator, clang (OpenCL) and Intel GPU drivers issues (see While SPIR-V spec states: OpName --//--. This has nosemantic impact and can safely be removed from a module. yet having EntryPoint function and a function that shares the name via OpName might be confusing by both not-up-to-date drivers and programmers, who read the SPIR-V file. This patch prevents generation of the wrapper function when it's not necessary to generate it aka if a kernel function is not called by other kernel. Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@46285e4

Return back some bitcasts to preserve opaque pointers support, fix SYCL tests by emitting generic address space instead of global as intended by intel/llvm changes.

For now, the reverse translation is not resolved properly, so we test only forward translation here. Original commit: KhronosGroup/SPIRV-LLVM-Translator@1677289

We no longer need this flag as only opaque pointers are supported now. Original commit: KhronosGroup/SPIRV-LLVM-Translator@3eeb3bf

Add basic support for the SPV_EXT_image_raw10_raw12 extension [1] such that SPIR-V modules using the extension can be consumed. [1] https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/EXT/SPV_EXT_image_raw10_raw12.asciidoc Original commit: KhronosGroup/SPIRV-LLVM-Translator@bb2196b

This reverts commit d885138. Reason: Applied the fix for the Asan buildbot failures.

…_READY Revert"[llvm] Drop some typed pointer handling/bitcasts" This reverts commit 4ce7c4a. Conflicts: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/Transforms/Coroutines/CoroSplit.cpp llvm/lib/Transforms/Scalar/LICM.cpp llvm/lib/Transforms/Scalar/SROA.cpp Revert "[SCEVExpander] Remove GEP add rec splitting code (NFCI)" This reverts commit b752542. Revert "[Transforms] Remove FactorOutConstant to fix -Wunneeded-internal-declaration (NFC)" This reverts commit 67f1e8d. Revert "[SCEVExpander] Remove typed pointer support (NFC)" This reverts commit 02ba405.

This reverts commit 5b5bd81.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLVM and SPIRV-LLVM-Translator pulldown (WW32) #10783

LLVM and SPIRV-LLVM-Translator pulldown (WW32) #10783

Commits on Aug 7, 2023

Commits on Aug 8, 2023

Commits on Aug 10, 2023

Commits on Aug 11, 2023

Commits on Aug 12, 2023

Commits on Aug 13, 2023

Commits on Aug 14, 2023