-
Notifications
You must be signed in to change notification settings - Fork 738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM and SPIRV-LLVM-Translator pulldown (WW32) #10783
Commits on Aug 7, 2023
-
[llvm-exegesis] Don't try to use SYS_rseq if it's not defined.
When compiling against recent glibc (>= 2.35) but old kernel headers (< 4.18), `SYS_rseq` is not defined and thus llvm-exegesis fails to build. So also check that `SYS_rseq` is defined before trying to use it. Fixes llvm/llvm-project#64456 Reviewed By: MaskRay, gchatelet Differential Revision: https://reviews.llvm.org/D157189
Configuration menu - View commit details
-
Copy full SHA for f70e83a - Browse repository at this point
Copy the full SHA f70e83aView commit details -
[JITLink][PowerPC] Enable more tests for ppc64 big-endian target. NFC.
Kai Luo committedAug 7, 2023 Configuration menu - View commit details
-
Copy full SHA for af07ec3 - Browse repository at this point
Copy the full SHA af07ec3View commit details -
[Pipelines] Perform hoisting prior to GVN
We currently only enable hoisting in the last SimplifyCFG run of the function simplification pipeline. In particular this happens after GVN, which means that instructions that were identical (and thus hoistable) prior to GVN might no longer be so after it ran, due to equality replacements (see the phase ordering test). The history here is that D84108 restricted hoisting to the very late (module optimization) pipeline only. Then D101468 went back on that, and also performed it at the end of function simplification. This patch goes one step further and allows it prior to GVN. Importantly, we still don't perform hoisting before LoopRotate, which was the original motivation for delaying it. Differential Revision: https://reviews.llvm.org/D156532
Configuration menu - View commit details
-
Copy full SHA for 1f37088 - Browse repository at this point
Copy the full SHA 1f37088View commit details -
[AIC] Fix the sext cost operands in tryToFPToSat
As pointed out in D125755 the operand of a call to getCastInstrCost had the Src and Dst the wrong way around. Differential Revision: https://reviews.llvm.org/D154841
Configuration menu - View commit details
-
Copy full SHA for aa97f6b - Browse repository at this point
Copy the full SHA aa97f6bView commit details -
[X86] ReplaceNodeResults - relax the value type constraints for TRUNC…
…ATE widening With SSSE3, widen the truncation for anything other than vXi64 -> vXi8 smaller than v8i64 (where PSHUFB would be better).
Configuration menu - View commit details
-
Copy full SHA for 9d3b19e - Browse repository at this point
Copy the full SHA 9d3b19eView commit details -
[clangd][clang-tidy][std_symbol_map] Add missing symbol.
Differential Revision: https://reviews.llvm.org/D157256
Configuration menu - View commit details
-
Copy full SHA for 8a5c0cc - Browse repository at this point
Copy the full SHA 8a5c0ccView commit details -
[clang][analyzer] Improve StdCLibraryFunctions socket send/recv funct…
…ions. The modeling of send, recv, sendmsg, recvmsg, sendto, recvfrom is changed: These functions do not return 0, except if the message length is 0. (In sendmsg, recvmsg the length is not checkable but it is more likely that a message with 0 length is invalid for these functions.) Reviewed By: donat.nagy Differential Revision: https://reviews.llvm.org/D155715
Configuration menu - View commit details
-
Copy full SHA for 52ac71f - Browse repository at this point
Copy the full SHA 52ac71fView commit details -
[MachineCSE] Add an option to override the profitability heuristics
Differential Revision: https://reviews.llvm.org/D157002
Configuration menu - View commit details
-
Copy full SHA for f580901 - Browse repository at this point
Copy the full SHA f580901View commit details -
[lldb] Make IR interpreter timeout test more loose
This has failed once in a while on our Windows on Arm bot: https://lab.llvm.org/buildbot/#/builders/219/builds/4688 Traceback (most recent call last): File "C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\... self.assertGreaterEqual(duration_sec, 1) AssertionError: 0.9907491207122803 not greater than or equal to 1 We're not here to check that Python/the C++ lib/the OS implemented timers correctly, so accept anything 0.95 or greater.
Configuration menu - View commit details
-
Copy full SHA for 91a0e83 - Browse repository at this point
Copy the full SHA 91a0e83View commit details -
[clang][RISCV] Fix bug in ABI handling of empty structs with hard FP …
…calling conventions in C++ As reported in <llvm/llvm-project#58929>, Clang's handling of empty structs in the case of small structs that may be eligible to be passed using the hard FP calling convention doesn't match g++. In general, C++ record fields are never empty unless [[no_unique_address]] is used, but the RISC-V FP ABI overrides this. After this patch, fields of structs that contain empty records will be ignored, even in C++, when considering eligibility for the FP calling convention ('flattening'). It isn't explicitly noted in the RISC-V psABI, but arrays of empty records will disqualify a struct for consideration of using the FP calling convention in g++. This patch matches that behaviour. The psABI issue <riscv-non-isa/riscv-elf-psabi-doc#358> seeks to clarify this. This patch was previously committed but reverted after a bug was found. This recommit adds additional logic to prevent that bug (adding an extra check for when a candidate from detectFPCCEligibleStructHelper may not be valid). Differential Revision: https://reviews.llvm.org/D142327
Configuration menu - View commit details
-
Copy full SHA for e3c57fd - Browse repository at this point
Copy the full SHA e3c57fdView commit details -
[VPlan] Move up VPRecipeWithIRFlags definition. (NFC)
This allows using VPRecipeWithIRFlags for VPInstruction and reduces the diff for D157144 & D157194.
Configuration menu - View commit details
-
Copy full SHA for 7b14c05 - Browse repository at this point
Copy the full SHA 7b14c05View commit details -
[FuncSpec] Estimate dead blocks more accurately.
Currently we only consider basic blocks with a unique predecessor when estimating the size of dead code. However, we could expand to this to consider blocks with a back-edge, or blocks preceded by dead blocks. Differential Revision: https://reviews.llvm.org/D156903
Configuration menu - View commit details
-
Copy full SHA for c2d1900 - Browse repository at this point
Copy the full SHA c2d1900View commit details -
[RISCV] Implement straight-forward bf16<->int conversion cases
This ports over the test cases half-convert.ll and implements patterns or RISCVISelLowering.cpp changes for all of the most straight-forward cases (those that don't require changes outside of lib/Target/RISCV). The remaining cases and noted poor codegen for saturating conversions will be handled in follow-up patches. Differential Revision: https://reviews.llvm.org/D156943
Configuration menu - View commit details
-
Copy full SHA for 7a1b2ad - Browse repository at this point
Copy the full SHA 7a1b2adView commit details -
[Flang][Sema] Move directive sets to a shared location
This patch moves directive sets defined internally in Semantics to a header accessible by other stages of the compiler to enable reuse. Some sets are renamed/rearranged and others are lifted from local definitions to provide a single source of truth. Differential Revision: https://reviews.llvm.org/D157090
Configuration menu - View commit details
-
Copy full SHA for ec70627 - Browse repository at this point
Copy the full SHA ec70627View commit details -
[TargetLowering][RISCV] Improve codegen for saturating bf16 to int co…
…nversion Extending to f32 first (as is done for f16) results in better generated code for RISC-V (and affects no other in-tree tests). Additionally, performing the FP_EXTEND first seems equally justified for bf16 as for f16. Differential Revision: https://reviews.llvm.org/D156944
Configuration menu - View commit details
-
Copy full SHA for 1cffd26 - Browse repository at this point
Copy the full SHA 1cffd26View commit details -
[RISCV] Add a blank line after end of RUN lines. NFC.
In most of testcases, it usually has a blank line after end of RUN lines for readability.
Configuration menu - View commit details
-
Copy full SHA for f2bdc29 - Browse repository at this point
Copy the full SHA f2bdc29View commit details -
[AMDGPU] Add and use SIInstrFlags::GWS. NFC.
This reduces the number of places where we have to check for a list of DS_GWS_* opcodes. Differential Revision: https://reviews.llvm.org/D157099
Configuration menu - View commit details
-
Copy full SHA for e61ca23 - Browse repository at this point
Copy the full SHA e61ca23View commit details -
[NFC] strengthen some CHECK-NOT lines
The affected lit tests failed when they were run in a path that contained the word "call". CHECK-NOT lines that were supposed to match only the IR ended up matching the path printed in the output. Fixed this by checking for "call void" instead.
Configuration menu - View commit details
-
Copy full SHA for f7031c4 - Browse repository at this point
Copy the full SHA f7031c4View commit details -
[Clang] Make __arm_streaming apply only to prototyped functions.
Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D152141
Configuration menu - View commit details
-
Copy full SHA for 4d3e917 - Browse repository at this point
Copy the full SHA 4d3e917View commit details -
[VPlan] Move VPRecipeWithIRFlags::getFastMathFlags. (NFCI)
Split off suggested refactoring from D157144. Also adds a assert to make sure this is only used when OpType is FPMathOp.
Configuration menu - View commit details
-
Copy full SHA for 0b17e9d - Browse repository at this point
Copy the full SHA 0b17e9dView commit details -
ValueTracking: Really remove CannotBeOrderedLessThanZero
6640df9 did not actually remove it, just its final user. cannotBeOrderedLessThanZeroImpl still has a user which needs to be updated before it can be removed. The users of SignBitMustBeZero currently have broken expectations for nan handling, so requires more work to replace.
Configuration menu - View commit details
-
Copy full SHA for 1d9f77f - Browse repository at this point
Copy the full SHA 1d9f77fView commit details -
Configuration menu - View commit details
-
Copy full SHA for 7f00389 - Browse repository at this point
Copy the full SHA 7f00389View commit details -
[lldb] Fix typo in comments and in test
Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D157214
Configuration menu - View commit details
-
Copy full SHA for aa27848 - Browse repository at this point
Copy the full SHA aa27848View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4097a24 - Browse repository at this point
Copy the full SHA 4097a24View commit details -
[TII] NFCI: Simplify the interface for isTriviallyReMaterializable
Currently `isTriviallyReMaterializable` calls `isReallyTriviallyReMaterializable` and `isReallyTriviallyReMaterializableGeneric`. The two interfaces are confusing, but there are also some real issues with this. The documentation of this function (see below) suggests that `isReallyTriviallyRematerializable` allows the target to override the default behaviour. /// For instructions with opcodes for which the M_REMATERIALIZABLE flag is /// set, this hook lets the target specify whether the instruction is actually /// trivially rematerializable, taking into consideration its operands. It however implements something different. The default behaviour is the analysis done in `isReallyTriviallyReMaterializableGeneric`, which is testing if it is safe to rematerialize the MachineInstr. The result of `isReallyTriviallyReMaterializable` is only considered if `isReallyTriviallyReMaterializableGeneric` returns `false`. That means there is no way to override the default behaviour if `isReallyTriviallyReMaterializableGeneric` returns true (i.e. it is safe to rematerialize, but we'd rather not). By making this a single interface, we can override the interface to do either. Reviewed By: craig.topper, nemanjai Differential Revision: https://reviews.llvm.org/D156520
Configuration menu - View commit details
-
Copy full SHA for bbb9589 - Browse repository at this point
Copy the full SHA bbb9589View commit details -
[X86] Add matchTruncateWithPACK helper for matching signbits/knownbit…
…s for PACKSS/PACKUS Begin to consolidate the similar matching code we have - all have semi-similar constraints that still need merging together to ensure we get consistent codegen depending on when the truncate is lowered.
Configuration menu - View commit details
-
Copy full SHA for 711dff4 - Browse repository at this point
Copy the full SHA 711dff4View commit details -
[X86] truncateVectorWithPACK - ensure we don't truncate to <1 x iXX> …
…vector types Fuzz testing noticed that the sub-128-bit vector splitting added in ef4330f didn't correctly halt at <2 x iXX> truncations.
Configuration menu - View commit details
-
Copy full SHA for 0d1f853 - Browse repository at this point
Copy the full SHA 0d1f853View commit details -
[RISCV][test] Add non-zfbfmin RUN lines to bfloat-convert.ll
As requested in review for https://reviews.llvm.org/D156990 This additionally consistently uses the ilp32d/lp64d ABIs when the D extension is enabled.
Configuration menu - View commit details
-
Copy full SHA for 380fd82 - Browse repository at this point
Copy the full SHA 380fd82View commit details -
[mlir] Apply ClangTidy fix (NFC)
redundant get() call on smart pointer.
Configuration menu - View commit details
-
Copy full SHA for 7d6fb14 - Browse repository at this point
Copy the full SHA 7d6fb14View commit details -
[mlir][NVGPU] Support 2D masks in transform.nvgpu.create_async_groups
Support IR that is generated by the vector-to-scf lowering of 2D vector transfers with a mask. Only 2D transfers that were fully unrolled are supported at the moment. Differential Revision: https://reviews.llvm.org/D156695
Configuration menu - View commit details
-
Copy full SHA for 39d8876 - Browse repository at this point
Copy the full SHA 39d8876View commit details -
[FileCheck, 3/4] Allow AP value for numeric expressions
Use APInt to represent numeric variables and expressions, therefore removing overflow concerns. Only remains underflow when the format of an expression is unsigned (incl. hex values) but the result is negative. Note that this can only happen when substituting an expression, not when capturing since the regex used to capture unsigned value will not include minus sign, hence all the code removal for match propagation testing. This is what this patch implement. Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D150880
Configuration menu - View commit details
-
Copy full SHA for 0726cb0 - Browse repository at this point
Copy the full SHA 0726cb0View commit details -
[clang][ASTMatcher] Add Matcher 'dependentSizedExtVectorType'
Add Matcher dependentSizedExtVectorType for DependentSizedExtVectorType. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D157237
Configuration menu - View commit details
-
Copy full SHA for 4cce27d - Browse repository at this point
Copy the full SHA 4cce27dView commit details -
[clang][ASTMatcher] Add Matcher 'convertVectorExpr'
Add Matcher convertVectorExpr. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D157248
Configuration menu - View commit details
-
Copy full SHA for 8baf862 - Browse repository at this point
Copy the full SHA 8baf862View commit details -
[clang/cxx-interop] Teach clang to ignore availability errors that co…
…me from CF_OPTIONS This cherry-picks swiftlang/llvm-project#6431 since without it, macOS 14 SDK headers don't compile when targeting catalyst. Fixes #64438.
Configuration menu - View commit details
-
Copy full SHA for bb58748 - Browse repository at this point
Copy the full SHA bb58748View commit details -
[NFC][SCCP] Regenerate test case
luxufan committedAug 7, 2023 Configuration menu - View commit details
-
Copy full SHA for 03dec91 - Browse repository at this point
Copy the full SHA 03dec91View commit details -
[ValueTracking] Switch over opcode in isKnownToBeAPowerOfTwo() (NFC)
Similar to the other ValueTracking function, switch over the instruction opcode instead of doing a long sequence of match()es.
Configuration menu - View commit details
-
Copy full SHA for 8aeb84c - Browse repository at this point
Copy the full SHA 8aeb84cView commit details -
Merge from 'main' to 'sycl-web' (30 commits)
CONFLICT (content): Merge conflict in llvm/lib/Passes/PassBuilderPipelines.cpp
Configuration menu - View commit details
-
Copy full SHA for 2c37701 - Browse repository at this point
Copy the full SHA 2c37701View commit details -
[AggressiveAntiDepBreaker] Tweak the fix for renaming a subregister o…
…f a live register This patch tweaks the fix in D20627 "Do not rename registers that do not start an independent live range" to only consider Data dependencies, not Output or Anti dependencies. An Output or Anti dependency to a superreg does not imply that that superreg is live at the current instruction. This enables breaking anti-dependencies in a few more cases as shown by the lit test updates. Differential Revision: https://reviews.llvm.org/D156879
Configuration menu - View commit details
-
Copy full SHA for 68a0a37 - Browse repository at this point
Copy the full SHA 68a0a37View commit details -
[AggressiveAntiDepBreaker] Refix renaming a subregister of a live reg…
…ister This patch reworks the fix from D20627 "Do not rename registers that do not start an independent live range". That fix depended on the scheduler dependency graph having redundant edges. Those edges are removed by D156552 "[MachineScheduler] Track physical register dependencies per-regunit" with the result that on several Hexagon lit tests, the post-RA scheduler would schedule the code in a way that fails machine verification. Consider this code where D11 is a pair R23:R22: SU(0): %R2<def> = A2_add %R23, %R17<kill> (anti dependency on R23 here) SU(8): %R23<def> = S2_asr_i_r %R22, 31 (data dependency on R23->D11 here) SU(10): %D0<def> = A2_tfrp %D11<kill> The original fix would detect this situation by examining the dependency from SU(8) to SU(10) and seeing that D11 is not a subreg of R23. A slightly more complicated example: SU(0): %R2<def> = A2_add %R23, %R17<kill> (anti dependency on R23 here) SU(8): %R23<def> = S2_asr_i_r %R22, 31 (data dependency on R23 here) SU(9): %R23<def> = S2_asr_i_r %R23, 31 (data dependency on R23->D11 here) SU(10): %D0<def> = A2_tfrp %D11<kill> The original fix also worked on this example, but only because ScheduleDAGInstrs adds an extra data dependency edge directly from SU(8) to SU(10). This edge is redundant, since you could infer it transitively from the edges SU(8)->SU(9) and SU(9)->SU(10), and since none of the data that SU(8) writes to R23 is read by SU(10). After D156552 the redundant edge SU(8)->SU(10) will not be present, so when we examine the successors of SU(8) we will not find any that read from a superreg of R23. This patch removes the original fix from D20627, which examined edges in the dependency graph. Instead it extends a check that was already being done in FindSuitableFreeRegisters: instead of checking that *some* register is a superreg of all registers in the rename group, we now check that the specific register that carries the anti-dependency that we want to break is a superreg of all registers in the rename group. Differential Revision: https://reviews.llvm.org/D156880
Configuration menu - View commit details
-
Copy full SHA for 97324f6 - Browse repository at this point
Copy the full SHA 97324f6View commit details -
[MachineScheduler] Track physical register dependencies per-regunit
Change the scheduler's physical register dependency tracking from registers-and-their-aliases to regunits. This has a couple of advantages when subregisters are used: - The dependency tracking is more accurate and creates fewer useless edges in the dependency graph. An AMDGPU example, edited for clarity: SU(0): $vgpr1 = V_MOV_B32 $sgpr0 SU(1): $vgpr1 = V_ADDC_U32 0, $vgpr1 SU(2): $vgpr0_vgpr1 = FLAT_LOAD_DWORDX2 $vgpr0_vgpr1, 0, 0 There is a data dependency on $vgpr1 from SU(0) to SU(1) and from SU(1) to SU(2). But the old dependency tracking code also added a useless edge from SU(0) to SU(2) because it thought that SU(0)'s def of $vgpr1 aliased with SU(2)'s use of $vgpr0_vgpr1. - On targets like AMDGPU that make heavy use of subregisters, each register can have a huge number of aliases - it can be quadratic in the size of the largest defined register tuple. There is a much lower bound on the number of regunits per register, so iterating over regunits is faster than iterating over aliases. The LLVM compile-time tracker shows a tiny overall improvement of 0.03% on X86. I expect a larger compile-time improvement on targets like AMDGPU. Recommit after fixing AggressiveAntiDepBreaker in D156880. Differential Revision: https://reviews.llvm.org/D156552
Configuration menu - View commit details
-
Copy full SHA for 56d92c1 - Browse repository at this point
Copy the full SHA 56d92c1View commit details -
[RISCVGatherScatterLowering] Support broadcast base pointer
A broadcast base pointer is the same as a scalar base pointer for GEP semantics (when there's at least one other vector operand). This is the form that SLP likes to emit, so we should handle it. Differential Revision: https://reviews.llvm.org/D157132
Configuration menu - View commit details
-
Copy full SHA for 999ac10 - Browse repository at this point
Copy the full SHA 999ac10View commit details -
[OpenMP][AMDGPU] Add Envar for controlling HSA busy queue tracking
If the Envar is set to true (default), busy HSA queues will be actively avoided when assigning a queue to a Stream. Otherwise, we will initialize a new HSA queue for each requested Stream, then default to round robin once the set maximum has been reached. Reviewed By: jdoerfert, kevinsala Differential Revision: https://reviews.llvm.org/D156996
Configuration menu - View commit details
-
Copy full SHA for 7eba3e5 - Browse repository at this point
Copy the full SHA 7eba3e5View commit details -
[RISCV] Use v(f)slide1down for build_vector with dominant values
If we have a dominant value, we can still use a v(f)slide1down to handle the last value in the vector if that value is neither undef nor the dominant value. Note that we can extend this idea to any tail of elements, but that's ends up being a near complete merge of the v(f)slide1down insert path, and requires a bit more untangling on profitability heuristics first. Differential Revision: https://reviews.llvm.org/D157120
Configuration menu - View commit details
-
Copy full SHA for 47fe3b3 - Browse repository at this point
Copy the full SHA 47fe3b3View commit details -
Configuration menu - View commit details
-
Copy full SHA for 95cd6ae - Browse repository at this point
Copy the full SHA 95cd6aeView commit details -
[ValueTracking] Support non-zero pow2 for shl with nowrap flags
If the shl has either nuw or nsw flags, then we know that bits cannot be shifted out, so a power of two cannot become zero. Proofs: https://alive2.llvm.org/ce/z/4QfebE
Configuration menu - View commit details
-
Copy full SHA for 5de89b4 - Browse repository at this point
Copy the full SHA 5de89b4View commit details -
Revert "[ValueTracking] Improve the coverage of isKnownToBeAPowerOfTw…
…o for vscale" Logic is incorrect. Shift can make non-zero pow2 zero. This reverts commit 9c837b7.
Configuration menu - View commit details
-
Copy full SHA for f6c7264 - Browse repository at this point
Copy the full SHA f6c7264View commit details -
[libc++][PSTL] Parallelize random_access_iterator
P2408 requires this for C++23, but implementing it in C++20 is safe because the only code impacted would be code that violated a precondition of the parallel algorithm. It was P2408 intent to enable implementations to backport this to C++20. Closes #63447 . Reviewed By: philnik, #libc Differential Revision: https://reviews.llvm.org/D154305
Configuration menu - View commit details
-
Copy full SHA for 0e2de66 - Browse repository at this point
Copy the full SHA 0e2de66View commit details -
Revert "[Clang][OpenMP] Support for Code Generation of loop bind clause"
This reverts commit 4097a24. Breaks tests on macOS, see https://reviews.llvm.org/rG4097a2458412#1235854
Configuration menu - View commit details
-
Copy full SHA for fab4972 - Browse repository at this point
Copy the full SHA fab4972View commit details -
[scudo] Implement and enable MemMapLinux
Most of the implementations are copied from linux.cpp and we will be keeping those memory functions in linux.cpp for a while until we are able to switch to use MemMap completely. The remaining part is SizeClassAllocator32 which hasn't been switched to use MemMap interface Reviewed By: cferris Differential Revision: https://reviews.llvm.org/D146453
Configuration menu - View commit details
-
Copy full SHA for f5fffbe - Browse repository at this point
Copy the full SHA f5fffbeView commit details -
Configuration menu - View commit details
-
Copy full SHA for 9b6aaf1 - Browse repository at this point
Copy the full SHA 9b6aaf1View commit details -
Reviewed By: #libc, philnik Differential Revision: https://reviews.llvm.org/D157213
Configuration menu - View commit details
-
Copy full SHA for f1fc29b - Browse repository at this point
Copy the full SHA f1fc29bView commit details -
[SLP]Improve stores vectorization.
Use O(nlogn) instead of O(N2) (N <= 32) sorting approach and do not try to revectorize all possible combinations of stores, if they definitely cannot be combined because of mem/data dependencies. Compile time (O3 + lto, skylake_avx512): External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk.test 117.15 120.11 2.5% External/SPEC/CINT2017speed/623.xalancbmk_s/623.xalancbmk_s.test 203.67 207.42 1.8% External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 232.43 235.01 1.1% External/SPEC/CINT2017rate/523.xalancbmk_r/523.xalancbmk_r.test 205.49 207.25 0.9% External/SPEC/CFP2017rate/510.parest_r/510.parest_r.test 310.46 306.23 -1.4% Link time (O3+lto, skylake_avx512): External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 1383.69 1475.94 6.7% Other changes are too small, cannot rely on them. size..text Program size..text results results0 diff test-suite :: SingleSource/Regression/C/Regression-C-sumarray.test 392.00 1439.00 267.1% test-suite :: MultiSource/Applications/JM/ldecod/ldecod.test 394258.00 394818.00 0.1% test-suite :: MultiSource/Applications/JM/lencod/lencod.test 846355.00 847075.00 0.1% test-suite :: External/SPEC/CINT2006/464.h264ref/464.h264ref.test 782816.00 783360.00 0.1% test-suite :: External/SPEC/CFP2017rate/508.namd_r/508.namd_r.test 779667.00 779923.00 0.0% test-suite :: MultiSource/Benchmarks/mafft/pairlocalalign.test 224398.00 224446.00 0.0% test-suite :: MultiSource/Applications/oggenc/oggenc.test 185019.00 185035.00 0.0% test-suite :: External/SPEC/CFP2017rate/526.blender_r/526.blender_r.test 12487610.00 12488010.00 0.0% test-suite :: MultiSource/Benchmarks/7zip/7zip-benchmark.test 1051772.00 1051804.00 0.0% test-suite :: MultiSource/Applications/SPASS/SPASS.test 529586.00 529602.00 0.0% test-suite :: External/SPEC/CINT2006/400.perlbench/400.perlbench.test 1084684.00 1084716.00 0.0% test-suite :: MultiSource/Benchmarks/tramp3d-v4/tramp3d-v4.test 1014245.00 1014261.00 0.0% test-suite :: MultiSource/Benchmarks/MallocBench/espresso/espresso.test 223494.00 223478.00 -0.0% test-suite :: External/SPEC/CINT2017speed/625.x264_s/625.x264_s.test 660843.00 660795.00 -0.0% test-suite :: External/SPEC/CINT2017rate/525.x264_r/525.x264_r.test 660843.00 660795.00 -0.0% test-suite :: MultiSource/Applications/ClamAV/clamscan.test 568824.00 568760.00 -0.0% espresso - 2 more stores vectorized x264 - small number of changes in 3-4 functions, generated a bit more vector stores (2 4x zeroinitializer stores + some other small variations). clamscan - emitted 32xi8 store instead of several scalar stores + several 4x-8x stores. Differential Revision: https://reviews.llvm.org/D155246
Configuration menu - View commit details
-
Copy full SHA for e894c3d - Browse repository at this point
Copy the full SHA e894c3dView commit details -
[FileCheck, 4/4] NFC: Stop using ExpressionValue
Use APInt directly instead. Depends On D150880 Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D154430
Configuration menu - View commit details
-
Copy full SHA for e15e969 - Browse repository at this point
Copy the full SHA e15e969View commit details -
Revert "[Flang][Sema] Move directive sets to a shared location"
This reverts commit ec70627. Reverting due to CI failure
Configuration menu - View commit details
-
Copy full SHA for f48969f - Browse repository at this point
Copy the full SHA f48969fView commit details -
[RISCV] Refactor to reduce some duplication in RISCVInstrInfoV.td. NFC
We had some load/store patterns split because EEW=64 needed a different predicate. Refactor where the foreach is place and use the foreach value to pick the predicate. Reviewed By: wangpc Differential Revision: https://reviews.llvm.org/D157176
Configuration menu - View commit details
-
Copy full SHA for 6c45b0f - Browse repository at this point
Copy the full SHA 6c45b0fView commit details -
[flang][openacc] Support readonly modifier for declare copyin in modu…
…le file Distinguish between copyin and copyin with the readonly modifier. Depends on D157121 Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D157125
Configuration menu - View commit details
-
Copy full SHA for a749b32 - Browse repository at this point
Copy the full SHA a749b32View commit details -
[ARM] Improve generation of thumb stack accesses
Currently when a stack access is out of range of an sp-relative ldr or str then we jump straight to generating the offset with a literal pool load or mov32 pseudo-instruction. This patch improves that in two ways: * If the offset is within range of sp-relative add plus an ldr then use that. * When we use the mov32 pseudo-instruction, if putting part of the offset into the ldr will simplify the expansion of the mov32 then do so. Differential Revision: https://reviews.llvm.org/D156875
Configuration menu - View commit details
-
Copy full SHA for f83ab2b - Browse repository at this point
Copy the full SHA f83ab2bView commit details -
[flang] Bump python dependencies in flang/examples/FlangOmpReport
ruamel.yaml had a potential security issues (may also be a false positive in scanner). Related to #64417 llvm/llvm-project#64417 Reviewed By: avogelsgesang Differential Revision: https://reviews.llvm.org/D157284
Configuration menu - View commit details
-
Copy full SHA for 165f7f0 - Browse repository at this point
Copy the full SHA 165f7f0View commit details -
[InstrProf] Encode linkage names in IRPGO counter names
Prior to this diff, names in the `__llvm_prf_names` section had the format `[<filepath>:]<function-name>`, e.g., `main.cpp:foo`, `bar`. `<filepath>` is used to discriminate between possibly identical function names when linkage is local and `<function-name>` simply comes from `F.getName()`. This has two problems: * `:` is commonly found in Objective-C functions so that names like `main.mm:-[C foo::]` and `-[C bar::]` are difficult to parse * `<function-name>` might be different from the linkage name, so it cannot be used to pass a function order to the linker via `-symbol-ordering-file` or `-order_file` (see https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068) Instead, this diff changes the format to `[<filepath>;]<linkage-name>`, e.g., `main.cpp;_foo`, `_bar`. The hope is that `;` won't realistically be found in either `<filepath>` or `<linkage-name>`. To prevent invalidating all prior IRPGO profiles, we also lookup the prior name format when a record is not found (see `InstrProfSymtab::create()`, `readMemprof()`, and `getInstrProfRecord()`). It seems that Swift and Clang FE-PGO rely on the original `getPGOFuncName()`, so we cannot simply replace it. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D156569
Configuration menu - View commit details
-
Copy full SHA for fe05193 - Browse repository at this point
Copy the full SHA fe05193View commit details -
[flang][openacc] Add lowering support for device_resident clause on O…
…penACC declare Depends on D156828 Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D156829
Configuration menu - View commit details
-
Copy full SHA for 5cb48f7 - Browse repository at this point
Copy the full SHA 5cb48f7View commit details -
[RISCV][GlobalISel] Fix tests for addition, subtraction and logical i…
…nstructions Fix a bug introduced in a previous commit. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156380
Configuration menu - View commit details
-
Copy full SHA for 1b74459 - Browse repository at this point
Copy the full SHA 1b74459View commit details -
[RISCV][GlobalISel] Legalize constants, undefined values, extension i…
…nstructions, and (un)merge instructions for narrow types Test legalization for (s7, s8, s16, s32, s48, s64, s96) for rv32, (s8, s15, s16, s32, s64, s72, s128, s192) for rv64. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D156383
Configuration menu - View commit details
-
Copy full SHA for b8fef7a - Browse repository at this point
Copy the full SHA b8fef7aView commit details -
Anonymous unions should be transparent wrt
[[clang::trivial_abi]]
.Anonymous unions should be transparent wrt `[[clang::trivial_abi]]`. Consider the test input below: ``` struct [[clang::trivial_abi]] Trivial { Trivial() {} Trivial(Trivial&& other) {} Trivial& operator=(Trivial&& other) { return *this; } ~Trivial() {} }; static_assert(__is_trivially_relocatable(Trivial), ""); struct [[clang::trivial_abi]] S2 { S2(S2&& other) {} S2& operator=(S2&& other) { return *this; } ~S2() {} union { Trivial field; }; }; static_assert(__is_trivially_relocatable(S2), ""); ``` Before the fix Clang would warn that 'trivial_abi' is disallowed on 'S2' because it has a field of a non-trivial class type (the type of the anonymous union is non-trivial, because it doesn't have the `[[clang::trivial_abi]]` attribute applied to it). Consequently, before the fix the `static_assert` about `__is_trivially_relocatable` would fail. Note that `[[clang::trivial_abi]]` cannot be applied to the anonymous union, because Clang warns that 'trivial_abi' is disallowed on '(unnamed union at ...)' because its copy constructors and move constructors are all deleted. Also note that it is impossible to provide copy nor move constructors for anonymous unions and structs. Reviewed By: gribozavr2 Differential Revision: https://reviews.llvm.org/D155895
Configuration menu - View commit details
-
Copy full SHA for bddaa35 - Browse repository at this point
Copy the full SHA bddaa35View commit details -
[AggressiveInstCombine][NFC] Fix typo
AggressiveInstCombine fix typo in expandStrcmp method. Differential Revision: https://reviews.llvm.org/D156556
Configuration menu - View commit details
-
Copy full SHA for 5dde755 - Browse repository at this point
Copy the full SHA 5dde755View commit details -
Configuration menu - View commit details
-
Copy full SHA for f0e8bda - Browse repository at this point
Copy the full SHA f0e8bdaView commit details -
[Clang][NVPTX] Permit use of the alias attribute for NVPTX targets
The patch in D155211 added basic support for the `.alias` keyword in PTX. This means we should be able to permit use of this in clang. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D156014
Configuration menu - View commit details
-
Copy full SHA for 0ba9aec - Browse repository at this point
Copy the full SHA 0ba9aecView commit details -
[clang-tidy] Add fix-it support to
llvmlibc-inline-function-decl
This is very simplistic and could be more thorough by replacing an existing `LIBC_INLINE` in the wrong location or a redunant `inline` when inserting the right macro use. But as is this suffices to automatically apply fixes for most or all of the instances in the libc tree today and get working results (despite some superfluous `inline` keywords left behind). Reviewed By: abrachet Differential Revision: https://reviews.llvm.org/D157164
Configuration menu - View commit details
-
Copy full SHA for 9d4162f - Browse repository at this point
Copy the full SHA 9d4162fView commit details -
[libc] Clean up required LIBC_INLINE uses in src/string
This was generated using clang-tidy and clang-apply-replacements, on src/string/*.cpp for just the llvmlibc-inline-function-decl check, after applying https://reviews.llvm.org/D157164, and then some manual fixup. Reviewed By: abrachet Differential Revision: https://reviews.llvm.org/D157169
Configuration menu - View commit details
-
Copy full SHA for 019a477 - Browse repository at this point
Copy the full SHA 019a477View commit details -
MIPS: clear_cache, use _flush_cache instead of cacheflush
The cacheflush is only defined with __USE_MISC, which depends on _DEFAULT_SOURCE, _GNU_SOURCE or _BSD_SOURCE, or _SVID_SOURCE. If CC is called with -std=c11, these macros won't be defined, Let's use _flush_cache, which is defined always. Reviewed By: brad, jrtc27 Differential Revision: https://reviews.llvm.org/D156072
Configuration menu - View commit details
-
Copy full SHA for 0f99bc2 - Browse repository at this point
Copy the full SHA 0f99bc2View commit details -
[clang-tidy][NFC] Update tests to specify CheckOptions using new syntax
In D128337, The spelling of CheckOptions was updated to support a more natural dictionary syntax. This patch is just updating all test files to use the new syntax. Reviewed By: PiotrZSL Differential Revision: https://reviews.llvm.org/D130209
Configuration menu - View commit details
-
Copy full SHA for e8a3dda - Browse repository at this point
Copy the full SHA e8a3ddaView commit details -
[clang-tidy][NFC] Update tests to CheckOptions using new syntax
This patch is just updating all test files to use the new syntax. Fix for changes introduced after D130209 were created.
Configuration menu - View commit details
-
Copy full SHA for 1af159e - Browse repository at this point
Copy the full SHA 1af159eView commit details -
[NVPTX] Fix missed test after adding alias support for NVPTX
Summary: This test was accidentally not updated.
Configuration menu - View commit details
-
Copy full SHA for 9e99a4f - Browse repository at this point
Copy the full SHA 9e99a4fView commit details -
Revert "Anonymous unions should be transparent wrt `[[clang::trivial_…
…abi]]`." This reverts commit bddaa35. Reverting as requested at https://reviews.llvm.org/D155895#4566945 (for breaking tests on Windows).
Configuration menu - View commit details
-
Copy full SHA for 0342bbf - Browse repository at this point
Copy the full SHA 0342bbfView commit details -
[RISCV] Add back handling of X > -1 to ISD::SETCC lowering.
There are cases where the -1 doesn't become visible until lowering so the folding doesn't have a chance to run. I think in these cases there is a missed DAGCombine for truncate (undef), which I may fix separately, but RISC-V backend should protect itself. Fixes #64503. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D157314
Configuration menu - View commit details
-
Copy full SHA for 7cc6154 - Browse repository at this point
Copy the full SHA 7cc6154View commit details -
[InstrProf] Fix macOS profile tests after D156569
In https://reviews.llvm.org/D156569 we changed the format of the IRPGO counter names which broke some macOS tests because the `__profc_` variable names changed. Use `{{_?}}` to allow mangled names to be prefixed with `_` to pass tests. Reviewed By: thakis Differential Revision: https://reviews.llvm.org/D157321
Configuration menu - View commit details
-
Copy full SHA for b98d0b2 - Browse repository at this point
Copy the full SHA b98d0b2View commit details -
[lldb] Make TSan errors fatal when running the test suite
Set the halt_on_error runtime flag to make TSan errors fatal when running the test suite. For the API tests the environment variables are set conditionally on whether the TSan is enabled. The Shell and Unit tests don't have that logic but setting the environment variable is harmless. For consistency, I've also mirrored the ASAN option (detect_stack_use_after_return=1) for the Shell tests. Differential revision: https://reviews.llvm.org/D157152
Configuration menu - View commit details
-
Copy full SHA for 17226c9 - Browse repository at this point
Copy the full SHA 17226c9View commit details -
[OpenMP] Disable some offloading/api tests for AArch64
Like for x86_64-linux-gnu, these need to be disabled for aarch64-linux-gnu. Differential Revision: https://reviews.llvm.org/D156815
Configuration menu - View commit details
-
Copy full SHA for 4dce6d3 - Browse repository at this point
Copy the full SHA 4dce6d3View commit details -
[FileCheck] Turn errors into assert in valueFromStringRepr()
getWildcardRegex() guarantees that only valid hex numbers are matched by FileCheck numeric expressions. This commit therefore only asserts the lack of parsing failure in valueFromStringRepr(). Depends On D154430 Reviewed By: arichardson Differential Revision: https://reviews.llvm.org/D154431
Configuration menu - View commit details
-
Copy full SHA for b743c19 - Browse repository at this point
Copy the full SHA b743c19View commit details -
Configuration menu - View commit details
-
Copy full SHA for c192b3d - Browse repository at this point
Copy the full SHA c192b3dView commit details -
[clang-tidy] Update tests to include C++23 and C++26
This commit changes the `c++xx-or-later` definitions to also include C++23 and the upcoming C++26. `readability/container-contains.cpp` to also test newer C++ versions. Also, this commit adjusts a couple of test cases slightly: * `container-contains.cpp` now also tests newer C++ versions. Restricting it to C++20 was an oversight of mine when originally writing this check. * `unconventional-assign-operator.cpp`: The `return rhs` raised a "non-const lvalue reference to type 'BadReturnStatement' cannot bind to a temporary" error in C++23. The issue is circumenvented by writing `return *&rhs`. * `const-correctness-values.cpp` was also running into the same error in C++23. The troublesome test cases were moved to a separate file. Differential Revision: https://reviews.llvm.org/D157246
Configuration menu - View commit details
-
Copy full SHA for fda7778 - Browse repository at this point
Copy the full SHA fda7778View commit details -
Configuration menu - View commit details
-
Copy full SHA for 729b55e - Browse repository at this point
Copy the full SHA 729b55eView commit details -
[AArch64] Narrow G_SEXT_INREG to s64 before lowering.
This avoids narrowing after it has been expanded to shifts. The G_SEXT_INREG narrowing can use the second operand of the instruction to optimize the narrowing. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D157172
Configuration menu - View commit details
-
Copy full SHA for 07c8bcc - Browse repository at this point
Copy the full SHA 07c8bccView commit details -
[libc] Add nullptr check option to printf %s
Some printf implementations perform a null check on pointers passed to %s. While that's not in the standard, this patch adds it as an option for compatibility. It also puts a similar check in %n behind the same flag. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D156923
Configuration menu - View commit details
-
Copy full SHA for f6ba352 - Browse repository at this point
Copy the full SHA f6ba352View commit details -
[clang][CGExprConstant] handle unary negation on integrals
Consider the statement: int x = -1; And the following AST: `-VarDecl 0x55c4823a7670 <x.c:2:1, col:10> col:5 x 'int' cinit `-UnaryOperator 0x55c4823a7740 <col:9, col:10> 'int' prefix '-' `-IntegerLiteral 0x55c4823a7720 <col:10> 'int' 1 Return the evaluation of the subexpression negated. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D156378
Configuration menu - View commit details
-
Copy full SHA for 769333a - Browse repository at this point
Copy the full SHA 769333aView commit details -
[mlir][sparse] minor cleanup of merger unit test
Removed some of the warning supression needed for the multi-arg macro logic by making number of arguments the same everywhere. Also removes some verbose comments and obvious TODOs. Reviewed By: wrengr Differential Revision: https://reviews.llvm.org/D157327
Configuration menu - View commit details
-
Copy full SHA for 1e15d79 - Browse repository at this point
Copy the full SHA 1e15d79View commit details -
[clang][CGExprConstant] handle implicit widening/narrowing Int-to-Int…
… casts Consider the following statements: long x = 1; short y = 1; With the following AST: |-VarDecl 0x55d289973730 <x.c:1:1, col:10> col:6 x 'long' cinit | `-ImplicitCastExpr 0x55d289973800 <col:10> 'long' <IntegralCast> | `-IntegerLiteral 0x55d2899737e0 <col:10> 'int' 1 `-VarDecl 0x55d289973830 <line:2:1, col:11> col:7 y 'short' cinit `-ImplicitCastExpr 0x55d2899738b8 <col:11> 'short' <IntegralCast> `-IntegerLiteral 0x55d289973898 <col:11> 'int' 1 Sign or Zero extend or truncate based on the source signedness and destination width. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D156466
Configuration menu - View commit details
-
Copy full SHA for f6267d3 - Browse repository at this point
Copy the full SHA f6267d3View commit details -
[libc][cleanup] Fix most conversion warnings
This patch is large, but is almost entirely just adding casts to calls to syscall_impl. Much of the work was done programatically, with human checking when the syntax or types got confusing. Reviewed By: mcgrathr Differential Revision: https://reviews.llvm.org/D156950
Configuration menu - View commit details
-
Copy full SHA for f0a3954 - Browse repository at this point
Copy the full SHA f0a3954View commit details -
[RISCV][GlobalISel] Legalize bitshift instructions for narrow types
Legalize G_SHL, G_ASHR and G_LSHR for types narrower and upto (and including) XLen: (i7, i8, i16 and i32) for rv32 and (i8, i15, i16, i32 and i64) for rv64. This requires adding some rules to handle G_ANYEXT, G_ZEXT and G_SEXT. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D155772
Configuration menu - View commit details
-
Copy full SHA for 649e1d1 - Browse repository at this point
Copy the full SHA 649e1d1View commit details -
Reland [clang][DeclPrinter] Fix missing semicolon in AST print for me…
…thods that are definitions without having a body DeclPrinter used FunctionDecl::isThisDeclarationADefinition to decide if the decl requires a semicolon at the end. However, there are several methods without body (that require a semicolon) that are definitions. Fixes llvm/llvm-project#62996 Initial commit had a failing test case on targets not supporting `__attribute__((alias))`. Added `-triple i386-linux-gnu` to the specific test case. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D156533
Configuration menu - View commit details
-
Copy full SHA for 3e66a17 - Browse repository at this point
Copy the full SHA 3e66a17View commit details -
Flag for LoadBinaryWithUUIDAndAddress, to create memory image or not
DynamicLoader::LoadBinaryWithUUIDAndAddress can create a Module based on the binary image in memory, which in some cases contains symbol names and can be genuinely useful. If we don't have a filename, it creates a name in the form `memory-image-0x...` with the header address. In practice, this is most useful with Darwin userland corefiles where the binary was stored in the corefile in whole, and we can't find a binary with the matching UUID. Using the binary out of the corefile memory in this case works well. But in other cases, akin to firmware debugging, we merely end up with an oddly named binary image and no symbols. Add a flag to control whether we will create these memory images and add them to the Target or not; only set it to true when working with a userland Mach-O image with the "all image infos" LC_NOTE for a userland corefile. Differential Revision: https://reviews.llvm.org/D157167
Configuration menu - View commit details
-
Copy full SHA for 57cbd26 - Browse repository at this point
Copy the full SHA 57cbd26View commit details -
Configuration menu - View commit details
-
Copy full SHA for df3800f - Browse repository at this point
Copy the full SHA df3800fView commit details -
AMDGPU: Fix counting source modifiers as literal constants
This fixes over estimating code size. This was broken by 79f52af. https://reviews.llvm.org/D157103
Configuration menu - View commit details
-
Copy full SHA for 4b1702e - Browse repository at this point
Copy the full SHA 4b1702eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0b57c3a - Browse repository at this point
Copy the full SHA 0b57c3aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 300c5aa - Browse repository at this point
Copy the full SHA 300c5aaView commit details -
[clang][DeclPrinter] Fix AST print to suppress output of implicit (no…
…n-written) constructor initializers DeclPrinter::PrintConstructorInitializers did output non-written constructor initiaizers. In particular, implicit constructor initializers of base classes were output. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D156523
Configuration menu - View commit details
-
Copy full SHA for 291eb25 - Browse repository at this point
Copy the full SHA 291eb25View commit details -
[OpenMP][Docs] Update OpenMP supported features table
Updated status of alignment clause for allocate directive in OpenMP features table, section OpenMP 5.1 Implementation Details. Differential Revision: https://reviews.llvm.org/D157135
Configuration menu - View commit details
-
Copy full SHA for f620472 - Browse repository at this point
Copy the full SHA f620472View commit details -
[RISCV][GlobalISel] Legalize logical instructions for nonpow 2 types
Legalize G_AND, G_OR, G_XOR for (s7, s48) on rv32 and (s15, s72) on rv64 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157017
Configuration menu - View commit details
-
Copy full SHA for 3bcfd6e - Browse repository at this point
Copy the full SHA 3bcfd6eView commit details -
[test][libc] Fix aligned_alloc argument
Size must be multiple of Alignment. Reviewed By: gchatelet Differential Revision: https://reviews.llvm.org/D157247
Configuration menu - View commit details
-
Copy full SHA for 9abc1e0 - Browse repository at this point
Copy the full SHA 9abc1e0View commit details -
[RISCV][GlobalISel] Legalize add/sub for wide and non-pow2 types
Legalize G_ADD, G_SUB, G_(S/U)ADD(O/E). We test for (s7, s48, s64, s96) on rv32 and (s15, s72, s128, s192) on rv64. Differential Revision: https://reviews.llvm.org/D157019
Configuration menu - View commit details
-
Copy full SHA for cd61e8d - Browse repository at this point
Copy the full SHA cd61e8dView commit details -
[RISCV][GlobalISel] Legalize G_ICMP and G_SELECT
Test legalization for (i7, i8, i16, i32, i48, i64) on rv32 and for (i8, i15, i16, i32, i64, i72, i128). Legalization fails for i96 on rv32 and i192 on rv64. Note that [i192 fails for AArch64](llvm/llvm-project#64394). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157023
Configuration menu - View commit details
-
Copy full SHA for c9fe119 - Browse repository at this point
Copy the full SHA c9fe119View commit details
Commits on Aug 8, 2023
-
[InstCombine] Introduce tests for D156811
Introduce test cases for folding `select` of `srem` and conditional add. Differential Revision: https://reviews.llvm.org/D156862
Configuration menu - View commit details
-
Copy full SHA for f5cb626 - Browse repository at this point
Copy the full SHA f5cb626View commit details -
[InstCombine] Fold
select
ofsrem
and conditional addSimplify a pattern that may show up when computing the remainder of euclidean division. Particularly, when the divisor is a power of two and never negative, the signed remainder can be folded with a bitwise and. Fixes 64305. Proofs: https://alive2.llvm.org/ce/z/9_KG6c Differential Revision: https://reviews.llvm.org/D156811
Configuration menu - View commit details
-
Copy full SHA for 2116921 - Browse repository at this point
Copy the full SHA 2116921View commit details -
[RISCV] Use vmv.s.x for a constant build_vector when the entire size …
…is less than 32 bits We have a variant of this for splats already, but hadn't handled the case where a single copy of the wider element can be inserted producing the entire required bit pattern. This shows up mostly in very small vector shuffle tests. Differential Revision: https://reviews.llvm.org/D157299
Configuration menu - View commit details
-
Copy full SHA for f0a9aac - Browse repository at this point
Copy the full SHA f0a9aacView commit details -
[Coroutine][DebugInfo] Pre-commit test for a DISubprogram with declar…
…ation. (NFC) Pre-commit test for D157184. Differential Revision: https://reviews.llvm.org/D157177
Configuration menu - View commit details
-
Copy full SHA for 88a83c9 - Browse repository at this point
Copy the full SHA 88a83c9View commit details -
[Coroutine][DebugInfo] Update the linkage name of the declaration of …
…coro-split functions in the debug info. This patch adds the linkage name update to DISubprogram's declaration after 6ce76ff. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D157184
Configuration menu - View commit details
-
Copy full SHA for ca1a5b3 - Browse repository at this point
Copy the full SHA ca1a5b3View commit details -
[mlir][sparse][gpu] add spgemm operator
Differential Revision: https://reviews.llvm.org/D152981
Kun Wu committedAug 8, 2023 Configuration menu - View commit details
-
Copy full SHA for dfe2942 - Browse repository at this point
Copy the full SHA dfe2942View commit details -
[SystemZ] Avoid type legalization on structs
In SystemZTTIImpl::getMemoryOpCost, the call to getNumberOfParts will run type legalization, which can't handle structs. So before that, we check for an unknown value type and forward to BaseT, just like many other targets do in this situation. https://bugzilla.redhat.com/show_bug.cgi?id=2224885 Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D156379
Configuration menu - View commit details
-
Copy full SHA for 85e4ee1 - Browse repository at this point
Copy the full SHA 85e4ee1View commit details -
[Clang][LoongArch] Fix ABI handling of empty structs in C++ to match …
…GCC behaviour GCC doesn't ignore non-zero-length array of empty structures in C++ while clang does. What this patch did is to match GCC's behaviour although this rule is not documented in psABI. Similar to D142327 for RISCV. Reviewed By: xry111, xen0n Differential Revision: https://reviews.llvm.org/D156116
Configuration menu - View commit details
-
Copy full SHA for e7a8a7d - Browse repository at this point
Copy the full SHA e7a8a7dView commit details -
[clang][hexagon] Handle library path arguments earlier
The removal of the early return in 96832a6 was an error: it would include the 'standalone' library that's not used by linux. Instead we reproduce the library path handling in the linux/musl block. Differential Revision: https://reviews.llvm.org/D156771
Configuration menu - View commit details
-
Copy full SHA for 5bc4b34 - Browse repository at this point
Copy the full SHA 5bc4b34View commit details -
[scudo] Dump MapAllocatorCache::retrieve() data
Keeps track of CallsToRetrieve, how many SuccessfulRetrieves, from cached block allocations. Dumps this data in the MapAllocatorCache::getStats() function Reviewed By: cferris, Chia-hungDuan Differential Revision: https://reviews.llvm.org/D157154
Configuration menu - View commit details
-
Copy full SHA for 12a22ec - Browse repository at this point
Copy the full SHA 12a22ecView commit details -
[mlir][sparse][gpu] fix spgemm runtime compile error
Differential Revision: https://reviews.llvm.org/D157349
Kun Wu committedAug 8, 2023 Configuration menu - View commit details
-
Copy full SHA for 0664db5 - Browse repository at this point
Copy the full SHA 0664db5View commit details -
[lldb] Fix data race in ConnectionFileDescriptor
TSan reports the following data race: Write of size 4 at 0x000109e0b160 by thread T2 (mutexes: write M0, write M1): #0 NativeFile::Close() File.cpp:329 #1 ConnectionFileDescriptor::Disconnect(lldb_private::Status*) ConnectionFileDescriptorPosix.cpp:232 #2 Communication::Disconnect(lldb_private::Status*) Communication.cpp:61 #3 process_gdb_remote::ProcessGDBRemote::DidExit() ProcessGDBRemote.cpp:1164 #4 Process::SetExitStatus(int, char const*) Process.cpp:1097 #5 process_gdb_remote::ProcessGDBRemote::MonitorDebugserverProcess(...) ProcessGDBRemote.cpp:3387 Previous read of size 4 at 0x000109e0b160 by main thread (mutexes: write M2): #0 NativeFile::IsValid() const File.h:393 #1 ConnectionFileDescriptor::IsConnected() const ConnectionFileDescriptorPosix.cpp:121 #2 Communication::IsConnected() const Communication.cpp:79 #3 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...) GDBRemoteCommunication.cpp:256 #4 process_gdb_remote::GDBRemoteCommunication::WaitForPacketNoLock(...l) GDBRemoteCommunication.cpp:244 #5 process_gdb_remote::GDBRemoteClientBase::SendPacketAndWaitForResponseNoLock(llvm::StringRef, StringExtractorGDBRemote&) GDBRemoteClientBase.cpp:246 The problem is that in WaitForPacketNoLock's run loop, it checks that the connection is still connected. This races with the ConnectionFileDescriptor disconnecting. Most (but not all) access to the IOObject in ConnectionFileDescriptorPosix is already gated by the mutex. This patch just protects IsConnected in the same way. Differential revision: https://reviews.llvm.org/D157347
Configuration menu - View commit details
-
Copy full SHA for 0bdbe7b - Browse repository at this point
Copy the full SHA 0bdbe7bView commit details -
Clarify the invariant of the MLIR pass pipeline around `Pass::initial…
…ize()` This method should not load new dialect or affect the context itself. Differential Revision: https://reviews.llvm.org/D157198
Configuration menu - View commit details
-
Copy full SHA for 3e2e10b - Browse repository at this point
Copy the full SHA 3e2e10bView commit details -
[MLIR] Make the
ConversionTarget
const ref in the DialectConversion…… (NFC) It isn't mutated during the conversion already, communicate this through the API. Differential Revision: https://reviews.llvm.org/D157199
Configuration menu - View commit details
-
Copy full SHA for 370a6f0 - Browse repository at this point
Copy the full SHA 370a6f0View commit details -
Add a generic "convert-to-llvm" pass delegating to an interface
The multiple -convert-XXX-to-llvm passes are really nice testing tools for individual dialects, but the expectation is that a proper conversion should assemble the conversion patterns using `populateXXXToLLVMConversionPatterns() APIs. However most customers just chain the conversion passes by convenience. This pass makes it composable more transparently to assemble the required patterns for conversion to LLVM dialect by using an interface. The Pass will scan the input and collect all the dialect present, and for those who implement the `ConvertToLLVMPatternInterface` it will use it to populate the conversion pattern, and possible the conversion target. Since these conversions can involve intermediate dialects, or target other dialects than LLVM (for example AVX or NVVM), this pass can't statically declare the required `getDependentDialects()` before the pass pipeline begins. This is worked around by using an extension in the dialectRegistry that will be invoked for every new loaded dialects in the context. This allows to lookup the interface ahead of time and use it to query the dependent dialects. Differential Revision: https://reviews.llvm.org/D157183
Configuration menu - View commit details
-
Copy full SHA for 4529797 - Browse repository at this point
Copy the full SHA 4529797View commit details -
Configuration menu - View commit details
-
Copy full SHA for 0620f99 - Browse repository at this point
Copy the full SHA 0620f99View commit details -
Add missing libraries dependencies to ConvertToLLVMPass to fix the sh…
…ared library build (NFC)
Configuration menu - View commit details
-
Copy full SHA for c8aab9b - Browse repository at this point
Copy the full SHA c8aab9bView commit details -
Revert "[lldb] Fix data race in ConnectionFileDescriptor"
This reverts commit 0bdbe7b because it broke the bots.
Configuration menu - View commit details
-
Copy full SHA for caa5167 - Browse repository at this point
Copy the full SHA caa5167View commit details -
[clang][ASTImporter] Add import of 'DependentSizedExtVectorType'
Add import of 'DependentSizedExtVectorType'. Reviewed By: balazske Differential Revision: https://reviews.llvm.org/D157238
Configuration menu - View commit details
-
Copy full SHA for db92fb8 - Browse repository at this point
Copy the full SHA db92fb8View commit details -
[clang][ASTImporter] Add import of 'ConvertVectorExpr'
Add import of ConvertVectorExpr. Reviewed By: balazske Differential Revision: https://reviews.llvm.org/D157249
Configuration menu - View commit details
-
Copy full SHA for df21f9f - Browse repository at this point
Copy the full SHA df21f9fView commit details -
Revert "[Clang] Fix -Wconstant-logical-operand when LHS is a constant"
This reverts commit dfdfd30. An issue is reported for wrong warning, this has to be reconsidered. Differential Revision: https://reviews.llvm.org/D157352
Configuration menu - View commit details
-
Copy full SHA for a845252 - Browse repository at this point
Copy the full SHA a845252View commit details -
[PPC32] Parse bl __tls_get_addr(x@tlsgd)@plt+32768
PPC32 -fpic/-fPIC generates `bl __tls_get_addr(x@tlsgd)@PLT` or `bl __tls_get_addr(x@tlsgd)@plt+32768`. `powerpc-linux-gnu-gcc -fPIC` generates `bl __tls_get_addr+32668(x@tlsgd)@plt`. These expressions can be parsed by GNU assembler but not by the integrated assembler. Add the support. Differential Revision: https://reviews.llvm.org/D153206
Configuration menu - View commit details
-
Copy full SHA for 6e07e90 - Browse repository at this point
Copy the full SHA 6e07e90View commit details -
[Sanitizers] Fix test in pulldown (#10725)
This reverts commit aa4cd66. This will resolve the following failures: ``` SanitizerCommon-asan-i386-Linux::sanitizer_coverage_allowlist_ignorelist.cpp SanitizerCommon-asan-x86_64-Linux::sanitizer_coverage_allowlist_ignorelist.cpp SanitizerCommon-lsan-i386-Linux::sanitizer_coverage_allowlist_ignorelist.cpp SanitizerCommon-lsan-x86_64-Linux::sanitizer_coverage_allowlist_ignorelist.cpp SanitizerCommon-msan-x86_64-Linux::sanitizer_coverage_allowlist_ignorelist.cpp ```
Configuration menu - View commit details
-
Copy full SHA for 32f6343 - Browse repository at this point
Copy the full SHA 32f6343View commit details -
[RISCV] Remove pre-defined macro test for b extension. NFC.
B extension has been removed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157353
Configuration menu - View commit details
-
Copy full SHA for 767ca3a - Browse repository at this point
Copy the full SHA 767ca3aView commit details -
[mlir][Linalg] Clarify error message in YieldOp verification NFC
The number of values yielded from a LinalgOp's payload has to match the number of inits / outs operands of the LinalgOp. These two numbers got mixed up in the respective error message, this patch clarifies the message and updates the tests. Reviewed By: nicolasvasilache, mehdi_amini Differential Revision: https://reviews.llvm.org/D153124
Configuration menu - View commit details
-
Copy full SHA for 8184737 - Browse repository at this point
Copy the full SHA 8184737View commit details -
[CSKY] Optimize conditional branch and value select with BTSTI
Reviewed By: zixuan-wu Differential Revision: https://reviews.llvm.org/D154768
Configuration menu - View commit details
-
Copy full SHA for 30b52a3 - Browse repository at this point
Copy the full SHA 30b52a3View commit details -
[CSKY][test][NFC] Add tests of multiplication with immediates
These tests will be optimized with IXH32/IXW32/IXD32 in the future. Reviewed By: zixuan-wu Differential Revision: https://reviews.llvm.org/D154332
Configuration menu - View commit details
-
Copy full SHA for 731bab5 - Browse repository at this point
Copy the full SHA 731bab5View commit details -
[CSKY] Optimize multiplication with immediates
Optimize "Rx * imm" for specific immediates to ([IXH32|IXW32|IXD32] (LSLI Rx, shift), Rx). Reviewed By: zixuan-wu Differential Revision: https://reviews.llvm.org/D154768
Configuration menu - View commit details
-
Copy full SHA for 57c6fe2 - Browse repository at this point
Copy the full SHA 57c6fe2View commit details -
[MLIR][Presburger] Implement findSymbolicIntegerLexMin/Max for Presbu…
…rgerRelation This patch implements findSymbolicIntegerLexMin/Max for PresburgerRelation Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D156623
Configuration menu - View commit details
-
Copy full SHA for ca21398 - Browse repository at this point
Copy the full SHA ca21398View commit details -
[X86][NFC]Remove dead code in IfConversion.cpp
In line 544, if we go in to isFalse, then the Kind could be ICTriangleFalse and isRev must be False, so we never go into the true branch in line 545, better to remove it. Reviewed By: skan, pengfei Differential Revision: https://reviews.llvm.org/D157260
Configuration menu - View commit details
-
Copy full SHA for f4a6038 - Browse repository at this point
Copy the full SHA f4a6038View commit details -
[clang][ExprConstant] Fix crash on uninitialized base class subobject
This patch fixes the reported regression caused by D146358 through adding notes about an uninitialized base class when we diagnose uninitialized constructor. This also changes the wording from the old one in order to make it clear that the uninitialized subobject is a base class and its constructor is not called. Wording changes: BEFORE: `subobject of type 'Base' is not initialized` AFTER: `constructor of base class 'Base' is not called` Fixes llvm/llvm-project#63496 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D153969
Configuration menu - View commit details
-
Copy full SHA for 24c91d4 - Browse repository at this point
Copy the full SHA 24c91d4View commit details -
[Clang][AArch64] Add/implement ACLE keywords for SME.
This patch adds all the language-level function keywords defined in: ARM-software/acle#188 (merged) ARM-software/acle#261 (update after D148700 landed) The keywords are used to control PSTATE.ZA and PSTATE.SM, which are respectively used for enabling the use of the ZA matrix array and Streaming mode. This information needs to be available on call sites, since the use of ZA or streaming mode may have to be enabled or disabled around the call-site (depending on the IR attributes set on the caller and the callee). For calls to functions from a function pointer, there is no IR declaration available, so the IR attributes must be added explicitly to the call-site. With the exception of '__arm_locally_streaming' and '__arm_new_za' the information is part of the function's interface, not just the function definition, and thus needs to be propagated through the FunctionProtoType::ExtProtoInfo. This patch adds the defintions of these keywords, as well as codegen and semantic analysis to ensure conversions between function pointers are valid and that no conflicting keywords are set. For example, '__arm_streaming' and '__arm_streaming_compatible' are mutually exclusive. Differential Revision: https://reviews.llvm.org/D127762
Configuration menu - View commit details
-
Copy full SHA for 28b5f30 - Browse repository at this point
Copy the full SHA 28b5f30View commit details -
[OpenMP][OMPD][Doc] Update OMPD implementations details.
OMPD is already pushed to LLVM repo through https://reviews.llvm.org/D100181 . Currently, it supports Openmp 5.0 standard for the host in Linux machines. Reviewed By: @jdoerfert Differential Revision: https://reviews.llvm.org/D156878
Configuration menu - View commit details
-
Copy full SHA for 658490a - Browse repository at this point
Copy the full SHA 658490aView commit details -
[NFC] Update formatting of some symbolizer tests
These tests are touched in D149757 and to reduce the number of changes in that patch, the tests are updated here. The test format is fixed according to the rules: - # for actual comments; - RUN and CHECK lines are specified without #; - all comment markers should have a space between them and the rest of the line (e.g. # This is a comment). In some cases lines are reordered to make CHECK commands closer to the corresponding RUN lines. No other changes are made. Differential Revision: https://reviews.llvm.org/D155943
Configuration menu - View commit details
-
Copy full SHA for 876ccd5 - Browse repository at this point
Copy the full SHA 876ccd5View commit details -
[libc][doc] Update macros documentation
Update documentaiton now that macros are laid out in a more structured way. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D143911
Configuration menu - View commit details
-
Copy full SHA for 5753103 - Browse repository at this point
Copy the full SHA 5753103View commit details -
[clang] Error on substitution failure within lambda body inside a req…
…uires-expression Per CWG 2672 substitution failure within the body of a lambda inside a requires-expression should be a hard error. Fixes llvm/llvm-project#64138 Reviewed By: cor3ntin Differential Revision: https://reviews.llvm.org/D156993
Configuration menu - View commit details
-
Copy full SHA for 38cf47f - Browse repository at this point
Copy the full SHA 38cf47fView commit details -
[clang] Pass --cuda-path to fix test/Driver/openmp-offload-jit.c
This test was trying to detect a system installation of CUDA and was marked as returning exit code 1 as part of D156363. Pass an explicit CUDA installation to make the test return exit code 0 regardless of a CUDA being found on the system or not. Also add an explicit -march to get a stable test.
Configuration menu - View commit details
-
Copy full SHA for cb3136b - Browse repository at this point
Copy the full SHA cb3136bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 0d05992 - Browse repository at this point
Copy the full SHA 0d05992View commit details -
[VPlan] Use IterT template arg directly for VPInstruction operands (NFC)
Makes the constructors a bit more flexible, to be used in D157194 & D157144.
Configuration menu - View commit details
-
Copy full SHA for e2851ad - Browse repository at this point
Copy the full SHA e2851adView commit details -
[RISCV] Add fixed vector tests for ct[l,t]z_zero_undef
Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157293
Configuration menu - View commit details
-
Copy full SHA for 44383ac - Browse repository at this point
Copy the full SHA 44383acView commit details -
[RISCV] Lower unary zvbb ops for fixed vectors
This reuses the same strategy for fixed vectors as other ops, i.e. custom lower to a scalable *_vl SD node. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157294
Configuration menu - View commit details
-
Copy full SHA for 768740e - Browse repository at this point
Copy the full SHA 768740eView commit details -
[RISCV] Lower vro{l,r} for fixed vectors
We need to add new VL nodes to mirror ISD::ROTL and ISD::ROTR. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157295
Configuration menu - View commit details
-
Copy full SHA for 5d510ea - Browse repository at this point
Copy the full SHA 5d510eaView commit details -
[clang-tidy][NFC] Remove trailing whitespaces from ProTypeVarargCheck
Just a whitespace cleanup.
Configuration menu - View commit details
-
Copy full SHA for 7899d2a - Browse repository at this point
Copy the full SHA 7899d2aView commit details -
[X86] matchTruncateWithPACK - canonically prefer v4i64 -> v4i32 shuff…
…le vs truncation Pulled out of LowerTruncateVecPackWithSignBits - prefer shuffles unless we can cheaply split the vector. ComputeNumSignBits struggles with vXi64 through bitcasts, so we're usually better off with shuffles.
Configuration menu - View commit details
-
Copy full SHA for 943fda5 - Browse repository at this point
Copy the full SHA 943fda5View commit details -
[FileCheck] Fix MSVC 'argument': truncation from 'int' to 'bool' warn…
…ing. Ensure expectOperationValueResult performs the is_integral_v as constexpr to prevent MSVC getting confused between the mixture of integer / string constructors in the if-else. Warning introduced in D150880
Configuration menu - View commit details
-
Copy full SHA for 5bd8f48 - Browse repository at this point
Copy the full SHA 5bd8f48View commit details -
[DAG] Add constant SPLAT handling in getNodes SIGN_EXTEND_INREG
This helps simplify constant splats a little. Without this the code in llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L14072 always returns the existing node. Differential Revision: https://reviews.llvm.org/D157259
Configuration menu - View commit details
-
Copy full SHA for de775f2 - Browse repository at this point
Copy the full SHA de775f2View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7f32088 - Browse repository at this point
Copy the full SHA 7f32088View commit details -
[flang][nfc] Add debug prints to FIR alias analysis
These make it easier to debug and improve alias analysis. Enable with --debug-only=fir-alias-analysis. Differential Revision: https://reviews.llvm.org/D157105
Configuration menu - View commit details
-
Copy full SHA for d82a158 - Browse repository at this point
Copy the full SHA d82a158View commit details -
[flang] support (hl)fir.declare in alias analysis
Differential Revision: https://reviews.llvm.org/D157106
Configuration menu - View commit details
-
Copy full SHA for c732a45 - Browse repository at this point
Copy the full SHA c732a45View commit details -
Configuration menu - View commit details
-
Copy full SHA for 90ecb9d - Browse repository at this point
Copy the full SHA 90ecb9dView commit details -
Revert "[flang] support (hl)fir.declare in alias analysis"
Reverting because of buildbot failure This reverts commit c732a45.
Configuration menu - View commit details
-
Copy full SHA for 4492ec7 - Browse repository at this point
Copy the full SHA 4492ec7View commit details -
[RISCV][Lsan] Set allocator for AP64
This patch uses similar allocator configuration to Asan, i.e. dynamic allocator start address (~(uptr)0) and 128 GB allocator size. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D152895
Configuration menu - View commit details
-
Copy full SHA for e7191fb - Browse repository at this point
Copy the full SHA e7191fbView commit details -
[mlir][TOSA] Set default TOSA validation level to 'None' for TOSA -> …
…linalg Unless otherwise specified this pass should not assume a level, as this rejects otherwise valid TOSA. This has caused build failures in IREE. The level (and other validation options) have now been made configurable. The pass options have been converted to enums to make them more type safe in C++. Reviewed By: Tai78641 Differential Revision: https://reviews.llvm.org/D157282
Configuration menu - View commit details
-
Copy full SHA for 32b7c1f - Browse repository at this point
Copy the full SHA 32b7c1fView commit details -
Configuration menu - View commit details
-
Copy full SHA for f9a609c - Browse repository at this point
Copy the full SHA f9a609cView commit details -
[VPlan] Use printOperands for VPInstruction.
Use the printOperands for printing VPInstruction's operands to be more in line with other recipes and ensure consistent printing after D15719. Also removes some stray spaces in print output.
Configuration menu - View commit details
-
Copy full SHA for 93c5bae - Browse repository at this point
Copy the full SHA 93c5baeView commit details -
[CodeGen] Pre-commit tests showing incorrect pattern FMLA_* pseudo in…
…structions Differential Revision: https://reviews.llvm.org/D157094
Configuration menu - View commit details
-
Copy full SHA for b560d5c - Browse repository at this point
Copy the full SHA b560d5cView commit details -
[VPlan] Model wrap flags directly, remove *NUW opcodes (NFC)
Model wrap flags directly using VPRecipeWithIRFlags and clean up the duplicated *NUW opcodes. D157144 will build on this and also model FMFs for VPInstruction. Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D157194
Configuration menu - View commit details
-
Copy full SHA for af635a5 - Browse repository at this point
Copy the full SHA af635a5View commit details -
[MLIR][NVGPU] Handling Offset in
nvgpu.tma.async.load
When using `nvgpu.tma.async.load` Op to asynchronously load data into shared memory, it fails to account for provided offsets, potentially leading to incorrect memory access. Using offset is common practice especially with the dynamic shared memory. This work addresses the problem by ensuring proper consideration of offsets. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D157380
Configuration menu - View commit details
-
Copy full SHA for 50a76a7 - Browse repository at this point
Copy the full SHA 50a76a7View commit details -
[Clang] Fix the do while statement disappearing in AST when an error …
…occurs in the conditional expression of the do while statement ``` constexpr int test() { do {} while (a + 1 < 10); return 0; } ``` Before: ``` `-FunctionDecl 0x56512a172650 <./recovery.cpp:1:1, line:4:1> line:1:15 constexpr test 'int ()' implicit-inline `-CompoundStmt 0x56512a172860 <col:22, line:4:1> `-ReturnStmt 0x56512a172850 <line:3:5, col:12> `-IntegerLiteral 0x56512a172830 <col:12> 'int' 0 ``` Now: ``` `-FunctionDecl 0x5642c4804650 <./recovery.cpp:1:1, line:4:1> line:1:15 constexpr test 'int ()' implicit-inline `-CompoundStmt 0x5642c48048e0 <col:22, line:4:1> |-DoStmt 0x5642c4804890 <line:2:5, col:28> | |-CompoundStmt 0x5642c4804740 <col:8, col:9> | `-BinaryOperator 0x5642c4804870 <col:18, col:26> '<dependent type>' contains-errors '<' | |-BinaryOperator 0x5642c4804850 <col:18, col:22> '<dependent type>' contains-errors '+' | | |-RecoveryExpr 0x5642c4804830 <col:18> '<dependent type>' contains-errors lvalue | | `-IntegerLiteral 0x5642c48047b0 <col:22> 'int' 1 | `-IntegerLiteral 0x5642c48047f0 <col:26> 'int' 10 `-ReturnStmt 0x5642c48048d0 <line:3:5, col:12> `-IntegerLiteral 0x5642c48048b0 <col:12> 'int' 0 ``` Reviewed By: hokein Differential Revision: https://reviews.llvm.org/D157195
Configuration menu - View commit details
-
Copy full SHA for a2132d7 - Browse repository at this point
Copy the full SHA a2132d7View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7542477 - Browse repository at this point
Copy the full SHA 7542477View commit details -
[Clang][Tooling] Accept preprocessed input files
This restores the tooling library's ability to accept invocations that take a preprocessed file as the primary input. Regressed by https://reviews.llvm.org/D105695 Fixes llvm/llvm-project#63941 Differential Revision: https://reviews.llvm.org/D157011
Configuration menu - View commit details
-
Copy full SHA for 241cceb - Browse repository at this point
Copy the full SHA 241ccebView commit details -
[VPlan] Fold if into return in prepareToExecute assertion (NFC).
Independent simplification suggested in D157194.
Configuration menu - View commit details
-
Copy full SHA for e18a547 - Browse repository at this point
Copy the full SHA e18a547View commit details -
[AArch64][SME2][SVE2p1] Choose strided or contiguous loads
Lower to the strided/contiguous addressing mode of ld1/ldnt1 instructions depending on register allocation. Differential Revision: https://reviews.llvm.org/D156311
Configuration menu - View commit details
-
Copy full SHA for e8efe7f - Browse repository at this point
Copy the full SHA e8efe7fView commit details -
Configuration menu - View commit details
-
Copy full SHA for b6d994d - Browse repository at this point
Copy the full SHA b6d994dView commit details -
[mlir][nvgpu] Add a nvgpu.rewrite_copy_as_tma transform operation.
This revision adds support for direct lowering of a linalg.copy on buffers between global and shared memory to a tma async load + synchronization operations. This uses the recently introduced Hopper NVVM and NVGPU abstraction to connect things end to end. Differential Revision: https://reviews.llvm.org/D157087
Configuration menu - View commit details
-
Copy full SHA for a3cd2ee - Browse repository at this point
Copy the full SHA a3cd2eeView commit details -
[mlir][nvgpu] Fix -Wunused-variable in NVGPUTransformOps.cpp (NFC)
/data/llvm-project/mlir/lib/Dialect/NVGPU/TransformOps/NVGPUTransformOps.cpp:969:16: error: unused variable 'inMemRefType' [-Werror,-Wunused-variable] MemRefType inMemRefType = inMemRef.getType(); ^ 1 error generated.
Configuration menu - View commit details
-
Copy full SHA for 28d8c0d - Browse repository at this point
Copy the full SHA 28d8c0dView commit details -
[NFC][AArch64] Added checks for global entries in ReplaceWithVeclib t…
…esting This patch added checks for global entries in ReplaceWithVeclib testing using ArmPL and SLEEF vector libraries. Differential Revision: https://reviews.llvm.org/D157258
Configuration menu - View commit details
-
Copy full SHA for 9329723 - Browse repository at this point
Copy the full SHA 9329723View commit details -
[mlir][NVGPU] Support N-D masks in transform.nvgpu.create_async_groups
Support IR that is generated by the vector-to-scf lowering of N-D vector transfers with a mask. (Until now only 1-D and 2-D transfers were supported.) Only transfers that were fully unrolled are supported. Differential Revision: https://reviews.llvm.org/D157286
Configuration menu - View commit details
-
Copy full SHA for 15ea230 - Browse repository at this point
Copy the full SHA 15ea230View commit details -
[libc++] Deflake the Clang Modules CI job
This re-introduces the workaround that had been introduced in d7ca140 and then removed in 0c0628c, since it seems like it is needed after all. Differential Revision: https://reviews.llvm.org/D157319
Configuration menu - View commit details
-
Copy full SHA for d2a61db - Browse repository at this point
Copy the full SHA d2a61dbView commit details -
[libc++] Remove variables that are not necessary anymore inside heade…
…r_information.py Those are not relevant anymore since we don't have tests for private headers anymore. Differential Revision: https://reviews.llvm.org/D155880
Configuration menu - View commit details
-
Copy full SHA for 5e67348 - Browse repository at this point
Copy the full SHA 5e67348View commit details -
[LegalizeTypes][RISCV] Support libcalls for fpto{s,u}i of bfloat by e…
…xtending to f32 first As there is no direct bf16 libcall for these conversions, extend to f32 first. This patch includes a tiny refactoring to pull out equivalent logic in ExpandIntRes_XROUND_XRINT so it can be reused in ExpandIntRes_FP_TO_{S,U}INT. This patch also demonstrates incorrect codegen for RV32 without zfbfmin for the newly enabled tests. As it doesn't introduce that incorrect codegen (caused by the assumption that 'TypeSoftPromoteHalf' is only used for f16 types), a fix will be added in a follow-up (D157287). Differential Revision: https://reviews.llvm.org/D156990
Configuration menu - View commit details
-
Copy full SHA for f7dbc85 - Browse repository at this point
Copy the full SHA f7dbc85View commit details -
[mlir][Target][LLVM] Adds an utility class for serializing operations…
… to binary strings. **For an explanation of these patches see D154153.** Commit message: This patch adds the utility base class `ModuleToObject`. This class provides an interface for compiling module operations into binary strings, by default this class serialize modules to LLVM bitcode. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D154100
Configuration menu - View commit details
-
Copy full SHA for c8e0364 - Browse repository at this point
Copy the full SHA c8e0364View commit details -
[mlir][gpu] Add GPU target attribute interface.
**For an explanation of these patches see D154153.** Commit message: This patch adds the `GPUTargetAttrInterface` attribute interface, this interface is meant to be used as an opaque interface for serializing GPU modules into binary strings. Reviewed By: mehdi_amini, krzysz00 Differential Revision: https://reviews.llvm.org/D154104
Configuration menu - View commit details
-
Copy full SHA for 86c4dfa - Browse repository at this point
Copy the full SHA 86c4dfaView commit details -
[DAG] Fix crash in replaceStoreOfInsertLoad
Idx's type can be different from Ptr's, causing a "Binary operator types must match" assertion failure when emitting the MUL. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D156972
Configuration menu - View commit details
-
Copy full SHA for 98ccc70 - Browse repository at this point
Copy the full SHA 98ccc70View commit details -
[AMDGPU] Add extended-image-insts to RemoveIncompatibleFunctions
Otherwise device libs still has issues at O0 (in OpenCL-CTS) Depends on D156972 as well. They're unrelated fixes but both are needed to fix the issue. Fixes SWDEV-402331 Reviewed By: #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D156973
Configuration menu - View commit details
-
Copy full SHA for 96e1032 - Browse repository at this point
Copy the full SHA 96e1032View commit details -
[mlir][gpu] Add target attribute to GPU modules.
**For an explanation of these patches see D154153.** Commit message: Adds support for Target attributes in GPU modules. This change enables attaching an optional non empty array of GPU target attributes to the module. Depends on D154104 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D154113
Configuration menu - View commit details
-
Copy full SHA for 9fa7b9e - Browse repository at this point
Copy the full SHA 9fa7b9eView commit details -
Revert "[mlir][Target][LLVM] Adds an utility class for serializing op…
…erations to binary strings." This reverts commit c8e0364.
Configuration menu - View commit details
-
Copy full SHA for bc9a375 - Browse repository at this point
Copy the full SHA bc9a375View commit details -
[CUDA][HIP] Fix overloading resolution of delete operator
Currently clang does not consider host/device preference when resolving delete operator in the file scope, which causes device operator delete selected for class member initialization. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D156795
Configuration menu - View commit details
-
Copy full SHA for 247cc26 - Browse repository at this point
Copy the full SHA 247cc26View commit details -
[libc] Allow NVPTX to use aliases
Summrary: Following D156014 we can now use aliases for NVPTX, removing this source of divergence. We require at least +ptx63 and at least sm_30 for `.alias` but this is already within what we build for with `libc` support. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D157323
Configuration menu - View commit details
-
Copy full SHA for e74281a - Browse repository at this point
Copy the full SHA e74281aView commit details -
[clang-tidy][include-cleaner] Add option to control deduplication of …
…findings per symbol We received some user feedback around this being disruptful rather than useful in certain workflows so add an option to control the output behaviour. Differential Revision: https://reviews.llvm.org/D157390
Configuration menu - View commit details
-
Copy full SHA for 89d0a76 - Browse repository at this point
Copy the full SHA 89d0a76View commit details -
Configuration menu - View commit details
-
Copy full SHA for 724b40a - Browse repository at this point
Copy the full SHA 724b40aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 8e7f032 - Browse repository at this point
Copy the full SHA 8e7f032View commit details -
Revert "[Pipelines] Perform hoisting prior to GVN"
This reverts commit 1f37088 as it causes a large regression in x264, and some other regressions in downstream embedded benchmarks under LTO.
Configuration menu - View commit details
-
Copy full SHA for 05b4310 - Browse repository at this point
Copy the full SHA 05b4310View commit details -
Merge from 'sycl' to 'sycl-web'
iclsrc committedAug 8, 2023 Configuration menu - View commit details
-
Copy full SHA for b87a906 - Browse repository at this point
Copy the full SHA b87a906View commit details -
Merge from 'main' to 'sycl-web' (178 commits)
CONFLICT (content): Merge conflict in llvm/lib/Passes/PassBuilderPipelines.cpp
Configuration menu - View commit details
-
Copy full SHA for 8f54d28 - Browse repository at this point
Copy the full SHA 8f54d28View commit details -
Configuration menu - View commit details
-
Copy full SHA for 76c624e - Browse repository at this point
Copy the full SHA 76c624eView commit details -
Configuration menu - View commit details
-
Copy full SHA for 99b77f3 - Browse repository at this point
Copy the full SHA 99b77f3View commit details
Commits on Aug 10, 2023
-
Merge from 'sycl' to 'sycl-web'
iclsrc committedAug 10, 2023 Configuration menu - View commit details
-
Copy full SHA for 556954a - Browse repository at this point
Copy the full SHA 556954aView commit details -
Configuration menu - View commit details
-
Copy full SHA for d6c09d3 - Browse repository at this point
Copy the full SHA d6c09d3View commit details
Commits on Aug 11, 2023
-
Fix @llvm.annotation translation with opaque pointers enabled. (#2035)
Original commit: KhronosGroup/SPIRV-LLVM-Translator@f751ba1
Configuration menu - View commit details
-
Copy full SHA for ad56aea - Browse repository at this point
Copy the full SHA ad56aeaView commit details -
optimizing away pair type object copy in spv writer
Original commit: KhronosGroup/SPIRV-LLVM-Translator@3448740
Configuration menu - View commit details
-
Copy full SHA for 3dd5ef7 - Browse repository at this point
Copy the full SHA 3dd5ef7View commit details -
Implement support for SPV_KHR_shader_clock (#2026)
Link to the spec: https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/KHR/SPV_KHR_shader_clock.asciidoc Original commit: KhronosGroup/SPIRV-LLVM-Translator@e4dfa92
Configuration menu - View commit details
-
Copy full SHA for a11d65d - Browse repository at this point
Copy the full SHA a11d65dView commit details -
Fix translation of SPV_INTEL_debug_module extension
When NonSemantic.Shader.100 debug info is enabled. The related tests cases are enabled back. Original commit: KhronosGroup/SPIRV-LLVM-Translator@574b0c6
Configuration menu - View commit details
-
Copy full SHA for 4dd0b0a - Browse repository at this point
Copy the full SHA 4dd0b0aView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2989ed7 - Browse repository at this point
Copy the full SHA 2989ed7View commit details -
Cache CU value translated outside of usual flow (#2072)
Compilation unit can be translated earlier, e.g. in `transEntryPoint()`. We should save the translated LLVM value to be used further. Original commit: KhronosGroup/SPIRV-LLVM-Translator@803e528
Configuration menu - View commit details
-
Copy full SHA for 3a95a40 - Browse repository at this point
Copy the full SHA 3a95a40View commit details -
Switch the default of reverse translation to emitting opaque pointers. (
#2074) Most of the changes are adding -emit-opaque-pointers=0 lines to test code. The code generally works in the forward translation at this point, although there is still substantial work that needs to be done to finish porting the tests. Original commit: KhronosGroup/SPIRV-LLVM-Translator@10b1354
Configuration menu - View commit details
-
Copy full SHA for beaaadd - Browse repository at this point
Copy the full SHA beaaaddView commit details -
.clang-tidy: disable misc-include-cleaner (#2077)
This check was recently added to clang-tidy, but the code base doesn't quite comply to it, so disable it for now. Original commit: KhronosGroup/SPIRV-LLVM-Translator@2cea844
Configuration menu - View commit details
-
Copy full SHA for cf9e9d5 - Browse repository at this point
Copy the full SHA cf9e9d5View commit details -
Fix DebugTypeSubrange parameters order (#2076)
Count should be the 3rd parameter. Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@fca2a3a
Configuration menu - View commit details
-
Copy full SHA for 9263e90 - Browse repository at this point
Copy the full SHA 9263e90View commit details -
Emit target extension types for SPIR-V friendly IR.
The ABI name mangling of target extension types is tweaked to be like a pointer-to-a-struct, which maintains somewhat better compatibility with the typed pointer representation for name mangling. Original commit: KhronosGroup/SPIRV-LLVM-Translator@15e0aa9
Configuration menu - View commit details
-
Copy full SHA for 500163c - Browse repository at this point
Copy the full SHA 500163cView commit details -
Use -emit-opaque-pointers for SPIR-V-friendly IR.
There are a couple of tests that use some magic to reuse the checks between opencl-flavored IR and SPIR-V-friendly IR; these tests are not yet ported to make diffs smaller. Original commit: KhronosGroup/SPIRV-LLVM-Translator@60ac579
Configuration menu - View commit details
-
Copy full SHA for 7803e37 - Browse repository at this point
Copy the full SHA 7803e37View commit details -
Convert tests to use -emit-opaque-pointers (#2084)
This fixes most of the tests to use -emit-opaque-pointers. There are a few tests not yet converted for various reasons: Several tests only have check lines for opaque struct declarations. These are not emitted in opaque pointer mode, so the tests need deeper rewrites. A few tests check both typed and opaque pointer output. A few tests only make sense with typed pointer output (ForwardPtr.ll and RecursiveType.ll in particular). One test has a crash in reverse translation. Outside of these cases, all tests should now be using opaque pointer for testing reverse translation. Original commit: KhronosGroup/SPIRV-LLVM-Translator@0dc80be
Configuration menu - View commit details
-
Copy full SHA for 63c1b91 - Browse repository at this point
Copy the full SHA 63c1b91View commit details -
Correctly use cached DICompilationUnit (#2080)
Followup for KhronosGroup/SPIRV-LLVM-Translator#2072 Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@8dcd16c
Configuration menu - View commit details
-
Copy full SHA for 90f3f9c - Browse repository at this point
Copy the full SHA 90f3f9cView commit details -
Remove tests that only make sense under typed pointers.
Original commit: KhronosGroup/SPIRV-LLVM-Translator@2bd25cf
Configuration menu - View commit details
-
Copy full SHA for 1cb6b2d - Browse repository at this point
Copy the full SHA 1cb6b2dView commit details -
Convert many tests to use opaque pointers.
Original commit: KhronosGroup/SPIRV-LLVM-Translator@cbac39f
Configuration menu - View commit details
-
Copy full SHA for 5a73df6 - Browse repository at this point
Copy the full SHA 5a73df6View commit details -
Handle ndrange parameter not being a GEP.
Original commit: KhronosGroup/SPIRV-LLVM-Translator@5d400f8
Configuration menu - View commit details
-
Copy full SHA for b1efb10 - Browse repository at this point
Copy the full SHA b1efb10View commit details -
XFAIL annotation_dbg_info_drop.ll (#2083)
It should be compiltely rewritten Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@0e3404e
Configuration menu - View commit details
-
Copy full SHA for 28359d3 - Browse repository at this point
Copy the full SHA 28359d3View commit details -
Remove 3 calls of getNonOpaquePointerElementType (#2089)
getNonOpaquePointerElementType is deprecated. This patch removes 3 calls to it where it's no longer require functional changes. Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@3b9f9a6
Configuration menu - View commit details
-
Copy full SHA for 14e6e5c - Browse repository at this point
Copy the full SHA 14e6e5cView commit details -
Convert tests to use opaque pointers.
Most of the changes in these tests are adjusting the way test checks work to match the output in opaque pointer mode (in particular, opaque struct names aren't getting generated anymore). In the case of cl_types.ll and spirv_types.ll, the tests are deleted entirely because they would require much more invasive changes to keep checking the same thing (since the opaque struct names aren't generally available, and all of the types would have to be used more directly in an intrinsic name to be available for translation. Original commit: KhronosGroup/SPIRV-LLVM-Translator@b6207b8
Configuration menu - View commit details
-
Copy full SHA for c186fe2 - Browse repository at this point
Copy the full SHA c186fe2View commit details -
Fix the AVC motion estimation tests.
There is a bug uncovered by these tests that, when exposed as target extension types, these weren't properly adding necessary capabilities. Original commit: KhronosGroup/SPIRV-LLVM-Translator@952b523
Configuration menu - View commit details
-
Copy full SHA for 85a5643 - Browse repository at this point
Copy the full SHA 85a5643View commit details -
Fix translation of llvm.index.group in opaque pointers.
Original commit: KhronosGroup/SPIRV-LLVM-Translator@224fed8
Configuration menu - View commit details
-
Copy full SHA for 7e1b961 - Browse repository at this point
Copy the full SHA 7e1b961View commit details -
Fix translation of PipeStorage
As far as I can tell, the original translation creating two OpTypePipeStorage instructions was actually a bug; they should always have been translated using a single instruction. The SPIR-V representation of PipeStorage as a target extension type has been changed to a pointer to a struct named spirv.PipeStorage instead. At the current time, target extension types don't support constant initializers, which would be necessary for these types. Instead, they're represented as structs with integers, with necessary bitcasting to these types for use in initializer. There is still a latent bug that OpTypePipeStorage is getting bitcast instead. Original commit: KhronosGroup/SPIRV-LLVM-Translator@e661cb7
Configuration menu - View commit details
-
Copy full SHA for eb1f0c0 - Browse repository at this point
Copy the full SHA eb1f0c0View commit details -
Rewrite joint_matrix tests (#2088)
This patch adds joint_matrix reverse translation to target extension type and starts rewriting all of the tests. Some tests are being removed as outdated Remaining tests to add after the patch: 1. tf32 test 2. element wise operations test Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@465eb3c
Configuration menu - View commit details
-
Copy full SHA for e7e7beb - Browse repository at this point
Copy the full SHA e7e7bebView commit details -
Remove last -opaque-pointers flag from test (#2092)
The -opaque-pointers flag is no longer supported. Original commit: KhronosGroup/SPIRV-LLVM-Translator@ff8c3e7
Configuration menu - View commit details
-
Copy full SHA for 8ff51cd - Browse repository at this point
Copy the full SHA 8ff51cdView commit details -
Workaround unsupported freeze insn (#2087)
Workaround unsupported freeze insn by: replacing uses of freeze's result with freeze's source or a random (but compilation reproducible) constant if freeze's source is undef/poison deleting freeze insn. Long term solution is to add a freeze instruction extension in SPIR-V. Issue is tracked in (#1140) Signed-off-by: Lu, John <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@ed25856
Configuration menu - View commit details
-
Copy full SHA for 25048b6 - Browse repository at this point
Copy the full SHA 25048b6View commit details -
Remove use of deprecated getWithSamePointeeType() (#2096)
Original commit: KhronosGroup/SPIRV-LLVM-Translator@28e2408
Configuration menu - View commit details
-
Copy full SHA for aed2dd2 - Browse repository at this point
Copy the full SHA aed2dd2View commit details -
Fix checks with incorrect labels (#2093)
Signed-off-by: Lu, John <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@201b782
Configuration menu - View commit details
-
Copy full SHA for ba03f12 - Browse repository at this point
Copy the full SHA ba03f12View commit details -
Use llvm-toolchain-focal-17/ on the main branch (#2104)
It is a temporary change to enable back CI until we change LLVM version Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@93b6d57
Configuration menu - View commit details
-
Copy full SHA for 68bb243 - Browse repository at this point
Copy the full SHA 68bb243View commit details -
Replace use of isOpaquePointerTy with isPointerTy (#2091)
'bool llvm::Type::isOpaquePointerTy() const' has been removed. We need to use isPointerTy() instead. This PR handles all uses except one (lib/SPIRV/SPIRVWriter.cpp:279) which is handled in a different PR (#2089) Thanks Original commit: KhronosGroup/SPIRV-LLVM-Translator@c7a0b9b
Configuration menu - View commit details
-
Copy full SHA for bd0ce29 - Browse repository at this point
Copy the full SHA bd0ce29View commit details -
Don't preserve debug metadata with --spirv-preserve-auxdata (#2102)
It's duplicating existing information in the SPIR-V and it has complex structure. Signed-off-by: Sarnie, Nick <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@d498f48
Configuration menu - View commit details
-
Copy full SHA for 3516299 - Browse repository at this point
Copy the full SHA 3516299View commit details -
Fix AttrKind attributes with --spirv-preserve-auxdata (#2107)
For AttrKind attributes, we need to convert them to the AttrKind enum before checking if it exists or adding it, otherwise it gets added as a string. Signed-off-by: Sarnie, Nick <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@d24b9c6
Configuration menu - View commit details
-
Copy full SHA for d2c280e - Browse repository at this point
Copy the full SHA d2c280eView commit details -
Fix parent scope index for ImportedEntity (#2095)
Should be more cautious with inline namespaces Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@668a97d
Configuration menu - View commit details
-
Copy full SHA for 201285f - Browse repository at this point
Copy the full SHA 201285fView commit details -
Remove deprecated functions: setOpaquePointer, isOpaqueOrPointeeTypeM…
…atches (#2108) Removes use of setOpaquePointer --> Can be removed without any side effect. isOpaqueOrPointeeTypeMatches --> is always true. Thanks Signed-off-by: Arvind Sudarsanam <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@42d7c59
Configuration menu - View commit details
-
Copy full SHA for 0d9ff2c - Browse repository at this point
Copy the full SHA 0d9ff2cView commit details -
Remove uses of supportsTypedPointers() (#2106)
Also fix 'unused variable' warning. Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@b0142af
Configuration menu - View commit details
-
Copy full SHA for 0337240 - Browse repository at this point
Copy the full SHA 0337240View commit details -
Adjust "Source Lang Literal" logic to support multiple CompileUnits (#…
…2098) This commit changes "Source Lang Literal" flag from simple a scalar value to a vector of pairs: (compile unit, source language). Original commit: KhronosGroup/SPIRV-LLVM-Translator@eb051c7
Configuration menu - View commit details
-
Copy full SHA for 27ed358 - Browse repository at this point
Copy the full SHA 27ed358View commit details -
Initial implementation of SPV_KHR_cooperative_matrix extension (#2099)
The intention is to replace existing SPV_INTEL_joint_matrix extension to the Khronos one in future. Spec: https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/KHR/SPV_KHR_cooperative_matrix.asciidoc Original commit: KhronosGroup/SPIRV-LLVM-Translator@e0c9de8
Configuration menu - View commit details
-
Copy full SHA for 436a02c - Browse repository at this point
Copy the full SHA 436a02cView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6080f63 - Browse repository at this point
Copy the full SHA 6080f63View commit details -
Revert "[CI] Use llvm-toolchain-focal-17/ on the main branch (#2104)"
This reverts commit 93b6d57759a12875810a215ef781ccb0e928f374. Original commit: KhronosGroup/SPIRV-LLVM-Translator@f80988f
Configuration menu - View commit details
-
Copy full SHA for 89287a0 - Browse repository at this point
Copy the full SHA 89287a0View commit details -
Fix delete of functions that becomes unused (#2109)
After the first loop of deleting instructions in ValuesToDelete, deleted instructions in ValuesToDelete are in an unstable state. Then in the second loop of deleting, dyn_cast to GlobalValue could return true for an instruction and double eraseFromParent causes crash. Global values in ValuesToDelete are functions. Unused functions are deleted by eraseUselessFunctions anyway. Original commit: KhronosGroup/SPIRV-LLVM-Translator@aea1ac7
Configuration menu - View commit details
-
Copy full SHA for 7a4574a - Browse repository at this point
Copy the full SHA 7a4574aView commit details -
`TransOperand` is never called for StrideIdx (3), because the loop ends at MinOperandCount (3). Original commit: KhronosGroup/SPIRV-LLVM-Translator@dcd3052
Configuration menu - View commit details
-
Copy full SHA for cd049ef - Browse repository at this point
Copy the full SHA cd049efView commit details -
Use
auto *
for pointer types (#2115)Change `auto` to `auto *` when the type is a pointer. This makes the code comply with the clang-tidy `llvm-qualified-auto` check. That check is already enabled, but the code base wasn't fully compliant yet. Original commit: KhronosGroup/SPIRV-LLVM-Translator@aa9226e
Configuration menu - View commit details
-
Copy full SHA for 1b29d1a - Browse repository at this point
Copy the full SHA 1b29d1aView commit details -
Add assert for kernel arg metadata not matching kernel signature (#2116)
There was a case where Intel's SYCL compiler was optimizing out a kernel argument but not updating the metadata. This meant the metadata had more operands than the number of kernel arguments. We use the number of metadata operands to iterate, so we did an out of bounds access. We can't fix this by instead using the function arguments to iterate because we still don't know which argument was removed, so just assert the metadata is valid. Signed-off-by: Sarnie, Nick <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@4f7efd3
Configuration menu - View commit details
-
Copy full SHA for 02359a6 - Browse repository at this point
Copy the full SHA 02359a6View commit details -
Don't wrap kernels that are not being called in the module (#2119)
* Don't wrap kernels that are not being called in the module This patch is a result of a reflection about previously merged PR KhronosGroup/SPIRV-LLVM-Translator#1149 "add an entry point wrapper around functions (llvm pass)" and is enspired by various reported translator, clang (OpenCL) and Intel GPU drivers issues (see KhronosGroup/SPIRV-LLVM-Translator#2029 for reference). While SPIR-V spec states: === *OpName* --//--. This has nosemantic impact and can safely be removed from a module. === yet having EntryPoint function and a function that shares the name via OpName might be confusing by both (old) drivers and programmers, who read the SPIR-V file. This patch prevents generation of the wrapper function when it's not necessary to generate it aka if a kernel function is not called by other kernel. We can do better in other cases as well, for example I have experiments of renaming a wrapped function adding a previx, so it could be distinguished from the actual kernel/entry point, but for now it doesn't pass validation for E2E OpenCL tests. Signed-off-by: Sidorov, Dmitry <[email protected]> * prevent a copy Signed-off-by: Sidorov, Dmitry <[email protected]> This patch is a result of a reflection about previously merged PR #1149 "add an entry point wrapper around functions (llvm pass)" and is enspired by various reported translator, clang (OpenCL) and Intel GPU drivers issues (see While SPIR-V spec states: OpName --//--. This has nosemantic impact and can safely be removed from a module. yet having EntryPoint function and a function that shares the name via OpName might be confusing by both not-up-to-date drivers and programmers, who read the SPIR-V file. This patch prevents generation of the wrapper function when it's not necessary to generate it aka if a kernel function is not called by other kernel. Signed-off-by: Sidorov, Dmitry <[email protected]> Original commit: KhronosGroup/SPIRV-LLVM-Translator@46285e4
Configuration menu - View commit details
-
Copy full SHA for 30e6487 - Browse repository at this point
Copy the full SHA 30e6487View commit details -
Configuration menu - View commit details
-
Copy full SHA for 99cb3db - Browse repository at this point
Copy the full SHA 99cb3dbView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6effc7f - Browse repository at this point
Copy the full SHA 6effc7fView commit details -
Fix tests after D153092 (#14944)
Return back some bitcasts to preserve opaque pointers support, fix SYCL tests by emitting generic address space instead of global as intended by intel/llvm changes.
Configuration menu - View commit details
-
Copy full SHA for 4132ba4 - Browse repository at this point
Copy the full SHA 4132ba4View commit details -
Test access and store operations for cooperative matrix (#2117)
For now, the reverse translation is not resolved properly, so we test only forward translation here. Original commit: KhronosGroup/SPIRV-LLVM-Translator@1677289
Configuration menu - View commit details
-
Copy full SHA for ce63b84 - Browse repository at this point
Copy the full SHA ce63b84View commit details -
Remove the -emit-opaque-pointers flag (#2121)
We no longer need this flag as only opaque pointers are supported now. Original commit: KhronosGroup/SPIRV-LLVM-Translator@3eeb3bf
Configuration menu - View commit details
-
Copy full SHA for 7368c3e - Browse repository at this point
Copy the full SHA 7368c3eView commit details -
Add SPV_EXT_image_raw10_raw12 reader support (#2113)
Add basic support for the SPV_EXT_image_raw10_raw12 extension [1] such that SPIR-V modules using the extension can be consumed. [1] https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/EXT/SPV_EXT_image_raw10_raw12.asciidoc Original commit: KhronosGroup/SPIRV-LLVM-Translator@bb2196b
Configuration menu - View commit details
-
Copy full SHA for 1362158 - Browse repository at this point
Copy the full SHA 1362158View commit details -
Configuration menu - View commit details
-
Copy full SHA for 4ba783b - Browse repository at this point
Copy the full SHA 4ba783bView commit details -
Configuration menu - View commit details
-
Copy full SHA for 6414c5f - Browse repository at this point
Copy the full SHA 6414c5fView commit details -
Configuration menu - View commit details
-
Copy full SHA for ca05ca9 - Browse repository at this point
Copy the full SHA ca05ca9View commit details -
Configuration menu - View commit details
-
Copy full SHA for c02d2ff - Browse repository at this point
Copy the full SHA c02d2ffView commit details -
Configuration menu - View commit details
-
Copy full SHA for 2e69a40 - Browse repository at this point
Copy the full SHA 2e69a40View commit details -
Configuration menu - View commit details
-
Copy full SHA for 32bb25b - Browse repository at this point
Copy the full SHA 32bb25bView commit details -
Revert "Revert "[lld][Arm] Big Endian - Byte invariant support.""
This reverts commit d885138. Reason: Applied the fix for the Asan buildbot failures.
Configuration menu - View commit details
-
Copy full SHA for d5d6d29 - Browse repository at this point
Copy the full SHA d5d6d29View commit details
Commits on Aug 12, 2023
-
Configuration menu - View commit details
-
Copy full SHA for 685edda - Browse repository at this point
Copy the full SHA 685eddaView commit details
Commits on Aug 13, 2023
-
Guard 4 typed pointer removal commits within INTEL_SYCL_OPAQUEPOINTER…
…_READY Revert"[llvm] Drop some typed pointer handling/bitcasts" This reverts commit 4ce7c4a. Conflicts: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/lib/Transforms/Coroutines/CoroSplit.cpp llvm/lib/Transforms/Scalar/LICM.cpp llvm/lib/Transforms/Scalar/SROA.cpp Revert "[SCEVExpander] Remove GEP add rec splitting code (NFCI)" This reverts commit b752542. Revert "[Transforms] Remove FactorOutConstant to fix -Wunneeded-internal-declaration (NFC)" This reverts commit 67f1e8d. Revert "[SCEVExpander] Remove typed pointer support (NFC)" This reverts commit 02ba405.
Configuration menu - View commit details
-
Copy full SHA for ba3339a - Browse repository at this point
Copy the full SHA ba3339aView commit details -
Configuration menu - View commit details
-
Copy full SHA for aef58ce - Browse repository at this point
Copy the full SHA aef58ceView commit details
Commits on Aug 14, 2023
-
Revert "AMDGPU: Move placement of RemoveIncompatibleFunctions"
This reverts commit 5b5bd81.
Configuration menu - View commit details
-
Copy full SHA for dce52b4 - Browse repository at this point
Copy the full SHA dce52b4View commit details -
Configuration menu - View commit details
-
Copy full SHA for a26b643 - Browse repository at this point
Copy the full SHA a26b643View commit details