-
Notifications
You must be signed in to change notification settings - Fork 737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GEN] Update GENX branch to LLVM 765206e
#14025
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
… (#85837) Introduced in #78041, originally reported as #79957 and fixed partially in #80050. `OpaqueValueExpr` used with `TemplateArgument::StructuralValue` has no corresponding source expression. A test case with subobject-referring NTTP added.
Use UTC. Add test coverage for AIX.
Even as the NPM has been in use by Polly for a while now, the majority of the tests continue using the LPM passes. This patch ports the tests to use the NPM passes (for example, by replacing a flag such as -polly-detect with -passes=polly-detect following the NPM syntax for specifying passes) with some exceptions for some missing features in the new passes. Relanding #90632.
Remove this test since it is marked as XFAIL and has some non-deterministic behaviour which causes it to spuriously pass on out-of-tree builds. Capturing this in llvm/llvm-project#93342 to make a proper fix and a test later.
We compute BF hashes in `YAMLProfileReader::readProfile` when first matching profile functions with binary functions, and second time in `YAMLProfileReader::parseFunctionProfile` during the profile assignment (we need to do that to account for LTO private functions with mismatching suffix). Avoid recomputing the hash if it's been set.
These are useful for finer-grain debugging and complement the already exposed global debug flag.
…(#89950) Previously, since response (.rsp) files weren't expanded at the very beginning of clang-scan-deps, we only parsed the command-line as provided in the Clang .cdb file. Unfortunately, when using Unreal Engine, arguments are always generated in a .rsp file (ie. `/path/to/clang-cl.exe @/path/to/filename_args.rsp`). After this patch, `/Fo` can be parsed and added to the final command-line. Without this option, the make targets that are emitted are made up from the input file name alone. We have some cases where the same input in the project generates several output files, so we end up with duplicate make targets in the scan-deps emitted dependency file.
llvm/llvm-project#93106 introduced some necessary fixes to module file generation, but has also caused a regression. The module file output can include bogus attempts to USE-associate symbols local to derived type scopes, like components and bindings. Fix, and extend a test.
We only know it expands to a 2 instruction sequence, not necessarily a sign extended sequence. Happened to notice while I was looking at naming for the proposed rematerializable LUI+ADDI for addresses.
Move out common X86MemOperand checks into helper lambdas. To be reused in #91667. Test Plan: NFC
As we have debuginfod as symbol locator available in lldb now, we want to make full use of it. In case of post mortem debugging, we don't always have the main executable available. However, the .note.gnu.build-id of the main executable(some other modules too), should be available in the core file, as those binaries are loaded in memory and dumped in the core file. We try to iterate through the NT_FILE entries, read and store the gnu build id if possible. This will be very useful as this id is the unique key which is needed for querying the debuginfod server. Test: Build and run lldb. Breakpoint set to https://github.com/llvm/llvm-project/blob/main/lldb/source/Plugins/SymbolLocator/Debuginfod/SymbolLocatorDebuginfod.cpp#L147 Verified after this commit, module_uuid is the correct gnu build id of the main executable which caused the crash(first in the NT_FILE entry) Previous PR: llvm/llvm-project#92078 was mistakenly merged. This PR is re-opening the commit.
Simplify mutually exclusive sanity checks in analyzeIndirectBranch, where an UNKNOWN IndirectBranchType is to be returned. Reduces confusion and code duplication when adding a new IndirectBranchType (to be added in #91667). Test Plan: NFC
Inline traits used by `arith.select` only into `ArithOps.td`. Trim trailing whitespace in op description.
Move out common code extracting the address of a MCExpr. To be reused in #91667. Test Plan: NFC
Enables writing patterns where one has op creation with variadic in result pattern more easily. Signed-off-by: Jacques Pienaar <[email protected]>
…onment is MSVC (#91689) From looking at the rest of code and from my own understanding, the driver mode is supposed to be independent of MSVC compatibility when the target triple is `*-windows-msvc`. Therefore strict aliasing should be disabled by default when the target triple is `*-windows-msvc` so code assuming MSVC behaves as expected when compiled with Clang.
The transform updates all users of inductions to work based on EVL, instead of the VF directly. At the moment, widened inductions cannot be updated, so bail out if the plan contains any. This patch introduces a check before applying EVL transform. If any recipes in loop rely on RuntimeVF, the plan is discarded.
…fault with that case (#76669)" When the default branch is the last case, we can transform that branch into a concrete branch with an unreachable default branch. ```llvm target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" define i64 @src(i64 %0) { %2 = urem i64 %0, 4 switch i64 %2, label %5 [ i64 1, label %3 i64 2, label %3 i64 3, label %4 ] 3: ; preds = %1, %1 br label %5 4: ; preds = %1 br label %5 5: ; preds = %1, %4, %3 %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ] ret i64 %.0 } define i64 @tgt(i64 %0) { %2 = urem i64 %0, 4 switch i64 %2, label %unreachable [ i64 0, label %5 i64 1, label %3 i64 2, label %3 i64 3, label %4 ] unreachable: ; preds = %1 unreachable 3: ; preds = %1, %1 br label %5 4: ; preds = %1 br label %5 5: ; preds = %1, %4, %3 %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ] ret i64 %.0 } ``` Alive2: https://alive2.llvm.org/ce/z/Y-PGXv After transform to a lookup table, I believe `tgt` is better code. The final instructions are as follows: ```asm src: # @src and edi, 3 lea rax, [rdi - 1] cmp rax, 2 ja .LBB0_1 mov rax, qword ptr [8*rdi + .Lswitch.table.src-8] ret .LBB0_1: xor eax, eax ret tgt: # @tgt and edi, 3 mov rax, qword ptr [8*rdi + .Lswitch.table.tgt] ret .Lswitch.table.src: .quad 1 # 0x1 .quad 1 # 0x1 .quad 2 # 0x2 .Lswitch.table.tgt: .quad 0 # 0x0 .quad 1 # 0x1 .quad 1 # 0x1 .quad 2 # 0x2 ``` Godbolt: https://llvm.godbolt.org/z/borME8znd Closes #73446. (cherry picked from commit 7d81e07)
…the Condition is Too Wide (#77831)" llvm/llvm-project#76669 taught SimplifyCFG to handle switches when `default` has only one case. When the `switch`'s condition is wider than 64 bit, the current implementation can calculate the wrong default value. This PR skips cases where the condition is too wide. (cherry picked from commit 39bb790)
This commit adds the /Zc:\_\_STDC\_\_ argument from MSVC, which defines \_\_STDC_\_. This means, alongside stronger feature parity with MSVC, that things that rely on \_\_STDC\_\_, such as autoconf, can work. Link to MSVC documentation of this flag: https://learn.microsoft.com/en-us/cpp/build/reference/zc-stdc?view=msvc-170
This reduces the effort of adding MVT strings every time.
…NFC. (#93363) * allow configuration for the target specific compiler flags. * allow lld linker for all linker outputs: shared, module and exe. * allow configuration of libc++ ABI version. * set MSVC runtime library to MultiThreadedDLL/MultiThreadedDebugDLL on Windows build hosts. * install UCRT libraries on Windows build hosts
The Range argument is not used by createWidenInductionRecipe; induction classification applies across the whole range of VFs. Remove the argument.
Update the folder titles for targets in the monorepository that have not seen taken care of for some time. These are the folders that targets are organized in Visual Studio and XCode (`set_property(TARGET <target> PROPERTY FOLDER "<title>")`) when using the respective CMake's IDE generator. * Ensure that every target is in a folder * Use a folder hierarchy with each LLVM subproject as a top-level folder * Use consistent folder names between subprojects * When using target-creating functions from AddLLVM.cmake, automatically deduce the folder. This reduces the number of `set_property`/`set_target_property`, but are still necessary when `add_custom_target`, `add_executable`, `add_library`, etc. are used. A LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's root CMakeLists.txt.
…P arithmetic. (#92799) This adds VPSExtPromotedInteger and VPZExtPromotedInteger and uses them to promote many arithmetic operations. VPSExtPromotedInteger uses a shift pair because we don't have VP_SIGN_EXTEND_INREG yet.
…erPC) (#93117) The original pull request (llvm/llvm-project#92838) was reverted due to a PowerPC buildbot breakage (llvm/llvm-project@df626dd). This reland limits the scope of the change to non-PowerPC platforms. I am unaware of any PowerPC use cases that would benefit from a larger kNumStackOriginDescrs constant. Original CL description: This increases the constant size of kNumStackOriginDescrs to 4M (64GB of BSS across two arrays), which ought to be enough for anybody. This is the easier alternative suggested by eugenis@ in llvm/llvm-project#92826.
This is an experiment to see if we can prevent some of the compiler OOMs happening without unduly impacting the Windows build latency.
- Reduce disk IO usage by adding cache to an realpath introduced by #81985
Swapped code blocks of parameter and variable, which have been confused (in a clang-tidy doc file)
I think test files for the legacy and the new EH (exnref) are better be separate, and I'd like to use the current test file names for the new EH, rather than keeping the current files and naming the new ones as `-new` or something.
This patch adds Version 3 for development purposes. For now, this patch adds V3 as a copy of V2. For the most part, this patch adds "case Version3:" wherever "case Version2:" appears. One exception is writeMemProfV3, which is copied from writeMemProfV2 but updated to write out memprof::Version3 to the MemProf header. We'll incrementally modify writeMemProfV3 in subsequent patches.
I discovered while working on something else that we were using the location of the directive name as the 'beginloc' which caused some problems in a few places. This patch makes it so our beginloc is the '#' as we originally designed, and then adds a DirectiveLoc concept to a construct for use diagnosing the name.
…n with a sample profile (#93286) Currently if a callsite is hot as determined by the sample profile, it is unconditionally inlined barring invalid cases (such as recursion). Inline cost check should still apply because a function's hotness and its inline cost are two different things. For example if a function is calling another very large function multiple times (at different code paths), the large function should not be inlined even if its hot.
…st JIT support (#84758)
Avoids regression in future commit which starts producing illegal instances.
…st check host JIT support (#84758) fea7399 had removed the unused function that was still there when I tested.
Add extra tests for llvm/llvm-project#93498.
For some reason I was using writeStmtRef when I meant writeStmt, so this corrects that.
…#93574) I plan to add other combines on TRUNCATE_VECTOR_VL.
This patch adds bind c names to functions and subroutines in cudadevice so they can be lowered and not hit the intrinsic procedure TODOs.
…obal addresses. (#93352) This allows register allocation to rematerialize these instead of spilling and reloading. We need to make it a single instruction due to limitations in rematerialization. This pseudo is expanded to an LUI+ADDI pair between regalloc and post RA scheduling. This improves the dynamic instruction count on 531.deepsjeng_r from spec2017 by 3.2% for the train dataset. 500.perlbench and 502.gcc see a 1% improvement. There are couple regressions, but they are 0.1% or smaller. AArch64 has similar pseudo instructions like MOVaddr
This patch adds hidden visibility to the variable that is used by the single byte counters mode in source-based code coverage.
Signed-off-by: Whitney Tsang <[email protected]>
sommerlukas
approved these changes
Jun 4, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.