Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GEN] Update GENX branch to LLVM 765206e #14025

Merged
merged 438 commits into from
Jun 4, 2024
Merged

Conversation

whitneywhtsang
Copy link
Contributor

No description provided.

bolshakov-a and others added 30 commits May 24, 2024 13:04
… (#85837)

Introduced in #78041, originally reported as #79957 and fixed partially
in #80050.

`OpaqueValueExpr` used with `TemplateArgument::StructuralValue` has no
corresponding source expression.

A test case with subobject-referring NTTP added.
Use UTC. Add test coverage for AIX.
Even as the NPM has been in use by Polly for a while now, the majority
of the tests continue using the LPM passes. This patch ports the tests
to use the NPM passes (for example, by replacing a flag such as
-polly-detect with -passes=polly-detect following the NPM syntax for
specifying passes) with some exceptions for some missing features in the
new passes.

Relanding #90632.
Remove this test since it is marked as XFAIL and has some
non-deterministic behaviour which causes it to spuriously pass on
out-of-tree builds.

Capturing this in llvm/llvm-project#93342 to
make a proper fix and a test later.
We compute BF hashes in `YAMLProfileReader::readProfile` when first
matching profile functions with binary functions, and second time in
`YAMLProfileReader::parseFunctionProfile` during the profile assignment
(we need to do that to account for LTO private functions with
mismatching suffix).

Avoid recomputing the hash if it's been set.
These are useful for finer-grain debugging and complement the already
exposed global debug flag.
…(#89950)

Previously, since response (.rsp) files weren't expanded at the very
beginning of clang-scan-deps, we only parsed the command-line as
provided in the Clang .cdb file. Unfortunately, when using Unreal
Engine, arguments are always generated in a .rsp file (ie.
`/path/to/clang-cl.exe @/path/to/filename_args.rsp`).

After this patch, `/Fo` can be parsed and added to the final
command-line. Without this option, the make targets that are emitted are
made up from the input file name alone. We have some cases where the
same input in the project generates several output files, so we end up
with duplicate make targets in the scan-deps emitted dependency file.
llvm/llvm-project#93106 introduced some
necessary fixes to module file generation, but has also caused a
regression. The module file output can include bogus attempts to
USE-associate symbols local to derived type scopes, like components and
bindings. Fix, and extend a test.
We only know it expands to a 2 instruction sequence, not necessarily
a sign extended sequence.

Happened to notice while I was looking at naming for the proposed
rematerializable LUI+ADDI for addresses.
Move out common X86MemOperand checks into helper lambdas. To be reused
in #91667.

Test Plan: NFC
As we have debuginfod as symbol locator available in lldb now, we want
to make full use of it.
In case of post mortem debugging, we don't always have the main
executable available.
However, the .note.gnu.build-id of the main executable(some other
modules too), should be available in the core file, as those binaries
are loaded in memory and dumped in the core file.

We try to iterate through the NT_FILE entries, read and store the gnu
build id if possible. This will be very useful as this id is the unique
key which is needed for querying the debuginfod server.

Test:
Build and run lldb. Breakpoint set to
https://github.com/llvm/llvm-project/blob/main/lldb/source/Plugins/SymbolLocator/Debuginfod/SymbolLocatorDebuginfod.cpp#L147
Verified after this commit, module_uuid is the correct gnu build id of
the main executable which caused the crash(first in the NT_FILE entry)

Previous PR: llvm/llvm-project#92078 was
mistakenly merged. This PR is re-opening the commit.
Simplify mutually exclusive sanity checks in analyzeIndirectBranch,
where an UNKNOWN IndirectBranchType is to be returned. Reduces confusion
and code duplication when adding a new IndirectBranchType (to be added
in #91667).

Test Plan: NFC
Inline traits used by `arith.select` only into `ArithOps.td`. Trim
trailing whitespace in op description.
Move out common code extracting the address of a MCExpr. To be reused in
#91667.

Test Plan: NFC
Enables writing patterns where one has op creation with variadic in
result pattern more easily.

Signed-off-by: Jacques Pienaar <[email protected]>
…onment is MSVC (#91689)

From looking at the rest of code and from my own understanding, the
driver mode is supposed to be independent of MSVC compatibility when the
target triple is `*-windows-msvc`.
Therefore strict aliasing should be disabled by default when the target
triple is `*-windows-msvc` so code assuming MSVC behaves as expected
when compiled with Clang.
This reverts commit 098c6df.
This reverts commit 8c718a3.
This reverts commit 4fb02de.
The transform updates all users of inductions to work based on EVL,
instead
of the VF directly. At the moment, widened inductions cannot be updated,
so
bail out if the plan contains any.
This patch introduces a check before applying EVL transform. If any
recipes in loop rely on RuntimeVF, the plan is discarded.
…fault with that case (#76669)"

When the default branch is the last case, we can transform that branch
into a concrete branch with an unreachable default branch.

```llvm
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

define i64 @src(i64 %0) {
  %2 = urem i64 %0, 4
  switch i64 %2, label %5 [
    i64 1, label %3
    i64 2, label %3
    i64 3, label %4
  ]

3:                                                ; preds = %1, %1
  br label %5

4:                                                ; preds = %1
  br label %5

5:                                                ; preds = %1, %4, %3
  %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ]
  ret i64 %.0
}

define i64 @tgt(i64 %0) {
  %2 = urem i64 %0, 4
  switch i64 %2, label %unreachable [
    i64 0, label %5
    i64 1, label %3
    i64 2, label %3
    i64 3, label %4
  ]

unreachable:                              ; preds = %1
  unreachable

3:                                                ; preds = %1, %1
  br label %5

4:                                                ; preds = %1
  br label %5

5:                                                ; preds = %1, %4, %3
  %.0 = phi i64 [ 2, %4 ], [ 1, %3 ], [ 0, %1 ]
  ret i64 %.0
}
```

Alive2: https://alive2.llvm.org/ce/z/Y-PGXv

After transform to a lookup table, I believe `tgt` is better code.

The final instructions are as follows:

```asm
src:                                    # @src
        and     edi, 3
        lea     rax, [rdi - 1]
        cmp     rax, 2
        ja      .LBB0_1
        mov     rax, qword ptr [8*rdi + .Lswitch.table.src-8]
        ret
.LBB0_1:
        xor     eax, eax
        ret
tgt:                                    # @tgt
        and     edi, 3
        mov     rax, qword ptr [8*rdi + .Lswitch.table.tgt]
        ret
.Lswitch.table.src:
        .quad   1                               # 0x1
        .quad   1                               # 0x1
        .quad   2                               # 0x2

.Lswitch.table.tgt:
        .quad   0                               # 0x0
        .quad   1                               # 0x1
        .quad   1                               # 0x1
        .quad   2                               # 0x2
```

Godbolt: https://llvm.godbolt.org/z/borME8znd

Closes #73446.

(cherry picked from commit 7d81e07)
…the Condition is Too Wide (#77831)"

llvm/llvm-project#76669 taught SimplifyCFG to
handle switches when `default` has only one case. When the `switch`'s
condition is wider than 64 bit, the current implementation can calculate
the wrong default value. This PR skips cases where the condition is too
wide.

(cherry picked from commit 39bb790)
This commit adds the /Zc:\_\_STDC\_\_ argument from MSVC, which defines
\_\_STDC_\_.
This means, alongside stronger feature parity with MSVC, that things
that rely on \_\_STDC\_\_, such as autoconf, can work.
Link to MSVC documentation of this flag:
https://learn.microsoft.com/en-us/cpp/build/reference/zc-stdc?view=msvc-170
This reduces the effort of adding MVT strings every time.
…NFC. (#93363)

* allow configuration for the target specific compiler flags.
* allow lld linker for all linker outputs: shared, module and exe.
* allow configuration of libc++ ABI version.
* set MSVC runtime library to MultiThreadedDLL/MultiThreadedDebugDLL on
Windows build hosts.
* install UCRT libraries on Windows build hosts
The Range argument is not used by createWidenInductionRecipe; induction
classification applies across the whole range of VFs. Remove the
argument.
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.

 * Ensure that every target is in a folder
 * Use a folder hierarchy with each LLVM subproject as a top-level folder
 * Use consistent folder names between subprojects
 * When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
topperc and others added 21 commits May 28, 2024 12:49
…P arithmetic. (#92799)

This adds VPSExtPromotedInteger and VPZExtPromotedInteger and uses them
to promote many arithmetic operations.
    
VPSExtPromotedInteger uses a shift pair because we don't have
VP_SIGN_EXTEND_INREG yet.
…erPC) (#93117)

The original pull request
(llvm/llvm-project#92838) was reverted due to a
PowerPC buildbot breakage
(llvm/llvm-project@df626dd).
This reland limits the scope of the change to non-PowerPC platforms. I
am unaware of any PowerPC use cases that would benefit from a larger
kNumStackOriginDescrs constant.

Original CL description: This increases the constant size of
kNumStackOriginDescrs to 4M (64GB of BSS across two arrays), which ought
to be enough for anybody.

This is the easier alternative suggested by eugenis@ in
llvm/llvm-project#92826.
This is an experiment to see if we can prevent some of the compiler OOMs
happening without unduly impacting the Windows build latency.
- Reduce disk IO usage by adding cache to an realpath introduced by
#81985
Swapped code blocks of parameter and variable, which have been confused
(in a clang-tidy doc file)
I think test files for the legacy and the new EH (exnref) are better be
separate, and I'd like to use the current test file names for the new
EH, rather than keeping the current files and naming the new ones as
`-new` or something.
This patch adds Version 3 for development purposes.  For now, this
patch adds V3 as a copy of V2.

For the most part, this patch adds "case Version3:" wherever "case
Version2:" appears.  One exception is writeMemProfV3, which is copied
from writeMemProfV2 but updated to write out memprof::Version3 to the
MemProf header.  We'll incrementally modify writeMemProfV3 in
subsequent patches.
I discovered while working on something else that we were using the
location of the directive name as the 'beginloc' which caused some
problems in a few places.  This patch makes it so our beginloc is the
'#' as we originally designed, and then adds a DirectiveLoc concept to a
construct for use diagnosing the name.
…n with a sample profile (#93286)

Currently if a callsite is hot as determined by the sample profile, it
is unconditionally inlined barring invalid cases (such as recursion).
Inline cost check should still apply because a function's hotness and
its inline cost are two different things.
For example if a function is calling another very large function
multiple times (at different code paths), the large function should not
be inlined even if its hot.
Avoids regression in future commit which starts producing
illegal instances.
…st check host JIT support (#84758)

fea7399 had removed the unused function that was still there when I tested.
For some reason I was using writeStmtRef when I meant writeStmt, so this
corrects that.
…#93574)

I plan to add other combines on TRUNCATE_VECTOR_VL.
This patch adds bind c names to functions and subroutines in cudadevice
so they can be lowered and not hit the intrinsic procedure TODOs.
…obal addresses. (#93352)

This allows register allocation to rematerialize these instead of
spilling and reloading. We need to make it a single instruction due to
limitations in rematerialization.

This pseudo is expanded to an LUI+ADDI pair between regalloc and post RA
scheduling.

This improves the dynamic instruction count on 531.deepsjeng_r from
spec2017 by 3.2% for the train dataset. 500.perlbench and 502.gcc see a
1% improvement. There are couple regressions, but they are 0.1% or
smaller.

AArch64 has similar pseudo instructions like MOVaddr
This patch adds hidden visibility to the variable
that is used by the single byte counters mode in
source-based code coverage.
@whitneywhtsang whitneywhtsang added the genx Pull requests or issues for genx branch label Jun 4, 2024
@whitneywhtsang whitneywhtsang requested a review from a team June 4, 2024 04:29
@whitneywhtsang whitneywhtsang self-assigned this Jun 4, 2024
@whitneywhtsang whitneywhtsang merged commit 58f1505 into intel:genx Jun 4, 2024
5 checks passed
@whitneywhtsang whitneywhtsang deleted the merge branch June 4, 2024 14:47
@whitneywhtsang whitneywhtsang restored the merge branch June 16, 2024 21:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
genx Pull requests or issues for genx branch
Projects
None yet
Development

Successfully merging this pull request may close these issues.