Releases: intel/intel-graphics-compiler
Releases · intel/intel-graphics-compiler
igc-1.0.7076
Fixed Issues / Improvements
- Respect per instruction contraction flag in mad pattern match.
- Add enable preemption to finalizer flags.
- Support for SPV_KHR_linkonce_odr in SPIRV Reader.
- When promoting arrays to registers wrong assumption regarding fp64 and int64 is made.
- Enhance m_num1DAccesses lookup in CS
- Enable partial emulation for fp64 div/sqrt for OCL
- Change interface for revision id information
- Add possibility to force bindless constant buffers to be untyped.
Dependencies revisions
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@069ced1
- KhronosGroup/SPIRV-LLVM-Translator@9d8d032 (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.7041
Fixed Issues / Improvements
- Keep Fast Math Flags during memory operations simplifications.
- Allow float to packed half-float move on select platforms, second try.
- Fix handling saturation patterns.
- Force private memory to global buffer when generic load/store are present
- Optionally allow for compilation without payload header.
- Fix bug with setting of global variable in kernel arg offsets.
- Fix right bound computation for send destination.
- Fix in NoMask WA for the last BB.
- Change unroll threshold for high trip count, nested loops.
- Support for SPV_INTEL_noopt in OCL adaptor.
- Fix bugs in expandMulPostSchedule pass.
- Other minor fixes and improvements.
Dependencies revisions
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@6713229
- KhronosGroup/SPIRV-LLVM-Translator@9d8d032 (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.6909
Fixed Issues / Improvements
- Added ARF latency for scheduler,
- Added more dep info in comment for SWSB,
- Added option to disable VC BiF, disable by default prior to LLVM 9,
- Added extra functionality to InlineLocalsResolution for identification and removal of unused global variables and all their successive recursive user nodes in the def-use tree,
- As prebuiltin lib is meant to be neutral to flags, removing llvm.module.flags,
- DebugInfo should emit several bit_pieces if a variable is larger that a register,
- Enable float accumualator for sel,
- Enable split memory fence operations,
- Fixed build break on Clang, error: unknown warning group,
- Fixed emission of debug info for VC in the presence of indirect calls,
- Fixed emit of InsertElement of uniform vector,
- Fixed indentation in CMakes,
- Fixed issue where ResolveGAS pass caused removal of a specific instructions,
- Fixed legalization of stores operating on composite types,
- Fixed missing debug info links when creating Gen specific intrinsics,
- Fixed spill mem size calculation in VC,
- Implemented support for both SPV-IR forms of atomic builtins and OpControlBarrier,
- Improved readability of debug info codebase (NFC),
- Initial support of CM-CL BiF, printf resolution in VC,
- Made dump() no arg function. Add dumptofile(filename),
- Made VC PressureTracker aware of DataLayout,
- New, more accurate implementation of lgamma & tgamma,
- Normalize BE_FP and BE_SP when interpreting them as they are in oword,
- Re-enable memory fence scheduling and do not schedule it beyond branches,
- Refactoring in GenX,
- Removed a power-of-two lookup table,
- Removed OpSource language check assert,
- Renamed function attribute "IndirectlyCalled" to "referenced-indirectly" to match SPIRV FE,
- Separate CMake utilities from main IGC list,
- Speed up GenXLiveness analysis,
- Support for Nontemporal MemoryAccess in SPIRVReader,
- Support import/export SPIRV linkage for indirect calls.
- Support legacy IR without constant addrspace for printf,
- Support prinf with args in GenXPrintfResolution,
- Update copyright headers,
- Use std::decay instead of custom functor in Frontend.h,
- Other minor fixes and improvements.
Dependencies revisions
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@6713229
- KhronosGroup/SPIRV-LLVM-Translator@9d8d032 (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.6812
Fixed Issues / Improvements
- Implement support for both SPV-IR forms of OpIsNan, OpIsInf, OpIsFinite and OpIsNormal builtins.
- Implement support for both SPV-IR forms of OpLessOrGreater, OpOrdered and OpUnordered builtins.
- Add option to schedule fence commit move.
- Fixed alignment processing in clone helper functions.
- Adding framework for error flag for catching uninitialized variables.
- Adding warning flag for unitialized variables in Compiler project and cleaning up needed issues.
- Unify tblgen detection in VC.
- Move UnreachableHandling pass after all LowerSwitch pass runs.
- Wrap CM-CL library to support clang-9.
- fcl options string must start with "-cmc" to invoke CM frontend.
- Incrementally apply pattern match transforms.
- Dispatch along y optimization - phase one.
- Added XeHP SDV to platfom enum.
- Support "%=" string format for labels in InlineAsm. Transforms this special format string into a unique label suffix for that asm block.
- Add a key: EnableL3FlushForGlobal, to control L3 flush.
- Redesign stackcalls codegen in VC.
- Fix for optimized compilation with debug info.
- Enable accumulator usage for sel instruction.
- Skip step 5 in LowerGPCallArg only when processing function with variable number of arguments.
- Reimplement workgroup reduce, scan_inclusive and scan_exclusive using subgroups.
- Added new passes to igc_opt.
- Implement support for both SPV-IR forms of OpIsNan, OpIsInf, OpIsFinite and OpIsNormal builtins.
- Changed the naming scheme used by VC to produce debug info dumps.
- Implement support for both SPV-IR forms for OpAny/OpAll builtins.
- Clone routine should make sure that alignment is set correctly.
- Simplifying code related to sample and texel fetch instructions.
- Remove unused included header.
- ZEBinary: Add a regkey to disable printf support.
- Do stateful transformation for non-gep ptr.
- IGA: Add new kv apis and some refactoring.
- Mov cleanupBindless after LVN.
- Initial CMCL Support library and tool implementation.
- Do not promote svm gather/scatter w/ mismatched types.
- Decide emission of pre-fills for spills based on presence of corresponding pseudo kill or def count of spill.
Dependencies revisions
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@6713229
- KhronosGroup/SPIRV-LLVM-Translator@9d8d032 (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.6748
Fixed Issues / Improvements
- DebugInfo - changed code layout and added few asserts to line info emission
- Add option to skip memory fence commit.
- This is a minor change to allow stateful transformation for non-gep pointer.
- Available externally OCLInlineThreshold option
- Enable code patching by default CodePatch=2
- Remove bunch of outdated CMake code
- Remove unused function from BiF
- VC can now dump asm for indirectly-called functions
- Add gen11 and gen12 bindless system routines
- Added check for induction variable sext in Simd32Profitability
- Change unreachable instructions to "return undef"
- Apply the same skipping rules for step 1 and step 5 of LowerGPCallArg
- Change OpenCL builtin mad implementation to use fma instruction instead of multiply add.
- Initial implementation of cm-cl library
- Link with LLVM target if dylib is required
- Switch TPM to SVM entirely
- Moving opencl-clang discovery code to outer scope to make it available for VC
- Simplifying code related to sample and texel fetch instructions
- Generate native sqrt for fast llvm sqrt operation and match reciprocal sqrt
- Refactor Sub- and Work- group Scan and Reduce
Dependencies revisions
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@7ee152a
- KhronosGroup/SPIRV-LLVM-Translator@9d8d032 (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.6712
Fixed Issues / Improvements
- Simplifying PrivateMemoryResolution
- Simplify SWSB fields in G4_INST
- Fix debug info link in VISA for caller save/restore code.
- Fixed stack call implicit arg mismatch between caller/callee.
- Fix the pseudo kill for RA
- Improve jump codegen by setting uniform if jump's flag is workgroup/global uniform
- Other minor fixes and improvements.
Dependencies revisions
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@7ee152a
- KhronosGroup/SPIRV-LLVM-Translator@9d8d032 (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.6646
Fixed Issues / Improvements
- Added a key to dump out WIA information into a unique file per each function invocation,
- Added comments for stack callee function prolog to assist debugging,
- Added indirect regioning restrictions,
- Added ld cases for texture folding,
- Added legalization checks for VxH regions for int-to-fp moves,
- Added localization of live ranges to reduce accumulator usages,
- Added shader dump for spec constants,
- Added support for a new pattern in
PushAnalysis::IsStatelessCBLoad()
to detect, - Added VISA option -dumpintf to dump RA interference graph,
- Added more passes to igc_opt,
- Added reporting warnings inside IGC passes,
- Added handles to plane coefficients,
- Added reg keys for scheduled BB range in local scheduler,
- Added additional DP emulation mode,
- Allow coalescing of spill/fill in presence of stack calls,
- Allow remat even for operations using NoMaskWA,
- Appling renaming to linear scan RA spill/fill,
- Change memory semantics to relaxed for OpenCL 1.x atomics,
- Decouple VC debug options to allow emission of debug infromation without debuggable kernels,
- Emit warning about an unsupported debuggability if ZeBin is requested,
- Enabled Wa16012061344 for read suppresion issue caused by predictor,
- Enhancements in compiler output,
- Extended FCL dumps with CMFE options and inputs,
- Favoring the llvm::BasicBlock name for vISA labels,
- Fixed build break in Fedora,
- Fixed emission of debug information for implicit variable locations,
- Fixed erroneous size calculation for DW_OP_bit_piece,
- Fixes for media height support in Cisa Builder,
- Fixes for PosDep MatchMad condition,
- Fixed logic when LLVM name is the empty string,
- Fixed missing barrier when inline ASM is used in a kernel,
- Fixed non-deterministic Function->VisaModule lookup,
- Fixed PushAnalysis to not create unaligned 64bit runtime value arguments,
- Fixed the hybrid RA with spill,
- Fixed the linear scan RA time status,
- Fixed issue for multiple thread compilation of shaders,
- Fixed GEP scalarized indexes calculation in CG_LowerGEPForPrivMem pass,
- For optnone builtins, allow IGC to determine inline/noinline and stackcall/subroutine calls,
- GenISA ibfe/ubfe constant literal offset may exceed 31,
- Implemented by value argument linearization,
- Implemented IGC_ASSERT in IGC/OCLFE,
- Improved DebugInfo robustness by implementing naive error-handling,
- Lifted 4K predicate variable restriction on vISA assembly,
- Made LinearScan default in ForceFastestSIMD,
- Misc. initial edits to the file parsing code in the global scope,
- Moving BiF parsing tools to a separate file,
- Renamed some VC options to have "-vc" prefix instead of "-genx",
- Reworked setPredicateForDiscard() to not use a temporary register for flag storage,
- Select phi input in non-overlapping region,
- Support for function pointer builtins/intrinsics,
- Support for Function pointer SIMD Variants,
- Support for uniformly typed read,
- Unify conditions for llvm::JumpThreading usage,
- Updated copyright headers,
- Updated DPEmu,
- Updated the indirect call info check in SWSB,
- VC: backend can lower lzd64,
- VC: debug info fixes for non-standalone kernels,
- VC: legacy messages legalization to vc-codegen,
- vISA: add helper function for calla check,
- vISA: add HWConformity::fixCalla for HW restriction,
- Simplifying ldrawvector to ldrawindex when we have a case where only one element is being used and we know the offset is a constant integer value,
- Other minor fixes and improvements.
Dependencies revisions
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@7ee152a
- KhronosGroup/SPIRV-LLVM-Translator@ab5e12a (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.6410
Fixed Issues / Improvements
- Consider a WA table entry before inserting a flush sampler instruction
- Location expressions improvements
- Do not split arithmetic instructions in IGC as vISA will handle it
- Backing out Simple push algorithm Optimization
- Fix reg number issue in translate math
- Changes for -O2. Optimizing non-user functions to save compiling time.
- Fix the SWSB when there is no send in kernel
- Add support to generate thread IDs in 2x2 blocks.
- Seperate global and local variables to reduce compilation time.
- Don't replace OpDecorate with OpGroupDecorate.
- Add InferAddressSpacesPass only if needed.
- Fix crash in SIMD32 mode caused by pseudo_ret instruction's source operand right bound computation.
- Update DispatchGPGPUWalkerAlongYFirst lookup
- Changes for -O2. Optimizing non-user functions to save compiling time.
- Changed ldms and ldmcs convertion to 16bit. Fixed ldmcs usage in users other than 16bit ldms
- Cleanup unnecessary dynamic allocations.
- Avoid warning of implicit i64->i32 by forcing explicit conversion.
- Optimization for signed reminder for constant power of 2 int32.
- Switch TPM to SVM entirely.
- Do not modify wrregion input in non-overlapping region optimization.
- Changed ldms and ldmcs convertion to 16bit. Fixed ldmcs usage in users other than 16bit ldms
- Avoid warning of implicit i64->i32 by forcing explicit conversion
- Simplify usage of IGC_BUILD__VC_ENABLED cmake option Change IGC_VC_DISABLED macro to more consistent IGC_VC_ENABLED
- Removed external dependency on llvm_patches and improved llvm setup in project
- Produce truncate instead of __builtin_spirv_OpUConvert for not rounded/saturated converts
- Fix missing barrier when inline ASM is used in a kernel
- Split uniform into thread uniform, work group uniform, and global uniform. which give us a detail info that could be used to enable better optimization.
- Set InlineAsm usage per function group, to create correct builder for multiple FGs.
- Support for stackcalls with InlineAsm by parsing multiple functions in single text stream.
- Broadcast uniform variables if 'rw' constraint was specified (Inline ASM)
- Optimize generic pointer load for kernels not using local memory.
- Bug fix for SWSB when comparing the footprint.
- Produce truncate instead of __builtin_spirv_OpUConvert for not rounded/saturated converts.
- Extend GAS phi resolution to all loops, not only top level ones.
- Remove the dependence between dummy csel instructions.
- Adds custom iterator class for Function Group. Can iterate through the FunctionGroup class, which uses a 2D vector storage.
- Split uniform into thread uniform, work group uniform, and global uniform. which give us a detail info that could be used to enable better optimization.
- Change OpenCL builtin mad implementation to use fma instruction instead of multiply add.
- Cast Base and Insert parameters to unsigned to avoid sign extension while shifting
- Add check for compute shaders that may need XYZ walk of thread IDs.
- ZEBinary: Fix scractch memory buffer creation.
- If unmasked regions are nested then the most nested intrinsic llvm.genx.GenISA.UnmaskedRegionEnd switched off unmasked code generation, resulting in other embracing nested regions generatedr as masked code.
- Fix missing barrier when inline ASM is used in a kernel.
- Extra flag has been added to WIAnalysis Runner to not mark some uniform instructions as random.
- Added a field to implicit argument structure for stack calls. Modified layout of local ids based on SIMD size.
- IGA: add disassembler option "--output-on-fail"
- Fix discovery of inlined DISubprogram nodes
- Implement support for both SPV-IR forms for BitFieldInsert builtins
- Introduction of new entry in IGC constant folder for bfrev.
- Update TracePointerSource() function to detect cases where two different resource pointer values describe the same resource.
- Vector backend does not support creation of L0 module with external functions. Insert assert in GenXCisaBuilder, explaining that.
- Take SpillMemOffset into consideration when reporting spill size.
- Split send has argument no 4, and it can be addr register. Make sure check dependence on src3 as well.
- Add case when propagating non-generic pointer to store.
- Disable certain transformations when compiling code for debug.
- Add -vc-promote-array-alloca-limit knob to control array promotion total size (2nd edition). Force array promotion for CMRT binary.
- Replace strcat by compound assignment operator
- Now appropriately handling shl instructions with unsupported types.
- More fixes to get local RA to honor declare even-alignment.
- Print SLMsize in compiler output file
- IGA SWSB refactoring: Unify InstType getter function
- Fix missing barrier when inline ASM is used in a kernel
- Extract vc input handling into another function
- Fix an assertion due to unexpected RAUW with a constant
- Extend supported subtargets in VC
- Solve the memory leak issue of SWSB
- Add control to route some resources to LSC/HDC
- Fix scratch surface allocation for VC
- Remove addrspacecast only if there no other uses.
- Set alwaysinline on invoke kernels. Don't add stack call or indirect call attributes.
- Extract vc input handling into another function
- Add interface target for vc intrinsics headers
- Move stepping into Options instead of a global variable.
- Add DoNotSpill attribute for vISA variables.
- ZEBinary: Support buffer_offset implicit argument
- If all its operands are region invariant, an inst is region invariant.
- Commit base data structures for implicit argument handling for bindless offsets. Changes in StatelessToBindless promotion will come later.
- For optnone builtins, allow -O0 flag to determine if we should call them as subroutines or stackcalls.
- Allow EnableA64WA env variable in Linux relesae mode.
- BinaryEncodingIGA: fix math pipe instruction check
- Upgraded error messages with source file locations and names of the kernel causing the error.
- Implement support for both SPV-IR forms for conversion builtins
- Prevent redundant lowering attempt during SIMD CF Conformance
- Now appropriately handling shl instructions with unsupported types.
- Make sure trivial RA honors even-alignment.
- ZEBinary: add regkey to enable .bss section for zero-initialized global variables
- add -vc-promote-array-alloca-limit knob to control array promotion total size
- Add simplify CFG pass to pass manager to simplify work of LICM
- Add an option for GenXPromoteArray threshold
- Debug location expression improvements
- Reduce memory footprint in GraphColor
- Fix binary encoding for simd2 align16 instructions
- Filter out "endif" and "else" when inserting dummy mov.
- Avoid localization of large data for oclbin, use relocation instead.
- Add option for TPM memory placement.
- Correct localization costs for global vectors.
Dependencies revisions
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@7ee152a
- KhronosGroup/SPIRV-LLVM-Translator@ab5e12a (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.6087
Fixed Issues / Improvements
- Fix a bug when comparing two source regions as type was not considered.
- Other minor fixes and improvements.
Dependencies revisions
- intel/llvm-patches@9cbc7cf
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@5032643
- KhronosGroup/SPIRV-LLVM-Translator@ab5e12a (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.6083
Fixed Issues / Improvements
- Process legalization of 64-bit moves on VC,
- Added the support for sending the textureID and dimensions value to the driver,
- Support for SPV_EXT_shader_atomic_float_min_max,
- Update the read suppression WA to use less dummy instructions,
- Move CMABI::doInitialization code in a separate helper analysis,
- Do not hash label operands, CreateLabel() should always return a new label,
- Modernize GraphColor code,
- Optimize to generate mad by promoting src2 from :b to :wv,
- Fix for creating a dump directory for Linux,
- Update configuration_flags.md,
- Fix Phi handling in SIMD CF Conformance,
- Introduction of new entry in IGC constant folder for bfi,
- Extend ValueTracker to be able to track inside user functions,
- Remove omitting zero/undef sample params for cube maps,
- Bug fixes for O0 inlining heuristic,
- Corrects a defect where the vISA asm parser erroneously used,
- Move ocl runtime info to headers,
- Added missing code for ADL_S and RKL,
- Fix builtin mangling for OpReadClockKHR,
- Make sure structurizer uses correct mask offset,
- Setting FunctionControl to force indirect call now applies to all user functions,
- Fixed logic error when there exists a reg key which is a substring of another reg key (i.e. ShaderDumpEnable and ShaderDumpEnableAll),
- Add UMD control to disable higher Simds,
- Report private memory usage in assembly dump,
- Other minor fixed and improvements.
Dependencies revisions
- intel/llvm-patches@9cbc7cf
- intel/opencl-clang@c8cd72e
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@5032643
- KhronosGroup/SPIRV-LLVM-Translator@ab5e12a (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.