Skip to content

Releases: intel/intel-graphics-compiler

igc-1.0.7076

27 Apr 14:26
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Respect per instruction contraction flag in mad pattern match.
  • Add enable preemption to finalizer flags.
  • Support for SPV_KHR_linkonce_odr in SPIRV Reader.
  • When promoting arrays to registers wrong assumption regarding fp64 and int64 is made.
  • Enhance m_num1DAccesses lookup in CS
  • Enable partial emulation for fp64 div/sqrt for OCL
  • Change interface for revision id information
  • Add possibility to force bindless constant buffers to be untyped.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.7041

19 Apr 13:29
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Keep Fast Math Flags during memory operations simplifications.
  • Allow float to packed half-float move on select platforms, second try.
  • Fix handling saturation patterns.
  • Force private memory to global buffer when generic load/store are present
  • Optionally allow for compilation without payload header.
  • Fix bug with setting of global variable in kernel arg offsets.
  • Fix right bound computation for send destination.
  • Fix in NoMask WA for the last BB.
  • Change unroll threshold for high trip count, nested loops.
  • Support for SPV_INTEL_noopt in OCL adaptor.
  • Fix bugs in expandMulPostSchedule pass.
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6909

12 Apr 13:43
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Added ARF latency for scheduler,
  • Added more dep info in comment for SWSB,
  • Added option to disable VC BiF, disable by default prior to LLVM 9,
  • Added extra functionality to InlineLocalsResolution for identification and removal of unused global variables and all their successive recursive user nodes in the def-use tree,
  • As prebuiltin lib is meant to be neutral to flags, removing llvm.module.flags,
  • DebugInfo should emit several bit_pieces if a variable is larger that a register,
  • Enable float accumualator for sel,
  • Enable split memory fence operations,
  • Fixed build break on Clang, error: unknown warning group,
  • Fixed emission of debug info for VC in the presence of indirect calls,
  • Fixed emit of InsertElement of uniform vector,
  • Fixed indentation in CMakes,
  • Fixed issue where ResolveGAS pass caused removal of a specific instructions,
  • Fixed legalization of stores operating on composite types,
  • Fixed missing debug info links when creating Gen specific intrinsics,
  • Fixed spill mem size calculation in VC,
  • Implemented support for both SPV-IR forms of atomic builtins and OpControlBarrier,
  • Improved readability of debug info codebase (NFC),
  • Initial support of CM-CL BiF, printf resolution in VC,
  • Made dump() no arg function. Add dumptofile(filename),
  • Made VC PressureTracker aware of DataLayout,
  • New, more accurate implementation of lgamma & tgamma,
  • Normalize BE_FP and BE_SP when interpreting them as they are in oword,
  • Re-enable memory fence scheduling and do not schedule it beyond branches,
  • Refactoring in GenX,
  • Removed a power-of-two lookup table,
  • Removed OpSource language check assert,
  • Renamed function attribute "IndirectlyCalled" to "referenced-indirectly" to match SPIRV FE,
  • Separate CMake utilities from main IGC list,
  • Speed up GenXLiveness analysis,
  • Support for Nontemporal MemoryAccess in SPIRVReader,
  • Support import/export SPIRV linkage for indirect calls.
  • Support legacy IR without constant addrspace for printf,
  • Support prinf with args in GenXPrintfResolution,
  • Update copyright headers,
  • Use std::decay instead of custom functor in Frontend.h,
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6812

30 Mar 12:40
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Implement support for both SPV-IR forms of OpIsNan, OpIsInf, OpIsFinite and OpIsNormal builtins.
  • Implement support for both SPV-IR forms of OpLessOrGreater, OpOrdered and OpUnordered builtins.
  • Add option to schedule fence commit move.
  • Fixed alignment processing in clone helper functions.
  • Adding framework for error flag for catching uninitialized variables.
  • Adding warning flag for unitialized variables in Compiler project and cleaning up needed issues.
  • Unify tblgen detection in VC.
  • Move UnreachableHandling pass after all LowerSwitch pass runs.
  • Wrap CM-CL library to support clang-9.
  • fcl options string must start with "-cmc" to invoke CM frontend.
  • Incrementally apply pattern match transforms.
  • Dispatch along y optimization - phase one.
  • Added XeHP SDV to platfom enum.
  • Support "%=" string format for labels in InlineAsm. Transforms this special format string into a unique label suffix for that asm block.
  • Add a key: EnableL3FlushForGlobal, to control L3 flush.
  • Redesign stackcalls codegen in VC.
  • Fix for optimized compilation with debug info.
  • Enable accumulator usage for sel instruction.
  • Skip step 5 in LowerGPCallArg only when processing function with variable number of arguments.
  • Reimplement workgroup reduce, scan_inclusive and scan_exclusive using subgroups.
  • Added new passes to igc_opt.
  • Implement support for both SPV-IR forms of OpIsNan, OpIsInf, OpIsFinite and OpIsNormal builtins.
  • Changed the naming scheme used by VC to produce debug info dumps.
  • Implement support for both SPV-IR forms for OpAny/OpAll builtins.
  • Clone routine should make sure that alignment is set correctly.
  • Simplifying code related to sample and texel fetch instructions.
  • Remove unused included header.
  • ZEBinary: Add a regkey to disable printf support.
  • Do stateful transformation for non-gep ptr.
  • IGA: Add new kv apis and some refactoring.
  • Mov cleanupBindless after LVN.
  • Initial CMCL Support library and tool implementation.
  • Do not promote svm gather/scatter w/ mismatched types.
  • Decide emission of pre-fills for spills based on presence of corresponding pseudo kill or def count of spill.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6748

30 Mar 12:36
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • DebugInfo - changed code layout and added few asserts to line info emission
  • Add option to skip memory fence commit.
  • This is a minor change to allow stateful transformation for non-gep pointer.
  • Available externally OCLInlineThreshold option
  • Enable code patching by default CodePatch=2
  • Remove bunch of outdated CMake code
  • Remove unused function from BiF
  • VC can now dump asm for indirectly-called functions
  • Add gen11 and gen12 bindless system routines
  • Added check for induction variable sext in Simd32Profitability
  • Change unreachable instructions to "return undef"
  • Apply the same skipping rules for step 1 and step 5 of LowerGPCallArg
  • Change OpenCL builtin mad implementation to use fma instruction instead of multiply add.
  • Initial implementation of cm-cl library
  • Link with LLVM target if dylib is required
  • Switch TPM to SVM entirely
  • Moving opencl-clang discovery code to outer scope to make it available for VC
  • Simplifying code related to sample and texel fetch instructions
  • Generate native sqrt for fast llvm sqrt operation and match reciprocal sqrt
  • Refactor Sub- and Work- group Scan and Reduce

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6712

22 Mar 17:23
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Simplifying PrivateMemoryResolution
  • Simplify SWSB fields in G4_INST
  • Fix debug info link in VISA for caller save/restore code.
  • Fixed stack call implicit arg mismatch between caller/callee.
  • Fix the pseudo kill for RA
  • Improve jump codegen by setting uniform if jump's flag is workgroup/global uniform
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6646

16 Mar 14:11
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Added a key to dump out WIA information into a unique file per each function invocation,
  • Added comments for stack callee function prolog to assist debugging,
  • Added indirect regioning restrictions,
  • Added ld cases for texture folding,
  • Added legalization checks for VxH regions for int-to-fp moves,
  • Added localization of live ranges to reduce accumulator usages,
  • Added shader dump for spec constants,
  • Added support for a new pattern in PushAnalysis::IsStatelessCBLoad() to detect,
  • Added VISA option -dumpintf to dump RA interference graph,
  • Added more passes to igc_opt,
  • Added reporting warnings inside IGC passes,
  • Added handles to plane coefficients,
  • Added reg keys for scheduled BB range in local scheduler,
  • Added additional DP emulation mode,
  • Allow coalescing of spill/fill in presence of stack calls,
  • Allow remat even for operations using NoMaskWA,
  • Appling renaming to linear scan RA spill/fill,
  • Change memory semantics to relaxed for OpenCL 1.x atomics,
  • Decouple VC debug options to allow emission of debug infromation without debuggable kernels,
  • Emit warning about an unsupported debuggability if ZeBin is requested,
  • Enabled Wa16012061344 for read suppresion issue caused by predictor,
  • Enhancements in compiler output,
  • Extended FCL dumps with CMFE options and inputs,
  • Favoring the llvm::BasicBlock name for vISA labels,
  • Fixed build break in Fedora,
  • Fixed emission of debug information for implicit variable locations,
  • Fixed erroneous size calculation for DW_OP_bit_piece,
  • Fixes for media height support in Cisa Builder,
  • Fixes for PosDep MatchMad condition,
  • Fixed logic when LLVM name is the empty string,
  • Fixed missing barrier when inline ASM is used in a kernel,
  • Fixed non-deterministic Function->VisaModule lookup,
  • Fixed PushAnalysis to not create unaligned 64bit runtime value arguments,
  • Fixed the hybrid RA with spill,
  • Fixed the linear scan RA time status,
  • Fixed issue for multiple thread compilation of shaders,
  • Fixed GEP scalarized indexes calculation in CG_LowerGEPForPrivMem pass,
  • For optnone builtins, allow IGC to determine inline/noinline and stackcall/subroutine calls,
  • GenISA ibfe/ubfe constant literal offset may exceed 31,
  • Implemented by value argument linearization,
  • Implemented IGC_ASSERT in IGC/OCLFE,
  • Improved DebugInfo robustness by implementing naive error-handling,
  • Lifted 4K predicate variable restriction on vISA assembly,
  • Made LinearScan default in ForceFastestSIMD,
  • Misc. initial edits to the file parsing code in the global scope,
  • Moving BiF parsing tools to a separate file,
  • Renamed some VC options to have "-vc" prefix instead of "-genx",
  • Reworked setPredicateForDiscard() to not use a temporary register for flag storage,
  • Select phi input in non-overlapping region,
  • Support for function pointer builtins/intrinsics,
  • Support for Function pointer SIMD Variants,
  • Support for uniformly typed read,
  • Unify conditions for llvm::JumpThreading usage,
  • Updated copyright headers,
  • Updated DPEmu,
  • Updated the indirect call info check in SWSB,
  • VC: backend can lower lzd64,
  • VC: debug info fixes for non-standalone kernels,
  • VC: legacy messages legalization to vc-codegen,
  • vISA: add helper function for calla check,
  • vISA: add HWConformity::fixCalla for HW restriction,
  • Simplifying ldrawvector to ldrawindex when we have a case where only one element is being used and we know the offset is a constant integer value,
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6410

02 Mar 13:36
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Consider a WA table entry before inserting a flush sampler instruction
  • Location expressions improvements
  • Do not split arithmetic instructions in IGC as vISA will handle it
  • Backing out Simple push algorithm Optimization
  • Fix reg number issue in translate math
  • Changes for -O2. Optimizing non-user functions to save compiling time.
  • Fix the SWSB when there is no send in kernel
  • Add support to generate thread IDs in 2x2 blocks.
  • Seperate global and local variables to reduce compilation time.
  • Don't replace OpDecorate with OpGroupDecorate.
  • Add InferAddressSpacesPass only if needed.
  • Fix crash in SIMD32 mode caused by pseudo_ret instruction's source operand right bound computation.
  • Update DispatchGPGPUWalkerAlongYFirst lookup
  • Changes for -O2. Optimizing non-user functions to save compiling time.
  • Changed ldms and ldmcs convertion to 16bit. Fixed ldmcs usage in users other than 16bit ldms
  • Cleanup unnecessary dynamic allocations.
  • Avoid warning of implicit i64->i32 by forcing explicit conversion.
  • Optimization for signed reminder for constant power of 2 int32.
  • Switch TPM to SVM entirely.
  • Do not modify wrregion input in non-overlapping region optimization.
  • Changed ldms and ldmcs convertion to 16bit. Fixed ldmcs usage in users other than 16bit ldms
  • Avoid warning of implicit i64->i32 by forcing explicit conversion
  • Simplify usage of IGC_BUILD__VC_ENABLED cmake option Change IGC_VC_DISABLED macro to more consistent IGC_VC_ENABLED
  • Removed external dependency on llvm_patches and improved llvm setup in project
  • Produce truncate instead of __builtin_spirv_OpUConvert for not rounded/saturated converts
  • Fix missing barrier when inline ASM is used in a kernel
  • Split uniform into thread uniform, work group uniform, and global uniform. which give us a detail info that could be used to enable better optimization.
  • Set InlineAsm usage per function group, to create correct builder for multiple FGs.
  • Support for stackcalls with InlineAsm by parsing multiple functions in single text stream.
  • Broadcast uniform variables if 'rw' constraint was specified (Inline ASM)
  • Optimize generic pointer load for kernels not using local memory.
  • Bug fix for SWSB when comparing the footprint.
  • Produce truncate instead of __builtin_spirv_OpUConvert for not rounded/saturated converts.
  • Extend GAS phi resolution to all loops, not only top level ones.
  • Remove the dependence between dummy csel instructions.
  • Adds custom iterator class for Function Group. Can iterate through the FunctionGroup class, which uses a 2D vector storage.
  • Split uniform into thread uniform, work group uniform, and global uniform. which give us a detail info that could be used to enable better optimization.
  • Change OpenCL builtin mad implementation to use fma instruction instead of multiply add.
  • Cast Base and Insert parameters to unsigned to avoid sign extension while shifting
  • Add check for compute shaders that may need XYZ walk of thread IDs.
  • ZEBinary: Fix scractch memory buffer creation.
  • If unmasked regions are nested then the most nested intrinsic llvm.genx.GenISA.UnmaskedRegionEnd switched off unmasked code generation, resulting in other embracing nested regions generatedr as masked code.
  • Fix missing barrier when inline ASM is used in a kernel.
  • Extra flag has been added to WIAnalysis Runner to not mark some uniform instructions as random.
  • Added a field to implicit argument structure for stack calls. Modified layout of local ids based on SIMD size.
  • IGA: add disassembler option "--output-on-fail"
  • Fix discovery of inlined DISubprogram nodes
  • Implement support for both SPV-IR forms for BitFieldInsert builtins
  • Introduction of new entry in IGC constant folder for bfrev.
  • Update TracePointerSource() function to detect cases where two different resource pointer values describe the same resource.
  • Vector backend does not support creation of L0 module with external functions. Insert assert in GenXCisaBuilder, explaining that.
  • Take SpillMemOffset into consideration when reporting spill size.
  • Split send has argument no 4, and it can be addr register. Make sure check dependence on src3 as well.
  • Add case when propagating non-generic pointer to store.
  • Disable certain transformations when compiling code for debug.
  • Add -vc-promote-array-alloca-limit knob to control array promotion total size (2nd edition). Force array promotion for CMRT binary.
  • Replace strcat by compound assignment operator
  • Now appropriately handling shl instructions with unsupported types.
  • More fixes to get local RA to honor declare even-alignment.
  • Print SLMsize in compiler output file
  • IGA SWSB refactoring: Unify InstType getter function
  • Fix missing barrier when inline ASM is used in a kernel
  • Extract vc input handling into another function
  • Fix an assertion due to unexpected RAUW with a constant
  • Extend supported subtargets in VC
  • Solve the memory leak issue of SWSB
  • Add control to route some resources to LSC/HDC
  • Fix scratch surface allocation for VC
  • Remove addrspacecast only if there no other uses.
  • Set alwaysinline on invoke kernels. Don't add stack call or indirect call attributes.
  • Extract vc input handling into another function
  • Add interface target for vc intrinsics headers
  • Move stepping into Options instead of a global variable.
  • Add DoNotSpill attribute for vISA variables.
  • ZEBinary: Support buffer_offset implicit argument
  • If all its operands are region invariant, an inst is region invariant.
  • Commit base data structures for implicit argument handling for bindless offsets. Changes in StatelessToBindless promotion will come later.
  • For optnone builtins, allow -O0 flag to determine if we should call them as subroutines or stackcalls.
  • Allow EnableA64WA env variable in Linux relesae mode.
  • BinaryEncodingIGA: fix math pipe instruction check
  • Upgraded error messages with source file locations and names of the kernel causing the error.
  • Implement support for both SPV-IR forms for conversion builtins
  • Prevent redundant lowering attempt during SIMD CF Conformance
  • Now appropriately handling shl instructions with unsupported types.
  • Make sure trivial RA honors even-alignment.
  • ZEBinary: add regkey to enable .bss section for zero-initialized global variables
  • add -vc-promote-array-alloca-limit knob to control array promotion total size
  • Add simplify CFG pass to pass manager to simplify work of LICM
  • Add an option for GenXPromoteArray threshold
  • Debug location expression improvements
  • Reduce memory footprint in GraphColor
  • Fix binary encoding for simd2 align16 instructions
  • Filter out "endif" and "else" when inserting dummy mov.
  • Avoid localization of large data for oclbin, use relocation instead.
  • Add option for TPM memory placement.
  • Correct localization costs for global vectors.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6087

08 Feb 14:57
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Fix a bug when comparing two source regions as type was not considered.
  • Other minor fixes and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.

igc-1.0.6083

26 Jan 12:29
Compare
Choose a tag to compare

Fixed Issues / Improvements

  • Process legalization of 64-bit moves on VC,
  • Added the support for sending the textureID and dimensions value to the driver,
  • Support for SPV_EXT_shader_atomic_float_min_max,
  • Update the read suppression WA to use less dummy instructions,
  • Move CMABI::doInitialization code in a separate helper analysis,
  • Do not hash label operands, CreateLabel() should always return a new label,
  • Modernize GraphColor code,
  • Optimize to generate mad by promoting src2 from :b to :wv,
  • Fix for creating a dump directory for Linux,
  • Update configuration_flags.md,
  • Fix Phi handling in SIMD CF Conformance,
  • Introduction of new entry in IGC constant folder for bfi,
  • Extend ValueTracker to be able to track inside user functions,
  • Remove omitting zero/undef sample params for cube maps,
  • Bug fixes for O0 inlining heuristic,
  • Corrects a defect where the vISA asm parser erroneously used,
  • Move ocl runtime info to headers,
  • Added missing code for ADL_S and RKL,
  • Fix builtin mangling for OpReadClockKHR,
  • Make sure structurizer uses correct mask offset,
  • Setting FunctionControl to force indirect call now applies to all user functions,
  • Fixed logic error when there exists a reg key which is a substring of another reg key (i.e. ShaderDumpEnable and ShaderDumpEnableAll),
  • Add UMD control to disable higher Simds,
  • Report private memory usage in assembly dump,
  • Other minor fixed and improvements.

Dependencies revisions

Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.