Releases: intel/intel-graphics-compiler
Releases · intel/intel-graphics-compiler
igc-1.0.5176
Fixed Issues / Improvements
- Enabling inlining for -O0 to preserve debug info.
- Replace libclc cospi and sinpi implementations with svml versions.
- -cmc options to work the same way as -vc-codegen
- Add a separate constant buffer for string literals in ZeBin path.
- Several improvements to computation of ranges in debug_loc section.
- Fix for comparison of ConstantDataSequentials when comparing constants
- Enhance function pointers support in VC.
- Fixed sporadic failures due to uninitialized variable.
- Open-source svml cos/sin/sincos functions, which will be used when input's absolute value is less than 10000. Otherwise libclc implementation will be used.
- Fix capability string in SPIRV reader to be conformant with spec.
- Fix dominance corruption in SIMD CF Conformance
- Fix fmax builtin conversion
- Fix for select condition checker in SIMD CF Conformance
- Add support for GenISA_simdBlockRead and GenISA_simdMediaBlockWrite GenISA intrinsics in Emu64OpsPass. This fixes the crash in IGC when compiling kernels with intel_sub_group_block_image_long for ICL platform.
- Fix building for LLVM11
- Add helper function to get image type from KernelArg (NFC)
- Check for illegal VxH operands (<1,0> with exec size 32) in vISA verifier.
- Fix wrong expected usage in genx wrapper
- Remove split of cmp instructions in CISACodeGen as it is now handled by vISA.
- CMABI is added to the list of VC passes in the plugin
- Changes in preparation of LLVM 11 upgrade.
- Fix for missing metadata for cloned functions
- Fix building for LLVM11 (next part)
- Update configuration_flags.md
- Fix a bug where uniform vector broadcast was handled incorrectly on platforms without i64 support.
- Default stack call and indirect call to compile SIMD16
- Prepare constant loader to use data layout
- Adding platform info to FCL interface
- Add support of llvm text files for GenXWrapper
- Align privatebase to 10 bits and explicitly tell InstCombiner the alignment of perThreadOffset and bufferOffset to avoid limitation of MaxDepth==6
- simd-1 kernels shall allocate SIP surface if debug info is present
- avoid emitting several visa indices for llvm instruction
- Removed clang block type arguments from precompiled builtins in BiFModule.
- Fix the Klock reported issue in linear scan RA
- .elf with debug info can be dumped with ShaderDumpEnable
- Fix translation of SPIRV DebugValue operation when first argument is OpConstant.
- Add support for more opcodes in SPIRV to create DIExpression.
- optimize the compilation time of linear Scan RA
- Improved loadPhiConstants to handle bitcast chains
- Remove split of arithmetic instructions in CISACodeGen as it is now handled by vISA.
- Add current BB to HW conformity and make more use of replaceDst()
- Add dump function to CVariable to assist debugging (NFC).
- Swap src0 and src1 for pseudo-mad if src1 is scalar but src0 is not.
- Avoid FP64 emulation related code if kernel doesn't use FP64 at all, second try.
- Turn on writing caller's frame-pointer to callee's stack. Enable flag EnableWriteOldFPToStack to support stack-walk.
- Remove split of logic instructions in CISACodeGen as it is now handled by vISA.
- Add possibility to use ninja cmake-generator
- Improved baling of new load insts
- GetGenxDebugInfo should be a constant method
- Add support of specializaton constants to VC
- Pattern match for canonical predicate of an icmp instruction with negation.
Dependencies revisions
- intel/llvm-patches@cfc8005
- intel/opencl-clang@fdcfda3
- KhronosGroup/SPIRV-LLVM-Translator@0db501e (for opencl-clang)
- intel/vc-intrinsics@c8c52b5
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.5064
Fixed Issues / Improvements
- Multiple refactor changes in preparation of LLVM upgrade to version 11.
- Added shader dumping capabilities to VectorCompiler.
- VectorCompiler now uses DebufInfo library.
- Refactored debug information.
- Fixes for compilation for OpenCL 3.0.
- Remove alwaysinline attributes for call instructions if -cl-opt-disable is present.
- Avoid FP64 emulation related code if kernel doesn't use FP64 at all.
- Instruction splitting is now handled by vISA.
- Enable splitting of instructions with indirect addressing.
- Update GED version to 0.68.
- Improvements to the legalizaton of shuffle-vector.
- Removed LShr and AShr operands truncation.
- Restricting spill space compression intra-iteration only in interest of compile time and memory usage.
- Removed the EnableOCLNoInlineAttr flag, NoInline should be honored by default.
- Multiple other improvements and minor changes throughout the project.
Dependencies revisions
- intel/llvm-patches@c4a0345
- intel/opencl-clang@6a9cd2c
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@8300678
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4944
Fixed Issues / Improvements
- Multiple refactor changes in preparation of LLVM upgrade to version 11.
- Vector backend improvements:
- clearer reports about unsupported types,
- introduced i64 emulation pass,
- support for -ftime-report,
- support for genx_addc and genx_subb intrinsics.
- Added support for OpConstFunctionPointerINTEL, 1024-bit constants in SPIRVReader.
- Added patches for SPIRV-LLVM-Translator based on LLVM9.
- Added support for experimental SYCL unmasked call feature.
- Added option to force thread group size.
- Added fallback path when ZEBinary is enabled by -allow-zebin.
- Updated CMFE interface.
- Relaxed some checks to allow a subset of i64 operations for targets without native i64 support.
- Fixed translation of non 32/64-bit constants.
- Fixed processing of GEP instruction when the index is a vector.
- Fixed buildbreak with VectorCompiler switched off.
- Removed unused IGA files, updated IGA.
- Many minor optimization and code improvements throughout whole project.
Dependencies revisions
- intel/llvm-patches@c4a0345
- intel/opencl-clang@6a9cd2c
- KhronosGroup/SPIRV-LLVM-Translator@424e375 (for opencl-clang)
- intel/vc-intrinsics@55124bb
- KhronosGroup/SPIRV-LLVM-Translator@e8a52ab (for VectorCompiler)
- llvm/[email protected]
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4879
Fixed Issues / Improvements
- SWSB sync instruction optimization
- Increase large basic block size to calculate register pressure.
- Support for emitting GenISA_simdShuffleDown in missing execution size (SIMD32)
- Decode DIExpression operation in SPIRV reader.
- Support emulation of general call and return for i64 type.
- Enable accumulator use for ror/rol.
- Minor fixes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4756
Fixed Issues / Improvements
- Update EOTRenderTarget() to use RTW instruction instead of a raw send.
- Add WA to set noMask for all sampler.
- Type of for statement iterator changed from unsigned int to uint64_t to prevent hangs.
- Initial support for source/line debug information in VectorCompiler backend.
- Fix a bug for A64 block store with byte source operand.
- Fix number of elements to use in G4_Declare creation in remat.
- Make IGC emit DW_AT_abstract_origin for inlined DW_TAG_lexical_block constructs.
- Add a check to avoid invalid dereference.
- Incorrect code generation for xor fix.
- Adding a regkey to override product for debugging.
- Create dummy kernel to attach symbol table and indirectly-called functions.
- Move bfrev pattern from GenSpecificPattern to CustomSafeOptPass.
- Adding attribute for selective stack calls.
- Disable rematerialzation of intrinsic.split instruction.
- Assign DebugLoc to pre-defined variables. Skip sinking of allocas when -cl-opt-disable is applied to keep .debug_ranges clean for inlined functions.
- Fixes for OCL 3.0 feature macro usage.
- Fix CISA offsets before splice operation in spill insertion.
- Update EOTRenderTarget() to use RTW instruction instead of a raw send.
- Fix regression in promotion of dynamic buffers to registers.
- Allow enabling some features in Release mode using environment variables.
- ZEBianry: support .spv and .gtpin_info section.
- Add pass to classify move types.
- Do not copy R0 when creating the header for bindless sampler message.
- Eliminate select+phi redundancy in SIMD CF
- Enable explicit variable split.
- Fix some bugs in ZEBinary - Fixed ELF flag - Rename .data.global_const to .data.const - Remove local_id info if not used
- Remove unncessary add for the common special case of add.pair with zero high32-bit values.
- Add backend configuration pass for VC.
- Merge instructions for 64bit emulation.
- Add pass to Split Indirect EE to sel to avoid VxH mov.
- Add dependency for EstimateFunctionSize.
- Emit single copy of pre-defined variables when -cl-kernel-debug-enable is passed.
- Inlining algorithm for controling Kernel Total Size.
- Forward pointer support in SPIRV Reader.
- Run earlyCSE after GEPLowering to optout some instrcutions introduced by GEPLowering.
- Switching from -runtime to -binary-format option -binary-format=ze for ZEBinary output.
- Stitch indirectly-called functions to the binary on VC side.
- Starting from tgllp mid-thread preemption is no longer supported. EnablePreemption value should be set to false for these new platforms.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4594
Fixed Issues / Improvements
- DWARF SIMD location expressions support cont.
- Emit debug info for lower and upper 16 channels for SIMD32
- Add an opt for dp4 with identity matrix
- GRF register info available in dump
- Minor fixes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4560
Fixed Issues / Improvements
- Removed redundant constant folding in HW legalization checks.
- Added ABI validation for CMFE.
- Added flushing L3 for device or cross-device memory fence on global memory.
- Added missing help text for alias options in VC.
- Moved IntrinsicGenISA.gen to build Config folder and added proper dependency requirement.
- Fixed VertexShaderLowering incorrectly clearing out Vertex Header when it is actually used.
- Fixed in LRA to ensure startGRFReg is less than number of GRFs available for allocation to linear scan.
- Fixes for vISA assembly.
- Fixed vector alloca type in TransformPrivMem for function pointers cases.
- Fixed and further implementation of IGC_ASSERT.
- Fixed args passed to register for non-uniform function calls.
- Disabled SIMD32 slicing when -cl-opt-disable is passed instead of -g.
- Disabled legacy mad to mac optimization.
- Implemented SIMD compile info for OCL shaders.
- Improved alloca uniform analysis.
- Allowing DisableAddingAlwaysAttribute flag in release mode.
- Switched to accessing GenXSubtarget through TargetPassConfig.
- Minor refactoring and deprecated code removal.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4521
Fixed Issues / Improvements
- Fix bugs in image/sampler tracking to properly track argument when compiling with -cl-opt-disable
- Move pass to erase redundant movs after other optimizations are done.
- Skip simd32 compilation for per-pixel dispatch with x16 samples.
- Revert new-reg-per-function behavior in VC RA
- Add new Relocation Type R_PER_THREAD_PAYLOAD_OFFSET_32 Also refactor vISA::RelocationEntry create API
- Moving IntrinsicGenISA.gen to build Config folder and adding proper dependency requirement
- Switch to using LLVMTargetMachine in VC. Initialized GenX pass in BackendPlugin.
- Add check for fp64 and i64 copy move if platform does not support 64b types.
- Program the correct response length for spill of a scalar variable used as send dst.
- Enable FP64 accumulator as mul instruction source.
- Add TGL emulation functions for DP and SP
- DWARF debugger location expressions
- Emit variable location off privateBase
- additional include guards to avoid re-defintion conflict of LARGE_INTEGER type
- Uniform analysis tuning for performance.
- For stack calls do not adjust the spill size by global scratch offset
- Relocations and symbols support in L0 binary in VC
- Update register numbering for debug info.
- Try to avoid bank conflict for Gen12 when scheduling
- Fix some excessive mov instructions emitted by VectorCompiler.
- Avoid unncessary llvm metadata regenerations to optimize compilation time
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4479
Fixed Issues / Improvements
- Fail compilation with error message instead of crash if we can't find sampler argument or inline/global sampler.
- DWARF debugger location expressions fixes.
- Remove redundant LICM pass.
- Enable timestats through regkey.
- extend the MarkReadOnlyPass to mark loads with constant address space with invariant.load.
- Adding metadata for computedDepthMode.
- Enable vISA instruction splitting pass.
- Open-sourcing CM FE related parts of driver.
- Fix system LLVM handling in VC.
- update integer splitter api in VectorCompiler.
- Improve lowering of ord/unord fcmp.
- Emask calculation correcting by remove the write enable and using right data types.
- Basic block where flow control partially joins should also be treated as divergent.
- Limit total thread payload size to 96 GRFs.
- Tighten up vISA assembly syntax: do not allow missing regions for general operands, do not allow two offsets for address operands.
- Fixed operations with overflow.
- Prevent tracking of images and samplers from falling into infinite loop.
- Fix lowpc/highpc for subroutines in debug info.
- Make Dst operand's subreg offset immutable.
- Get rid of unnecessary MOVS.
- Check for undefined predicate variables when parsing vISA assembly.
- Removing the dst/src overlap checking after augmentation.
- ZEBinary: add kernel symbol.
- Fix TPM's replaceGatherPrivate.
- Add ZEAutoTool.
- Report parser error for vISA inline assembly in releaseInternal build.
- ZEBinaryBuilder: Fix packed_local_ids size to 6 instead of 12.
- DWARF debugger location expressions fixes.
- Avoid multiple metadata regenerating in AggregateArguments pass.
- Refactor for stack call functions. Combine code for caller/callee stack load/store.
- Fix shuffleVector lowering in legalization pass.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.
igc-1.0.4427
Fixed Issues / Improvements
- Handle 16-byte alignment correct for trivial and local RA.
- Introduce ExtraOCLOptions debug key.
- Improve URB merging.
- Initial support of L0 binary in cmc.
- Add pattern match to emit integer trunc instruction with saturation.
- Update of static bank conflict checking.
- Minor fixes and improvements.
Ubuntu 18.04 binary packages for LLVM10/Clang10 are included.