Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue]: Build failure "error: ("Must define exactly one of __HIP_PLATFORM_AMD__ or __HIP_PLATFORM_NVIDIA__")" and "error: unknown type name 'hip_bfloat16'" #1697

Open
tdavie opened this issue Feb 22, 2025 · 3 comments

Comments

@tdavie
Copy link

tdavie commented Feb 22, 2025

Problem Description

Error on attempting to build with provided script.

$ ./install.sh -idc -a "gfx908:xnack+;gfx908:xnack-" --legacy_hipblas_direct
...

[  4%] Built target hipblaslt-common
cd /nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops && /usr/bin/cmake -E make_directory /nvme/hipBLASLt/build/release/Tensile/library
cd /nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops && bash /nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Source/..//Ops/gen_assembly.sh "gfx908" /nvme/hipBLASLt/build/release/Tensile/library/../build_tmp/ops /nvme/hipBLASLt/build/release/virtualenv sha1
Creating code object for arch gfx908
In file included from /nvme/hipBLASLt/library/src/amd_detail/rocblaslt/src/kernels/matrix_transform.cpp:26:
In file included from /nvme/hipBLASLt/library/src/amd_detail/rocblaslt/src/kernels/matrix_transform.h:27:
/usr/include/hip/hip_bfloat16.h:37:2: error: ("Must define exactly one of __HIP_PLATFORM_AMD__ or __HIP_PLATFORM_NVIDIA__");
   37 | #error("Must define exactly one of __HIP_PLATFORM_AMD__ or __HIP_PLATFORM_NVIDIA__");
      |  ^
In file included from /nvme/hipBLASLt/library/src/amd_detail/rocblaslt/src/kernels/matrix_transform.cpp:26:
/nvme/hipBLASLt/library/src/amd_detail/rocblaslt/src/kernels/matrix_transform.h:43:9: error: unknown type name 'hip_bfloat16'
   43 | typedef hip_bfloat16 DTypeBF16;
      |         ^

################################################################################
# Tensile Create Library
# HIP Version:         6.2.41133-dd7f95766
# Cxx Compiler:        /opt/rocm/bin/amdclang++ (version 18.0.0)
# C Compiler:          /opt/rocm/bin/amdclang (version 18.0.0)
# Assembler:           /opt/rocm/bin/amdclang++ (version 18.0.0)
# Offload Bundler:     /opt/rocm/lib/llvm/bin/clang-offload-bundler (version 18.0.0)
# Code Object Version: 4
# Architecture(s):     gfx908:xnack+_gfx908:xnack-
# Library Format:      msgpack
# Detected local GPU with ISA: gfx908
2 errors generated when compiling for gfx908.
make[2]: *** [library/CMakeFiles/MatrixTransformKernels.dir/build.make:73: Tensile/library/hipblasltTransform.hsaco] Error 1
make[2]: Leaving directory '/nvme/hipBLASLt/build/release'
make[1]: *** [CMakeFiles/Makefile2:383: library/CMakeFiles/MatrixTransformKernels.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
cd /nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops && /usr/bin/cmake -E copy /nvme/hipBLASLt/build/release/Tensile/library/../build_tmp/ops/hipblasltExtOpLibrary.dat /nvme/hipBLASLt/build/release/Tensile/library/../build_tmp/ops/extop_*.co /nvme/hipBLASLt/build/release/Tensile/library
make[2]: Leaving directory '/nvme/hipBLASLt/build/release'
[  4%] Built target build_ext_op_library
make[2]: Leaving directory '/nvme/hipBLASLt/build/release'
[  4%] Built target hipblaslt-test-data
               cap gfx000 gfx803 gfx900 gfx906 gfx908 gfx90a gfx942 gfx1010 gfx1011 gfx1012 gfx1030 gfx1100 gfx1101 gfx1102 gfx1200 gfx1201
   HasMFMA_bf16_1k      0      0      0      0      0      1      1       0       0       0       0       0       0       0       0       0
        HasAddLshl      0      0      1      1      1      1      1       1       1       1       1       1       1       1       1       1
      HasAtomicAdd      0      0      0      0      1      1      1       0       0       0       0       1       1       1       1       1
    HasDirectToLds      0      1      1      1      1      1      1       1       1       1       1       0       0       0       0       0
     HasExplicitCO      0      0      1      1      1      1      1       1       1       1       1       1       1       1       1       1
     HasExplicitNC      0      0      0      0      0      0      0       1       1       1       1       1       1       1       1       1
    HasGLCModifier      0      1      1      1      1      1      0       1       1       1       1       1       1       1       0       0
         HasLshlOr      0      0      1      1      1      1      1       1       1       1       1       1       1       1       1       1
           HasMFMA      0      0      0      0      1      1      1       0       0       0       0       0       0       0       0       0
     HasMUBUFConst      0      1      1      1      1      1      1       1       1       1       1       1       1       1       0       0
     HasNTModifier      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
     HasNewBarrier      0      0      0      0      0      0      0       0       0       0       0       0       0       0       1       1
          HasSCMPK      0      1      1      1      1      1      1       1       1       1       1       1       1       1       0       0
          HasSMFMA      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
         HasSMulHi      0      0      1      1      1      1      1       1       1       1       1       1       1       1       1       1
           HasWMMA      0      0      0      0      0      0      0       0       0       0       0       1       1       1       1       1
        MaxLgkmcnt      1      1      1      1      1      1      1       1       1       1       1       1       1       1       1       1
          MaxVmcnt      0      1      1      1      1      1      1       1       1       1       1       1       1       1       1       1
      SupportedISA      0      1      1      1      1      1      1       1       1       1       1       1       1       1       1       1
   SupportedSource      1      1      1      1      1      1      1       1       1       1       1       1       1       1       1       1
        HasWMMA_V1      0      0      0      0      0      0      0       0       0       0       0       1       1       1       0       0
        HasWMMA_V2      0      0      0      0      0      0      0       0       0       0       0       0       0       0       1       1
         v_mov_b64      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
        HasMFMA_b8      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
  HasMFMA_explictB      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
    v_dot2_f32_f16      0      0      0      1      1      1      1       0       1       1       1       1       1       1       1       1
   v_dot2c_f32_f16      0      0      0      0      1      1      1       0       1       1       1       1       1       1       0       0
         v_fma_f16      0      0      1      1      1      1      1       1       1       1       1       1       1       1       1       1
        v_fmac_f16      0      0      0      0      0      0      0       0       0       0       0       0       0       0       0       0
         v_mac_f16      0      1      1      1      1      1      1       0       0       0       0       0       0       0       0       0
      v_pk_fma_f16      0      0      1      1      1      1      1       1       1       1       1       1       1       1       1       1
     v_pk_fmac_f16      0      0      0      0      0      0      0       0       0       0       0       0       0       0       0       0
         v_fma_f32      0      1      1      1      1      1      1       1       1       1       1       1       1       1       1       1
     v_fma_mix_f32      0      0      0      1      1      1      1       1       1       1       1       1       1       1       1       1
        v_fmac_f32      0      0      0      1      1      1      1       1       1       1       1       1       1       1       1       1
         v_mac_f32      0      1      1      1      1      1      0       1       1       1       0       0       0       0       0       0
     v_mad_mix_f32      0      0      1      0      0      0      0       0       0       0       0       0       0       0       0       0
      v_pk_add_f32      0      0      0      0      0      1      1       0       0       0       0       0       0       0       0       0
      v_pk_mul_f32      0      0      0      0      0      1      1       0       0       0       0       0       0       0       0       0
       HasMFMA_f64      0      0      0      0      0      1      1       0       0       0       0       0       0       0       0       0
         v_fma_f64      0      1      1      1      1      1      1       1       1       1       1       1       1       1       1       1
        HasMFMA_f8      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
 VOP3v_dot4_i32_i8      0      0      0      1      1      1      1       0       1       1       1       0       0       0       0       0
     v_dot4_i32_i8      0      0      0      0      0      0      0       0       0       0       0       0       0       0       0       0
    v_dot4c_i32_i8      0      0      0      0      1      1      1       0       1       1       1       0       0       0       0       0
      HasMFMA_xf32      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
ArchAccUnifiedRegs      0      0      0      0      0      1      1       0       0       0       0       0       0       0       0       0
    CMPXWritesSGPR      1      1      1      1      1      1      1       0       0       0       0       0       0       0       0       0
     CrosslaneWait      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
DSLow16NotPreserve      0      0      0      0      0      0      0       0       0       0       0       0       0       0       1       1
     ForceStoreSC1      0      0      0      0      0      0      0       0       0       0       0       0       0       0       0       0
          HasAccCD      0      0      0      0      0      1      1       0       0       0       0       0       0       0       0       0
        HasEccHalf      0      0      0      1      1      1      1       0       0       0       0       0       0       0       0       0
        HasFP8_OCP      0      0      0      0      0      0      0       0       0       0       0       0       0       0       1       1
         HasWave32      0      0      0      0      0      0      0       1       1       1       1       1       1       1       1       1
            NoSDWA      0      0      0      0      0      0      0       0       0       0       0       0       0       0       1       1
          SDWAWait      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
   SeparateLGKMcnt      0      0      0      0      0      0      0       0       0       0       0       0       0       0       1       1
     SeparateVMcnt      0      0      0      0      0      0      0       0       0       0       0       0       0       0       1       1
     SeparateVscnt      0      0      0      0      0      0      0       1       1       1       1       1       1       1       0       0
       TransOpWait      0      0      0      0      0      0      1       0       0       0       0       0       0       0       0       0
       VOP3ByteSel      0      0      0      0      0      0      0       0       0       0       0       0       0       0       1       1
          VgprBank      0      0      0      0      0      0      0       1       1       1       1       1       1       1       1       1
  Waitcnt0Disabled      0      0      0      0      1      1      1       0       0       0       0       0       0       0       0       0
WrokGroupIdFromTTM      0      0      0      0      0      0      0       0       0       0       0       0       0       0       1       1
# Found hipcc version 6.2.41133-dd7f95766
# LogicFilter:       /nvme/hipBLASLt/library/src/amd_detail/rocblaslt/src/Tensile/Logic/asm_full/**/*.yaml
# Experimental:      False
Loading Logics...: Launching 12 threads...
Loading Logics...: Done. (0.1 secs elapsed)
Number of solutions parsed: 58
Number of unique solutions: 58
Number of kernel helper objects: 431
Number of unique kernel helper objects: 69
Number of duplicate kernels: 0
Generating assembly kernels: Launching 12 threads for 58 tasks...
Generating assembly kernels: Done. (5.0 secs elapsed)
buildSourceCodeObjectFile time (s): 124.27
# Tensile Library Writer DONE
################################################################################

Total time (s): 173.65
Total kernels processed: 58
Kernels processed per second: 0.33
[  5%] TENSILE_LIBRARY_TARGET
make[2]: Leaving directory '/nvme/hipBLASLt/build/release'
[  5%] Built target TENSILE_LIBRARY_TARGET
make[1]: Leaving directory '/nvme/hipBLASLt/build/release'
make: *** [Makefile:166: all] Error 2

Operating System

Ubuntu 24.04.1 LTS (Noble Numbat)

CPU

Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz

GPU

AMD Instinct MI100

Other

No response

ROCm Version

ROCm 6.2.0

ROCm Component

hipBLASLt

Steps to Reproduce

  1. checkout c8fb6ed
  2. ./install.sh -idc -a "gfx908:xnack+;gfx908:xnack-" --legacy_hipblas_direct

(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support

ROCk module version 6.8.5 is loaded
=====================
HSA System Attributes
=====================
Runtime Version:         1.14
Runtime Ext Version:     1.6
System Timestamp Freq.:  1000.000000MHz
Sig. Max Wait Duration:  18446744073709551615 (0xFFFFFFFFFFFFFFFF) (timestamp count)
Machine Model:           LARGE
System Endianness:       LITTLE
Mwaitx:                  DISABLED
DMAbuf Support:          YES

==========
HSA Agents
==========
*******
Agent 1
*******
  Name:                    Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz
  Uuid:                    CPU-XX
  Marketing Name:          Intel(R) Xeon(R) W-2135 CPU @ 3.70GHz
  Vendor Name:             CPU
  Feature:                 None specified
  Profile:                 FULL_PROFILE
  Float Round Mode:        NEAR
  Max Queue Number:        0(0x0)
  Queue Min Size:          0(0x0)
  Queue Max Size:          0(0x0)
  Queue Type:              MULTI
  Node:                    0
  Device Type:             CPU
  Cache Info:
    L1:                      32768(0x8000) KB
  Chip ID:                 0(0x0)
  ASIC Revision:           0(0x0)
  Cacheline Size:          64(0x40)
  Max Clock Freq. (MHz):   4500
  BDFID:                   0
  Internal Node ID:        0
  Compute Unit:            12
  SIMDs per CU:            0
  Shader Engines:          0
  Shader Arrs. per Eng.:   0
  WatchPts on Addr. Ranges:1
  Memory Properties:
  Features:                None
  Pool Info:
    Pool 1
      Segment:                 GLOBAL; FLAGS: FINE GRAINED
      Size:                    65397536(0x3e5e320) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
    Pool 2
      Segment:                 GLOBAL; FLAGS: KERNARG, FINE GRAINED
      Size:                    65397536(0x3e5e320) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
    Pool 3
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED
      Size:                    65397536(0x3e5e320) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:4KB
      Alloc Alignment:         4KB
      Accessible by all:       TRUE
  ISA Info:
*******
Agent 2
*******
  Name:                    gfx908
  Uuid:                    GPU-1ba7ce67f0270ca4
  Marketing Name:          AMD Instinct MI100
  Vendor Name:             AMD
  Feature:                 KERNEL_DISPATCH
  Profile:                 BASE_PROFILE
  Float Round Mode:        NEAR
  Max Queue Number:        128(0x80)
  Queue Min Size:          64(0x40)
  Queue Max Size:          131072(0x20000)
  Queue Type:              MULTI
  Node:                    1
  Device Type:             GPU
  Cache Info:
    L1:                      16(0x10) KB
    L2:                      8192(0x2000) KB
  Chip ID:                 29580(0x738c)
  ASIC Revision:           2(0x2)
  Cacheline Size:          64(0x40)
  Max Clock Freq. (MHz):   1502
  BDFID:                   26368
  Internal Node ID:        1
  Compute Unit:            120
  SIMDs per CU:            4
  Shader Engines:          8
  Shader Arrs. per Eng.:   1
  WatchPts on Addr. Ranges:4
  Coherent Host Access:    FALSE
  Memory Properties:
  Features:                KERNEL_DISPATCH
  Fast F16 Operation:      TRUE
  Wavefront Size:          64(0x40)
  Workgroup Max Size:      1024(0x400)
  Workgroup Max Size per Dimension:
    x                        1024(0x400)
    y                        1024(0x400)
    z                        1024(0x400)
  Max Waves Per CU:        40(0x28)
  Max Work-item Per CU:    2560(0xa00)
  Grid Max Size:           4294967295(0xffffffff)
  Grid Max Size per Dimension:
    x                        4294967295(0xffffffff)
    y                        4294967295(0xffffffff)
    z                        4294967295(0xffffffff)
  Max fbarriers/Workgrp:   32
  Packet Processor uCode:: 67
  SDMA engine uCode::      18
  IOMMU Support::          None
  Pool Info:
    Pool 1
      Segment:                 GLOBAL; FLAGS: COARSE GRAINED
      Size:                    33538048(0x1ffc000) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:2048KB
      Alloc Alignment:         4KB
      Accessible by all:       FALSE
    Pool 2
      Segment:                 GLOBAL; FLAGS: EXTENDED FINE GRAINED
      Size:                    33538048(0x1ffc000) KB
      Allocatable:             TRUE
      Alloc Granule:           4KB
      Alloc Recommended Granule:2048KB
      Alloc Alignment:         4KB
      Accessible by all:       FALSE
    Pool 3
      Segment:                 GROUP
      Size:                    64(0x40) KB
      Allocatable:             FALSE
      Alloc Granule:           0KB
      Alloc Recommended Granule:0KB
      Alloc Alignment:         0KB
      Accessible by all:       FALSE
  ISA Info:
    ISA 1
      Name:                    amdgcn-amd-amdhsa--gfx908:sramecc+:xnack-
      Machine Models:          HSA_MACHINE_MODEL_LARGE
      Profiles:                HSA_PROFILE_BASE
      Default Rounding Mode:   NEAR
      Default Rounding Mode:   NEAR
      Fast f16:                TRUE
      Workgroup Max Size:      1024(0x400)
      Workgroup Max Size per Dimension:
        x                        1024(0x400)
        y                        1024(0x400)
        z                        1024(0x400)
      Grid Max Size:           4294967295(0xffffffff)
      Grid Max Size per Dimension:
        x                        4294967295(0xffffffff)
        y                        4294967295(0xffffffff)
        z                        4294967295(0xffffffff)
      FBarrier Max Size:       32
*** Done ***

Additional Information

I also tried compiling on branch release/rocm-rel-6.2, to match my system ROCm version. This gave a different error.

$ ./install.sh -idc -a "gfx908:xnack+;gfx908:xnack-"

...

Building wheels for collected packages: Tensile
  Building wheel for Tensile (setup.py): started
  Building wheel for Tensile (setup.py): finished with status 'done'
  Created wheel for Tensile: filename=Tensile-4.33.0-py3-none-any.whl size=22098125 sha256=0bc2abc38c7a2935523551143c089f8059c58b1432cd5c1a5b254a5cca7c2133
  Stored in directory: /tmp/pip-ephem-wheel-cache-520lqd_y/wheels/90/2f/64/e9ed0cd3f7b329e3b35ed2f9d346e95c7287c507a28193219b
Successfully built Tensile

...

[  6%] Built target hipblaslt-common
Creating code object for arch gfx908
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./SoftmaxGenerator.py", line 34, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./AMaxGenerator.py", line 35, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./AMaxGenerator.py", line 35, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./SoftmaxGenerator.py", line 34, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./SoftmaxGenerator.py", line 34, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./SoftmaxGenerator.py", line 34, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./AMaxGenerator.py", line 35, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./LayerNormGenerator.py", line 35, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./AMaxGenerator.py", line 35, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./LayerNormGenerator.py", line 35, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
Traceback (most recent call last):
  File "/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops/./SoftmaxGenerator.py", line 34, in <module>
    from Tensile.Common import detectGlobalCurrentISA, restoreDefaultGlobalParameters, \
ImportError: cannot import name 'getGfxName' from 'Tensile.Common' (/nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Common/__init__.py)
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/L_256_4_1_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/L_256_4_0_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/S_16_16_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/S_8_32_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/S_4_64_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/S_2_128_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/S_1_256_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/A_S_S_256_4_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/A_H_H_256_4_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/A_H_S_256_4_gfx908.o'
clang++: error: no such file or directory: '/nvme/hipBLASLt/build/release/library/build_tmp/ops/A_S_H_256_4_gfx908.o'
cd /nvme/hipBLASLt/build/release/virtualenv/lib/python3.12/site-packages/Tensile/Ops && /usr/bin/cmake -E copy /nvme/hipBLASLt/build/release/library/build_tmp/ops/hipblasltExtOpLibrary.dat /nvme/hipBLASLt/build/release/library/build_tmp/ops/extop_*.co /nvme/hipBLASLt/build/release/Tensile/library
Error copying file "/nvme/hipBLASLt/build/release/library/build_tmp/ops/extop_*.co" to "/nvme/hipBLASLt/build/release/Tensile/library".
make[2]: *** [library/CMakeFiles/build_ext_op_library.dir/build.make:77: Tensile/library/hipblasltExtOpLibrary.dat] Error 1
make[2]: *** Deleting file 'Tensile/library/hipblasltExtOpLibrary.dat'
make[2]: Leaving directory '/nvme/hipBLASLt/build/release'
make[1]: *** [CMakeFiles/Makefile2:309: library/CMakeFiles/build_ext_op_library.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
usage: TensileCreateLibrary [-h] [--cxx-compiler CXXCOMPILER] [--c-compiler CCOMPILER] [--assembler ASSEMBLER]
                            [--offload-bundler OFFLOADBUNDLER] [--architecture ARCHITECTURE]
                            [--code-object-version {default,V4,V5}] [--cmake-cxx-compiler CMAKECXXCOMPILER] [--merge-files]
                            [--no-merge-files] [--num-merged-files NUMMERGEDFILES] [--short-file-names] [--no-short-file-names]
                            [--no-enumerate] [--embed-library EMBEDLIBRARY] [--embed-library-key EMBEDLIBRARYKEY]
                            [--version VERSION] [--generate-manifest-and-exit] [--generate-sources-and-exit] [--verify-manifest]
                            [--keep-build-tmp] [--library-format {yaml,msgpack}] [--jobs CPUTHREADS] [--verbose {0,1,2,3}]
                            [--separate-architectures] [--lazy-library-loading] [--build-client] [--client-config]
                            [--ignore-asm-cap-cache] [--write-master-solution-index]
                            [--global-parameters GLOBALPARAMETERS [GLOBALPARAMETERS ...]]
                            LogicPath OutputPath {HIP,HSA}
TensileCreateLibrary: error: unrecognized arguments: --no-library-print-debug --build-id=sha1
make[2]: *** [library/CMakeFiles/TENSILE_LIBRARY_TARGET.dir/build.make:74: Tensile/library/TensileManifest.txt] Error 2
make[2]: Leaving directory '/nvme/hipBLASLt/build/release'
make[1]: *** [CMakeFiles/Makefile2:283: library/CMakeFiles/TENSILE_LIBRARY_TARGET.dir/all] Error 2
make[2]: Leaving directory '/nvme/hipBLASLt/build/release'
[  6%] Built target hipblaslt-test-data
make[1]: Leaving directory '/nvme/hipBLASLt/build/release'
make: *** [Makefile:166: all] Error 2

Let me know if I can provide any more information to assist. Thanks!

@tdavie tdavie changed the title [Issue]: Build failure [Issue]: Build failure "error: ("Must define exactly one of __HIP_PLATFORM_AMD__ or __HIP_PLATFORM_NVIDIA__")" and "error: unknown type name 'hip_bfloat16'" Feb 22, 2025
@ppanchad-amd
Copy link

Hi @tdavie. Internal ticket has been created to investigate this issue. Thanks!

@zichguan-amd
Copy link

Hi @tdavie, MI100 does not support hipBLASLt. See https://github.com/ROCm/hipBLASLt?tab=readme-ov-file#requirements.

@tdavie
Copy link
Author

tdavie commented Feb 26, 2025

I see, thanks for the response @zichguan-amd. Is the intent to not maintain the additions merged here? I also note an (abandoned?) pr to add gfx908 to the support list.

It's kind of disappointing that hardware ostensibly under official ROCm support is not supported by key ROCm libraries and thus not able to be fully used in practice. hipBLASlt is a dependency for a few libraries I'm trying to use e.g. bitsandbytes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants