Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make chat undefined reference to `LLaVAGenerate #95

Open
cuu opened this issue Feb 27, 2024 · 1 comment
Open

make chat undefined reference to `LLaVAGenerate #95

cuu opened this issue Feb 27, 2024 · 1 comment

Comments

@cuu
Copy link

cuu commented Feb 27, 2024

On Jetson orin nano 8G

when make chat

(TinyChatEngine) cpi@ubuntu:~/github/mit-han-lab/TinyChatEngine/llm$ make chat
CUDA is available!
src/Generate.cc src/GPTBigCodeGenerate.cc src/GPTBigCodeTokenizer.cc src/LLaMATokenizer.cc src/OPTGenerate.cc src/OPTTokenizer.cc src/utils.cc src/nn_modules/Fp32CLIPAttention.cc src/nn_modules/Fp32CLIPEncoder.cc src/nn_modules/Fp32CLIPEncoderLayer.cc src/nn_modules/Fp32CLIPVisionTransformer.cc src/nn_modules/Fp32GPTBigCodeAttention.cc src/nn_modules/Fp32GPTBigCodeDecoder.cc src/nn_modules/Fp32GPTBigCodeDecoderLayer.cc src/nn_modules/Fp32GPTBigCodeForCausalLM.cc src/nn_modules/Fp32llamaAttention.cc src/nn_modules/Fp32llamaDecoder.cc src/nn_modules/Fp32llamaDecoderLayer.cc src/nn_modules/Fp32llamaForCausalLM.cc src/nn_modules/Fp32OPTAttention.cc src/nn_modules/Fp32OPTDecoder.cc src/nn_modules/Fp32OPTDecoderLayer.cc src/nn_modules/Fp32OPTForCausalLM.cc src/nn_modules/Int4GPTBigCodeAttention.cc src/nn_modules/Int4GPTBigCodeDecoder.cc src/nn_modules/Int4GPTBigCodeDecoderLayer.cc src/nn_modules/Int4GPTBigCodeForCausalLM.cc src/nn_modules/Int4OPTAttention.cc src/nn_modules/Int4OPTDecoder.cc src/nn_modules/Int4OPTDecoderLayer.cc src/nn_modules/Int4OPTForCausalLM.cc src/nn_modules/Int8OPTAttention.cc src/nn_modules/Int8OPTDecoder.cc src/nn_modules/Int8OPTDecoderLayer.cc src/nn_modules/OPTForCausalLM.cc src/ops/arg_max.cc src/ops/batch_add.cc src/ops/BMM_F32T.cc src/ops/BMM_S8T_S8N_F32T.cc src/ops/BMM_S8T_S8N_S8T.cc src/ops/Conv2D.cc src/ops/embedding.cc src/ops/Gelu.cc src/ops/LayerNorm.cc src/ops/LayerNormQ.cc src/ops/linear.cc src/ops/LlamaRMSNorm.cc src/ops/RotaryPosEmb.cc src/ops/softmax.cc src/ops/W8A8B8O8Linear.cc src/ops/W8A8B8O8LinearReLU.cc src/ops/W8A8BFP32OFP32Linear.cc ../kernels/matmul_imp.cc ../kernels/matmul_int4.cc ../kernels/matmul_int8.cc ../kernels/pthread_pool.cc
../kernels/cuda/matmul_ref_fp32.cc ../kernels/cuda/matmul_ref_int8.cc
../kernels/cuda/gemv_cuda.cu ../kernels/cuda/matmul_int4.cu  src/nn_modules/cuda/Int4llamaAttention.cu src/nn_modules/cuda/Int4llamaDecoder.cu src/nn_modules/cuda/Int4llamaDecoderLayer.cu src/nn_modules/cuda/Int4llamaForCausalLM.cu src/nn_modules/cuda/LLaMAGenerate.cu src/nn_modules/cuda/utils.cu src/ops/cuda/batch_add.cu src/ops/cuda/BMM_F16T.cu src/ops/cuda/embedding.cu src/ops/cuda/linear.cu src/ops/cuda/LlamaRMSNorm.cu src/ops/cuda/RotaryPosEmb.cu src/ops/cuda/softmax.cu
/usr/local/cuda/bin/nvcc -std=c++17 -Xptxas -O3 -gencode arch=compute_87,code=sm_87 --forward-unknown-to-host-compiler -Xcompiler "-pthread" -DQM_CUDA -DENABLE_BF16 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_BFLOAT16_OPERATORS__ -U__CUDA_NO_BFLOAT16_CONVERSIONS__ -U__CUDA_NO_BFLOAT162_OPERATORS__ -U__CUDA_NO_BFLOAT162_CONVERSIONS__ --expt-relaxed-constexpr --expt-extended-lambda --use_fast_math --threads=8 -fPIC -I../kernels -I./include -I./include/nn_modules -I./json/single_include/ -I./half-2.2.0/include/ -I./include/ops/cuda -I/usr/local/cuda/include -I/usr/local/cuda/targets/aarch64-linux/include -I/usr/include/aarch64-linux-gnu -o chat application/chat.cc build/transformer/src/Generate.o build/transformer/src/GPTBigCodeGenerate.o build/transformer/src/GPTBigCodeTokenizer.o build/transformer/src/LLaMATokenizer.o build/transformer/src/OPTGenerate.o build/transformer/src/OPTTokenizer.o build/transformer/src/utils.o build/transformer/src/nn_modules/Fp32CLIPAttention.o build/transformer/src/nn_modules/Fp32CLIPEncoder.o build/transformer/src/nn_modules/Fp32CLIPEncoderLayer.o build/transformer/src/nn_modules/Fp32CLIPVisionTransformer.o build/transformer/src/nn_modules/Fp32GPTBigCodeAttention.o build/transformer/src/nn_modules/Fp32GPTBigCodeDecoder.o build/transformer/src/nn_modules/Fp32GPTBigCodeDecoderLayer.o build/transformer/src/nn_modules/Fp32GPTBigCodeForCausalLM.o build/transformer/src/nn_modules/Fp32llamaAttention.o build/transformer/src/nn_modules/Fp32llamaDecoder.o build/transformer/src/nn_modules/Fp32llamaDecoderLayer.o build/transformer/src/nn_modules/Fp32llamaForCausalLM.o build/transformer/src/nn_modules/Fp32OPTAttention.o build/transformer/src/nn_modules/Fp32OPTDecoder.o build/transformer/src/nn_modules/Fp32OPTDecoderLayer.o build/transformer/src/nn_modules/Fp32OPTForCausalLM.o build/transformer/src/nn_modules/Int4GPTBigCodeAttention.o build/transformer/src/nn_modules/Int4GPTBigCodeDecoder.o build/transformer/src/nn_modules/Int4GPTBigCodeDecoderLayer.o build/transformer/src/nn_modules/Int4GPTBigCodeForCausalLM.o build/transformer/src/nn_modules/Int4OPTAttention.o build/transformer/src/nn_modules/Int4OPTDecoder.o build/transformer/src/nn_modules/Int4OPTDecoderLayer.o build/transformer/src/nn_modules/Int4OPTForCausalLM.o build/transformer/src/nn_modules/Int8OPTAttention.o build/transformer/src/nn_modules/Int8OPTDecoder.o build/transformer/src/nn_modules/Int8OPTDecoderLayer.o build/transformer/src/nn_modules/OPTForCausalLM.o build/transformer/src/ops/arg_max.o build/transformer/src/ops/batch_add.o build/transformer/src/ops/BMM_F32T.o build/transformer/src/ops/BMM_S8T_S8N_F32T.o build/transformer/src/ops/BMM_S8T_S8N_S8T.o build/transformer/src/ops/Conv2D.o build/transformer/src/ops/embedding.o build/transformer/src/ops/Gelu.o build/transformer/src/ops/LayerNorm.o build/transformer/src/ops/LayerNormQ.o build/transformer/src/ops/linear.o build/transformer/src/ops/LlamaRMSNorm.o build/transformer/src/ops/RotaryPosEmb.o build/transformer/src/ops/softmax.o build/transformer/src/ops/W8A8B8O8Linear.o build/transformer/src/ops/W8A8B8O8LinearReLU.o build/transformer/src/ops/W8A8BFP32OFP32Linear.o build/transformer/../kernels/matmul_imp.o build/transformer/../kernels/matmul_int4.o build/transformer/../kernels/matmul_int8.o build/transformer/../kernels/pthread_pool.o build/transformer/../kernels/cuda/matmul_ref_fp32.o build/transformer/../kernels/cuda/matmul_ref_int8.o build/transformer/../kernels/cuda/gemv_cuda.o build/transformer/../kernels/cuda/matmul_int4.o build/transformer/src/nn_modules/cuda/Int4llamaAttention.o build/transformer/src/nn_modules/cuda/Int4llamaDecoder.o build/transformer/src/nn_modules/cuda/Int4llamaDecoderLayer.o build/transformer/src/nn_modules/cuda/Int4llamaForCausalLM.o build/transformer/src/nn_modules/cuda/LLaMAGenerate.o build/transformer/src/nn_modules/cuda/utils.o build/transformer/src/ops/cuda/batch_add.o build/transformer/src/ops/cuda/BMM_F16T.o build/transformer/src/ops/cuda/embedding.o build/transformer/src/ops/cuda/linear.o build/transformer/src/ops/cuda/LlamaRMSNorm.o build/transformer/src/ops/cuda/RotaryPosEmb.o build/transformer/src/ops/cuda/softmax.o  -lcublas -lculibos -lcudart -lcublasLt -lpthread -ldl -lrt -lnvrtc -lcuda -lcudnn -lcurand -lcusolver -L/usr/local/cuda/lib64 -L/usr/local/cuda/targets/aarch64-linux/lib -L/usr/lib/aarch64-linux-gnu -Xlinker -rpath=/usr/local/cuda/lib64 -Xlinker -rpath=/usr/local/cuda/targets/aarch64-linux/lib -Xlinker -rpath=/usr/lib/aarch64-linux-gnu
nvlink warning : Skipping incompatible '/usr/lib/aarch64-linux-gnu/libpthread.a' when searching for -lpthread
nvlink warning : Skipping incompatible '/usr/lib/aarch64-linux-gnu/libdl.a' when searching for -ldl
nvlink warning : Skipping incompatible '/usr/lib/aarch64-linux-gnu/librt.a' when searching for -lrt
/usr/bin/ld: /tmp/tmpxft_00002c81_00000000-5_chat.o: in function `main':
chat.cc:(.text+0x8548): undefined reference to `LLaVAGenerate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, opt_params, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool, bool)'
/usr/bin/ld: chat.cc:(.text+0x8a00): undefined reference to `LLaVAGenerate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, opt_params, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool, bool)'
/usr/bin/ld: chat.cc:(.text+0x9158): undefined reference to `LLaVAGenerate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, opt_params, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool, bool)'
/usr/bin/ld: chat.cc:(.text+0x9614): undefined reference to `LLaVAGenerate(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, void*, int, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, opt_params, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, bool, bool, bool)'
collect2: error: ld returned 1 exit status
make: *** [Makefile:225: chat] Error 255

it seems that there is no src/nn_modules/cuda/LLaVAGenerate.cu

and by the way

src/ops/Gelu.cc needs

#include <math.h>

for tanhf, expf

git commit hash is

d0fed698b739994afda8ece0dab60cc0f22b2108 
@Dudu014
Copy link

Dudu014 commented Feb 27, 2024

Same issue here.

Gelu.cc, I solved it adding:
#include <cmath>

Regarding the LLaVAGenerate issue I just commented out those lines on chat.cc, since I am using LLaMA and not LLaVA, it should not matter.

That way I am able to "make chat -j". However, when running "./chat" it gets stucked showing "loading model ..." and the process ends showing "killed" on the screen. Unsure what the problem is, I am assuming is the "int4LlamaForCausalLM model" declaration on "chat.cc" as the program never shows the comment "Finshed!".

Arquitecture:
Jetson Nano Orin Developer Kit 8GB
Model: LLaMA2_7B_chat_awq_int4 for CUDA device

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants