Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

容器昇腾NPU跑通llama2-7B #8

Open
Yikun opened this issue Aug 18, 2023 · 2 comments
Open

容器昇腾NPU跑通llama2-7B #8

Yikun opened this issue Aug 18, 2023 · 2 comments
Labels
Ascend Something isn't working publish New feature or request

Comments

@Yikun
Copy link
Contributor

Yikun commented Aug 18, 2023

作者:@Yikun

0. 前置条件

根据 #7 完成pytorch环境搭建

(.llm-venv) # npu-smi info

(.llm-venv) # python3 -c "import torch;import torch_npu; a = torch.randn(3, 4).npu(); print(a + a);"
Warning: Device do not support double dtype now, dtype cast repalce with float.
tensor([[ 1.2800,  1.3105,  0.4513, -1.1650],
        [ 3.5199, -0.2590,  2.6664, -1.9602],
        [ 2.3262, -2.4671,  2.3252, -2.1502]], device='npu:0')

1. 安装Transformer

python3 -m pip install --upgrade pip
pip install transformers accelerate xformers
# Need "sentencepiece" and "protobuf==3.20.0" when convert_llama_weights_to_hf
pip install sentencepiece protobuf==3.20.0

2. 准备llama模型

准备模型:

# tree llama/llama-2-7b/
llama/llama-2-7b/
├── checklist.chk
├── consolidated.00.pth
└── params.json

cd llama/llama-2-7b
mkdir 7B
mv *.* 7B
cp ../tokenizer.model .

# tree -h llama/llama-2-7b/
llama/llama-2-7b/
|-- [4.0K]  7B
|   |-- [ 100]  checklist.chk
|   |-- [ 13G]  consolidated.00.pth
|   `-- [ 102]  params.json
`-- [488K]  tokenizer.model

转换模型:

# find / -name convert_llama_weights_to_hf.py
/root/.llm-venv/lib/python3.8/site-packages/transformers/models/llama/convert_llama_weights_to_hf.py
python  /root/.llm-venv/lib/python3.8/site-packages/transformers/models/llama/convert_llama_weights_to_hf.py --input_dir llama/llama-2-7b --model_size 7B --output_dir transformer/llama-2-7b

生成的模型结构如下:

# tree -h transformer/llama-2-7b/
transformer/llama-2-7b/
|-- [ 578]  config.json
|-- [ 132]  generation_config.json
|-- [9.3G]  pytorch_model-00001-of-00002.bin
|-- [3.3G]  pytorch_model-00002-of-00002.bin
|-- [ 26K]  pytorch_model.bin.index.json
|-- [ 411]  special_tokens_map.json
|-- [1.8M]  tokenizer.json
|-- [488K]  tokenizer.model
`-- [ 745]  tokenizer_config.json

3. 运行模型

from transformers import AutoTokenizer, LlamaForCausalLM
import torch
import torch_npu

# Avoid ReduceProd operator core dump, see more in: https://github.com/cosdt/llm/issues/4
option={}
option["NPU_FUZZY_COMPILE_BLACKLIST"]="ReduceProd"
torch.npu.set_option(option)

npu_id = 0
torch.npu.set_device(0)

device = "npu:{}".format(npu_id)
model_path = "/opt/yikun/transformer/llama-2-7b"
model = LlamaForCausalLM.from_pretrained(model_path).to(device)

tokenizer = AutoTokenizer.from_pretrained(model_path)

prompt = "Deep learning is"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
generate_ids = model.generate(inputs.input_ids, max_length=50)

tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
'Deep learning is a branch of machine learning that is based on artificial neural networks. Deep learning is a subset of machine learning that is based on artificial neural networks. Neural networks are a type of machine learning algorithm that is inspired by the structure and'

踩到的坑:

  1. torch.npu.set_device: 设置错NPU ID后,会一直报错,即使改回来也会报错: https://github.com/cosdt/llm/issues/3
  2. torch ReduceProd算子问题:https://github.com/cosdt/llm/issues/4
  3. import transformer必须先于torch和torch_npu: https://github.com/cosdt/llm/issues/5
@Yikun Yikun added Ascend Something isn't working publish New feature or request labels Aug 31, 2023
@yellow0523
Copy link

感谢兄弟,我终于把llama2在昇腾上跑起来了

@Yikun
Copy link
Contributor Author

Yikun commented Oct 20, 2023

@yellow0523 很高兴能帮助到你!需要的话可以加我微信:yikunkero,可以聊聊建议,让昇腾+大模型开发者体验更好。:)

@Yikun Yikun changed the title 容器昇腾910B跑通llama2-7B 容器昇腾NPU跑通llama2-7B Jan 6, 2024
@Yikun Yikun added publish New feature or request and removed publish New feature or request labels Jan 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Ascend Something isn't working publish New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants