Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cpu跑通了,但是时间很久,gpu还是没成功 #41

Open
baltam opened this issue Nov 27, 2022 · 1 comment
Open

cpu跑通了,但是时间很久,gpu还是没成功 #41

baltam opened this issue Nov 27, 2022 · 1 comment

Comments

@baltam
Copy link

baltam commented Nov 27, 2022

用cpu跑通demo了,但是用gpu跑不了,求救!!!

1.机器配置

操作系统:Windows 10
硬件配置环境

显卡:3070ti 8g
处理器:i9 7980xe
cuda 10.0
cudnn 7.6.5
内存 64g

软件依赖

pandas==0.24.2
regex==2019.4.14
h5py==2.9.0
numpy==1.16.2
tensorboard==1.13.1
tensorflow-gpu==1.13.1
tqdm==4.31.1
requests==2.22.0
protobuf==3.19.0

2.报错、解决思路、替代方案

模型加载好啦!🍺Bilibili干杯🍺 

现在将你的作文题精简为一个句子,粘贴到这里:⬇️,然后回车


**********************************************作文题目**********************************************

苦练本手,方能妙手随成


**********************************************作文题目**********************************************

正在生成第  1  of  1 篇文章

......

EssayKiller正在飞速写作中,请稍后......

2022-11-27 19:19:37.206277: E tensorflow/stream_executor/cuda/cuda_blas.cc:428] failed to run cuBLAS routine: CUBLAS_STATUS_EXECUTION_FAILED
2022-11-27 19:19:37.206746: E tensorflow/stream_executor/cuda/cuda_blas.cc:2301] Internal: failed BLAS call, see log for details
Traceback (most recent call last):
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1365, in _do_call
    return fn(*args)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1350, in _run_fn
    target_list, run_metadata)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1443, in _call_tf_sessionrun
    run_metadata)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24
         [[{{node sample_sequence/newslm/layer00/MatMul}}]]
         [[sample_sequence/while/Identity/_1594]]
  (1) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24
         [[{{node sample_sequence/newslm/layer00/MatMul}}]]
0 successful operations.
0 derived errors ignored.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "d:/ly/EssayKiller_V2-master/LanguageNetwork/GPT2/scripts/demo.py", line 220, in <module>
    p_for_topp: top_p[chunk_i]})
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 956, in run
    run_metadata_ptr)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1180, in _run
    feed_dict_tensor, options, run_metadata)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1359, in _do_run
    run_metadata)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\client\session.py", line 1384, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InternalError: 2 root error(s) found.
  (0) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24
         [[node sample_sequence/newslm/layer00/MatMul (defined at C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
         [[sample_sequence/while/Identity/_1594]]
  (1) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24
         [[node sample_sequence/newslm/layer00/MatMul (defined at C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py:1748) ]]
0 successful operations.
0 derived errors ignored.

Original stack trace for 'sample_sequence/newslm/layer00/MatMul':
  File "d:/ly/EssayKiller_V2-master/LanguageNetwork/GPT2/scripts/demo.py", line 188, in <module>
    do_topk=False)
  File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 768, in sample
    do_topk=do_topk)
  File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 740, in initialize_from_context
    batch_size=batch_size, p_for_topp=p_for_topp, cache=None, do_topk=do_topk)
  File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 714, in sample_step
    cache=cache,
  File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 499, in __init__
    cache=layer_cache,
  File "d:\ly\EssayKiller_V2-master\LanguageNetwork\GPT2\scripts\modeling.py", line 198, in attention_layer
    attention_scores = tf.matmul(query, key, transpose_b=True)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\util\dispatch.py", line 180, in wrapper
    return target(*args, **kwargs)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\ops\math_ops.py", line 2716, in matmul
    return batch_mat_mul_fn(a, b, adj_x=adjoint_a, adj_y=adjoint_b, name=name)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\ops\gen_math_ops.py", line 1712, in batch_mat_mul_v2
    "BatchMatMulV2", x=x, y=y, adj_x=adj_x, adj_y=adj_y, name=name)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\op_def_library.py", line 794, in _apply_op_helper
    op_def=op_def)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\util\deprecation.py", line 507, in new_func
    return func(*args, **kwargs)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3357, in create_op
    attrs, op_def, compute_device)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 3426, in _create_op_internal
    op_def=op_def)
  File "C:\Users\ly1995\AppData\Local\conda\conda\envs\zuowen1\lib\site-packages\tensorflow_core\python\framework\ops.py", line 1748, in __init__
    self._traceback = tf_stack.extract_stack()

2.1 关键信息抽取

(0) Internal: Blas xGEMMBatched launch failed : a.shape=[24,11,64], b.shape=[24,11,64], m=11, n=11, k=64, batch_size=24

2.2 问题分析

通过bing搜索报错信息,得知了报错原因,主要是因为显存不够造成的

2.3 想法1

既然显存不够,那就减少一些显存,让程序灵活调用显存,这样问题就解决了吧,于是我加入了如下语句

os.environ["CUDA_VISIBLE_DEVICES"] = "0"
tf_config = tf.compat.v1.ConfigProto(allow_soft_placement=True)
tf_config.gpu_options.allow_growth=True
# tf_config.gpu_options.per_process_gpu_memory_fraction = 0.6

...可还是报错,是因为显存不够吗...

替代方案1

既然gpu跑不了,那干脆不用gpu了,用cpu试试,于是我修改了以下语句

os.environ["CUDA_VISIBLE_DEVICES"] = " " #将0改为none

结果:程序跑通了,但是cpu跑肯定比gpu慢很多,跑一篇作文大概要10min,cpu占用率大概为40-50

@hello-2021
Copy link

hello-2021 commented Nov 27, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants