CPU训练无法满载 #3

ghost · 2017-03-21T11:02:53Z

24核心，48线程的2695 v2，始终无法满载。使用率30%左右。每次训练大约耗时1秒。

使用GPU（GTX960）的话，每次训练大约耗时0.5秒。

ubuntu 16.04，CUDA 8.0， cuDNN 5.1。tensorflow是本地编译的。

ghost · 2017-03-21T11:04:07Z

是否和优化有关？是否只用了一个线程生成图像，因而这里是瓶颈？

luyishisi · 2017-03-31T03:19:50Z

一方面线程瓶颈是有的，其次的看看该生成图像是否有存为本地文件，可能存在磁盘io瓶颈

ghost · 2017-03-31T08:17:41Z

第一是没有生成本地文件，第二，硬盘是intel 750 1.2T。根据intel的文档，磁盘应该有几十万IOPS，外加2GBps以上读速度，1GB以上的写速度。另外iotop的实际读写为0。应该就单单是生成验证码太慢。 2017/03/31 11:19、luyishisi <[email protected]> のメッセージ:

…

一方面线程瓶颈是有的，其次的看看该生成图像是否有存为本地文件，可能存在磁盘io瓶颈 — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

leng-yue · 2018-10-01T04:13:06Z

是否和优化有关？是否只用了一个线程生成图像，因而这里是瓶颈？

这里确实是和优化有关, 事实上在正常使用中不应该一边生成图片一边训练

kotori2 · 2019-05-17T14:18:03Z

我这里试了一下，生成batch用了254ms，GPU训练用了13ms。。。结果就是GPU完全空载，CPU六个核只有一个在慢慢生成训练批次

kotori2 · 2019-05-17T15:02:51Z

试图优化了一下，开了12个线程同时生成图片数据，最后可以做到160ms的训练数据生成速度，再多开线程好像影响不大

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CPU训练无法满载 #3

CPU训练无法满载 #3

ghost commented Mar 21, 2017

ghost commented Mar 21, 2017

luyishisi commented Mar 31, 2017

ghost commented Mar 31, 2017 via email

leng-yue commented Oct 1, 2018

kotori2 commented May 17, 2019

kotori2 commented May 17, 2019

CPU训练无法满载 #3

CPU训练无法满载 #3

Comments

ghost commented Mar 21, 2017

ghost commented Mar 21, 2017

luyishisi commented Mar 31, 2017

ghost commented Mar 31, 2017 via email

leng-yue commented Oct 1, 2018

kotori2 commented May 17, 2019

kotori2 commented May 17, 2019