Replies: 2 comments 1 reply
-
With your newest improvement, my inference time is reduced from 54-58s/step down to 44-48/step. Windows 10 laptop; Intel i5-8250U; 12Gb RAM 2600 Mhz; generating 512 x 512 image I'm using your windows ready avx2 version |
Beta Was this translation helpful? Give feedback.
-
Hey @leejet Using BLAS 44-48s/step, but w/o BLAS it's 34-38s/step. I wonder how could this happen? Is BLAS itself not very good on sd.cpp? Ubuntu Jammy 22.04 laptop; Intel i5-8250U; 12Gb RAM 2600 Mhz; generating 512 x 512 image |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Original
./bin/sd -m ../models/sd-v1-4-ggml-model-f16.bin -p "a lovely cat" -v
Improvement 1
./bin/sd -m ../models/sd-v1-4-ggml-model-f16.bin -p "a lovely cat" -v
Improvement 2
./bin/sd -m ../models/sd-v1-4-ggml-model-f16.bin -p "a lovely cat" -v
Beta Was this translation helpful? Give feedback.
All reactions