You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Both give this when ./chat is run: ../kernels/avx/matmul_avx_int4.cc:701: void matmul::MatmulOperator::mat_mul_accelerator_int4_fast_no_offset(const matmul_params*): Assertion params->block_size == 32' failed. Aborted (core dumped)
When I print the block_size parameter that comes to the above functions it says 128.
does anyone know why this happens? How can I define block size as 32?
The text was updated successfully, but these errors were encountered:
I have tried with both LLaMA and VILA models.
Both give this when ./chat is run:
../kernels/avx/matmul_avx_int4.cc:701: void matmul::MatmulOperator::mat_mul_accelerator_int4_fast_no_offset(const matmul_params*): Assertion params->block_size == 32' failed. Aborted (core dumped)
When I print the block_size parameter that comes to the above functions it says
128
.does anyone know why this happens? How can I define block size as 32?
The text was updated successfully, but these errors were encountered: