question about quantization #119

xinhaoc · 2023-06-04T18:52:38Z

Hi FlexGen team! I have a question about your quantization algorithm. are you using this function run_float_quantization for int4/int8 compression? When I run the test(test_float_quantize), it fails because the params is different with the deepspeed version(the ref_out_tensor is the same). the deepspeed param can recover the float16 tensor, the run_float_quantize can't. Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

question about quantization #119

question about quantization #119

xinhaoc commented Jun 4, 2023 •

edited

Loading

question about quantization #119

question about quantization #119

Comments

xinhaoc commented Jun 4, 2023 • edited Loading

xinhaoc commented Jun 4, 2023 •

edited

Loading