fixing GPTQ #148

HDCharles · 2024-03-28T08:42:41Z

Stack from ghstack (oldest at bottom):

-> fixing GPTQ #148

Summary:

trying to fix the issue with kv_cache update by changing tracing into a
tensor subclass. However it seems we have less success than the fx
tracer. The fx tracer breaks due

k_out[:,:, input_pos] = k_val

getting traced as

new_var = torch.ops.aten.index_put_(k_out, [None, None,
input_pos], k_val)

with new var never being accessed afterward. new_var becomes hte correct
multiInput value, but then is lost.

The subclass ont he other hand, tries to use the func "<slot wrapper 'setitem' of 'torch._C.TensorBase' objects>"
which seems to not want to mutate k_out and so the attempt to make it a
multiTensor fails.

Test Plan: sh run.sh

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 9ed1621201317e5f655132ba11538a67c8aa5a69 Pull Request resolved: #148

Summary: trying to fix the issue with kv_cache update by changing tracing into a tensor subclass. However it seems we have less success than the fx tracer. The fx tracer breaks due k_out[:,:, input_pos] = k_val getting traced as new_var = torch.ops.aten.index_put_(k_out, [None, None, input_pos], k_val) with new var never being accessed afterward. new_var becomes hte correct multiInput value, but then is lost. The subclass ont he other hand, tries to use the func "<slot wrapper '__setitem__' of 'torch._C.TensorBase' objects>" which seems to not want to mutate k_out and so the attempt to make it a multiTensor fails. Test Plan: sh run.sh Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

Summary: trying to fix the issue with kv_cache update by changing tracing into a tensor subclass. However it seems we have less success than the fx tracer. The fx tracer breaks due k_out[:,:, input_pos] = k_val getting traced as new_var = torch.ops.aten.index_put_(k_out, [None, None, input_pos], k_val) with new var never being accessed afterward. new_var becomes hte correct multiInput value, but then is lost. The subclass ont he other hand, tries to use the func "<slot wrapper '__setitem__' of 'torch._C.TensorBase' objects>" which seems to not want to mutate k_out and so the attempt to make it a multiTensor fails. Test Plan: sh run.sh Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 9ed1621201317e5f655132ba11538a67c8aa5a69 Pull Request resolved: #148

fixing GPTQ

298c443

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: [ghstack-poisoned]

HDCharles added a commit that referenced this pull request Mar 28, 2024

fixing GPTQ

743261b

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 9ed1621201317e5f655132ba11538a67c8aa5a69 Pull Request resolved: #148

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 28, 2024

mikekgfb mentioned this pull request Apr 7, 2024

GPTQ enablement pytorch/torchchat#78

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fixing GPTQ #148

fixing GPTQ #148

HDCharles commented Mar 28, 2024 •

edited

Loading

fixing GPTQ #148

Are you sure you want to change the base?

fixing GPTQ #148

Conversation

HDCharles commented Mar 28, 2024 • edited Loading

HDCharles commented Mar 28, 2024 •

edited

Loading