-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/reagent branch for ReAGent #250
Conversation
Hey there, thanks a lot for the PR @casszhao @xuan25 ! I had a look at the code structure you pushed and I have some comments on the current implementation:
I will keep having a look in the next week, if you could address some of the issues I describe up here it would be of great help! |
I've had a look at the changes and we're much closer to get it merged @xuan25, thanks! 🤗 Will have a second look to fix minor details. Just a quick Q:
|
Thanks for the feedback. Yeah, I have implemented the attribute target in the latest commits, but without limiting the sampled token within the same language (vocabulary set). However, it should do the job of at least perturbating the inputs. |
* origin: Fix CTI on first token when whitespace Updated model config with Cohere and StarCoder2 (transformers v4.39) Fix attribute-context --help Strip prepended whitespace in Fix attribute-context with non-contrastive attribution Fix `IndexError` for dec-only models in `attribute-context` Fix prefixed generation for mismatching tokenization Fix URL to arXiv (inseq-team#259) Fix install CI Various fixes to `attribute-context` (inseq-team#258)
Hi @xuan25 @casszhao, sorry for the delay! I'm having a look at this and ran with no issues using decoder-only and encoder-decoders (both with and without target attribution), so I think we are quite close to merging now. I pushed some fixes including:
|
Thanks, @gsarti ! |
We are ready for merging! 🎉 Just noting down here some points in the current that can be improved regarding the current ReAGent implementation:
Provided the current implementation is functional for both decoder-only and encoder-decoder models, I will proceed with the merge and any further development regarding these issues should be performed in a dedicated PR. Thanks again @casszhao @xuan25 for your contribution! 😄 |
Hi
Thanks, will promote it later on X. Cheers ~
Best Regards
Cass Z
linkedin.com/in/casszhao
M: 44 7516 862694
…On Sat, Apr 13, 2024 at 11:00 Gabriele Sarti ***@***.***> wrote:
We are ready for merging! 🎉 Just noting down here three points in the
current that can be improved regarding the current ReAGent implementation:
- overlap_strict_pos currently defaults to True, and the False
condition is in TODO. If it's added, the purpose of this check needs to be
made more explicit in docstrings.
- The AggregateRationalizer class currently supports only a batch size
of 1 because it builds a batch of various masked examples using
num_probes. Ideally, we'd want batching to still be allowed here,
taking inspiration from the Captum Integrated Gradients implementation
<https://captum.ai/api/_modules/captum/attr/_core/integrated_gradients.html#IntegratedGradients>
where they face the same issue (theinternal_batch_size there is
equivalent to num_probes, and it is used to build the interpolation
steps across all batch elements.
- Currently the ReAGent implementation doesn't make use of
attributed_fn to specify what step function to use to estimate token
importance, and always uses the logit. It would be good to use the
attribution_model forward instead of extracting the underlying
AutoModel, since it would automatically handle this and allow for
out-of-the-box usage for e.g. contrastive feature attribution, or other
user-specified step functions.
Provided the current implementation is functional for both decoder-only
and encoder-decoder models, I will proceed with the merge and any further
development regarding these issues should be performed in a dedicated PR.
Thanks again @casszhao <https://github.com/casszhao> @xuan25
<https://github.com/xuan25> for your contribution! 😄
—
Reply to this email directly, view it on GitHub
<#250 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AIFSOHH6EJCBWLHSIOIARWDY5D64RAVCNFSM6AAAAABDPLAO26VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANJTGU4TMNZWHA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
ReAGent, for a Model-agnostic Feature Attribution Method for Generative Language Models
Paper link: https://arxiv.org/abs/2402.00794
Type of Change