v0.5.0: Tutorial, Better contrastive attribution, 4-bit/Petals support and more
📄 New Tutorial and Better Documentation
-
A new quickstart tutorial is available in the repository, introducing feature attribution methods and showcasing basic and more advanced Inseq use-cases.
-
Documentation now uses the Sphinx
furo
theme -
A new utility function
inseq.explain
was introduced to visualised docs associated with string identifiers used for attribution methods, step functions and aggregators
import inseq
inseq.explain("saliency")
>>> Saliency attribution method.
Reference implementation:
`https://captum.ai/api/saliency.html <https://captum.ai/api/saliency.html>`__.
🔀 More Flexible and Intuitive Contrastive Attribution (#193, #195, #207, #228)
-
Contrastive attribution functions now support original and contrastive targets of different lengths, using right-side alignment of tokens by default to simplify usage for studies using preceding context as contrastive option.
-
Contrastive source and target inputs can be specified as string inputs to
model.attribute
when using a contrastive step function of attribution target using thecontrast_sources
andcontrast_targets
arguments (see docs) -
Custom alignments can be provided for contrastive step functions to compare specific step pairs using the
contrast_targets_alignments
argument inmodel.attribute
. Using”auto”
uses a multilingual LaBSE encoder for creating alignments using the AWESOME approach (useful for generation tasks preserving semantic equivalence, e.g. machine translation) -
The
is_attributed_fn
argument inStepFunctionBaseArgs
can be used to customize the behavior of step functions in the attributed or the regular cases.
Refer to the quickstart tutorial for examples of contrastive attribution.
🤗 Support for Distributed and 4-bit Models (#186, #205)
Towards the goal of democratizing the access to interpretability methods for analyzing state-of-the-art models, Inseq now supports attribution of distributed language models from the Petals library and 4-bit quantized LMs from the transformers
bitsandbytes
integration using load_in_4bit=True
, with the added flexibility of the Inseq API.
Example of contrastive gradient attribution of a distributed LLaMA 65B model:
from petals import AutoDistributedModelForCausalLM
model_name = "enoch/llama-65b-hf"
model = AutoDistributedModelForCausalLM.from_pretrained(model_name).cuda()
inseq_model = inseq.load_model(model, "saliency")
prompt = (
"Option 1: Take a 50 minute bus, then a half hour train, and finally a 10 minute bike ride.\n"
"Option 2: Take a 10 minute bus, then an hour train, and finally a 30 minute bike ride.\n"
"Which of the options above is faster to get to work?\n"
"Answer: Option”
)
out = inseq_model.attribute(
prompt,
prompt + "1",
attributed_fn="contrast_prob_diff",
contrast_targets=prompt + "2",
)
Refer to the doc guide for more details.
🔍 New Step Functions and Attribution Methods (#182, #222, #223)
The following step functions were added as pre-registered in this release:
logits
: Logits of the target token.contrast_logits
/contrast_prob
: Logits/probabilities of the target token when different contrastive inputs are provided to the model. Equivalent tologits
/probability
when no contrastive inputs are provided.pcxmi
: Point-wise Contextual Cross-Mutual Information (P-CXMI) for the target token given original and contrastive contexts (Yin et al. 2021).kl_divergence
: KL divergence of the predictive distribution given original and contrastive contexts. Can be restricted to most likely target token options using thetop_k
andtop_p
parameters.in_context_pvi
: In-context Pointwise V-usable Information (PVI) to measure the amount of contextual information used in model predictions (Lu et al. 2023).top_p_size
: The number of tokens with cumulative probability greater thantop_p
in the predictive distribution of the model.
The following attribution method was also added:
sequential_integrated_gradients
: Sequential Integrated Gradients: a simple but effective method for explaining language models (Enguehard, 2023)
💥 Breaking Changes
-
The
contrast_ids
andcontrast_attention_mask
parameters inmodel.attribute
for contrastive step functions and attribution targets are deprecated in favor ofcontrast_sources
andcontrast_targets
. -
Extraction and aggregation of attention weights from the
attention
method is now handled post-hoc via Aggregator classes, making it uniform to the API adopted for other attribution methods.
All Merged PRs
🚀 Features
- Attributed behavior for contrastive step functions (#228) @gsarti
- Step functions fixes, add
in_context_pvi
(#223) @gsarti - Add Sequential IG method (#222) @gsarti
- Allow contrastive attribution with shorter contrastive targets (#207) @gsarti
- Add
top_p_size
step fn,StepFunctionArgs
class (#206) @gsarti - Support
petals
distributed model classes (#205) @gsarti - Custom alignment of
contrast_targets
for contrastive attribution methods (#195) @gsarti - Tokens diff view for contrastive attribution methods (#193) @gsarti
- Handle .to for 4bit quantized models (#186) @g8a9
- Aggregation functions, named aggregators, contrastive context step functions,
inseq.explain
(#182) @gsarti - Target prefix-constrained generation (#172) @gsarti
🔧 Fixes & Refactoring
- Bump dependencies, update version and readme (#236) @gsarti
- Add optional jax group to enforce compatible jaxlib version. (#235) @carschno
- Minor fixes (#233) @gsarti
- Migrate from torchtyping to jaxtyping (#226) @carschno
- Fix command for installing pre-commit hooks. (#229) @carschno
- Remove
max_input_length
frommodel.encode
(#227) @gsarti - Migrate to
ruff format
(#225) @gsarti - Remove contrast_target_prefixes from contrastive step functions (#224) @gsarti
- Fix LIME and Occlusion outputs (#220) @gsarti
- Add model config (#216) @gsarti
- Fix tokenization space cleanup (#215) @gsarti
- Support
ContiguousSpanAggregation
whenattr_pos_start != 0
(#213) @gsarti - fix
merge_attributions
(#210) @DanielSc4 - Fix attribution remapping for decoder-only models (#204) @gsarti
- Remove forced seed in attribution (#199) @gsarti
- Fix
get_scores_dict
for duplicate tokens (#192) @gsarti - Fix
get_scores_dicts
for non-initialattr_pos_start
(#187) @gsarti - Fix batching in generate (#184) @gsarti
- Generalize forward pass management with
InputFormatter
classes (#180) @gsarti - replaced type definitions for
PreTrainedTokenizer
withPreTrainedTokenizerBase
(#179) @lsickert
📝 Documentation and Tutorials
- Update tutorial to contrastive attribution changes (#231) @gsarti
- Improved quickstart documentation (#201) @gsarti
- Add example tutorial (#196) @gsarti
- Fix Locate GPT-2 Knowledge tutorial in docs (#174) @gsarti
- Minor fixes to links and docs (#171) @gsarti
- Add
tuned-lens
integration tutorial to docs (#169) @gsarti - Migrate docs to
furo
(#168) @gsarti
👥 List of contributors
@gsarti, @DanielSc4, @carschno, @g8a9 and @lsickert