Release v0.5.0: Tutorial, Better contrastive attribution, 4-bit/Petals support and more · inseq-team/inseq

📄 New Tutorial and Better Documentation

A new quickstart tutorial is available in the repository, introducing feature attribution methods and showcasing basic and more advanced Inseq use-cases.
Documentation now uses the Sphinx furo theme
A new utility function inseq.explain was introduced to visualised docs associated with string identifiers used for attribution methods, step functions and aggregators

import inseq

inseq.explain("saliency")
>>> Saliency attribution method.

    Reference implementation:
    `https://captum.ai/api/saliency.html <https://captum.ai/api/saliency.html>`__.

🔀 More Flexible and Intuitive Contrastive Attribution (#193, #195, #207, #228)

Contrastive attribution functions now support original and contrastive targets of different lengths, using right-side alignment of tokens by default to simplify usage for studies using preceding context as contrastive option.
Contrastive source and target inputs can be specified as string inputs to model.attribute when using a contrastive step function of attribution target using the contrast_sources and contrast_targets arguments (see docs)
Custom alignments can be provided for contrastive step functions to compare specific step pairs using the contrast_targets_alignments argument in model.attribute. Using ”auto” uses a multilingual LaBSE encoder for creating alignments using the AWESOME approach (useful for generation tasks preserving semantic equivalence, e.g. machine translation)
The is_attributed_fn argument in StepFunctionBaseArgs can be used to customize the behavior of step functions in the attributed or the regular cases.

Refer to the quickstart tutorial for examples of contrastive attribution.

🤗 Support for Distributed and 4-bit Models (#186, #205)

Towards the goal of democratizing the access to interpretability methods for analyzing state-of-the-art models, Inseq now supports attribution of distributed language models from the Petals library and 4-bit quantized LMs from the transformersbitsandbytes integration using load_in_4bit=True, with the added flexibility of the Inseq API.

Example of contrastive gradient attribution of a distributed LLaMA 65B model:

from petals import AutoDistributedModelForCausalLM

model_name = "enoch/llama-65b-hf"
model = AutoDistributedModelForCausalLM.from_pretrained(model_name).cuda()
inseq_model = inseq.load_model(model, "saliency")
prompt = (
    "Option 1: Take a 50 minute bus, then a half hour train, and finally a 10 minute bike ride.\n"
    "Option 2: Take a 10 minute bus, then an hour train, and finally a 30 minute bike ride.\n"
    "Which of the options above is faster to get to work?\n"
    "Answer: Option”
)
out = inseq_model.attribute(
    prompt,
    prompt + "1",
    attributed_fn="contrast_prob_diff",
    contrast_targets=prompt + "2",
)

Refer to the doc guide for more details.

🔍 New Step Functions and Attribution Methods (#182, #222, #223)

The following step functions were added as pre-registered in this release:

logits: Logits of the target token.
contrast_logits/contrast_prob: Logits/probabilities of the target token when different contrastive inputs are provided to the model. Equivalent to logits/probability when no contrastive inputs are provided.
pcxmi: Point-wise Contextual Cross-Mutual Information (P-CXMI) for the target token given original and contrastive contexts (Yin et al. 2021).
kl_divergence: KL divergence of the predictive distribution given original and contrastive contexts. Can be restricted to most likely target token options using the top_k and top_p parameters.
in_context_pvi: In-context Pointwise V-usable Information (PVI) to measure the amount of contextual information used in model predictions (Lu et al. 2023).
top_p_size: The number of tokens with cumulative probability greater than top_p in the predictive distribution of the model.

The following attribution method was also added:

sequential_integrated_gradients: Sequential Integrated Gradients: a simple but effective method for explaining language models (Enguehard, 2023)

💥 Breaking Changes

The contrast_ids and contrast_attention_mask parameters in model.attribute for contrastive step functions and attribution targets are deprecated in favor of contrast_sources and contrast_targets.
Extraction and aggregation of attention weights from the attention method is now handled post-hoc via Aggregator classes, making it uniform to the API adopted for other attribution methods.

All Merged PRs

🚀 Features

Attributed behavior for contrastive step functions (#228) @gsarti
Step functions fixes, add in_context_pvi (#223) @gsarti
Add Sequential IG method (#222) @gsarti
Allow contrastive attribution with shorter contrastive targets (#207) @gsarti
Add top_p_size step fn, StepFunctionArgs class (#206) @gsarti
Support petals distributed model classes (#205) @gsarti
Custom alignment of contrast_targets for contrastive attribution methods (#195) @gsarti
Tokens diff view for contrastive attribution methods (#193) @gsarti
Handle .to for 4bit quantized models (#186) @g8a9
Aggregation functions, named aggregators, contrastive context step functions, inseq.explain (#182) @gsarti
Target prefix-constrained generation (#172) @gsarti

🔧 Fixes & Refactoring

Bump dependencies, update version and readme (#236) @gsarti
Add optional jax group to enforce compatible jaxlib version. (#235) @carschno
Minor fixes (#233) @gsarti
Migrate from torchtyping to jaxtyping (#226) @carschno
Fix command for installing pre-commit hooks. (#229) @carschno
Remove max_input_length from model.encode (#227) @gsarti
Migrate to ruff format (#225) @gsarti
Remove contrast_target_prefixes from contrastive step functions (#224) @gsarti
Fix LIME and Occlusion outputs (#220) @gsarti
Add model config (#216) @gsarti
Fix tokenization space cleanup (#215) @gsarti
Support ContiguousSpanAggregation when attr_pos_start != 0 (#213) @gsarti
fix merge_attributions (#210) @DanielSc4
Fix attribution remapping for decoder-only models (#204) @gsarti
Remove forced seed in attribution (#199) @gsarti
Fix get_scores_dict for duplicate tokens (#192) @gsarti
Fix get_scores_dicts for non-initial attr_pos_start (#187) @gsarti
Fix batching in generate (#184) @gsarti
Generalize forward pass management with InputFormatter classes (#180) @gsarti
replaced type definitions for PreTrainedTokenizer with PreTrainedTokenizerBase (#179) @lsickert

📝 Documentation and Tutorials

Update tutorial to contrastive attribution changes (#231) @gsarti
Improved quickstart documentation (#201) @gsarti
Add example tutorial (#196) @gsarti
Fix Locate GPT-2 Knowledge tutorial in docs (#174) @gsarti
Minor fixes to links and docs (#171) @gsarti
Add tuned-lens integration tutorial to docs (#169) @gsarti
Migrate docs to furo (#168) @gsarti

👥 List of contributors

@gsarti, @DanielSc4, @carschno, @g8a9 and @lsickert

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.5.0: Tutorial, Better contrastive attribution, 4-bit/Petals support and more