Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow DiscretizedIntegratedGradientAttribution method, also on GPU #161

Open
MoritzLaurer opened this issue Jan 20, 2023 · 2 comments
Open
Labels
enhancement New feature or request

Comments

@MoritzLaurer
Copy link

MoritzLaurer commented Jan 20, 2023

🐛 Bug Report

Inference on a google colab GPU is very slow. There is no significant difference if the model runs on cuda or CPU

🔬 How To Reproduce

The following model.attribute(...) code runs for around 33 to 47 seconds both on a colab CPU or GPU. I tried passing the device to the model and the model.device confirms that it's running on cuda, but it still takes very long to run only 2 sentences. (I don't know the underlying computations for attribution enough to know if this is to be expected, or if this should be faster. If it's always that slow, then it seems practically infeasible to analyse larger corpora)

import inseq
import torch
device = "cuda" if torch.cuda.is_available() else "cpu"

print(inseq.list_feature_attribution_methods())
model = inseq.load_model("google/flan-t5-small", attribution_method="discretized_integrated_gradients", device=device)

model.to(device)

out = model.attribute(
    input_texts=["We were attacked by hackers. Was there a cyber attack?", "We were not attacked by hackers. Was there a cyber attack?"],
)

model.device

Environment

  • OS: linux, google colab
  • Python version: Python 3.8.10
  • Inseq version: 0.3.3

Expected behavior

Faster inference with a GPU/cuda

(Thanks btw, for the fix for returning the per-token scores in a dictionary, the new method works well :) )

@MoritzLaurer MoritzLaurer added the bug Something isn't working label Jan 20, 2023
@gsarti gsarti changed the title Attribute is very slow, also on google colab GPU Slow DiscretizedIntegratedGradientAttribution method, also on GPU Jan 20, 2023
@gsarti
Copy link
Member

gsarti commented Jan 20, 2023

Hi @MoritzLaurer , thanks for your comment!

The slowness you report is most likely specific to the discretized_integrated_gradient method, since the current implementation builds non-linear interpolation paths in a sequential manner. We currently have issue #113 tracking a bug with batching with this method, and we are in touch with the authors.

In the meantime, I suggest using the more common saliency or integrated_gradients approach that should be considerably faster on GPU. Bastings et al. 2022 shows how Gradient L2 (the default outcome using saliency in Inseq since v0.3.3) works well in terms of faithfulness on Transformer-based classifiers, so that could be a good starting point! Alternatively, attention attribution only requires forward passes, but it's less principled.

Hope it helps!

@MoritzLaurer
Copy link
Author

ok, thanks, will try the other methods. (good to know that there might be a fix at some point, in my ad-hoc tests the discretized_integrated_gradient method seems to make the most interpretable attributions)

@gsarti gsarti added enhancement New feature or request and removed bug Something isn't working labels Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants