Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix linear_to_conv2d_map to work with other distilbert model types #4

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

anentropic
Copy link

re #3 (comment)

I'd already monkeypatched this in my own project in order to use a QA model, so here's a PR

happy to make any tweaks required

@anentropic anentropic force-pushed the fix-linear-conv2-map branch from 181a16d to ee34e9c Compare April 24, 2023 20:24
@angusfong
Copy link

Hi @anentropic thanks for the patch! I tried it on DistilBertForMaskedLM, and managed to successfully produce an optimized model with this code

import transformers
model_name = "distilbert-base-uncased"
baseline_model = transformers.AutoModelForMaskedLM.from_pretrained(
    model_name,
    return_dict=False,
    torchscript=True,
).eval()

from ane_transformers.huggingface import distilbert as ane_distilbert
optimized_model = ane_distilbert.DistilBertForMaskedLM(
    baseline_model.config).eval()
optimized_model.load_state_dict(baseline_model.state_dict())

tokenizer = transformers.AutoTokenizer.from_pretrained(model_name)
tokenized = tokenizer(
    ["Sample input text to trace the model"],
    return_tensors="pt",
    max_length=128,  # token sequence length
    padding="max_length",
)

import torch
traced_optimized_model = torch.jit.trace(
    optimized_model,
    (tokenized["input_ids"], tokenized["attention_mask"])
)

import coremltools as ct
import numpy as np
ane_mlpackage_obj = ct.convert(
    traced_optimized_model,
    convert_to="mlprogram",
    inputs=[
        ct.TensorType(
                f"input_{name}",
                    shape=tensor.shape,
                    dtype=np.int32,
                ) for name, tensor in tokenized.items()
            ],
            compute_units=ct.ComputeUnit.ALL,
)
out_path = "optimized.mlpackage"
ane_mlpackage_obj.save(out_path)

However, the mlpackage produced seems to result in the wrong dimensionality outputs: comparing the baseline and optimized models on Netron shows this difference.
Screenshot 2023-07-11 at 6 42 40 PM

Moreover, I am unable to performance test the new mlpackage on XCode.
Screenshot 2023-07-11 at 6 35 05 PM

Do you know what may be missing?

@anentropic
Copy link
Author

anentropic commented Jul 12, 2023

@angusfong I don't know off the top of my head and it's a while since I looked at it unfortunately

I do remember I ended up making some more fixes and refactoring, I didn't bother to make another PR here since there doesn't seem to be anybody handling public contributions to this repo

actually... reading through my altered code now it does look like I may have encountered the same problem, I commented something in DistilBertForMaskedLM here: https://github.com/anentropic/hft2ane/blob/main/hft2ane/models/distilbert.py#L299
(Note: the repo is WIP and project is not published anywhere except github currently)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants