Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoModel class for image-text-to-text models #32042

Open
merveenoyan opened this issue Jul 18, 2024 · 3 comments
Open

AutoModel class for image-text-to-text models #32042

merveenoyan opened this issue Jul 18, 2024 · 3 comments
Labels
Feature request Request for a new feature

Comments

@merveenoyan
Copy link
Contributor

merveenoyan commented Jul 18, 2024

Feature request

It would be nice to get a standard AutoModel class for image-text-to-text models (since @molbap is standardizing the processor)

Motivation

@NielsRogge noticed that in model repositories the automatic snippets fallback to AutoModelForPreTraining because these models don't exist in PIPELINE_TAGS_AND_AUTO_MODELS (due to lack of AutoClass) More importantly it would be nice to load it to a single class.

Your contribution

I haven't checked what it takes to implement an AutoClass when model classes exist in different names for the same task but if decided I don't mind looking into it and taking a stab.

@merveenoyan merveenoyan added the Feature request Request for a new feature label Jul 18, 2024
@NielsRogge
Copy link
Contributor

First attempt was at #29572, but is awaiting standardization of processors which is tracked at #31911

@amyeroberts
Copy link
Collaborator

cc @zucchini-nlp re VLMs

@yonigozlan
Copy link
Member

Another blocker was that some models need custom processing code to be moved into their processors. I started #32059 and will get to work on checking which models need additional processing to standardize the inputs and outputs :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request Request for a new feature
Projects
None yet
Development

No branches or pull requests

4 participants