Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding mplugdocowl #31792

Open
wants to merge 98 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
98 commits
Select commit Hold shift + click to select a range
b311e5e
feat: adding mplugdocowl
danaaubakirova May 27, 2024
aa0ec04
feat: added separate file for the mPLUGDocOwl language model
danaaubakirova May 27, 2024
cc7e9b3
feat: added vision encoder for mplugdocowl
danaaubakirova May 27, 2024
204daba
fix: changed the attention mechanism in clip vision, renamed to MPLUG…
danaaubakirova May 28, 2024
6e144e5
feat: added hreducer and new things in config, changed vision embeddi…
danaaubakirova May 28, 2024
9f94d2c
fix: converted hreducer module related tensors to contiguous
danaaubakirova May 29, 2024
19ffc83
feat: added shape adaptive module
danaaubakirova May 31, 2024
85dce8d
feat: added new image_processing script
danaaubakirova Jun 3, 2024
0f5fb87
Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…
danaaubakirova Jun 4, 2024
53aca6d
fix: small fix
danaaubakirova Jun 4, 2024
cb25b05
Merge branch 'mplugdocowl' of github.com:danaaubakirova/transformers …
danaaubakirova Jun 4, 2024
1debae3
feat: added the additional keys to the output of the data
danaaubakirova Jun 4, 2024
66b849d
feat: made major modifications to image_processing script. added the …
danaaubakirova Jun 6, 2024
1716668
feat: refactored shape_adaptive_cropping function and resolved the is…
danaaubakirova Jun 10, 2024
452ebf5
feat: testing forward
danaaubakirova Jun 11, 2024
1e7f386
feat: corrected image tag
danaaubakirova Jun 12, 2024
8577f35
fix: attention mask handling is fixed, .forward works
danaaubakirova Jun 13, 2024
f546fbc
feat: updates in vision architecture
danaaubakirova Jun 18, 2024
edc358d
Update src/transformers/models/mplugdocowl/language_modeling_mplugdoc…
danaaubakirova Jun 19, 2024
9003d59
fix: renaming the model
danaaubakirova Jun 19, 2024
9f688d9
grand fix: fixed hreducer, the firstgenerated token is correct. forw…
danaaubakirova Jun 21, 2024
30c8a2b
fix: need to fix prepare_inputs_for_generation()
danaaubakirova Jun 24, 2024
5483f82
fix: fixed prepare_inputs_for_generation()
danaaubakirova Jun 24, 2024
413ddad
Merge branch 'main' into mplugdocowl
danaaubakirova Jun 25, 2024
7546063
testing phase
danaaubakirova Jun 25, 2024
e3cc222
removed copied from ..
danaaubakirova Jun 25, 2024
4f4f219
small fixes
danaaubakirova Jun 25, 2024
661bd75
removed some things from the config
danaaubakirova Jun 26, 2024
8aded38
small fixes
danaaubakirova Jun 27, 2024
19e0a35
update
danaaubakirova Jun 27, 2024
8300463
small fix
danaaubakirova Jun 27, 2024
f0c87d8
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
b75b2b9
Update src/transformers/models/mplugdocowl/modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
2aae5ca
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
105b5e1
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
7a2f434
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
205e345
Update tests/models/mplugdocowl/test_modeling_mplugdocowl.py
danaaubakirova Jun 27, 2024
0f5ba22
Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py
danaaubakirova Jun 27, 2024
c0e241a
Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py
danaaubakirova Jun 27, 2024
1555e04
Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py
danaaubakirova Jun 27, 2024
219d866
Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…
danaaubakirova Jun 27, 2024
4600f75
Update src/transformers/models/mplugdocowl/convert_mplugdocowl_weight…
danaaubakirova Jun 27, 2024
cb55d49
Update src/transformers/models/mplugdocowl/language_modeling_mplugdoc…
danaaubakirova Jun 27, 2024
c4c711c
model card is updated. tips to be added
danaaubakirova Jun 28, 2024
3007178
fix
danaaubakirova Jun 28, 2024
cdcf2f6
added documentation,updated rotary embedding function, added ModelTest
danaaubakirova Jun 28, 2024
cc7681f
updated
danaaubakirova Jul 1, 2024
b77e2ba
processing updates for batches
danaaubakirova Jul 4, 2024
6c42032
fixes
danaaubakirova Jul 4, 2024
c4425be
removed 'copied from' for language models
danaaubakirova Jul 4, 2024
f20ea69
check_repo fixes
danaaubakirova Jul 4, 2024
59e34b6
resolving conflicts with main
danaaubakirova Jul 4, 2024
4c63d84
fix
danaaubakirova Jul 4, 2024
1af7e52
update
danaaubakirova Jul 4, 2024
fe70171
Merge branch 'main' into adding_mplugdocowl
danaaubakirova Jul 4, 2024
3237828
resolving conflicts
danaaubakirova Jul 4, 2024
4a67ed2
added mplugdocowl to image_proc_auto
danaaubakirova Jul 4, 2024
4c53f6c
fix
danaaubakirova Jul 4, 2024
a5c28c1
updates to image_processing and tokenizer
danaaubakirova Jul 5, 2024
3e278cc
update
danaaubakirova Jul 8, 2024
1ab8c2a
new
danaaubakirova Jul 9, 2024
9d16fca
generation related changes
danaaubakirova Jul 11, 2024
0ad9b7a
changes to test
danaaubakirova Jul 11, 2024
560602a
fixes
danaaubakirova Jul 11, 2024
6285349
Update src/transformers/models/mplugdocowl/language_modeling_mplugdoc…
danaaubakirova Jul 16, 2024
012a801
Update src/transformers/models/mplugdocowl/language_modeling_mplugdoc…
danaaubakirova Jul 16, 2024
e4d29d6
Update src/transformers/models/mplugdocowl/language_modeling_mplugdoc…
danaaubakirova Jul 16, 2024
fdec794
Update src/transformers/models/mplugdocowl/modeling_mplugdocowl.py
danaaubakirova Jul 16, 2024
229fd31
Update src/transformers/models/mplugdocowl/processing_mplugdocowl.py
danaaubakirova Jul 16, 2024
6dc4776
feedback fixes 1
danaaubakirova Jul 16, 2024
a0ab134
feedback fixes 2
danaaubakirova Jul 16, 2024
e9a4b2b
fixes after testing and running make fixup
danaaubakirova Jul 16, 2024
dd465f8
fixes tests passed
danaaubakirova Jul 17, 2024
83dd273
nit
danaaubakirova Jul 17, 2024
e78c3e3
small fix
danaaubakirova Jul 17, 2024
8c27f9b
Merge branch 'main' into adding_mplugdocowl
danaaubakirova Jul 17, 2024
b10658c
small fix
danaaubakirova Jul 17, 2024
3706879
doc fix
danaaubakirova Jul 17, 2024
91113e3
fixes related to doc
danaaubakirova Jul 17, 2024
b7a61df
nit
danaaubakirova Jul 17, 2024
102f5f6
fixes
danaaubakirova Jul 17, 2024
87c40b3
Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…
danaaubakirova Jul 17, 2024
3aa4635
Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…
danaaubakirova Jul 17, 2024
4b87998
Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…
danaaubakirova Jul 17, 2024
da5411d
Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…
danaaubakirova Jul 17, 2024
cb02ee6
Update src/transformers/models/mplugdocowl/image_processing_mplugdoco…
danaaubakirova Jul 17, 2024
47f552d
fix of the accepted commits.
danaaubakirova Jul 17, 2024
c2837ae
fix
danaaubakirova Jul 17, 2024
6a48b47
update, aded kwargs and support for quantization
danaaubakirova Jul 19, 2024
49acffb
update
danaaubakirova Jul 22, 2024
f2fed0d
Merge branch 'main' into adding_mplugdocowl
danaaubakirova Jul 23, 2024
dba858e
resolving comments, small fixes
danaaubakirova Jul 31, 2024
387beb9
fixup
danaaubakirova Jul 31, 2024
96d5c6e
Merge branch 'main' into adding_mplugdocowl
danaaubakirova Jul 31, 2024
7f0a993
Merge branch 'main' into adding_mplugdocowl
danaaubakirova Aug 1, 2024
389d049
copies fix
danaaubakirova Aug 1, 2024
8b5451a
doc fix
danaaubakirova Aug 1, 2024
cddfbdf
add expansion logic in processors
zucchini-nlp Sep 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/source/en/_toctree.yml
Original file line number Diff line number Diff line change
Expand Up @@ -818,6 +818,8 @@
title: MatCha
- local: model_doc/mgp-str
title: MGP-STR
- local: model_doc/mplugdocowl
title: mPLUGDocOwl
- local: model_doc/nougat
title: Nougat
- local: model_doc/oneformer
Expand Down
1 change: 1 addition & 0 deletions docs/source/en/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -214,6 +214,7 @@ Flax), PyTorch, and/or TensorFlow.
| [MobileNetV2](model_doc/mobilenet_v2) | ✅ | ❌ | ❌ |
| [MobileViT](model_doc/mobilevit) | ✅ | ✅ | ❌ |
| [MobileViTV2](model_doc/mobilevitv2) | ✅ | ❌ | ❌ |
| [mPLUGDocOwl](model_doc/mplugdocowl) | ✅ | ❌ | ❌ |
| [MPNet](model_doc/mpnet) | ✅ | ✅ | ❌ |
| [MPT](model_doc/mpt) | ✅ | ❌ | ❌ |
| [MRA](model_doc/mra) | ✅ | ❌ | ❌ |
Expand Down
75 changes: 75 additions & 0 deletions docs/source/en/model_doc/mplugdocowl.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.

⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
rendered properly in your Markdown viewer.

-->

# mPLUG-DocOwl1.5

## Overview

The mPLUG-DocOwl1.5 model was proposed in [mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding](https://arxiv.org/pdf/2403.12895) by Anwen Hu, Haiyang Xu, Jiabo Ye, Ming Yan
Liang Zhang, Bo Zhang, Chen Li, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou.

MPLUG-DocOwl1.5 is a multimodal model designed for text-rich images. It features the H-Reducer vision-to-text module, which preserves spatial relationships and efficiently processes high-resolution document images by merging visual features horizontally.

The model employs Unified Structure Learning with structure-aware parsing tasks and multi-grained text localization tasks, teaching it to parse text using line feeds, spaces, and extended Markdown syntax, which enhances the model's ability to correlate text with specific positions in the image.

DocOwl 1.5 undergoes a two-stage training process: Unified Structure Learning followed by Multi-task Tuning among Downstream Tasks. The high-quality DocReason25K dataset boosts reasoning abilities, allowing DocOwl 1.5-Chat to balance concise answers and detailed explanations.

The abstract from the paper is the following:

*Structure information is critical for understanding the semantics of text-rich images, such as documents, tables, and charts. Existing Multimodal Large Language Models (MLLMs) for Visual Document Understanding are equipped with text recognition ability but lack general structure understanding abilities for text-rich document images. In this work, we emphasize the importance of structure information in Visual Document Understanding and propose the Unified Structure Learning to boost the performance of MLLMs. Our Unified Structure Learning comprises structure-aware parsing tasks and multi-grained text localization tasks across 5 domains: document, webpage, table, chart, and natural image. To better encode structure information, we design a simple and effective vision-to-text module H-Reducer, which can not only maintain the layout information but also reduce the length of visual features by merging horizontal adjacent patches through convolution, enabling the LLM to understand high-resolution images more efficiently. Furthermore, by constructing structure-aware text sequences and multi-grained pairs of texts and bounding boxes for publicly available text-rich images, we build a comprehensive training set DocStruct4M to support structure learning. Finally, we construct a small but high-quality reasoning tuning dataset DocReason25K to trigger the detailed explanation ability in the document domain. Our model DocOwl 1.5 achieves state-of-the-art performance on 10 visual document understanding benchmarks, improving the SOTA performance of MLLMs with a 7B LLM by more than 10 points in 5/10 benchmarks.*

Tips:

DocOowl-Chat: For more accurate and stable generation, set do_sample=False. Performs better on most of the samples compared to the DocOwl-Omni checkpoint.
DocOwl-Omni: For optimal performance, use do_sample=True and top_p=0.7 as recommended in the original code.

This model was contributed by [danaaubakirova](https://huggingface.co/danaaubakirova).
The original code can be found [here](https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/DocOwl1.5).


## MPLUGDocOwlConfig

[[autodoc]] MPLUGDocOwlConfig

## MPLUGDocOwlImageProcessor
[[autodoc]] MPLUGDocOwlImageProcessor

## MPLUGDocOwlProcessor
[[autodoc]] MPLUGDocOwlProcessor

## MPLUGDocOwlHReducer
[[autodoc]] MPLUGDocOwlHReducer

## MPLUGDocOwlForCausalLM
[[autodoc]] MPLUGDocOwlForCausalLM
- forward

## MPLUGDocOwlLanguageModel
[[autodoc]] MPLUGDocOwlLanguageModel

## MPLUGDocOwlPreTrainedLanguageModel
[[autodoc]] MPLUGDocOwlPreTrainedLanguageModel

## MPLUGDocOwlVisionModel
[[autodoc]] MPLUGDocOwlVisionModel

## MPLUGDocOwlVisionTransformer
[[autodoc]] MPLUGDocOwlVisionTransformer

## MPLUGDocOwlForConditionalGeneration

[[autodoc]] MPLUGDocOwlForConditionalGeneration
- forward
Binary file added examples_multi_col_60204.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
34 changes: 34 additions & 0 deletions src/transformers/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -576,6 +576,10 @@
"models.mobilenet_v2": ["MobileNetV2Config"],
"models.mobilevit": ["MobileViTConfig"],
"models.mobilevitv2": ["MobileViTV2Config"],
"models.mplugdocowl": [
"MPLUGDocOwlConfig",
"MPLUGDocOwlProcessor",
],
"models.mpnet": [
"MPNetConfig",
"MPNetTokenizer",
Expand Down Expand Up @@ -1170,6 +1174,7 @@
_import_structure["models.mobilenet_v1"].extend(["MobileNetV1FeatureExtractor", "MobileNetV1ImageProcessor"])
_import_structure["models.mobilenet_v2"].extend(["MobileNetV2FeatureExtractor", "MobileNetV2ImageProcessor"])
_import_structure["models.mobilevit"].extend(["MobileViTFeatureExtractor", "MobileViTImageProcessor"])
_import_structure["models.mplugdocowl"].extend(["MPLUGDocOwlImageProcessor"])
_import_structure["models.nougat"].append("NougatImageProcessor")
_import_structure["models.oneformer"].extend(["OneFormerImageProcessor"])
_import_structure["models.owlv2"].append("Owlv2ImageProcessor")
Expand Down Expand Up @@ -2667,6 +2672,19 @@
"MobileViTV2PreTrainedModel",
]
)
_import_structure["models.mplugdocowl"].extend(
[
"MPLUGDocOwlAttention",
"MPLUGDocOwlForCausalLM",
"MPLUGDocOwlForConditionalGeneration",
"MPLUGDocOwlHReducer",
"MPLUGDocOwlLanguageModel",
"MPLUGDocOwlPreTrainedLanguageModel",
"MPLUGDocOwlPreTrainedModel",
"MPLUGDocOwlVisionModel",
"MPLUGDocOwlVisionTransformer",
]
)
_import_structure["models.mpnet"].extend(
[
"MPNetForMaskedLM",
Expand Down Expand Up @@ -5266,6 +5284,10 @@
from .models.mobilevitv2 import (
MobileViTV2Config,
)
from .models.mplugdocowl import (
MPLUGDocOwlConfig,
MPLUGDocOwlProcessor,
)
from .models.mpnet import (
MPNetConfig,
MPNetTokenizer,
Expand Down Expand Up @@ -5895,6 +5917,7 @@
MobileNetV2ImageProcessor,
)
from .models.mobilevit import MobileViTFeatureExtractor, MobileViTImageProcessor
from .models.mplugdocowl import MPLUGDocOwlImageProcessor
from .models.nougat import NougatImageProcessor
from .models.oneformer import OneFormerImageProcessor
from .models.owlv2 import Owlv2ImageProcessor
Expand Down Expand Up @@ -7122,6 +7145,17 @@
MobileViTV2Model,
MobileViTV2PreTrainedModel,
)
from .models.mplugdocowl import (
MPLUGDocOwlAttention,
MPLUGDocOwlForCausalLM,
MPLUGDocOwlForConditionalGeneration,
MPLUGDocOwlHReducer,
MPLUGDocOwlLanguageModel,
MPLUGDocOwlPreTrainedLanguageModel,
MPLUGDocOwlPreTrainedModel,
MPLUGDocOwlVisionModel,
MPLUGDocOwlVisionTransformer,
)
from .models.mpnet import (
MPNetForMaskedLM,
MPNetForMultipleChoice,
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,6 +152,7 @@
mobilenet_v2,
mobilevit,
mobilevitv2,
mplugdocowl,
mpnet,
mpt,
mra,
Expand Down
2 changes: 2 additions & 0 deletions src/transformers/models/auto/configuration_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -169,6 +169,7 @@
("mobilenet_v2", "MobileNetV2Config"),
("mobilevit", "MobileViTConfig"),
("mobilevitv2", "MobileViTV2Config"),
("mplugdocowl", "MPLUGDocOwlConfig"),
("mpnet", "MPNetConfig"),
("mpt", "MptConfig"),
("mra", "MraConfig"),
Expand Down Expand Up @@ -461,6 +462,7 @@
("mobilenet_v2", "MobileNetV2"),
("mobilevit", "MobileViT"),
("mobilevitv2", "MobileViTV2"),
("mplugdocowl", "mPLUGDocOwl"),
("mpnet", "MPNet"),
("mpt", "MPT"),
("mra", "MRA"),
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/auto/image_processing_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@
("mobilenet_v2", ("MobileNetV2ImageProcessor",)),
("mobilevit", ("MobileViTImageProcessor",)),
("mobilevitv2", ("MobileViTImageProcessor",)),
("mplugdocowl", ("MPLUGDocOwlImageProcessor",)),
("nat", ("ViTImageProcessor", "ViTImageProcessorFast")),
("nougat", ("NougatImageProcessor",)),
("oneformer", ("OneFormerImageProcessor",)),
Expand Down
2 changes: 2 additions & 0 deletions src/transformers/models/auto/modeling_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -312,6 +312,7 @@
("mega", "MegaForMaskedLM"),
("megatron-bert", "MegatronBertForPreTraining"),
("mobilebert", "MobileBertForPreTraining"),
("mplugdocowl", "MPLUGDocOwlForConditionalGeneration"),
("mpnet", "MPNetForMaskedLM"),
("mpt", "MptForCausalLM"),
("mra", "MraForMaskedLM"),
Expand Down Expand Up @@ -711,6 +712,7 @@
("llava", "LlavaForConditionalGeneration"),
("llava-next-video", "LlavaNextVideoForConditionalGeneration"),
("llava_next", "LlavaNextForConditionalGeneration"),
("mplugdocowl", "MPLUGDocOwlForConditionalGeneration"),
("paligemma", "PaliGemmaForConditionalGeneration"),
("pix2struct", "Pix2StructForConditionalGeneration"),
("video_llava", "VideoLlavaForConditionalGeneration"),
Expand Down
1 change: 1 addition & 0 deletions src/transformers/models/auto/processing_auto.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@
("markuplm", "MarkupLMProcessor"),
("mctct", "MCTCTProcessor"),
("mgp-str", "MgpstrProcessor"),
("mplugdocowl", "MPLUGDocOwlProcessor"),
("oneformer", "OneFormerProcessor"),
("owlv2", "Owlv2Processor"),
("owlvit", "OwlViTProcessor"),
Expand Down
107 changes: 107 additions & 0 deletions src/transformers/models/mplugdocowl/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Copyright 2024 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
from typing import TYPE_CHECKING

from ...utils import OptionalDependencyNotAvailable, _LazyModule, is_torch_available, is_vision_available


_import_structure = {
"configuration_mplugdocowl": ["MPLUGDocOwlConfig"],
"modeling_mplugdocowl": [
"MPLUGDocOwlAttention",
"MPLUGDocOwlForCausalLM",
"MPLUGDocOwlForConditionalGeneration",
"MPLUGDocOwlHReducer",
"MPLUGDocOwlLanguageModel",
"MPLUGDocOwlPreTrainedLanguageModel",
"MPLUGDocOwlPreTrainedModel",
"MPLUGDocOwlVisionModel",
"MPLUGDocOwlVisionTransformer",
],
"processing_mplugdocowl": ["MPLUGDocOwlProcessor"],
}

try:
if not is_vision_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
pass
else:
_import_structure["image_processing_mplugdocowl"] = ["MPLUGDocOwlImageProcessor"]

try:
if not is_torch_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
pass
else:
_import_structure["modeling_mplugdocowl"] = [
"MPLUGDocOwlAttention",
"MPLUGDocOwlForCausalLM",
"MPLUGDocOwlForConditionalGeneration",
"MPLUGDocOwlHReducer",
"MPLUGDocOwlLanguageModel",
"MPLUGDocOwlPreTrainedLanguageModel",
"MPLUGDocOwlPreTrainedModel",
"MPLUGDocOwlVisionModel",
"MPLUGDocOwlVisionTransformer",
]


if TYPE_CHECKING:
from .configuration_mplugdocowl import MPLUGDocOwlConfig
from .modeling_mplugdocowl import (
MPLUGDocOwlAttention,
MPLUGDocOwlForCausalLM,
MPLUGDocOwlForConditionalGeneration,
MPLUGDocOwlHReducer,
MPLUGDocOwlLanguageModel,
MPLUGDocOwlPreTrainedLanguageModel,
MPLUGDocOwlPreTrainedModel,
MPLUGDocOwlVisionModel,
MPLUGDocOwlVisionTransformer,
)
from .processing_mplugdocowl import MPLUGDocOwlProcessor

try:
if not is_vision_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
pass
else:
from .image_processing_mplugdocowl import MPLUGDocOwlImageProcessor

try:
if not is_torch_available():
raise OptionalDependencyNotAvailable()
except OptionalDependencyNotAvailable:
pass
else:
from .modeling_mplugdocowl import (
MPLUGDocOwlAttention,
MPLUGDocOwlForCausalLM,
MPLUGDocOwlForConditionalGeneration,
MPLUGDocOwlHReducer,
MPLUGDocOwlLanguageModel,
MPLUGDocOwlPreTrainedLanguageModel,
MPLUGDocOwlPreTrainedModel,
MPLUGDocOwlVisionModel,
MPLUGDocOwlVisionTransformer,
)


else:
import sys

sys.modules[__name__] = _LazyModule(__name__, globals()["__file__"], _import_structure)
Loading