Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
133 changes: 133 additions & 0 deletions tools/model_converter/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
# Model Converter Tool

A command-line utility to download PyTorch models and convert them to OpenVINO format.

## Overview

This tool reads a JSON configuration file containing model specifications, downloads PyTorch weights from URLs, loads the models, and exports them to OpenVINO Intermediate Representation (IR) format.

## Features

- **Automatic Download**: Downloads model weights from HTTP/HTTPS URLs with caching support
- **Dynamic Model Loading**: Dynamically imports and instantiates model classes from Python paths
- **Metadata Embedding**: Embeds custom metadata into OpenVINO models
- **Input/Output Naming**: Configurable input and output tensor names
- **Batch Processing**: Process multiple models from a single configuration file
- **Selective Conversion**: Convert specific models using the `--model` flag

## Installation

### Prerequisites

```bash
# Required packages
uv pip install torch torchvision openvino

```

## Usage

### Basic Usage

```bash
uv run python model_converter.py config.json -o ./output_models
```

### Command-Line Options

```text
positional arguments:
config Path to JSON configuration file

options:
-h, --help Show help message and exit
-o OUTPUT, --output OUTPUT
Output directory for converted models (default: ./converted_models)
-c CACHE, --cache CACHE
Cache directory for downloaded weights (default: ~/.cache/torch/hub/checkpoints)
--model MODEL Process only the specified model (by model_short_name)
--list List all models in the configuration file and exit
-v, --verbose Enable verbose logging
```

### Examples

**List all models in configuration:**

```bash
uv run python model_converter.py example_config.json --list
```

**Convert all models:**

```bash
uv run python model_converter.py example_config.json -o ./converted_models
```

**Convert a specific model:**

```bash
uv run python model_converter.py example_config.json -o ./converted_models --model resnet50
```

**Use custom cache directory:**

```bash
uv run python model_converter.py example_config.json -o ./output -c ./my_cache
```

**Enable verbose logging:**

```bash
uv run python model_converter.py example_config.json -o ./output -v
```

## Configuration File Format

The configuration file is a JSON file with the following structure:

```json
{
"models": [
{
"model_short_name": "resnet50",
"model_class_name": "torchvision.models.resnet.resnet50",
"model_full_name": "ResNet-50",
"description": "ResNet-50 image classification model",
"weights_url": "https://download.pytorch.org/models/resnet50-0676ba61.pth",
"input_shape": [1, 3, 224, 224],
"input_names": ["images"],
"output_names": ["output"],
"model_params": null,
"model_type": "Classification"
}
]
}
```

**Important**: The `model_type` field enables automatic model detection when using [Intel's model_api](https://github.com/openvinotoolkit/model_api). When specified, this metadata is embedded in the OpenVINO IR, allowing `Model.create_model()` to automatically select the correct model wrapper class.

Common `model_type` values:

- `"Classification"` - Image classification models
- `"DetectionModel"` - Object detection models
- `"YOLOX"` - YOLOX detection models
- `"SegmentationModel"` - Segmentation models

### Configuration Fields

#### Required Fields

- **`model_short_name`** (string): Short identifier for the model (used for output filename)
- **`model_class_name`** (string): Full Python path to the model class (e.g., `torchvision.models.resnet.resnet50`)
- **`weights_url`** (string): URL to download the PyTorch weights (.pth file)

#### Optional Fields

- **`model_full_name`** (string): Full descriptive name of the model
- **`description`** (string): Description of the model
- **`input_shape`** (array of integers): Input tensor shape (default: `[1, 3, 224, 224]`)
- **`input_names`** (array of strings): Names for input tensors (default: `["input"]`)
- **`output_names`** (array of strings): Names for output tensors (default: auto-generated)
- **`model_params`** (object): Parameters to pass to model constructor (default: `null`)
- **`model_type`** (string): Model type for model_api auto-detection (e.g., `"Classification"`, `"DetectionModel"`, `"YOLOX"`, etc.)
102 changes: 102 additions & 0 deletions tools/model_converter/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
{
"models": [
{
"model_short_name": "mobilenet_v3_small",
"model_class_name": "torchvision.models.mobilenetv3.mobilenet_v3_small",
"model_full_name": "MobileNetV3-Small",
"description": "MobileNetV3 Small - Efficient convolutional neural network for mobile and embedded vision applications",
"docs": "https://docs.pytorch.org/vision/main/models/generated/torchvision.models.mobilenet_v3_small.html#torchvision.models.mobilenet_v3_small",
"weights_url": "https://download.pytorch.org/models/mobilenet_v3_small-047dcff4.pth",
"input_shape": [1, 3, 224, 224],
"input_names": ["image"],
"output_names": ["output1"],
"model_params": null,
"model_type": "Classification",
"reverse_input_channels": false,
"mean_values": "123.675 116.28 103.53",
"scale_values": "58.395 57.12 57.375",
"labels": "IMAGENET1K_V1"
},
{
"model_short_name": "efficientnet_b0",
"model_class_name": "torchvision.models.efficientnet.efficientnet_b0",
"model_full_name": "EfficientNet-B0",
"description": "EfficientNet-B0 - Efficient convolutional neural network with compound scaling",
"docs": "https://docs.pytorch.org/vision/main/models/generated/torchvision.models.efficientnet_b0.html#torchvision.models.efficientnet_b0",
"weights_url": "https://download.pytorch.org/models/efficientnet_b0_rwightman-3dd342df.pth",
"input_shape": [1, 3, 224, 224],
"input_names": ["image"],
"output_names": ["logits"],
"model_params": null,
"model_type": "Classification",
"reverse_input_channels": true,
"mean_values": "123.675 116.28 103.53",
"scale_values": "58.395 57.12 57.375",
"labels": "IMAGENET1K_V1"
},
{
"model_short_name": "resnet18",
"model_class_name": "torchvision.models.resnet.resnet18",
"model_full_name": "ResNet-18",
"description": "ResNet-18 - 18-layer residual learning network for image classification",
"weights_url": "https://download.pytorch.org/models/resnet18-f37072fd.pth",
"input_shape": [1, 3, 224, 224],
"input_names": ["image"],
"output_names": ["output"],
"model_params": null,
"model_type": "Classification",
"reverse_input_channels": true,
"mean_values": "123.675 116.28 103.53",
"scale_values": "58.395 57.12 57.375",
"labels": "IMAGENET1K_V1"
},
{
"model_short_name": "resnet50",
"model_class_name": "torchvision.models.resnet.resnet50",
"model_full_name": "ResNet-50",
"description": "ResNet-50 - 50-layer residual learning network for image classification",
"weights_url": "https://download.pytorch.org/models/resnet50-0676ba61.pth",
"input_shape": [1, 3, 224, 224],
"input_names": ["image"],
"output_names": ["output"],
"model_params": null,
"model_type": "Classification",
"reverse_input_channels": true,
"mean_values": "123.675 116.28 103.53",
"scale_values": "58.395 57.12 57.375",
"labels": "IMAGENET1K_V1"
},
{
"model_short_name": "squeezenet1_0",
"model_class_name": "torchvision.models.squeezenet.squeezenet1_0",
"model_full_name": "SqueezeNet 1.0",
"description": "SqueezeNet 1.0 - Small CNN with AlexNet-level accuracy and 50x fewer parameters",
"weights_url": "https://download.pytorch.org/models/squeezenet1_0-b66bff10.pth",
"input_shape": [1, 3, 224, 224],
"input_names": ["image"],
"output_names": ["output"],
"model_params": null,
"model_type": "Classification",
"reverse_input_channels": true,
"mean_values": "123.675 116.28 103.53",
"scale_values": "58.395 57.12 57.375",
"labels": "IMAGENET1K_V1"
},
{
"model_short_name": "vgg16",
"model_class_name": "torchvision.models.vgg.vgg16",
"model_full_name": "VGG-16",
"description": "VGG-16 - 16-layer deep convolutional network",
"weights_url": "https://download.pytorch.org/models/vgg16-397923af.pth",
"input_shape": [1, 3, 224, 224],
"input_names": ["image"],
"output_names": ["output"],
"model_params": null,
"model_type": "Classification",
"reverse_input_channels": true,
"mean_values": "123.675 116.28 103.53",
"scale_values": "58.395 57.12 57.375",
"labels": "IMAGENET1K_V1"
}
]
}
Loading