Skip to content

Commit

Permalink
Update docs/source/en/model_doc/vitpose.md
Browse files Browse the repository at this point in the history
Co-authored-by: NielsRogge <[email protected]>
  • Loading branch information
SangbumChoi and NielsRogge committed Sep 10, 2024
1 parent cb6d45f commit 5197549
Showing 1 changed file with 1 addition and 2 deletions.
3 changes: 1 addition & 2 deletions docs/source/en/model_doc/vitpose.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,7 @@ The original code can be found [here](https://github.com/ViTAE-Transformer/ViTPo
>>> outputs = model(pixel_values, dataset_index)
```

- The current model utilizes a 2-step inference pipeline. The first step involves placing a bounding box around the region corresponding to the person.
After that, the second step uses VitPose to predict the keypoints.
- ViTPose is a so-called top-down keypoint detection model. This means that one first uses an object detector, like [RT-DETR](rt-detr), to detect people (or other instances) in an image. Next, ViTPose takes the cropped images as input and predicts the keypoints.

```py
>>> import torch
Expand Down

0 comments on commit 5197549

Please sign in to comment.