initial pix2seq v2 release
Support multiple vision tasks in a single neural network (with different prompts), namely, object detection, instance segmentation, keypoint / human pose detection, image captioning.
Support multiple vision tasks in a single neural network (with different prompts), namely, object detection, instance segmentation, keypoint / human pose detection, image captioning.