tinyvision
diff --git a/‎LICENSE
Lines changed: 21 additions & 0 deletions b/‎LICENSE
Lines changed: 21 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 48 additions & 0 deletions b/‎README.md
Lines changed: 48 additions & 0 deletions
diff --git a/‎config/__init__.py
Lines changed: 8 additions & 0 deletions b/‎config/__init__.py
Lines changed: 8 additions & 0 deletions
diff --git a/‎config/__pycache__/__init__.cpython-37.pyc
329 Bytes b/‎config/__pycache__/__init__.cpython-37.pyc
329 Bytes
diff --git a/‎config/__pycache__/defaults.cpython-37.pyc
2.53 KB b/‎config/__pycache__/defaults.cpython-37.pyc
2.53 KB
diff --git a/‎config/defaults.py
Lines changed: 204 additions & 0 deletions b/‎config/defaults.py
Lines changed: 204 additions & 0 deletions
diff --git a/‎configs/market/swin_base.yml
Lines changed: 53 additions & 0 deletions b/‎configs/market/swin_base.yml
Lines changed: 53 additions & 0 deletions
diff --git a/‎configs/market/swin_small.yml
Lines changed: 53 additions & 0 deletions b/‎configs/market/swin_small.yml
Lines changed: 53 additions & 0 deletions
@@ -0,0 +1,21 @@
+MIT License
+
+Copyright (c) 2021 heshuting555
+
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.
@@ -0,0 +1,48 @@
+# SOLIDER on Person Re-identification
+
+This repo provides details about how to use [SOLIDER](https://github.com/tinyvision/SOLIDER) pretrained representation on **person re-identification task**.
+We modify the code from [TransReID](https://github.com/damo-cv/TransReID), and you can refer to the original repo for more details.
+
+## Installation and Datasets
+
+We use python version 3.7, PyTorch version 1.7.1, CUDA 10.1 and torchvision 0.8.2. More details of installation and dataset preparation can be found in [TransReID-SSL](https://github.com/damo-cv/TransReID-SSL).
+
+## Prepare Pre-trained Models 
+You can download models from [SOLIDER](https://github.com/tinyvision/SOLIDER), or use [SOLIDER](https://github.com/tinyvision/SOLIDER) to train your own models.
+Before training, you should convert the models first.
+
+```bash
+python convert_model.py path/to/SOLIDER/log/lup/swin_tiny/checkpoint.pth path/to/SOLIDER/log/lup/swin_tiny/checkpoint_tea.pth
+```
+
+## Training
+
+We utilize 1 GPU for training. Please modify the `MODEL.PRETRAIN_PATH`, `DATASETS.ROOT_DIR` and `OUTPUT_DIR` in the config file.
+
+```bash
+sh run.sh
+```
+
+## Performance
+
+| Method | Model | MSMT17<br>(mAP/R1) | Market1501<br>(mAP/R1) |
+| ------ | :---: | :---: | :---: |
+| SOLIDER | Swin Tiny | 67.4/85.9 | 91.6/96.1 |
+| SOLIDER | Swin Small | 76.9/90.8 | 93.3/96.6 |
+| SOLIDER | Swin Base | 77.1/90.7 | 93.9/96.9 |
+
+- We use the pretrained models from [SOLIDER](https://github.com/tinyvision/SOLIDER).
+- The semantic weight is set to 0.2 in these experiments.
+
+## Citation
+
+If you find this code useful for your research, please cite our paper
+
+```
+@inproceedings{chen2023beyond,
+  title={Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks},
+  author={Weihua Chen and Xianzhe Xu and Jian Jia and Hao Luo and Yaohua Wang and Fan Wang and Rong Jin and Xiuyu Sun},
+  booktitle={The IEEE/CVF Conference on Computer Vision and Pattern Recognition},
+  year={2023},
+}
+```
@@ -0,0 +1,8 @@
+# encoding: utf-8
+"""
+@author:  sherlock
+@contact: [email protected]
+"""
+
+from .defaults import _C as cfg
+from .defaults import _C as cfg_test
@@ -0,0 +1,204 @@
+from yacs.config import CfgNode as CN
+
+# -----------------------------------------------------------------------------
+# Convention about Training / Test specific parameters
+# -----------------------------------------------------------------------------
+# Whenever an argument can be either used for training or for testing, the
+# corresponding name will be post-fixed by a _TRAIN for a training parameter,
+
+# -----------------------------------------------------------------------------
+# Config definition
+# -----------------------------------------------------------------------------
+
+_C = CN()
+# -----------------------------------------------------------------------------
+# MODEL
+# -----------------------------------------------------------------------------
+_C.MODEL = CN()
+# Using cuda or cpu for training
+_C.MODEL.DEVICE = "cuda"
+# ID number of GPU
+_C.MODEL.DEVICE_ID = '0'
+# Name of backbone
+_C.MODEL.NAME = 'resnet50'
+# Last stride of backbone
+_C.MODEL.LAST_STRIDE = 1
+# Path to pretrained model of backbone
+_C.MODEL.PRETRAIN_PATH = ''
+_C.MODEL.PRETRAIN_HW_RATIO = 1
+
+# Use ImageNet pretrained model to initialize backbone or use self trained model to initialize the whole model
+# Options: 'imagenet' , 'self' , 'finetune'
+_C.MODEL.PRETRAIN_CHOICE = 'imagenet'
+
+# If train with BNNeck, options: 'bnneck' or 'no'
+_C.MODEL.NECK = 'bnneck'
+# If train loss include center loss, options: 'yes' or 'no'. Loss with center loss has different optimizer configuration
+_C.MODEL.IF_WITH_CENTER = 'no'
+
+_C.MODEL.ID_LOSS_TYPE = 'softmax'
+_C.MODEL.ID_LOSS_WEIGHT = 1.0
+_C.MODEL.TRIPLET_LOSS_WEIGHT = 1.0
+
+_C.MODEL.METRIC_LOSS_TYPE = 'triplet'
+# If train with multi-gpu ddp mode, options: 'True', 'False'
+_C.MODEL.DIST_TRAIN = False
+# If train with soft triplet loss, options: 'True', 'False'
+_C.MODEL.NO_MARGIN = False
+# If train with label smooth, options: 'on', 'off'
+_C.MODEL.IF_LABELSMOOTH = 'on'
+# If train with arcface loss, options: 'True', 'False'
+_C.MODEL.COS_LAYER = False
+
+_C.MODEL.DROPOUT_RATE = 0.0
+# Reduce feature dim
+_C.MODEL.REDUCE_FEAT_DIM = False
+_C.MODEL.FEAT_DIM = 512
+# Transformer setting
+_C.MODEL.DROP_PATH = 0.1
+_C.MODEL.DROP_OUT = 0.0
+_C.MODEL.ATT_DROP_RATE = 0.0
+_C.MODEL.TRANSFORMER_TYPE = 'None'
+_C.MODEL.STRIDE_SIZE = [16, 16]
+_C.MODEL.GEM_POOLING = False
+_C.MODEL.STEM_CONV = False
+
+# JPM Parameter
+_C.MODEL.JPM = False
+_C.MODEL.SHIFT_NUM = 5
+_C.MODEL.SHUFFLE_GROUP = 2
+_C.MODEL.DEVIDE_LENGTH = 4
+_C.MODEL.RE_ARRANGE = True
+
+# SIE Parameter
+_C.MODEL.SIE_COE = 3.0
+_C.MODEL.SIE_CAMERA = False
+_C.MODEL.SIE_VIEW = False
+
+# Semantic Weight
+_C.MODEL.SEMANTIC_WEIGHT = 1.0
+
+# -----------------------------------------------------------------------------
+# INPUT
+# -----------------------------------------------------------------------------
+_C.INPUT = CN()
+# Size of the image during training
+_C.INPUT.SIZE_TRAIN = [384, 128]
+# Size of the image during test
+_C.INPUT.SIZE_TEST = [384, 128]
+# Random probability for image horizontal flip
+_C.INPUT.PROB = 0.5
+# Random probability for random erasing
+_C.INPUT.RE_PROB = 0.5
+# Values to be used for image normalization
+_C.INPUT.PIXEL_MEAN = [0.485, 0.456, 0.406]
+# Values to be used for image normalization
+_C.INPUT.PIXEL_STD = [0.229, 0.224, 0.225]
+# Value of padding size
+_C.INPUT.PADDING = 10
+
+# -----------------------------------------------------------------------------
+# Dataset
+# -----------------------------------------------------------------------------
+_C.DATASETS = CN()
+# List of the dataset names for training, as present in paths_catalog.py
+_C.DATASETS.NAMES = ('market1501')
+# Root directory where datasets should be used (and downloaded if not found)
+_C.DATASETS.ROOT_DIR = ('../data')
+_C.DATASETS.ROOT_TRAIN_DIR = ('../data')
+_C.DATASETS.ROOT_VAL_DIR = ('../data')
+
+
+# -----------------------------------------------------------------------------
+# DataLoader
+# -----------------------------------------------------------------------------
+_C.DATALOADER = CN()
+# Number of data loading threads
+_C.DATALOADER.NUM_WORKERS = 8
+# Sampler for data loading
+_C.DATALOADER.SAMPLER = 'softmax'
+# Number of instance for one batch
+_C.DATALOADER.NUM_INSTANCE = 16
+# remove tail data
+_C.DATALOADER.REMOVE_TAIL = 0
+
+# ---------------------------------------------------------------------------- #
+# Solver
+# ---------------------------------------------------------------------------- #
+_C.SOLVER = CN()
+# Name of optimizer
+_C.SOLVER.OPTIMIZER_NAME = "Adam"
+# Number of max epoches
+_C.SOLVER.MAX_EPOCHS = 100
+# Base learning rate
+_C.SOLVER.BASE_LR = 3e-4
+# Whether using larger learning rate for fc layer
+_C.SOLVER.LARGE_FC_LR = False
+# Factor of learning bias
+_C.SOLVER.BIAS_LR_FACTOR = 1
+# Factor of learning bias
+_C.SOLVER.SEED = 1234
+# Momentum
+_C.SOLVER.MOMENTUM = 0.9
+# Margin of triplet loss
+_C.SOLVER.MARGIN = 0.3
+# Learning rate of SGD to learn the centers of center loss
+_C.SOLVER.CENTER_LR = 0.5
+# Balanced weight of center loss
+_C.SOLVER.CENTER_LOSS_WEIGHT = 0.0005
+
+# Settings of weight decay
+_C.SOLVER.WEIGHT_DECAY = 0.0005
+_C.SOLVER.WEIGHT_DECAY_BIAS = 0.0005
+
+# decay rate of learning rate
+_C.SOLVER.GAMMA = 0.1
+# decay step of learning rate
+_C.SOLVER.STEPS = (40, 70)
+# warm up factor
+_C.SOLVER.WARMUP_FACTOR = 0.01
+#  warm up epochs
+_C.SOLVER.WARMUP_EPOCHS = 5
+# method of warm up, option: 'constant','linear'
+_C.SOLVER.WARMUP_METHOD = "cosine"
+
+_C.SOLVER.COSINE_MARGIN = 0.5
+_C.SOLVER.COSINE_SCALE = 30
+
+# epoch number of saving checkpoints
+_C.SOLVER.CHECKPOINT_PERIOD = 10
+# iteration of display training log
+_C.SOLVER.LOG_PERIOD = 100
+# epoch number of validation
+_C.SOLVER.EVAL_PERIOD = 10
+# Number of images per batch
+# This is global, so if we have 8 GPUs and IMS_PER_BATCH = 128, each GPU will
+# contain 16 images per batch
+_C.SOLVER.IMS_PER_BATCH = 64
+_C.SOLVER.TRP_L2 = False
+
+# ---------------------------------------------------------------------------- #
+# TEST
+# ---------------------------------------------------------------------------- #
+
+_C.TEST = CN()
+# Number of images per batch during test
+_C.TEST.IMS_PER_BATCH = 128
+# If test with re-ranking, options: 'True','False'
+_C.TEST.RE_RANKING = False
+# Path to trained model
+_C.TEST.WEIGHT = ""
+# Which feature of BNNeck to be used for test, before or after BNNneck, options: 'before' or 'after'
+_C.TEST.NECK_FEAT = 'after'
+# Whether feature is nomalized before test, if yes, it is equivalent to cosine distance
+_C.TEST.FEAT_NORM = 'yes'
+
+# Name for saving the distmat after testing.
+_C.TEST.DIST_MAT = "dist_mat.npy"
+# Whether calculate the eval score option: 'True', 'False'
+_C.TEST.EVAL = False
+# ---------------------------------------------------------------------------- #
+# Misc options
+# ---------------------------------------------------------------------------- #
+# Path to checkpoint and saved log of trained model
+_C.OUTPUT_DIR = ""
@@ -0,0 +1,53 @@
+MODEL:
+  PRETRAIN_HW_RATIO: 2
+  METRIC_LOSS_TYPE: 'triplet'
+  IF_LABELSMOOTH: 'off'
+  IF_WITH_CENTER: 'no'
+  NAME: 'transformer'
+  NO_MARGIN: True
+  DEVICE_ID: ('0')
+  TRANSFORMER_TYPE: 'swin_base_patch4_window7_224'
+  STRIDE_SIZE: [16, 16]
+
+INPUT:
+  SIZE_TRAIN: [384, 128]
+  SIZE_TEST: [384, 128]
+  PROB: 0.5 # random horizontal flip
+  RE_PROB: 0.5 # random erasing
+  PADDING: 10
+  PIXEL_MEAN: [0.5, 0.5, 0.5]
+  PIXEL_STD: [0.5, 0.5, 0.5]
+
+DATASETS:
+  NAMES: ('market1501')
+  ROOT_DIR: ('path/to/market1501/datasets')
+
+DATALOADER:
+  SAMPLER: 'softmax_triplet'
+  NUM_INSTANCE: 4
+  NUM_WORKERS: 8
+
+SOLVER:
+  OPTIMIZER_NAME: 'SGD'
+  MAX_EPOCHS: 120
+  BASE_LR: 0.0008
+  WARMUP_EPOCHS: 20
+  IMS_PER_BATCH: 64
+  WARMUP_METHOD: 'cosine'
+  LARGE_FC_LR: False
+  CHECKPOINT_PERIOD: 120
+  LOG_PERIOD: 20
+  EVAL_PERIOD: 10
+  WEIGHT_DECAY:  1e-4
+  WEIGHT_DECAY_BIAS: 1e-4
+  BIAS_LR_FACTOR: 2
+
+TEST:
+  EVAL: True
+  IMS_PER_BATCH: 256
+  RE_RANKING: False
+  WEIGHT: ''
+  NECK_FEAT: 'before'
+  FEAT_NORM: 'yes'
+
+OUTPUT_DIR: './log/market1501/swin_base'
@@ -0,0 +1,53 @@
+MODEL:
+  PRETRAIN_HW_RATIO: 2
+  METRIC_LOSS_TYPE: 'triplet'
+  IF_LABELSMOOTH: 'off'
+  IF_WITH_CENTER: 'no'
+  NAME: 'transformer'
+  NO_MARGIN: True
+  DEVICE_ID: ('0')
+  TRANSFORMER_TYPE: 'swin_small_patch4_window7_224'
+  STRIDE_SIZE: [16, 16]
+
+INPUT:
+  SIZE_TRAIN: [384, 128]
+  SIZE_TEST: [384, 128]
+  PROB: 0.5 # random horizontal flip
+  RE_PROB: 0.5 # random erasing
+  PADDING: 10
+  PIXEL_MEAN: [0.5, 0.5, 0.5]
+  PIXEL_STD: [0.5, 0.5, 0.5]
+
+DATASETS:
+  NAMES: ('market1501')
+  ROOT_DIR: ('path/to/market1501/datasets')
+
+DATALOADER:
+  SAMPLER: 'softmax_triplet'
+  NUM_INSTANCE: 4
+  NUM_WORKERS: 8
+
+SOLVER:
+  OPTIMIZER_NAME: 'SGD'
+  MAX_EPOCHS: 120
+  BASE_LR: 0.0008
+  WARMUP_EPOCHS: 20
+  IMS_PER_BATCH: 64
+  WARMUP_METHOD: 'cosine'
+  LARGE_FC_LR: False
+  CHECKPOINT_PERIOD: 120
+  LOG_PERIOD: 20
+  EVAL_PERIOD: 10
+  WEIGHT_DECAY:  1e-4
+  WEIGHT_DECAY_BIAS: 1e-4
+  BIAS_LR_FACTOR: 2
+
+TEST:
+  EVAL: True
+  IMS_PER_BATCH: 256
+  RE_RANKING: False
+  WEIGHT: ''
+  NECK_FEAT: 'before'
+  FEAT_NORM: 'yes'
+
+OUTPUT_DIR: './log/market1501/swin_small'