Skip to content

Commit 99a655b

Browse files
feichtenhoferfacebook-github-bot
authored andcommitted
unsup
Summary: updated Reviewed By: bxiong1202, haooooooqi, lyttonhao Differential Revision: D30004049 fbshipit-source-id: 4c2c20249dad2f7b4a75ebe24f4105152917e4d7
1 parent 39ef35c commit 99a655b

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+3759
-413
lines changed

MODEL_ZOO.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Kinetics 400 and 600
44

5-
| architecture | depth | crops x clips | frame length x sample rate | top1 | top5 | model | config | dataset |
5+
| architecture | size | crops x clips | frame length x sample rate | top1 | top5 | model | config | dataset |
66
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
77
| C2D | R50 | 3 x 10 | 8 x 8 | 67.2 | 87.8 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/C2D_NOPOOL_8x8_R50.pkl) | Kinetics/c2/C2D_NOPOOL_8x8_R50 | K400 |
88
| I3D | R50 | 3 x 10 | 8 x 8 | 73.5 | 90.8 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/I3D_8x8_R50.pkl) | Kinetics/c2/I3D_8x8_R50 | K400 |
@@ -26,7 +26,7 @@
2626

2727
## AVA
2828

29-
| architecture | depth | Pretrain Model | frame length x sample rate | MAP | AVA version | model |
29+
| architecture | size | Pretrain Model | frame length x sample rate | MAP | AVA version | model |
3030
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |------------- |
3131
| Slow | R50 | [Kinetics 400](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/ava/pretrain/C2D_8x8_R50.pkl) | 4 x 16 | 19.5 | 2.2 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/ava/C2D_8x8_R50.pkl) |
3232
| SlowFast | R101 | [Kinetics 600](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/ava/pretrain/SLOWFAST_32x2_R101_50_50_v2.1.pkl) | 8 x 8 | 28.2 | 2.1 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/ava/SLOWFAST_32x2_R101_50_50_v2.1.pkl) |
@@ -39,22 +39,22 @@
3939
](https://arxiv.org/abs/1912.00998)" paper. The multigrid method trains about 3-6x faster than the original training on multiple datasets. See [projects/multigrid](projects/multigrid/README.md) for more information. The following provides models, results, and example config files.
4040

4141
#### Kinetics:
42-
| architecture | depth | pretrain | frame length x sample rate | training | top1 | top5 | model | config |
42+
| architecture | size | pretrain | frame length x sample rate | training | top1 | top5 | model | config |
4343
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
4444
| SlowFast | R50 | - | 8 x 8 | Standard | 76.8 | 92.7 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/pyslowfast/model_zoo/multigrid/model_zoo/Kinetics/SLOWFAST_8x8_R50_stepwise.pkl) | Kinetics/SLOWFAST_8x8_R50_stepwise |
4545
| SlowFast | R50 | - | 8 x 8 | Multigrid | 76.6 | 92.7 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/pyslowfast/model_zoo/multigrid/model_zoo/Kinetics/SLOWFAST_8x8_R50_stepwise_multigrid.pkl) | Kinetics/SLOWFAST_8x8_R50_stepwise_multigrid |
4646

4747
(Here we use stepwise learning rate schedule.)
4848

4949
#### Something-Something V2:
50-
| architecture | depth | pretrain | frame length x sample rate | training | top1 | top5 | model | config |
50+
| architecture | size | pretrain | frame length x sample rate | training | top1 | top5 | model | config |
5151
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
5252
| SlowFast | R50 | [Kinetics 400](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/SLOWFAST_8x8_R50.pkl) | 16 x 8 | Standard | 63.0 | 88.5 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/pyslowfast/model_zoo/multigrid/model_zoo/SSv2/SLOWFAST_16x8_R50.pkl) | SSv2/SLOWFAST_16x8_R50 |
5353
| SlowFast | R50 | [Kinetics 400](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/SLOWFAST_8x8_R50.pkl) | 16 x 8 | Multigrid | 63.5 | 88.7 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/pyslowfast/model_zoo/multigrid/model_zoo/SSv2/SLOWFAST_16x8_R50_multigrid.pkl) | SSv2/SLOWFAST_16x8_R50_multigrid |
5454

5555

5656
#### Charades
57-
| architecture | depth | pretrain | frame length x sample rate | training | mAP | model | config |
57+
| architecture | size | pretrain | frame length x sample rate | training | mAP | model | config |
5858
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
5959
| SlowFast | R50 | [Kinetics 400](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/SLOWFAST_8x8_R50.pkl) | 16 x 8 | Standard | 38.9 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/pyslowfast/model_zoo/multigrid/model_zoo/Charades/SLOWFAST_16x8_R50.pkl) | SSv2/SLOWFAST_16x8_R50 |
6060
| SlowFast | R50 | [Kinetics 400](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/SLOWFAST_8x8_R50.pkl) | 16 x 8 | Multigrid | 38.6 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/pyslowfast/model_zoo/multigrid/model_zoo/Charades/SLOWFAST_16x8_R50_multigrid.pkl) | SSv2/SLOWFAST_16x8_R50_multigrid |
@@ -64,7 +64,7 @@
6464

6565
We also release the imagenet pretrained model if finetuning from ImageNet is preferred. The reported accuracy is obtained by center crop testing on the validation set.
6666

67-
| architecture | depth | Top1 | Top5 | model | Config |
67+
| architecture | size | Top1 | Top5 | model | Config |
6868
| ------------- | ------------- | ------------- | ------------- | ------------- | ------------- |
6969
| ResNet | R50 | 23.6 | 6.8 | [`link`](https://dl.fbaipublicfiles.com/pyslowfast/model_zoo/kinetics400/R50_IN1K.pyth) | ImageNet/RES_R50 |
7070
| MVIT | B-16-Conv | 17.1 | 3.7 | [`link`](https://drive.google.com/file/d/1dYYqUB-3DSgBVc9d6o-rW8ojtVsrFLgp/view?usp=sharing) | ImageNet/MVIT_B_16_CONV |

configs/ssl/BYOL_SlowR50_8x8.yaml

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
TASK: ssl
2+
TRAIN:
3+
DATASET: kinetics
4+
EVAL_PERIOD: 10
5+
CHECKPOINT_PERIOD: 5
6+
AUTO_RESUME: True
7+
MODEL:
8+
NUM_CLASSES: 256
9+
MODEL_NAME: ContrastiveModel
10+
ARCH: slow_c2d
11+
ARCH: slow
12+
LOSS_FUNC: contrastive_loss
13+
DROPOUT_RATE: 0.0
14+
HEAD_ACT: none
15+
CONTRASTIVE:
16+
T: 0.5 # default 0.07
17+
DIM: 256 # 128 default, if changed, change nCls too
18+
NUM_MLP_LAYERS: 2 # default 1
19+
BN_MLP: True
20+
BN_SYNC_MLP: True
21+
MLP_DIM: 4096
22+
SEQUENTIAL: True # def fault
23+
MOMENTUM: 0.996 # default 0.5
24+
MOMENTUM_ANNEALING: True # default false
25+
TYPE: byol # default mem
26+
PREDICTOR_DEPTHS: [2]
27+
DATA:
28+
NUM_FRAMES: 8
29+
SAMPLING_RATE: 8
30+
# NUM_FRAMES: 16 # dont forget to change these parameters in linear & finetuning configs
31+
# SAMPLING_RATE: 4
32+
TRAIN_CROP_NUM_TEMPORAL: 2 # default 1
33+
TRAIN_CROP_NUM_SPATIAL: 1 # default 1
34+
TRAIN_JITTER_SCALES_RELATIVE: [0.2, 0.766]
35+
TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333]
36+
SSL_MOCOV2_AUG: True
37+
SSL_COLOR_JITTER: True # default false
38+
COLOR_RND_GRAYSCALE: 0.2 # default 0.0
39+
SSL_COLOR_HUE: 0.15
40+
SSL_COLOR_BRI_CON_SAT: [0.6, 0.6, 0.6] # default [0.4, 0.4, 0.4]
41+
TRAIN_JITTER_SCALES: [256, 320]
42+
TRAIN_CROP_SIZE: 224
43+
TEST_CROP_SIZE: 256
44+
INPUT_CHANNEL_NUM: [3]
45+
PATH_LABEL_SEPARATOR: " "
46+
BN:
47+
USE_PRECISE_STATS: False
48+
NUM_BATCHES_PRECISE: 200
49+
WEIGHT_DECAY: 0.0
50+
NUM_SYNC_DEVICES: 8
51+
NORM_TYPE: "sync_batchnorm"
52+
# NORM_TYPE: "sync_batchnorm_apex"
53+
SOLVER:
54+
# BASE_LR: 1.2 # for rho=4 clips
55+
BASE_LR: 0.6
56+
LARS_ON: True
57+
BASE_LR_SCALE_NUM_SHARDS: True
58+
LR_POLICY: cosine
59+
MAX_EPOCH: 200
60+
MOMENTUM: 0.9
61+
WEIGHT_DECAY: 1e-6
62+
WARMUP_EPOCHS: 35.0
63+
WARMUP_START_LR: 0.001
64+
OPTIMIZING_METHOD: sgd
65+
TEST:
66+
ENABLE: True
67+
DATASET: kinetics
68+
BATCH_SIZE: 64
69+
DATA_LOADER:
70+
NUM_WORKERS: 10
71+
PIN_MEMORY: True
72+
NUM_GPUS: 8
73+
RNG_SEED: 0
74+
OUTPUT_DIR: .

configs/ssl/MoCo_SlowR50_8x8.yaml

Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
TASK: ssl
2+
TRAIN:
3+
DATASET: kinetics
4+
BATCH_SIZE: 64
5+
EVAL_PERIOD: 10
6+
CHECKPOINT_PERIOD: 5
7+
AUTO_RESUME: True
8+
MIXED_PRECISION: True
9+
MODEL:
10+
NUM_CLASSES: 128
11+
MODEL_NAME: ContrastiveModel
12+
ARCH: slow_c2d
13+
ARCH: slow
14+
LOSS_FUNC: contrastive_loss
15+
DROPOUT_RATE: 0.0
16+
HEAD_ACT: none
17+
CONTRASTIVE:
18+
T: 0.1 # default 0.07
19+
NUM_MLP_LAYERS: 3 # default 1
20+
SEQUENTIAL: True # def fault
21+
MOCO_MULTI_VIEW_QUEUE: True # default: False
22+
MOMENTUM: 0.994 # default 0.5 ours: 0.999
23+
MOMENTUM_ANNEALING: True # default false
24+
TYPE: moco # default mem
25+
DATA:
26+
NUM_FRAMES: 8
27+
SAMPLING_RATE: 8
28+
TRAIN_CROP_NUM_TEMPORAL: 4 # default 1
29+
TRAIN_CROP_NUM_SPATIAL: 1 # default 1
30+
TRAIN_JITTER_SCALES_RELATIVE: [0.2, 0.766]
31+
TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333]
32+
SSL_MOCOV2_AUG: True
33+
COLOR_RND_GRAYSCALE: 0.2 # default 0.0
34+
SSL_COLOR_JITTER: True # default false
35+
SSL_COLOR_HUE: 0.15
36+
SSL_COLOR_BRI_CON_SAT: [0.6, 0.6, 0.6] # default [0.4, 0.4, 0.4]
37+
TRAIN_JITTER_SCALES: [256, 320]
38+
TRAIN_CROP_SIZE: 224
39+
TEST_CROP_SIZE: 256
40+
INPUT_CHANNEL_NUM: [3]
41+
PATH_LABEL_SEPARATOR: " "
42+
DATA_LOADER:
43+
NUM_WORKERS: 8
44+
BN:
45+
USE_PRECISE_STATS: False
46+
NUM_BATCHES_PRECISE: 200
47+
WEIGHT_DECAY: 0.0
48+
SOLVER:
49+
BASE_LR_SCALE_NUM_SHARDS: True
50+
BASE_LR: 0.05
51+
BASE_LR: 0.1
52+
LR_POLICY: cosine
53+
MAX_EPOCH: 200
54+
WARMUP_EPOCHS: 35.0
55+
MOMENTUM: 0.9
56+
WARMUP_START_LR: 0.001
57+
OPTIMIZING_METHOD: sgd
58+
TEST:
59+
ENABLE: True
60+
DATASET: kinetics
61+
BATCH_SIZE: 64
62+
NUM_GPUS: 8
63+
RNG_SEED: 0
64+
OUTPUT_DIR: .

configs/ssl/SimCLR_SlowR50_8x8.yaml

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
TASK: ssl
2+
TRAIN:
3+
DATASET: kinetics
4+
EVAL_PERIOD: 10
5+
CHECKPOINT_PERIOD: 10
6+
AUTO_RESUME: True
7+
MODEL:
8+
NUM_CLASSES: 128
9+
MODEL_NAME: ContrastiveModel
10+
ARCH: slow
11+
LOSS_FUNC: contrastive_loss
12+
DROPOUT_RATE: 0.0
13+
HEAD_ACT: none
14+
CONTRASTIVE:
15+
TYPE: simclr # default mem
16+
T: 0.1 # default 0.07
17+
DIM: 128 # 128 default, if changed, change nCls too
18+
NUM_CLASSES_DOWNSTREAM: 400
19+
NUM_MLP_LAYERS: 3 # default 1
20+
BN_SYNC_MLP: True
21+
SIMCLR_DIST_ON: True # default false
22+
SEQUENTIAL: True # def fault
23+
DATA:
24+
NUM_FRAMES: 8
25+
SAMPLING_RATE: 8
26+
TRAIN_CROP_NUM_TEMPORAL: 4 # default 1
27+
TRAIN_CROP_NUM_SPATIAL: 1 # default 1
28+
SSL_MOCOV2_AUG: True
29+
SSL_COLOR_JITTER: True # default false
30+
TRAIN_JITTER_SCALES_RELATIVE: [0.2, 0.766]
31+
TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333]
32+
COLOR_RND_GRAYSCALE: 0.2 # default 0.0
33+
SSL_COLOR_HUE: 0.15
34+
SSL_COLOR_BRI_CON_SAT: [0.6, 0.6, 0.6] # default [0.4, 0.4, 0.4]
35+
TRAIN_JITTER_SCALES: [224, 224]
36+
TRAIN_CROP_SIZE: 224
37+
TEST_CROP_SIZE: 256
38+
INPUT_CHANNEL_NUM: [3]
39+
PATH_LABEL_SEPARATOR: " "
40+
BN:
41+
USE_PRECISE_STATS: False
42+
NUM_BATCHES_PRECISE: 200
43+
WEIGHT_DECAY: 0.0
44+
NUM_SYNC_DEVICES: 8
45+
NORM_TYPE: "sync_batchnorm"
46+
# NORM_TYPE: "sync_batchnorm_apex"
47+
SOLVER:
48+
BASE_LR: 0.6
49+
BASE_LR: 1.2
50+
51+
LARS_ON: True
52+
BASE_LR_SCALE_NUM_SHARDS: True
53+
LR_POLICY: cosine
54+
MAX_EPOCH: 200
55+
MOMENTUM: 0.9
56+
WEIGHT_DECAY: 1e-6
57+
WARMUP_EPOCHS: 35.0
58+
WARMUP_START_LR: 0.001
59+
OPTIMIZING_METHOD: sgd
60+
TEST:
61+
ENABLE: True
62+
DATASET: kinetics
63+
BATCH_SIZE: 64
64+
DATA_LOADER:
65+
NUM_WORKERS: 10
66+
PIN_MEMORY: True
67+
NUM_GPUS: 8
68+
RNG_SEED: 0
69+
OUTPUT_DIR: .

configs/ssl/SwAV_Slow_R50_8x8.yaml

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,65 @@
1+
TASK: ssl
2+
TRAIN:
3+
DATASET: kinetics
4+
EVAL_PERIOD: 10
5+
CHECKPOINT_PERIOD: 10
6+
AUTO_RESUME: True
7+
MODEL:
8+
NUM_CLASSES: 128
9+
MODEL_NAME: ContrastiveModel
10+
ARCH: slow
11+
LOSS_FUNC: contrastive_loss
12+
DROPOUT_RATE: 0.0
13+
HEAD_ACT: none
14+
CONTRASTIVE:
15+
TYPE: swav # default mem
16+
T: 0.1 # default 0.07
17+
DIM: 128 # 128 default, if changed, change nCls too
18+
NUM_CLASSES_DOWNSTREAM: 400
19+
NUM_MLP_LAYERS: 3 # default 1
20+
BN_MLP: True
21+
BN_SYNC_MLP: True
22+
DATA:
23+
NUM_FRAMES: 8
24+
SAMPLING_RATE: 8
25+
TRAIN_CROP_NUM_TEMPORAL: 2 # default 1
26+
TRAIN_CROP_NUM_SPATIAL: 1 # default 1
27+
TRAIN_JITTER_SCALES_RELATIVE: [0.2, 0.766]
28+
TRAIN_JITTER_ASPECT_RELATIVE: [0.75, 1.3333]
29+
SSL_MOCOV2_AUG: True
30+
SSL_COLOR_JITTER: True # default false
31+
COLOR_RND_GRAYSCALE: 0.2 # default 0.0
32+
SSL_COLOR_HUE: 0.15
33+
SSL_COLOR_BRI_CON_SAT: [0.6, 0.6, 0.6] # default [0.4, 0.4, 0.4]
34+
TRAIN_JITTER_SCALES: [224, 224]
35+
TRAIN_CROP_SIZE: 224
36+
TEST_CROP_SIZE: 256
37+
INPUT_CHANNEL_NUM: [3]
38+
PATH_LABEL_SEPARATOR: " "
39+
BN:
40+
USE_PRECISE_STATS: False
41+
NUM_BATCHES_PRECISE: 200
42+
WEIGHT_DECAY: 0.0
43+
NUM_SYNC_DEVICES: 8
44+
NORM_TYPE: "sync_batchnorm"
45+
SOLVER:
46+
BASE_LR: 0.6
47+
BASE_LR_SCALE_NUM_SHARDS: True
48+
LR_POLICY: cosine
49+
MAX_EPOCH: 200
50+
LARS_ON: True
51+
MOMENTUM: 0.9
52+
WEIGHT_DECAY: 1e-6
53+
WARMUP_EPOCHS: 35.0
54+
WARMUP_START_LR: 0.001
55+
OPTIMIZING_METHOD: sgd
56+
TEST:
57+
ENABLE: True
58+
DATASET: kinetics
59+
BATCH_SIZE: 64
60+
DATA_LOADER:
61+
NUM_WORKERS: 10
62+
PIN_MEMORY: True
63+
NUM_GPUS: 8
64+
RNG_SEED: 0
65+
OUTPUT_DIR: .
Lines changed: 57 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,57 @@
1+
TASK: ssl_eval_ssv2
2+
TRAIN:
3+
ENABLE: True
4+
DATASET: ssv2
5+
BATCH_SIZE: 64
6+
EVAL_PERIOD: 4
7+
CHECKPOINT_PERIOD: 4
8+
AUTO_RESUME: True
9+
CHECKPOINT_TYPE: pytorch
10+
CHECKPOINT_EPOCH_RESET: True
11+
CHECKPOINT_CLEAR_NAME_PATTERN: ("backbone.",)
12+
DATA:
13+
NUM_FRAMES: 8
14+
SAMPLING_RATE: 8
15+
DECODING_BACKEND: torchvision
16+
TRAIN_JITTER_SCALES: [256, 320]
17+
TRAIN_CROP_SIZE: 224
18+
TEST_CROP_SIZE: 256
19+
INPUT_CHANNEL_NUM: [3]
20+
INV_UNIFORM_SAMPLE: True
21+
RANDOM_FLIP: False
22+
PATH_TO_DATA_DIR: # pls add
23+
PATH_PREFIX: # pls add
24+
BN:
25+
USE_PRECISE_STATS: False
26+
NUM_BATCHES_PRECISE: 200
27+
WEIGHT_DECAY: 0.0
28+
SOLVER:
29+
BASE_LR: 0.12
30+
BASE_LR_SCALE_NUM_SHARDS: True
31+
LR_POLICY: steps_with_relative_lrs
32+
LRS: [1, 0.1, 0.01, 0.001, 0.0001, 0.00001]
33+
STEPS: [0, 14, 18]
34+
MAX_EPOCH: 22
35+
MOMENTUM: 0.9
36+
WEIGHT_DECAY: 1e-6
37+
WARMUP_EPOCHS: 0.19
38+
WARMUP_START_LR: 0.0001
39+
OPTIMIZING_METHOD: sgd
40+
MODEL:
41+
NUM_CLASSES: 174
42+
ARCH: slow
43+
MODEL_NAME: ResNet
44+
LOSS_FUNC: cross_entropy
45+
DROPOUT_RATE: 0.5
46+
TEST:
47+
ENABLE: True
48+
DATASET: ssv2
49+
BATCH_SIZE: 64
50+
NUM_ENSEMBLE_VIEWS: 1
51+
NUM_SPATIAL_CROPS: 1
52+
DATA_LOADER:
53+
NUM_WORKERS: 2
54+
PIN_MEMORY: True
55+
NUM_GPUS: 8
56+
RNG_SEED: 0
57+
OUTPUT_DIR: .

0 commit comments

Comments
 (0)