We list some common troubles faced by many users and their corresponding solutions here. Feel free to enrich the list if you find any frequent issues and have ways to help others to solve them. If the contents here do not cover your issue, please create an issue using the provided templates and make sure you fill in all required information in the template.
The compatible MMSegmentation, MMCV and MMEngine versions are as below. Please install the correct versions of them to avoid installation issues.
MMSegmentation version | MMCV version | MMEngine version | MMClassification (optional) version | MMDetection (optional) version |
---|---|---|---|---|
dev-1.x branch | mmcv >= 2.0.0rc4 | MMEngine >= 0.7.1 | mmcls==1.0.0rc6 | mmdet >= 3.0.0 |
main branch | mmcv >= 2.0.0rc4 | MMEngine >= 0.7.1 | mmcls==1.0.0rc6 | mmdet >= 3.0.0 |
1.0.0 | mmcv >= 2.0.0rc4 | MMEngine >= 0.7.1 | mmcls==1.0.0rc6 | mmdet >= 3.0.0 |
1.0.0rc6 | mmcv >= 2.0.0rc4 | MMEngine >= 0.5.0 | mmcls>=1.0.0rc0 | mmdet >= 3.0.0rc6 |
1.0.0rc5 | mmcv >= 2.0.0rc4 | MMEngine >= 0.2.0 | mmcls>=1.0.0rc0 | mmdet>=3.0.0rc6 |
1.0.0rc4 | mmcv == 2.0.0rc3 | MMEngine >= 0.1.0 | mmcls>=1.0.0rc0 | mmdet>=3.0.0rc4, <=3.0.0rc5 |
1.0.0rc3 | mmcv == 2.0.0rc3 | MMEngine >= 0.1.0 | mmcls>=1.0.0rc0 | mmdet>=3.0.0rc4, <=3.0.0rc5 |
1.0.0rc2 | mmcv == 2.0.0rc3 | MMEngine >= 0.1.0 | mmcls>=1.0.0rc0 | mmdet>=3.0.0rc4, <=3.0.0rc5 |
1.0.0rc1 | mmcv >= 2.0.0rc1, <=2.0.0rc3> | MMEngine >= 0.1.0 | mmcls>=1.0.0rc0 | Not required |
1.0.0rc0 | mmcv >= 2.0.0rc1, <=2.0.0rc3> | MMEngine >= 0.1.0 | mmcls>=1.0.0rc0 | Not required |
Notes:
-
MMClassification and MMDetatction are optional for MMSegmentation. If you didn't install them,
ConvNeXt
(required MMClassification) and MaskFormer, Mask2Former (required MMDetection) cannot be used. We recommend to install them with source code. Please refer to MMClasssication and MMDetection for more details about their installation. -
To install MMSegmentation 0.x and master branch, please refer to the faq 0.x document to check compatible versions of MMCV.
-
If you have installed an incompatible version of mmcv, please run
pip uninstall mmcv
to uninstall the installed mmcv first. If you have previously installed mmcv-full (which exists in OpenMMLab 1.x), please runpip uninstall mmcv-full
to uninstall it. -
If "No module named 'mmcv'" appears, please follow the steps below;
- Use
pip uninstall mmcv
to uninstall the existing mmcv in the environment. - Install the corresponding mmcv according to the installation instructions.
- Use
- Infer from the name of the config file of the model. You can refer to the
Config Name Style
part of Learn about Configs. For example, for config file with namesegformer_mit-b0_8xb1-160k_cityscapes-1024x1024.py
,8xb1
means training the model corresponding to it needs 8 GPUs, and the batch size of each GPU is 1. - Infer from the log file. Open the log file of the model and search
nGPU
in the file. The number of figures followingnGPU
is the number of GPUs needed to train the model. For instance, searching fornGPU
in the log file yields the recordnGPU 0,1,2,3,4,5,6,7
, which indicates that eight GPUs are needed to train the model.
Briefly, it is a deep supervision trick to improve the accuracy. In the training phase, decode_head
is for decoding semantic segmentation output, auxiliary_head
is just adding an auxiliary loss, the segmentation result produced by it has no impact to your model's result, it just works in training. You may read this paper for more information.
In the test script, we provide --out
argument to control whether output the painted images. Users might run the following command:
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} --out ${OUTPUT_DIR}
MMSegmentation uses num_classes
and out_channels
to control output of last layer self.conv_seg
. More details could be found here.
num_classes
should be the same as number of types of labels, in binary segmentation task, dataset only has two types of labels: foreground and background, so num_classes=2
. out_channels
controls the output channel of last layer of model, it usually equals to num_classes
.
But in binary segmentation task, there are two solutions:
-
Set
out_channels=2
, using Cross Entropy Loss in training, usingF.softmax()
andargmax()
to get prediction of each pixel in inference. -
Set
out_channels=1
, using Binary Cross Entropy Loss in training, usingF.sigmoid()
andthreshold
to get prediction of each pixel in inference.threshold
is set 0.3 as default.
In summary, to implement binary segmentation methods users should modify below parameters in the decode_head
and auxiliary_head
configs. Here is a modification example of pspnet_unet_s5-d16.py:
- (1)
num_classes=2
,out_channels=2
anduse_sigmoid=False
inCrossEntropyLoss
.
decode_head=dict(
type='PSPHead',
in_channels=64,
in_index=4,
num_classes=2,
out_channels=2,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=128,
in_index=3,
num_classes=2,
out_channels=2,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=0.4)),
- (2)
num_classes=2
,out_channels=1
anduse_sigmoid=True
inCrossEntropyLoss
.
decode_head=dict(
type='PSPHead',
in_channels=64,
in_index=4,
num_classes=2,
out_channels=1,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
auxiliary_head=dict(
type='FCNHead',
in_channels=128,
in_index=3,
num_classes=2,
out_channels=1,
loss_decode=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=0.4)),
The parameter type of reduce_zero_label
in dataset is Boolean, which is default to False. It is used to ignore the dataset label 0. The specific method is to change label 0 to 255, and subtract 1 from the corresponding number of all the remaining labels. At the same time, set 255 as ignore index in the decode head, which means that it will not participate in the loss calculation.
Following is the specific implementation logic of reduce_zero_label
:
if self.reduce_zero_label:
# avoid using underflow conversion
gt_semantic_seg[gt_semantic_seg == 0] = 255
gt_semantic_seg = gt_semantic_seg - 1
gt_semantic_seg[gt_semantic_seg == 254] = 255
Whether your dataset needs to use reduce_zero_label
, there are two types of situations:
- On Potsdam dataset, there are six classes: 0-Impervious surfaces, 1-Building, 2-Low vegetation, 3-Tree, 4-Car, 5-Clutter/background. However, this dataset provides two types of RGB labels, one with black pixels at the edges of the images, and the other without. For labels with black edges, in dataset_converters.py, it converts the black edges to label 0, and the other labels are 1-Impervious surfaces, 2-Building, 3-Low vegetation, 4-Tree, 5-Car, 6-Clutter/background. Therefore, in the dataset config potsdam.py
reduce_zero_label=True
。 If you are using labels without black edges, then there are only class 0-5 in the mask label. At this point, you should usereduce_zero_label=False
.reduce_zero_label
usage needs to be considered with your actual situation. - On a dataset with class 0 as the background class, if you need to separate the background from the rest of your classes ultimately then you do not need to use
reduce_zero_label
, which in the dataset config settings should bereduce_zero_label=False
Note: Please confirm the number of original classes in the dataset. If there are only two classes, you should not use reduce_zero_label
which is reduce_zero_label=False
.