You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+14-5
Original file line number
Diff line number
Diff line change
@@ -70,11 +70,20 @@ Train DSMIL on TCGA Lung Cancer dataset (precomputed features):
70
70
```
71
71
72
72
## Training on your own datasets
73
-
You could modify train_tcga.py to easily let it work with your datasets. You will need to:
74
-
1. For each bag, generate a .csv file where each row contains the feature of an instance. The .csv file should be named as "_bagID_.csv" and put into a folder named "_dataset-name_".
75
-
2. Generate a "_dataset-name_.csv" file with two columns where the first column contains _bagID_, and the second column contains the class label.
76
-
3. Replace the corresponding file path in the script with the file path of "_dataset_.csv" file, and change the data directory path in the dataloader to the path of the folder "_dataset-name_"
77
-
4. Configure the number of class for creating the DSMIL model.
73
+
You could modify train_tcga.py to easily let it work with your datasets. After you have trained your embedder, you will need to compute the features and organize them as:
74
+
1. For each bag, generate a .csv file where each row contains the feature of an instance. The .csv file should be named as "_bagID_.csv" and put into a folder named "_dataset-name_".
75
+
<divalign="center">
76
+
<imgsrc="thumbnails/bag.png"width="400px" />
77
+
</div>
78
+
2. Generate a "_dataset-name_.csv" file with two columns where the first column contains the paths to all _bagID_.csv files, and the second column contains the bag labels.
79
+
<divalign="center">
80
+
<imgsrc="thumbnails/bags.png"width="400px" />
81
+
</div>
82
+
3. Replace the corresponding file path in the script with the file path of "_dataset_.csv".
0 commit comments