4
4
5
5
<h3 align =" center " >Support Vector Data Description (SVDD)</h3 >
6
6
7
- <p align =" center " >MATLAB Code for abnormal detection or fault detection using SVDD</p >
8
- <p align =" center " >Version 2.1.1, 22-DEC-2021 </p >
7
+ <p align =" center " >MATLAB Code for abnormal detection using SVDD</p >
8
+ <p align =" center " >Version 2.2, 13-MAY-2022 </p >
9
9
<
p align =
" center " >Email:
[email protected] </
p >
10
10
11
11
<div align =center >
21
21
22
22
<hr />
23
23
24
- ## Main features
25
-
24
+ ## ✨ MAIN FEATURES
26
25
- SVDD model for one-class or binary classification
27
26
- Multiple kinds of kernel functions (linear, gaussian, polynomial, sigmoid, laplacian)
28
27
- Visualization of decision boundaries for 2D or 3D data
29
- - Parameter Optimization using Bayesian optimization, Genetic Algorithm , and Particle Swarm Optimization
28
+ - Parameter optimization using Bayesian optimization, genetic algorithm , and pParticle swarm optimization
30
29
- Weighted SVDD model
30
+ - Hybrid-kernel SVDD model (K =w1×K1+w2×K2+...+wn×Kn)
31
31
32
- ## Notices
33
-
34
- - This version of the code is not compatible with the versions lower than *** R2016b*** .
32
+ ## ⚠️ NOTICES
33
+ - This version of this code is not compatible with the versions lower than *** R2016b*** .
35
34
- The label must be 1 for positive sample or -1 for negative sample.
36
- - Detailed applications please see the demonstrations.
35
+ - Detailed applications please see the provided *** demonstrations*** .
37
36
- This code is for reference only.
38
37
39
- ## How to use
40
-
41
- ### 01. banana-shaped dataset
42
-
43
- A class named *** DataSet*** is defined to generate and partition the 2D or 3D banana-shaped dataset.
44
- ```
45
- [data, label] = DataSet.generate;
46
- [data, label] = DataSet.generate('dim', 2);
47
- [data, label] = DataSet.generate('dim', 2, 'num', [200, 200]);
48
- [data, label] = DataSet.generate('dim', 3, 'num', [200, 200], 'display', 'on');
49
-
50
- % 'single' --- The training set contains only positive samples.
51
- [trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'single');
52
-
53
- % 'hybrid' --- The training set contains positive and negetive samples.
54
- [trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'hybrid');
55
- ```
56
- <p align =" center " >
57
- <img src =" http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/banana-2D_.png " >
58
- <img src =" http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/banana-3D_.png " >
59
- </p >
60
-
61
- ### 02. Kernel funcions
62
-
63
- A class named *** Kernel*** is defined to compute kernel function matrix.
64
- ```
65
- %{
66
- type -
67
-
68
- linear : k(x,y) = x'*y
69
- polynomial : k(x,y) = (γ*x'*y+c)^d
70
- gaussian : k(x,y) = exp(-γ*||x-y||^2)
71
- sigmoid : k(x,y) = tanh(γ*x'*y+c)
72
- laplacian : k(x,y) = exp(-γ*||x-y||)
73
-
74
-
75
- degree - d
76
- offset - c
77
- gamma - γ
78
- %}
79
- kernel = Kernel('type', 'gaussian', 'gamma', value);
80
- kernel = Kernel('type', 'polynomial', 'degree', value);
81
- kernel = Kernel('type', 'linear');
82
- kernel = Kernel('type', 'sigmoid', 'gamma', value);
83
- kernel = Kernel('type', 'laplacian', 'gamma', value);
84
- ```
85
- For example, compute the kernel matrix between ** X** and ** Y**
86
- ```
87
- X = rand(5, 2);
88
- Y = rand(3, 2);
89
- kernel = Kernel('type', 'gaussian', 'gamma', 2);
90
- kernelMatrix = kernel.computeMatrix(X, Y);
91
- >> kernelMatrix
92
-
93
- kernelMatrix =
38
+ ## 🔨 HOW TO USE
39
+ ### 👉 A simple SVDD model
40
+ ``` MATLAB
41
+ % generate dataset
42
+ ocdata = BinaryDataset();
43
+ ocdata.generate;
44
+ [trainData, trainLabel, testData, testLabel] = ocdata.partition;
94
45
95
- 0.5684 0.5607 0.4007
96
- 0.4651 0.8383 0.5091
97
- 0.8392 0.7116 0.9834
98
- 0.4731 0.8816 0.8052
99
- 0.5034 0.9807 0.7274
100
- ```
101
-
102
- ### 03-1. Simple SVDD model for dataset containing only positive samples
103
-
104
- ```
105
- [data, label] = DataSet.generate('dim', 3, 'num', [200, 200], 'display', 'on');
106
- [trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'single');
107
- kernel = Kernel('type', 'gaussian', 'gamma', 0.2);
46
+ % set parameter
47
+ kernel = BaseKernel('type', 'gaussian', 'gamma', 0.04);
108
48
cost = 0.3;
109
49
svddParameter = struct('cost', cost,...
110
50
'kernelFunc', kernel);
111
- svdd = BaseSVDD(svddParameter);
112
-
113
- % train SVDD model
114
- svdd.train(trainData, trainLabel);
115
- % test SVDD model
116
- results = svdd.test(testData, testLabel);
117
- ```
118
- In this code, the input of *** svdd.train*** is also supported as:
119
- ```
120
- % train SVDD model
121
- svdd.train(trainData);
122
- ```
123
- The training and test results:
124
- ```
125
- *** SVDD model training finished ***
126
- running time = 0.0069 seconds
127
- iterations = 9
128
- number of samples = 140
129
- number of SVs = 23
130
- radio of SVs = 16.4286%
131
- accuracy = 95.0000%
132
51
133
-
134
- *** SVDD model test finished ***
135
- running time = 0.0013 seconds
136
- number of samples = 260
137
- number of alarm points = 215
138
- accuracy = 94.2308%
139
- ```
140
-
141
- ### 03-2. Simple SVDD model for dataset containing both positive and negetive samples
142
-
143
- ```
144
- [data, label] = DataSet.generate('dim', 3, 'num', [200, 200], 'display', 'on');
145
- [trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'hybrid');
146
- kernel = Kernel('type', 'gaussian', 'gamma', 0.05);
147
- cost = 0.9;
148
- svddParameter = struct('cost', cost,...
149
- 'kernelFunc', kernel);
52
+ % creat an SVDD object
150
53
svdd = BaseSVDD(svddParameter);
151
-
152
54
% train SVDD model
153
55
svdd.train(trainData, trainLabel);
154
56
% test SVDD model
155
57
results = svdd.test(testData, testLabel);
156
58
```
59
+ - ` BinaryDataset ` is designed to validate the svdd model only, you can use your data and please be careful to keep the naming of variables consistent, e.g. ` trainData ` , ` trainLabel ` , ` testData ` , and ` testLabel ` .
60
+ - Specifically, if the data does not have labels, please change the inputs for training or testing to ` svdd.train(trainData) ` and ` results = svdd.test(testData) ` .
157
61
158
- The training and test results:
159
- ```
160
- *** SVDD model training finished ***
161
- running time = 0.0074 seconds
162
- iterations = 9
163
- number of samples = 160
164
- number of SVs = 12
165
- radio of SVs = 7.5000%
166
- accuracy = 97.5000%
167
-
168
-
169
- *** SVDD model test finished ***
170
- running time = 0.0013 seconds
171
- number of samples = 240
172
- number of alarm points = 188
173
- accuracy = 96.6667%
62
+ ### 👉 Parameter Optimization for SVDD model
63
+ A class named ` SvddOptimization ` is defined to optimized the parameters. First define an optimization setting structure, then add it to the svdd parameter structure.The parameter optimization of the polynomial kernel function can only be achieved by using Bayesian optimization.
64
+ Please see the demonstration ` 📝 demo_ParameterOptimization.m ` for details.
65
+ ``` MATLAB
66
+ % optimization setting
67
+ optimization.method = 'bayes'; %
68
+ optimization.maxIteration = 20;
69
+ optimization.display = 'on';
70
+ % SVDD parameter
71
+ svddParameter = struct('cost', cost,...
72
+ 'kernelFunc', kernel,...
73
+ 'optimization', optimization);
174
74
```
175
- ### 04. Visualization
176
-
177
- A class named *** SvddVisualization*** is defined to visualize the training and test results.
178
-
75
+ The full properties of optimization are
76
+ - ` method ` : optimization methods, only supported for 'bayes', 'pso', and 'ga'.
77
+ - ` variableName ` : variables that are to be optimized, including 'cost', 'degree', 'offset', and 'gamma'.
78
+ - ` variableType ` : variable type, specified as 'real' (real variable), 'integer' (integer variable).
79
+ - ` lowerBound ` : lower bound of variables.
80
+ - ` upperBound ` : upper bound of variables.
81
+ - ` maxIteration ` : max iterations.
82
+ - ` points ` : size of group or seed.
83
+ - ` display ` : visualization, 'on' or 'off'.
84
+
85
+ ### 👉 Visualization of SVDD model
86
+ A class named ` SvddVisualization ` is defined to visualize the training and test results.
179
87
Based on the trained SVDD model, the ROC curve of the training results (only supported for dataset containing both positive and negetive samples) is
180
-
181
- ```
88
+ ``` MATLAB
182
89
% Visualization
183
90
svplot = SvddVisualization();
184
91
svplot.ROC(svdd);
@@ -188,7 +95,7 @@ svplot.ROC(svdd);
188
95
</p >
189
96
190
97
The decision boundaries (only supported for 2D/3D dataset) are
191
- ```
98
+ ``` MATLAB
192
99
% Visualization
193
100
svplot = SvddVisualization();
194
101
svplot.boundary(svdd);
@@ -202,91 +109,113 @@ svplot.boundary(svdd);
202
109
</p >
203
110
204
111
The distance between the test data and the hypersphere is
205
- ```
112
+ ``` MATLAB
206
113
svplot.distance(svdd, results);
207
114
```
208
115
<p align =" center " >
209
116
<img src =" http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/distance-3D_.png " >
210
117
</p >
211
118
212
- For the test results, the test data and decision boundary (only supported for 2D/3D dataset) are
213
- ```
214
- svplot.testDataWithBoundary(svdd, results);
215
- ```
216
- <p align =" center " >
217
- <img src =" http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/boundary-tets-2D_.png " >
218
- <img src =" http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/boundary-tets-3D_.png " >
219
- </p >
220
-
221
- ### 05. Parameter Optimization
222
-
223
- A class named *** SvddOptimization*** is defined to optimized the parameters.
224
-
225
- ```
226
- % optimization setting
227
- optimization.method = 'bayes'; % bayes, ga pso
228
- optimization.variableName = { 'cost', 'gamma'};
229
- optimization.variableType = {'real', 'real'}; % 'integer' 'real'
230
- optimization.lowerBound = [10^-2, 2^-6];
231
- optimization.upperBound = [10^0, 2^6];
232
- optimization.maxIteration = 20;
233
- optimization.points = 10;
234
- optimization.display = 'on';
235
-
236
- % SVDD parameter
237
- svddParameter = struct('cost', cost,...
238
- 'kernelFunc', kernel,...
239
- 'optimization', optimization);
119
+ ### 👉 Binary Dataset for SVDD model
120
+ A class named ` BinaryDataset ` is defined to generate and partition the 2D or 3D binary dataset.
121
+ Please see the demonstration ` 📝demo_BinaryDataset.m ` for details.
122
+ ``` MATLAB
123
+ ocdata = BinaryDataset();
124
+ [data, label] = ocdata.generate;
125
+ [trainData, trainLabel, testData, testLabel] = ocdata.partition;
126
+ ```
127
+ The method ` generate ` is designed to generate dataset. The syntax of ` generate ` is
128
+ ``` MATLAB
129
+ ocdata.generate;
130
+ data = ocdata.generate;
131
+ [data, label] = ocdata.generate;
132
+ ```
133
+ The method ` partition ` is designed to partition dataset into training dataset and test dataset. The syntax of ` partition ` is
134
+ ``` MATLAB
135
+ [trainData, trainLabel, testData, testLabel] = ocdata.partition;
136
+ ```
137
+ The full Name-Value Arguments of class ` BinaryDataset ` are
138
+ - ` shape ` : shape of dataset, 'banana' or 'circle'.
139
+ - ` dimensionality ` : dimensionality of dataset, 2 or 3.
140
+ - ` number ` : number of samples per class, for example: [ 200, 200] .
141
+ - ` display ` : visualization, 'on' or 'off'.
142
+ - ` noise ` : noise added to dataset with range [ 0, 1] . For example: 0.2.
143
+ - ` ratio ` : ratio of the test set with range (0, 1). For example: 0.3.
144
+
145
+ ### 👉 Kernel funcions
146
+ A class named `BaseKernel* is defined to compute kernel function matrix.
147
+ Please see the demonstration ` 📝demo_KernelFuncion.m ` for details.
148
+ ``` MATLAB
149
+ %{
150
+ type -
151
+
152
+ linear : k(x,y) = x'*y
153
+ polynomial : k(x,y) = (γ*x'*y+c)^d
154
+ gaussian : k(x,y) = exp(-γ*||x-y||^2)
155
+ sigmoid : k(x,y) = tanh(γ*x'*y+c)
156
+ laplacian : k(x,y) = exp(-γ*||x-y||)
157
+
158
+
159
+ degree - d
160
+ offset - c
161
+ gamma - γ
162
+ %}
163
+ kernel = BaseKernel('type', 'gaussian', 'gamma', value);
164
+ kernel = BaseKernel('type', 'polynomial', 'degree', value);
165
+ kernel = BaseKernel('type', 'linear');
166
+ kernel = BaseKernel('type', 'sigmoid', 'gamma', value);
167
+ kernel = BaseKernel('type', 'laplacian', 'gamma', value);
240
168
```
241
-
242
- The visualization of parameter optimization is
243
- <p align =" center " >
244
- <img src =" http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/bayesopt_1_.png " >
245
- <img src =" http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/bayesopt_2_.png " >
246
- </p >
247
-
248
- ** Notice**
249
- - The optimization method can be set to 'bayes', 'ga', 'pso'.
250
- - The parameter names are limited to 'cost', 'degree', 'offset', 'gamma'
251
- - The parameter optimization of the polynomial kernel function can only be achieved by using Bayesian optimization.
252
- - The parameter type of 'degree' should be set to 'integer'.
253
-
254
-
255
- ### 06. Cross Validation
256
-
169
+ ### 👉 Cross Validation
257
170
In this code, two cross-validation methods are supported: 'K-Folds' and 'Holdout'.
258
171
For example, the cross-validation of 5-Folds is
259
- ```
172
+ ``` MATLAB
260
173
svddParameter = struct('cost', cost,...
261
174
'kernelFunc', kernel,...
262
175
'KFold', 5);
263
176
```
264
177
For example, the cross-validation of the Holdout method with a ratio of 0.3 is
265
- ```
178
+ ``` MATLAB
266
179
svddParameter = struct('cost', cost,...
267
180
'kernelFunc', kernel,...
268
181
'Holdout', 0.3);
269
182
```
270
183
271
- ### 07. Dimensionality reduction using PCA
272
-
184
+ ### 👉 Dimensionality reduction using PCA
273
185
For example, reducing the data to 2 dimensions can be set as
274
- ```
186
+ ``` MATLAB
275
187
% SVDD parameter
276
188
svddParameter = struct('cost', cost,...
277
189
'kernelFunc', kernel,...
278
190
'PCA', 2);
279
191
```
280
192
** Notice:** you only need to set PCA in svddParameter, and you don't need to process training data and test data separately.
281
193
282
- ### 08. Weighted SVDD
283
-
284
- An Observation-weighted SVDD is supported in this code. For example, the weighted SVDD can be set as
285
- ```
194
+ ### 👉 Weighted SVDD
195
+ An Observation-weighted SVDD is supported in this code.
196
+ Please see the demonstration ` demo_ObservationWeight.m ` for details.
197
+ ``` MATLAB
286
198
weight = rand(size(trainData, 1), 1);
287
199
% SVDD parameter
288
200
svddParameter = struct('cost', cost,...
289
201
'kernelFunc', kernel,...
290
202
'weight', weight);
291
203
```
292
204
** Notice:** the size of 'weigh' should be m×1, where m is the number of training samples.
205
+
206
+ ### 👉 Hybrid-kernel SVDD model
207
+ A demo for SVDD using Hybrid kernel functions (K =w1×K1+w2×K2+...+wn×Kn).
208
+ Please see the demonstration ` demo_HybridKernel.m ` for details.
209
+ ``` MATLAB
210
+ kernel_1 = BaseKernel('type', 'gaussian', 'gamma', 0.3);
211
+ kernel_2 = BaseKernel('type', 'polynomial', 'degree', 2);
212
+ kernel_3 = BaseKernel('type', 'sigmoid', 'gamma', 0.05);
213
+ kernelWeight = [0.5, 0.2, 0.3];
214
+ % parameter setting
215
+ % kernel = Kernel('type', 'gaussian', 'gamma', 0.04);
216
+ cost = 0.3;
217
+ svddParameter = struct('cost', cost,...
218
+ 'kernelFunc', [kernel_1, kernel_2, kernel_3],...
219
+ 'kernelWeight', kernelWeight);
220
+ ```
221
+ ** Notice:** the size of 'weigh' should be m×1, where m is the number of training samples.
0 commit comments