Skip to content

Commit 632ab75

Browse files
committed
updated the code
1 parent 800305a commit 632ab75

22 files changed

+1087
-825
lines changed

README.md

Lines changed: 122 additions & 193 deletions
Original file line numberDiff line numberDiff line change
@@ -4,8 +4,8 @@
44

55
<h3 align="center">Support Vector Data Description (SVDD)</h3>
66

7-
<p align="center">MATLAB Code for abnormal detection or fault detection using SVDD</p>
8-
<p align="center">Version 2.1.1, 22-DEC-2021</p>
7+
<p align="center">MATLAB Code for abnormal detection using SVDD</p>
8+
<p align="center">Version 2.2, 13-MAY-2022</p>
99
<p align="center">Email: [email protected]</p>
1010

1111
<div align=center>
@@ -21,164 +21,71 @@
2121

2222
<hr />
2323

24-
## Main features
25-
24+
## ✨ MAIN FEATURES
2625
- SVDD model for one-class or binary classification
2726
- Multiple kinds of kernel functions (linear, gaussian, polynomial, sigmoid, laplacian)
2827
- Visualization of decision boundaries for 2D or 3D data
29-
- Parameter Optimization using Bayesian optimization, Genetic Algorithm, and Particle Swarm Optimization
28+
- Parameter optimization using Bayesian optimization, genetic algorithm, and pParticle swarm optimization
3029
- Weighted SVDD model
30+
- Hybrid-kernel SVDD model (K =w1×K1+w2×K2+...+wn×Kn)
3131

32-
## Notices
33-
34-
- This version of the code is not compatible with the versions lower than ***R2016b***.
32+
## ⚠️ NOTICES
33+
- This version of this code is not compatible with the versions lower than ***R2016b***.
3534
- The label must be 1 for positive sample or -1 for negative sample.
36-
- Detailed applications please see the demonstrations.
35+
- Detailed applications please see the provided ***demonstrations***.
3736
- This code is for reference only.
3837

39-
## How to use
40-
41-
### 01. banana-shaped dataset
42-
43-
A class named ***DataSet*** is defined to generate and partition the 2D or 3D banana-shaped dataset.
44-
```
45-
[data, label] = DataSet.generate;
46-
[data, label] = DataSet.generate('dim', 2);
47-
[data, label] = DataSet.generate('dim', 2, 'num', [200, 200]);
48-
[data, label] = DataSet.generate('dim', 3, 'num', [200, 200], 'display', 'on');
49-
50-
% 'single' --- The training set contains only positive samples.
51-
[trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'single');
52-
53-
% 'hybrid' --- The training set contains positive and negetive samples.
54-
[trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'hybrid');
55-
```
56-
<p align="center">
57-
<img src="http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/banana-2D_.png">
58-
<img src="http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/banana-3D_.png">
59-
</p>
60-
61-
### 02. Kernel funcions
62-
63-
A class named ***Kernel*** is defined to compute kernel function matrix.
64-
```
65-
%{
66-
type -
67-
68-
linear : k(x,y) = x'*y
69-
polynomial : k(x,y) = (γ*x'*y+c)^d
70-
gaussian : k(x,y) = exp(-γ*||x-y||^2)
71-
sigmoid : k(x,y) = tanh(γ*x'*y+c)
72-
laplacian : k(x,y) = exp(-γ*||x-y||)
73-
74-
75-
degree - d
76-
offset - c
77-
gamma - γ
78-
%}
79-
kernel = Kernel('type', 'gaussian', 'gamma', value);
80-
kernel = Kernel('type', 'polynomial', 'degree', value);
81-
kernel = Kernel('type', 'linear');
82-
kernel = Kernel('type', 'sigmoid', 'gamma', value);
83-
kernel = Kernel('type', 'laplacian', 'gamma', value);
84-
```
85-
For example, compute the kernel matrix between **X** and **Y**
86-
```
87-
X = rand(5, 2);
88-
Y = rand(3, 2);
89-
kernel = Kernel('type', 'gaussian', 'gamma', 2);
90-
kernelMatrix = kernel.computeMatrix(X, Y);
91-
>> kernelMatrix
92-
93-
kernelMatrix =
38+
## 🔨 HOW TO USE
39+
### 👉 A simple SVDD model
40+
```MATLAB
41+
% generate dataset
42+
ocdata = BinaryDataset();
43+
ocdata.generate;
44+
[trainData, trainLabel, testData, testLabel] = ocdata.partition;
9445
95-
0.5684 0.5607 0.4007
96-
0.4651 0.8383 0.5091
97-
0.8392 0.7116 0.9834
98-
0.4731 0.8816 0.8052
99-
0.5034 0.9807 0.7274
100-
```
101-
102-
### 03-1. Simple SVDD model for dataset containing only positive samples
103-
104-
```
105-
[data, label] = DataSet.generate('dim', 3, 'num', [200, 200], 'display', 'on');
106-
[trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'single');
107-
kernel = Kernel('type', 'gaussian', 'gamma', 0.2);
46+
% set parameter
47+
kernel = BaseKernel('type', 'gaussian', 'gamma', 0.04);
10848
cost = 0.3;
10949
svddParameter = struct('cost', cost,...
11050
'kernelFunc', kernel);
111-
svdd = BaseSVDD(svddParameter);
112-
113-
% train SVDD model
114-
svdd.train(trainData, trainLabel);
115-
% test SVDD model
116-
results = svdd.test(testData, testLabel);
117-
```
118-
In this code, the input of ***svdd.train*** is also supported as:
119-
```
120-
% train SVDD model
121-
svdd.train(trainData);
122-
```
123-
The training and test results:
124-
```
125-
*** SVDD model training finished ***
126-
running time = 0.0069 seconds
127-
iterations = 9
128-
number of samples = 140
129-
number of SVs = 23
130-
radio of SVs = 16.4286%
131-
accuracy = 95.0000%
13251
133-
134-
*** SVDD model test finished ***
135-
running time = 0.0013 seconds
136-
number of samples = 260
137-
number of alarm points = 215
138-
accuracy = 94.2308%
139-
```
140-
141-
### 03-2. Simple SVDD model for dataset containing both positive and negetive samples
142-
143-
```
144-
[data, label] = DataSet.generate('dim', 3, 'num', [200, 200], 'display', 'on');
145-
[trainData, trainLabel, testData, testLabel] = DataSet.partition(data, label, 'type', 'hybrid');
146-
kernel = Kernel('type', 'gaussian', 'gamma', 0.05);
147-
cost = 0.9;
148-
svddParameter = struct('cost', cost,...
149-
'kernelFunc', kernel);
52+
% creat an SVDD object
15053
svdd = BaseSVDD(svddParameter);
151-
15254
% train SVDD model
15355
svdd.train(trainData, trainLabel);
15456
% test SVDD model
15557
results = svdd.test(testData, testLabel);
15658
```
59+
- `BinaryDataset` is designed to validate the svdd model only, you can use your data and please be careful to keep the naming of variables consistent, e.g. `trainData`, `trainLabel`, `testData`, and `testLabel`.
60+
- Specifically, if the data does not have labels, please change the inputs for training or testing to `svdd.train(trainData)` and `results = svdd.test(testData)`.
15761

158-
The training and test results:
159-
```
160-
*** SVDD model training finished ***
161-
running time = 0.0074 seconds
162-
iterations = 9
163-
number of samples = 160
164-
number of SVs = 12
165-
radio of SVs = 7.5000%
166-
accuracy = 97.5000%
167-
168-
169-
*** SVDD model test finished ***
170-
running time = 0.0013 seconds
171-
number of samples = 240
172-
number of alarm points = 188
173-
accuracy = 96.6667%
62+
### 👉 Parameter Optimization for SVDD model
63+
A class named `SvddOptimization` is defined to optimized the parameters. First define an optimization setting structure, then add it to the svdd parameter structure.The parameter optimization of the polynomial kernel function can only be achieved by using Bayesian optimization.
64+
Please see the demonstration `📝 demo_ParameterOptimization.m` for details.
65+
```MATLAB
66+
% optimization setting
67+
optimization.method = 'bayes'; %
68+
optimization.maxIteration = 20;
69+
optimization.display = 'on';
70+
% SVDD parameter
71+
svddParameter = struct('cost', cost,...
72+
'kernelFunc', kernel,...
73+
'optimization', optimization);
17474
```
175-
### 04. Visualization
176-
177-
A class named ***SvddVisualization*** is defined to visualize the training and test results.
178-
75+
The full properties of optimization are
76+
- `method`: optimization methods, only supported for 'bayes', 'pso', and 'ga'.
77+
- `variableName`: variables that are to be optimized, including 'cost', 'degree', 'offset', and 'gamma'.
78+
- `variableType`: variable type, specified as 'real' (real variable), 'integer' (integer variable).
79+
- `lowerBound`: lower bound of variables.
80+
- `upperBound`: upper bound of variables.
81+
- `maxIteration`: max iterations.
82+
- `points`: size of group or seed.
83+
- `display `: visualization, 'on' or 'off'.
84+
85+
### 👉 Visualization of SVDD model
86+
A class named `SvddVisualization` is defined to visualize the training and test results.
17987
Based on the trained SVDD model, the ROC curve of the training results (only supported for dataset containing both positive and negetive samples) is
180-
181-
```
88+
```MATLAB
18289
% Visualization
18390
svplot = SvddVisualization();
18491
svplot.ROC(svdd);
@@ -188,7 +95,7 @@ svplot.ROC(svdd);
18895
</p>
18996

19097
The decision boundaries (only supported for 2D/3D dataset) are
191-
```
98+
```MATLAB
19299
% Visualization
193100
svplot = SvddVisualization();
194101
svplot.boundary(svdd);
@@ -202,91 +109,113 @@ svplot.boundary(svdd);
202109
</p>
203110

204111
The distance between the test data and the hypersphere is
205-
```
112+
```MATLAB
206113
svplot.distance(svdd, results);
207114
```
208115
<p align="center">
209116
<img src="http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/distance-3D_.png">
210117
</p>
211118

212-
For the test results, the test data and decision boundary (only supported for 2D/3D dataset) are
213-
```
214-
svplot.testDataWithBoundary(svdd, results);
215-
```
216-
<p align="center">
217-
<img src="http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/boundary-tets-2D_.png">
218-
<img src="http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/boundary-tets-3D_.png">
219-
</p>
220-
221-
### 05. Parameter Optimization
222-
223-
A class named ***SvddOptimization*** is defined to optimized the parameters.
224-
225-
```
226-
% optimization setting
227-
optimization.method = 'bayes'; % bayes, ga pso
228-
optimization.variableName = { 'cost', 'gamma'};
229-
optimization.variableType = {'real', 'real'}; % 'integer' 'real'
230-
optimization.lowerBound = [10^-2, 2^-6];
231-
optimization.upperBound = [10^0, 2^6];
232-
optimization.maxIteration = 20;
233-
optimization.points = 10;
234-
optimization.display = 'on';
235-
236-
% SVDD parameter
237-
svddParameter = struct('cost', cost,...
238-
'kernelFunc', kernel,...
239-
'optimization', optimization);
119+
### 👉 Binary Dataset for SVDD model
120+
A class named `BinaryDataset` is defined to generate and partition the 2D or 3D binary dataset.
121+
Please see the demonstration `📝demo_BinaryDataset.m` for details.
122+
```MATLAB
123+
ocdata = BinaryDataset();
124+
[data, label] = ocdata.generate;
125+
[trainData, trainLabel, testData, testLabel] = ocdata.partition;
126+
```
127+
The method `generate` is designed to generate dataset. The syntax of `generate` is
128+
```MATLAB
129+
ocdata.generate;
130+
data = ocdata.generate;
131+
[data, label] = ocdata.generate;
132+
```
133+
The method `partition` is designed to partition dataset into training dataset and test dataset. The syntax of `partition` is
134+
```MATLAB
135+
[trainData, trainLabel, testData, testLabel] = ocdata.partition;
136+
```
137+
The full Name-Value Arguments of class `BinaryDataset` are
138+
- `shape`: shape of dataset, 'banana' or 'circle'.
139+
- `dimensionality`: dimensionality of dataset, 2 or 3.
140+
- `number`: number of samples per class, for example: [200, 200].
141+
- `display`: visualization, 'on' or 'off'.
142+
- `noise`: noise added to dataset with range [0, 1]. For example: 0.2.
143+
- `ratio`: ratio of the test set with range (0, 1). For example: 0.3.
144+
145+
### 👉 Kernel funcions
146+
A class named `BaseKernel* is defined to compute kernel function matrix.
147+
Please see the demonstration `📝demo_KernelFuncion.m` for details.
148+
```MATLAB
149+
%{
150+
type -
151+
152+
linear : k(x,y) = x'*y
153+
polynomial : k(x,y) = (γ*x'*y+c)^d
154+
gaussian : k(x,y) = exp(-γ*||x-y||^2)
155+
sigmoid : k(x,y) = tanh(γ*x'*y+c)
156+
laplacian : k(x,y) = exp(-γ*||x-y||)
157+
158+
159+
degree - d
160+
offset - c
161+
gamma - γ
162+
%}
163+
kernel = BaseKernel('type', 'gaussian', 'gamma', value);
164+
kernel = BaseKernel('type', 'polynomial', 'degree', value);
165+
kernel = BaseKernel('type', 'linear');
166+
kernel = BaseKernel('type', 'sigmoid', 'gamma', value);
167+
kernel = BaseKernel('type', 'laplacian', 'gamma', value);
240168
```
241-
242-
The visualization of parameter optimization is
243-
<p align="center">
244-
<img src="http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/bayesopt_1_.png">
245-
<img src="http://github-files-qiu.oss-cn-beijing.aliyuncs.com/SVDD-MATLAB/bayesopt_2_.png">
246-
</p>
247-
248-
**Notice**
249-
- The optimization method can be set to 'bayes', 'ga', 'pso'.
250-
- The parameter names are limited to 'cost', 'degree', 'offset', 'gamma'
251-
- The parameter optimization of the polynomial kernel function can only be achieved by using Bayesian optimization.
252-
- The parameter type of 'degree' should be set to 'integer'.
253-
254-
255-
### 06. Cross Validation
256-
169+
### 👉 Cross Validation
257170
In this code, two cross-validation methods are supported: 'K-Folds' and 'Holdout'.
258171
For example, the cross-validation of 5-Folds is
259-
```
172+
```MATLAB
260173
svddParameter = struct('cost', cost,...
261174
'kernelFunc', kernel,...
262175
'KFold', 5);
263176
```
264177
For example, the cross-validation of the Holdout method with a ratio of 0.3 is
265-
```
178+
```MATLAB
266179
svddParameter = struct('cost', cost,...
267180
'kernelFunc', kernel,...
268181
'Holdout', 0.3);
269182
```
270183

271-
### 07. Dimensionality reduction using PCA
272-
184+
### 👉 Dimensionality reduction using PCA
273185
For example, reducing the data to 2 dimensions can be set as
274-
```
186+
```MATLAB
275187
% SVDD parameter
276188
svddParameter = struct('cost', cost,...
277189
'kernelFunc', kernel,...
278190
'PCA', 2);
279191
```
280192
**Notice:** you only need to set PCA in svddParameter, and you don't need to process training data and test data separately.
281193

282-
### 08. Weighted SVDD
283-
284-
An Observation-weighted SVDD is supported in this code. For example, the weighted SVDD can be set as
285-
```
194+
### 👉 Weighted SVDD
195+
An Observation-weighted SVDD is supported in this code.
196+
Please see the demonstration `demo_ObservationWeight.m` for details.
197+
```MATLAB
286198
weight = rand(size(trainData, 1), 1);
287199
% SVDD parameter
288200
svddParameter = struct('cost', cost,...
289201
'kernelFunc', kernel,...
290202
'weight', weight);
291203
```
292204
**Notice:** the size of 'weigh' should be m×1, where m is the number of training samples.
205+
206+
### 👉 Hybrid-kernel SVDD model
207+
A demo for SVDD using Hybrid kernel functions (K =w1×K1+w2×K2+...+wn×Kn).
208+
Please see the demonstration `demo_HybridKernel.m` for details.
209+
```MATLAB
210+
kernel_1 = BaseKernel('type', 'gaussian', 'gamma', 0.3);
211+
kernel_2 = BaseKernel('type', 'polynomial', 'degree', 2);
212+
kernel_3 = BaseKernel('type', 'sigmoid', 'gamma', 0.05);
213+
kernelWeight = [0.5, 0.2, 0.3];
214+
% parameter setting
215+
% kernel = Kernel('type', 'gaussian', 'gamma', 0.04);
216+
cost = 0.3;
217+
svddParameter = struct('cost', cost,...
218+
'kernelFunc', [kernel_1, kernel_2, kernel_3],...
219+
'kernelWeight', kernelWeight);
220+
```
221+
**Notice:** the size of 'weigh' should be m×1, where m is the number of training samples.

SVDD-V2.1使用说明.pdf

-1.69 MB
Binary file not shown.

0 commit comments

Comments
 (0)