Skip to content

Latest commit

 

History

History
343 lines (254 loc) · 14.9 KB

README.md

File metadata and controls

343 lines (254 loc) · 14.9 KB

NPM version npm download

Port of to port libsvm v3.22 using emscripten , for usage in the browser or nodejs. 2 build targets: asm and WebAssembly.

What is libsvm? libsvm is a c++ library developped by Chih-Chung Chang and Chih-Jen Lin that allows to do support vector machine (aka SVM) classification and regression.

Resources about libsvm:

Usage

Install

npm install libsvm-js

Load in nodejs

The main entry point loads the WebAssembly build and is asynchronous.

require('libsvm-js').then(SVM => {
    const svm = new SVM(); // ...
});

There is an alternative entry point if you want to use asm build. This entrypoint is synchronous.

const SVM = require('libsvm-js/asm');
const svm = new SVM(); // ...

Load in the browser

The npm package contains a bundle for the browser that works with AMD and browser globals. There is one bundle for the asm build and another for the web assembly build. They are located in the dist/browser directory of the package. You can load them into your web page with a script tag. For the web assembly module, make sure that the libsvm.wasm file is served from the same relative path as the js file.

Basic usage

This example illustrates how to use the library to train and use an SVM classifier.

async function xor() {
    const SVM = await
    require('libsvm-js');
    const svm = new SVM({
        kernel: SVM.KERNEL_TYPES.RBF, // The type of kernel I want to use
        type: SVM.SVM_TYPES.C_SVC,    // The type of SVM I want to run
        gamma: 1,                     // RBF kernel gamma parameter
        cost: 1                       // C_SVC cost parameter
    });

    // This is the xor problem
    //
    //  1  0
    //  0  1
    const features = [[0, 0], [1, 1], [1, 0], [0, 1]];
    const labels = [0, 0, 1, 1];
    svm.train(features, labels);  // train the model
    const predictedLabel = svm.predictOne([0.7, 0.8]);
    console.log(predictedLabel) // 0
}

xor().then(() => console.log('done!'));

Benchmarks

You can compare the performance of the library in various environments. Run npm run benchmark to run the benchmarks with native c/c++ code and with the compiled code with your local version of node.js. For browser performance, go to the web benchmark page.

Speed is mainly affected by the javascript engine that compiles it. Since WebAssembly has been stabilized and is an optimization phase, more recent engines are almost always faster.

Speed is also affected by the version of emscripten that generated the build or the options used in the build. I will try to keep up with any improvement that might significantly impact the performance.

Cross-validation benchmark

I report the results here for the cross-validation benchmark on the iris dataset to get a feeling for how performance compares on different platforms. There are other benchmarks that can be run from the terminal in node.js or in the browser. The performance results are given relative to how they run natively (with compiled c++ code). The benchmarks only consider runtime performance, not load and parse performance.

Platform Rel asm perf Rel wasm perf
Native 100% 100%
Node.js 8.1.2 34.2% 52.6%
Node.js v7.10.0 14.4% N/A
Chrome 59.0.3071.115 36.2% 51.3%
Firefox 54.0 35.5% 70.4%

What are asm and WebAssembly ?

From asmjs.org

asm is an optimizable subset of javascript.

From webassembly.org

WebAssembly or wasm is a new portable, size- and load-time-efficient format suitable for compilation to the web

Should I use asm or WebAssembly ?

Both. You should try to use WebAssembly first and fall back to asm in order to support all browsers.

WebAssembly is currently supported in the latest stable versions of Chrome, Firefox and on preview versions of Safari and Edge.

API Documentation

SVM

Kind: global class

new SVM(options)

Param Type Default Description
options object
[options.type] number SVM_TYPES.C_SVC Type of SVM to perform,
[options.kernel] number KERNEL_TYPES.RBF Kernel function,
[options.degree] number 3 Degree of polynomial, for polynomial kernel
[options.gamma] number Gamma parameter of the RBF, Polynomial and Sigmoid kernels. Default value is 1/num_features
[options.coef0] number 0 coef0 parameter for Polynomial and Sigmoid kernels
[options.cost] number 1 Cost parameter, for C SVC, Epsilon SVR and NU SVR
[options.nu] number 0.5 For NU SVC and NU SVR
[options.epsilon] number 0.1 For epsilon SVR
[options.cacheSize] number 100 Cache size in MB
[options.tolerance] number 0.001 Tolerance
[options.shrinking] boolean true Use shrinking euristics (faster),
[options.probabilityEstimates] boolean false weather to train SVC/SVR model for probability estimates,
[options.weight] object Set weight for each possible class
[options.quiet] boolean true Print info during training if false

svM.train(samples, labels)

Trains the SVM model.

Kind: instance method of SVM
Throws:

  • if SVM instance was instantiated from SVM.load.
Param Type Description
samples Array.<Array.<number>> The training samples. First level of array are the samples, second level are the individual features
labels Array.<number> The training labels. It should have the same size as the samples. If you are training a classification model, the labels should be distinct integers for each class. If you are training a regression model, each label should be the value of the predicted variable.

svM.crossValidation(samples, labels, kFold) ⇒ Array.<number>

Performs k-fold cross-validation (KF-CV). KF-CV separates the data-set into kFold random equally sized partitions, and uses each as a validation set, with all other partitions used in the training set. Observations left over from if kFold does not divide the number of observations are left out of the cross-validation process. If kFold is one, this is equivalent to a leave-on-out cross-validation

Kind: instance method of SVM
Returns: Array.<number> - The array of predicted labels produced by the cross validation. Has a size equal to the number of samples provided as input.
Throws:

  • if SVM instance was instantiated from SVM.load.
Param Type Description
samples Array.<Array.<number>> The training samples.
labels Array.<number> The training labels.
kFold number Number of datasets into which to split the training set.

svM.free()

Free the memory allocated for the model. Since this memory is stored in the memory model of emscripten, it is allocated within an ArrayBuffer and WILL NOT BE GARBARGE COLLECTED, you have to explicitly free it. So not calling this will result in memory leaks. As of today in the browser, there is no way to hook the garbage collection of the SVM object to free it automatically. Free the memory that was created by the compiled libsvm library to. store the model. This model is reused every time the predict method is called.

Kind: instance method of SVM

svM.predictOne(sample) ⇒ number

Predict the label of one sample.

Kind: instance method of SVM
Returns: number - - The predicted label.

Param Type Description
sample Array.<number> The sample to predict.

svM.predict(samples) ⇒ Array.<number>

Predict the label of many samples.

Kind: instance method of SVM
Returns: Array.<number> - - The predicted labels.

Param Type Description
samples Array.<Array.<number>> The samples to predict.

svM.predictProbability(samples) ⇒ Array.<object>

Predict the label with probability estimate of many samples.

Kind: instance method of SVM
Returns: Array.<object> - - An array of objects containing the prediction label and the probability estimates for each label

Param Type Description
samples Array.<Array.<number>> The samples to predict.

svM.predictOneProbability(sample) ⇒ object

Predict the label with probability estimate.

Kind: instance method of SVM
Returns: object - - An object containing the prediction label and the probability estimates for each label

Param Type
sample Array.<number>

svM.predictOneInterval(sample, confidence) ⇒ object

Predict a regression value with a confidence interval

Kind: instance method of SVM
Returns: object - - An object containing the prediction value and the lower and upper bounds of the confidence interval

Param Type Description
sample Array.<number>
confidence number A value between 0 and 1. For example, a value 0.95 will give you the 95% confidence interval of the predicted value.

svM.predictInterval(samples, confidence) ⇒ Array.<object>

Predict regression values with confidence intervals

Kind: instance method of SVM
Returns: Array.<object> - - An array of objects each containing the prediction label and the probability estimates for each label

Param Type Description
samples Array.<Array.<number>> An array of samples.
confidence number A value between 0 and 1. For example, a value 0.95 will give you the 95% confidence interval of the predicted value.

svM.getLabels() ⇒ Array.<number>

Get the array of labels from the model. Useful when creating an SVM instance with SVM.load

Kind: instance method of SVM
Returns: Array.<number> - - The list of labels.

svM.getSVIndices() ⇒ Array.<number>

Get the indices of the support vectors from the training set passed to the train method.

Kind: instance method of SVM
Returns: Array.<number> - - The list of indices from the training samples.

svM.serializeModel() ⇒ string

Uses libsvm's serialization method of the model.

Kind: instance method of SVM
Returns: string - The serialization string.

SVM.SVM_TYPES : Object

SVM classification and regression types

Kind: static property of SVM
Properties

Name Description
C_SVC The C support vector classifier type
NU_SVC The nu support vector classifier type
ONE_CLASS The one-class support vector classifier type
EPSILON_SVR The epsilon support vector regression type
NU_SVR The nu support vector regression type

SVM.KERNEL_TYPES : Object

SVM kernel types

Kind: static property of SVM
Properties

Name Description
LINEAR Linear kernel
POLYNOMIAL Polynomial kernel
RBF Radial basis function (gaussian) kernel
SIGMOID Sigmoid kernel

SVM.load(serializedModel) ⇒ SVM

Create a SVM instance from the serialized model.

Kind: static method of SVM
Returns: SVM - - SVM instance that contains the model.

Param Type Description
serializedModel string The serialized model.

LICENSE

BSD-3-Clause