Skip to content

Efficient training of Support Vector Machines in Java

Notifications You must be signed in to change notification settings

friendshipity/jlibsvm

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

jLibSVM

Efficient training of Support Vector Machines in Java

  • Heavily refactored Java port of the venerable LIBSVM (version 2.88).
  • Provides idiomatic Java class structure and APIs (unlike the Java version provided by LIBSVM, which is transliterated C code).
  • Easy to add new kernels, in addition to the five standard ones provided by LIBSVM.
  • On the mathematical side, jlibsvm performs exactly the same computations as LIBSVM, including shrinking and all the fancy stuff described in the LIBSVM implementation docs.
  • Optimized kernel implementations run faster, particularly when input vectors are sparse. For instance, on the mushrooms dataset, jlibsvm trained ~25% faster than LIBSVM (java version) with an RBF kernel and ~40% faster with a linear kernel. (The C version of LIBSVM is still faster, though).
  • Multithreaded training to take advantage of modern multi-core machines (using Conja).
  • Integrated scaling and normalization so you don't have to explicitly preprocess your data.
  • Integrated grid search for optimal kernel parameters.
  • Drop-in replacement if you use the command-line tools (e.g. svm-train, etc.), but not if you use LIBSVM programmatically.
  • Uses Java generics throughout, including for classification labels, so you can specify that the "label" of a class be of whatever Java type you like. In an email-filtering application, for example, you could use objects of type Mailbox as the labels. That would allow you to write something like mySvmSolutionModel.predict(incomingEmail).addMessage(incomingEmail). The predict() method returns a classification label, which in this case is an object of class Mailbox, which has an addMessage() method.

Status

This is beta code. While LIBSVM is stable, it's possible that I broke something in the process of refactoring it. I've done ad-hoc testing primarily with the C_SVC machine and an RBF kernel, and got results that were identical to LIBSVM as far as I could tell. There are not (yet?) any unit tests. I'm running some automated verifications that jlibsvm behaves identically to LIBSVM for a number of input datasets and parameter choices; results will be available here soon. Please let me know if you find a situation in which the two packages give different results.

Documentation

Sorry, I haven't really had a chance to write any docs. Have a look at the sources for the command-line programs in the legacyexec package to see how jlibsvm gets called. Very briefly, you'll need to:

  1. instantiate the KernelFunction that you want
  2. set up some parameters in a new SvmParameter object
  3. instantiate a concrete subclass of SvmProblem (binary, multiclass, or regression), and populate it with training data
  4. instantiate a concrete subclass of SVM, choosing a type appropriate for your problem
  5. Call SVM.train(problem) to yield a SolutionModel, which can be used to make predictions

Download

Maven is by far the easiest way to make use of jlibsvm. Just add these to your pom.xml:

<repositories>
	<repository>
		<id>dev.davidsoergel.com releases</id>
		<url>http://dev.davidsoergel.com/nexus/content/repositories/releases</url>
		<snapshots>
			<enabled>false</enabled>
		</snapshots>
	</repository>
	<repository>
		<id>dev.davidsoergel.com snapshots</id>
		<url>http://dev.davidsoergel.com/nexus/content/repositories/snapshots</url>
		<releases>
			<enabled>false</enabled>
		</releases>
	</repository>
</repositories>

<dependencies>
	<dependency>
		<groupId>edu.berkeley.compbio</groupId>
		<artifactId>jlibsvm</artifactId>
		<version>0.911</version>
	</dependency>
</dependencies>

If you really want just the jar, you can get the latest release from the Maven repo; or get the latest stable build from the build server.

About

Efficient training of Support Vector Machines in Java

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 96.7%
  • Perl 2.9%
  • Other 0.4%