Skip to content
Ondřej Moravčík edited this page Mar 27, 2015 · 7 revisions

MLlib is Spark’s scalable machine learning library consisting of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, dimensionality reduction, as well as underlying optimization primitives.

Currently implemented methods can be found at mllib.rb.

All models are train in Scala Spark so first make sure that mllib works. Detail doc can be found at http://spark.apache.org/docs/latest/mllib-guide.html.

Importing

Normally all methods are accessible via

Spark::Mllib::LinearRegressionWithSGD.train(...)

but if you just want using LinearRegressionWithSGD you can import all mllib classes to specific object (by default to Object).

Spark::Mllib.import # import classes to Object
Spark::Mllib.import(CLASS) # import classes to specific CLASS

LinearRegressionWithSGD.train(...)
Clone this wiki locally