A simple Self Organizing Map for Scala and Apache Spark.
Make sure you have an implicit SparkSession and your data RDD ready.
implicit val sparkSession = ???
val data: RDD[Vector] = ???Compose your own SOM instance, with either predefined or custom implementations of decay functions, neighborhood kernels or error metrics...
val SOM = new SelfOrganizingMap with CustomDecay with GaussianNeighborboodKernel with QuantizationErrorMetrics {
override val shape: Shape = (24, 24)
override val learningRate: Double = 0.3
override val sigma: Double = 0.5
}... or just use an off-the-shelf SOM for your convenience.
val SOM = GaussianSelfOrganizingMap(24, 24, sigma = 0.5, learningRate = 0.3)Initialization and training:
val (som, params) = SOM.initialize(data).train(data, 20)Classification of datapoints:
val dataPoint: DenseVector = ???
val (bmu, distance) = som.classify(dataPoint)You can find more examples using the SOM library in the tests and complete applications in the examples directory.
➜ som git:(master) ✗ sbt
...
> publishLocal
...
> project macros
...
> publishLocal
...
[success] Total time: 10 s, completed Dec 18, 2016 4:02:19 PM
>
Some parts of the implementation are inspired by the spark-som project. Credits to @jxieeducation / PragmaticLab.