Skip to content

robbenti/succinct

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Succinct

Succinct is a data store that enables queries directly on a compressed representation of data. This repository maintains the Java implementations of Succinct's core algorithms, and applications that exploit them, such as a Spark binding for Succinct.

The master branch is in version 0.1.2.

Building Succinct

Succinct is built using Apache Maven. To build Succinct and its component modules, run:

mvn clean package

Alternatively, one can also use sbt for building and development:

sbt/sbt gen-idea # can now import project into Intellij IDEA
sbt/sbt assembly # builds uber jars
sbt/sbt "~assembly" # incremental build
sbt/sbt "testOnly edu.berkeley.cs.succinct.sql.SuccinctSQLSuite"
sbt/sbt "project spark" "runMain edu.berkeley.cs.succinct.examples.WikiSearch <dataPath>"

Succinct-Core

The Succinct-Core module contains Java implementation of Succinct's core algorithms. See a more descriptive description of the core module here.

Dependency Information

Apache Maven

To build your application with Succinct-Core, you can link against this library using Maven by adding the following dependency information to your pom.xml file:

<dependency>
    <groupId>edu.berkeley.cs.succinct</groupId>
    <artifactId>succinct-core</artifactId>
    <version>0.1.2</version>
</dependency>

Succinct-Spark

The Succinct-Spark module contains Spark and Spark SQL intefaces for Succinct, exposes a compressed, queryable RDD SuccinctRDD. We also expose Succinct as a DataSource in Spark SQL as an experimental feature. More details on the Succinct-Spark module can be found here.

Dependency Information

Apache Maven

To build your application with Succinct-Spark, you can link against this library using Maven by adding the following dependency information to your pom.xml file:

<dependency>
    <groupId>edu.berkeley.cs.succinct</groupId>
    <artifactId>succinct-spark</artifactId>
    <version>0.1.2</version>
</dependency>

SBT and Spark-Packages

Add the dependency to your SBT project by adding the following to build.sbt (see the Spark Packages listing for spark-submit and Maven instructions):

resolvers += "Spark Packages Repo" at "http://dl.bintray.com/spark-packages/maven"
libraryDependencies += "edu.berkeley.cs.succinct" % "succinct-spark" % "0.1.2"

The succinct-spark jar file can also be added to a Spark shell using the --jars command line option. For example, to include it when starting the spark shell:

$ bin/spark-shell --jars succinct-spark_2.10-0.1.2.jar

About

Java/Scala Implementation of Succinct.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 72.7%
  • Scala 24.9%
  • Shell 2.4%