Skip to content
Change the repository type filter

All

    Repositories list

    • A Scala / Java / Python library for cleansing, transforming and preparing large datasets for ML operations on Apache Spark.
      Scala
      Apache License 2.0
      7801Updated Oct 13, 2020Oct 13, 2020
    • protectr

      Public
      A Scala / Java / Python library for anonymization, encryption and redaction operations for large datasets on Apache Spark.
      Scala
      Apache License 2.0
      0200Updated Sep 29, 2018Sep 29, 2018
    • forecast

      Public
      forecast package for R
      R
      341000Updated Aug 9, 2017Aug 9, 2017
    • superset

      Public
      Apache Superset (incubating) is a modern, enterprise-ready business intelligence web application
      Python
      Apache License 2.0
      15k000Updated Jun 10, 2017Jun 10, 2017
    • pyts

      Public
      A library for stats module in python
      Python
      MIT License
      0100Updated Mar 7, 2017Mar 7, 2017
    • This is a simple setup for spark using maven.
      Scala
      0000Updated Dec 6, 2016Dec 6, 2016
    • FiloDB

      Public
      Distributed. Columnar. Versioned. Streaming. SQL.
      Scala
      Apache License 2.0
      230000Updated Nov 15, 2016Nov 15, 2016
    • gobblin

      Public
      Universal data ingestion framework for Hadoop.
      Java
      Apache License 2.0
      750000Updated Jul 1, 2016Jul 1, 2016
    • Apache worked LICENSE and NOTICE example
      HTML
      Apache License 2.0
      8000Updated Jun 23, 2016Jun 23, 2016
    • HTML
      1000Updated Jun 18, 2016Jun 18, 2016
    • A library for time series analysis on Apache Spark
      Scala
      Apache License 2.0
      421000Updated Jun 3, 2016Jun 3, 2016