Skip to content

stergiosbamp/spark-dominance-based-queries

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Dominance-based queries on Apache Spark.

Skyline queries are a popular and powerful paradigm for extracting interesting objects from a multi-dimensional dataset. Given a set D of d-dimensional objects (or points), the skyline set of R is the set of Pareto-optimal, or undominated, points in D

Algorithms

  1. Skyline query based on the Sort Filter Skyline (SFS) algorithm.

  2. Top-k dominating based on the Skyline-based Top-k Dominating (STD).

  3. Top-k dominating on Skyline

Datasets

There are 4 distributions of synthetic datasets to run the algorithms, from 2-d to 10-d.

  1. Correlated
  2. Uniform
  3. Normal
  4. Anti-correlated

About

Dominance-based queries on Apache Spark

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published