Skip to content

Advanced Data Management project - Università degli studi di Genova

Notifications You must be signed in to change notification settings

frmusso/HiddenArtsADM_Project_2018-2019

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HiddenArtsADM_Project_2018-2019

Contributors

Hidden Arts has been developed by Francesco Musso (@frmusso) and Davide Ponassi (@ponassi).

The project

Domain

The domain is Street Arts. It is based on a mobile application which will provide detailed informations about street arts around the world. Street arts are shared by users that will take photo and fill an information form to be send to the database. The application will support Facebook and Google login and it will allow only one session per device.

System specifications

Our dataset is read-intensive. We do not consider our dataset write-intensive since street arts will not be shared much frequently. Also object shared will end up in a moderation queue before being visible to users. So reading data may be eventually consistent.

Also the dataset will increase proportionally to time and active users so it would be better to create a system that uses technologies that provides partitioning and replication.

We rely on Cassandra technology and CQL for workload.

Main dataset (published.csv)

The chosen dataset is a custom dataset composed by a manually generated part (which consist of 225 real data entries) and a pseudo-random generated one (24775 entries). The random generated part has followed these procedure:

  • It generates random Latitude / Longitude point on earth (beside Antarctica).
  • It evaluates its nature (wheter on land or on water).
  • It keeps generate a random point till it is on land.
  • Once it has a point it generates a random art title and take a random real author.

Users table are also initially random-generated. This allow us to generate a pseudo-real data whose weight is ~2.3 MBytes.

Other csv datasets
How we evaluates land and water points

You can check the details on the repository IsOnWater_CSharp.

Project file structure

  • datasets: contains csv data which has been used to populate the database
  • src: contains cql schema and population
  • docs: contains documentation (soon to be added)
  • workload: contains cql workload (soon to be added)

Releases

No releases published

Packages

No packages published