Multidimesnional Data Structures

General Info

The purpose of this computer science project is to present the focuses on optimizing range and similarity queries on text datasets using quadratic trees, range trees, and R-trees. The project aims to compare the performance of these tree-based data structures in terms of query processing time, space complexity, and accuracy.

Technologies && Library Versions

Python

gensim              4.2.0
numpy               1.23.5
pandas              1.3.5
scikit-learn        1.2.1
Scrapy              2.7.1

Installation

Clone the repo

git clone https://github.com/d4g10ur0s/Multidimesnional_Data_Structures_2023.git

Create a folder called data in the main directory
Run Scripts

First cd to the scripts directory. Then run the following commands

./run_scrapy.sh

The first command creates a folder called scientists in the data folder where we have multiple json files scrapped from a wikipedia page.

Run preprocess.py

If you are running preprocess.py for the first time u need to uncomment the following lines once. Run it and then comment them again.

    logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
    corpus = api.load('text8')
    print(inspect.getsource(corpus.__class__))
    print(inspect.getfile(corpus.__class__))
    model = w2v(corpus)
    model.save('.\\readyvocab.model')

Contact

You can always contact us through email :

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
4_linux		4_linux
data_structures		data_structures
preprocessing		preprocessing
scrappy		scrappy
scripts		scripts
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multidimesnional Data Structures

Table of Contents

General Info

Technologies && Library Versions

Installation

Contact

Contributing

License

About

Releases

Packages

Contributors 3

Languages

d4g10ur0s/Multidimesnional_Data_Structures_2023

Folders and files

Latest commit

History

Repository files navigation

Multidimesnional Data Structures

Table of Contents

General Info

Technologies && Library Versions

Installation

Contact

Contributing

License

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages