GitHub - AdrianKlessa/SteamScout: Ensemble recommendation (recommender) system for finding similar games on Steam

A similarity search and recommendation system for finding games on Steam

Find similar games to the ones you like - even if the genres you like are not very popular

The system is a weighted ensemble recommender composed of:

A recommender rating the games based on the similarity of their vectorized descriptions ("About this game" section)
A recommender rating the similarity of games based on their tags
A simple check for the ratio of positive user reviews to negative

Since this ensemble does not take into account the number of reviews (only the fraction of positive ones), it tends to catch games with very few reviews if they appear to be a good match to the one the user searched for.

Both the tags and descriptions are compared based on the cosine similarity of their vectors.

The tags are binary vectors (consisting of 1s at the indices matching a particular tag the game has and 0s for tags it lacks). The description vector is generated by using doc2vec after training on the entire dataset.

The vectors are stored using vectordb for easy Approximate Nearest Neighbor search.

Build & installation (Docker)

Clone this repository
Download the Steam Games dataset from Kaggle and put games.json inside data/raw
Run docker compose up from the repo's directory.

Building the app will take a while (possibly 15+ minutes) since it's preprocessing the dataset and training doc2vec on it.

(I did not want to include a pretrained model or the data as part of this repository)

Running the app after it has been built uses the generated model and DBs so it should be much faster.

Due to heavy reliance on a local database, the app greatly benefits from using an SSD. Full Text Search and indexing on queried columns is used to reduce search delays.

By default, the frontend is available on the host machine at http://localhost:5173

Unit tests

Unit tests can be executed from the src directory by running python -m unittest discover ../tests

Utilizes the Steam Games dataset published on Kaggle by Martin Bustos Roman, originally generated with Steam Games Scraper by Martin Bustos.

Name		Name	Last commit message	Last commit date
Latest commit History 88 Commits
.github/workflows		.github/workflows
data		data
docs		docs
frontend		frontend
models		models
notebooks		notebooks
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
compose.yaml		compose.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

A similarity search and recommendation system for finding games on Steam

Build & installation (Docker)

Unit tests

About

Languages

License

AdrianKlessa/SteamScout

Folders and files

Latest commit

History

Repository files navigation

A similarity search and recommendation system for finding games on Steam

Build & installation (Docker)

Unit tests

About

Topics

Resources

License

Stars

Watchers

Forks

Languages