LibRecommender

Overview

LibRecommender is an easy-to-use recommender system focused on end-to-end recommendation. The main features are:

Implemented a number of popular recommendation algorithms such as SVD++, DeepFM, BPR etc, see full algorithm list.
A hybrid recommender system, which allows user to use either collaborative-filtering or content-based features or both.
Low memory usage, automatically convert categorical and multi-value categorical features to sparse representation.
Support training for both explicit and implicit datasets, and negative sampling can be used for implicit dataset.
Making use of Cython or Tensorflow for high-speed model training.
Provide end-to-end workflow, i.e. data handling / preprocessing -> model training -> evaluate -> serving.
Provide unified and friendly API for all algorithms.

Usage

pure collaborative-filtering example :

import numpy as np
import pandas as pd
from libreco.data import random_split, DatasetPure
from libreco.algorithms import SVDpp  # pure data, algorithm SVD++

data = pd.read_csv("examples/sample_data/sample_movielens_rating.dat", sep="::", 
                   names=["user", "item", "label", "time"])

# split whole data into three folds for training, evaluating and testing
train_data, eval_data, test_data = random_split(data, multi_ratios=[0.8, 0.1, 0.1])

train_data, data_info = DatasetPure.build_trainset(train_data)
eval_data = DatasetPure.build_testset(eval_data)
test_data = DatasetPure.build_testset(test_data)
print(data_info)   # n_users: 5894, n_items: 3253, data sparsity: 0.4172 %

svdpp = SVDpp(task="rating", data_info=data_info, embed_size=16, n_epochs=3, lr=0.001, 
              reg=None, batch_size=256)
# monitor metrics on eval_data during training
svdpp.fit(train_data, verbose=2, eval_data=eval_data, metrics=["rmse", "mae", "r2"])

# do final evaluation on test data
svdpp.evaluate(test_data, metrics=["rmse", "mae"])  
# predict preference of user 1 to item 2333
print("prediction: ", svdpp.predict(user=1, item=2333))
# recommend 7 items for user 1
print("recommendation(id, probability): ", svdpp.recommend_user(user=1, n_rec=7))

include features example :

import numpy as np
import pandas as pd
from libreco.data import split_by_ratio_chrono, DatasetFeat
from libreco.algorithms import YouTubeRanking  # feat data, algorithm YouTubeRanking

data = pd.read_csv("examples/sample_data/sample_movielens_merged.csv", sep=",", header=0)
data["label"] = 1  # convert to implicit data and do negative sampling afterwards

# split into train and test data based on time
train_data, test_data = split_by_ratio_chrono(data, test_size=0.2)

# specify complete columns information
sparse_col = ["sex", "occupation", "genre1", "genre2", "genre3"]
dense_col = ["age"]
user_col = ["sex", "age", "occupation"]
item_col = ["genre1", "genre2", "genre3"]

train_data, data_info = DatasetFeat.build_trainset(
    train_data, user_col, item_col, sparse_col, dense_col
)
test_data = DatasetFeat.build_testset(test_data)
train_data.build_negative_samples(data_info)  # sample negative items for each record
test_data.build_negative_samples(data_info)
print(data_info)  # n_users: 5962, n_items: 3226, data sparsity: 0.4185 %

ytb_ranking = YouTubeRanking(task="ranking", data_info=data_info, embed_size=16, 
                             n_epochs=3, lr=1e-4, batch_size=512, use_bn=True, 
                             hidden_units="128,64,32")
ytb_ranking.fit(train_data, verbose=2, shuffle=True, eval_data=test_data,
                metrics=["loss", "roc_auc", "precision", "recall", "map", "ndcg"])

# predict preference of user 1 to item 2333
print("prediction: ", ytb_ranking.predict(user=1, item=2333))  
# recommend 7 items for user 1
print("recommendation(id, probability): ", ytb_ranking.recommend_user(user=1, n_rec=7))

For more examples and usages, see User Guide

Data Format

JUST normal data format, each line represents a sample. One thing is important, the model assumes that user, item, and label column index are 0, 1, and 2, respectively. You may wish to change the column order if that's not the case. Take for Example, the movielens-1m dataset:

1::1193::5::978300760
1::661::3::978302109
1::914::3::978301968
1::3408::4::978300275

Besides, if you want to use some other meta features (e.g., age, sex, category etc.), you need to tell the model which columns are [sparse_col, dense_col, user_col, item_col], which means all features must be in a same table. See above YouTubeRanking for example.

Serving

For how to serve a trained model in LibRecommender, see Serving Guide .

Installation & Dependencies

From pypi :

$ pip install LibRecommender==0.2.2

To build from source, you 'll first need Cython and Numpy:

$ # pip install numpy cython
$ git clone https://github.com/massquantity/LibRecommender.git
$ cd LibRecommender
$ python setup.py install

Basic Dependencies in `libreco`:

Python >= 3.6
tensorflow >= 1.14
numpy >= 1.15.4
pandas >= 0.23.4
scipy >= 1.2.1
scikit-learn >= 0.20.0
gensim>=3.6.0
tqdm >= 4.46.0
hnswlib

LibRecommender is tested under tensorflow 1.14 and 2.3. If you encounter any problem during running, feel free to open an issue.

Optional Serving Dependencies:

flask >= 1.0.0
requests >= 2.22.0
redis == 3.0.6
redis-py >= 3.3.5
faiss == 1.5.2
Tensorflow Serving

References

Algorithm	Category	Paper
userCF / itemCF	pure	Item-Based Collaborative Filtering Recommendation Algorithms
SVD	pure	Matrix Factorization Techniques for Recommender Systems
SVD ++	pure	Factorization Meets the Neighborhood: a Multifaceted Collaborative Filtering Model
ALS	pure	1. Matrix Completion via Alternating Least Square(ALS) / 2. Collaborative Filtering for Implicit Feedback Datasets / 3. Applications of the Conjugate Gradient Method for Implicit Feedback Collaborative Filtering
NCF	pure	Neural Collaborative Filtering
BPR	pure	BPR: Bayesian Personalized Ranking from Implicit Feedback
Wide & Deep	feat	Wide & Deep Learning for Recommender Systems
FM	feat	Factorization Machines
DeepFM	feat	DeepFM: A Factorization-Machine based Neural Network for CTR Prediction
YouTubeMatch YouTubeRanking	feat, seq	Deep Neural Networks for YouTube Recommendations
AutoInt	feat	AutoInt: Automatic Feature Interaction Learning via Self-Attentive Neural Networks
DIN	feat, seq	Deep Interest Network for Click-Through Rate Prediction
Item2Vec	pure, seq	Item2Vec: Neural Item Embedding for Collaborative Filtering
RNN4Rec / GRU4Rec	pure, seq	Session-based Recommendations with Recurrent Neural Networks

pure means collaborative-filtering algorithms which only use behavior data, feat means other features can be included, seq means sequence or graph algorithms.

Name		Name	Last commit message	Last commit date
Latest commit History 732 Commits
distributed		distributed
examples		examples
libreco		libreco
serving		serving
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
TODOs		TODOs
_config.yml		_config.yml
requirements-serving.txt		requirements-serving.txt
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LibRecommender

Overview

Usage

pure collaborative-filtering example :

include features example :

For more examples and usages, see User Guide

Data Format

Serving

Installation & Dependencies

Basic Dependencies in `libreco`:

Optional Serving Dependencies:

References

License

MIT

About

Releases

Packages

Languages

License

neenerrh/LibRecommender

Folders and files

Latest commit

History

Repository files navigation

LibRecommender

Overview

Usage

pure collaborative-filtering example :

include features example :

For more examples and usages, see User Guide

Data Format

Serving

Installation & Dependencies

Basic Dependencies in libreco:

Optional Serving Dependencies:

References

License

MIT

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Basic Dependencies in `libreco`:

Packages