- Self-supervised Learning resources: [jason718/awesome-self-supervised-learning].
- [2014 ArXiv] Neural Turing Machines, [paper], [bibtex] sources: [carpedm20/NTM-tensorflow].
- [2016 Nature] Hybrid Computing Using a Neural Network with Dynamic External Memory, [paper], [bibtex] sources: [deepmind/dnc], [claymcleod/tf-differentiable-neural-computer].
- [2015 ICLR] Memory Networks, [paper], [bibtex] sources: [facebook/MemNN].
- [2015 NIPS] End-To-End Memory Networks, [paper], [bibtex], sources: [facebook/MemNN], [seominjoon/memnn-tensorflow], [domluna/memn2n], [carpedm20/MemN2N-tensorflow].
- [2015 ICMLW] Highway Networks, [paper], [bibtex], [homepage], sources: [IsaacChanghau/AmusingPythonCodes/highway_networks], [lucko515/fully-connected-highway-network], [fomorians/highway-cnn].
- [2015 NIPS] Training Very Deep Networks, [paper], [bibtex], sources: [trangptm/HighwayNetwork].
- [2017 ICML] Recurrent Highway Networks, [paper], [bibtex], sources: [julian121266/RecurrentHighwayNetworks].
- [2016 ArXiv] N-ary Error Correcting Coding Scheme, [paper], [bibtex].
- [2018 JIIS] Experimental Validation for N-ary Error Correcting Output Codes for Ensemble Learning of Deep Neural Networks, [paper], [bibtex].
- [2020 MobiMedia], Deep N-ary Error Correcting Output Codes, [paper], [bibtex], sources: [IsaacChanghau/DeepNaryECOC].
- [2018 ACL] Multi-Task Label Embedding for Text Classification, [paper], [bibtex], [blog].
- [2018 ACL] Joint Embedding of Words and Labels for Text Classification, [paper], [bibtex], [poster], sources: [guoyinwang/LEAM].
- [2018 TACL] GILE: A Generalized Input-Label Embedding for Text Classification, [paper], [bibtex], sources: [idiap/gile].
- [2017 IJCAI] DeepFM - A Factorization-Machine based Neural Network for CTR Prediction, [paper], [bibtex], sources: [shenweichen/DeepCTR].
- [2018 ArXiv] Next Item Recommendation with Self-Attention, [paper], [bibtex].
- [2018 ICDM] Self-Attentive Sequential Recommendation, [paper], [bibtex], sources: [kang205/SASRec].
- [2018 KDD] Multi-Pointer Co-Attention Networks for Recommendation, [paper], [bibtex], sources: [vanzytay/KDD2018_MPCN].
- [2019 RecSys] Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches, [paper], [bibtex], sources: [MaurizioFD/RecSys2019_DeepLearning_Evaluation].
- [2020 ArXiv] A Critical Study on Data Leakage in Recommender System Offline Evaluation, [paper], [bibtex].
- [2020 SIGIR] A Re-visit of the Popularity Baseline in Recommender Systems, [paper], [bibtex].
- [2021 SIGIR] Causal Intervention for Leveraging Popularity Bias in Recommendation, [paper], [bibtex], sources: [zyang1580/PDA].
- [2021 ArXiv] SELFCF: A Simple Framework for Self-supervised Collaborative Filtering, [paper], [bibtex], sources: [enoche/SelfCF].
- [2019 ICLR] DARTS: Differentiable Architecture Search, [paper], [bibtex], [homepage], sources: [quark0/darts].
- [2019 CVPR] Searching for A Robust Neural Architecture in Four GPU Hours, [paper], [bibtex], sources: [D-X-Y/GDAS].
- [2019 ICML] EigenDamage: Structured Pruning in the Kronecker-Factored Eigenbasis, [paper], [bibtex], [supplementary], sources: [alecwangcq/EigenDamage-Pytorch].
- [2019 CVPR] Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration, [paper], [bibtex], sources: [he-y/filter-pruning-geometric-median].
- [2009 ICML] Curriculum Learning, [paper], [bibtex].
- [2010 AISTATS] Understanding the difficulty of training deep feedforward neural networks, [paper], [bibtex].
- [2011 ICML] On Optimization Methods for Deep Learning, [paper], [bibtex], [homepage].
- [2013 ICML] Maxout Networks, [paper], [bibtex], sources: [philipperemy/tensorflow-maxout].
- [2014 JMLR] Dropout: A Simple Way to Prevent Neural Networks from Overfitting, [paper], [bibtex].
- [2015 ICCV] Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification, [paper], [bibtex], [Kaiming He's homepage], sources: [nutszebra/prelu_net].
- [2015 ICML] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, [paper], [bibtex], sources: [tomokishii/mnist_cnn_bn.py].
- [2016 ICLR] Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), [paper], [bibtex].
- [2016 ArXiv] An overview of gradient descent optimization algorithms, [paper], [bibtex], [slides].
- [2016 ArXiv] Layer Normalization, [paper], [bibtex], sources: [ryankiros/layer-norm], [pbhatia243/tf-layer-norm], [NickShahML/tensorflow_with_latest_papers].
- [2016 ICLR] Incorporating Nesterov Momentum into Adam, [paper], [bibtex].
- [2016 ECCV] Layer Dropout: Deep Networks with Stochastic Depth, [paper], [bibtex], [poster], sources: [yueatsprograms/Stochastic_Depth], [samjabrahams/stochastic-depth-tensorflow].
- [2017 NIPS] Self-Normalizing Neural Networks, [paper], [bibtex], sources: [shaohua0116/Activation-Visualization-Histogram], [bioinf-jku/SNNs].
- [2017 ICLR] Recurrent Batch Normalization, [paper], [bibtex], sources: [cooijmanstim/recurrent-batch-normalization], [jihunchoi/recurrent-batch-normalization-pytorch].
- [2018 AAAI] Adversarial Dropout for Supervised and Semi-Supervised Learning, [paper], [bibtex], sources: [sungraepark/Adversarial-Dropout].
- [2019 NeurIPS] Understanding and Improving Layer Normalization, [paper], [bibtex], sources: [lancopku/AdaNorm].
- [2019 NeurIPS] Positional Normalization, [paper], [bibtex], sources: [Boyiliee/PONO].
- [2020 ArXiv] On Layer Normalization in the Transformer Architecture, [paper], [bibtex].
- [2017 ICLR] Categorical Reparameterization with Gumbel-SoftMax, [paper], [bibtex], sources: [ericjang/gumbel-softmax].
- [2017 ICLR] The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables, [paper], [bibtex], sources: [pytorch/relaxed_categorical].
- [2018 ICLR] Learning Latent Permutations with Gumbel-Sinkhorn Networks, [paper], [bibtex], sources: [google/gumbel_sinkhorn], [HeddaCohenIndelman/Learning-Gumbel-Sinkhorn-Permutations-w-Pytorch].
- [2018 TCSVT] Sharp Attention Network via Adaptive Sampling for Person Re-identification, [paper], [bibtex].
- [2019 CVPR] Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling, [paper], [bibtex].
- [2019 CVPR] Searching for A Robust Neural Architecture in Four GPU Hours, [paper], [bibtex], sources: [D-X-Y/GDAS].
- [2020 ACL] How Does Selective Mechanism Improve Self-Attention Networks?, [paper], [bibtex], sources: [xwgeng/SSAN].
- [2021 ICLR] Rao Blackwellizing the Straight-Through Gumbel-Softmax Gradient Estimator, [paper], [bibtex].
- [2018 ICML] Mutual Information Neural Estimation, [paper], [bibtex], sources: [MasanoriYamada/Mine_pytorch], [mzgubic/MINE], [gtegner/mine-pytorch].
- [2018 ArXiv] Representation Learning with Contrastive Predictive Coding, [paper], [bibtex], sources: [davidtellez/contrastive-predictive-coding], [flrngel/cpc-tensorflow], [jefflai108/Contrastive-Predictive-Coding-PyTorch].
- [2019 ICLR] Deep Graph Infomax, [paper], [bibtex], sources: [PetarV-/DGI].
- [2019 ICLR] Learning Deep Representations by Mutual Information Estimation and Maximization, [paper], [bibtex], sources: [rdevon/DIM].
- [2019 NeurIPS] Learning Representations by Maximizing Mutual Information Across Views, [paper], [bibtex], sources: [Philip-Bachman/amdim-public].
- [2009 ICCV] Fast and Robust Earth Mover’s Distances, [paper], [bibtex], sources: [wmayner/pyemd], [LeeKamentsky/pyemd].
- [2015 ICML] From Word Embeddings To Document Distances, [paper], [bibtex], sources: [mkusner/wmd], [src-d/wmd-relax], [stephenhky/PyWMD].
- [2017 ICML] Wasserstein Generative Adversarial Networks, [paper], [supplementary], [arxiv], [bibtex], [homepage], [explaination1], [explaination 2], [explaination3], sources: [kpandey008/wasserstein-gans], [martinarjovsky/WassersteinGAN], [luslab/scRNAseq-WGAN-GP].
- [2019 ACL] Sentence Movers Similarity: Automatic Evaluation for Multi-Sentence Texts, [paper], [bibtex], sources: [eaclark07/sms].
- [2020 CVPR] DeepEMD: Few-Shot Image Classification with Differentiable Earth Movers, [paper], [bibtex], sources: [icoz69/DeepEMD].
- [2013 ICML] Deep Canonical Correlation Analysis, [paper], [bibtex], sources: [VahidooX/DeepCCA], [DTaoo/DCCA], [msamribeiro/deep-cca], [wangxu-scu/DeepCCA].
- [2014 EACL] CCA: Improving Vector Space Word Representations Using Multilingual Correlation, [paper], [bibtex].
- [2017 TIML] Efficient Methods and Hardware for Deep Learning, [Ph.D Thesis], [bibtex], [Song Han's homepage], [slides].
- [2017 NIPS] SVCCA: Singular Vector Canonical Correlation Analysis for Deep Learning Dynamics and Interpretability, [paper], [bibtex], sources: [google/svcca].
- [2017 ArXiv] One Model To Learn Them All, [paper], [blog], [bibtex].
- [2017 ArXiv] An Overview of Multi-Task Learning in Deep Neural Networks, [paper], [bibtex].
- [2017 PNAS] Robust Continuous Clustering, [paper], [bibtex], sources: [sohilas/robust-continuous-clustering], [yhenon/pyrcc], [shahsohil/DCC].
- [2018 ArXiv] Tunneling Neural Perception and Logic Reasoning through Abductive Learning, [paper], [bibtex]
- [2018 AAAI] Reliable Multi-View Clustering, [paper], [bibtex].
- [2018 AAAI] SC2Net: Sparse LSTMs for Sparse Coding, [paper], [bibtex], sources: [joeyzhouty/sc2net].
- [2019 ArXiv] Implicit Generation and Generalization in Energy-Based Models, [paper], [bibtex], [homepage], [blog], [ext. readings], sources: [openai/ebm_code_release], [rosinality/igebm-pytorch].
- [2019 SIGIR] Finding Camouflaged Needle in a Haystack? Pornographic Products Detection via Berrypicking Tree Model, [paper], [bibtex], [slides], sources: [GuoxiuHe/BIRD].
- [2019 ICML] COMIC: Multi-view Clustering Without Parameter Selection, [paper], [bibtex].
- [2019 NeurIPS] PyTorch: An Imperative Style, High-Performance Deep Learning Library, [paper], [bibtex].
- [2019 NeurIPS] Addressing Failure Prediction by Learning Model Confidence, [paper], [bibtex], sources: [valeoai/ConfidNet].
- [2021 ArXiv] Attention is not all you need: pure attention loses rank doubly exponentially with depth, [paper], [bibtex], sources: [twistedcubic/attention-rank-collapse].
- [2021 ICLR] Trusted Multi-View Classification, [paper], [bibtex], sources: [hanmenghan/TMC].