Building Machine Learning Infrastructure!
- 分布式机器学习:算法、理论与实践
- Serving Machine Learning Models: A Guide to Architecture, Stream Processing Engines, and Frameworks
- A Hitchhiker’s Guide On Distributed Training of Deep Neural Networks, 中文
- An overview of gradient descent optimization algorithms
- Optimization Methods for Large-Scale Machine Learning
- A Comparison of Distributed Machine Learning Platforms
- Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective
- CoCoA: A General Framework for Communication-Efficient Distributed Optimization
- Device Placement Optimization with Reinforcement Learning
- Dynamic Control Flow in Large-Scale Machine Learning
- Large Scale Distributed Deep Networks
- Large Scale Machine Learning by Ronan Collobert, 2004
- Large-Scale Machine Learning and Applications
- Machine Learning: The High-Interest Credit Card of Technical Debt
- MLbase: A Distributed Machine-learning System
- More Effective Distributed ML via a Stale Synchronous Parallel Parameter Server
- Revisiting Distributed Synchronous SGD
- Automatic Differentiation in Machine Learning: a Survey
- GPipe: Efficient Training of Giant Neural Networks using Pipeline Parallelism
- Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
- Today’s AI Software Infrastructure Landscape
- 资料:Large Scale Machine Learning
- 分布式机器学习 / 深度学习论文整理
- 大规模机器学习框架的四重境界
- Jeff Dean、贾扬清等ScaledML大会演讲
- TensorFlow分布式训练加速之梯度压缩
- TensorFlow 分布式训练的线性加速实践
- 实现Tensorflow多机并行线性加速
- 机器学习平台的优化器:优化篇, 平台篇
- 知乎讨论:做底层 AI 框架or做上层 AI 应用
- Large Scale Distributed Deep Networks解读
- 搭建容易维护难!谷歌机器学习系统血泪教训
- Building intelligence systems with large-scale deep learning, 视频
- Machine Learning at Facebook: An Infrastructure View, 视频, 翻译, 解读
- Applied Machine Learning at Facebook: A Datacenter Infrastructure Perspective
- Facebook AI 贾扬清:AI,从大数据问题演进到高性能计算问题