Skip to content

Latest commit

 

History

History
511 lines (350 loc) · 10.1 KB

slides.md

File metadata and controls

511 lines (350 loc) · 10.1 KB

% title: Engineering a full python stack for biophysical computation % author: Kyle A. Beauchamp, Patrick Grinaway, Choderalab@MSKCC % author: Slides here: http://tinyurl.com/n4vq9aj % favicon: figures/membrane.png


title: Moore's Law

http://en.wikipedia.org/wiki/Moore%27s_law

title: Does Moore's law apply to medicine?

http://seniorhousingforum.net/

title: New Drug Approvals: Stagnant

http://www.forbes.com/sites/matthewherper/2011/06/27/the-decline-of-pharmaceutical-researchmeasured-in-new-drugs-and-dollars/ --- title: erooM's Law for R&D Efficiency

Scannell, 2012

title: A crisis in drug discovery subtitle: Producing a drug costs $2B and 15 years, with 95% fail rate

Paul, 2010

title: How can computers help design drugs?

Figure credit: @jchodera, Tom Cruise

title: A brief introduction to biophysics class: segue dark nobackground


title: The cell as a bag of protein machines

http://mgl.scripps.edu/people/goodsell/illustration/public/ecoli-icon.gif --- title: How do drugs work? subtitle: By binding and controlling misbehaving proteins

PDB Code: 2HWO. Figure generated by @sonyahanson

title: Challenges in molecular medicine

  • Can we link nanoscale biophysics with human disease?
  • Can we rationally engineer protein-drug binding?

title: Our Toolbox: Molecular Dynamics

  • Physics-based simulations of biomolecules
  • Numerically integrate (classical) equations of motion
  • Protein, water, salts, drugs
Shan et al: J. Am. Chem. Soc. (2011).
http://deshawresearch.com/

title: Software for Biophysics class: segue dark nobackground


title: How do we reach biological timescales? subtitle: Challenge: 10^5 atoms, 10^12 iterations.

Church, 2011.

title: OpenMM subtitle: GPU accelerated molecular dynamics

  • Extensible C++ library with Python wrappers
  • Hardware backends for CUDA, OpenCL, CPU
  • $>100$ nanoseconds ($10^{-7}$ s) per day on a GTX Titan

openmm.org
Eastman et al, 2012.

title: OpenMM Powers Folding@Home

  • Largest distributed computing project
  • 100,000 CPUs, 10,000 GPUs, 40 petaflops!
  • Hundreds of microseconds per day aggregate simulation
  • ~100 research papers on folding, misfolding, signalling

http://folding.stanford.edu/
Gromacs also powers Folding@Home: http://gromacs.org/

title: Trajectory munging with MDTraj subtitle: Read, write, and analyze trajectories with only a few lines of Python.

  • Multitude of formats (PDB, DCD, XTC, HDF, CDF, mol2)
  • Geometric trajectory analysis (distances, angles, RMSD)
  • Numpy / SSE kernels enable Folding@Home scale analysis

mdtraj.org
McGibbon et al, 2014

title: Trajectory munging with MDTraj subtitle: Lightweight Pythonic API

import mdtraj as md

trajectory = md.load("./trajectory.h5")
indices, phi = md.compute_phi(trajectory)

mdtraj.org
McGibbon et al, 2014

title: MDTraj IPython Notebook

mdtraj.org
McGibbon et al, 2014

title: MSMBuilder subtitle: Finding meaning in massive simulation datasets

msmbuilder.org
https://github.com/msmbuilder/msmbuilder

title: MSMBuilder subtitle: Markov State Models of Conformational Dynamics

Voelz, Bowman, Beauchamp, Pande. J. Am. Chem. Soc., 2010

title: MSMBuilder subtitle: MSMBuilder: An sklearn-compatible framework for conformation dynamics

# To install, `conda install -c https://conda.binstar.org/omnia msmbuilder`
import mdtraj as md
from msmbuilder import example_datasets, cluster, markovstatemodel
from sklearn.pipeline import make_pipeline

dataset = example_datasets.alanine_dipeptide.fetch_alanine_dipeptide()  # From Figshare!
trajectories = dataset["trajectories"]  # List of MDTraj Trajectory Objects

clusterer = cluster.KCenters(n_clusters=10, metric="rmsd")
msm = markovstatemodel.MarkovStateModel()

pipeline = make_pipeline(clusterer, msm)
pipeline.fit(trajectories)

msmbuilder.org
https://github.com/msmbuilder/msmbuilder

title: Yank subtitle: Fast, accurate alchemical ligand binding simulations

https://github.com/choderalab/yank
http://alchemistry.org/

title: Python Packaging Blues class: segue dark nobackground


title: Building scientific software is hard!

User: I couldn't really install the mdtraj module on my computer [...]
User: I tried easy_install and other things and that didn't work for me.


title: Building scientific software is hard! subtitle: 2008: I was compiling BLAS / Numpy / Scipy by hand...

Red means hard to install.

title: Building scientific software is hard! subtitle: 2010: Switched to Enthought python

Red means hard to install.

title: Building scientific software is hard! subtitle: Present: Conda

Red means hard to install.

title: Avoiding glibc Hell

-bash-4.1$ parmchk2
~/opt/bin/parmchk2_pvt: /lib64/libc.so.6: version `GLIBC_2.14' not found
  • Problem: users insist on old Linux versions
  • Solution: build all recipes on a Centos 6.6 VM
https://github.com/omnia-md/virtual-machines/

title: Facile package sharing

User: I couldn't really install the mdtraj module on my computer [...]
User: I tried easy_install and other things and that didn't work for me.

Me: Installing mdtraj should be a one line command:
Me: `conda install -c https://conda.binstar.org/omnia mdtraj`

User: Success!


title: A full stack for biophysical computation subtitle: Simulation, Munging, Analysis, Visualization

conda install -c https://conda.binstar.org/omnia/channel/test omnia
  • OpenMM
  • MDTraj
  • MSMBuilder
  • Yank
  • EMMA$^1$
1: Senne, Noe. J. Chem. Theor. Comp. 2012

title: Automating Biophysics class: segue dark nobackground


title: Models are made to be broken subtitle: How can we falsify and refine computer based models?

  • Chemistry and biophysics are labor-intensive
  • Thousands of parameters = thousands of measurements
  • Reproducibilty and scalability

title: Can experiments be easy as Py(thon)?

from itctools.procedures import ITCExperiment
from itctools.materials import Solvent
from itctools.labware import Labware

# [...]

water = Solvent('water', density=0.9970479 * grams / milliliter)
source_plate = Labware(RackLabel='SourcePlate', RackType='5x3 Vial Holder')
experiment = ITCExperiment()


Work by @jhprinz and @bas-rustenburg
https://github.com/choderalab/robots https://github.com/choderalab/itctools

title: Robots!


title: Biophysical modeling should be:

  • Reproducible
  • Automatable
  • Accessible
  • Tested
  • Useful

title: People

  • John Chodera + ChoderaLab (MSKCC)
  • Robert McGibbon (Stanford)
  • Peter Eastman (Stanford)
  • Vijay Pande + PandeLab (Stanford)
  • Daniel Parton (MSKCC)
  • Yutong Zhao (Stanford, Folding@Home)
  • Joy Ku (Stanford)
  • Jason Swails (Rutgers)
  • Justin MacCallum (U. Calgary)
Jan-Hendrik Prinz, Bas Rustenburg, Sonya Hanson Greg Bowman, Christian Schwantes, TJ Lane, Vince Voelz, Imran Haque, Matthew Harrigan, Carlos Hernandez, Bharath Ramsundar, Lee-Ping Wang Frank Noe, Martin Scherer, Xuhui Huang, Sergio Bacallado, Mark Friedrichs

title: Questions?

conda config --add channels http://conda.binstar.org/omnia/
conda install -c http://conda.binstar.org/omnia/channel/omnia1_beta1 omnia

omnia.md

openmm.org

mdtraj.org

github.com/msmbuilder/msmbuilder

github.com/choderalab/yank/