% title: Engineering a full python stack for biophysical computation % author: Kyle A. Beauchamp, Patrick Grinaway, Choderalab@MSKCC % author: Slides here: http://tinyurl.com/n4vq9aj % favicon: figures/membrane.png
title: Moore's Law
http://en.wikipedia.org/wiki/Moore%27s_lawtitle: Does Moore's law apply to medicine?
http://seniorhousingforum.net/title: New Drug Approvals: Stagnant
http://www.forbes.com/sites/matthewherper/2011/06/27/the-decline-of-pharmaceutical-researchmeasured-in-new-drugs-and-dollars/ --- title: erooM's Law for R&D Efficiency Scannell, 2012title: A crisis in drug discovery subtitle: Producing a drug costs $2B and 15 years, with 95% fail rate
Paul, 2010title: How can computers help design drugs?
Figure credit: @jchodera, Tom Cruisetitle: A brief introduction to biophysics class: segue dark nobackground
title: The cell as a bag of protein machines
http://mgl.scripps.edu/people/goodsell/illustration/public/ecoli-icon.gif --- title: How do drugs work? subtitle: By binding and controlling misbehaving proteins PDB Code: 2HWO. Figure generated by @sonyahansontitle: Challenges in molecular medicine
- Can we link nanoscale biophysics with human disease?
- Can we rationally engineer protein-drug binding?
title: Our Toolbox: Molecular Dynamics
- Physics-based simulations of biomolecules
- Numerically integrate (classical) equations of motion
- Protein, water, salts, drugs
http://deshawresearch.com/
title: Software for Biophysics class: segue dark nobackground
title: How do we reach biological timescales? subtitle: Challenge: 10^5 atoms, 10^12 iterations.
Church, 2011.title: OpenMM subtitle: GPU accelerated molecular dynamics
- Extensible C++ library with Python wrappers
- Hardware backends for CUDA, OpenCL, CPU
-
$>100$ nanoseconds ($10^{-7}$ s) per day on a GTX Titan
Eastman et al, 2012.
title: OpenMM Powers Folding@Home
- Largest distributed computing project
- 100,000 CPUs, 10,000 GPUs, 40 petaflops!
- Hundreds of microseconds per day aggregate simulation
- ~100 research papers on folding, misfolding, signalling
Gromacs also powers Folding@Home: http://gromacs.org/
title: Trajectory munging with MDTraj subtitle: Read, write, and analyze trajectories with only a few lines of Python.
- Multitude of formats (PDB, DCD, XTC, HDF, CDF, mol2)
- Geometric trajectory analysis (distances, angles, RMSD)
- Numpy / SSE kernels enable Folding@Home scale analysis
McGibbon et al, 2014
title: Trajectory munging with MDTraj subtitle: Lightweight Pythonic API
import mdtraj as md trajectory = md.load("./trajectory.h5") indices, phi = md.compute_phi(trajectory)mdtraj.org
McGibbon et al, 2014
title: MDTraj IPython Notebook
mdtraj.orgMcGibbon et al, 2014
title: MSMBuilder subtitle: Finding meaning in massive simulation datasets
msmbuilder.orghttps://github.com/msmbuilder/msmbuilder
title: MSMBuilder subtitle: Markov State Models of Conformational Dynamics
Voelz, Bowman, Beauchamp, Pande. J. Am. Chem. Soc., 2010title: MSMBuilder subtitle: MSMBuilder: An sklearn-compatible framework for conformation dynamics
# To install, `conda install -c https://conda.binstar.org/omnia msmbuilder` import mdtraj as md from msmbuilder import example_datasets, cluster, markovstatemodel from sklearn.pipeline import make_pipeline dataset = example_datasets.alanine_dipeptide.fetch_alanine_dipeptide() # From Figshare! trajectories = dataset["trajectories"] # List of MDTraj Trajectory Objects clusterer = cluster.KCenters(n_clusters=10, metric="rmsd") msm = markovstatemodel.MarkovStateModel() pipeline = make_pipeline(clusterer, msm) pipeline.fit(trajectories)msmbuilder.org
https://github.com/msmbuilder/msmbuilder
title: Yank subtitle: Fast, accurate alchemical ligand binding simulations
http://alchemistry.org/
title: Python Packaging Blues class: segue dark nobackground
title: Building scientific software is hard!
User: I couldn't really install the mdtraj module on my computer [...] User: I tried easy_install and other things and that didn't work for me.
title: Building scientific software is hard! subtitle: 2008: I was compiling BLAS / Numpy / Scipy by hand...
Red means hard to install.title: Building scientific software is hard! subtitle: 2010: Switched to Enthought python
Red means hard to install.title: Building scientific software is hard! subtitle: Present: Conda
Red means hard to install.title: Avoiding glibc Hell
-bash-4.1$ parmchk2 ~/opt/bin/parmchk2_pvt: /lib64/libc.so.6: version `GLIBC_2.14' not found
- Problem: users insist on old Linux versions
- Solution: build all recipes on a Centos 6.6 VM
title: Facile package sharing
User: I couldn't really install the mdtraj module on my computer [...] User: I tried easy_install and other things and that didn't work for me. Me: Installing mdtraj should be a one line command: Me: `conda install -c https://conda.binstar.org/omnia mdtraj` User: Success!
title: A full stack for biophysical computation subtitle: Simulation, Munging, Analysis, Visualization
conda install -c https://conda.binstar.org/omnia/channel/test omnia
- OpenMM
- MDTraj
- MSMBuilder
- Yank
- EMMA$^1$
title: Automating Biophysics class: segue dark nobackground
title: Models are made to be broken subtitle: How can we falsify and refine computer based models?
- Chemistry and biophysics are labor-intensive
- Thousands of parameters = thousands of measurements
- Reproducibilty and scalability
title: Can experiments be easy as Py(thon)?
from itctools.procedures import ITCExperiment from itctools.materials import Solvent from itctools.labware import Labware # [...] water = Solvent('water', density=0.9970479 * grams / milliliter) source_plate = Labware(RackLabel='SourcePlate', RackType='5x3 Vial Holder') experiment = ITCExperiment()Work by @jhprinz and @bas-rustenburg
https://github.com/choderalab/robots https://github.com/choderalab/itctools
title: Robots!
title: Biophysical modeling should be:
- Reproducible
- Automatable
- Accessible
- Tested
- Useful
title: People
- John Chodera + ChoderaLab (MSKCC)
- Robert McGibbon (Stanford)
- Peter Eastman (Stanford)
- Vijay Pande + PandeLab (Stanford)
- Daniel Parton (MSKCC)
- Yutong Zhao (Stanford, Folding@Home)
- Joy Ku (Stanford)
- Jason Swails (Rutgers)
- Justin MacCallum (U. Calgary)
title: Questions?
conda config --add channels http://conda.binstar.org/omnia/ conda install -c http://conda.binstar.org/omnia/channel/omnia1_beta1 omnia
omnia.md
openmm.org
mdtraj.org
github.com/msmbuilder/msmbuilder
github.com/choderalab/yank/