-
-
Notifications
You must be signed in to change notification settings - Fork 371
Introduction to Stan for New Developers
Welcome to Stan! We're excited that you are interested in contributing to the project. Before you can contribute you will have to familiarize yourself with the particular processes for new contributions that we have incorporated to facilitate the growth of the project.
The Stan project is hosted on GitHub so you will have to create a GitHub account if you do not yet already have one. Developer discussions are hosted on Discourse so you will have to create an account there in order to ask questions or participate in discussions.
You can see the overarching Stan project structure here. Each of the repos can be worked on independently, though some will include others as git submodules if they are dependent. Each of the repos also has their own wiki! Don't forget to check that wiki homepage and search it for information that might be related to that subproject.
People use Stan in many contexts. When we're deciding how to add/remove/modify Stan code, we need to understand what our goals are. This typically involves some discussion where we try to elicit some concrete use-cases for the feature, followed by a github issue in the appropriate repo with something resembling a spec for the issue that the reviewer can use to evaluate an associated pull request, followed by that pull request. These three artifacts exist in different locations, so at the top of each one there should be a link to the others and an attempt to summarize the results of previous steps in the workflow. To summarize:
- Bring up your proposed feature for discussion on our forums. If you're trying to find a place to help out, you can skip this and find an existing issue on the appropriate github repo.
- Summarize the discussion and write something approaching a high level spec in a github issue.
- Create a pull request with an attempt to address a github issue.
You can read more about the developer process here.
We have adopted the GitFlow process for incorporating new contributions into Stan. If you are not yet familiar with Git we recommend that you check out many of the great Git tutorials freely available online. Once you are comfortable with Git itself you can read about are particular implementation of GitFlow here and here.
All new contributions are also tested with out continuous integration framework.
Every developer has their own local development setup, but we have compiled various helpful tricks that you might find useful.
In order to ensure that we can quickly read and understand contributions, consistent style is incredibly important. We have adopted conventions for code quality and code style to which all contributions must conform. You can read more on these links, but we use an automated formatter for many of our conventions.
There is a list of supported compilers and language features here.
The robustness of Stan is only as good as our test coverage, and we require that all new contributions are adequately tested. We use the GoogleTest framework for writing tests and GnuMake and Python for running those tests.
We have two main sources of documentation - Doxygen doc comments and the Stan manual. You can read more about contributing to the former here. The latter typically has a github issue for each Stan release associated with it on the Stan repo, but we also take pull requests to the .tex files.
There are other forms of documentation listed on the website here.
Much of what you might consider to be the "core" of Stan actually exists in the Math repo. This document applies to that repo, but you can read more about how that repo is organized and any differences here.
The core code in Stan is written in heavily-templated C++ to ensure high-performance. There are many great C++ tutorials available online, for example cplusplus.org, and once you are familiar with the basics of the language you can tackle the subtleties of templates. We highly recommend Vandevoorde and Josuttis and Alexandrescu.
There are many additional resources available for learning how to optimize C++ code, including Agner Fog's manuscript and the many books of, amongst others, Scott Meyers and Herb Sutter.
Having a comprehensive set of useful densities coded in the Stan math library is a benefit to users. Densities are also a maintenance burden both for testing and for understanding the code base. As a result we are somewhat cautious about including new densities. Guidelines for including densities:
- The pdf, cdf, and rng should be available so users of the Stan language don't need to check the manual.
- There should be a computational benefit to coding the density in C++. Some densities can easily and efficiently be specified in the Stan language and the benefits of coding them in C++ are limited. It helps to provide some evidence of the computational benefits.
- The density should be applicable to a range of problems.
- If the density's C++ code re-implements or improves on functions already present in the math library, the necessary improvements should be coded separately in the math library.
- Ongoing interest from the code author in maintaining the code.
The Stan interfaces wrap the core C++ code and expose its functionality to other languages, such as R and Python. Consequently contributions to the interfaces may require knowledge of how to couple these languages together, for example with Rccp and Cython, or be built entirely in the interface language. For details on a specific interface please consult the corresponding GitHub repository.
Once you have familiarized yourself with our process take a look at the GitHub issue trackers for the many tasks that need to be tackled! We look forward to hearing from you on Discourse and seeing your pull requests!
- Stan project structure
- Stan Users Group
- Discourse
- Developer process
- Contributing a new Stan function
- Stan C++ style guide (includes some developer environment setup)
- Autodiff paper Details the implementation and math library generally
- Some Bayesian Modeling Techniques in Stan
- [Vandevoorde's C++ Templates] (http://www.josuttis.com/tmplbook/)
- [Alexandrescu's Modern C++ Design] (http://erdani.com/index.php/books/modern-c-design/)
- Agner Fog's manuscript