-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Modularize compiler around the idea of a "standard library" using functors #1182
base: master
Are you sure you want to change the base?
Conversation
It'd certainly be nice if it didn't. I don't have a good sense of the weight of the downsides here, though. |
Functors are pretty common in OCaml (e.g., they're how one would implement any of the common generic data structures like I've also been trying to use an alternative feature called "virtual libraries" which covers similar use cases while being much simpler (no module applications, etc), but I am not sure if it is possible to do it in this way due to some issues that arise with circular dependencies which functors cleverly avoid. |
Scratch that, I believe I have it working with the much lighter-weight virtual library approach. This does lose the benefit that things that use the library look structurally different, but that is the very thing that makes functors "heavy" syntactically |
I'm going to close this in favor of #1184 |
Codecov Report
@@ Coverage Diff @@
## master #1182 +/- ##
==========================================
- Coverage 88.79% 88.67% -0.12%
==========================================
Files 64 68 +4
Lines 9844 9970 +126
==========================================
+ Hits 8741 8841 +100
- Misses 1103 1129 +26
|
This is something I played around with on a slow day last week. I think it's an interesting idea, but not one I'm myself sold on. I'm opening it up to get feedback/discussion if we want to move in this direction.
This idea came from a discussion with @mandel and @gbdrt at ProbProg 2021. They had been working on an effort to translate Stan programs to Pyro and NumPyro at https://github.com/deepppl, and I asked them what about the current implementation of stanc made it easier or harder to do this kind of work. One thing that came up was the reliance on the Stan math library signatures.
Current implementation
At the moment, we have a file
src/middle/Stan_math_signatures.ml
which defines a lookup table of all the C++ functions we'd like to expose at the language level. We've essentially been treating these as built in to Stan, but I'd argue this is actually very backend specific. The fact that all these functions need to be defined for your backend to work is part of what stalled the Tensorflow backend, and it created a considerable effort for the deepppl folks, who defined a name-matching Python library: https://github.com/deepppl/stan-num-pyro/blob/main/stan-pyro/stanpyro/stanlib.py. For functions which don't exist or they don't support, they defined dummy signatures which throw a runtime exception.If we're serious about stanc being architected to support alternative backends, doing something better here is pretty key. I would argue (and I think so would the name of the file) that, as a starting point,
Stan_math_signatures
belongs in thesrc/stan_math_backend
folder, not themiddle
folder.The changes
This PR is conceptually pretty simple. It defines a basic module interface for what we consider to be a "Stan standard library". This had previously been discussed in the review of #1115. The interface looks a lot like the Stan_math_signatures file, but genericified just a touch.
Here it is in full:
Then, every module which currently depends on
Stan_math_signatures
was made into a functor over this module. This means that rather than there being a module calledTypechecker
, there is now a functor calledTypechecking
, which when you "call" it with an instance of a standard library module (the above type), it will give you back a typechecker module which works like the old one.I tried to make as few functors as possible, so some modules just got a slight refactor (e.g. SignatureMismatch was being supplied the list of signatures for everything except operators, which it was checking from the library. Now it just gets supplied them for everything, so it doesn't need to depend on the library implemenation).
At the call site, this looks basically the same as the non-functor status quo. E.g., in stanc.ml, we add the line
module Typechecker = Typechecking.Make (Stan_math_library)
and then proceed as before by callingTypechecker.check_prog
etc.If anyone is wondering how to think about this, you can conceptually imagine
Library
as a Java interface, and functors like Typechecking aspublic class Typechecker<T extends Library>
. This is not a good analogy for the complete ML module system, but for the specific use here of dependency injection it is more or less correct.The Good
But it does! It turns out the lowering of
~
statements is dependent on which exact signatures are defined! This refactor makes things like that extremely obvious, since the things that depend on the library look structurally different than those that don't (e.g. they're functors rather than normal modules).The Bad
struct ... end
syntax required to define a functor...._intf.ml
file (see .mli file usage #358 and discussion)The Unsure-How-To-Feel
Library
type..stan
file. However, there is nothing stopping a backend from defining a function which does not exist in any other backend, making them disjoint. This could be a pro or con, I think it is subjective.Submission Checklist
Release notes
Refactored internals of the compiler to be less tied to the specific functions exposed in the Stan Math C++ library.
Copyright and Licensing
By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the BSD 3-clause license (https://opensource.org/licenses/BSD-3-Clause)