Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia PPLs? #184

Open
trappmartin opened this issue Aug 4, 2020 · 12 comments
Open

Julia PPLs? #184

trappmartin opened this issue Aug 4, 2020 · 12 comments

Comments

@trappmartin
Copy link

Hi,
This looks like a really nice project. Do you aim to also include codes for Julia based PPLs? If so, I’m happy to have a look at it.

Cheers,
Martin

@MansMeg
Copy link
Collaborator

MansMeg commented Aug 21, 2020

Hi! Sorry for my late respons (vacation). I would be happy to add Julia PPL! I guess the first step would be to reproduce the 8-schools in Julia and Ill add that code to get the structure in? then we can just add additional models. We should also find a way to check that the models are identical using Travis.

Im happy to help!

@trappmartin
Copy link
Author

Super cool. I’ll open a PR, but it might take a few days as I’m currently preparing for my viva next week.

@MansMeg
Copy link
Collaborator

MansMeg commented Aug 22, 2020

Sound great! Just let me know if I can help somehow?

@sethaxen
Copy link
Contributor

sethaxen commented Nov 2, 2021

I'd like to contribute to this as I find the time. But before I get started, @MansMeg, how do you recommend proceeding? 1 model per PR, or are batches per PR fine? (How) do you ensure the models are up-to-date/accurate?

For Stan, each model is a text block, but for Python and Julia, each model is a script. How is this then packaged by posteriordb so that it can be used?

@MansMeg
Copy link
Collaborator

MansMeg commented Nov 3, 2021

Hi!
I think the first step is to get the structure right for the julia models. So I think the best would be if you could create a first model that is identical to any of the Stan models already in (say eight schools).

The accuracy should probably be tested by checking the (proportional) log density values for a set of parameter draws. I can help fixing this.

Lastly, we should try to setup a julia test suite to check that the julia models will run/work.

I dont know exactly how to best formulate a julia model, so here it would be good with suggestions from you. What do you think would be best?

@sethaxen
Copy link
Contributor

sethaxen commented Nov 7, 2021

I think for Julia PPLs it would be best for each model to be defined in a Julia script that would include

  1. any necessary package imports
  2. any necessary function definitions
  3. the model definition

There should also be a Project.toml file that specifies the version bounds for the used packages. Ideally there'd be one per model, but that would be painful to maintain, and since there will probably only be a few required packages (e.g. DifferentialEquations.jl), it's probably okay to have one global Project.toml.

Here are two example models reproduced in Turing.jl:

eight_schools_noncentered
using Turing

@model function eight_schools_noncentered(
    J, # number of schools
    y, # estimated treatment
    sigma, # std of estimated effect
)
    mu ~ Normal(0, 5) # hyper-parameter of mean
    tau ~ truncated(Cauchy(0, 5), 0, Inf) # hyper-parameter of sdv, a non-informative prior    
    theta_trans ~ filldist(Normal(0, 1), J) # transformation of theta
    theta = theta_trans .* tau .+ mu # original theta, treatment effect in school j
    y ~ MvNormal(theta, sigma)
end
hudson_lynx_hare
using DifferentialEquations, Turing

function lotka_volterra(
    du, # system rate {prey, predator}
    u, # system state {prey, predator}
    p, # parameters
    t, # time
)
    x, y = u
    alpha, beta, gamma, delta = p
    du[1] = (alpha - beta * y) * x # dx
    du[2] = (-gamma + delta * x) * y # dy
end

@model function model(
    N, # number of measurement times
    ts, # measurement times > 0
    y_init, # initial measured populations
    y, # measured populations
    tspan=extrema(ts),
    prob_base=ODEProblem(lotka_volterra, [log(10), log(10)], tspan, [1, 0.05, 1, 0.05]),
)
    theta ~ MvNormal([1, 0.05, 1, 0.05], [0.5, 0.05, 0.5, 0.05]) # { alpha, beta, gamma, delta }
    sigma ~ filldist(LogNormal(-1, 1), 2) # measurement errors
    z_init ~ filldist(LogNormal(log(10), 1), 2) # initial population

    prob = remake(prob_base; u0=z_init, p=theta)
    sol = solve(
        prob;
        saveat=ts,
        save_start=false,
        save_end=false,
        rel_tol=1e-5,
        abs_tol=1e-3,
        maxiters=500,
    )
    z = reduce(vcat, sol')

    y_init .~ LogNormal.(log.(z_init), sigma)
    y .~ LogNormal.(log.(z), sigma')
end

A potential complication is that in Julia there are multiple automatic differentiation packages that could be used for sampling, and these are independent of the PPLs. It's possible that a given model implementation will work well with one backend but not another, so it might be worth it to hardcode the AD backend choice in the model definition as well.

I understand Stan has both forward- and reverse-mode ADs. Is the best AD mode stored for stan models in the database somehow, or is the default always used?

@trappmartin, what are your thoughts? Anything I've missed?

@MansMeg
Copy link
Collaborator

MansMeg commented Nov 7, 2021

Hi!

This looks great!

I actually dont know the exakt backend used by Stan by heart, but it is the default used.

I think this looks great. I guess the next step(s) are:

  1. Compute the (proportional) log density for these two models for a set of parameters (both in Stan and Turing).
  2. Add the proportional log density values to the database
  3. Add the julia code for the model
  4. add how to run Turing with posteriord for these (two) models
  5. add a julia test suite to check added julia models
  6. add more models

I could do 1-3, but would most likely need help with 4 and 5.

@trappmartin
Copy link
Author

Hi there,
I have seen that there is now an extra repo for R and python. Would we need one for Julia then, as this repo doesn't contain any scripts to run inference on models?

If I understand correctly, a JuliaPosteriorDB repo would then handle the loading of libraries. Or it seems this is how it is done for PyMC.

@MansMeg
Copy link
Collaborator

MansMeg commented Nov 9, 2021

Yes. I could try to get such a repo up.

@sethaxen
Copy link
Contributor

@MansMeg I have created a repo PosteriorDB.jl that, similarly to posteriordb-r and posteriordb-python, provides functionality for working with posteriordb. If you haven't started on a Julia repo yet, would you like this to be it, and if so, would it make sense to move it to this org?

@MansMeg
Copy link
Collaborator

MansMeg commented Oct 25, 2022

That sounds great! Im happy to look into this.

@sethaxen
Copy link
Contributor

Great! IIRC, the process would be:

  1. I transfer ownership of the repo to a user who has repository creation rights in the stan-dev org
  2. That user then transfers ownership to the stan-dev org

Also, I would prefer I be given administrative rights on the repository after transfer so I can continue maintaining it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants