Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combine regenerate and update into an update method taking an UpdateSpec object #282

Draft
wants to merge 72 commits into
base: master
Choose a base branch
from

Conversation

georgematheos
Copy link
Contributor

@georgematheos georgematheos commented Jul 8, 2020

This PR builds upon #279 to change the signature of update to the following:

(new_tr, weight, retdiff, reverse_update_spec) = update(trace, args, argdiffs, update_spec::UpdateSpec, externally_constrained_addresses::Selection)

update_spec is a specific type of AddressTree. It can select some addresses to regenerate using the internal proposal distribution and constrain some addresses to choices, or include address tree leafs of type CustomUpdateSpec which specify an update that a custom generative function knows how to perform. As I am currently conceiving it, custom update specs must be equivalent to a combination of selecting and constraining addresses--but at some point I think we could reformalize in such a way to relax this requirement, and use these custom update specs to effectively allow generative functions to implement multiple internal proposal distributions.

externally_constrained_addresses is a selection including all addresses which external proposal distributions will constrain the values for when applying the reverse move to this update. This selection will determine what weight the update function calculates: it will include the term Q[old_tr | get_selected(reverse_update_spec, externally_constrained_addrs)]. To implement the old update weight, we set externally_constrained_addrs = AllSelection(); to implement the old regenerate weight, we set externally_constrained_addrs = EmptySelection().

I have implemented syntactic sugar so the old calls to update and regenerate still work, by being translated into the correct call to the new update method.

I have changed the dynamic DSL, static DSL, CallAt, Map, and Unfold combinators to use this new update method. I have not changed Recurse yet.

This is still a WIP and I need to document these changes and do some performance engineering. I am putting this online since others may need access to this branch to use some of my open universe inference tools.

notes:

  • Recurse did not have an implementation for regenerate; I have not implemented the fully general update function yet
  • Looks like there is currently a 2x slowdown for the static DSL inference benchmark; I have a few ideas where this might be coming from and will try to fix it once I get a chance.

TODOS:

  • Add documentation (for addresstree and the new update interface)
  • Performance
  • Add testing for updates which select some addresses and constrain others
  • Add a new variant of metropolis_hastings which allows for simultaneous selection and constraints

Resolves #266 , resolves #279 , resolves #259, resolves #258 , resolves #189, resolves #274, resolves #263

@georgematheos
Copy link
Contributor Author

Made a couple performance improvements. Here is some benchmarking. Looks like the static DSL is a bit faster than on the master branch, and the dynamic DSL is slower (by a more significant amount). Asymptotic performance does not seem to be majorly affected in these simple tests.

The performance slowdown for the dynamic DSL appears to have been introduced by my initial PR which introduces the concept of a "Value Choice Map"; the performance on dynamic DSL inference did not degrade (and if anything, it slightly improved) when I made distributions generative functions, implemented addresstrees, and changed the update signature.

Benchmarks:

This PR:

Simple static DSL (including CallAt nodes) MH on regression model:
  0.307637 seconds (4.13 M allocations: 302.513 MiB, 14.84% gc time)
  0.288067 seconds (4.13 M allocations: 302.513 MiB, 13.35% gc time)

Simple dynamic DSL MH on regression model:
  7.253488 seconds (87.12 M allocations: 4.507 GiB, 11.41% gc time)
  7.367530 seconds (87.12 M allocations: 4.507 GiB, 11.71% gc time)

georgematheos@Georges-MBP-3 benchmarks % julia run_benchmarks.jl
Simple static DSL (including CallAt nodes) MH on regression model:
  0.326150 seconds (4.13 M allocations: 302.513 MiB, 16.59% gc time)
  0.317942 seconds (4.13 M allocations: 302.513 MiB, 14.69% gc time)

Simple dynamic DSL MH on regression model:
  7.357534 seconds (87.12 M allocations: 4.507 GiB, 11.98% gc time)
  7.435310 seconds (87.12 M allocations: 4.507 GiB, 12.35% gc time)

# asymptotics check:
Simple static DSL (including CallAt nodes) MH on regression model - 5x as many data points:
  1.420621 seconds (20.91 M allocations: 1.895 GiB, 11.45% gc time)
  1.415851 seconds (20.91 M allocations: 1.895 GiB, 12.54% gc time)

Simple dynamic DSL MH on regression model - 1/5 as many data points:
  0.419291 seconds (4.92 M allocations: 253.056 MiB, 12.15% gc time)
  0.421394 seconds (4.92 M allocations: 253.056 MiB, 11.92% gc time)

Master branch:

Simple static DSL (including CallAt nodes) MH on regression model:
  0.359577 seconds (4.35 M allocations: 309.954 MiB, 13.78% gc time)
  0.329181 seconds (4.35 M allocations: 309.954 MiB, 12.14% gc time)

Simple dynamic DSL MH on regression model:
  5.208065 seconds (68.65 M allocations: 4.524 GiB, 16.93% gc time)
  5.033524 seconds (68.65 M allocations: 4.524 GiB, 16.42% gc time)

georgematheos@Georges-MBP-3 benchmarks % julia run_benchmarks.jl
Simple static DSL (including CallAt nodes) MH on regression model:
  0.339774 seconds (4.35 M allocations: 309.954 MiB, 11.95% gc time)
  0.331594 seconds (4.35 M allocations: 309.954 MiB, 10.95% gc time)

Simple dynamic DSL MH on regression model:
  5.200960 seconds (68.65 M allocations: 4.524 GiB, 16.24% gc time)
  5.014927 seconds (68.65 M allocations: 4.524 GiB, 15.83% gc time)

# asymptotics check:
Simple static DSL (including CallAt nodes) MH on regression model - 5x as many data points:
  1.621701 seconds (21.96 M allocations: 1.928 GiB, 13.03% gc time)
  1.598446 seconds (21.96 M allocations: 1.928 GiB, 12.40% gc time)

Simple dynamic DSL MH on regression model - 1/5 as many data points:
  0.315384 seconds (3.91 M allocations: 256.996 MiB, 17.18% gc time)
  0.300466 seconds (3.91 M allocations: 256.996 MiB, 14.59% gc time)

@georgematheos
Copy link
Contributor Author

As a TODO--I realize the current implementation here of distributions as generative functions is incompatible with the distribution DSL since it assumes distributions are fully defined by their type information. This should be a quick fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants