Iterator APIs (via ResumableFunctions.jl) #121

InnovativeInventor · 2021-07-26T03:54:52Z

Since #94 was stagnating, I decided to break out the iterator-style API into this PR using ResumableFunctions.jl.

Caveat: The API exposed by this PR is different from the one proposed in #94. Rather than just returning scores, it also returns the Partition object as well in a tuple. This makes it more versatile.

Although this is a breaking change, this brings GerryChain Julia more in line with GerryChain Python and allows the user to save and export state more easily. This API is different from the one proposed in #94 in that it returns a Partition object and the scores.

The iterators are now exposed in `recom_chain_iter` and `flip_chain_iter` respectively to avoid any breaking changes.

src/flip.jl

InnovativeInventor · 2021-07-26T03:58:58Z

For some more context, I decided to not break backwards compatibility and made recom_chain and flip_chain dumb wrappers around recom_chain_iter and flip_chain_iter respectively.

InnovativeInventor · 2021-07-26T04:04:11Z

a678eb6#diff-ade2663ea888449e8028f941e69e33ec7f16635c54248240974af0ea18ffe682R288 is a long line, but it is necessary since I can't add it to the multi-line comment above and the workaround itself is finnicky.

pjrule

I am the original author of most of these changes (over in #94), and I've run a bunch of chains this way—it works! However, would love a review from other folks just to make sure I didn't miss any subtle ordering issues, etc.

InnovativeInventor · 2021-07-26T04:15:28Z

I am the original author of most of these changes (over in #94), and I've run a bunch of chains this way—it works! However, would love a review from other folks just to make sure I didn't miss any subtle ordering issues, etc.

Yep, this is a fairly simple, but impactful change (it really changes up the internals), that should hopefully maintain backwards compatibility. I've tested it with my scripts, etc. and it appears to work -- more testing and eyeballs would be better.

Finally, as this change is a minor but backwards-compatible change, the minor version number should be bumped up as per semantic versioning (e.g. 0.1.3 -> 0.2.0).

pjrule · 2021-07-26T04:48:59Z

Actually, I just spotted an issue here: the semantics of the iterator are different from the Python version of GerryChain, which may confuse our users. Specifically, Partition objects are basically immutable in Python; the iterator in the MarkovChain object returns distinct Partition objects. In this version, we update a single Partition struct at each step by default. This behavior can be changed internally with the copy_partition flag of update_partition!, but this is not really exposed to users. Regardless of how easy the behavior is to change, though, the defaults are most relevant here: the Python version is immutable by default, and the Julia version is mutable by default. Julia users expecting the Python behavior will, to their surprise, end up with a bunch of pointers to the same Partition struct. Some ideas, roughly in order of performance impact (least to greatest):

Ignore this and add a note in the docstring.
Introduce a copy flag to recom_chain_iter; set it to false by default. (Internally, this would be forwarded to the aforementioned copy_partition flag.)
For each chain, provide two iterator functions: a basic one that returns just scores (always immutable), and an advanced one that returns the latest Partition (copy or not) and the scores. This sounds clunky—and it kinda is—but in most cases I imagine users will only want the scores. IMO, that's the usage we should encourage.
Introduce a copy flag to recom_chain_iter; set it to true by default.

InnovativeInventor · 2021-07-26T14:26:52Z

I'm mostly in favor of adding a note about this slight divergence from GerryChain Python in the docs for a few reasons. Firstly, I'm not quite sure when/how often someone would want a copy of the partition object outside the for loop (I've certainly never had to do that) -- even then, it should be fairly easy to detect that all the copies one is storing are the same. Secondly, changing the behavior to be less speedy (by default) undercuts the only selling point of GerryChain Julia: speed. I don't think we should compromise on that, especially with the defaults. So, I'm not sure if we're sabotaging our performance out of concern for a phantom use case.

I do like the third idea (however, it doesn't really answer whether we should copy the partition by default). What do others think?

sunnymatt

LGTM!!! Glad that we have this option. I wish there were a cleaner way to name the functions besides just adding _iter but that's not, like, a fixable thing afaik. 👍 thank you @InnovativeInventor for this!

InnovativeInventor added 3 commits July 25, 2021 22:41

Add recom_chain and flip_chain functions for backwards compat.

34f6585

The iterators are now exposed in `recom_chain_iter` and `flip_chain_iter` respectively to avoid any breaking changes.

Update type signatures of the iterator APIs

5fe1128

InnovativeInventor requested a review from pjrule July 26, 2021 03:55

pjrule reviewed Jul 26, 2021

View reviewed changes

src/flip.jl Outdated Show resolved Hide resolved

Point to issue on GitHub with the bug in ResumableFunctions.jl

a678eb6

pjrule reviewed Jul 26, 2021

View reviewed changes

pjrule requested review from bsuwal, sunnymatt and gabeschoenbach July 26, 2021 04:04

Export iter API

d2004de

InnovativeInventor mentioned this pull request Jul 26, 2021

Saving assignments #49

Closed

InnovativeInventor linked an issue Jul 26, 2021 that may be closed by this pull request

Saving assignments #49

Closed

sunnymatt approved these changes Jul 27, 2021

View reviewed changes

Add note about Partition being mutable and changing in-place

3405111

InnovativeInventor removed request for bsuwal and gabeschoenbach August 3, 2021 15:00

InnovativeInventor merged commit bee809d into main Aug 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Iterator APIs (via ResumableFunctions.jl) #121

Iterator APIs (via ResumableFunctions.jl) #121

InnovativeInventor commented Jul 26, 2021

InnovativeInventor commented Jul 26, 2021

InnovativeInventor commented Jul 26, 2021

pjrule left a comment

InnovativeInventor commented Jul 26, 2021 •

edited

Loading

pjrule commented Jul 26, 2021

InnovativeInventor commented Jul 26, 2021

sunnymatt left a comment

Iterator APIs (via ResumableFunctions.jl) #121

Iterator APIs (via ResumableFunctions.jl) #121

Conversation

InnovativeInventor commented Jul 26, 2021

InnovativeInventor commented Jul 26, 2021

InnovativeInventor commented Jul 26, 2021

pjrule left a comment

Choose a reason for hiding this comment

InnovativeInventor commented Jul 26, 2021 • edited Loading

pjrule commented Jul 26, 2021

InnovativeInventor commented Jul 26, 2021

sunnymatt left a comment

Choose a reason for hiding this comment

InnovativeInventor commented Jul 26, 2021 •

edited

Loading