Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Stan users guide chapter on HMMs #838

Open
charlesm93 opened this issue Nov 27, 2024 · 8 comments
Open

Update Stan users guide chapter on HMMs #838

charlesm93 opened this issue Nov 27, 2024 · 8 comments
Assignees

Comments

@charlesm93
Copy link
Contributor

charlesm93 commented Nov 27, 2024

Summary:

The section on HMMs (under Time-Series models: https://mc-stan.org/docs/stan-users-guide/time-series.html#hmms.section) in the Stan users guide currently doesn't use Stan's dedicated functions for HMMs (described here: https://mc-stan.org/docs/functions-reference/hidden_markov_models.html).

I'll update the section and point users towards the convenient functions we have.

@charlesm93 charlesm93 self-assigned this Nov 27, 2024
@charlesm93
Copy link
Contributor Author

I've taken a closer look at the user guide's section on HMMs, and I find that it treats specific class of HMMs, rather than provide general guidance.

First, the section focuses on the case where the likelihood (observational distribution) is categorical. Then, it considers:

  • the case where the hidden states are observed ("supervised parameter estimation"), including naive coding, coding with sufficient statistics, and coding the posterior analytically
  • the case where some of the hidden states are observed ("semisupervised estimation")

I'm not sure that the supervised case is particularly interesting, since we can compute the posterior analytically. Its main interest is setting up the semi-supervised example.

If I look at the HMM suite we developed, its use-case seems a bit orthogonal, in that:

  • there are no restrictions on the observational model
  • none of the hidden states are observed ("unsupervised" case).

So I think it can make sense to keep the existing example and add an example which uses the HMM suite. I'll base myself on the case study @bbbales2 put together after we released the functions (https://mc-stan.org/users/documentation/case-studies/hmm-example.html).

Two organizational questions:

  • does it make sense to have the general HMM example first and then the specific example with supervised and semi-supervised learning?
  • should we harmonize notation between the examples, and between the user and functions guide?

My answer to both questions is yes. But I'm leaving it up for discussion.

@bob-carpenter

@bob-carpenter
Copy link
Contributor

That all sounds good. But I think it'd also be OK to take out the direct implementation and instead just use the new HMM methods we provide. I would start with the pre-baked method and only then go on to coding it yourself if you do leave it.

And by all means unify the notation if it's not too much of a bother.

I included semi-supervised because it's so common to see that in natural language processing settings where the hidden states might represent things like a fixed set of parts of speech and the supervision in the form of labeled training data. I think that's an edge case. The bigger thing is when there are covariates informing the transition matrix.

Also, I think the next release will be the first with left and right stochastic matrices. So it'd make sense to use those as we don't talk about them elsewhere in the User's Guide.

Thanks for taking this on!

@charlesm93
Copy link
Contributor Author

I think it'd also be OK to take out the direct implementation and instead just use the new HMM methods we provide.

Ok, I wrote a first version which only has the pre-baked version in it. There is still time to add it back, if we decide to.

Also, I think the next release will be the first with left and right stochastic matrices. So it'd make sense to use those as we don't talk about them elsewhere in the User's Guide.

I'm not familiar with those. Are there docs somewhere? I can then adapt the example code.

@WardBrian
Copy link
Member

@SteveBronder is still working on them here #807

@bob-carpenter
Copy link
Contributor

Ah, so it won't be in the upcoming release. The transition matrix for an HMM is a stochastic matrix. We usually assume a left-stochastic matrix that has rows that are simplexes. For a left-stochastic matrix X and simplex Y, we know that X * Y is again a simplex. Right stochastic matrices have columns that are simplexes.

@charlesm93
Copy link
Contributor Author

In that case, I think the doc is ready to be PR-ed.

@charlesm93 charlesm93 mentioned this issue Dec 6, 2024
3 tasks
@WardBrian
Copy link
Member

Oh, the feature is merged. The docs are what’s still being worked on.

sorry for the confusion!

@bob-carpenter
Copy link
Contributor

In which case, the example should just be for a simple HMM that declares a left-stochastic matrix. Then you can get fancy and add structural zeros in another section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants