Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminology should use the language of Deterministic Finite Automata and Semiautomata #75

Open
ariasanovsky opened this issue Jan 23, 2024 · 0 comments
Assignees

Comments

@ariasanovsky
Copy link
Owner

ariasanovsky commented Jan 23, 2024

Markov Decision Process

A Markov Decision Process is a $(\mathcal{S}, \mathcal{A}, \mathcal{P}, \mathcal{R})$ where

  • $\mathcal{S}$ consists of states,
  • $\mathcal{A}$ consists of actions,
    • more generally, $\mathcal{A}_s$ is a set of actions permitted from $s\in\mathcal{S}$
  • $\mathcal{P}_a(s, s')$ is the probability, when permitted, that visiting $(s, a)$ leads to $s'$
  • $R_a(s, s')$ is the (expected) immediate reward if visiting $(s, a)$ leads to $s'$

We don't quite need the generality of a MDP because our transition probabilities are determinstic.
I.e., $\mathcal{P}a(s, \cdot) = \delta{s'}$ for some $s'\in\mathcal{S}$.

Deterministic Finite Automaton

A Deterministic Finite Automaton consists of

  • $\mathcal{Q}$ (states)
  • $\Sigma$ (alphabet)
  • $\delta: \mathcal{Q}\times\Sigma\to\mathcal{Q}$ (transition function),
  • $q_0\in\mathcal{Q}$ (initial state)
  • $\mathcal{F}\subseteq\mathcal{Q}$ (accept (final) states)

This is closer to our formulation. Note that:

  • a DFA has no cost/reward function
    • instead of including a reward function, we are optimizing an objective over a DFA
  • a DFA has a specific initial state
    • we may consider a DFA for each MCTS
  • this formally does explicitly leave room for actions sets to be restricted in any way
    • but this can be patched by replacing a $(\mathcal{Q}, \Sigma, \delta: \mathcal{Q}\times\Sigma\to\mathcal{Q}^\ast)$
      • (here, $\mathcal{Q}^\ast$ is $\mathcal{Q}$ with dud state added)
    • with $(\mathcal{Q}^\ast, \Sigma, \delta^\ast: \mathcal{Q}^\ast\times\Sigma\to\mathcal{Q}^\ast)$ with the dud state serving as a final object in the corresponding monoidal category

Across each MCTS, the transition dynamics are shared by a common semiautomaton.

Semiautomaton

A Semiautomaton has

  • $\mathcal{Q}$ (states),
  • $T: \mathcal{Q}\times\Sigma\to\mathcal{Q}$

As before, we may accommodate partial transition functions $T:\mathcal{Q}\times \Sigma\to\mathcal{Q}$ with the canonical monadic extension.

@ariasanovsky ariasanovsky self-assigned this Jan 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant