Tiger POMDP
The canonical POMDP — a hidden tiger, noisy growls, a finite-memory policy.
A textbook partially-observable Markov decision process. The tiger sits behind the left or right door — a hidden taxonomy value the agent's policy never reads. The contestant listens, drawing a growl that points the right way ~85% of the time, and accumulates observations in a bounded two-slot history. It opens a door only after two consecutive agreeing growls; otherwise it listens again, paying a small cost each time.
Demonstrates principled handling of hidden state: the emission probabilities are encoded as condition-masked distributions, and the belief is approximated by a finite window of past observations rather than a real probability vector. A compact, reproducible benchmark for agents that must act under uncertainty and information-gathering cost.
Linked tables with guaranteed referential integrity.
Generated REST endpoints. Also exposed as MCP tools.
OSI-compatible definition, emitted with the dataset.
# tiger-pomdp.osi.yaml — emitted automatically semantic_model: name: "tiger-pomdp" source: "duckdb://tiger-pomdp.db" entities: - name: contestant primary_key: id dimensions: - name: state type: categorical - name: t type: time measures: - name: row_count agg: count - name: active agg: sum filter: "state = 'ACTIVE'"
More worlds.
Game of Life
Conway's automaton as a perfectly observable, deterministic grid world.
London Underground
A live tube graph — eleven lines, hundreds of trains, platforms held as a mutex.
Pac-Man
A self-playing arcade game — ghosts chase a flood-filled distance field.