SCENARIOS / SC-014 / CLIFF WALKING
SC-014 Reinforcement Learning

Cliff Walking

Sutton & Barto's cliff-walking gridworld — three policies, one dangerous edge.

Async Tick Spatial Observations FSM
S G CLIFF
SC-014 / SCHEMATICReinforcement Learning
Clock 120 Hz tick
Update Asynchronous
Grid 4 × 12
Policies Safe / Cautious / Risky
Observability Per-step
OVERVIEW

A textbook reinforcement-learning gridworld. The board is a 4×12 grid of cells wired together by north/south/east/west references, with a cliff along the bottom edge and a goal in the corner. Walkers move under a chosen policy — Safe, Cautious or Risky — encoded as per-policy direction probabilities. Stepping off the cliff is a large negative reward that resets the walker; every ordinary step costs one.

A compact benchmark for policy comparison: the same world run under three risk appetites produces three distinct reward profiles you can read straight from the exported walker rows. Demonstrates indirect, reference-based grid navigation and reproducible per-step reward accounting.

TRAITS
Async
Independent, event-driven timelines
Tick
Discrete fixed-step time
Spatial
Entities have position & topology
Observations
Emits structured agent observations
FSM
Entities are finite-state machines
SCHEMA

Linked tables with guaranteed referential integrity.

TABLECOLUMNSDESCRIPTION
walker ID, policy, steps, total_reward, current_cell_id, current_state One row per walker: its policy, steps taken, cumulative reward and current cell.
gridcell ID, kind, reward, north_id, south_id, east_id, west_id, current_state One row per cell: kind (normal/cliff/goal), step reward and the four neighbour references.
LIVE API

Generated REST endpoints. Also exposed as MCP tools.

POST /scenarios/cliff-walking/experiments Seed a new gridworld
POST /scenarios/cliff-walking/experiments/{eid}/run Advance N turns, or request a move
GET /scenarios/cliff-walking/experiments/{eid}/entities/walker Read walker positions and rewards
GET /scenarios/cliff-walking/experiments/{eid}/events Append-only event log
GET /scenarios/cliff-walking/experiments/{eid}/dataset Download the exported dataset
SEMANTIC LAYER

OSI-compatible definition, emitted with the dataset.

# cliff-walking.osi.yaml — emitted automatically
semantic_model:
  name: "cliff-walking"
  source: "duckdb://cliff-walking.db"
  entities:
    - name: walker
      primary_key: id
  dimensions:
    - name: state
      type: categorical
    - name: t
      type: time
  measures:
    - name: row_count
      agg: count
    - name: active
      agg: sum
      filter: "state = 'ACTIVE'"