Files
dl92 01d5532823 Add expression-evaluator: DAGs & state machines tutorial project
Educational calculator teaching FSMs (explicit transition table tokenizer)
and DAGs (recursive descent parser with AST evaluation). Includes CLI with
REPL, graphviz visualization, and 61 tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-08 18:09:42 +00:00
..

Expression Evaluator -- DAGs & State Machines Tutorial

A calculator that teaches two fundamental CS patterns by building them from scratch:

  1. Finite State Machine -- the tokenizer processes input character-by-character using an explicit transition table
  2. Directed Acyclic Graph (DAG) -- the parser builds an expression tree, evaluated bottom-up in topological order

What You'll Learn

File CS Concept What it does
tokenizer.py State Machine (Mealy machine) Converts "3 + 4 * 2" into tokens using a transition table
parser.py DAG construction Builds an expression tree with operator precedence
evaluator.py Topological evaluation Walks the tree bottom-up (leaves before parents)
visualize.py Visualization Generates graphviz diagrams of both the FSM and AST

Quick Start

# Evaluate an expression
python main.py "3 + 4 * 2"
# => 11

# Interactive REPL
python main.py

# See how the state machine tokenizes
python main.py --show-tokens "(2 + 3) * -4"

# See the expression tree (DAG)
python main.py --show-ast "(2 + 3) * 4"
# *
# +-- +
# |   +-- 2
# |   `-- 3
# `-- 4

# Watch evaluation in topological order
python main.py --trace "(2 + 3) * 4"
#     Step 1: 2 => 2
#     Step 2: 3 => 3
#   Step 3: 2 + 3 => 5
#   Step 4: 4 => 4
# Step 5: 5 * 4 => 20

# Generate graphviz diagrams
python main.py --dot "(2 + 3) * 4" | dot -Tpng -o ast.png
python main.py --dot-fsm | dot -Tpng -o fsm.png

Features

  • Arithmetic: +, -, *, /, ^ (power)
  • Parentheses: (2 + 3) * 4
  • Unary minus: -3, -(2 + 1), 2 * -3
  • Decimals: 3.14, .5
  • Standard precedence: parens > ^ > *// > +/-
  • Right-associative power: 2^3^4 = 2^(3^4)
  • Correct unary minus: -3^2 = -(3^2) = -9

Running Tests

python -m pytest tests/ -v

How the State Machine Works

The tokenizer in tokenizer.py uses an explicit transition table -- a dictionary mapping (current_state, character_class) to (next_state, action). This is the same pattern used in network protocol parsers, regex engines, and compiler lexers.

The three states are:

  • START -- between tokens, dispatching based on the next character
  • INTEGER -- accumulating digits (e.g., "12" so far)
  • DECIMAL -- accumulating digits after a decimal point (e.g., "12.3")

Use --dot-fsm to generate a visual diagram of the state machine.

How the DAG Works

The parser in parser.py builds an expression tree (AST) where:

  • Leaf nodes are numbers (no dependencies)
  • Interior nodes are operators with edges to their operands
  • Edges represent "depends on" relationships

Evaluation in evaluator.py walks this tree bottom-up -- children before parents. This is exactly a topological sort of the DAG: you can only compute a node after all its dependencies are resolved.

Use --show-ast to see the tree structure, or --dot to generate a graphviz diagram.