Educational calculator teaching FSMs (explicit transition table tokenizer) and DAGs (recursive descent parser with AST evaluation). Includes CLI with REPL, graphviz visualization, and 61 tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2.9 KiB
2.9 KiB
Expression Evaluator -- DAGs & State Machines Tutorial
A calculator that teaches two fundamental CS patterns by building them from scratch:
- Finite State Machine -- the tokenizer processes input character-by-character using an explicit transition table
- Directed Acyclic Graph (DAG) -- the parser builds an expression tree, evaluated bottom-up in topological order
What You'll Learn
| File | CS Concept | What it does |
|---|---|---|
tokenizer.py |
State Machine (Mealy machine) | Converts "3 + 4 * 2" into tokens using a transition table |
parser.py |
DAG construction | Builds an expression tree with operator precedence |
evaluator.py |
Topological evaluation | Walks the tree bottom-up (leaves before parents) |
visualize.py |
Visualization | Generates graphviz diagrams of both the FSM and AST |
Quick Start
# Evaluate an expression
python main.py "3 + 4 * 2"
# => 11
# Interactive REPL
python main.py
# See how the state machine tokenizes
python main.py --show-tokens "(2 + 3) * -4"
# See the expression tree (DAG)
python main.py --show-ast "(2 + 3) * 4"
# *
# +-- +
# | +-- 2
# | `-- 3
# `-- 4
# Watch evaluation in topological order
python main.py --trace "(2 + 3) * 4"
# Step 1: 2 => 2
# Step 2: 3 => 3
# Step 3: 2 + 3 => 5
# Step 4: 4 => 4
# Step 5: 5 * 4 => 20
# Generate graphviz diagrams
python main.py --dot "(2 + 3) * 4" | dot -Tpng -o ast.png
python main.py --dot-fsm | dot -Tpng -o fsm.png
Features
- Arithmetic:
+,-,*,/,^(power) - Parentheses:
(2 + 3) * 4 - Unary minus:
-3,-(2 + 1),2 * -3 - Decimals:
3.14,.5 - Standard precedence: parens >
^>*//>+/- - Right-associative power:
2^3^4=2^(3^4) - Correct unary minus:
-3^2=-(3^2)=-9
Running Tests
python -m pytest tests/ -v
How the State Machine Works
The tokenizer in tokenizer.py uses an explicit transition table -- a dictionary mapping (current_state, character_class) to (next_state, action). This is the same pattern used in network protocol parsers, regex engines, and compiler lexers.
The three states are:
START-- between tokens, dispatching based on the next characterINTEGER-- accumulating digits (e.g.,"12"so far)DECIMAL-- accumulating digits after a decimal point (e.g.,"12.3")
Use --dot-fsm to generate a visual diagram of the state machine.
How the DAG Works
The parser in parser.py builds an expression tree (AST) where:
- Leaf nodes are numbers (no dependencies)
- Interior nodes are operators with edges to their operands
- Edges represent "depends on" relationships
Evaluation in evaluator.py walks this tree bottom-up -- children before parents. This is exactly a topological sort of the DAG: you can only compute a node after all its dependencies are resolved.
Use --show-ast to see the tree structure, or --dot to generate a graphviz diagram.