Educational calculator teaching FSMs (explicit transition table tokenizer) and DAGs (recursive descent parser with AST evaluation). Includes CLI with REPL, graphviz visualization, and 61 tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
88 lines
2.9 KiB
Markdown
88 lines
2.9 KiB
Markdown
# Expression Evaluator -- DAGs & State Machines Tutorial
|
|
|
|
A calculator that teaches two fundamental CS patterns by building them from scratch:
|
|
|
|
1. **Finite State Machine** -- the tokenizer processes input character-by-character using an explicit transition table
|
|
2. **Directed Acyclic Graph (DAG)** -- the parser builds an expression tree, evaluated bottom-up in topological order
|
|
|
|
## What You'll Learn
|
|
|
|
| File | CS Concept | What it does |
|
|
|------|-----------|-------------|
|
|
| `tokenizer.py` | **State Machine** (Mealy machine) | Converts `"3 + 4 * 2"` into tokens using a transition table |
|
|
| `parser.py` | **DAG construction** | Builds an expression tree with operator precedence |
|
|
| `evaluator.py` | **Topological evaluation** | Walks the tree bottom-up (leaves before parents) |
|
|
| `visualize.py` | **Visualization** | Generates graphviz diagrams of both the FSM and AST |
|
|
|
|
## Quick Start
|
|
|
|
```bash
|
|
# Evaluate an expression
|
|
python main.py "3 + 4 * 2"
|
|
# => 11
|
|
|
|
# Interactive REPL
|
|
python main.py
|
|
|
|
# See how the state machine tokenizes
|
|
python main.py --show-tokens "(2 + 3) * -4"
|
|
|
|
# See the expression tree (DAG)
|
|
python main.py --show-ast "(2 + 3) * 4"
|
|
# *
|
|
# +-- +
|
|
# | +-- 2
|
|
# | `-- 3
|
|
# `-- 4
|
|
|
|
# Watch evaluation in topological order
|
|
python main.py --trace "(2 + 3) * 4"
|
|
# Step 1: 2 => 2
|
|
# Step 2: 3 => 3
|
|
# Step 3: 2 + 3 => 5
|
|
# Step 4: 4 => 4
|
|
# Step 5: 5 * 4 => 20
|
|
|
|
# Generate graphviz diagrams
|
|
python main.py --dot "(2 + 3) * 4" | dot -Tpng -o ast.png
|
|
python main.py --dot-fsm | dot -Tpng -o fsm.png
|
|
```
|
|
|
|
## Features
|
|
|
|
- Arithmetic: `+`, `-`, `*`, `/`, `^` (power)
|
|
- Parentheses: `(2 + 3) * 4`
|
|
- Unary minus: `-3`, `-(2 + 1)`, `2 * -3`
|
|
- Decimals: `3.14`, `.5`
|
|
- Standard precedence: parens > `^` > `*`/`/` > `+`/`-`
|
|
- Right-associative power: `2^3^4` = `2^(3^4)`
|
|
- Correct unary minus: `-3^2` = `-(3^2)` = `-9`
|
|
|
|
## Running Tests
|
|
|
|
```bash
|
|
python -m pytest tests/ -v
|
|
```
|
|
|
|
## How the State Machine Works
|
|
|
|
The tokenizer in `tokenizer.py` uses an **explicit transition table** -- a dictionary mapping `(current_state, character_class)` to `(next_state, action)`. This is the same pattern used in network protocol parsers, regex engines, and compiler lexers.
|
|
|
|
The three states are:
|
|
- `START` -- between tokens, dispatching based on the next character
|
|
- `INTEGER` -- accumulating digits (e.g., `"12"` so far)
|
|
- `DECIMAL` -- accumulating digits after a decimal point (e.g., `"12.3"`)
|
|
|
|
Use `--dot-fsm` to generate a visual diagram of the state machine.
|
|
|
|
## How the DAG Works
|
|
|
|
The parser in `parser.py` builds an **expression tree** (AST) where:
|
|
- **Leaf nodes** are numbers (no dependencies)
|
|
- **Interior nodes** are operators with edges to their operands
|
|
- **Edges** represent "depends on" relationships
|
|
|
|
Evaluation in `evaluator.py` walks this tree **bottom-up** -- children before parents. This is exactly a **topological sort** of the DAG: you can only compute a node after all its dependencies are resolved.
|
|
|
|
Use `--show-ast` to see the tree structure, or `--dot` to generate a graphviz diagram.
|