docs: map existing codebase

2026-03-27 00:21:00 +00:00
parent f50d2f8446
commit d963902a28
7 changed files with 1215 additions and 0 deletions
--- a/.planning/codebase/CONCERNS.md
+++ b/.planning/codebase/CONCERNS.md
@@ -0,0 +1,250 @@
+# Codebase Concerns
+
+**Analysis Date:** 2026-03-27
+
+## Silent Error Handling (Critical)
+
+**Widespread use of empty catch blocks:**
+- Files: `/.claude/hooks/gsd-prompt-guard.js`, `/.claude/hooks/gsd-context-monitor.js`, `/.claude/hooks/gsd-workflow-guard.js`, `/.claude/hooks/gsd-statusline.js`, `/.claude/hooks/gsd-check-update.js`
+- Issue: 14+ catch blocks across hooks swallow errors silently with `catch (e) {}` or `catch (e) { // comment }` without logging
+- Impact: Failures in hook execution go undetected, making debugging difficult. When files are corrupted, config fails to parse, or file system errors occur, no indication is provided to users or developers
+- Current mitigation: Comments explain intent (e.g., "Silent fail -- bridge is best-effort", "don't break statusline on parse errors"), but no logging mechanism exists
+- Recommendation: Implement optional debug logging to stderr (only when hooks are invoked in debug mode) without breaking normal operation. Use environment variable like `GSD_HOOK_DEBUG=1` to enable error logging
+
+---
+
+## Fragile Hook Timeout Architecture
+
+**Context-Monitor Hook (`.claude/hooks/gsd-context-monitor.js`)**:
+- Issue: 10-second stdin timeout (line 35) is tight for slow systems. On Windows/Git Bash, pipe delays can exceed this
+- Files: `/.claude/hooks/gsd-context-monitor.js` (line 35: `setTimeout(() => process.exit(0), 10000)`)
+- Current mitigation: Comments reference issues #775, #1162 indicating this was a known problem
+- Risk: On systems with slow I/O, the hook exits prematurely without reading input, potentially leaving metrics unprocessed
+- Safe modification: Consider detecting platform and adjusting timeout accordingly, or make timeout configurable via environment variable
+
+---
+
+## Hardcoded Buffer Percentage
+
+**Status Line Hook Context Normalization**:
+- Issue: 16.5% buffer hardcoded for Claude Code autocompact (line 29)
+- Files: `/.claude/hooks/gsd-statusline.js` (line 29: `const AUTO_COMPACT_BUFFER_PCT = 16.5`)
+- Impact: If Claude Code changes autocompact behavior or other models have different buffers, context calculations become inaccurate
+- Recommendation: Make this configurable via `.planning/config.json` with sensible defaults per model (Claude, Gemini, OpenCode)
+
+---
+
+## Stale Hook Version Tracking
+
+**Hook Version Header Validation**:
+- Issue: Version detection depends on comment headers in JavaScript files (line 78 in `gsd-check-update.js`)
+- Files: `/.claude/hooks/gsd-check-update.js` (lines 73-92)
+- Risk: If hook files are minified, reformatted, or copied incorrectly, version headers can be lost, making stale hook detection fail silently
+- Current state: Detects "unknown" version if header is missing, but this doesn't prevent the hook from running with wrong version
+- Recommendation: Add a checksum or content hash validation in addition to version headers to detect unintended hook modifications
+
+---
+
+## Process Detachment on Windows
+
+**Check-Update Hook Background Process**:
+- Issue: Uses `spawn()` with `detached: true` (line 111 in `gsd-check-update.js`)
+- Files: `/.claude/hooks/gsd-check-update.js` (lines 108-114)
+- Impact: Background npm process can hang on Windows if not properly detached. Blocks are prevented by `unref()` (line 114), but edge cases remain if process fails
+- Current state: `windowsHide: true` flag added, `stdio: 'ignore'` set
+- Recommendation: Add timeout to background spawn, return immediately after unref, and consider writing to log file instead of silently ignoring errors
+
+---
+
+## Config File Parsing Without Validation
+
+**Multiple Hooks Read `.planning/config.json`**:
+- Issue: Config files are parsed with `JSON.parse()` but no schema validation exists
+- Files: `/.claude/hooks/gsd-context-monitor.js` (line 53), `/.claude/hooks/gsd-workflow-guard.js` (line 65)
+- Risk: Malformed JSON or missing expected fields cause silent failures. If config structure changes, old configs won't error but silently ignore new fields
+- Recommendation: Implement a simple schema validation utility (even lightweight: check for required keys and types) and log warnings when config doesn't match expected shape
+
+---
+
+## Race Conditions in Metrics Bridge
+
+**Context Metrics Between Statusline and Monitor Hooks**:
+- Issue: Two separate hooks read/write to `/tmp/claude-ctx-{sessionId}.json` without file locking
+- Files: `/.claude/hooks/gsd-statusline.js` (line 40-47 writes), `/.claude/hooks/gsd-context-monitor.js` (line 63-70 reads)
+- Risk: On high-frequency tool use, write and read can race, causing monitor hook to read stale metrics or corrupted JSON
+- Current mitigation: Both have try-catch around JSON.parse(), stale metrics are detected by timestamp (line 74)
+- Recommendation: Use atomic writes (write-to-temp-then-rename pattern) for metrics file, or switch to a simpler shared state mechanism
+
+---
+
+## Missing Logging Infrastructure
+
+**No Structured Logging**:
+- Issue: Hooks only log to stdout via process.stdout.write() for hook outputs. No persistent logs for troubleshooting
+- Files: All hook files (`/.claude/hooks/*.js`)
+- Impact: When hooks fail silently, no audit trail exists to debug issues. Users cannot see what hooks detected or why they exited
+- Recommendation: Create optional debug logs in `.planning/hooks-debug.log` (only when env var `GSD_DEBUG=1`), append structured JSON records with timestamp, hook name, and outcome
+
+---
+
+## Unbounded Memory in Warning State File
+
+**Context Monitor Warning Debounce**:
+- Issue: Warning state tracked in `/tmp/claude-ctx-{sessionId}-warned.json` accumulates indefinitely
+- Files: `/.claude/hooks/gsd-context-monitor.js` (lines 87-117)
+- Risk: Temporary directory might not clean this up, causing stale state from previous sessions to affect new sessions
+- Current mitigation: State is per-session (uses sessionId), so scope is limited
+- Recommendation: Add timestamp to warning state and treat state older than 1 hour as stale (auto-reset)
+
+---
+
+## Regex Pattern Complexity in Prompt Guard
+
+**Prompt Injection Detection Patterns**:
+- Issue: 12 regex patterns for injection detection (lines 18-32 in `gsd-prompt-guard.js`) are not anchored and may be fragile
+- Files: `/.claude/hooks/gsd-prompt-guard.js` (lines 18-32)
+- Risk: Some patterns (e.g., `/you\s+are\s+now\s+/i`) could match benign documentation text about "You are now ready to..." and trigger false warnings
+- Impact: While warnings are advisory-only, frequent false positives degrade user experience
+- Recommendation: Review patterns for false positive risk, consider context (e.g., only in code blocks, not in comments), or add allowlist for common false positives
+
+---
+
+## Inconsistent Hook Version Across Environments
+
+**Multi-Environment Hook Duplication**:
+- Issue: Identical hooks duplicated across `.claude/`, `.agent/`, `.gemini/`, `.opencode/` directories
+- Files: 4x each of `gsd-prompt-guard.js`, `gsd-context-monitor.js`, `gsd-workflow-guard.js`, `gsd-statusline.js`, `gsd-check-update.js`
+- Risk: If one environment's hooks are updated, others can drift out of sync. Version check in `gsd-check-update.js` detects this (line 81-82) but doesn't auto-fix
+- Impact: Users with multiple agent environments may see inconsistent behavior
+- Recommendation: Consolidate hooks to a shared location or implement auto-sync mechanism in installer
+
+---
+
+## Resource Leaks in Timeouts
+
+**Stdin Timeout Cleanup**:
+- Issue: `setTimeout()` created for stdin timeout in all hooks, relies on `clearTimeout()` to clean up
+- Files: `/.claude/hooks/gsd-prompt-guard.js`, `/.claude/hooks/gsd-context-monitor.js`, `/.claude/hooks/gsd-workflow-guard.js`, `/.claude/hooks/gsd-statusline.js`
+- Risk: If process exits before reaching `process.stdin.on('end')`, timeout continues running until expiry (3-10 seconds)
+- Current state: Process exits on timeout with code 0, but spawned processes inherit this timeout
+- Recommendation: Add `unref()` call on timeout timers so they don't keep process alive
+
+---
+
+## Platform-Specific Path Separator
+
+**Windows Path Handling in Guards**:
+- Issue: Guards check for both `/` and `\\` in file paths (e.g., line 52 in `gsd-prompt-guard.js`)
+- Files: `/.claude/hooks/gsd-prompt-guard.js` (line 52), `/.claude/hooks/gsd-workflow-guard.js` (line 43)
+- Risk: Mixing separators when checking `.planning/` paths may not catch all variations on Windows (e.g., mixed slashes)
+- Recommendation: Normalize paths using `path.normalize()` before comparison, or use `path.sep` for platform detection
+
+---
+
+## Missing Config Directory Override Documentation
+
+**CLAUDE_CONFIG_DIR Environment Variable**:
+- Issue: Support for `CLAUDE_CONFIG_DIR` environment variable is implemented (line 18 in `gsd-check-update.js`, line 73 in `gsd-statusline.js`) but not documented
+- Files: `/.claude/hooks/gsd-check-update.js` (line 18), `/.claude/hooks/gsd-statusline.js` (line 73)
+- Impact: Users with custom config directories won't know this variable exists, potentially breaking hook functionality
+- Recommendation: Document in copilot-instructions.md or add reference in hook comments
+
+---
+
+## Temporary File Cleanup Strategy
+
+**Metric Files Never Cleaned**:
+- Issue: `/tmp/claude-ctx-{sessionId}.json` and `/tmp/claude-ctx-{sessionId}-warned.json` are never deleted
+- Files: `/.claude/hooks/gsd-statusline.js` (writes), `/.claude/hooks/gsd-context-monitor.js` (reads)
+- Risk: On long-running systems, /tmp fills up with stale session files
+- Recommendation: Implement cleanup logic to delete temp files older than 7 days, or hook into session cleanup to delete files on logout
+
+---
+
+## Package.json Missing Dependencies
+
+**Minimal Configuration**:
+- Issue: `.claude/package.json` only contains `{"type":"commonjs"}`, no dependencies listed
+- Files: `/.claude/package.json`
+- Risk: Hooks use built-in Node modules (fs, path, os, child_process), so this is correct, but appears incomplete
+- Current state: This is intentional (all hooks use only Node builtins), but unclear to maintainers
+- Recommendation: Add comment in package.json explaining that hooks intentionally use no external dependencies for portability
+
+---
+
+## Potential Memory Bloat in Large Outputs
+
+**No Stream Processing in Hooks**:
+- Issue: All hooks read entire stdin into memory before processing (`let input = ''; process.stdin.on('data', chunk => input += chunk)`)
+- Files: All hook files
+- Risk: If a tool call writes gigabytes of content (e.g., massive file read), the hook process could run out of memory
+- Current mitigation: Only tool writes/edits are inspected, and these are typically small
+- Recommendation: For hooks processing large content (especially prompt injection guard), implement streaming regex matching or line-by-line processing
+
+---
+
+## Test Coverage Gaps
+
+**No Unit Tests for Hooks**:
+- What's not tested: None of the 5 hook files have automated tests
+- Files: `/.claude/hooks/*.js` (all of them)
+- Risk: Changes to hook logic have no regression protection. Timeout tweaks, path logic, or regex patterns could break without detection
+- Impact: Hook failures are silent (by design), so broken hooks go unnoticed until users report issues
+- Priority: High — hooks are critical path for agent context management
+
+---
+
+## Missing Error Recovery in Executor
+
+**Deviation Rule Priority Conflicts**:
+- Issue: Executor auto-applies Rules 1-3 without clear priority when multiple rules could apply
+- Files: `/.github/agents/gsd-executor.agent.md` (line 150 mentions "RULE PRIORITY:")
+- Risk: When a task has multiple issues (e.g., bug AND missing critical functionality), executor might fix bug first, missing that missing-function blocking task completion
+- Recommendation: Clarify priority order: blocking issues (Rule 3) > missing critical functionality (Rule 2) > bugs (Rule 1)
+
+---
+
+## Subprocess Error Propagation in Background Checks
+
+**npm view Command Failure Silent**:
+- Issue: `execSync('npm view get-shit-done-cc version')` wrapped in try-catch, timeout 10s (line 96 in `gsd-check-update.js`)
+- Files: `/.claude/hooks/gsd-check-update.js` (lines 95-97)
+- Risk: If npm is not installed, offline, or slow, fails silently. Users won't know update check failed
+- Impact: Latest version is set to "unknown", update check becomes unreliable
+- Recommendation: Log to file when update check fails (optional debug log), or retry with exponential backoff
+
+---
+
+## Missing Instrumentation for Decision Points
+
+**No Audit Trail for Hook Decisions**:
+- Issue: When context monitor injects critical warning or workflow guard injects advisory, no record of decision
+- Files: `/.claude/hooks/gsd-context-monitor.js` (lines 125-142), `/.claude/hooks/gsd-workflow-guard.js` (lines 78-87)
+- Risk: If user behavior doesn't match warning, no way to audit what warning was shown or when
+- Recommendation: Write decision logs to `.planning/hooks-audit.log` with timestamp, hook name, decision, and reason (only in GSD active mode)
+
+---
+
+## Scalability of Hook Frequency
+
+**Hooks Run on Every Tool Use**:
+- Issue: Context monitor (PostToolUse) runs after EVERY tool call, reads file, parses JSON, checks thresholds
+- Files: `/.claude/hooks/gsd-context-monitor.js`
+- Risk: On high-frequency tool sessions (100+ tools), accumulated overhead could impact performance
+- Current mitigation: Debounce warnings (line 108), stale metric detection (line 74)
+- Recommendation: Profile hook execution time under load, consider batching metrics or reducing check frequency
+
+---
+
+## Unclear Behavior on Missing Files
+
+**Graceful Degradation Assumptions**:
+- Issue: When `.planning/STATE.md`, `.planning/config.json`, or todo files don't exist, hooks silently skip functionality
+- Files: All hooks that check for optional files
+- Risk: Unclear to users whether missing files are "normal" or indicate misconfiguration
+- Example: Statusline shows "no task" if todos not found, but user might think todos are lost
+- Recommendation: Distinguish between "feature not enabled" (e.g., no .planning/ = not a GSD project) vs. "feature broken" (e.g., config.json exists but unparseable)
+
+---
+
+*Concerns audit: 2026-03-27*