feat: v1.1 — URL ingestion, vision, delta tracking, 3 new skills, auto-commit

Skills (new):
- skills/obsidian-markdown/ — full Obsidian Flavored Markdown syntax reference
  (wikilinks, embeds, callouts, properties, math, Mermaid)
- skills/obsidian-bases/ — Obsidian Bases (.base files) with correct filters/views/
  formulas syntax (sourced from kepano/obsidian-skills authoritative spec)
- skills/defuddle/ — web page cleaner; strips ads/nav before URL ingestion,
  saves 40-60% tokens on web articles

wiki-ingest upgrades:
- URL ingestion: pass https:// directly, auto-fetches + runs defuddle if available
- Image/vision ingestion: .png/.jpg/.gif etc → Claude reads → description saved
  to .raw/ → standard ingest pipeline
- Delta tracking: .raw/.manifest.json tracks hash per source, skips unchanged files

wiki-query upgrades:
- Quick mode (query quick:) — hot.md + index only, ~1500 tokens
- Standard mode — existing behaviour, 3-5 pages
- Deep mode (query deep:) — full wiki + optional web search supplement

hooks:
- PostToolUse auto-commit: every Write/Edit to wiki/ or .raw/ triggers
  git add + commit automatically, vault always versioned

fixes:
- Removed invalid allowed-tools field from all 10 SKILL.md files
  (not a valid skill frontmatter attribute per spec; was silently ignored)
- Canvas SKILL.md now references json-canvas open standard and kepano/obsidian-skills

wiki research:
- Ecosystem research: 16+ Claude+Obsidian projects mapped and filed
- New pages: comparisons/claude-obsidian-ecosystem, concepts/cherry-picks,
  entities/ (6 new), sources/claude-obsidian-ecosystem-research
- Cherry-picks roadmap filed at wiki/concepts/cherry-picks.md

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
Daniel
2026-04-08 12:25:00 +03:00
parent cd1935cf63
commit 3ad61b984d
26 changed files with 1803 additions and 42 deletions

View File

@@ -1,17 +1,97 @@
---
name: wiki-ingest
description: >
Ingest sources into the Obsidian wiki vault. Reads a source, extracts entities
and concepts, creates or updates wiki pages, cross-references, and logs the operation.
Supports single sources and batch mode. Triggers on: "ingest", "process this source",
"add this to the wiki", "read and file this", "batch ingest", "ingest all of these".
allowed-tools: ["Read", "Write", "Edit", "Bash", "Glob", "Grep"]
description: "Ingest sources into the Obsidian wiki vault. Reads a source, extracts entities and concepts, creates or updates wiki pages, cross-references, and logs the operation. Supports files, URLs, and batch mode. Triggers on: ingest, process this source, add this to the wiki, read and file this, batch ingest, ingest all of these, ingest this url."
---
# wiki-ingest — Source Ingestion
Read the source. Write the wiki. Cross-reference everything. A single source typically touches 8-15 wiki pages.
**Syntax standard**: Write all Obsidian Markdown using proper Obsidian Flavored Markdown — wikilinks as `[[Note Name]]`, callouts as `> [!type] Title`, embeds as `![[file]]`, properties as YAML frontmatter. If kepano/obsidian-skills is installed, its `obsidian-markdown` skill is the authoritative syntax reference.
---
## Delta Tracking
Before ingesting any file, check `.raw/.manifest.json` to avoid re-processing unchanged sources.
```bash
# Check if manifest exists
[ -f .raw/.manifest.json ] && echo "exists" || echo "no manifest yet"
```
**Manifest format** (create if missing):
```json
{
"sources": {
".raw/articles/article-slug-2026-04-08.md": {
"hash": "abc123",
"ingested_at": "2026-04-08",
"pages_created": ["wiki/sources/article-slug.md", "wiki/entities/Person.md"],
"pages_updated": ["wiki/index.md"]
}
}
}
```
**Before ingesting a file:**
1. Compute a hash: `md5sum [file] | cut -d' ' -f1` (or `sha256sum` on Linux).
2. Check if the path exists in `.manifest.json` with the same hash.
3. If hash matches — skip. Report: "Already ingested (unchanged). Use `force` to re-ingest."
4. If missing or hash differs — proceed with ingest.
**After ingesting a file:**
1. Record `{hash, ingested_at, pages_created, pages_updated}` in `.manifest.json`.
2. Write the updated manifest back.
Skip delta checking if the user says "force ingest" or "re-ingest".
---
## URL Ingestion
Trigger: user passes a URL starting with `https://`.
Steps:
1. **Fetch** the page using WebFetch.
2. **Clean** (optional): if `defuddle` is available (`which defuddle 2>/dev/null`), run `defuddle [url]` to strip ads, nav, and clutter — typically saves 40-60% tokens. Fall back to raw WebFetch output if not installed.
3. **Derive slug** from the URL path (last segment, lowercased, spaces→hyphens, strip query strings).
4. **Save** to `.raw/articles/[slug]-[YYYY-MM-DD].md` with a frontmatter header:
```markdown
---
source_url: [url]
fetched: [YYYY-MM-DD]
---
```
5. Proceed with **Single Source Ingest** starting at step 2 (file is now in `.raw/`).
---
## Image / Vision Ingestion
Trigger: user passes an image file path (`.png`, `.jpg`, `.jpeg`, `.gif`, `.webp`, `.svg`, `.avif`).
Steps:
1. **Read** the image file using the Read tool — Claude can process images natively.
2. **Describe** the image contents: extract all text (OCR), identify key concepts, entities, diagrams, and data visible in the image.
3. **Save** the description to `.raw/images/[slug]-[YYYY-MM-DD].md`:
```markdown
---
source_type: image
original_file: [original path]
fetched: YYYY-MM-DD
---
# Image: [slug]
[Full description of image contents, transcribed text, entities visible, etc.]
```
4. Copy the image to `_attachments/images/[slug].[ext]` if it's not already in the vault.
5. Proceed with **Single Source Ingest** on the saved description file.
Use cases: whiteboard photos, screenshots, diagrams, infographics, document scans.
---
## Single Source Ingest