✎ Note

I’d been putting off properly tagging my Obsidian vault for years — over 800 notes accumulated with no consistent structure, just a pile of clippings, quick references, and dumped chat exports. Classic case of “I’ll organize it later.” it.

The approach: rather than trust a single AI pass blindly, I built a small classification pipeline and validated it in stages. First, a discovery run over a sample of notes with no fixed categories — just “what is this actually about?” — to see what topics and tags naturally emerged from the data itself, instead of guessing a taxonomy up front. Once a sensible set of categories settled out, I locked it in and ran the classifier against a bigger slice of the corpus, checked the results by hand to make sure the model wasn’t hallucinating or defaulting to generic buckets, and only then let it loose on the entire vault.

Once that pipeline was proven out, I pointed it at something meatier: a full year’s worth of markdown files generated by Claude Code sessions — specs, audits, plans, reports, incident write-ups — and ran the exact same enrichment process against it. Same local model, same classify-then-verify discipline, just a different (and much denser) corpus. Nice validation that the pipeline generalizes beyond “lazily tagged personal notes” into real working documentation.

AI: Text None

Learn more about AI usage on this site

Comments

Sign in with your website to comment:

Signed in as
Send a Webmention

Have you written a response to this post? Send a webmention by entering your post URL below.