Knowledge & Memory

Training

Fine-tuning pipeline for creating custom Ollama models trained on vault structure, routing rules, and organizational patterns from your personal knowledge base.

In Plain English

This is a workshop for creating custom AI models that run locally through Ollama (a tool for running AI on your own computer without cloud services). You feed it examples of how you want the AI to respond, and it fine-tunes a model (adjusts an existing AI brain to learn your specific patterns) for your needs, like teaching a new employee by showing them exactly how you want things done, until they can do it on their own.

Problem

General-purpose models don't understand your file organization, naming conventions, or routing rules. Training produces models that actually know where shadow work journals go, how to route book notes, and which folder structure applies to each project - baking your vault rules directly into model weights.

Architecture

Scroll to explore diagram

Key Features

Vault Mining

automated

Scripts scan your Obsidian vault structure, extract routing patterns, and generate training examples in JSONL format automatically.

Multi-Format Export

3 formats

Training data is generated in Alpaca (instruction-following), ChatML (conversation), and JSONL (line-delimited) formats for flexibility.

Constitution Encoding

custom rules

The Vault Constitution - routing rules like "shadow work → 02_Personal/Psychology/" - is encoded directly into the model's behavior through fine-tuning.

How It Works

Mining

generate_training_data.py scans the vault structure and creates training examples from existing file paths and categories

Formatting

Raw examples are converted to Alpaca/ChatML/JSONL format with instruction-response pairs

Fine-Tuning

The base Qwen2.5:14b model is fine-tuned using the vault-specific training data and constitution rules

Deployment

The custom model is loaded into Ollama via create-model.bat and exposed through MCP for Claude Code to use

Tech Stack

Base Model

Qwen2.5:14b via Ollama

Data Generation

Python scripts

Formats

JSONL, Alpaca, ChatML

Constitution

Markdown rules document

Deployment

Ollama Modelfile + batch script

Integration

MCP server