Orchestration
AgentHub
A personal AI command center that learns how you think, routes every request to the right specialist, and rewrites its own prompts based on 40+ real conversations.
In Plain English
Imagine texting one number that automatically forwards your message to the right expert, whether you need a therapist, a sysadmin, or a shopping assistant. AgentHub does that with AI agents (programs that can make decisions and take actions on their own): you say what you need, and it picks the right model (the specific AI brain trained for that job), the right personality, and the right context without you lifting a finger.
Problem
When you work with AI every day, across dozens of different domains, the single-chat-window paradigm falls apart fast. You might ask about a dream you had, then pivot to debugging a firewall rule, then need to compare prices on cat food. Each of those tasks benefits from a different model, a different system prompt, and a different set of background knowledge. Doing that switching manually means copying context, rewriting instructions, and losing the thread of what you were actually trying to accomplish.
AgentHub grew out of that friction. The core idea is that if you have enough real conversation data, you can build a system that genuinely understands the texture of how a particular person uses AI. Not a generic chatbot router, but a system trained on the actual patterns of one user's life: their schedule, their vocabulary, their emotional rhythms, the projects they jump between. The 1,029-line prompt system at the heart of AgentHub was built by analyzing over 40 real conversation threads, extracting the patterns that matter, and encoding them into a dynamic template engine that adapts in real time.
The result is something closer to a personal AI operating system than a chatbot. It remembers that Monday mornings usually mean vault health checks. It knows that when you type in Norwegian, the response should come back in Norwegian. It understands that "deep extraction" means a multi-phase research process with verification, not a one-shot summary. And it routes all of that through a cascade of models chosen for cost efficiency: free local models for simple tasks, cheap API calls for medium complexity, and premium models only when the question genuinely demands it.
Architecture
Features
4-Tier Router Cascade
7s max latency
Every request passes through a four-stage classification pipeline designed to never hang. The first tier checks learned pattern matches for instant routing. If no pattern fires, GPT-4o-mini classifies the intent with high accuracy. Failing that, the local brain service on port 8420 provides a free fallback. And as a final safety net, keyword matching catches anything the models miss. The entire cascade completes in under seven seconds, and each tier feeds confidence scores back to the pattern learner so routing improves with every interaction.
Dynamic Prompt System v2
1,029 lines
Built from analysis of 40+ real conversation threads, the prompt system is the intelligence layer that transforms raw user input into precisely tuned model instructions. It detects quality modifiers like "ultra," "verbose," and "deep" to apply boost multipliers. It recognizes constraint patterns such as "norwegian_only," "budget," and "no_guessing" to enforce behavioral guardrails. Across 10+ prompt categories (shopping research, deep extraction, personality creation, verification), it selects the right template and injects personal context. An anti-pattern filter strips AI fluff using curated FORBIDDEN_STARTS, FORBIDDEN_ENDS, and FORBIDDEN_ANYWHERE pattern lists, ensuring outputs stay grounded and useful.
Smart Suggestions Engine
30+ suggestions
Rather than presenting a blank text box, AgentHub proactively suggests what you might want to do right now. Over 30 pre-built suggestions are organized across eight categories: time-based prompts (morning mood check, evening shadow reflection), therapy and psychology (shadow trigger processing, parts work check-in, dream logging), system health (vault scans, infrastructure status), shopping (price comparisons, gift ideas), deep work (multi-phase research, truth checking), and more. Each suggestion carries time-range filters so morning prompts only appear before noon, priority scores for ranking, agent routing metadata, and optional input flags. The system effectively learns your daily rhythm and surfaces the right tool at the right time.
Pattern Learner
4 pattern types
The pattern learning module runs a local LLM (hermes3) to extract behavioral patterns from real usage. It tracks four types: time patterns (recognizing that Monday mornings mean vault scans), sequence patterns (noting that RDP fixes are always followed by connection tests), emotional patterns (detecting frustration and routing to shadow work), and preference patterns (learning that you want detailed reports, not summaries). A PatternIndex enables fast lookup by time, sequence, keyword, and agent. Every ten activity entries, the system auto-extracts new patterns and records user feedback (accepted or rejected) to adjust confidence scores over time.
How It Works
Request Arrives with Context
A message comes in through the React dashboard or one of the 50+ REST endpoints. Before anything else, the system loads the persistent USER_CONTEXT, which contains personal preferences, bilingual support settings (Norwegian and English), and a psychological framework (Jungian plus Buddhist). This context travels with every request, ensuring that even the first message of a session has full personal grounding. The smart suggestions engine simultaneously evaluates the current time, recent activity, and user patterns to present contextually relevant action buttons alongside the chat input.
4-Tier Classification Cascade
The 873-line router module kicks in with its four-tier classification pipeline. First, learned patterns are checked for an instant match. If the pattern learner has seen this type of request before with high confidence, it skips the API call entirely. Otherwise, GPT-4o-mini performs the primary classification, categorizing intent and selecting the target agent. The local brain at port 8420 serves as a free backup, and keyword matching provides the final safety net. The cascade also runs complexity detection using curated word lists of SIMPLE_SIGNALS and COMPLEX_SIGNALS to determine whether the request needs a free local model, a cheap API call, or a premium Claude session.
Prompt Enhancement and Filtering
Once the router has classified the request, the prompt system transforms it. Quality modifiers are detected and applied as boost multipliers. Constraint patterns are identified and enforced. The appropriate template from 10+ prompt categories is selected and populated with format strings. Domain-specific context is injected based on the chosen agent. Finally, the anti-pattern filter runs through FORBIDDEN_STARTS, FORBIDDEN_ENDS, and FORBIDDEN_ANYWHERE pattern lists to strip common AI filler phrases before the enhanced prompt reaches the model. The goal is that the model receives exactly the instruction it needs, with no ambiguity and no fluff.
Agent Execution and Streaming
The enhanced request is dispatched to one of 15+ specialized agents, each with its own system prompt, tool access, and behavioral constraints. The @writer agent handles shadow work, dreams, and psychological journaling. The @guardian agent monitors vault health and detects orphaned links. The @system agent manages RDP connections, firewalls, and Windows services. Each agent streams its response back through persistent WebSocket connections, so the React dashboard updates in real time. The dashboard shows agent cards with Spawn, View, and Config controls, an activity monitor, and tabbed views for Smart suggestions, workflow Flows, and Recent history.
Pattern Learning and Feedback
After execution, the pattern learner records the interaction. Every ten activity entries, it runs hermes3 to auto-extract new behavioral patterns. If the user accepted or rejected a suggestion, that feedback adjusts confidence scores for future routing. Successful patterns get reinforced; failed ones get demoted. The PatternIndex updates its fast-lookup tables for time-based, sequence-based, keyword-based, and agent-based queries. Over time, the system builds an increasingly accurate model of the user's habits, preferences, and workflow rhythms, making each subsequent interaction faster and more precisely targeted.
Tech Stack
Runtime
FastAPI + Uvicorn on port 8421
Frontend
React dashboard with purple accent (#7c3aed), dark theme, WebSocket live updates
Models
Ollama (hermes3, local), OpenAI (GPT-4o-mini), Anthropic (Claude), Google (Gemini)
Core Modules
router.py (873 lines), prompt_system.py (1,029 lines), pattern_learner.py (490 lines), agent_manager, llm_gateway, scheduler, workflow_engine
API Surface
50+ REST endpoints across brain, browser, LLM, model, prompt, service, session, settings, task, and usage routes
Storage
SQLite for state persistence, session history, and pattern storage