Automation

Bernard

A 26,000-line personal AI assistant with dual personalities, CPTSD-aware emotional care, desktop voice control, pet intelligence, guided psychology workflows, story narration, smart home control, and 200+ commands across 35 major systems.

In Plain English

Bernard is a personal assistant that lives in Telegram (a messaging app similar to WhatsApp) and can also hear you through a desktop microphone. It manages shopping lists, feeds the cat, reads Reddit stories out loud in different voices, tracks your mood, helps with therapy prep and shadow work (a psychology practice for exploring hidden parts of yourself), controls your smart home lights, downloads videos, schedules calendar events, and remembers everything about you. It has two personalities you can switch between: an enthusiastic robot and a refined butler. It knows when you are having a bad day and adjusts how it talks to you.

Problem

Most AI chatbots are stateless question-answering machines. You type a prompt, get a response, and the conversation exists in isolation. They do not know your name, your partner's name, when your anniversary is, that you prefer concise answers in the morning and conversational ones at night, or that when you send a YouTube link you probably want a summary rather than a discussion about the URL format. They are tools, not assistants.

Bernard began as a simple Telegram link curator. Its original purpose was to catch URLs sent throughout the day and sort them into an Obsidian vault: wishlists here, articles there, videos in a separate folder. But assistants have a way of growing into their roles. The curation logic needed intent detection, which needed an LLM, which opened the door to chat. Chat needed personality. Personality needed user awareness. User awareness revealed the need for mood tracking, couple awareness, holiday greetings, and a system that genuinely understands the humans it serves.

Today Bernard is a 26,000-line hybrid-architecture bot with 35 major systems. A three-tier routing engine dispatches every message through pattern matching, local LLM classification, or sovereign deep research. It maintains per-user profiles with mood tracking, emotional care awareness, and adaptive response styles. It runs two distinct personalities (a loyal robot and a refined butler), manages encrypted shopping lists, controls smart home devices through Home Assistant, integrates with Google Calendar using natural language date parsing, feeds a cat through a smart feeder with LLM-powered health analysis, narrates Reddit stories in 10+ voices with emotion-aware TTS, guides multi-step psychology workflows (shadow work, dream journaling, therapy prep, IFS parts work) that save directly to an Obsidian vault, and accepts voice commands through a desktop microphone with wake-word detection and vision-based screen interaction. It is not a chatbot. It is a personal operating system that happens to live in a messaging app.

Architecture

Scroll to explore diagram

Features

Dual Personality + Evolution

3 modes, 200+ commands

Bernard ships with three distinct modes. In Bernard mode, the bot is an enthusiastic, loyal robot who punctuates messages with sound effects like "beep boop" and "excited spinning," calls the user "boss" and recognizes household members, and has strong opinions about Marvel vs DC. Switch to JARVIS mode with /jarvis and the entire tone shifts to a refined British butler who addresses users as "sir" or "ma'am" with understated wit. Evolution mode activates learning and adaptation, recording behavioral patterns to a dedicated JSON store. Each mode has its own system prompt, vocabulary, and behavioral rules, but all share the same 200+ command handlers, memory system, and intelligence modules.

CPTSD-Aware Emotional Care

3 care levels

Bernard runs a continuous emotional detection system tuned for CPTSD patterns. It monitors for trigger words (flashback, dissociation, numbness, panic, overwhelm, shutdown), analyzes message length as a distress signal, detects late-night activity patterns that suggest anxiety spirals, and runs sentiment analysis on every message. When distress is detected, Bernard shifts through three care levels: Normal (standard helpful), Gentle (softer language, extra affirmations), and Protective (minimal demands, maximum support, extra silly robot noises). It never mentions distress directly to preserve dignity, may "accidentally" share cat facts as a distraction technique, and includes crisis support helplines when appropriate. The mood tracking system supports 1-10 scales with energy levels, predictive forecasting for tomorrow and the week ahead, pattern analysis by time of day, and proactive check-in scheduling.

Desktop Voice Interface

Wake word + vision

Beyond Telegram text, Bernard listens through a desktop microphone using Faster-Whisper with CUDA acceleration. Say "Bernard" to wake it, then speak naturally. The voice system has two execution paths: a fast path that matches registered commands from an actions.json registry and launches apps or runs scripts instantly, and a vision path that takes a screenshot, uses LLaVA to find UI element coordinates on screen, and clicks them with pyautogui. A transparent HUD overlay sits on top of all windows showing status (LISTENING, THINKING, EXECUTING), a VU meter for audio input levels, and the last recognized command. Voice messages sent through Telegram are also transcribed via Whisper before entering the routing pipeline.

Selma AI Pet Intelligence

LLM-powered care

Bernard integrates with a Philips smart pet feeder through a dedicated module called Selma AI, named after the Sphynx cat it feeds. This goes far beyond a "dispense food" button. Selma AI tracks the cat's full profile (born August 2020, ~5 kg, ~300 kcal/day across 4 meals), runs vision-based activity monitoring through the feeder camera to confirm actual eating vs just visiting, provides LLM-powered morning health briefings at 8 AM and evening summaries at 10 PM, analyzes eating patterns over time, makes AI-powered feeding decisions via /smartfeed, and answers natural language questions like "has Selma eaten today?" It auto-alerts on missed meals, low food levels, or health concerns, and celebrates the cat's birthday on August 24th.

Guided Psychology Workflows

4 workflows, Obsidian sync

Bernard includes a stateful workflow engine with four built-in multi-step psychology processes. Shadow work (9 steps) guides the user through identifying and integrating shadow aspects using a Jungian framework. Dream journaling (5 steps) captures dream content, emotions, and symbols for later analysis. Therapy prep (6 steps) helps organize thoughts and goals before a session. IFS parts work (7 steps) walks through Internal Family Systems identification and dialogue. Each workflow tracks state across messages, collects user responses step by step, and saves the completed entry directly to the Obsidian vault under the personal journal folder with proper timestamps and frontmatter. Workflows can be abandoned mid-process with /abandon.

Reddit Story Narration

10+ voices, emotion TTS

Say "read me an AITA story" and Bernard fetches a story from Reddit (AITA, TIFU, pettyrevenge, ProRevenge, MaliciousCompliance, nosleep, BestOfRedditorUpdates, and more), runs emotion analysis on the text to determine delivery style (dramatic, casual, scary), selects a voice from 10+ options (Andrew male, Ava female, British Sonia/Ryan, and more), and generates speech through ElevenLabs with natural intonation. If ElevenLabs quota runs out, it falls back to Microsoft Edge TTS. Each user can set a preferred voice via /voice. Bernard tracks which stories have already been played to avoid repeats, supports a rating system, and can also search YouTube for story narration channels to send as audio-only voice messages.

Three-Tier Routing + 32 Models

0ms to deep research

Every incoming message passes through a three-tier routing engine. Tier 1 matches command prefixes instantly via regex (cal:, note:, wish:, vid:, j:, chat:) with zero latency and zero cost. Tier 2 sends ambiguous messages to Ollama for local LLM intent classification with entity extraction, typically resolving in about two seconds, with a configurable TTL cache to avoid redundant calls. Tier 3 escalates complex queries to the sovereign research agent for multi-source web synthesis with contradiction detection and configurable depth levels (light/deep/thorough). Behind this sits access to 32 Ollama models including deepseek-r1:32b for reasoning, qwen2.5:32b for synthesis, hermes3 for general chat, and dolphin-mixtral for uncensored responses. A prompt enhancement system automatically detects keywords like "think ultra hard" or "verify" and switches to appropriate response modes.

Full Life Management Suite

7+ integrated systems

Beyond the headline features, Bernard runs a complete life management layer. Google Calendar integration with OAuth2 handles natural language event creation in English and Norwegian ("cal: dentist neste fredag kl 15"), conflict detection, and recurring events. The reminder engine supports priority levels (low/normal/high/urgent), recurrence patterns, snooze with escalation, and quiet mode. The encrypted shopping list tracks who added each item and supports undo. Link intelligence auto-detects URLs, classifies them by domain (shopping, video, article), and routes them to the correct Obsidian folder. Smart home control via Home Assistant handles lights, scenes, and entity queries. Video analysis through yt-dlp supports 1000+ sites with metadata extraction and LLM-powered summarization. The couple awareness module tracks anniversaries, suggests date nights by category, stores gift ideas, and sends countdown reminders for special dates.

How It Works

Input Capture

Messages arrive from multiple channels. Telegram delivers text, voice notes, photos, files, URLs, and inline button callbacks. The desktop voice interface listens through a microphone with Faster-Whisper running on CUDA, activated by the wake word "Bernard." Scheduled tasks fire at set intervals for reminders, mood check-ins, and morning/evening Selma briefings. Voice notes and desktop audio are transcribed to text before anything else happens. The sender is verified against a DPAPI-encrypted whitelist. Unknown senders are rejected silently. The raw input, now text, is paired with the user's profile: personality mode, chat history, mood state, and couple context.

Three-Tier Routing

The routing engine processes each message through escalating tiers. Tier 1 runs regex patterns against 200+ registered commands: "wish:" routes to the wishlist, "cal:" to Google Calendar, "note:" to the Obsidian vault, /jarvis switches personality, /shadow starts the shadow work workflow, /selma opens the pet care interface. If no pattern matches, Tier 2 sends the message to Ollama running Hermes3 for local LLM intent classification. The model returns a structured classification with target handler, confidence score, and extracted entities. A TTL cache prevents redundant calls. If the classification confidence is below threshold, Tier 3 escalates to sovereign deep research: multi-source web search with contradiction detection and configurable depth levels.

Handler Dispatch

The classified intent dispatches to one of 35+ specialized handler modules, all running on asyncio. The shopping list handler manages DPAPI-encrypted JSON with add, remove, undo, and shared-list operations. The video handler uses yt-dlp across 1,000+ sites with FFmpeg post-processing. The calendar handler creates events through Google Calendar OAuth2, parsing natural language dates in both English and Norwegian via parsedatetime. The pet care handler checks the Philips feeder camera, analyzes eating patterns, and makes LLM-powered feeding decisions. The psychology handler manages stateful multi-step workflows (shadow work, dream journaling, therapy prep, IFS parts work) that persist across messages. The Reddit narration handler fetches stories, runs emotion analysis, selects from 10+ ElevenLabs voices, and generates speech. The smart home handler controls lights and scenes through Home Assistant's REST API. Each handler is self-contained and degrades gracefully if its external service is unavailable.

Intelligence + Emotional Layer

Before the response reaches the user, multiple intelligence modules process it in parallel. The CPTSD-aware emotional care system monitors for trigger words (flashback, dissociation, panic, overwhelm), analyzes message length as a distress signal, detects late-night anxiety patterns, and runs sentiment analysis. When distress is detected, the system shifts through care levels: Normal, Gentle, or Protective, adjusting language without ever mentioning distress directly. The couple awareness module injects anniversary countdowns, date night suggestions, and gift ideas when contextually relevant. The pattern learner tracks behavioral trends and adjusts future routing. The link intelligence module classifies URLs by domain and generates rich previews. The mood tracker maintains a rolling 1-10 scale with energy levels, time-of-day patterns, and predictive forecasting. The prompt enhancement system detects keywords like "think ultra hard" or "verify" and automatically switches to the appropriate Ollama model and response mode.

Multi-Channel Output

Responses route to the appropriate output channel. Telegram text responses use a streaming draft system that sends incremental updates as the LLM generates tokens, with typing indicators throughout. Voice responses go through ElevenLabs with emotion-aware intonation selection, falling back to Microsoft Edge TTS if quota runs out. Psychology workflow entries save directly to the Obsidian vault with proper timestamps, frontmatter, and folder routing. Desktop voice commands trigger app launches, UI clicks via LLaVA vision and pyautogui, or script execution through the actions.json registry, with the HUD overlay displaying status (LISTENING, THINKING, EXECUTING) and a VU meter. Smart home commands translate to Home Assistant REST calls for light scenes, entity control, and feeder dispensing. All responses are logged to per-user chat history (capped at a configurable maximum) with PII redacted through a dedicated filtering pipeline. Sensitive data like tokens and API keys are stripped from rotating logs.

Tech Stack

Runtime

Python 3.10+ with asyncio, nest_asyncio, 26,000+ lines

Chat Interface

Telegram Bot API (polling + Local Bot API port 9000)

LLM Layer

Ollama with 32 models (Hermes3, DeepSeek-R1, Qwen2.5, Dolphin-Mixtral)

Voice

Faster-Whisper (CUDA), ElevenLabs TTS, Edge TTS fallback

Smart Home

Home Assistant REST API, Philips Hue Cloud, Philips Pet Feeder

Calendar

Google Calendar API with OAuth2, parsedatetime NLP

Media

yt-dlp (1000+ sites), FFmpeg, Reddit PRAW, ElevenLabs

Storage

3 SQLite databases, encrypted JSON, Obsidian vault, DPAPI encryption

Desktop

pyautogui, LLaVA vision, tkinter HUD overlay, Elgato Wave:XLR

Security

User whitelist, DPAPI encrypted .env, PII redaction in rotating logs