Gaming

DST-Ollama-Bridge

A drop-in replacement for the FAtiMA decision engine that gives Don't Starve Together AI companions genuine reasoning, persistent memory, and a speaking personality by routing game state through local large language models.

In Plain English

In the game Don't Starve Together, you can have AI companions that follow you around and help you survive. Normally they make decisions using fixed rules, but this project replaces that brain with an actual language model (the same type of AI behind ChatGPT) running on your own computer through Ollama (a tool for running AI locally). The AI can now reason about what it sees, remember what hurt it before, and even talk to you with a unique personality and computer-generated voice.

Problem

The DST-AICompanion mod described in the previous project page uses FAtiMA, a rule-based cognitive architecture, to make decisions for the AI companion. FAtiMA works by matching incoming perception data against predefined appraisal rules, generating emotions, and selecting actions from a fixed decision table. This approach produces consistent, predictable behavior, which is ideal for controlled academic experiments. But it has fundamental limitations when it comes to creating a companion that feels like a real partner.

A rule-based system cannot adapt to situations its designer did not anticipate. If the companion encounters a combination of conditions that none of the authored rules cover, it defaults to wandering aimlessly. It cannot reason about trade-offs ("I need logs for a campfire, but there are spiders near the trees, and I only have 30 health"). It has no memory of past sessions, so every game starts from a blank slate. And it certainly cannot hold a conversation or explain why it just ran away from a fight. The companion's behavior, while functional, feels robotic because it literally is: a lookup table mapping perceptions to canned responses.

The Ollama Bridge solves this by replacing FAtiMA's decision engine with local LLM inference while preserving the exact HTTP protocol that the game mod expects. From the mod's perspective, nothing changes. It still sends perception data to localhost:8080 and receives action decisions in the FAtiMA Well-Formed Name format. But behind that familiar interface, a language model is reading a natural-language description of the survival situation, reasoning about priorities, drawing on a persistent memory of past experiences, and generating both strategic decisions and in-character dialogue. The companion goes from being a state machine to being something that can genuinely surprise you.

Architecture

Scroll to explore diagram

The bridge server sits between the DST game mod and local LLM inference, translating raw game perceptions into natural language prompts, routing them through one or two language models, parsing the responses back into the FAtiMA action protocol, and maintaining persistent memory across sessions.

Features

Dual LLM Architecture

2 models running in parallel

The bridge supports a dual-model configuration where a primary LLM handles strategic survival decisions (what to chop, when to flee, what to craft) while a secondary model, typically something creative like Dolphin Mixtral, handles personality-driven dialogue generation. Each model gets tuned parameters based on the current context: combat situations receive lower temperature and fewer tokens for quick decisive responses, while idle conversation moments get higher temperature and more creative freedom. The primary model runs on a 5-second decision timeout that ensures the game never stalls waiting for inference, with a WANDER fallback if the timeout expires. This separation means the strategic brain can be a small, fast model optimized for instruction-following while the personality engine can be a larger, more expressive model that takes a bit longer to generate interesting dialogue.

Persistent Memory System

JSON episodic and learned stores per character

Unlike the original FAtiMA system that starts from scratch every session, the Ollama Bridge maintains persistent memory across game sessions for each AI character (identified by GUID). The memory system has three tiers: short-term memory (a rolling buffer of the 50 most recent events), episodic memory (up to 100 significant events like deaths, combat encounters, and successful crafts that persist to disk), and learned knowledge (a growing dictionary of successful actions, failed actions, dangerous entities, safe entities, resource locations, crafted items, and death causes). When the prompt builder constructs the next decision prompt, it injects relevant memory context so the AI can reference past experiences. If the companion was killed by a spider last session, it knows to avoid spiders or bring a weapon this time. All memory data serializes to JSON files in a per-character directory and loads automatically when that character reconnects.

FAtiMA Protocol Compatibility

Zero mod changes needed

The bridge implements every endpoint the DST-AICompanion mod expects, down to the exact JSON response format. The perceptions endpoint (POST /{guid}/perceptions) accepts the full game state payload every 0.5 seconds. The decide endpoint (GET /{guid}/decide/) returns action decisions in FAtiMA's Well-Formed Name format: Action(CHOP, -, -, -, -) for chopping a tree, Action(BUILD, -, -, -, axe) for crafting an axe, Speak(cs, ns, m, sty, 'Hello!') for speech. The events endpoint (POST /{guid}/events) receives action completion and property change notifications. The game mod genuinely cannot distinguish between the original FAtiMA C# server and this Python bridge, which means the bridge works as a true drop-in replacement without any modifications to the Lua mod code.

Voice Synthesis and Anti-Repetition

Edge TTS neural voices + similarity detection

An integrated Edge TTS service converts the companion's generated dialogue into spoken audio using Microsoft's neural voice synthesis. The companion does not just display text in the game chat; it actually speaks to you. The voice synthesis runs asynchronously so it never blocks the decision pipeline. Complementing this, an anti-repetition system tracks the last 15 responses per character and uses word-overlap similarity detection (with a configurable threshold, defaulting to 60%) to reject generated lines that are too similar to recent output. If the LLM produces a repetitive response, the system re-prompts with explicit instructions listing the recent responses to avoid, ensuring the companion's dialogue stays fresh and varied even during long play sessions.

How It Works

Perception Ingestion

Every 0.5 seconds, the DST mod sends the complete game state over HTTP to the bridge's perceptions endpoint. The payload contains everything the companion can see and know: health, hunger, sanity, and temperature as numeric values; a boolean array of environmental conditions (freezing, overheating, raining); a list of up to 20 nearby entities, each tagged with over a dozen interaction properties (IsChoppable, IsPickable, IsEdible, IsAttackable, and so on); the full inventory with per-item properties; currently equipped items; position coordinates; and whether the human player is nearby.

The bridge server stores this as the current world snapshot, indexed by the character's GUID, and makes it available to the prompt builder. It also feeds the perception data into the memory system, which tracks property changes (health dropping means combat or starvation) and entity deletions (a tree disappearing means it was successfully chopped). The conversation history manager also receives the perception to maintain awareness of what the companion was doing when it last spoke, enabling contextually relevant follow-up dialogue.

Prompt Construction

The prompt builder is the most critical translation layer in the system. It takes the raw JSON perception data and converts it into a compact natural language description that an LLM can reason about. Nearby entities are categorized into pickables, choppables, mineables, hostiles, containers, and food sources using the game knowledge base, which maps internal prefab names (like "evergreen") to readable names (like "Pine Tree"). The builder generates urgent survival warnings when stats drop below thresholds: "STARVING!" when hunger drops below 40, "FREEZING!" when the temperature flag is set, "LOW HEALTH!" below 30 HP.

The prompt also enumerates which crafting recipes are currently available based on the companion's inventory. The game knowledge module contains 77 recipes with ingredient lists and priority ratings, so the builder can tell the LLM "You can craft: axe (1 twig, 1 flint), torch (2 twigs, 2 grass)." Memory context is appended if available: past deaths, dangerous creatures, successful strategies. The complete prompt fits within a tight token budget because the DST mod expects a response within 5 seconds, and every unnecessary token is inference time the player feels as latency.

LLM Reasoning and Decision

The constructed prompt is sent to Ollama's local API (port 11434) for inference. The LLM reads the survival situation description and selects from the valid action set: PICK (harvest a plant), CHOP (cut a tree), BUILD (craft a recipe), ATTACK (fight an enemy), STORE (deposit items in a container), TRAVEL (explore new areas), or WANDER (idle movement). The model must respond with an action keyword and an optional parameter, like "ACTION: BUILD axe" or "ACTION: PICK" for the nearest pickable resource.

In dual-LLM mode, the primary model handles this strategic decision while the secondary model simultaneously generates in-character dialogue. The secondary model receives a personality prompt (loaded from a text file that defines the companion's name, backstory, speech patterns, and emotional tendencies) along with the game context and the conversation history of the last 20 exchanges. Its temperature and token limits are adjusted per situation type: lower temperature during combat for terse, focused remarks; higher temperature during peaceful moments for creative, personality-rich conversation. The async chat response system handles the 5-second CURL timeout by returning immediate acknowledgments and letting the companion poll for the completed response.

Response Parsing and Protocol Translation

The action parser extracts the chosen action from the LLM's free-text output using regex matching. It handles multiple response formats: "ACTION: CHOP", "CHOP", or even conversational phrasing that happens to contain the action keyword. A comprehensive synonym mapping table converts natural language verbs into the exact action names the mod supports: GATHER, HARVEST, and COLLECT all map to PICK; CRAFT, MAKE, and CREATE map to BUILD; HIT, FIGHT, and KILL map to ATTACK; EXPLORE maps to TRAVEL; and various idle verbs (WAIT, IDLE, NOTHING) map to WANDER.

After synonym resolution, the parser validates that the resulting action exists in the VALID_ACTIONS dictionary. If parsing fails entirely, the fallback is always WANDER, ensuring the companion never gets stuck. The validated action is then packaged into FAtiMA's Well-Formed Name format, constructing the exact JSON structure the mod expects: Type, Target, Name, WFN, Action, InvObject, PosX, PosZ, and Recipe fields. For BUILD actions, the recipe name is extracted and placed in the Recipe field. For speech, the Speak WFN format wraps the generated dialogue. This translation layer is what makes the bridge invisible to the mod.

Execution, Learning, and Memory Persistence

The mod executes the action in-game and reports the outcome back through the events endpoint. An Action-End event indicates whether the action succeeded (the tree was chopped, the item was picked up) or failed (the target was out of range, the companion was interrupted by combat). Property-Change events notify the bridge when vital stats change. Delete-Entity events signal that an entity was consumed or destroyed. The memory system processes all of these events to update its learned knowledge.

When an action succeeds, its count increments in the successful_actions dictionary. When it fails, the failed_actions counter increments and the context is logged. If the companion takes damage from a creature, that creature's prefab is added to the dangerous_entities set. If the companion dies, the death cause and circumstances are recorded in the death_causes list. Crafting completions are tracked in crafted_items. Resource locations are stored with coordinates so the companion can navigate back to known resource areas. All of this data saves to a JSON file on disk when the session ends and loads automatically when the character reconnects, giving the companion cumulative knowledge that improves its survival performance over multiple play sessions.

Tech Stack

Server

Python 3.10+, FastAPI, Uvicorn, async request handling with httpx

Inference

Ollama for local LLM hosting (qwen2.5, llama3, mistral, dolphin-mixtral)

Game Integration

Full FAtiMA HTTP protocol implementation (perceptions, decide, events, speak)

Voice

Edge TTS with Microsoft neural voices for async text-to-speech synthesis

Persistence

JSON file-based memory per character GUID with episodic, learned, and session stores

Data Validation

Pydantic models for request/response validation, python-dotenv for configuration