Learn
Building a Fully Local, Air-Gapped Splunk AI Assistant Using LibreChat, Ollama, and Splunk MCP
This guide shows how to build a fully local AI assistant for Splunk-adjacent workflows using LibreChat as the web UI and Ollama as the model backend, with a clear path to Splunk MCP integration later. The design keeps data on-prem, avoids cloud AI services, and is built for repeatable operations in controlled environments.
What you will build
- LibreChat running in Docker on a Raspberry Pi 5 (or similar Linux host)
- Ollama serving local models from a second machine (or the same machine)
- A validated LibreChat custom endpoint that connects to Ollama over HTTP
- A foundation for adding Splunk MCP tools later without cloud dependencies
1) Background and Motivation
Many organizations cannot send operational data to cloud-hosted LLM services. In Splunk environments, that restriction is common in regulated sectors, internal SOC networks, critical infrastructure, and any environment with strict data sovereignty rules.
Air-gapped design means no dependency on external AI APIs at runtime. You control the model host, the UI host, network paths, and logs. That gives you predictable behavior and clearer audit boundaries.
Tool-driven patterns matter here. A model should not have direct unrestricted access to production systems. Instead, it should call controlled tools that enforce identity, permissions, and scope.
MCP (Model Context Protocol) is a controlled tool interface for this pattern. The model does not talk directly to Splunk internals. It requests tool actions, and the tool layer enforces RBAC, tokens, and approved operations. This article focuses on the local UI and local LLM side needed before adding tools. It does not depend on Splunk AI Assistant.
A web UI is also practical: many analysts, responders, and platform users are not SPL experts. A UI lets them ask for help safely while you retain backend control.
2) Architecture Overview
Architecture in plain words:
- Raspberry Pi 5 runs LibreChat in Docker and provides the browser UI.
- A separate machine (Windows, Linux, or macOS) runs Ollama and serves local models over HTTP.
- LibreChat sends prompts to Ollama through a configured custom endpoint.
- Later, a Splunk MCP Server can be added as the tool provider.
LibreChat is the operator-facing interface: authentication, conversation controls, endpoint wiring, and model selection behavior.
Ollama is the local model runtime: it downloads or imports models and serves them via a local API. No cloud model endpoint is required.
Docker is used to package services with predictable dependencies and repeatable startup. You avoid host package drift and keep deployment behavior consistent.
A Pi 5 is usually sufficient for the UI tier because it is handling web/API services, not heavy model inference. Inference load stays on the Ollama host.
Internet is not required for day-to-day operation once components and models are already available in your internal environment.
3) Prerequisites
Minimum requirements:
- Raspberry Pi 5 (or equivalent Linux server) for LibreChat
- SSH access to manage the host
- Docker Engine and Docker Compose plugin installed
- A second machine (or same machine) running Ollama; Windows GPU hosts are a common choice
- Basic terminal familiarity
Required Downloads and References
- LibreChat source repository
- Ollama downloads (Windows, Linux, macOS)
- Docker Engine install docs
- Docker Compose install docs
- Splunk MCP Server app on Splunkbase
- Splunk docs: Configure the Splunk MCP Server
Install Docker and Compose on Raspberry Pi OS
On current Raspberry Pi OS builds, Docker packages are available from apt:
sudo apt update
sudo apt install -y docker.io docker-compose-plugin git curl
sudo systemctl enable --now docker
sudo usermod -aG docker $USER Log out and back in after group changes, then verify:
docker --version
docker compose version
If docker-compose-plugin is unavailable in your repo mirror, install Docker
using your standard internal package method, then confirm docker compose
works before continuing.
4) Installing LibreChat
LibreChat source is hosted in GitHub. Clone the repository so you can run the supported Compose workflow and maintain clean local overrides.
git clone https://github.com/danny-avila/LibreChat.git
cd LibreChat
cp .env.example .env
The .env file is a plain text key/value file used by Docker Compose and
LibreChat startup scripts for environment-specific settings.
Do not edit core compose files directly. Keep local changes in override files so updates from upstream remain easy to merge.
5) Understanding LibreChat’s Docker Setup
LibreChat is a multi-service application. You typically see service definitions for the API/backend, web client, MongoDB, and supporting components. This separation makes startup and dependencies explicit.
The backend service is named api. That is the service which reads
librechat.yaml, validates endpoint configuration, and brokers model calls.
Docker Compose supports layered configuration. A base file defines defaults and an override file applies local changes. That is why override files exist.
In plain terms: an override file is a patch. It only contains the changes you need, while the base file remains untouched.
6) Creating the Docker Override File
Create docker-compose.override.yaml in the LibreChat repository root. It
can start empty and then only include the local settings required for this deployment.
The target service must be api, because that is where
librechat.yaml is consumed.
services:
api:
volumes:
- ./librechat.yaml:/app/librechat.yaml:ro
extra_hosts:
- "host.docker.internal:host-gateway" Line by line explanation:
-
volumes: mounts your host config file into the container so the API can read it. -
:ro: read-only mount. The container can read config but cannot modify it. -
extra_hosts: adds a name-to-IP mapping inside the container. -
host.docker.internal:host-gateway: lets containers reach services on the Docker host in a portable way.
Why localhost fails in containers: inside Docker, localhost
means "this container," not your host and not another machine.
7) Installing and Configuring Ollama
Ollama runs LLM models locally and exposes them through an HTTP API. LibreChat connects to that API endpoint.
The /v1/ path matters because LibreChat expects an OpenAI-compatible API
shape for custom endpoints.
Pick your Ollama host OS
If your GPU is on a Windows gaming machine, use that as the Ollama host and keep LibreChat on the Pi. That split is common and works well.
Windows (recommended when your GPU is on Windows)
- Install Ollama for Windows from the official installer and start the app.
- Open PowerShell and verify the local API:
curl http://127.0.0.1:11434/api/tags
netstat -ano | findstr 11434 Expose Ollama to your LAN so LibreChat can reach it:
setx OLLAMA_HOST "0.0.0.0:11434" Then fully restart Ollama (quit and reopen). Add a firewall rule for TCP 11434:
New-NetFirewallRule -DisplayName "Ollama 11434" -Direction Inbound -Action Allow -Protocol TCP -LocalPort 11434 Linux
curl -fsSL https://ollama.com/install.sh | sh
sudo systemctl enable --now ollama Verify listener and API:
ss -ltnp | rg 11434
curl http://127.0.0.1:11434/api/tags Bind to all interfaces if LibreChat is on another host:
sudo systemctl edit ollama Add:
[Service]
Environment="OLLAMA_HOST=0.0.0.0:11434" Then reload and restart:
sudo systemctl daemon-reload
sudo systemctl restart ollama macOS
Install Ollama for macOS, launch it, then verify:
curl http://127.0.0.1:11434/api/tags
lsof -iTCP:11434 -sTCP:LISTEN
For remote access from LibreChat, set OLLAMA_HOST=0.0.0.0:11434 in the
Ollama runtime environment and restart Ollama.
Expected behavior on any OS: /api/tags returns JSON with a
models list. If the list is empty, pull a model first.
Also allow TCP/11434 in the host firewall between the LibreChat host and Ollama host.
8) Creating librechat.yaml
LibreChat requires a YAML configuration to define endpoints and model behavior. The API validates this file at startup using a schema.
Schema validation errors often mention Zod. In plain language, that means a required field is missing or has the wrong type.
Required sections include:
versionendpointsmodels.defaultfor your custom endpoint definition
Working example:
version: "1.0.0"
endpoints:
custom:
- name: "Ollama"
apiKey: "ollama"
baseURL: "http://192.168.169.173:11434/v1/"
models:
fetch: true
default:
- "gpt-oss:20b"
titleModel: "current_model"
summarize: false
modelDisplayLabel: "Ollama" Field explanation:
version: config schema version expected by LibreChat.endpoints.custom: list of non-default providers.name: provider name shown in UI.apiKey: required field for provider shape; local placeholder is fine.baseURL: Ollama endpoint, including trailing/v1/.models.fetch: query provider for available models dynamically.models.default: startup/default model list for the provider.titleModel: title behavior in conversations.summarize: whether to run summary behavior for this endpoint.modelDisplayLabel: friendly label shown in model selection UI.
9) Bringing LibreChat Online
Use deploy-compose.yml as the base runtime compose file, then add your
override file explicitly.
docker compose -f deploy-compose.yml -f docker-compose.override.yaml up -d
docker compose -f deploy-compose.yml -f docker-compose.override.yaml logs -f api Why specify files explicitly: it avoids ambiguity and guarantees the same startup path each time.
Common errors and fixes
- service has no image: you likely used the wrong compose file set.
Use
deploy-compose.ymlwith your override. - invalid configuration: YAML syntax error or wrong indentation in
librechat.yaml. - missing fields: required keys like
version,endpoints, ormodels.defaultare absent.
Logs are the source of truth for startup failures. Check API logs first whenever the UI loads but model calls fail.
10) Verifying Ollama Connectivity
Test from inside the api container, not just from the host shell. This
confirms real container-to-Ollama network reachability.
docker compose -f deploy-compose.yml -f docker-compose.override.yaml exec api curl http://OLLAMA_IP:11434/api/tags
Expected output is JSON containing a models array. If this request fails,
check IP, port, firewall, and Ollama binding.
If Ollama runs on Windows, use the Windows host LAN IP (for example from
ipconfig) in both this test command and baseURL. Do not use
localhost unless Ollama and LibreChat are on the same machine.
11) Selecting Models in LibreChat
Models are governed by backend configuration and provider state. They are not freely added from the GUI because endpoint admins control what is available.
With fetch: true, LibreChat asks Ollama for all available models and shows
them in selection lists.
Governance implication: if certain models are disallowed, remove them at the Ollama layer. That is safer than relying on UI-only restrictions.
12) Security and Air-Gapped Considerations
- No cloud AI dependency is required for inference.
- Data stays inside your controlled network boundary.
- Docker provides process and filesystem isolation per service.
- Future MCP integration can enforce token-scoped tool access.
- Architecture aligns with least privilege and auditability goals common in Splunk operations.
This is not "secure by default" without operations discipline, but it is a strong base for controlled deployment in restricted environments.
13) What This Enables Next
Once the local UI and model path is stable, you can extend the platform with:
- Splunk MCP Server integration for tool-based query and action workflows
- Constrained tool execution tied to Splunk roles and service accounts
- RAG patterns for SPL guidance grounded in approved internal docs
- Role-based endpoint access and model policies
- Audit logs across UI requests, tool calls, and backend actions
Related guides: Splunk Homelab Guide, Splunk FIPS Compatibility Guide, and GoSplunk Utilities.
14) Summary
You built a fully local assistant stack with LibreChat as the front-end control plane and Ollama as the on-prem model runtime. The components communicate over explicit network paths and validated configuration, with no dependency on cloud AI APIs.
This approach works because each layer has a clear responsibility: UI mediation, model serving, and future tool governance through MCP. It scales by separating UI and inference hosts and by enforcing policy at the endpoint and model layers.
For organizations that need deterministic behavior, stronger audit boundaries, and local data control, this architecture is a practical and safer foundation than sending Splunk context to external AI services.