OMEGA Core runs as a headless server on the local network. No desktop, no GUI, no persistent terminal session. When you want to know whether Ollama is up, whether Qdrant has started, whether the API container has crashed and restarted — you need a way to ask that question remotely, reliably, and without SSHing in every time.

This is a problem that sounds trivial until you start solving it.

The First Instinct: State Files

The first idea was simple. Write the Docker state to a file. Something like:

{
  "ollama": "running",
  "postgres": "running",
  "qdrant": "stopped"
}

A script runs on the server, queries Docker, writes the file. You read the file from wherever. Done.

Except it isn’t done, because immediately you have to answer a set of uncomfortable questions.

How often does the script run? Every 30 seconds? That means your state is up to 30 seconds stale. A service can crash and restart within that window and you’d never see it. Every 5 seconds? Now you have a polling loop running constantly on the host, and you have to make sure it doesn’t die silently.

Where does the file live? On the host filesystem — which means anything that wants to read it needs either SSH access or a mounted volume. Neither is convenient if you’re querying from a Mac, a script, or a dashboard.

What format? JSON is fine until you want to add more fields — uptime, image version, restart count, port bindings. Now you’re designing a schema and maintaining it across the script that writes it and whatever reads it.

And what happens when the file is being written and you read it mid-write? You get a corrupt half-written JSON blob and an unhelpful parse error.

State files are not a bad idea for very simple cases. For anything with more than two or three services, any need for real-time accuracy, or any requirement to query from more than one place — they become a maintenance burden faster than they become useful.

What You Actually Want

What you actually want is something you can ask: what is the current state of the platform, right now?

The operative word is ask. Not read a cached file. Ask — and get an answer that reflects the current moment.

That means a service. Something running on the host that can query Docker and respond to requests over the network. HTTP is the obvious interface — it’s queryable from a browser, a script, a dashboard, a monitoring system, anything.

The question is how lightweight you can make it.

Entering FastAPI

FastAPI is a Python web framework that is genuinely fast to write and genuinely fast to run. A minimal application that serves a single JSON endpoint can be written in under twenty lines and deployed in a Docker container with a standard Python base image.

The Evidence API started exactly that small:

from fastapi import FastAPI
import docker

app = FastAPI()
client = docker.from_env()

@app.get("/docker")
def docker_state():
    containers = client.containers.list(all=True)
    return [
        {
            "name": c.name,
            "status": c.status,
            "image": c.image.tags[0] if c.image.tags else c.short_id,
        }
        for c in containers
    ]

That’s it. That’s the core. The Docker Python SDK queries the socket, the API serialises the result as JSON, FastAPI handles the HTTP layer. No polling loop. No state file. No staleness. Every request returns the current state at the moment you asked.

From any machine on the network:

curl http://10.10.1.94:8000/docker

You get back a JSON array of every container, its current status, and its image. Running, exited, restarting — whatever Docker knows, you know.

Growing the API

Once the pattern existed, it was cheap to extend it.

A /health endpoint that checks each service by name and returns a structured health summary. A /models endpoint that queries the Ollama API and returns the list of loaded models with their sizes and families. A /system endpoint that returns CPU, memory and disk utilisation from the host.

Each endpoint is the same pattern: query something, return JSON. The HTTP interface means anything can consume it — including Grafana.

Which is where the next problem appeared.

The Observability Gap

A health endpoint tells you the state now. It doesn’t tell you what happened at 3am, how long a service was down before it recovered, or whether memory utilisation has been climbing steadily for the past week.

For that you need time-series data. You need something that writes state to a database continuously, not just when you ask.

The solution was two background loops added to the FastAPI application using its lifespan context manager — a 30-second cycle that writes service health and system metrics to InfluxDB, and a 5-minute cycle that writes model inventory. The loops start when the application starts and stop cleanly when it shuts down.

Grafana reads from InfluxDB on a read-only token. The API writes on a separate scoped read-write token. Neither can do what the other is authorised to do.

The state file idea would never have got there. A single HTTP API that started with twenty lines of Python now does real-time querying, background metric collection, and feeds a live dashboard — all from the same container, with the same deployment, without any external scheduler or cron dependency.

What Landed

The Evidence API runs as a Docker container in its own Compose stack, on the omega_internal network, bound to the LAN interface at 10.10.1.94:8000. It starts with the platform, stops cleanly, and is the single source of truth for platform state.

If you want to know what’s running on OMEGA Core, you ask it.

curl http://10.10.1.94:8000/health | jq .

That question has an answer. It’s accurate. It’s immediate. And it required considerably less engineering than a state file that went stale every thirty seconds.