The control plane and host agent for a multi-host, Firecracker-microVM Minecraft hosting platform. A Gin HTTP API owns users, auth, and game-server desired state; a reconciler drives each server onto a fleet host; and a per-host agent boots it as a real Firecracker microVM. See docs/PLAN.md for the phased roadmap (P0–P4 landed; P6 networking in progress).
- Go 1.26+
- PostgreSQL 13+
- For the real VM backend (
AGENT_RUNTIME=firecracker): a Linux host with/dev/kvm, thefirecrackerbinary, avmlinuxkernel, and rootfs images. - For (re)building the eBPF dataplane: clang + libbpf + bpftool on a Linux ≥6.6 host (see docs/ebpf-nat-dataplane.md).
cp .env.example .env # set DATABASE_URL, JWT_SECRET, etc.
# Spin up Postgres (example via Docker):
docker run -d --name craftling-pg -e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=craftling -p 5432:5432 postgres:16-alpine
make run # or: go run ./cmd/serverThe server listens on :8080 by default and applies the embedded goose
migrations (internal/db/migrations) automatically on startup — clean on both
a fresh database and one already at an older revision.
| Variable | Default | Description |
|---|---|---|
PORT |
8080 |
HTTP listen port |
APP_ENV |
development |
production → release mode + JSON logs |
DATABASE_URL |
postgres://postgres:postgres@localhost:5432/craftling?sslmode=disable |
Postgres connection string |
JWT_SECRET |
dev-secret-change-me |
HMAC signing secret (set in prod) |
ACCESS_TTL |
15m |
Access-token (JWT) lifetime |
REFRESH_TTL |
720h (30d) |
Refresh-token lifetime |
ADMIN_EMAIL |
(unset) | If set with ADMIN_PASSWORD, seeds/promotes this admin on startup |
ADMIN_PASSWORD |
(unset) | Admin bootstrap password |
TEMPLATE_INDEX_URL |
(unset) | Marketplace registry index the /templates API fetches from |
| Variable | Default | Description |
|---|---|---|
CONTROL_PLANE_URL |
http://localhost:8080 |
Control plane the agent registers + heartbeats with |
ADVERTISE_ADDR |
localhost:9000 |
Agent API address the control plane calls back |
ADVERTISE_HOST |
127.0.0.1 |
Player-facing connect host VMs report |
AGENT_RUNTIME |
fake |
VM backend: fake (in-memory) or firecracker (real microVMs) |
FC_BINARY |
firecracker |
Firecracker executable (PATH lookup by default) |
FC_KERNEL |
(required) | Uncompressed vmlinux all VMs boot |
FC_IMAGE_DIR |
(required) | Directory of per-version minecraft-<version>.ext4 rootfs images |
FC_DEFAULT_IMAGE |
(unset) | Fallback rootfs filename when a version has no image |
FC_WORK_DIR |
OS temp dir | Per-VM working dirs (sockets, writable rootfs, logs) |
AGENT_ID, AGENT_HOSTNAME, ZONE, CPUS_TOTAL, MEMORY_MB_TOTAL, and
AGENT_VERSION further describe the host in its registration.
| Method | Path | Auth | Description |
|---|---|---|---|
| GET | /healthz |
— | Liveness probe |
| GET | /api/v1/ping |
— | Example endpoint |
| POST | /api/v1/auth/register |
— | Create user, returns a token pair |
| POST | /api/v1/auth/login |
— | Verify credentials, returns a pair |
| POST | /api/v1/auth/refresh |
— | Rotate refresh token, returns a pair |
| POST | /api/v1/auth/logout |
— | Revoke a refresh token (204) |
| GET | /api/v1/me |
Bearer | Current authenticated user |
| POST | /api/v1/servers |
Bearer | Create a game server |
| GET | /api/v1/servers |
Bearer | List your game servers |
| GET | /api/v1/servers/:id |
Bearer | Get one of your servers |
| PATCH | /api/v1/servers/:id |
Bearer | Rename, or set desired_state (running/stopped) |
| DELETE | /api/v1/servers/:id |
Bearer | Tear down a server (202) |
| GET | /api/v1/templates |
Bearer | List marketplace templates |
| GET | /api/v1/templates/:id |
Bearer | Get one template's manifest |
| GET | /api/v1/admin/users |
Admin | List all users (role admin only) |
| GET | /api/v1/admin/servers |
Admin | List all servers across all owners |
| GET | /api/v1/admin/hosts |
Admin | List the fleet |
| POST | /api/v1/agent/hosts/register |
Agent | Host registers itself |
| POST | /api/v1/agent/hosts/:id/heartbeat |
Agent | Host liveness + capacity heartbeat |
Auth endpoints return a token pair:
{
"access_token": "<jwt>",
"refresh_token": "<opaque>",
"token_type": "Bearer",
"expires_in": 900
}The short-lived access token is sent as Authorization: Bearer <access_token>
on protected routes. When it expires, exchange the refresh token at
/auth/refresh for a new pair. Refresh tokens are rotated on every use
(the old one is revoked); replaying a rotated token is treated as theft and
revokes all of that user's tokens.
# Register (or login) to get a token pair
PAIR=$(curl -s -X POST localhost:8080/api/v1/auth/register \
-H 'Content-Type: application/json' \
-d '{"email":"alice@example.com","password":"hunter2pass"}')
ACCESS=$(echo "$PAIR" | jq -r .access_token)
REFRESH=$(echo "$PAIR" | jq -r .refresh_token)
# Call a protected route
curl localhost:8080/api/v1/me -H "Authorization: Bearer $ACCESS"
# Later: rotate for a fresh pair
curl -s -X POST localhost:8080/api/v1/auth/refresh \
-H 'Content-Type: application/json' \
-d "{\"refresh_token\":\"$REFRESH\"}"
# Log out (revoke the refresh token)
curl -s -X POST localhost:8080/api/v1/auth/logout \
-H 'Content-Type: application/json' \
-d "{\"refresh_token\":\"$REFRESH\"}"Each user has a single role (user by default, or admin). The role is
embedded in the access-token claims, and RequireRole middleware guards
admin routes — so authorization is checked without a database round-trip.
Because the role lives in the JWT, a role change takes effect on the user's
next token refresh (within ACCESS_TTL), not instantly.
Set ADMIN_EMAIL + ADMIN_PASSWORD to bootstrap an admin on startup: a
matching user is created if absent, or promoted to admin if it already
exists (its password is left untouched).
A game server separates desired state (running / stopped / deleted,
set by the API) from observed status (pending → provisioning →
running → stopping → stopped → deleting, plus error). A background
reconciler (internal/reconciler) ticks periodically, finds servers whose
status doesn't match their desired state, and drives them one step at a time
via a Provisioner backend.
Placement (P2). Before a server can run it must be placed on a fleet host.
The internal/scheduler picks a ready host with enough allocatable cpu/memory
(least-loaded) and atomically reserves that capacity; the assignment is
recorded as game_servers.host_id. A server that fits no host right now is
marked unschedulable and retried; a spec larger than any host is rejected at
create time.
Agent split (P3). The control plane never touches KVM. The reconciler's
backend is provisioner.RemoteProvisioner, which resolves the assigned host's
address and calls that host's agent (cmd/agent / internal/agent) over
HTTP to provision/start/stop/deprovision the VM. The agent runs a Runtime;
which one is chosen by AGENT_RUNTIME (fake for the in-memory stub, or
firecracker for real microVMs) — the API and reconciler are identical either
way because both satisfy the same agent.Runtime interface. Agents register and
heartbeat with the control plane so the scheduler knows the fleet.
# Create a server (desired_state defaults to running)
curl -s -X POST localhost:8080/api/v1/servers \
-H "Authorization: Bearer $ACCESS" -H 'Content-Type: application/json' \
-d '{"name":"survival","version":"1.20.4"}'
# Poll until the reconciler reports status "running" with host/port
curl -s localhost:8080/api/v1/servers/<id> -H "Authorization: Bearer $ACCESS"
# Stop / start
curl -s -X PATCH localhost:8080/api/v1/servers/<id> \
-H "Authorization: Bearer $ACCESS" -H 'Content-Type: application/json' \
-d '{"desired_state":"stopped"}'
# Tear down (reconciler deprovisions, then soft-deletes the row)
curl -s -X DELETE localhost:8080/api/v1/servers/<id> -H "Authorization: Bearer $ACCESS"Deletes are soft: the reconciler deprovisions the VM and then sets
status = deleted with a deleted_at timestamp. The row is retained for
history/audit but hidden from every API read (and from reconciliation).
Runtime (P4). internal/agent/firecracker boots each server as a real
Firecracker microVM, driving Firecracker through the in-repo generated REST
client (internal/firecracker) over the per-VM API Unix socket and managing the
VMM process directly. Provision stages a per-VM writable copy of the version's
base rootfs, launches firecracker, configures machine/boot-source/drive, then
starts the instance; Stop sends SendCtrlAltDel (force-kill after a grace
period) but keeps the rootfs so the world survives a restart on that host;
Start re-boots from it. make test-kvm runs the gated lifecycle integration
test on a /dev/kvm host.
Image pipeline. internal/image converts an OCI/Docker image into a
read-only squashfs rootfs (internal/squashfs), injecting the Go init binary
(cmd/init) as PID 1. The init agent mounts the kernel filesystems, fetches a
run spec (internal/runspec) from Firecracker's MMDS over the link-local
address, applies per-VM networking, then execs and supervises the workload —
powering the VM off when it exits. The control plane resolves launchable
templates from a marketplace registry (internal/registry, the /templates
API).
Networking (P6, in progress). An eBPF NAT dataplane gives each VM real
connectivity with no Linux bridge and no iptables/nftables rules: TCX-attached
programs SNAT egress and DNAT a per-server host port to the in-VM service port,
reusing the kernel's nf_conntrack via bpf_ct_* kfuncs. The guest applies its
address, gateway neighbor, and default route from the run spec. Design and
current status: docs/ebpf-nat-dataplane.md.
cmd/server control-plane entry point: DB connect/migrate + HTTP API + graceful shutdown
cmd/agent host-agent entry point: VM API + register/heartbeat; selects the VM backend
cmd/init in-VM PID 1: mount, fetch run spec from MMDS, apply networking, supervise workload
internal/config environment configuration (control plane + agent + Firecracker)
internal/db pgx pool + embedded goose migrations
internal/model domain types (User, GameServer, Host)
internal/repository data access (Postgres; in-memory host inventory)
internal/auth bcrypt password hashing + JWT issue/verify
internal/handler control-plane routes (auth, servers, templates, admin, agent)
internal/middleware request ID, request logging, JWT/agent auth guards
internal/registry template registry (marketplace) client
internal/scheduler host placement + atomic capacity reservation (P2)
internal/reconciler desired-state → observed-status convergence loop
internal/provisioner Provisioner seam: Fake + RemoteProvisioner (P3)
internal/agent host agent: Runtime interface, FakeRuntime, VM API server + clients (P3)
internal/agent/firecracker real Firecracker microVM Runtime + eBPF NAT dataplane (P4/P6)
internal/firecracker generated Firecracker REST client (go-openapi, over the API Unix socket)
internal/image OCI/Docker image → squashfs rootfs converter (injects cmd/init)
internal/squashfs squashfs writer/compressor used by the converter
internal/runspec host↔guest run-spec contract delivered via MMDS
internal/reaper periodic background cleanup (stale hosts → down)
internal/seed initial-data bootstrap (admin account)
internal/logger zap logger + request-scoped context helpers
make run # run the control plane
make run-agent # run a host agent (MODE=agent; see Makefile/.env.example for env)
make build # build control plane to ./bin/server
make build-agent # build agent to ./bin/agent
make test # run unit tests
make test-e2e # run end-to-end tests (requires Docker)
make test-kvm # Firecracker lifecycle test (requires /dev/kvm + host artifacts)
make bpf-generate # regenerate eBPF bindings/objects (Linux ≥6.6 + clang/libbpf/bpftool)
make tidy # tidy go.mod
make fmt # format codeEnd-to-end tests live in test/e2e/. They are gated behind the e2e build
tag, so make test (plain go test ./...) never touches Docker. The e2e
suite uses testcontainers to start a
real Postgres, serves the actual router over HTTP, and drives the full
lifecycle (register → login → server placement → reconcile → teardown):
make test-e2e # or: go test -tags e2e ./test/e2e/...The Firecracker lifecycle test is gated behind the kvm build tag and kept out
of the default lane; run it on a KVM host with the FC_* artifacts in place:
FC_KERNEL=... FC_IMAGE_DIR=... FC_DEFAULT_IMAGE=base.ext4 make test-kvm