Three apps, one binary

pocket-llm/
└── apps/
    ├── cli/                 # the Go binary: CLI + serve host
    ├── llama/               # cgo binding to llama.cpp (statically linked)
    └── dashboard/           # web UI; built and embedded via embed.FS

Three apps, one shipped binary. No Docker, no compose file, no runtime subprocess.

`apps/cli` — the CLI + HTTP host

Module: github.com/deemwar-products/pocket-llm/apps/cli.
Cobra-based CLI (cmd/local-agents/main.go). 8 subcommands.
Hosts the embedded HTTP server when serve is invoked.
internal/ is intentionally Go-internal — public API is the binary, not a library.
The function-layer rule lives here: internal/service/ doesn't import cobra, bubbletea, or HTTP-server packages.

Subpackage	Owns
`internal/cli/`	cobra commands (thin)
`internal/tui/`	Bubble Tea prompts (model picker today)
`internal/http/`	HTTP handlers (for `serve`)
`internal/service/`	the function layer — all business logic
`internal/manifest/`	model recommendation fetch + sha256 download + LocalAgentsHome resolver
`internal/project/`	config.yml types + scaffold + Load/Validate + templates (embed.FS)
`internal/wiki/`	(M3+) chat-time wiki tools — search_wiki, read_wiki, list_wiki
`internal/wikigen/`	(M3+) `wiki generate` orchestrator + pluggable backends
`internal/rag/`	(M5+) chunking + embedding + index build
`internal/skills/`	(M7+) SKILL.md parser + four-stage loader + runner
`internal/packzip/`	(M5+) zip pack + manifest + transport
`internal/eval/`	(M14+) LLM-as-judge backends

`apps/llama` — cgo bindings to llama.cpp

Module: github.com/deemwar-products/pocket-llm/apps/llama.
llama.cpp vendored at vendor/llama.cpp/ as a git submodule, pinned to tag b4000.
Built to libllama.a via cmake → statically linked into apps/cli via cgo.
Public Go API: Load, (*Model).Chat, (*Model).Embed, (*Model).Close. Today these are stubs; real implementations land in M1.
A pure-Go fallback (//go:build pocketllm_llama_stub) keeps the package building for contributors without a C toolchain.

This is the only cgo in the repo. apps/cli imports apps/llama but never adds its own C deps.

`apps/dashboard` — Vite + React + TS SPA

Bun + Vite + React 18 + TypeScript 5.9.
@/* path alias to src/*.
Builds to dist/ — which is the embed target: apps/cli reads apps/dashboard/dist/ at compile time via embed.FS and serves it at / when serve runs.
Static SPA only. No SSR. No Next.js. No server-side code here.
M0 ships a placeholder UI. Real dashboard UI is M8.

How `task build` chains them

bash

task dashboard:build    # bun run build → apps/dashboard/dist/
task llama:libllama     # cmake → apps/llama/vendor/llama.cpp/build/libllama.a
task cli:build          # go build with both embedded → bin/local-agents

Or just task build from the repo root.

Once M1 lands, task build produces one self-contained binary with dashboard assets and llama.cpp inside it.

Cross-platform targets

OS	Arch	Status
linux	amd64	first-class
linux	arm64	first-class
darwin	amd64	first-class
darwin	arm64	first-class
windows	amd64	first-class
windows	arm64	v2

WSL counts as linux — the linux binary works there unchanged. CI builds natively per OS/arch (no cross-compile).

Three apps, one binary ​

apps/cli — the CLI + HTTP host ​

apps/llama — cgo bindings to llama.cpp ​

apps/dashboard — Vite + React + TS SPA ​

How task build chains them ​

Cross-platform targets ​