Where we stand

Current state as of 2026-05-24.

What works end-to-end today

Surface	Milestone	Notes
Monorepo scaffold (apps/{cli,llama,dashboard}, root Taskfile, go.work)	M0 ✓	Commit `f0e2755`. Compiles cleanly.
`local-agents init`	M2 ✓	Manifest fetch + model picker (TTY) / `--yes` / `--model` (non-TTY) + project folder scaffold + GGUF + ONNX download with sha256 verify + cache hit short-circuit + next-steps hint.
`local-agents doctor`	M2 ✓	All 8 spec'd checks: config.yml load+validate, model presence, embedding presence, disk space (Statfs on unix / GetDiskFreeSpaceEx on windows), port availability, commands[].run binary resolution, skill runtime resolution, eval API key presence. Both pretty and `--json` output. Commit `38d6c46`.
Bubble Tea model picker	M2 ✓	Lives in `internal/tui/picker.go`.
`manifest` package (pure-Go foundation)	M2 ✓	17 tests. stdlib only.
`project` package (pure-Go foundation)	M2 ✓	11 tests / 24 sub-tests. `gopkg.in/yaml.v3` for marshaling. embed.FS templates for schema.md, wiki-instructions.md, README.md.
Module path on `deemwar-products` org	✓	Commit `87d98db`.
Function-layer rule	✓	Grep-verified. No cobra/bubbletea/HTTP-server imports under `internal/service/`, `internal/manifest/`, `internal/project/`.

What's stubbed (returns `not implemented`)

Surface	Milestone	What's blocking it
`local-agents wiki generate`	M3	Needs `internal/wikigen/` orchestrator + `local` backend that calls `apps/llama` in-process. Wiki shape (typed articles, frontmatter contract, wiki/index.md) is part of M4 alongside.
`local-agents wiki list/show/edit`	M4	Trivial once wiki/ folder is being written to.
`local-agents build`	M5	Needs RAG chunk index + manifest assembly + zip output + the demo-question smoke test.
`local-agents chat`	M6	Needs real `apps/llama` (M1) and chat-time tool calling: `search_wiki`, `read_wiki`, `list_wiki`, `search_rag`.
`local-agents serve`	M8	Needs embedded dashboard at `/` + `/api/*` handlers (M6 chat tools have to land first).
`local-agents import`	M9	Needs zip unpack + manifest validation + https/file sources. M10 adds `--self-contained` round-trip.
`local-agents eval`	M14	LLM-as-judge with pluggable backends (claude/openai/command). Dev-time only — requires internet + API key.

What's stubbed in `apps/llama`

	Status
Public Go API (Load/Chat/Embed/Close)	stub — returns `ErrNotImplemented`
Pure-Go fallback build tag (`pocketllm_llama_stub`)	✓ — keeps the package building without a C toolchain
Pinned llama.cpp tag in `.gitmodules`	`b4000`
Submodule actually checked out	✗ — `task llama:vendor-init` not run yet
`task llama:libllama` (cmake → libllama.a)	✗ — blocked on submodule init

M1 is the milestone that wires real cgo inference. Until then, the pocketllm_llama_stub build tag is the path forward for everything that doesn't need a working model.

What's NOT in scope for v1

OpenAI-wire-compatible /v1/* HTTP surface.
Additional wikigen backends (claude-code, codex, command) — deferred to v2. The wiki.generator config slot exists so the schema doesn't need a migration when they land.
OCR for images in data/.
local-agents skills new "<description>" skill-creator command.
windows/arm64 build.

What's permanently out of scope

Dockerfile / docker-compose. We ship the one binary.
Multi-tenancy, auth, accounts, OAuth, SSO.
Cloud sync, telemetry, license validation.
Multi-model, model routing, fine-tuning.
GPU inference.
A benchmarks app inside this repo (we consume llama-cpu-benchmarks).

Tests

50+ unit tests across internal/manifest/, internal/project/, internal/service/, internal/cli/.
All green with -race.
go build ./... && go vet ./... clean.

Next decisions

M3 (wiki generate) needs M1 (real cgo / libllama) to be meaningful end-to-end. Could do M3's orchestrator + a command backend (any shell-callable LLM) before M1 lands.
M1 is unblocked but has the highest setup cost (vendor submodule init + cmake + cgo on the matrix).
M5 (build) is the gate to having a real shippable zip.