Skip to content

Where we stand

Current state as of 2026-05-24.

What works end-to-end today

SurfaceMilestoneNotes
Monorepo scaffold (apps/{cli,llama,dashboard}, root Taskfile, go.work)M0Commit f0e2755. Compiles cleanly.
local-agents initM2Manifest fetch + model picker (TTY) / --yes / --model (non-TTY) + project folder scaffold + GGUF + ONNX download with sha256 verify + cache hit short-circuit + next-steps hint.
local-agents doctorM2All 8 spec'd checks: config.yml load+validate, model presence, embedding presence, disk space (Statfs on unix / GetDiskFreeSpaceEx on windows), port availability, commands[].run binary resolution, skill runtime resolution, eval API key presence. Both pretty and --json output. Commit 38d6c46.
Bubble Tea model pickerM2Lives in internal/tui/picker.go.
manifest package (pure-Go foundation)M217 tests. stdlib only.
project package (pure-Go foundation)M211 tests / 24 sub-tests. gopkg.in/yaml.v3 for marshaling. embed.FS templates for schema.md, wiki-instructions.md, README.md.
Module path on deemwar-products orgCommit 87d98db.
Function-layer ruleGrep-verified. No cobra/bubbletea/HTTP-server imports under internal/service/, internal/manifest/, internal/project/.

What's stubbed (returns not implemented)

SurfaceMilestoneWhat's blocking it
local-agents wiki generateM3Needs internal/wikigen/ orchestrator + local backend that calls apps/llama in-process. Wiki shape (typed articles, frontmatter contract, wiki/index.md) is part of M4 alongside.
local-agents wiki list/show/editM4Trivial once wiki/ folder is being written to.
local-agents buildM5Needs RAG chunk index + manifest assembly + zip output + the demo-question smoke test.
local-agents chatM6Needs real apps/llama (M1) and chat-time tool calling: search_wiki, read_wiki, list_wiki, search_rag.
local-agents serveM8Needs embedded dashboard at / + /api/* handlers (M6 chat tools have to land first).
local-agents importM9Needs zip unpack + manifest validation + https/file sources. M10 adds --self-contained round-trip.
local-agents evalM14LLM-as-judge with pluggable backends (claude/openai/command). Dev-time only — requires internet + API key.

What's stubbed in apps/llama

Status
Public Go API (Load/Chat/Embed/Close)stub — returns ErrNotImplemented
Pure-Go fallback build tag (pocketllm_llama_stub)✓ — keeps the package building without a C toolchain
Pinned llama.cpp tag in .gitmodulesb4000
Submodule actually checked out✗ — task llama:vendor-init not run yet
task llama:libllama (cmake → libllama.a)✗ — blocked on submodule init

M1 is the milestone that wires real cgo inference. Until then, the pocketllm_llama_stub build tag is the path forward for everything that doesn't need a working model.

What's NOT in scope for v1

  • OpenAI-wire-compatible /v1/* HTTP surface.
  • Additional wikigen backends (claude-code, codex, command) — deferred to v2. The wiki.generator config slot exists so the schema doesn't need a migration when they land.
  • OCR for images in data/.
  • local-agents skills new "<description>" skill-creator command.
  • windows/arm64 build.

What's permanently out of scope

  • Dockerfile / docker-compose. We ship the one binary.
  • Multi-tenancy, auth, accounts, OAuth, SSO.
  • Cloud sync, telemetry, license validation.
  • Multi-model, model routing, fine-tuning.
  • GPU inference.
  • A benchmarks app inside this repo (we consume llama-cpu-benchmarks).

Tests

  • 50+ unit tests across internal/manifest/, internal/project/, internal/service/, internal/cli/.
  • All green with -race.
  • go build ./... && go vet ./... clean.

Next decisions

  • M3 (wiki generate) needs M1 (real cgo / libllama) to be meaningful end-to-end. Could do M3's orchestrator + a command backend (any shell-callable LLM) before M1 lands.
  • M1 is unblocked but has the highest setup cost (vendor submodule init + cmake + cgo on the matrix).
  • M5 (build) is the gate to having a real shippable zip.

pocket llm — local-first, offline, no telemetry. MIT licensed.