OpenCode with Lore
OpenCode is a terminal-based AI coding agent. The @loreai/opencode plugin runs inside OpenCode and routes every LLM call through the Lore gateway transparently.
Install
Section titled “Install”Add the plugin to your project’s opencode.json:
{ "plugin": [ "@loreai/opencode" ]}Restart OpenCode. The plugin is installed automatically on first run.
What you get
Section titled “What you get”Once the plugin is loaded, every conversation you have with OpenCode is captured in Lore’s three-tier memory. Distillations run in the background, the recall tool is available in any session, and your project knowledge is exported to .lore.md and AGENTS.md automatically. See the architecture overview for the full picture.
Local embeddings
Section titled “Local embeddings”Recall uses @huggingface/transformers with nomic-embed-text-v1.5 (768-dim INT8 quantized, ~137 MB) for on-device vector search by default — no API key required. The model downloads on first use and is cached locally.
If local embeddings fail (for example, CUDA 13 on Linux/x64 with onnxruntime-node), set VOYAGE_API_KEY or OPENAI_API_KEY in your environment and recall transparently switches to that provider. You can also pin a hosted provider explicitly in .lore.json:
{ "search": { "embeddings": { "provider": "voyage" } } }If none of the above apply, recall falls back to FTS-only search.
Pointing OpenCode at a local LLM server
Section titled “Pointing OpenCode at a local LLM server”OpenCode users running a local LLM server (vllm, llama.cpp, ollama, LM Studio, etc.) should configure the provider’s baseURL in OpenCode to point at the running gateway URL. The @loreai/opencode plugin reads LORE_UPSTREAM_<PROVIDER> from your environment and injects it as the x-lore-upstream-url header on each request:
export LORE_UPSTREAM_VLLM=http://localhost:8000# orexport LORE_UPSTREAM_OLLAMA=http://localhost:11434The URL is the server root — do not include /v1 (the gateway appends API paths automatically). See the local inference guide for a full walkthrough.
Per-harness notes
Section titled “Per-harness notes”- Compaction is disabled. OpenCode’s built-in compaction would defeat Lore’s gradient context manager. The plugin sets
cfg.compaction = { auto: false, prune: false }in the OpenCode config. - Project identity is automatic. The plugin detects your project root (
ctx.worktree/ctx.directory), walks up to find the workspace root, and injects the path + git remote asx-lore-project/x-lore-git-remoteheaders on every request. - The gateway auto-starts. If no Lore gateway is running, the plugin starts one in-process. The gateway listens on a deterministic port (
3207by default, falling back to5673) and the plugin probes for it before assuming it needs to spawn.
Next steps
Section titled “Next steps”- Architecture — how temporal storage, distillation, and the gradient context manager fit together.
- Configuration — full reference for
.lore.json. - Local inference — running OpenCode against Ollama, vLLM, or llama.cpp.
- Custom upstreams — corporate proxies, LiteLLM, Cloudflare AI Gateway.