Quickstart

Install, configure, and smoke-test localmelo.

This page is the minimum path from checkout to a verified local agent loop. It covers direct CLI mode, gateway mode, session reuse, backend setup, and the focused test suites.

Install

Install the package in editable mode from the repository root.

# Core + dev dependencies
pip install -e ".[dev]"

# Core + dev + gateway dependencies
pip install -e ".[dev,gateway]"

Direct CLI smoke

Direct mode runs one query and exits. It is the fastest way to confirm the configured chat backend can complete a normal request.

melo "What is 6*7?"

Pass criteria: the command exits with status 0, returns an answer that mentions 42, and prints no traceback.

You can override the endpoint and model for an ad-hoc run:

melo --base-url http://localhost:11434/v1 \
     --chat-model qwen3:8b \
     "What is 6*7?"

Gateway smoke

Gateway mode starts an HTTP API around the same agent runtime. Run the server in one shell and call it from another.

melo --serve

curl http://127.0.0.1:8401/v1/health

curl -X POST http://127.0.0.1:8401/v1/agent/run \
  -H "Content-Type: application/json" \
  -d '{"query":"Say hello briefly"}'

Pass criteria: health returns JSON with status: ok, and /v1/agent/run returns both result and session_id.

Session reuse

Pass a stable session_id to reuse one gateway session across calls.

curl -X POST http://127.0.0.1:8401/v1/agent/run \
  -H "Content-Type: application/json" \
  -d '{"query":"remember this is session test","session_id":"demo123"}'

curl -X POST http://127.0.0.1:8401/v1/agent/run \
  -H "Content-Type: application/json" \
  -d '{"query":"continue","session_id":"demo123"}'

curl http://127.0.0.1:8401/v1/sessions
curl -X DELETE http://127.0.0.1:8401/v1/sessions/demo123

Backend setup

localmelo uses a split backend model: chat_backend and embedding_backend are configured independently through melo --reconfigure.

Mode	Chat backend	Embedding backend	Notes
Ollama local	`ollama`	`ollama` or `none`	Connects to a user-managed Ollama server.
MLC local	`mlc`	`mlc` or `none`	Connects to a user-managed OpenAI-compatible MLC endpoint.
Other local	`vllm` or `sglang`	same local backend or `none`	Runtime lifecycle remains outside localmelo.
Cloud chat	`openai`, `anthropic`, `gemini`, or `nvidia`	local backend or `none`	Set the configured API key environment variable before running.

Local backend rule: localmelo does not host, compile, or start model runtimes. Start your local server separately, then point localmelo at its URL.

No-embedding mode

Set embedding_backend = "none" when you do not want to run an embedding server. The agent still supports direct answers, tool use, and history recording; only long-term embedding retrieval is disabled.

melo --reconfigure
melo "What is 6*7?"

Tests

Use the focused suites while developing, then run the full suite before merging.

# Dev-safe core suites
python -m pytest tests/agent tests/checker tests/executor \
                 tests/cli tests/memory tests/integration -q

# Gateway suite
python -m pytest tests/gateway -q

# Full suite
python -m pytest tests/ -q

# Lint
python -m ruff check .