Getting Started with Quincy¶

First Run¶

When you run Quincy for the first time, it detects that no configuration exists and walks you through a setup wizard. There's no separate setup command — any command triggers onboarding automatically.

Behind the scenes, the Quincy server starts in setup mode. The CLI (or any other client) drives the wizard by exchanging steps with the server over its REST API. From your perspective, it looks like a normal interactive walkthrough.

Walk through each step:

Provider selection — Choose between a local model (via llama.cpp), a cloud provider (like Anthropic or Gemini), or both. Choosing "both" sets up local inference first, then walks you through cloud provider setup.
Provider-specific setup — Depending on your choice:
Local: Quincy locates your llama-server binary and configures GPU offloading, context size, and idle timeout. See Setting Up llama.cpp for details.
Cloud: Your API key is collected and stored securely in the macOS Keychain (never written to disk). See Setting Up Anthropic or Setting Up Gemini.
Model selection — Pick a default model for the orchestrator agent.
Finish screen — You have three choices:
Finish — Start using Quincy immediately
Add another provider — Loop back to add an additional LLM provider (you can add as many as you want)
Advanced — Customize which models are used for background tasks like chat titles and memory extraction

Once onboarding completes, Quincy saves your configuration. All configuration files are digitally signed — see Security & Trust for details.

Your First Conversation¶

Here are some things to try once you're set up.

Works right away (built-in tools only):

"Summarize my unread emails and flag anything that needs a reply today"

This exercises Quincy's email tools, memory, and the orchestrator's judgment about what counts as urgent.

"Create a scheduled job that reviews my calendar every Monday morning and writes a weekly priority summary to memory"

This demonstrates scheduled jobs, memory, and calendar context — Quincy will create a background job that runs without you being present.

Exploring MCP connections:

"Set up an MCP bridge to my Obsidian vault, then find all notes tagged #project and summarize the status of each"

This shows how Quincy can connect to external tools via MCP bridges and work with your existing note-taking setup.

"Connect to the Hacker News API and build me a daily digest agent that summarizes the top 10 stories every morning"

This demonstrates REST API tools, scheduled jobs, and agent creation — Quincy builds a custom agent that runs on a schedule.

Power-user combo:

"Create an agent that monitors RSS feeds from my favorite news sources, summarizes new articles each morning, and saves the summaries to memory so I can ask 'what happened this week in tech?' later"

This ties together agent creation, MCP bridges, memory, and scheduled jobs into a complete workflow.

Sessions¶

Quincy organizes conversations into sessions:

A main session runs continuously and persists across restarts
You can reset the main session at any time — the old session is archived (not deleted) and remains browsable
Named sessions provide focused contexts for specific tasks:

quincy chat --session tax-prep

You can also create and destroy sessions dynamically, and list all active sessions from any client

Each session maintains its own conversation history. The orchestrator agent owns the session and passes context to sub-agents as needed.

After a few exchanges, Quincy automatically generates a short session title summarizing the conversation topic (e.g., "OmniFocus weekly review" or "Email agent setup"). The title updates as the conversation evolves, and all connected clients see it in real time.

Next Steps¶

Set up Anthropic for cloud-powered accuracy
Set up Gemini for an alternative cloud provider
Set up llama.cpp for fast, private, local inference
Learn about the agent system to understand how Quincy delegates work
Save money with local models for background task configurations