Frequently Asked Questions¶
General¶
What is Quincy?¶
Quincy is an AI assistant that runs on your Mac. It uses language models — the same technology behind ChatGPT and Claude — to help you manage apps, services, and APIs on your computer. Unlike cloud-only AI assistants, Quincy can run entirely on your machine, and you control exactly what the AI can and can't do. See Welcome to Quincy for the full picture.
Is Quincy free?¶
Quincy itself is free. If you run language models locally on your Mac, there are no ongoing costs. If you connect a cloud provider like Anthropic or Google, you pay for API usage according to that provider's pricing. See Saving Money with Local LLMs for tips on keeping costs low.
What Macs does it run on?¶
Quincy runs on Apple Silicon Macs (M1 and later) running macOS 14 Sonoma or later. Local language models benefit from more unified memory — 16 GB is a good starting point, and 32 GB or more lets you run larger, more capable models.
Is Quincy open source?¶
Quincy is not currently open source.
Privacy & Security¶
Does Quincy send my data to the cloud?¶
Only if you choose to. Quincy can run language models entirely on your Mac with no internet connection. If you connect a cloud provider for faster or more accurate responses, only the messages in that conversation are sent — and only to the provider you selected. You choose what stays local and what goes to the cloud, on a per-task basis. See Security & Trust for details.
How does Quincy protect my API keys?¶
All API keys, passwords, and credentials are stored in the macOS Keychain — the same secure storage that Safari and other Mac apps use. The AI never sees your secrets. When a tool needs a credential, it's retrieved from the Keychain at runtime and passed directly to the external service, without ever appearing in the conversation. See Security & Trust for the full picture.
Can an agent go rogue?¶
Agents can only use the tools you've allowed, with the restrictions you've set. These aren't suggestions the AI promises to follow — they're hard limits enforced in code before any action is taken. An email agent that's only allowed to read your inbox literally cannot send replies or delete messages, no matter what the AI tries. You can also require manual approval for specific actions, so you stay in the loop for anything high-stakes. See How the Agent System Works for more.
Models¶
Do I need an internet connection?¶
No. Quincy can run language models locally on your Mac using llama.cpp. An internet connection is only needed if you want to use a cloud provider like Anthropic or Google. See Setting Up llama.cpp to get started with local models.
Which models work best?¶
It depends on the task. Local models are great for routine work — fast, free, and private. Cloud models like Anthropic's Claude are stronger at complex reasoning, planning, and nuanced tasks. Many users run both: local models for everyday tasks and a cloud model for anything that needs more horsepower. See Choosing Models for guidance.
Can I use both local and cloud models?¶
Yes. Quincy supports hybrid setups where different agents use different models. You might run your orchestrator on Claude for strong reasoning while running simpler sub-agents on a local model to save cost and keep data private. You can also set up fallback chains — try a local model first, fall back to a cloud model if it's not available. See Choosing Models for details.
MCP & Integrations¶
What is MCP?¶
MCP (Model Context Protocol) is an open standard that lets AI agents connect to external tools and data sources. Think of MCP servers as plugins — each one exposes tools that any compatible agent can use. There's a growing ecosystem of MCP servers for everything from GitHub and Slack to home automation and databases. See Extending Quincy with MCP for how to connect them.
Can I connect Quincy to my existing tools?¶
Yes. Quincy can connect to any MCP server (the open standard for AI tool integrations) and can also wrap REST APIs that have an OpenAPI or Swagger spec. You can connect tools for task management, calendars, smart home devices, databases, and more. Ask Quincy to set up a connection and it walks you through the process. See Extending Quincy with MCP for details.
How does Quincy compare to Lasso MCP Gateway?¶
Both Quincy and Lasso MCP Gateway help make AI tool usage safer, but they solve different problems. Lasso is a transparent proxy that adds automated security scanning (token masking, PII detection, prompt injection filtering) to your existing MCP setup. Quincy is a full agent platform with a built-in MCP firewall that includes human-in-the-loop approval — you can require manual confirmation before specific tool calls execute, staying in the decision loop at runtime rather than relying solely on pre-defined rules. For a detailed comparison, see Quincy vs Lasso MCP Gateway.
Agents¶
Can I build my own agents?¶
Yes. You can ask Quincy to create a custom agent for any task — scan email for action items, manage your smart home, query a database, or anything else you need. You define the agent's purpose, tools, and restrictions together in conversation, and Quincy handles the rest. See How the Agent System Works for the full guide.
Can I modify built-in agents?¶
Yes. Every agent in Quincy — including the ones that ship with it — can be customized. If a built-in agent is too verbose, uses the wrong model, or has access to tools it doesn't need, you can change its system prompt, model, and tool permissions. There are no special agents that are off-limits. See How the Agent System Works for details.
Troubleshooting & Cost¶
What do I do if something isn't working?¶
Quincy has a built-in diagnostic tool that can check your setup, test connections to model providers, and identify common problems. Ask Quincy to run diagnostics, or see Doctor for what it checks and how to interpret the results.
Does Quincy work on iPhone?¶
Yes, with a server running on your Mac. Quincy's server handles the AI processing, and you connect from your iPhone over your local network. The conversation happens on your phone; the work happens on your Mac. See Server & Clients for setup instructions.
How can I reduce API costs?¶
Use local models for routine tasks and reserve cloud models for complex reasoning. Quincy's hybrid setup lets you mix local and cloud models across different agents, so you only pay for cloud API calls when you need the extra capability. See Saving Money with Local LLMs for a detailed guide.