What does Thuki even mean?

Thuki (Thư ký) is Vietnamese for secretary: a personal assistant who handles the details so you can focus on the work. The product is built for quick, throwaway conversations powered by local AI models: ask something, get an answer, move on. No history you didn't ask for, no cloud logging everything you type. Long term, Thuki is being built toward a fully agentic assistant that can actually do work on your behalf (Gmail, Calendar, Slack, and more via MCP integrations). The kind of assistant that handles the task, not just the answer.

Which Mac does Thuki run on?

Apple Silicon Macs (M1 or later) running macOS 13 Ventura or newer. Ventura is the floor because two capabilities Thuki depends on (the NSPanel overlay that floats above every app including fullscreen, and the HID-level event tap that intercepts your shortcut system-wide) only behave reliably from that release onward. Intel Mac support is on the roadmap.

Do I need an internet connection?

No. Thuki uses Ollama to run language models directly on your hardware. After the initial model download, everything is fully offline.

Which models can I use?

Any model Ollama supports: Llama 3, Mistral, Phi-3, Qwen, Gemma, and more. New models become available automatically as Ollama updates.

Where is my data stored?

A single SQLite file on your disk. There is no backend, no server, no log you cannot cat. Delete the file and the data is gone.

Is there a subscription or per-query charge?

No. Once you download Thuki there is no cloud service billing you. Your GPU does the work.

Does Thuki work with every app?

Yes. Highlight text in any app, summon Thuki with your keyboard shortcut, and type a slash command. The system-wide overlay works wherever your cursor is.

Why Thuki Is Local-First, Not Cloud-Based

Adding a cloud tier was the obvious move. Cloud tiers are how AI products make money, how they ship frontier models without depending on user hardware, and how they iterate features without waiting for a release cycle. Thuki shipped without one anyway. The reasoning is worth being explicit about, because it shapes everything else: the pricing, the privacy guarantees, the product roadmap, and what you can expect from the project a year from now.

What "Local-First" Means in Thuki's Context

Local-first means three things, in order of importance:

The model runs on your machine. No prompt, no response, no context window is sent to a remote server. Inference happens on your CPU and GPU.
The application stores nothing remotely. Conversations, settings, model files, and history all live on your disk. There is no Thuki account, no Thuki database, no "your data on our servers."
The software keeps working if Thuki the project disappears. Because there is no backend to shut down, an installed copy of Thuki will keep running as long as macOS and Ollama keep working. No license server, no phone-home check, no subscription wall.

This is a stronger guarantee than "we promise not to look at your data." It is a structural guarantee: there is no path for data to leave, because no such path was built.

Why a Cloud Tier Was the Obvious Move

It is worth being honest about what local-first costs the project itself, not just the user. A cloud tier would have solved real problems.

Revenue. A monthly subscription is the simplest, most predictable way to fund development. Free local software has a thin sustainability model.

Access to frontier models. GPT-4 class models cannot realistically run on a laptop. A cloud tier would let users tap into models that local hardware cannot host today.

Easier feature rollout. Server-side features ship instantly to every user. Client-side features require a release, an update prompt, and the user actually installing it.

Telemetry and product feedback. A backend can collect anonymized usage data, learn what works, and improve faster. Without one, the team has to ask users directly or infer behavior from GitHub issues.

Easier onboarding. "Sign up and start chatting" is a shorter funnel than "install Ollama, pull a model, install Thuki, point it at your model."

These are not theoretical advantages. They are the reason almost every AI product ships with a cloud tier. Choosing not to add one is choosing to live with the costs above.

What You Get When AI Runs Locally

The trade-offs flip when looked at from the user's side.

Privacy by structure, not by promise. Most "private" AI products are private in the sense that the company promises not to misuse your data. Thuki is private in the sense that there is no network path for the data to travel. You can verify this by disconnecting from the internet and watching Thuki keep working. A promise can be broken or changed by a new ToS; a structural guarantee cannot.

No bill, ever. Local inference has no marginal cost. Once you have the hardware and the model, the only thing inference costs is electricity. Thuki does not charge a subscription, does not have a paid tier, and does not plan to add one. The product is sustainable because the costs the project absorbs (development, hosting the website, maintaining the GitHub repo) do not scale per query.

Offline-capable, always. Thuki works on a plane, in a hotel with bad Wi-Fi, in a secure facility, or during an internet outage. The model is on your disk; the runtime is on your disk; the UI is on your disk.

You pick the model. Cloud AI products give you one model (theirs). Thuki lets you pull any model Ollama supports and switch freely. If a new open-source model is released tomorrow, you can be running it the same day without waiting for Thuki to add support.

The software outlives the project. If Thuki development stops, your installed copy keeps working. No license expiry. No "the service has been discontinued" email. The version you have today will still launch and run in five years, as long as you can run an Ollama-compatible model on your hardware.

What Local-First Costs the User

Honesty matters here. Local-first is not free of trade-offs.

Setup is more involved. Cloud AI is "open the app, start chatting." Local AI is "install Ollama, download a model, install Thuki, point it at the model." The first run is longer.

Model quality lags the frontier. The biggest cloud models (GPT-4o, Claude Opus) are not yet runnable on a consumer Mac. A 7-8B parameter local model handles most daily tasks well, but it will not match a 400B+ parameter cloud model on long-context reasoning, complex multi-step planning, or specialized benchmarks. For people who use frontier capabilities daily, local AI is a complement to cloud AI, not a full replacement.

Performance depends on your hardware. A 16 GB M2 MacBook gives a great experience. An 8 GB M1 Air running a 7B model will feel slower. Cloud AI gives every user the same baseline performance; local AI gives every user the performance their hardware can support.

No automatic model upgrades. When OpenAI ships a new model, every ChatGPT user gets it immediately. With Thuki, getting a new model means waiting for the open-source community to release one, then pulling it via Ollama. There is a lag.

These are real costs, not marketing trade-offs. Anyone deciding between Thuki and a cloud AI tool should weigh them honestly.

Cloud Models, On Your Terms (BYOK)

Local-first is the default. It is not the only option Thuki plans to support.

Some users will reasonably want access to frontier cloud models (GPT-4o, Claude Opus, and others) for tasks where local hardware cannot match the quality. The plan is to add bring-your-own-key (BYOK) support: you plug in your own API key from OpenAI, Anthropic, or any other provider, and Thuki routes the request directly from your machine to that provider's API.

What this means in practice:

No Thuki backend. Requests go from your Mac directly to the provider you chose. Thuki itself is not in the loop, does not proxy your key, and does not store your queries on any Thuki server.
You hold the keys. You pay the provider directly. You can revoke the key at any time. You take on the privacy trade-off knowingly because you are the one putting the key in.
Local stays the default. BYOK is opt-in, configured per-model. Your existing local setup is untouched, and removing BYOK at any point returns you to the same fully-local experience.
Provider-agnostic. Any provider with an OpenAI-compatible API works. You are not locked into one vendor and can switch freely.

The workflow stays the same in either mode. Thuki's distinguishing trait is not which model is running but how you reach for it: double-tap Control to summon the overlay from anywhere, highlighted text auto-loaded as context, /screen to attach what is on your monitor, and a dismiss-and-return flow that puts you back exactly where you were. That ambient design is what Thuki was actually built around, and it does not change when you swap a local 7B model for a frontier cloud model via BYOK. You keep the workflow you actually want, with the model quality the task actually needs. This combination is rare. Most ambient AI tools either lock you to one provider, hide the model behind a chat tab, or require you to leave the app you are currently in. Thuki refuses all three.

BYOK does not compromise the local-first stance because Thuki itself never becomes a cloud service. The architecture is the same in both modes: a desktop overlay that connects to whatever model endpoint you point it at, whether that endpoint is on your machine or on a third-party server. The "no cloud tier" rule is about Thuki not running its own paid cloud service; it was never a promise that the app would refuse to talk to a cloud API on your behalf.

If frontier capabilities matter to you on a specific task, BYOK gives you that escape hatch without forcing every Thuki user into a cloud subscription. If they do not matter, the default local setup is exactly what was described above.

Frequently Asked Questions

Will Thuki ever add a cloud option? Thuki itself will never run a paid cloud service or sit between you and a model provider. What is on the roadmap is bring-your-own-key (BYOK): you supply an API key from OpenAI, Anthropic, or any provider with a compatible API, and Thuki connects directly from your machine to that provider. No Thuki backend, no key proxying, no Thuki-managed subscription. Local-first stays the default; BYOK is an opt-in escape hatch for tasks where frontier cloud models matter more than the structural privacy guarantee.

What if local models are not good enough for my work? Local models are sufficient for most everyday tasks: summarizing, drafting, explaining code, answering questions, working through ideas. For frontier-only tasks (complex multi-step agents, very long context, image generation at GPT-4o quality), cloud AI is currently the better tool. Use both. Thuki is not trying to be your only AI; it is trying to be the AI that handles the 80% of tasks where privacy and speed matter more than raw capability.

How does the project sustain itself without a subscription? Thuki is open source under Apache 2.0. The project is funded by its sole maintainer's time, not by recurring revenue. This is genuinely a slower path: no marketing budget, no full-time team, no investor pressure. The trade is that the product can prioritize user interests over revenue interests indefinitely. If support tiers, sponsorship, or one-time purchases ever fund the work, they will be additive, not gating.

What happens if Thuki gets abandoned? The software you have installed keeps working. There is no license server to deactivate. Ollama is its own independent project with broad community support. Your installed Thuki will keep launching, keep connecting to your local Ollama, and keep responding to the Control double-tap, as long as macOS and Ollama remain functional on your machine. The source is on GitHub under a permissive license; if anyone wanted to fork and continue development, they could.

Why not just open source the cloud version too? A self-hosted cloud version of Thuki would shift the problem rather than solve it: now the user has to run a server, manage authentication, secure the database, and maintain uptime. That is not a Mac app workflow. The whole point of Thuki is that the desktop is already where you work; the AI should come to the desktop, not require you to set up infrastructure.

The Bigger Picture: Local-First Software

Thuki fits into a broader movement. The local-first software essay from Ink & Switch laid out the case in 2019: software that puts user data on the user's device, syncs optionally, and respects the user's ownership of their work. Apps like Obsidian, Logseq, and Anytype have built around this principle for note-taking and knowledge work. The AI space has been slower to adopt it, largely because of the model-quality gap.

That gap is closing. Open-source models have improved dramatically since 2023. The hardware to run them has gotten cheaper and faster. The case for cloud-only AI is weaker every year. Thuki is a bet that the local-first approach, applied to AI assistants, is where the space will eventually land.

The Short Version

Thuki has no cloud tier because adding one would compromise the privacy guarantee, create a paid-vs-free class system, and introduce a sustainability dependency that conflicts with the product's intent. The trade-off is real: setup is harder, models lag the frontier, and performance depends on your machine. But what you get in return is privacy by structure, no bill, offline capability, model choice, and software that outlives the project.

If those trade-offs match what you actually want from an AI assistant, Thuki is built for you.

Next steps:

New to Thuki? Read the introduction post
Set it up: Getting Started guide on GitHub
Read the original local-first software essay from Ink & Switch