Skip to content

Models and thinking

Every teammate has a brain you can dial up or down: a faster one for quick jobs, a deeper one for hard ones. This page shows you where to set it and how to pick well, so a teammate is as quick — or as thorough — as the task deserves.

Open Agent Builder for the teammate you want to tune, then the Config tab. You’ll find three settings stacked together: Model, Extended Thinking, and a per-request Budget limit.

*Agent Builder → Config: the Model picker, the Extended Thinking switch, and the budget limit.*

A breakdown of what you’re looking at:

  1. Model — the brain you’re handing this teammate. Each option shows a short note and a small tag.
  2. Extended Thinking — a switch that lets the teammate reason step by step before replying, with an Effort level when it’s on.
  3. Budget limit per query — an optional cap on how much a single request can spend.

Open the Model picker

In Config, find the Model card. It’s a short list — tap the circle next to the one you want.

*The Model list — tap one to choose it.*

Read it in plain terms — faster vs deeper

You don’t need to know model trivia. The simple rule:

Deeper

Bigger models (the Opus options) think harder and handle complex, multi-step work best. Slower and a little pricier — worth it for the important stuff.

Faster

Lighter models (Sonnet, Haiku, the Flash option) reply quicker and cost less. Great for short, simple jobs.

The tag on each row tells you how much each request can hold (for example 1M ctx means it can read a lot of context at once).

Keep the recommended default if you're unsure

Latest (Opus) is marked recommended for most users and stays current automatically. If you’re not sure, leave it — it’s a strong all-rounder.

Some jobs need a teammate to reason carefully before answering — planning a multi-step task, weighing options, working through a tricky calculation. That’s what thinking is for.

A teammate with thinking on takes a short, private moment to reason things through before it replies. It tends to give steadier, better-considered answers on hard problems — and it isn’t needed for simple ones.

Flip the Extended Thinking switch

On the Extended Thinking card, turn the switch on. A short note explains it adjusts how hard the teammate reasons.

*Extended Thinking on — the teammate reasons before it replies.*

Choose an Effort level

With thinking on, an Effort menu appears — Max, High, Medium, or Low. Higher effort means more careful reasoning (better on hard work, a little slower and pricier); lower effort is quick and light.

*Effort: Max, High, Medium, or Low — more effort for harder work.*

Speed, quality, and cost — the trade-off

Section titled “Speed, quality, and cost — the trade-off”

These three settings pull against each other in a simple way:

Quick and cheap

A faster model with thinking off (or on Low). Best for short, routine jobs — quick answers, simple drafts, status checks.

Deep and thorough

A bigger model with thinking on at higher effort. Best for complex, important work where being right matters more than being fast. Costs a bit more per request.

You don’t have to get this perfect. The recommended default — Latest (Opus) with thinking on — is a sensible balance for most teammates, and you can adjust any time.

Bigger models and more thinking use more of the model’s effort, so they cost a little more per request. The model list shows a small cost tag on each option so the pricier-vs-cheaper trade-off is visible before you choose. To put a hard ceiling on any single request, turn on Budget limit per query and set an amount — the teammate won’t spend more than that on one job.

*Budget limit per query — an optional cap on what a single request can cost.*

For how charges add up across your whole company and what’s included in your plan, see Billing and plans.

If a model or thinking change doesn’t seem to take effect, check these first:

  1. Wrong teammate — make sure you’re in Agent Builder for the teammate you meant to change, not a different one.
  2. Change not saved — the setting saves with the rest of the teammate’s setup; finish saving in Agent Builder before testing.
  3. Effort menu missing — the Effort level only appears after you turn Extended Thinking on.
  4. Replies feel slow — that’s expected on a deeper model or higher effort; switch to a faster model or lower effort if speed matters more for that teammate.
  5. A request stopped early — if you set a tight Budget limit, a big job may hit the cap; raise the amount or turn the limit off for that teammate.

Still stuck? See Getting help.

Which model should I pick if I don't know?

Leave it on Latest (Opus) — it’s the recommended default, deep enough for almost any job, and it stays current automatically. Switch to a faster model only for a teammate that mostly does quick, simple work.

What does 'thinking' actually do?

It lets a teammate reason step by step, privately, before it replies — like thinking before speaking. It helps on hard, multi-step problems and isn’t needed for simple questions. Turn it on and set the Effort level to control how much.

Will a faster model give worse answers?

Not for simple jobs — a faster model is plenty for quick questions and routine drafts. For complex, important work, a deeper model handles the hard parts more reliably. Match the model to the job.

Does a bigger model or more thinking cost more?

A little, per request. Bigger models and higher thinking effort use more of the model’s effort. The model list shows a cost tag on each option, and you can cap any single request with Budget limit per query. See Billing and plans for the full picture.

Do I set this for each teammate separately?

Yes. Model and thinking are per-teammate, in each one’s Agent Builder → Config. So you can give your hardest-working teammate a deeper brain and keep a quick-question teammate fast and cheap.