agent transfer

How AI Agents Hand Over: Agent-to-Agent Transfer, Human Escalation, and Live Supervision

Anuprash Gupta

Product & Platform, Zoice

June 8, 20267 min read

The hardest moment in any customer conversation is the handover. Human call centers learned this decades ago: a cold transfer where the caller repeats everything to a new person is where satisfaction goes to die. AI-run conversations have the same failure mode — plus a new one, where a single overloaded agent tries to be the billing desk, the scheduler, and the complaints team all at once, and does each job worse than a specialist would.

This post covers how handover actually works on the Zoice voice platform: warm AI-to-AI transfers, escalation to human phone numbers, the trigger scenarios that decide when each happens, and the live supervision tools that let a human operator step into a conversation without ending it.

Why One Mega-Prompt Agent Fails at Scale

The first version of most deployments is a single agent with a long prompt: greet the caller, answer product questions, handle billing disputes, book appointments, take complaints. It demos well. Then real traffic arrives and three problems compound:

Instruction dilution. The more jobs in one prompt, the weaker each instruction's pull on the model. Edge-case billing rules buried under scheduling logic get ignored exactly when they matter.
Untestable changes. Fixing the refund flow means editing a prompt that also runs scheduling — every change risks regressions in unrelated conversations, so teams stop iterating.
No clean ownership. When the billing team and the support team share one prompt, nobody owns its behavior, and prompt edits turn into merge conflicts between departments.

The fix mirrors how good call centers are organized: specialists plus a disciplined transfer protocol. Each agent gets a short, focused prompt for one domain, and built-in transfer tools move the conversation to whoever should own it next — another AI agent or a human.

Key Insight

One mega-prompt agent that tries to do everything degrades at scale — specialist agents connected by transfer tools do not. Two built-in tools, transfer_agent and transfer_number, route conversations between AI specialists and human staff based on natural-language trigger scenarios, while live supervision (Whisper, text take-over, voice takeover) lets operators step in mid-conversation without ending the AI session.

On this page

The Two Transfer Tools

Zoice agents ship with two built-in tools that the LLM can invoke mid-conversation, the same way it would call any other function.

transfer_agent: Warm AI-to-AI Transfer

This tool hands the conversation to another agent picked from your organization. The transfer is warm — the conversation continues rather than restarting — so a caller who drifts from a delivery question into a billing dispute lands with the billing agent without re-explaining anything. Specialist agents stay small and testable because anything outside their domain is one tool call away.

transfer_number: Escalation to a Human

Some conversations need a person: an angry customer, a request with legal weight, a judgment call the agent is not authorized to make. transfer_number sends the call to a human phone number, with an optional pre-transfer TTS message — something like "Connecting you to our support team, please hold" — so the caller hears a clear bridge instead of dead air.

Trigger Scenarios: Teaching the Agent When to Transfer

Both tools are configured with multiple natural-language trigger scenarios. You write plain sentences — "user asks about billing", "caller wants to speak to a manager", "user mentions a refund" — and each scenario maps to a destination. This is the part teams underinvest in: the quality of your handoffs is the quality of your scenario list. Write scenarios the way callers actually talk, including the indirect phrasings ("there's a problem with my bill" as well as "billing question").

One guardrail worth knowing: transfers are function calls under the hood, so the agent editor warns you when the selected LLM lacks function-calling support. Catch that at design time, not on a live call where the transfer silently never fires.

These pieces — the two tools, their scenarios, and the routing and supervision around them — break down as follows.

Core Components

1transfer_agent Tool

A built-in tool for warm AI-to-AI transfer — the conversation moves to another agent picked from your organization, so a billing specialist takes over from the front-desk agent mid-conversation.

2transfer_number Tool

Escalation to a human phone number, with an optional pre-transfer TTS message such as 'Connecting you to our support team' so the caller is never dropped into silence.

3Trigger Scenarios

Each transfer tool carries multiple natural-language scenarios — 'user asks about billing', 'caller wants to cancel' — that tell the LLM exactly when to invoke the transfer.

4Live Supervision Trio

Whisper folds a private operator instruction into the agent's next turn; text take-over pauses the bot while the operator replies in-thread; voice takeover rings the operator's phone into the live call, then releases back to the AI.

5Inbound Routing Rules

First-match-wins rules on caller prefix, language, business hours, keywords, and channel decide which agent answers in the first place — transfers handle what routing cannot predict.

Live Supervision: Three Levels of Human Intervention

Transfers handle the planned handoffs. Supervision handles the unplanned ones — a high-value caller, a conversation drifting off course, a new agent on its first week of traffic. Operators watching a live conversation have three escalating controls:

Control	What happens	Caller notices?
Whisper	Operator sends a private instruction that is folded into the agent's next turn — the AI keeps talking, but steered	No — the AI simply gets smarter
Text take-over	Bot pauses; the operator replies as a human in the same thread. The transcript shows amber OPERATOR labels and handoff timeline markers	Yes — a human is now answering
Voice takeover	The operator's phone rings and joins the live call; when the situation is handled, the operator releases the call back to the AI	Yes — a human voice joins the call

Whisper is the workhorse: a supervisor who spots the agent missing context can inject "offer the goodwill credit, this is a ten-year customer" without the caller ever knowing. Take-over modes are for moments that need human judgment now — and crucially, voice takeover is reversible. The human resolves the hard part, releases the call, and the AI finishes the routine wrap-up.

Every one of these events — transfers, whispers, take-overs, releases — is logged in the transcript, so QA reviews and analytics can answer the operational questions: which scenarios fire most, which agents escalate most, where humans had to step in.

Routing Decides Who Answers First

Transfers fix mid-conversation misroutes, but the cheapest handoff is the one you avoid. Inbound routing rules match on caller prefix, language, business hours, keywords, and channel — first match wins — so a Hindi-speaking caller dialing after hours about a claim lands on the right specialist immediately. Get routing right and transfer tools become the safety net, not the front door. The roadmap below sequences the whole setup.

Implementation Roadmap

1Split your mega-prompt into specialist agents — one per domain (billing, scheduling, support) — each with a short, focused prompt
2Attach transfer_agent to each specialist with natural-language trigger scenarios covering every out-of-domain request you expect
3Add transfer_number for the cases that genuinely need a human, and write the pre-transfer TTS message so callers know what is happening
4Configure inbound routing rules (caller prefix, language, business hours, keywords, channel) so most conversations start with the right agent
5Train supervisors on the Whisper, text take-over, and voice takeover controls, and review handoff events in transcripts weekly

Designing the Handoff Layer: Practical Guidance

Start from your call taxonomy, not your org chart. Specialist agents should map to caller intents (billing, scheduling, claims) — the things callers ask for — even if one human team handles several of them today.
Make every agent a dead-end-free zone. Each specialist needs trigger scenarios covering everything outside its domain. The failure mode to design against is an agent gamely improvising answers it was never given.
Reserve humans for judgment, not volume. transfer_number should fire on emotion, authority, and ambiguity — not on routine questions a better-scoped AI agent could answer. Every unnecessary human transfer is the cost you deployed AI to remove.
Script the bridge. The pre-transfer TTS message is two seconds of audio that determines whether the caller experiences a handoff or a hang-up. Always set it.
Supervise new agents like new hires. Put fresh agents under live observation for their first weeks, lean on Whisper to correct course in real time, and graduate them to lighter oversight as transcript reviews come back clean.

The pattern that emerges is the same one on our home page transfer story: AI handles the routine majority, specialists handle their domains, and humans enter precisely when judgment is needed — with the full conversation context already in place, in any of 10+ Indian languages your callers prefer.

FAQ

Is a transfer_agent handoff visible to the caller?

It is a warm transfer — the conversation continues with the new agent rather than restarting, so the caller is not asked to repeat themselves.

How does the agent know when to transfer?

Through the natural-language trigger scenarios you attach to each tool. Multiple scenarios per tool are supported, so "user asks about billing", "caller disputes a charge", and "user mentions a refund" can all route to the same billing agent.

What if my chosen LLM does not support function calling?

Transfers depend on function calling, and the agent editor warns you at configuration time if the selected model lacks it — switch models before going live.

Can an operator intervene without the caller knowing?

Yes — Whisper folds a private operator instruction into the agent's next turn. The caller only perceives a change if the operator escalates to text take-over or voice takeover, both of which are labeled in the transcript with amber OPERATOR markers and handoff timeline events.

What happens after a voice takeover ends?

The operator releases the call back to the AI agent, which resumes the conversation — humans handle the exception, the AI finishes the routine.

Want to see a warm transfer and a live takeover end to end? Book a walkthrough with our team.

Written by

Anuprash Gupta

Product & Platform, Zoice

Anuprash Gupta works on the Zoice platform across telephony, WhatsApp, and the agent tooling that powers real customer conversations. He writes about how teams put AI voice and chat agents into production — integrations, onboarding, analytics, and the practical decisions behind shipping conversational AI for Indian businesses.

Conversational AIWhatsApp Business APITelephony & SIPAgent ToolingIndian Language AI

Keep reading

All articles

Connect Plivo to Zoice: A Step-by-Step Guide to Putting an AI Agent on Your Phone Number

June 14, 2026 · 7 min read

WhatsApp Business API Without a BSP: What Skipping the Middleman Actually Means

June 12, 2026 · 6 min read

BYOC for Voice AI: Wiring Your Own SIP Trunks into AI Agents (and Why Telephony Margins Matter)

June 10, 2026 · 7 min read

Ready to put an AI agent to work?

Deploy voice, WhatsApp, and chat agents across Indian languages — grounded in your knowledge and measured on every call.

Back to all articles