Unifying the Agent + Tool Control Plane

There’s a layer to the agent stack that I don’t think has been fully crystalized by MCP, A2A, and friends. It’s the control contract between how two chat-attached processes relate to each other & to the user.

Let’s call them Alice (the Agent) and Tom (the Tool). But Tom could also be a sub-Agent.

I believe the contract between Alice and Tom needs to account for five main topics:

Calling Convention

What input should Alice provide Tom?
What output should Tom return to Alice?

Context Sharing

What context does Alice provide Tom?
What context doers Tom return to Alice?

Interactivity

Can Tom interact with the user?
Can the user interact with Tom?

Exclusivity

Is Tom's access to context & the user exclusive while he is active?

Control:

Can Tom pass control to Jerry?
Can Tom return control to Alice at Tom’s discretion?
Can Alice seize control from Tom at Alice’s discretion?

Tools and Agents look like the same thing from the lens of this rubric:

The Agent<>Tool relationship is (1) a well-specified calling contract, with (2) undefined context sharing, (3) no interactivity, (4) undefined exclusivity, and (5) Tom fully blocks Alice until Tom returns.
The Agent<>Sub-Agent relationship is just the Agent<>Tool relationship with added interactivity

Having a rubric like this makes it easy to identify some of the issues that chat platforms are no-doubt having with third party chat app designs:

If Claude yields control to a UberEatsBot, is that UberEatsBot allowed to retain control, attempting to upsell you, even if you want to leave and go back to Claude?
If the UberEatsBot doesn’t want Claude to have access to the portion of the chat that took place during its control, can it specify that?
If Siri passes control to UserEatsBot, who then wants to pass control to UberEatsSupportBot, is that hand-off supported?

As chat becomes an operating system of many agents, we’ll have to figure out what the shape of the navigable space is.

If the Agent Card / Tool Card of each agent were to list preferences & hard requirements across this rubric, then your "agent browser" (the chat client app itself) could reason about what it meant to defer to them from the broader context of your navigation and personal data.