MCP Needs a FAQ Endpoint

Here's a question. How can Siri navigate the instruction

File expenses from my trip.

Let's assume Siri is an agent on my phone, with access to an iOS equivalent of MCP that brokers capabilities offered by each app.

Here are some of the tools the agent will discover:

add_receipt(vendor, description, category, amount, date, attachment) from Expensify
get_photos(start_date, end_date) from the Photos App
get_calendar_items(start_date, end_date) from the Calendar App
get_email(start_date, end_date) from the GMail APp
get_contacts(query) from the Contacts App

..and so on.

There are a lot of ways our agent could fail, but I want to focus on one particular way to fail that I think MCP can easily combat: What if the agent doesn't realize receipts are stored in photos and emails?

Or, a different way to put it: What if the tool's docstring isn't enough?

What if this were possible..

At some point during the planning process, the agent will identify add_receipt as a critical goal. But now it needs to realize that it should look in photos and emails to find receipts to add.

At this exact moment, imagine if there was an MCP equivalent of a Frequently Asked Questions endpoint that it could call to the host of a particular tool.

-> faq("Expensify:add_receipt", "Where do I normally find receipts?", {context: "iOS"})
<- "Receipts are usually found as photos or email attachments."

Is this over-design?

You might push back with the following observations:

The LLM should be expected to figure this out on its own.
The add_receipt tool definition can just include this information in its docstring.
The add_receipt function signature should make this information obvious.

I'm not convinced. At the scale of an operating-system level agent, there are too many tools and user contexts for this to be tractable.

As human developers, we don't rely on sparse function docstrings to do our work. We rely on a rich ecosystem of tutorials, faqs, videos, how-tos to supplement those docstrings.

Adding a FAQ endpoint for each tool doesn't solve the problem, but gives developers a way to participating in helping the Agent be its best.

Seriously? Each app hosts an LLM endpoint?

Why not?

A world with agents that call third-party tools is already a world of AIs talking to to other AIs. We're artificially limiting ourselves if we don't fully leverage the assumption of LLM availability.

It could be something as simple as a web endpoint, or mobile app OS hook, that just passes through the request with a helpful prompt:

mcp/faqs/add_receipt.txt

You are an LLM answering frequently asked questions about how to use a receipt logging API.
Answer the question concisely based on the information below. If the information is not below, say: "I don't know."
The tool you are answering questions about is called:

`add_receipt(vendor, description, category, amount, date, attachment)` from Expensify

Here are some common pieces of information agents may find helpful:

- On iOS and Android, receipts are usually stored in the photos or emails (as attachments). 
- To determine if a photo or attachment is a receipt, use a multimodal LLM to ask if it is a receipt.
- To determine when the user was on a business trip, you can look for flights on their calendar or ask them.
- Receipts for lodging, food, and travel are almost always valid business expenses, but others may not be, so you should ask.
- To turn a receipt photo into data for a receipt, you can simply ask the LLM if it has image understanding. Otherwise you can attempt to use an OCR tool to turn the image to text if available.

User Question: {question}
User Context: {context}
Your Concise Answer:

The thought chain for our agent.

And here's thepsudocode (pseudothought?) rollout of the thought chain I would hope unfolds.

Agent: How do I find receipts?
Tool FAQ: Receipts are usually stored as photos or email attachments.
Agent: How do I determine if a photo or email attachment is a receipt?
Tool FAQ: Use a multimodal LLM to ask if it is a receipt.

Agent: Invoke get_photos("3/1/2025", "3/3/2025")
Tool: {"photos": [url, url, url]}

(agent begins iterating)

Agent->LLM: Is <url> a receipt?
LLM: Yes
Agent: Please get the data {vendor, description, category, amount, date} from this receipt <url>
LLM: {"vendor": "Bills", "Description": "Lunch", "category": "Dining", "amount": 15.99, "date": "3/1/2025"}
Agent: Invoke add_receipt("Bills", "Lunch", "Dining", 15.99, "3/1/2025", url)

Developers lending agents a helping hand

OS level agents are going to be really hard to get right. The space of possibilities is enormous, and that means the opportunities for failure are enormous.

Adding a "FAQ" endpoint for tool providers feels like a simple, LLM-native step that can help developers offer a helping hand to agents as they navigate workflow planning.