Connectors Need Intelligence

I use coding agents daily and they work. I’ve tried agents for email, calendar, documents, CRMs, and all the other software I use. None of that works. This is not a problem with the LLMs powering these agents. The models are smart enough. The difference between these experiences boils down to poor connectors and tools.

To write code, a model needs grep, pytest, tsc, git, and commands to read and edit files. These functions are all symbolic, discrete, and in the training data. They take a handful of known, clearly defined arguments. Their output is easy to understand text. A model trained on the internet knows these tools by heart. The product surface of “coding” is familiar to the model, and the commands that operate over it are native to an LLM’s vocabulary.

Most software doesn’t have these properties. A CRM has hundreds of functions with subtle, often platform-specific, differences. Today, to integrate with LLMs, you have to follow the MCP approach, which lists every capability as a tool and lets the model pick one. It’s betting that the coding approach of handing the model the right tools and letting it figure out the rest will scale to all workloads. It doesn’t. An unfamiliar product with hundreds of tools confuses the model, which results in failed tasks and frustrated users. Anecdotally, when I connected Claude to Apollo, it couldn’t tell me whether an email was scheduled to send or marked inactive. That’s not a hard question. It’s very obvious in the UI. But it is a hard question for an LLM holding a large tool manifest it doesn’t quite understand.

At Chonkie, we’ve built several connectors. In our experience, MCP does a good job of standardizing the interface between LLMs and tools. That said, when designing integrations for large products, the main question is how to preserve the functionality of several hundred endpoints without confusing the model. We’ve found two approaches that work well.

Sub-Agent Connector

A connector that wraps an LLM. A single call to this spins up a scoped agent that runs for a few turns and returns a result.

This approach works well when your product surface is large, well-understood by an LLM, and your typical user journey involves multiple steps. For example: search-then-filter-then-open or fetch-then-transform-then-send.

The main benefit of this design is isolation. The main model’s context doesn’t get flooded with intermediate tool calls, internal reasoning, or failed attempts. It delegates a task, gets the final result, and continues. Sub-agent artifacts can be discarded or used for analytics.

Note that this is still a general purpose LLM using an MCP or CLI underneath. If your endpoints are confusing or your param lists sprawling, the sub-agent will struggle in the same ways a main model calling these tools would. Isolation alone cannot fix tool use confusion.

Custom Agents

The connector call invokes a model you own, fine-tuned on your product.

A generalist model working on your product has to figure the surface out during inference, guided only by tool descriptions. A custom agent already understands your product surface, including the subtle differences between endpoints, the shortcuts, and the common sequences. We’ve seen this approach pay off in three ways:

Task completion: A generalist LLM making a dozen tool calls has a dozen chances to drift. A custom agent knows the shortest path and takes it. Failure rates drop to a point where the connection stops feeling brittle.
Richer analytics: With an MCP server, a model’s “task” is only visible to you as API requests. You know a model is issuing these requests, but you don’t know the intent, objective, context, or assumptions behind them. When you own the operating model, you see the entire agentic span and can use it to iterate and improve your product.
Execution speed: A model trained for a single product can be small, and it doesn’t waste tokens exploring a surface it already knows. The result is a custom agent that runs noticeably faster than a generalist figuring things out on the fly.

Cost often comes up when evaluating this approach. Fine-tuning does carry an upfront cost. However, a brittle connector that fails often and leans on an expensive large model to recover adds up quickly. Every confused tool call is tokens out of your user’s pocket at main-model prices. Multiply that by retries and re-prompts. Then add the cost of users who eventually give up. There’s a volume at which owning a custom agent becomes the cheaper option.

Coding integrations work because the model is familiar with the surface. Every other product has to make up that gap. The real design question is whether you close it inside the connector or leave it for the main model to figure out. My bet is that quality connectors will carry most of that intelligence themselves.