What Agents Are Good At (and What They Are Not)
The fastest way to waste money on AI agents is to deploy them where they do not belong. The second fastest is to ignore them where they could transform your operations. This chapter helps you tell the difference.
Where Agents Excel
Research and synthesis. Agents can search multiple sources, extract relevant information, cross-reference findings, and produce structured summaries. A human researcher might spend hours on what an agent does in minutes — not because the agent is smarter, but because it does not get tired, distracted, or bored.
Repetitive multi-step workflows. Any process where a human follows the same series of steps repeatedly is a candidate for an agent. Processing invoices, triaging support tickets, updating CRM records, generating reports from templates.
Code generation and debugging. Coding agents are among the most mature agent applications. They can write code, run tests, interpret errors, fix bugs, and iterate — often completing in minutes what would take a developer an hour.
Data transformation and analysis. Moving data between formats, cleaning datasets, running calculations, generating visualizations. Agents handle the tedium while humans focus on interpreting results.
Customer interactions at scale. Support agents that resolve routine issues, onboarding assistants that guide new users, sales agents that qualify leads. The key is "routine" — agents handle the predictable cases while humans handle the exceptions.
Where Agents Struggle
Novel reasoning. Despite appearances, agents do not reason from first principles. They pattern-match against training data. When they encounter genuinely novel problems — situations with no precedent in their training — they fail in ways that look confident but are fundamentally wrong.
Emotional intelligence. Agents can simulate empathy convincingly, but they do not feel it. For interactions where genuine human connection matters — crisis support, sensitive negotiations, grief counseling — an agent is not a substitute.
Physical world interaction. Today's agents operate in digital environments. They cannot inspect a product, visit a factory, or read body language in a meeting. Robotics is advancing, but the bridge between digital agents and physical action remains narrow.
Long-horizon planning. Agents work well over tens of steps. Over hundreds or thousands of steps — planning a six-month project, managing a complex supply chain — they lose coherence. The accumulation of small errors becomes catastrophic.
Accountability. When an agent makes a mistake, who is responsible? The developer? The company that deployed it? The user who trusted it? This is not a technical limitation — it is a structural one that affects every deployment decision.
The Augmentation Sweet Spot
The most successful agent deployments do not replace humans. They augment them. The agent handles the first 80% of a task — the research, the drafting, the data gathering — and a human handles the final 20% that requires judgment, creativity, or accountability.
This is not a compromise. It is the optimal architecture for most real-world applications today. The agent is fast and tireless. The human is wise and accountable. Together, they outperform either one alone.
A Simple Evaluation Framework
When considering an agent for a task, ask:
- Is the task well-defined? Agents need clear success criteria.
- Is the cost of failure low? If mistakes are expensive, add human oversight.
- Is the task repetitive? Agents shine on volume.
- Does the task require judgment? Keep humans in the loop for subjective decisions.
- Can you measure success? If you cannot tell whether the agent is performing well, you cannot deploy it safely.
If you answer yes to questions 1, 3, and 5 — the task is likely a strong candidate for an agent.