{k}
Blog post illustration

AI Agents: We Need Less Hype and More Reliability

Adrian Krebs,Co-Founder & CEO of Kadoa

2025 is supposed to be the year of agents according to the big tech players. Better and cheaper models, more powerful tools (MCP, memory, RAG, etc.) and 10X inference speed are making agents better and more affordable. But what most customers struggle with isn't the capabilities, it's reliability.

Less Hype, More Reliability

Most customers don't need complex AI systems. They need simple and reliable automation workflows with clear ROI. The "book a flight" agent demos are very far away from this reality. Reliability, transparency, and compliance are top criteria when firms are evaluating AI solutions.

Here are a few "non-fancy" AI agent use cases from our customers that automate tasks and execute them in a highly accurate and reliable way:

  1. Web monitoring: A leading market maker built their own in-house web monitoring tool, but realized they didn't have the expertise to operate it at scale.
  2. Web scraping: a hedge fund with 100s of web scrapers was struggling to keep up with maintenance and couldn’t scale. Their data engineers where overwhelmed with a long backlog of PM requests.
  3. Company filings: a large quant fund used manual content experts to extract commodity data from company filings with complex tables, charts, etc.

These are all relatively unexciting use cases that I automated with AI agents. It comes down to such relatively unexciting use cases where AI adds the most value.

Agents won't eliminate our jobs, but they will automate tedious, repetitive work such as web scraping, form filling, and data entry.

Buy vs Make

Many of our customers tried to build their own AI agents, but often struggled to get them to the desire reliability. The top reasons why these in-house initiatives often fail:

  • Building the agent is only 30% of the battle. Deployment, maintenance, data quality/reliability are the hardest part.
  • The problem shifts from "can we pull the text from this document?" to "how do we teach an LLM o extract the data, validate the output, and deploy it with confidence into production?"
  • Getting > 95% accuracy in real world complex use cases requires state-of-the-art LLMs, but also:
    • Orchestration (parsing, classification, extraction, and splitting)
    • Tooling that lets non-technical domain experts quickly iterate, review results, and improve accuracy
    • Comprehensive automated data quality checks (e.g. with regex and LLM-as-a-judge)

Outlook

Data is the competitive edge of many financial services firms, and it has been traditionally limited by the capacity of their data scientists. This is changing now as data and research teams can do a lot more with a lot less by using AI agents across the entire data stack. Automating well constrained tasks with highly-reliable agents is where we are at now.

But we should not narrowly see AI agents as replacing work that already gets done. Most AI agents will be used to automate tasks/research that humans/rule-based systems never got around to doing before because it was too expensive or time consuming.