Wednesday, July 17, 2024

AI is not going to implement itself, but governments can help

The AI hype has passed, and the overexcited futurists' voices are mercifully fading away. We're now entering a practical era where AI is leveraged to boost productivity in businesses, non-profit, and public organizations. This shift brings a sobering realization: AI integration requires a meticulous, pragmatic approach to build reliable and trustworthy systems. It's a lot of work and requires some strategy.

When a single person manages a well-defined workflow, integrating AI is relatively straightforward. It's easy to incorporate AI tools like ChatGPT or Claude to assist with ad copy, reports, or applications. The beauty of these scenarios lies in their simplicity - the user acts as both operator and quality controller, immediately judging the output's effectiveness.

However, the story changes dramatically when we shift to multi-user workflows or more complex processes, where both inputs and outputs are more of a collective responsibility. I recently spoke with an Accounts Payable team who posed a challenging question: "Yes, we can see that AI can help review travel claims, but can you guarantee it's going to be 100% accurate?" I couldn't provide that guarantee; I don't have time to conduct a hundred tests, and I don't even have access to a hundred travel reports. They emphasized their need for completely audit-proof outcomes. This conversation highlighted the trust issues that arise when moving from AI enthusiasts to skeptics in larger organizations. And organizations should have a healthy group of skeptics to remain viable.

I've also recently been a fly on the wall during discussions between healthcare executives and a U.S. lawmaker. The executives explained that each AI-assisted medical procedure needs validation, which is expensive and often duplicated across multiple hospital systems. This challenge extends beyond healthcare. For instance, when using AI to crunch data in all organizations, we need to understand its reliability in analyzing large datasets, cleaning them, and handling outliers.

The problem is that no private institution can conduct the kind of comprehensive testing and validation needed to establish trust in AI systems across various industries. We cannot seriously trust claims of startups who are trying to sell a specialized product to an industry or a government organization. It's not clear how a hypothetical validation private service would monetize such an endeavor.

This is where I believe government involvement becomes crucial. Instead of obsessing with deep fakes and ethics, that's what governments should be doing. Governments can collaborate with industry experts to develop standardized benchmarks for AI reliability and performance. They could establish certification programs that act as quality marks, assuring users that AI systems have undergone rigorous testing. Moreover, government funding could support businesses, NGOs, and government agencies in conducting extensive AI testing, especially benefiting smaller organizations lacking the necessary resources.

In my view, public-private partnerships are key to navigating these challenges. By leveraging expertise from both sectors, we can develop robust testing frameworks and create dependable AI systems. This approach would pave the way for more efficient and innovative workflows across industries, ensuring that the benefits of AI are realized while maintaining trust and reliability. 

Do AI bots deceive?

The paper, Frontier Models are Capable of In-Context Scheming , arrives at a time when fears about AI’s potential for deception are increasi...