AI as a Tool or AI as a Product?

The gap between $20 ChatGPT and six-figure AI vendors lies in integration, repeatability and operational complexity personal tools can't address.

An artist's illustration of AI

Someone on your team just demoed how ChatGPT or Copilot can extract data from a medical report in seconds. Now leadership wants to know why you're paying a vendor six figures for document processing when the same AI is available for $20 a month.

It's a reasonable question. Personal AI tools and operational AI systems solve fundamentally different problems, even when the underlying technology looks identical.

When an adjuster pastes a claim's medical records into ChatGPT to prep for a call, that's a one-time task. They provide context, review outputs, fix what needs fixing. The AI just makes them faster at work they were already doing.

When AI processes hundreds of documents a day as part of an operational workflow, a person isn't shepherding each one through. The AI becomes one component inside a larger system that needs to work reliably at scale.

The problem is these two use cases get treated as interchangeable. Organizations sometimes spend months building infrastructure for tasks an employee could handle with ChatGPT. Other times, someone proposes using ChatGPT for a process that actually requires serious engineering. Both mistakes come from not recognizing when a task requires a product. The telltale signs are integration, operational complexity, and repeatability.

Integration

ChatGPT works with whatever you paste into it. It can't reach into your claims system, query your policy database, or update a file on its own. In production, the AI sits in the middle of a pipeline. Data has to get in, and data has to get out

On the input side, documents arrive through fax servers, email integrations, and carrier portals. On the output side, extracted data has to be written to the system of record and validated against what's already in the claim file, and exceptions need to be flagged and routed. Without the integration work on either side of the AI, the system can't function in production.

Operational Complexity

When someone uses ChatGPT to extract data from a document, they're the orchestration layer. They decide what to paste in, what to ask for, what order to work through it. If something looks wrong, they adjust and try again. That works when you're processing one document at a time.

At scale, software has to do that job instead. Documents need to be normalized into a usable format. Illegible or corrupted files need to be handled. Outputs need to be validated, exceptions routed, and results connected to downstream systems. When something fails halfway through, the system needs to know where it stopped, what succeeded, and how to recover.

There's also the question of proof. When a claim gets litigated or an auditor asks how a decision was made, "the AI said so" isn't an answer. You need to show exactly where in the document a value came from and why it was interpreted that way. Personal AI tools are black boxes. Enterprise systems build in traceability because insurance requires it.

Finally there is the file size. A single claim file can run 10,000 pages. You can't paste that into ChatGPT. Personal AI tools have input limits that make documents like these impossible to process in a single pass. At that point, you're not using a tool. You need a product.

Repeatability

Personal AI tools are inherently variable. Ask ChatGPT the same question twice and you'll get different answers. When drafting a strategy document, this can actually help. Running the same prompt multiple times gives you different angles to choose from.

At operational scale, variability becomes a liability. A diagnosis code extracted one way in the morning might come out differently in the afternoon. Tags get applied inconsistently. Provider names normalize differently across batches. A claim that gets flagged high-priority on Monday might score as routine on Tuesday. These inconsistencies create problems throughout downstream processes.

When outputs are unreliable, users lose trust. They start checking everything manually, which defeats the purpose. Enterprise implementations address this through standardization: controlled prompts, validated extraction logic, versioned models, systematic testing. When something changes, you know what changed and why. When something breaks, you can trace it back. This infrastructure is what makes production systems require real investment, but it's also what makes them suitable for production.

When You Don't Need a Product

But the opposite mistake is also common. Not everything needs a product. If the task is infrequent and doesn't need to connect to anything, a person with ChatGPT can be the right answer.

A couple of times a year, a supervisor prepares for a mediation on a complex claim. They need to review the medical records, understand the treatment history, and build their argument. Someone sees that and thinks: we should build a tool for this. However, ChatGPT can help them work through it directly, surface key details, summarize sections. That's a tool making someone better at their job, not a workflow that needs to be automated.

The same applies outside of claims. Quarterly management presentations. Strategy preparation for a renewal. Evaluating a vendor. One-off policy research. These happen a few times a year, the output goes into a document or slide deck, and nobody needs the data anywhere else. Building automation around them solves a problem that doesn't exist.

The person doing the work already has what they need. They have the data, they understand the context, they'll review and edit whatever the AI produces. The value of personal AI tools is that they require no infrastructure. Let people use the tools directly, get useful output, and move on. Trying to systematize that just adds overhead without adding value.

Different Problems, Different Approaches

Personal productivity works because a human handles everything around the AI. They provide context, review outputs, catch errors, make decisions. For these use cases, give people access to AI tools and get out of their way.

Operational automation requires software to do what the human does in the personal productivity scenario. Integration with existing systems. Repeatable outputs. An application layer that makes the AI's work usable by others. That's a product.

The underlying AI might be identical in both cases. The difference is what surrounds it. If the task requires integration with other systems, that's a product problem. If it requires orchestration across large or complex inputs, that's a product problem. If it requires consistent, auditable outputs, that's a product problem. The more of these that apply, the further you are from something a person with ChatGPT can handle. If none of them apply, you probably don't need a product at all. Give someone the tool and let them work.

Document ingestion is one example. However, the same pattern holds for triage, fraud detection, subrogation, anywhere AI moves from assisting one person to running inside a workflow. The question isn't whether AI can do the task. It's whether you need a tool or a product. Get that wrong and you'll spend six months discovering why the vendor charges what they do, or build a system for something that just needed a person with ChatGPT.


Tycho Speekenbrink

Profile picture for user TychoSpeekenbrink

Tycho Speekenbrink

Tycho Speekenbrink is head of AI at Gain Life.  

His career, spanning Europe, Asia and the U.S., has encompassed roles at both insurance carriers and solution providers. He is a licensed actuary.

MORE FROM THIS AUTHOR

Read More