The data extraction dividend: Making sense of complex data in insurance

Yashodeep Sengupta

Insurance business processes have a lot going on in the background. Most of the processes involve a lot of complex and unstructured documents. Nearly 80% of enterprise business data is unstructured. Unstructured data is a treasure-trove of information and insights that can help them make profitable and timely decisions, with which they can surge ahead of competition.

However, complex documents are, well, complex in nature: they can have multiple tables or images, they can be handwritten and might not be template-friendly. Think handwritten invoices, complicated Excels with multiple sheets and correlations, unstructured policies and claim forms, other documents with multiple formatting or fonts or missing items, etc. This falls outside the scope of technologies like optical character recognition (OCR), which are adept in handling structured and simple documents. The complexity of data inevitably makes manual intervention an undesirable necessity. In short, complex data acts as a barrier to seamless automation and can potentially impact operational efficiency, costs, accuracies and TATs.

If documents that need processing are complex, the technology handling them can’t be too simple under the hood. So, welcome AI. Artificial intelligence, when combined with other powerful technologies, can transform into a game-changer across industries. In insurance, intelligent automation technologies like AI-backed OCR, AI-driven RPA, intelligent data capturing, intelligent data processing, natural language processing (NLP), and the likes, help inject a modern and far more efficient approach in processing and extracting data from unstructured and complex documents.

Let’s take a look at how data extraction from mostly unstructured and complex documents can help in critical areas like claims and underwriting in insurance.

The extraction edge in claims processing

Claims adjusters ideally look at images (like of property or auto damage) clients provide to determine an optimal payout. They need to base their payout decisions on a lot of factors, one of the most important being historical records. Since most insurers have only a small percentage of their records digitized, the adjuster might have difficulty accessing geographically distanced records or have to spend a lot of time sifting through papers and pictures.

So the first step in any data processing initiative is digitizing paper documents, pictures, etc. for posterity. This is of importance especially in a historically incumbent industry like insurance. Technologies like OCR and machine vision can ‘read’ papers and handwritten letters and transcribe them into digital text. Further, intelligent data capture identifies and extracts important sections of the document. NLP kicks in next: it has the ability to search through a plethora of digital documents, structured or unstructured, and throw up (or extract) a suitable result based on the adjuster’s requirements.

Digitizing insurance papers and converting the unstructured or semi-structured data into structured data substantially reduces or obliterates time taken for a claims resource to sift through millions of documents to choose the right one. For example, historical claims forms for auto damage of a particular amount can be easily pulled out if records are digitized. Or historical images of accidents of similar nature can be presented, sorted by damage severity or payout, which in turn could serve as a factor to determine optimal payout for a particular auto insurance claim.

Job made easy for underwriters

Underwriters need a variety of information to assess the risk and determine the policyholder coverage, monthly premium, etc. Apart from non-digitized insurance records, other barriers to seamless access to such information include storage in different clouds or servers or other external business units. Data extraction can help underwriters by scanning through records present in various branches, clouds or file-sharing programs. Records can include complex documents like health records, handwritten prescriptions or deeds, tabular data and so on. Cutting-edge tech like AI-based OCR, NLP and unsupervised ML models can cut through the clutter and help underwriters find patterns, derive insights and take onboarding decisions, premium plans, etc. much quicker than how they could have managed manually.

One crucial point worth considering here is the number of FTEs that data extraction ends up saving for an insurance business. As a crude example, while manual underwrit ing could have involved 8-10 resources for the job, automated data extraction can easily free up 7-9 of them to be responsible for other critical areas in the business like business process improvement initiatives.

Insuring a better future

All this translates into the following overarching benefits for an insurance business:

  • Higher savings due to greater operational efficiency
  • Stronger customer loyalty due to reduced TATs
  • Greater productivity due to optimum utilization of FTEs
  • Discovering new markets and diversifying portfolio through modern offerings like on-demand insurance

Thanks to AI-driven quick and accurate document extraction in the areas of underwriting and claims processing in insurance, the search process becomes less difficult and more organized. This gives rise to better premiums and less claims-related losses for insurers. In short, digitization and data extraction are precursors to making sense of complex data in insurance.


Your email address will not be published. Required fields are marked *