10/11/2007

Mining for Information Gold in Unstructured Claims Data

By Steve Holcomb

For decades, insurers have been amassing enormous databases of claim information. The amount of data available to insurers today is overwhelming. There are millions of untapped records, reports and documents currently lying in insurance data centers; enough information to actually foresee and predict likely claim outcomes. The big question for insurers is, "What exactly do we do with this data?" Significantly more important is, "How do we turn this data into knowledge that can be acted upon?"

The ability to effectively mine and analyze this data can result in faster claims resolution, better data modeling and forecasting and considerable loss cost savings for insurers. When you consider that a small number of claims cases constitutes a significant portion of a carrier's total claims payout, it follows that small improvements in the claims payout process can offer significant financial benefits to insurance companies.

Claims professionals now have the ability to mine actionable information from both structured and unstructured data sources through the use of semantic technologies and text data mining, a first step in the effective use of predictive analytics. In fact, the primary objective of text mining is to create new information that has predictive value.

Characteristics of Claims Data

Carriers have accumulated raw claim or loss data that equates into millions of text records. Examples of text data include claim description fields in claim files, the content of e-mails, underwriters' written evaluation of prospective policy holders contained in underwriting files and responses to open-ended survey questions on customer satisfaction surveys. Until recently, this information was available, yet extracting knowledge from it was not financially feasible.

Industry research suggests that 80 percent of any company's data is unstructured and that 90 percent of that information is unmanaged. In the insurance industry, claims data is actually 97 percent unstructured with the richest information residing in the unique set of words, acronyms and abbreviations that comprise the adjuster's notes. So, if insurers are not effectively using text mining, it can be implied that important business decisions are being made based upon only about three percent of the information available. When you consider the size of the insurance industry, that percentage is a staggering revelation.

While the richest information may reside in the adjuster notes, those notes are just a part of a claim department's unstructured data inventory. There also are e-mails with attachments, imaged documents, case manager notes, Web based information, recorded statement transcriptions (or digital audio files) and digital photos. All of this valuable data content currently exists within insurance companies' data inventories; the information gold that the industry has yet to mine effectively.

The Science of Text Mining

While you might be unfamiliar with the "text mining" term, most of us encounter and use this technology in our daily lives through search engine technology. A user types in a word or phrase, which might include misspellings, and the search engine searches through a vast repository of documents to find the most relevant documents and list the results -all within seconds.

Other popular text mining applications include spam identification pro- grams associated with e-mail accounts, call center routing, analyzing open-ended survey questions and global monitoring for public health early warning such as SARS.

Software for analyzing text information was commercially available in the 1990s, although text mining is a relatively new science. Early data-mining applications often yielded disappointing results and tended to be unnecessarily complex. The lack of data-mining knowledge also contributed to speculation of overall effectiveness. However, continued technology innovations in recent years have expanded text mining capabilities and stature as a vital business tool for the insurance industry. Current data-mining applications feature expanded analytics, user-friendly interfaces and powerful algorithms that allow researchers to analyze structured and unstructured data.

Structured vs. Unstructured Data

Structured data is standardized, easily entered and handled information, such as numbers or company-designated codes for financial information, line-of-business abbreviations, causes of losses and the like.

In contrast unstructured data, which is what exists in the majority of current claims data, is non-standardized, freeform, explanatory information such as adjuster notes, imaged documents, telephone call transcripts and e-mail messages. This type of data is a gold mine of rich, but hidden, information. Unlocking it and extracting its actionable business value has been a daunting necessity.

Historically, companies have tried to gain access to this data in manually intensive ways, extracting hard copy files from departments and having teams of individuals review them for specific information. This was an extremely time consuming and expensive task because the effort required and difficulty of interpreting the unstructured data rarely led to the financial results that warranted the time and effort.

Automate and Expedite

Information being automatically extracted from unstructured data is a significant achievement because it allows corporations to extract knowledge that can directly impact profitability. Text mining enables machines to do what scientists, researchers, lawyers, librarians and normal readers have been doing without conscious reflection for as long as text has existed-finding patterns and creating intelligence from text data. Text mining magnifies the human abilities to identify complex patterns that are typically indiscernible without the application of statistical techniques.

In the claims arena, text mining applications can extract information related to a wide range of claims issues-from identifying all claims with a particular body shop, attorney or treating physician to whether an auto accident involved an incidence of DUI/DWI or road rage. With text mining, insurers can quickly tackle this large volume of data and discern patterns that might group these claimants, or identify them as unique. Data can also be gathered to examine what is unique about these claims, what patterns exist within them and what new business opportunities may be opened up by adjusting new products to fit these groups of customers.

The use of text mining to collect, review and process unstructured data automates claim analysis by quickly doing the job of multiple workers scouring through online adjusters' notes and incident photos on a file-by-file, inquiry-by-inquiry basis. Yet, text mining is unlikely to replace human researchers or supplant traditional research methods. Rather, it will augment existing capabilities. Many organizations have found a hybrid approach compelling by exploiting the speed and capacity provided by text mining and automating processing to handle the initial intake, filtering and processing steps while leaving final high-value analyses to human analytical experts.

Benefits for Claims Managers/Adjusters

Text mining enables carriers to gather data and analyze field reports, notes and e-mails in order to respond swiftly to customer claims concerns and ensure ongoing customer satisfaction. This analysis allows adjusters and managers to quickly identify and understand unique characteristics in the claims by identifying patterns that may indicate special conditions or fraud. Identification of fraudulent claims or claims that are candidates for subrogation recovery enables carriers to mobilize data quickly, enhance workflow and ultimately streamline the claims management process.

When information can be gathered faster, claims processing time is reduced and carriers save money. For example, in the automotive industry, warranty claims cost the industry roughly $14 billion per year in the United States. It was found that by effectively text mining, the cost of processing a warranty claim from the time of problem identification to resolution is cut by up to 10 days and the number of overall claims is reduced by 5%.

Keep in mind that while text mining allows data to be transformed into insights for decision-making in the claim process, these insights must be delivered to claims handlers, investigators or recovery specialists in a useful form and early enough in the claims process to add significant value.

As new types of claims or new patterns of claiming behavior begin to emerge, databases containing freeform claim description fields or narratives describing accidents could contain information that is not found in standard claims coding. Using text mining, insurers will have the ability to identify suspicious claims that would never be found in standard database mining.

Improving Profitability

The field of text mining is undergoing rapid development and the use of predictive analytics is helping to drive that growth. As data mining becomes more sophisticated, the knowledge it extracts from insurance information will be become more valuable to insurers and a key to sustaining a competitive advantage. Technology innovation is feeding the industry's appetite as vendors are offering the technology expertise and industry experience that are helping the insurance industry leverage data resources.

Carriers have a real opportunity to harvest the knowledge from claims data and transform that knowledge into improved claims processes, a refocused effort on exception handling and the automation of routine claims. Carriers now have the ability to effectively streamline the claims process, improve profitability and retain their competitive advantage in the global marketplace.

Stephen Holcomb is the founder, president and CEO of Full Capture Solutions. Mr. Holcomb can be reached at steve@fullcapture.com.


Stephen Holcomb is the founder, president and CEO of Full Capture Solutions.

Top Industry News

Powered by : Claimspages


jacobson