4/3/2008

Predictive Analytics

It's everywhere

By Donna J. Popow

When it comes to effectively forecasting future trends and outcomes, claims professionals are beginning to put aside their crystal balls in favor of predictive analytics, a tool being heralded by vendors and upper management as powerful and a new best practice.

With a tremendous amount of raw data on the characteristics and preferences of their customers already collected and warehoused, insurers believe that predictive analytics, which uses this warehoused data to predict future claims trends, will contain claims costs, improve claims efficiency and help detect fraud.

Factors such as the quality of the data used, cost of the effort to obtain quality data, and implementation of predictive analytics as a resource for adjusters, however, will affect the success of this tool in reshaping claims management.

What Is Predictive Analytics?
Predictive analytics is a broad term describing a variety of statistical and analytical techniques used to develop models that predict future events or behaviors.

Data mining, a component of predictive analytics, is the process of analyzing all available data to determine if there are any relationships between the variables. This information can then be used to develop a predictive model.

Predictive Models: Credit Score

The most prevalent examples of predictive models are those used by credit reporting agencies (CRAs), commonly known as credit bureaus. Experian, Equifax, and TransUnion are three of the largest. To develop credit scores for individuals, each credit bureau uses a variety of information about an individual—income, credit history, outstanding loan balances, and so forth—to develop a credit score that predicts the likelihood that he or she will repay current and future debts. The higher the credit score, the more likely the individual is to pay his/her debt.

In looking at a large number of workers’ compensation claims, for example, data mining may determine that the frequency is higher when the economy takes a downturn. This relationship can then be used to predict trends in workers’ compensation claims when economic indicators predict an economic downturn.

Data mining has no predefined relationships. The objective is to sift through the data to uncover any trends or relationships that may be present and identify them as predictors. When joined with other predictive analytic techniques, organizations are better able to discover trends and relationships that may not be readily apparent but may forecast future events or behaviors.

The form taken by predictive models varies, and depends on the behavior or event being predicted. Most predictive models generate a score (a credit score, for example) with a higher score indicating a higher likelihood of the given behavior or event occurring.

The Predictive Analytic Process
When using predictive analytics, an insurer, for example, starts by aggregating and “cleansing” its data for use in analytics software. Cleansing refers to examining the data and correcting errors, completing or removing incomplete records, and ensuring all data is in a readable format.

Next, data mining is conducted to determine if any underlying trends, patterns or relationships can be found in the data. This is a necessary first step in predictive analytics because the data this mining process identifies as relevant can then be used to develop the predictive model. Think of data mining as gathering knowledge about relationships with the resulting predictive analytics model as applying that knowledge.

To ensure that a predictive model is as accurate as possible, it must be validated through out-of-sample testing. For example, suppose an insurer has twenty-four months’ worth of data on the frequency of homeowners’ claims. To properly construct and validate a predictive model using the data, the modeler may choose to use the first eighteen months’ worth of data. Once the model has been developed, data from the final six months could then be used for validation.

Why is Predictive Analytics the hot topic now?
With advances in technology, the use of predictive analytics has become more widespread. Because the statistical techniques used in predictive analytics are computationally intensive—some require performing thousands or millions of calculations—advances in computer hardware and software design have yielded software packages that quickly perform such calculations, allowing insurers to efficiently analyze the data that produce and validate their predictive models.

Highly developed technology also permits capturing additional data. For years, adjusters have complained that they are required to enter more and more data into the claims processing system but get nothing in return for their efforts. Today, that data can be used to help adjust a claim.

The validity of any predictive model depends on the quality and quantity of data available to develop it. While most insurers today have a sufficient amount of data (quantity) to develop their predictive models, many store archived claim and policyholder information on legacy systems that may not be compatible with systems running predictive analytics software.

Converting data on these legacy systems to a usable format can be time consuming and costly. Factor in the cost of an adjuster’s time to enter all the data, and it becomes reasonable to question the cost effectiveness of this effort.

Insurers’ Use of Predictive Analytics
In the insurance industry, predictive analytics is largely used in the three core insurer functions—marketing, underwriting and claims. Marketing and underwriting have successfully used predictive analytics for some time, however in claims it has been used to a lesser, but growing, extent.

Property-casualty insurers can use predictive analytics in a number of ways, from analyzing the purchasing patterns of insurance customers (marketing) to filtering out applicants who do not meet a pre-determined model score in the risk-selection process (underwriting).

In claims handling, predictive analytics is a more revolutionary concept. Insurers primarily have been using predictive analytics to help identify (and prevent) potentially fraudulent claims. Now, some insurers are using predictive analytics to score claims based on the likely size of the settlement, enabling an insurer to more efficiently allocate resources to larger claims.

The Proverbial Needle in the Haystack
Failing to identify fraudulent claims results in higher claims costs and, therefore, higher premiums for all insureds. Property-casualty insurers traditionally have had difficulty detecting the relatively small number of fraudulent claims (the needle) among the millions of claims filed every year (the haystack). Predictive analytics first helps insurers with early identification of the potential fraudulent claim, and, second, with classification of those claims in need of a detailed review.

One of the issues associated with fraud detection is that insurers have not been able to capture data on fraud that escaped being identified, which has always skewed the data. But data mining does not rely on already identified fraud. Data mining looks for relationships in claims files that individuals may overlook.

As an example: John Smith, an adjuster, might easily realize that the same attorney is representing all the claimants from a single accident, but John could not possibly know that this attorney has been referring hundreds of clients to the same doctor over a period of years. Any tools that can help adjusters and other claims professionals recognize patterns such as this will aid in the accurate identification of fraudulent claims and improve the claims process.

After analyzing the totality of circumstances, what data mining and predictive analytics cannot do after is identify whether or not there are legitimate explanations for one or more fraud indicators in a particular claim or group of claims. Consequently, an insurer’s identifying legitimate claims as fraudulent may anger policyholders and result in litigation or accusations of bad faith in claims practices.

A second use of predictive analytics in the claims process is prioritization of claims for handling. By looking for relationships between the present claim and past claims (for example, type of medical treatment recommended or law firm involved in litigation), predictive analytics can help identify at an early stage claims that are likely to be settled for higher values.

Predictive Analytic Techniques—The Advantages for Insurers
  • Helps marketing more precisely identify potential policy sales through analysis of customer purchasing patterns
  • Reduces the employee hours underwriters may spend researching and analyzing an applicant who ultimately is not a desired insured
  • Provides predictive modeling scores for applicants that can be used as a rating mechanism for determining a variety of policy price/product points
  • Helps identify potentially fraudulent claims
  • Scores claims based on the likely size of the settlement, enabling an insurer to more efficiently allocate resources to higher priority claims

These higher-value claims can therefore be classified as claims requiring a specific level of expertise in order to be adjusted effectively. Accurately identifying these claims helps the claims department operate more efficiently and improve customer service. Other possible uses of predictive analytics include scoring claims based on the probability of successful subrogation, and more accurate case reserving.

The Advantages of Using Predictive Analytics in Insurance
If knowledge is power, then the advantages of predictive analytics are clear. Predictive analytic techniques allow insurers to better understand their data and how to use it to predict future events. Proper implementation of predictive analytic techniques can improve an insurer’s consistency and efficiency in marketing, underwriting, and claims services by helping to define target markets, increasing the number of policy price points, and detecting and reducing claims fraud.

The Disadvantage of Predictive Analytics in Insurance
While nearly all insurers find that the benefits of predictive analytics outweigh its costs, the techniques of modeling result in inherent disadvantages. A model’s potential for inaccuracy is an important consideration for insurers relying on predictive modeling. Just like a credit score, a model indicates what is likely to occur, not certain to occur. Just because a predictive model indicates that a claim may be fraudulent does not mean it is fraudulent; it just means that the claim displayed some of the characteristics of similar claims in the past that have proven to be fraudulent.

Predictive Analytic Techniques—The Disadvantages for Insurers
  • Inherent inaccuracy of the predictive model
  • Cost of implementing predictive analytic techniques, including an investment in new software and hardware
  • Resistance to change within the organization
  • Need for clean, accurate data

In addition to a predictive model’s possible inaccuracy, an insurer’s use of predictive analytics may result in additional disadvantages, many of which are associated with the implantation of operational changes that using predictive analysis techniques require. An insurer may find that implementing predictive analytic techniques, including an investment in the hardware and software necessary to facilitate predictive modeling, is too costly an investment. Also, poor record keeping and multiple legacy systems often indicate that the insurer does not have the clean, accurate data necessary to support a successful predictive modeling platform, which creates the need for further financial outlay.

Finally, as with any substantial change in operations, an insurer may encounter resistance from within to the incorporation of predictive analytic techniques that streamline operations and reduce the demand for human resources, particularly from employees who may feel their jobs are being marginalized.

Social or Regulatory Implications of Predictive Analytics
One of the advantages of predictive modeling is that it may detect relationships among the data, or predictors/indicators, of potential losses/claims that may not be readily apparent to insurers or that may not be readily explainable. However, an insurer must be able to justify charging differential premiums to customers based on a predictive model output.

Some consumer organizations and regulators, for example, have resisted insurers’ use of an insurance score, or credit score, as a pricing factor for policies. Insurers initially could not explain why the relationship between credit scores and loss ratios existed, thus making it difficult to justify using the relationship to price policies.

While such use of an insurance score is becoming more widely accepted, this type of resistance may become more likely if the predictive model factors used to justify pricing are not intuitive. What if a predictor of losses is not just the education level a potential insured attained but the high school he or she attended, or the hospital where he or she was born? How insurers justify the factors a predictive model uses may be just as important as discovering the relationships.

Where Do Insurers Go From Here?
According to Alan Kay, a renowned computer scientist who worked for companies such as Xerox, Apple, and Disney, “The best way to predict the future is to invent it.” This is the opportunity predictive analytics can provide claims—draw on the past to better forecast and create the future. Insurers should take advantage of every tool that can make a claims handler more efficient and effective. Keeping the disadvantages in mind, when used properly, predictive modeling can be a legitimate claims tool to increase efficiency, effectiveness, competitiveness, and profitability.
Donna J. Popow, JD, CPCU, AIC and Charles M. Nyce, PhD, CPCU, ARM serve as senior directors of Knowledge Resources at the American Institute for CPCU and Insurance Institute of America (the Institutes) in Malvern, Pennsylvania. Popow has responsibility for all aspects of claims education including the Associate in Claims designation program and the Introduction to Claims certificate program. Nyce is the primary author of CPCU 510 – Foundations of Risk Management and Insurance and a coordinating author on ARM 54 – Risk Assessment.



Donna J. Popow, JD, CPCU, AIC, is president of Donna J. Popow LLC, and has more than 25 years of experience in the property and casualty insurance industry. She has been a CLM Fellow since 2007 and can be reached at (215) 630-0829.

Top Industry News

Powered by : Business Insurance