Data science now plays a vital role in modern insurance companies. The need for security in models and systems driven by cutting-edge data science is following an evolutionary track similar to that of internet security needs. RGA’s Mr Jeffrey Heaton discusses the science, the issues and how insurers can benefit from a greater understanding.
One of the original intents of the internet was to allow reliable communication among computer systems. The designers’ primary fear at the time was disruption from unreliable communication links. Security against hackers was barely an initial concern.
As the internet matured, data security concerns moved to the forefront. And with the rise of technologies such as multi-factor authentication (MFA), blockchain, public-private key encryption, and others, the continuing arms race that is internet security is filled with more risk and more potential tripwires than ever.
Predictive models and machine learning
Today, predictive models are using machine learning (ML) to automate certain important tasks in underwriting – tasks once the exclusive domain of human underwriters, actuaries and financial professionals. These predictive models, when placed behind application programme interfaces (APIs), have even become products in their own right. An API is a software intermediary that allows two applications to talk to one another, and insurance companies with APIs can leverage one another’s predictive models when sending policy applications over the internet to obtain risk scores.
Elements such as authentication, encryption, and how best to deal with and protect personally identifiable information (PII) remain primary IT concerns for predictive modelling, but with predictive models becoming increasingly commercialised, new security concerns are emerging – concerns unique to ML.
Adversarial attacks are a form of hacking specific to ML algorithms. In these attacks, an adversarial neural network is developed and used specifically to attack the original neural network application. Landmark research in this area in 2017 by data scientists at Kyushu University, Japan was able to show that deep learning neural networks trained for computer vision could be fooled into misclassifying an image by the manipulation of as little as a single pixel.
Deep learning neural networks are typically very robust-to-noise in a computer image. This means that changing a single random pixel, or for that matter a large number of pixels, in an image will usually have minimal effect on the network’s ability to identify an image accurately. However, the researchers found an Achilles’ heel: An adversarial deep learning neural network can be trained to find weaknesses in an original machine learning application, and can then be used to find ways to compromise the original application’s predictive capabilities.
Cat or dog?
Here’s how it works: A low-resolution image of a dog can easily be identified and then properly classified as a dog by a well-trained deep learning neural network. However, an adversarial deep learning neural network can strategically introduce a single pixel at a specific location and even a specific colour (in this case, green) in the image, thus convincing the neural network it is looking at a cat. How? The adversarial network can calculate which pixel to change and what colour the changed pixel should be, in order to compromise the target network.
Research continues in this area: In 2018, a team of researchers from four US universities – University of Michigan, Ann Arbor; University of Washington; University of California, Berkeley; and Stony Brook University – successfully collaborated to see if a similar technique could be used to disrupt the ability of computer vision algorithms to recognise standard traffic signs. This could have serious ramifications in applications such as self-driving cars, which need to identify traffic signs while on the road. Though most self-driving cars can navigate entirely from maps and global positioning systems, they must also be able to be sense and process the world around them. To that end, signage is particularly important.
A stop sign, for example, could be modified via the placement of black and white bars upon the image, at points predetermined by an adversarial neural network. A human driver would recognise such alterations as simple graffiti and would ignore them, but the deep neural network would have difficulty interpreting the sign properly.
Confusing a deep learning neural network
Life insurance implications
With the rise of ML applications in insurance, the industry is becoming increasingly vulnerable to adversarial attacks both in the realms of computer vision and in traditional predictive modelling.
Computer vision is finding applicability in life and health insurance. Life insurers are using computer vision not only to extract data from attending physician statements, but also to analyse facial images to assess an individual’s health and predict important measures such as body mass index (BMI), tobacco usage, and age. Unfortunately, adversarial attacks have made it possible to fool computer vision systems into giving BMIs more favourable to the applicant by varying lighting, facial expressions, and for men, the amount of facial hair. An expert at such adversarial attacks could even optimise each of these controllable variables in order to receive more favourable values for BMIs and age.
Adversarial attacks are not limited to computer vision applications. Researchers from Harvard and MIT published a paper in 2018 that identified electronic health records (EHRs) as a possible target for deep neural network adversarial attacks.
Countering malicious activity
Detecting and defending from adversarial attacks from hackers is a new and growing area of research today, with considerable attention being devoted to computer security. Since 2017, ‘white hat’ hackers have been researching how ML systems can be breached in order to defend and protect them from such breaches. However, even at this relatively early point in the evolution of this security risk, some general recommendations can be made for defence.
Mainstream adversarial attacks require repeated access of the ML model to be exploited. The defence against this is twofold. The first, for insurers, is to ensure that the API serving its model is well protected. Regular IT best practices for APIs should be followed, such as restricting access to a company’s API only to known clients. An adversarial attack might originate with a successful hack of your client’s systems that accesses your APIs. From here, hackers can submit many requests to your API, using the responses to begin to build their attack strategy.
The second line of defence is monitoring the APIs and their connected models. As adversarial networks and modes of attack are developed by hackers, unusual and even unorthodox inputs will likely be presented to your models. A significant and unexplained spike in requests from one client to your API could indicate an adversarial attack. Similarly, a large number of transactions, where the individual module inputs are considerably outside of the expected distributions for those inputs, could be indicative of the probing examples that such an attack will generate.
Future landscape of ML security
The unique aspects of predictive model security could soon become as great a concern as internet security has become in recent years. As this area of information security grows, new techniques will be developed to help assure the security of predictive models. However, internet security, much like fraud, is a constant race between security professionals and malicious hackers. It is important to remain well-informed of new attack techniques and the measures that can be employed to combat them. Security must be a prime consideration for predictive model development and commercialisation. A
Mr Jeffrey Heaton is vice president and data scientist at RGA.