09/05/2022
In this blog we document how LexisNexis® ThreatMetrix® works as a live deployment platform for advanced machine learning models, delivering easy implementation and high performance.
What makes our job challenging is also what makes it so interesting. Fraud is a constantly evolving world and so we too must always be evolving and adapting to the ever-more complex and inventive techniques fraudsters dream up to scam unfortunate individuals out of their hard-earned cash. Strong signals from session profiling tools, combined with a traditional scoring engine already do a good job at detecting 3rd party frauds and so defences against that particular attack type has dramatically improved in recent years for all industries.
But, in the fraud world, no sooner is one weakness addressed, fraudsters will find a new modus operandi, and you’re back to square one. What is more, each evolution tends to be progressively harder to tackle from a data perspective.
Scams are a great example. Most of the time, the fraudulent transactions are conducted by the victim themselves, from their own device – thus making it much harder to spot. After all, how can you tell if the individual is acting with free will, or under duress? The fraudulent signals are often far more subtle than for traditional 3rd party frauds, hence the challenge posed for fraud prevention specialists such as our Professional Services Team.
Linear scorecards, such as ThreatMetrix default scoring policy, have a strong track record at effectively detecting frauds while also being easy to use and 100% ‘clearbox’ – meaning decisions are transparent and explainable. However, they come with some limitations. For one, they are based on binary features (also referred as “rules”) which need to be manually designed using the continuous variables – i.e. things that can be used to predict potential fraudulent activity, such as how many times in an hour a user logs in, or whether a new account beneficiary has been created in the past few minutes. Although rules can be combined, most of them require to be independently checked, weighted, and summed in sequence (hence the linear term) which is not always appropriate for some complex fraud scenarios.
Machine learning, on the other hand, is a method of data analysis that automates the model building process, thereby doing away with the need for a linear set of rules. In contrast to scorecards, it automatically consumes all the available continuous variables simultaneously and can capture subtle interactions between these fraud predictors. A classic example of a Machine learning algorithm is a decision tree. By feeding continuous variables and fraud data into the tree, the algorithm determines the best combination of predictors to most effectively authenticate the user and screen for potential fraudulent behaviour, without adding delays to the process.
Although these more advanced techniques can be matched in performance by standard scorecards in the case of some traditional frauds, they are for the most part, far more effective at detecting subtle fraud risk scenarios, such as scams where the genuine user is being manipulated to carry out the transaction themselves.
ThreatMetrix Dynamic Decision Platform (DDP) is designed for flexibility and supports most types of machine learning algorithm. Armed with this powerful tool, our Data scientists have been able to build high-performing models to prevent scams for our banking and financial services customers.
In a recent example, we developed a scam detection model to help a tier-one UK bank tackle authorised push payment (APP) fraud, a major challenge for them since 2021. We elected to use a forest of boosted trees, an ensemble of decision trees built leveraging machine learning. The expertise of our data scientists, along with strong domain knowledge, a long-term customer relationship and an excellent understanding of their challenges, paved the way to a collaborative model-building process, highly performing and tailored to the business’ needs. Working with the client, we focussed on:
Using these best practices, the model was built and deployed in under one month and achieved top performance in production for this Tier 1 UK Bank client. ThreatMetrix alerted on ~50% of scams, at a cost of one alert every 3500 online banking payments, representing 0.03% of their daily traffic. In comparison, a traditional scorecard would capture three times fewer scams for a similar number of alerts.
Following the building process, our data scientists were able to deploy the model live in ThreatMetrix DDP with just a few clicks. Thanks to an extremely lightweight computer language and a well-designed back end, the model is scoring live events without introducing any latency into the customer journey – that’s around 3000 complex calculations in less than 2 milliseconds, all deployable in a matter of hours.
Scores are then used for live decisioning, downstream in the customer journey. The decision engine provides the client with a wide range of configurable options to determine the decision threshold enabling them to easily calibrate it to support their operational objectives. In addition to deploying a model, ThreatMetrix’s transparent policy engine can also be used to create business rules around the model, or combine the score with other models (in an ensemble of models).
Recognising that achieving better performance with machine learning often comes at the cost of interpretability, we also designed a graphical interpretation of the model outputs based on a technique called SHAP (shown below) to help users explain how the model reaches a particular decision, in even the most complex scenarios.
ThreatMetrix DDP is a highly customisable platform. In addition to scorecards, it is a great platform for machine learning model deployment. These models provide far superior performance on the more complex fraud scenarios that are currently posing a significant challenge to the financial services sector, all using subtle fraud indicators that passively authenticate in the blink of an eye, using live data without introducing latency into the customer experience. With a focus on data interpretability, we also stay true to our clear box principles.
Thanks to data science, machine learning and the unshakeable curiosity and ingenuity of our people, we’re better than ever at catching fraudsters’ latest dirty tricks. Now, we just need to wait for them to make their next move!