5/5 - (7 votes)

Introduction

As more and more businesses are facing credit card fraud and identity theft, the popularity of “fraud detection” is rising in Google Trends:

Companies are looking for credit card fraud detection software that will help to eliminate this problem or at least reduce the possible dangers. Before looking at the SPD Group credit card fraud detection project, let’s answer the most common questions:

What is a Fraud Detection System?

It is a set of activities undertaken to prevent money or property from being obtained through false pretenses.

What are Fraud Detection Predictive Models?

Models make predictions based on information about a transaction and some context (historical) information. To make the model more robust, we used only the most important features which were selected based on χ² (a chi-square is a test that measures how expectations compare to actual observed data) and recursive feature elimination techniques.

How can Neural Networks be Used to Detect Fraud?

Neural networks are highly effective when the data scientist has access to a large dataset (say 100,000 or more data samples). They are able to seek patterns and smartly detect new behavior that seems too distinct from the normal flow. However, in our case, we decided to rely on other Machine Learning models such as Classification trees because the RNN performance did not show the accuracy we expected, most likely because of a dataset that was not large enough.

What Are the Most Prominent Fraud Detection Use Cases?

The list of the possible implementations of machine learning to fight against criminal activities is constantly growing, and it is not feasible to cover all deserving examples here. However, we shall attempt to mention the most interesting to provide you a perspective. A publicly available E-commerce fraud case study by DataVisor states that their solutions help businesses detect over 30% fraudulent attempts with a 90% accuracy and 1.3% false-positive rate. Another Fraud Detection analytics case study by Feedzai claims that their OpenML Engine enables banks to build their own Machine Learning software that has dramatically reduced fraud-related losses. There are also multiple proven Big Data fraud detection case studies that leverage automated analytics to justify the large volumes of raw data and offer solutions that can predict and prevent the activities of criminals.

The Fraud Detection market, in general, is tremendous now, and it is expected to expand even further. According to Fortune Business Insights, it will expand up to $106 billion by 2027. North America is a leading region for this market, but Europe and Asia are also considered to be significant players. The growing number of transactions during the COVID-19 lockdowns may boost this number even higher than predicted. Hence, the volume of the transactions globally is highly related to the growth of this market. As of 2021, the biggest shares in the fraud detection and fraud prevention market belong to insurance claims, money laundering, and electronic payments. In this case study, we will focus on how we dealt with challenges in making e-payments more secure for our partners. If you are interested in reading about money-laundering prevention or protection measures in the insurance industries, feel free to leave your comments. We may cover these topics too if you are interested in reading about them!

Anomaly Detection Solution for E-Commerce Credit Card Transactions from SPD Group

Monitoring Fradulent Activity
Development time – 3 months
Team size – 6 experts
Platform – Web

Overview of the Credit Card Fraud Detection Project

The SPD Group was contacted by an E-commerce and financial service company that offered products and services that can be paid for using mobile money or a bank card (e.g., Visa and MasterCard) to make their platform a safer place for online transactions for their customers. Along with the increase in the number of customers who faced issues with their money suddenly disappearing or being transferred to another unknown account, our client thought of implementing a modern fraud prevention method for their platform. Therefore, they contacted us and decided to depend on what Machine Learning can do here.

We had experience in working with innovative technologies such as Artificial Intelligence and Machine Learning. Recently, the SPD Group helped a leading financial data provider that covers the global venture capital, private equity, and public markets with two Machine Learning-powered solutions. The first one was a trend detection service, which is a microservice with a visual interface that helps clients find trends on the market worthy to invest in. The second was a market map solution, aimed to collect and categorize the information about companies in different industries with a visual interface, flexible functionality, and additional features. Moreover, we used Machine Learning algorithms and natural language processing techniques in a web crawler tool to search and analyze financial news. Therefore, working on a new machine learning-based project was an exciting opportunity to expand our experience in the field and help our client deal with their existing problems.

Challenges

To dive into the challenges and obstacles of this project, we got a quote from a Machine Learning Engineer from our development team:

“The most complicated part of the solution was to achieve good metrics for users who have made only a few transactions. We could apply the regular model, which is good for users with a rich transaction history, but it would give worse scores if there is a lack of historical data (for example, a new user). Another obvious solution is to treat such users as empty accounts that have only identity information without any transaction history. In this case, we lose the advantage of having at least some data about the users, but the results that such a model provides are quite stable (underfitting). After making a weekly stand up on the matter, we decided to look into ‘few-shot learning’ techniques, which could help us improve our metrics. We have prepared a PoC, but it didn’t give us the drastic improvement we had expected. Nevertheless, we proceeded with experimenting and diving into our client business domain; it allowed us to develop features which have made a huge impact on our model that is based on ‘few-shot learning’ techniques. Because of the domain features, our main score improved by more than 15% and it became the production solution.”

Solution

Our R&D team worked on the project for 3 months, using Classification rather than classical Anomaly Detection methods. After an intense feature generation phase (about 700 features in total) they went to feature selection to choose only the most relevant ones. Finally, it was a blend of Classification methods such as XGBoost, Catboost, and LightGBM that got us close to the desired score.

Credit Card Fraud Detection Dataset

The platform is an e-commerce and financial service app serving 12,000+ customers daily. This dataset included a sample of approximately 140,000 transactions that occurred between October 2018 and April 2019. One of the fraud detection challenges is that the data is highly imbalanced. There were around 130,000 normal transactions and only 6% of them were fraudulent. We addressed the problem of an imbalanced dataset with various techniques such as data oversampling (augmenting the existing data samples) and data sample generation.

Credit Card Fraud Detection Algorithm

Once the Machine Learning-driven fraud protection module was integrated into the e-commerce platform, it started tracking the transactions. Whenever a user requests a transaction, it is being processed for some time. Depending on the level of predicted fraud probability, there are 3 kinds of possible output:

  • If the probability is less than 10%, the transaction is allowed
  • If the probability is between 10% and 80%, an additional authentication factor (e.g. a one-time SMS code, a fingerprint, or a secret question) should be applied.
  • If the probability is more than 80%, the transaction is frozen, so it should be processed manually.

The model estimates the probability of a fraudulent transaction based on the following transaction information: Date and Time, Product Category, Amount, Provider (Seller), Client Information, Agent Information, Location, and Client’s Behavioral Patterns. Contextual and aggregated data is produced by a Machine Learning engineer based on the previously mentioned data.

Technology for Credit Card Fraud Detection Software

Classification models:

  • Decision Trees
  • Isolation Trees
  • Random Forest
  • Bootstrap

Anomaly detection:

  • PCA
  • Mahalanobis Distance
  • Local Outlier Factor

Results

After this solution was implemented, the entire e-commerce platform received tangible benefits. In only 6 months after production, we can highlight the following areas:

  • Cost Reduction for solving problems with fraud: communication with Customer Support, filling in forms for refund, refund verification by a manager, etc.
  • Customer Satisfaction because of simplified authentication for a transaction with a low probability of fraudulence (usually, most transactions have low fraudulence probability).

What we achieved with the solution: More than 140,000 transactions were analyzed with 6% of fraudulent data points a year. Fewer customers claimed to have fraudulent transactions. Our client’s online card transaction platform became a safer service and gained more loyalty from their customers. We continued to support the project after release because it is very important to continuously train the Fraud Detection model whenever new data arrives, so new fraud schemas/patterns can be learned and detected as early as possible.

Fraud Detection Machine Learning: What to Look for in 2021?

This Fraud Detection Machine Learning case study is merely one of many examples of how the technology can benefit an organization. While looking into the future, it is safe to say that Artificial Intelligence and Machine Learning will dominate the Fraud Detection field if companies adjust their culture and embrace the power of innovation. Keep in mind that Artificial Intelligence/Machine Learning can always be combined with powerful technologies such as Cloud Computing and Big Data, which will provide better efficiency.

As criminals always find new inventive ways to steal money, Machine Learning-based solutions with supervised and unsupervised techniques will help find hidden patterns and provide even more complex insights. Deep Learning technology is already altering this field by making conclusions built on user behavioral, historical, and time-series data and most likely will become even more effective in 2021. Probably, the biggest advantage of Artificial Intelligence/Machine Learning solutions is that you can combine different kinds to secure maximum efficiency for your organization. According to Fortune Business Insights, the global Machine Learning market will exceed $117 billion in the next six years. However, seamless adoption presents some significant challenges:

Machine Learning Challenges

You can deal with them if you partner up with a trusted software development company and obtain a consultation to find out what you can do to implement Machine Learning into your processes. Become a part of the Machine Learning revolution and embrace the latest technology!

ARE YOU INTERESTED IN DEVELOPING A Credit Card Fraud Detection SOLUTION?

Contact our experts to get a free consultation and time&budget estimate for your project.

Contact Us
Roman Chuprina Technical journalist at SPD Group, covering AI/ML, IoT, and Blockchain topics with articles and interviews. May 6, 2021
Olena Kovalenko Project Manager @ SPD-Group, leading AI/ML team and writing articles for our blog. May 6, 2021