News Blog Paper China
Explainable Machine Learning for Fraud Detection2021-05-13   ${\displaystyle \cong }$
The application of machine learning to support the processing of large datasets holds promise in many industries, including financial services. However, practical issues for the full adoption of machine learning remain with the focus being on understanding and being able to explain the decisions and predictions made by complex models. In this paper, we explore explainability methods in the domain of real-time fraud detection by investigating the selection of appropriate background datasets and runtime trade-offs on both supervised and unsupervised models.
Towards Explainable Deep Learning for Credit Lending: A Case Study2018-11-30   ${\displaystyle \cong }$
Deep learning adoption in the financial services industry has been limited due to a lack of model interpretability. However, several techniques have been proposed to explain predictions made by a neural network. We provide an initial investigation into these techniques for the assessment of credit risk with neural networks.
Deep Q-Network-based Adaptive Alert Threshold Selection Policy for Payment Fraud Systems in Retail Banking2020-10-21   ${\displaystyle \cong }$
Machine learning models have widely been used in fraud detection systems. Most of the research and development efforts have been concentrated on improving the performance of the fraud scoring models. Yet, the downstream fraud alert systems still have limited to no model adoption and rely on manual steps. Alert systems are pervasively used across all payment channels in retail banking and play an important role in the overall fraud detection process. Current fraud detection systems end up with large numbers of dropped alerts due to their inability to account for the alert processing capacity. Ideally, alert threshold selection enables the system to maximize the fraud detection while balancing the upstream fraud scores and the available bandwidth of the alert processing teams. However, in practice, fixed thresholds that are used for their simplicity do not have this ability. In this paper, we propose an enhanced threshold selection policy for fraud alert systems. The proposed approach formulates the threshold selection as a sequential decision making problem and uses Deep Q-Network based reinforcement learning. Experimental results show that this adaptive approach outperforms the current static solutions by reducing the fraud losses as well as improving the operational efficiency of the alert system.
A Semi-supervised Graph Attentive Network for Financial Fraud Detection2020-02-28   ${\displaystyle \cong }$
With the rapid growth of financial services, fraud detection has been a very important problem to guarantee a healthy environment for both users and providers. Conventional solutions for fraud detection mainly use some rule-based methods or distract some features manually to perform prediction. However, in financial services, users have rich interactions and they themselves always show multifaceted information. These data form a large multiview network, which is not fully exploited by conventional methods. Additionally, among the network, only very few of the users are labelled, which also poses a great challenge for only utilizing labeled data to achieve a satisfied performance on fraud detection. To address the problem, we expand the labeled data through their social relations to get the unlabeled data and propose a semi-supervised attentive graph neural network, namedSemiGNN to utilize the multi-view labeled and unlabeled data for fraud detection. Moreover, we propose a hierarchical attention mechanism to better correlate different neighbors and different views. Simultaneously, the attention mechanism can make the model interpretable and tell what are the important factors for the fraud and why the users are predicted as fraud. Experimentally, we conduct the prediction task on the users of Alipay, one of the largest third-party online and offline cashless payment platform serving more than 4 hundreds of million users in China. By utilizing the social relations and the user attributes, our method can achieve a better accuracy compared with the state-of-the-art methods on two tasks. Moreover, the interpretable results also give interesting intuitions regarding the tasks.
Instance-Level Explanations for Fraud Detection: A Case Study2018-06-19   ${\displaystyle \cong }$
Fraud detection is a difficult problem that can benefit from predictive modeling. However, the verification of a prediction is challenging; for a single insurance policy, the model only provides a prediction score. We present a case study where we reflect on different instance-level model explanation techniques to aid a fraud detection team in their work. To this end, we designed two novel dashboards combining various state-of-the-art explanation techniques. These enable the domain expert to analyze and understand predictions, dramatically speeding up the process of filtering potential fraud cases. Finally, we discuss the lessons learned and outline open research issues.
Computer-Assisted Fraud Detection, From Active Learning to Reward Maximization2018-11-20   ${\displaystyle \cong }$
The automatic detection of frauds in banking transactions has been recently studied as a way to help the analysts finding fraudulent operations. Due to the availability of a human feedback, this task has been studied in the framework of active learning: the fraud predictor is allowed to sequentially call on an oracle. This human intervention is used to label new examples and improve the classification accuracy of the latter. Such a setting is not adapted in the case of fraud detection with financial data in European countries. Actually, as a human verification is mandatory to consider a fraud as really detected, it is not necessary to focus on improving the classifier. We introduce the setting of 'Computer-assisted fraud detection' where the goal is to minimize the number of non fraudulent operations submitted to an oracle. The existing methods are applied to this task and we show that a simple meta-algorithm provides competitive results in this scenario on benchmark datasets.
FraudJudger: Real-World Data Oriented Fraud Detection on Digital Payment Platforms2019-09-05   ${\displaystyle \cong }$
Automated fraud behaviors detection on electronic payment platforms is a tough problem. Fraud users often exploit the vulnerability of payment platforms and the carelessness of users to defraud money, steal passwords, do money laundering, etc, which causes enormous losses to digital payment platforms and users. There are many challenges for fraud detection in practice. Traditional fraud detection methods require a large-scale manually labeled dataset, which is hard to obtain in reality. Manually labeled data cost tremendous human efforts. Besides, the continuous and rapid evolution of fraud users makes it hard to find new fraud patterns based on existing detection rules. In our work, we propose a real-world data oriented detection paradigm which can detect fraud users and upgrade its detection ability automatically. Based on the new paradigm, we design a novel fraud detection model, FraudJudger, to analyze users behaviors on digital payment platforms and detect fraud users with fewer labeled data in training. FraudJudger can learn the latent representations of users from unlabeled data with the help of Adversarial Autoencoder (AAE). Furthermore, FraudJudger can find new fraud patterns from unknown users by cluster analysis. Our experiment is based on a real-world electronic payment dataset. Comparing with other well-known fraud detection methods, FraudJudger can achieve better detection performance with only 10% labeled data.
Deep Learning Methods for Credit Card Fraud Detection2020-12-07   ${\displaystyle \cong }$
Credit card frauds are at an ever-increasing rate and have become a major problem in the financial sector. Because of these frauds, card users are hesitant in making purchases and both the merchants and financial institutions bear heavy losses. Some major challenges in credit card frauds involve the availability of public data, high class imbalance in data, changing nature of frauds and the high number of false alarms. Machine learning techniques have been used to detect credit card frauds but no fraud detection systems have been able to offer great efficiency to date. Recent development of deep learning has been applied to solve complex problems in various areas. This paper presents a thorough study of deep learning methods for the credit card fraud detection problem and compare their performance with various machine learning algorithms on three different financial datasets. Experimental results show great performance of the proposed deep learning methods against traditional machine learning models and imply that the proposed approaches can be implemented effectively for real-world credit card fraud detection systems.
Applying support vector data description for fraud detection2020-05-31   ${\displaystyle \cong }$
Fraud detection is an important topic that applies to various enterprises such as banking and financial sectors, insurance, government agencies, law enforcement, and more. Fraud attempts have been risen remarkably in current years, shaping fraud detection an essential topic for research. One of the main challenges in fraud detection is acquiring fraud samples which is a complex and challenging task. In order to deal with this challenge, we apply one-class classification methods such as SVDD which does not need the fraud samples for training. Also, we present our algorithm REDBSCAN which is an extension of DBSCAN to reduce the number of samples and select those that keep the shape of data. The results obtained by the implementation of the proposed method indicated that the fraud detection process was improved in both performance and speed.
ServeNet: A Deep Neural Network for Web Service Classification2019-05-14   ${\displaystyle \cong }$
Automated service classification plays a crucial role in service discovery, selection, and composition. Machine learning has been used for service classification in recent years. However, the performance of conventional machine learning methods highly depends on the quality of manual feature engineering. In this paper, we present a deep neural network to automatically abstract low-level representation of service description to high-level features without feature engineering and then predict service classification on 50 service categories. To demonstrate the effectiveness of our approach, we conduct a comprehensive experimental study by comparing 10 machine learning methods on 10,000 real-world web services. The result shows that the proposed deep neural network can achieve higher accuracy than other machine learning methods.
Transparent Interpretation with Knockouts2020-11-01   ${\displaystyle \cong }$
How can we find a subset of training samples that are most responsible for a complicated black-box machine learning model prediction? More generally, how can we explain the model decision to end-users in a transparent way? We propose a new model-agnostic algorithm to identify a minimum number of training samples that are indispensable for a given model decision at a particular test point, as the model decision would otherwise change upon the removal of these training samples. In line with the counterfactual explanation, our algorithm identifies such a set of indispensable samples iteratively by solving a constrained optimization problem. Further, we efficiently speed up the algorithm through approximation. To demonstrate the effectiveness of the algorithm, we apply it to a variety of tasks including data poisoning detection, training set debugging, and understanding loan decisions. Results show that our algorithm is an effective and easy to comprehend tool to help better understand local model behaviors and therefore facilitate the application of machine learning in domains where such understanding is a requisite and where end-users do not have a machine learning background.
Cyclic Boosting -- an explainable supervised machine learning algorithm2020-05-07   ${\displaystyle \cong }$
Supervised machine learning algorithms have seen spectacular advances and surpassed human level performance in a wide range of specific applications. However, using complex ensemble or deep learning algorithms typically results in black box models, where the path leading to individual predictions cannot be followed in detail. In order to address this issue, we propose the novel "Cyclic Boosting" machine learning algorithm, which allows to efficiently perform accurate regression and classification tasks while at the same time allowing a detailed understanding of how each individual prediction was made.
Interpretable Automated Machine Learning in Maana(TM) Knowledge Platform2019-05-06   ${\displaystyle \cong }$
Machine learning is becoming an essential part of developing solutions for many industrial applications, but the lack of interpretability hinders wide industry adoption to rapidly build, test, deploy and validate machine learning models, in the sense that the insight of developing machine learning solutions are not structurally encoded, justified and transferred. In this paper we describe Maana Meta-learning Service, an interpretable and interactive automated machine learning service residing in Maana Knowledge Platform that performs machine-guided, user assisted pipeline search and hyper-parameter tuning and generates structured knowledge about decisions for pipeline profiling and selection. The service is shipped with Maana Knowledge Platform and is validated using benchmark dataset. Furthermore, its capability of deriving knowledge from pipeline search facilitates various inference tasks and transferring to similar data science projects.
Indexing Cost Sensitive Prediction2014-08-15   ${\displaystyle \cong }$
Predictive models are often used for real-time decision making. However, typical machine learning techniques ignore feature evaluation cost, and focus solely on the accuracy of the machine learning models obtained utilizing all the features available. We develop algorithms and indexes to support cost-sensitive prediction, i.e., making decisions using machine learning models taking feature evaluation cost into account. Given an item and a online computation cost (i.e., time) budget, we present two approaches to return an appropriately chosen machine learning model that will run within the specified time on the given item. The first approach returns the optimal machine learning model, i.e., one with the highest accuracy, that runs within the specified time, but requires significant up-front precomputation time. The second approach returns a possibly sub- optimal machine learning model, but requires little up-front precomputation time. We study these two algorithms in detail and characterize the scenarios (using real and synthetic data) in which each performs well. Unlike prior work that focuses on a narrow domain or a specific algorithm, our techniques are very general: they apply to any cost-sensitive prediction scenario on any machine learning algorithm.
Explaining predictive models with mixed features using Shapley values and conditional inference trees2020-07-02   ${\displaystyle \cong }$
It is becoming increasingly important to explain complex, black-box machine learning models. Although there is an expanding literature on this topic, Shapley values stand out as a sound method to explain predictions from any type of machine learning model. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. This methodology was then extended to explain dependent features with an underlying continuous distribution. In this paper, we propose a method to explain mixed (i.e. continuous, discrete, ordinal, and categorical) dependent features by modeling the dependence structure of the features using conditional inference trees. We demonstrate our proposed method against the current industry standards in various simulation studies and find that our method often outperforms the other approaches. Finally, we apply our method to a real financial data set used in the 2018 FICO Explainable Machine Learning Challenge and show how our explanations compare to the FICO challenge Recognition Award winning team.
Fraud Detection using Data-Driven approach2020-09-08   ${\displaystyle \cong }$
The extensive use of the internet is continuously drifting businesses to incorporate their services in the online environment. One of the first spectrums to embrace this evolution was the banking sector. In fact, the first known online banking service came in 1980. It was deployed from a community bank located in Knoxville, called the United American Bank. Since then, internet banking has been offering ease and efficiency to costumers in completing their daily banking tasks. The ever increasing use of internet banking and a large number of online transactions increased fraudulent behavior also. As if fraud increase was not enough, the massive number of online transactions further increased the data complexity. Modern data sources are not only complex but generated at high speed and in real-time as well. This presents a serious problem and a definite reason why more advanced solutions are desired to protect financial service companies and credit cardholders. Therefore, this research paper aims to construct an efficient fraud detection model which is adaptive to customer behavior changes and tends to decrease fraud manipulation, by detecting and filtering fraud in real-time. In order to achieve this aim, a review of various methods is conducted, adding above a personal experience working in the Banking sector, specifically in the Fraud Detection office. Unlike the majority of reviewed methods, the proposed model in this research paper is able to detect fraud in the moment of occurrence using an incremental classifier. The evaluation of synthetic data, based on fraud scenarios selected in collaboration with domain experts that replicate typical, real-world attacks, shows that this approach correctly ranks complex frauds. In particular, our proposal detects fraudulent behavior and anomalies with up to 97\% detection rate while maintaining a satisfyingly low cost.
Making Contextual Decisions with Low Technical Debt2017-05-09   ${\displaystyle \cong }$
Applications and systems are constantly faced with decisions that require picking from a set of actions based on contextual information. Reinforcement-based learning algorithms such as contextual bandits can be very effective in these settings, but applying them in practice is fraught with technical debt, and no general system exists that supports them completely. We address this and create the first general system for contextual learning, called the Decision Service. Existing systems often suffer from technical debt that arises from issues like incorrect data collection and weak debuggability, issues we systematically address through our ML methodology and system abstractions. The Decision Service enables all aspects of contextual bandit learning using four system abstractions which connect together in a loop: explore (the decision space), log, learn, and deploy. Notably, our new explore and log abstractions ensure the system produces correct, unbiased data, which our learner uses for online learning and to enable real-time safeguards, all in a fully reproducible manner. The Decision Service has a simple user interface and works with a variety of applications: we present two live production deployments for content recommendation that achieved click-through improvements of 25-30%, another with 18% revenue lift in the landing page, and ongoing applications in tech support and machine failure handling. The service makes real-time decisions and learns continuously and scalably, while significantly lowering technical debt.
Cognitive Model Priors for Predicting Human Decisions2019-05-22   ${\displaystyle \cong }$
Human decision-making underlies all economic behavior. For the past four decades, human decision-making under uncertainty has continued to be explained by theoretical models based on prospect theory, a framework that was awarded the Nobel Prize in Economic Sciences. However, theoretical models of this kind have developed slowly, and robust, high-precision predictive models of human decisions remain a challenge. While machine learning is a natural candidate for solving these problems, it is currently unclear to what extent it can improve predictions obtained by current theories. We argue that this is mainly due to data scarcity, since noisy human behavior requires massive sample sizes to be accurately captured by off-the-shelf machine learning methods. To solve this problem, what is needed are machine learning models with appropriate inductive biases for capturing human behavior, and larger datasets. We offer two contributions towards this end: first, we construct "cognitive model priors" by pretraining neural networks with synthetic data generated by cognitive models (i.e., theoretical models developed by cognitive psychologists). We find that fine-tuning these networks on small datasets of real human decisions results in unprecedented state-of-the-art improvements on two benchmark datasets. Second, we present the first large-scale dataset for human decision-making, containing over 240,000 human judgments across over 13,000 decision problems. This dataset reveals the circumstances where cognitive model priors are useful, and provides a new standard for benchmarking prediction of human decisions under uncertainty.
Uncheatable Machine Learning Inference2019-08-08   ${\displaystyle \cong }$
Classification-as-a-Service (CaaS) is widely deployed today in machine intelligence stacks for a vastly diverse set of applications including anything from medical prognosis to computer vision tasks to natural language processing to identity fraud detection. The computing power required for training complex models on large datasets to perform inference to solve these problems can be very resource-intensive. A CaaS provider may cheat a customer by fraudulently bypassing expensive training procedures in favor of weaker, less computationally-intensive algorithms which yield results of reduced quality. Given a classification service supplier $S$, intermediary CaaS provider $P$ claiming to use $S$ as a classification backend, and customer $C$, our work addresses the following questions: (i) how can $P$'s claim to be using $S$ be verified by $C$? (ii) how might $S$ make performance guarantees that may be verified by $C$? and (iii) how might one design a decentralized system that incentivizes service proofing and accountability? To this end, we propose a variety of methods for $C$ to evaluate the service claims made by $P$ using probabilistic performance metrics, instance seeding, and steganography. We also propose a method of measuring the robustness of a model using a blackbox adversarial procedure, which may then be used as a benchmark or comparison to a claim made by $S$. Finally, we propose the design of a smart contract-based decentralized system that incentivizes service accountability to serve as a trusted Quality of Service (QoS) auditor.
Machine learning on Crays to optimise petrophysical workflows in oil and gas exploration2020-10-01   ${\displaystyle \cong }$
The oil and gas industry is awash with sub-surface data, which is used to characterize the rock and fluid properties beneath the seabed. This in turn drives commercial decision making and exploration, but the industry currently relies upon highly manual workflows when processing data. A key question is whether this can be improved using machine learning to complement the activities of petrophysicists searching for hydrocarbons. In this paper we present work done, in collaboration with Rock Solid Images (RSI), using supervised machine learning on a Cray XC30 to train models that streamline the manual data interpretation process. With a general aim of decreasing the petrophysical interpretation time down from over 7 days to 7 minutes, in this paper we describe the use of mathematical models that have been trained using raw well log data, for completing each of the four stages of a petrophysical interpretation workflow, along with initial data cleaning. We explore how the predictions from these models compare against the interpretations of human petrophysicists, along with numerous options and techniques that were used to optimise the prediction of our models. The power provided by modern supercomputers such as Cray machines is crucial here, but some popular machine learning framework are unable to take full advantage of modern HPC machines. As such we will also explore the suitability of the machine learning tools we have used, and describe steps we took to work round their limitations. The result of this work is the ability, for the first time, to use machine learning for the entire petrophysical workflow. Whilst there are numerous challenges, limitations and caveats, we demonstrate that machine learning has an important role to play in the processing of sub-surface data.