News Blog Paper China
Indexing Cost Sensitive Prediction2014-08-15   ${\displaystyle \cong }$
Predictive models are often used for real-time decision making. However, typical machine learning techniques ignore feature evaluation cost, and focus solely on the accuracy of the machine learning models obtained utilizing all the features available. We develop algorithms and indexes to support cost-sensitive prediction, i.e., making decisions using machine learning models taking feature evaluation cost into account. Given an item and a online computation cost (i.e., time) budget, we present two approaches to return an appropriately chosen machine learning model that will run within the specified time on the given item. The first approach returns the optimal machine learning model, i.e., one with the highest accuracy, that runs within the specified time, but requires significant up-front precomputation time. The second approach returns a possibly sub- optimal machine learning model, but requires little up-front precomputation time. We study these two algorithms in detail and characterize the scenarios (using real and synthetic data) in which each performs well. Unlike prior work that focuses on a narrow domain or a specific algorithm, our techniques are very general: they apply to any cost-sensitive prediction scenario on any machine learning algorithm.
Risk Aware and Multi-Objective Decision Making with Distributional Monte Carlo Tree Search2021-02-01   ${\displaystyle \cong }$
In many risk-aware and multi-objective reinforcement learning settings, the utility of the user is derived from the single execution of a policy. In these settings, making decisions based on the average future returns is not suitable. For example, in a medical setting a patient may only have one opportunity to treat their illness. When making a decision, just the expected return -- known in reinforcement learning as the value -- cannot account for the potential range of adverse or positive outcomes a decision may have. Our key insight is that we should use the distribution over expected future returns differently to represent the critical information that the agent requires at decision time. In this paper, we propose Distributional Monte Carlo Tree Search, an algorithm that learns a posterior distribution over the utility of the different possible returns attainable from individual policy executions, resulting in good policies for both risk-aware and multi-objective settings. Moreover, our algorithm outperforms the state-of-the-art in multi-objective reinforcement learning for the expected utility of the returns.
Learning with Analytical Models2019-02-25   ${\displaystyle \cong }$
To understand and predict the performance of scientific applications, several analytical and machine learning approaches have been proposed, each having its advantages and disadvantages. In this paper, we propose and validate a hybrid approach for performance modeling and prediction, which combines analytical and machine learning models. The proposed hybrid model aims to minimize prediction cost while providing reasonable prediction accuracy. Our validation results show that the hybrid model is able to learn and correct the analytical models to better match the actual performance. Furthermore, the proposed hybrid model improves the prediction accuracy in comparison to pure machine learning techniques while using small training datasets, thus making it suitable for hardware and workload changes.
Application of Machine Learning Techniques in Aquaculture2014-05-03   ${\displaystyle \cong }$
In this paper we present applications of different machine learning algorithms in aquaculture. Machine learning algorithms learn models from historical data. In aquaculture historical data are obtained from farm practices, yields, and environmental data sources. Associations between these different variables can be obtained by applying machine learning algorithms to historical data. In this paper we present applications of different machine learning algorithms in aquaculture applications.
Learning Representations by Humans, for Humans2020-06-28   ${\displaystyle \cong }$
The task of optimizing machines to support human decision-making is often conflated with that of optimizing machines for accuracy even though they are materially different. Whereas it is typical for learning systems to prescribe actions through prediction, here we propose an approach in which the role of machines is to reframe problems in order to directly support human decisions. Inspired by the success of representation learning in promoting machine performance, we frame the problem as one of learning representations that are conducive to good human performance. This "Man Composed with Machine'' framework incorporates a human decision-making model directly into the representation learning paradigm with optimization achieved through a novel human-in-the-loop training procedure. We empirically demonstrate on various tasks and representational forms that the framework is capable of learning representations that better coincide with human decision-making processes and can lead to good decisions.
Risk Assessment for Machine Learning Models2020-11-09   ${\displaystyle \cong }$
In this paper we propose a framework for assessing the risk associated with deploying a machine learning model in a specified environment. For that we carry over the risk definition from decision theory to machine learning. We develop and implement a method that allows to define deployment scenarios, test the machine learning model under the conditions specified in each scenario, and estimate the damage associated with the output of the machine learning model under test. Using the likelihood of each scenario together with the estimated damage we define \emph{key risk indicators} of a machine learning model. The definition of scenarios and weighting by their likelihood allows for standardized risk assessment in machine learning throughout multiple domains of application. In particular, in our framework, the robustness of a machine learning model to random input corruptions, distributional shifts caused by a changing environment, and adversarial perturbations can be assessed.
On Hyperparameter Optimization of Machine Learning Algorithms: Theory and Practice2020-07-30   ${\displaystyle \cong }$
Machine learning algorithms have been used widely in various applications and areas. To fit a machine learning model into different problems, its hyper-parameters must be tuned. Selecting the best hyper-parameter configuration for machine learning models has a direct impact on the model's performance. It often requires deep knowledge of machine learning algorithms and appropriate hyper-parameter optimization techniques. Although several automatic optimization techniques exist, they have different strengths and drawbacks when applied to different types of problems. In this paper, optimizing the hyper-parameters of common machine learning models is studied. We introduce several state-of-the-art optimization techniques and discuss how to apply them to machine learning algorithms. Many available libraries and frameworks developed for hyper-parameter optimization problems are provided, and some open challenges of hyper-parameter optimization research are also discussed in this paper. Moreover, experiments are conducted on benchmark datasets to compare the performance of different optimization methods and provide practical examples of hyper-parameter optimization. This survey paper will help industrial users, data analysts, and researchers to better develop machine learning models by identifying the proper hyper-parameter configurations effectively.
Cost-Sensitive Feature-Value Acquisition Using Feature Relevance2019-12-18   ${\displaystyle \cong }$
In many real-world machine learning problems, feature values are not readily available. To make predictions, some of the missing features have to be acquired, which can incur a cost in money, computational time, or human time, depending on the problem domain. This leads us to the problem of choosing which features to use at the prediction time. The chosen features should increase the prediction accuracy for a low cost, but determining which features will do that is challenging. The choice should take into account the previously acquired feature values as well as the feature costs. This paper proposes a novel approach to address this problem. The proposed approach chooses the most useful features adaptively based on how relevant they are for the prediction task as well as what the corresponding feature costs are. Our approach uses a generic neural network architecture, which is suitable for a wide range of problems. We evaluate our approach on three cost-sensitive datasets, including Yahoo! Learning to Rank Competition dataset as well as two health datasets. We show that our approach achieves high accuracy with a lower cost than the current state-of-the-art approaches.
Optimization Models for Machine Learning: A Survey2020-07-27   ${\displaystyle \cong }$
This paper surveys the machine learning literature and presents in an optimization framework several commonly used machine learning approaches. Particularly, mathematical optimization models are presented for regression, classification, clustering, deep learning, and adversarial learning, as well as new emerging applications in machine teaching, empirical model learning, and Bayesian network structure learning. Such models can benefit from the advancement of numerical optimization techniques which have already played a distinctive role in several machine learning settings. The strengths and the shortcomings of these models are discussed and potential research directions and open problems are highlighted.
Predicting Abnormal Returns From News Using Text Classification2009-06-24   ${\displaystyle \cong }$
We show how text from news articles can be used to predict intraday price movements of financial assets using support vector machines. Multiple kernel learning is used to combine equity returns with text as predictive features to increase classification performance and we develop an analytic center cutting plane method to solve the kernel learning problem efficiently. We observe that while the direction of returns is not predictable using either text or returns, their size is, with text features producing significantly better performance than historical returns alone.
Techniques for Interpretable Machine Learning2019-05-19   ${\displaystyle \cong }$
Interpretable machine learning tackles the important problem that humans cannot understand the behaviors of complex machine learning models and how these models arrive at a particular decision. Although many approaches have been proposed, a comprehensive understanding of the achievements and challenges is still lacking. We provide a survey covering existing techniques to increase the interpretability of machine learning models. We also discuss crucial issues that the community should consider in future work such as designing user-friendly explanations and developing comprehensive evaluation metrics to further push forward the area of interpretable machine learning.
Actionable Interpretation of Machine Learning Models for Sequential Data: Dementia-related Agitation Use Case2020-09-10   ${\displaystyle \cong }$
Machine learning has shown successes for complex learning problems in which data/parameters can be multidimensional and too complex for a first-principles based analysis. Some applications that utilize machine learning require human interpretability, not just to understand a particular result (classification, detection, etc.) but also for humans to take action based on that result. Black-box machine learning model interpretation has been studied, but recent work has focused on validation and improving model performance. In this work, an actionable interpretation of black-box machine learning models is presented. The proposed technique focuses on the extraction of actionable measures to help users make a decision or take an action. Actionable interpretation can be implemented in most traditional black-box machine learning models. It uses the already trained model, used training data, and data processing techniques to extract actionable items from the model outcome and its time-series inputs. An implementation of the actionable interpretation is shown with a use case: dementia-related agitation prediction and the ambient environment. It is shown that actionable items can be extracted, such as the decreasing of in-home light level, which is triggering an agitation episode. This use case of actionable interpretation can help dementia caregivers take action to intervene and prevent agitation.
A Hierarchy of Limitations in Machine Learning2020-02-29   ${\displaystyle \cong }$
"All models are wrong, but some are useful", wrote George E. P. Box (1979). Machine learning has focused on the usefulness of probability models for prediction in social systems, but is only now coming to grips with the ways in which these models are wrong---and the consequences of those shortcomings. This paper attempts a comprehensive, structured overview of the specific conceptual, procedural, and statistical limitations of models in machine learning when applied to society. Machine learning modelers themselves can use the described hierarchy to identify possible failure points and think through how to address them, and consumers of machine learning models can know what to question when confronted with the decision about if, where, and how to apply machine learning. The limitations go from commitments inherent in quantification itself, through to showing how unmodeled dependencies can lead to cross-validation being overly optimistic as a way of assessing model performance.
Variance Penalized On-Policy and Off-Policy Actor-Critic2021-02-03   ${\displaystyle \cong }$
Reinforcement learning algorithms are typically geared towards optimizing the expected return of an agent. However, in many practical applications, low variance in the return is desired to ensure the reliability of an algorithm. In this paper, we propose on-policy and off-policy actor-critic algorithms that optimize a performance criterion involving both mean and variance in the return. Previous work uses the second moment of return to estimate the variance indirectly. Instead, we use a much simpler recently proposed direct variance estimator which updates the estimates incrementally using temporal difference methods. Using the variance-penalized criterion, we guarantee the convergence of our algorithm to locally optimal policies for finite state action Markov decision processes. We demonstrate the utility of our algorithm in tabular and continuous MuJoCo domains. Our approach not only performs on par with actor-critic and prior variance-penalization baselines in terms of expected return, but also generates trajectories which have lower variance in the return.
Identifying Semantically Duplicate Questions Using Data Science Approach: A Quora Case Study2020-04-18   ${\displaystyle \cong }$
Identifying semantically identical questions on, Question and Answering social media platforms like Quora is exceptionally significant to ensure that the quality and the quantity of content are presented to users, based on the intent of the question and thus enriching overall user experience. Detecting duplicate questions is a challenging problem because natural language is very expressive, and a unique intent can be conveyed using different words, phrases, and sentence structuring. Machine learning and deep learning methods are known to have accomplished superior results over traditional natural language processing techniques in identifying similar texts. In this paper, taking Quora for our case study, we explored and applied different machine learning and deep learning techniques on the task of identifying duplicate questions on Quora's dataset. By using feature engineering, feature importance techniques, and experimenting with seven selected machine learning classifiers, we demonstrated that our models outperformed previous studies on this task. Xgboost model with character level term frequency and inverse term frequency is our best machine learning model that has also outperformed a few of the Deep learning baseline models. We applied deep learning techniques to model four different deep neural networks of multiple layers consisting of Glove embeddings, Long Short Term Memory, Convolution, Max pooling, Dense, Batch Normalization, Activation functions, and model merge. Our deep learning models achieved better accuracy than machine learning models. Three out of four proposed architectures outperformed the accuracy from previous machine learning and deep learning research work, two out of four models outperformed accuracy from previous deep learning study on Quora's question pair dataset, and our best model achieved accuracy of 85.82% which is close to Quora state of the art accuracy.
Machine learning on Crays to optimise petrophysical workflows in oil and gas exploration2020-10-01   ${\displaystyle \cong }$
The oil and gas industry is awash with sub-surface data, which is used to characterize the rock and fluid properties beneath the seabed. This in turn drives commercial decision making and exploration, but the industry currently relies upon highly manual workflows when processing data. A key question is whether this can be improved using machine learning to complement the activities of petrophysicists searching for hydrocarbons. In this paper we present work done, in collaboration with Rock Solid Images (RSI), using supervised machine learning on a Cray XC30 to train models that streamline the manual data interpretation process. With a general aim of decreasing the petrophysical interpretation time down from over 7 days to 7 minutes, in this paper we describe the use of mathematical models that have been trained using raw well log data, for completing each of the four stages of a petrophysical interpretation workflow, along with initial data cleaning. We explore how the predictions from these models compare against the interpretations of human petrophysicists, along with numerous options and techniques that were used to optimise the prediction of our models. The power provided by modern supercomputers such as Cray machines is crucial here, but some popular machine learning framework are unable to take full advantage of modern HPC machines. As such we will also explore the suitability of the machine learning tools we have used, and describe steps we took to work round their limitations. The result of this work is the ability, for the first time, to use machine learning for the entire petrophysical workflow. Whilst there are numerous challenges, limitations and caveats, we demonstrate that machine learning has an important role to play in the processing of sub-surface data.
Dynamic Control Flow in Large-Scale Machine Learning2018-05-04   ${\displaystyle \cong }$
Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditional execution, and other features that call for dynamic control flow. These applications benefit from the ability to make rapid control-flow decisions across a set of computing devices in a distributed system. For performance, scalability, and expressiveness, a machine learning system must support dynamic control flow in distributed and heterogeneous environments. This paper presents a programming model for distributed machine learning that supports dynamic control flow. We describe the design of the programming model, and its implementation in TensorFlow, a distributed machine learning system. Our approach extends the use of dataflow graphs to represent machine learning models, offering several distinctive features. First, the branches of conditionals and bodies of loops can be partitioned across many machines to run on a set of heterogeneous devices, including CPUs, GPUs, and custom ASICs. Second, programs written in our model support automatic differentiation and distributed gradient computations, which are necessary for training machine learning models that use control flow. Third, our choice of non-strict semantics enables multiple loop iterations to execute in parallel across machines, and to overlap compute and I/O operations. We have done our work in the context of TensorFlow, and it has been used extensively in research and production. We evaluate it using several real-world applications, and demonstrate its performance and scalability.
Cost-aware Pre-training for Multiclass Cost-sensitive Deep Learning2016-05-24   ${\displaystyle \cong }$
Deep learning has been one of the most prominent machine learning techniques nowadays, being the state-of-the-art on a broad range of applications where automatic feature extraction is needed. Many such applications also demand varying costs for different types of mis-classification errors, but it is not clear whether or how such cost information can be incorporated into deep learning to improve performance. In this work, we propose a novel cost-aware algorithm that takes into account the cost information into not only the training stage but also the pre-training stage of deep learning. The approach allows deep learning to conduct automatic feature extraction with the cost information effectively. Extensive experimental results demonstrate that the proposed approach outperforms other deep learning models that do not digest the cost information in the pre-training stage.
RuleMatrix: Visualizing and Understanding Classifiers with Rules2018-07-17   ${\displaystyle \cong }$
With the growing adoption of machine learning techniques, there is a surge of research interest towards making machine learning systems more transparent and interpretable. Various visualizations have been developed to help model developers understand, diagnose, and refine machine learning models. However, a large number of potential but neglected users are the domain experts with little knowledge of machine learning but are expected to work with machine learning systems. In this paper, we present an interactive visualization technique to help users with little expertise in machine learning to understand, explore and validate predictive models. By viewing the model as a black box, we extract a standardized rule-based knowledge representation from its input-output behavior. We design RuleMatrix, a matrix-based visualization of rules to help users navigate and verify the rules and the black-box model. We evaluate the effectiveness of RuleMatrix via two use cases and a usability study.
Machine Learning Techniques for Biomedical Image Segmentation: An Overview of Technical Aspects and Introduction to State-of-Art Applications2019-11-06   ${\displaystyle \cong }$
In recent years, significant progress has been made in developing more accurate and efficient machine learning algorithms for segmentation of medical and natural images. In this review article, we highlight the imperative role of machine learning algorithms in enabling efficient and accurate segmentation in the field of medical imaging. We specifically focus on several key studies pertaining to the application of machine learning methods to biomedical image segmentation. We review classical machine learning algorithms such as Markov random fields, k-means clustering, random forest, etc. Although such classical learning models are often less accurate compared to the deep learning techniques, they are often more sample efficient and have a less complex structure. We also review different deep learning architectures, such as the artificial neural networks (ANNs), the convolutional neural networks (CNNs), and the recurrent neural networks (RNNs), and present the segmentation results attained by those learning models that were published in the past three years. We highlight the successes and limitations of each machine learning paradigm. In addition, we discuss several challenges related to the training of different machine learning models, and we present some heuristics to address those challenges.