News Blog Paper China
Probabilistic Neighbourhood Component Analysis: Sample Efficient Uncertainty Estimation in Deep Learning2020-07-18   ${\displaystyle \cong }$
While Deep Neural Networks (DNNs) achieve state-of-the-art accuracy in various applications, they often fall short in accurately estimating their predictive uncertainty and, in turn, fail to recognize when these predictions may be wrong. Several uncertainty-aware models, such as Bayesian Neural Network (BNNs) and Deep Ensembles have been proposed in the literature for quantifying predictive uncertainty. However, research in this area has been largely confined to the big data regime. In this work, we show that the uncertainty estimation capability of state-of-the-art BNNs and Deep Ensemble models degrades significantly when the amount of training data is small. To address the issue of accurate uncertainty estimation in the small-data regime, we propose a probabilistic generalization of the popular sample-efficient non-parametric kNN approach. Our approach enables deep kNN classifier to accurately quantify underlying uncertainties in its prediction. We demonstrate the usefulness of the proposed approach by achieving superior uncertainty quantification as compared to state-of-the-art on a real-world application of COVID-19 diagnosis from chest X-Rays. Our code is available at https://github.com/ankurmallick/sample-efficient-uq
STUaNet: Understanding uncertainty in spatiotemporal collective human mobility2021-02-08   ${\displaystyle \cong }$
The high dynamics and heterogeneous interactions in the complicated urban systems have raised the issue of uncertainty quantification in spatiotemporal human mobility, to support critical decision-makings in risk-aware web applications such as urban event prediction where fluctuations are of significant interests. Given the fact that uncertainty quantifies the potential variations around prediction results, traditional learning schemes always lack uncertainty labels, and conventional uncertainty quantification approaches mostly rely upon statistical estimations with Bayesian Neural Networks or ensemble methods. However, they have never involved any spatiotemporal evolution of uncertainties under various contexts, and also have kept suffering from the poor efficiency of statistical uncertainty estimation while training models with multiple times. To provide high-quality uncertainty quantification for spatiotemporal forecasting, we propose an uncertainty learning mechanism to simultaneously estimate internal data quality and quantify external uncertainty regarding various contextual interactions. To address the issue of lacking labels of uncertainty, we propose a hierarchical data turbulence scheme where we can actively inject controllable uncertainty for guidance, and hence provide insights to both uncertainty quantification and weak supervised learning. Finally, we re-calibrate and boost the prediction performance by devising a gated-based bridge to adaptively leverage the learned uncertainty into predictions. Extensive experiments on three real-world spatiotemporal mobility sets have corroborated the superiority of our proposed model in terms of both forecasting and uncertainty quantification.
Interpreting Uncertainty in Model Predictions For COVID-19 Diagnosis2020-10-25   ${\displaystyle \cong }$
COVID-19, due to its accelerated spread has brought in the need to use assistive tools for faster diagnosis in addition to typical lab swab testing. Chest X-Rays for COVID cases tend to show changes in the lungs such as ground glass opacities and peripheral consolidations which can be detected by deep neural networks. However, traditional convolutional networks use point estimate for predictions, lacking in capture of uncertainty, which makes them less reliable for adoption. There have been several works so far in predicting COVID positive cases with chest X-Rays. However, not much has been explored on quantifying the uncertainty of these predictions, interpreting uncertainty, and decomposing this to model or data uncertainty. To address these needs, we develop a visualization framework to address interpretability of uncertainty and its components, with uncertainty in predictions computed with a Bayesian Convolutional Neural Network. This framework aims to understand the contribution of individual features in the Chest-X-Ray images to predictive uncertainty. Providing this as an assistive tool can help the radiologist understand why the model came up with a prediction and whether the regions of interest captured by the model for the specific prediction are of significance in diagnosis. We demonstrate the usefulness of the tool in chest x-ray interpretation through several test cases from a benchmark dataset.
Objective Evaluation of Deep Uncertainty Predictions for COVID-19 Detection2020-12-22   ${\displaystyle \cong }$
Deep neural networks (DNNs) have been widely applied for detecting COVID-19 in medical images. Existing studies mainly apply transfer learning and other data representation strategies to generate accurate point estimates. The generalization power of these networks is always questionable due to being developed using small datasets and failing to report their predictive confidence. Quantifying uncertainties associated with DNN predictions is a prerequisite for their trusted deployment in medical settings. Here we apply and evaluate three uncertainty quantification techniques for COVID-19 detection using chest X-Ray (CXR) images. The novel concept of uncertainty confusion matrix is proposed and new performance metrics for the objective evaluation of uncertainty estimates are introduced. Through comprehensive experiments, it is shown that networks pertained on CXR images outperform networks pretrained on natural image datasets such as ImageNet. Qualitatively and quantitatively evaluations also reveal that the predictive uncertainty estimates are statistically higher for erroneous predictions than correct predictions. Accordingly, uncertainty quantification methods are capable of flagging risky predictions with high uncertainty estimates. We also observe that ensemble methods more reliably capture uncertainties during the inference.
Exploring Uncertainty in Deep Learning for Construction of Prediction Intervals2021-04-26   ${\displaystyle \cong }$
Deep learning has achieved impressive performance on many tasks in recent years. However, it has been found that it is still not enough for deep neural networks to provide only point estimates. For high-risk tasks, we need to assess the reliability of the model predictions. This requires us to quantify the uncertainty of model prediction and construct prediction intervals. In this paper, We explore the uncertainty in deep learning to construct the prediction intervals. In general, We comprehensively consider two categories of uncertainties: aleatory uncertainty and epistemic uncertainty. We design a special loss function, which enables us to learn uncertainty without uncertainty label. We only need to supervise the learning of regression task. We learn the aleatory uncertainty implicitly from the loss function. And that epistemic uncertainty is accounted for in ensembled form. Our method correlates the construction of prediction intervals with the uncertainty estimation. Impressive results on some publicly available datasets show that the performance of our method is competitive with other state-of-the-art methods.
Combining Model and Parameter Uncertainty in Bayesian Neural Networks2019-05-25   ${\displaystyle \cong }$
Bayesian neural networks (BNNs) have recently regained a significant amount of attention in the deep learning community due to the development of scalable approximate Bayesian inference techniques. There are several advantages of using Bayesian approach: Parameter and prediction uncertainty become easily available, facilitating rigid statistical analysis. Furthermore, prior knowledge can be incorporated. However so far there have been no scalable techniques capable of combining both model (structural) and parameter uncertainty. In this paper we introduce the concept of model uncertainty in BNNs and hence make inference in the joint space of models and parameters. Moreover, we suggest an adaptation of a scalable variational inference approach with reparametrization of marginal inclusion probabilities to incorporate the model space constraints. Finally, we show that incorporating model uncertainty via Bayesian model averaging and Bayesian model selection allows to drastically sparsify the structure of BNNs.
Interval Deep Learning for Uncertainty Quantification in Safety Applications2021-05-13   ${\displaystyle \cong }$
Deep neural networks (DNNs) are becoming more prevalent in important safety-critical applications, where reliability in the prediction is paramount. Despite their exceptional prediction capabilities, current DNNs do not have an implicit mechanism to quantify and propagate significant input data uncertainty -- which is common in safety-critical applications. In many cases, this uncertainty is epistemic and can arise from multiple sources, such as lack of knowledge about the data generating process, imprecision, ignorance, and poor understanding of physics phenomena. Recent approaches have focused on quantifying parameter uncertainty, but approaches to end-to-end training of DNNs with epistemic input data uncertainty are more limited and largely problem-specific. In this work, we present a DNN optimized with gradient-based methods capable to quantify input and parameter uncertainty by means of interval analysis, which we call Deep Interval Neural Network (DINN). We perform experiments on an air pollution dataset with sensor uncertainty and show that the DINN can produce accurate bounded estimates from uncertain input data.
DEUP: Direct Epistemic Uncertainty Prediction2021-02-16   ${\displaystyle \cong }$
Epistemic uncertainty is the part of out-of-sample prediction error due to the lack of knowledge of the learner. Whereas previous work was focusing on model variance, we propose a principled approach for directly estimating epistemic uncertainty by learning to predict generalization error and subtracting an estimate of aleatoric uncertainty, i.e., intrinsic unpredictability. This estimator of epistemic uncertainty includes the effect of model bias and can be applied in non-stationary learning environments arising in active learning or reinforcement learning. In addition to demonstrating these properties of Direct Epistemic Uncertainty Prediction (DEUP), we illustrate its advantage against existing methods for uncertainty estimation on downstream tasks including sequential model optimization and reinforcement learning. We also evaluate the quality of uncertainty estimates from DEUP for probabilistic classification of images and for estimating uncertainty about synergistic drug combinations.
Uncertainty Modelling in Deep Neural Networks for Image Data2020-11-17   ${\displaystyle \cong }$
Quantifying uncertainty in a model's predictions is important as it enables, for example, the safety of an AI system to be increased by acting on the model's output in an informed manner. We cannot expect a system to be 100% accurate or perfect at its task, however, we can equip the system with some tools to inform us if it is not certain about a prediction. This way, a second check can be performed, or the task can be passed to a human specialist. This is crucial for applications where the cost of an error is high, such as in autonomous vehicle control, medical image analysis, financial estimations or legal fields. Deep Neural Networks are powerful black box predictors that have recently achieved impressive performance on a wide spectrum of tasks. Quantifying predictive uncertainty in DNNs is a challenging and yet on-going problem. Although there have been many efforts to equip NNs with tools to estimate uncertainty, such as Monte Carlo Dropout, most of the previous methods only focus on one of the three types of model, data or distributional uncertainty. In this paper we propose a complete framework to capture and quantify all of these three types of uncertainties in DNNs for image classification. This framework includes an ensemble of CNNs for model uncertainty, a supervised reconstruction auto-encoder to capture distributional uncertainty and using the output of activation functions in the last layer of the network, to capture data uncertainty. Finally we demonstrate the efficiency of our method on popular image datasets for classification.
On the Effects of Quantisation on Model Uncertainty in Bayesian Neural Networks2021-02-22   ${\displaystyle \cong }$
Bayesian neural networks (BNNs) are making significant progress in many research areas where decision making needs to be accompanied by uncertainty estimation. Being able to quantify uncertainty while making decisions is essential for understanding when the model is over-/under-confident, and hence BNNs are attracting interest in safety-critical applications, such as autonomous driving, healthcare and robotics. Nevertheless, BNNs have not been as widely used in industrial practice, mainly because of their increased memory and compute costs. In this work, we investigate quantisation of BNNs by compressing 32-bit floating-point weights and activations to their integer counterparts, that has already been successful in reducing the compute demand in standard pointwise neural networks. We study three types of quantised BNNs, we evaluate them under a wide range of different settings, and we empirically demonstrate that an uniform quantisation scheme applied to BNNs does not substantially decrease their quality of uncertainty estimation.
Towards calibrated and scalable uncertainty representations for neural networks2019-12-03   ${\displaystyle \cong }$
For many applications it is critical to know the uncertainty of a neural network's predictions. While a variety of neural network parameter estimation methods have been proposed for uncertainty estimation, they have not been rigorously compared across uncertainty measures. We assess four of these parameter estimation methods to calibrate uncertainty estimation using four different uncertainty measures: entropy, mutual information, aleatoric uncertainty and epistemic uncertainty. We evaluate the calibration of these parameter estimation methods using expected calibration error. Additionally, we propose a novel method of neural network parameter estimation called RECAST, which combines cosine annealing with warm restarts with Stochastic Gradient Langevin Dynamics, capturing more diverse parameter distributions. When benchmarked against mutilated image data, we show that RECAST is well-calibrated and when combined with predictive entropy and epistemic uncertainty it offers the best calibrated measure of uncertainty when compared to recent methods.
Simple and Principled Uncertainty Estimation with Deterministic Deep Learning via Distance Awareness2020-06-17   ${\displaystyle \cong }$
Bayesian neural networks (BNN) and deep ensembles are principled approaches to estimate the predictive uncertainty of a deep learning model. However their practicality in real-time, industrial-scale applications are limited due to their heavy memory and inference cost. This motivates us to study principled approaches to high-quality uncertainty estimation that require only a single deep neural network (DNN). By formalizing the uncertainty quantification as a minimax learning problem, we first identify input distance awareness, i.e., the model's ability to quantify the distance of a testing example from the training data in the input space, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs, by adding a weight normalization step during training and replacing the output layer with a Gaussian process. On a suite of vision and language understanding tasks and on modern architectures (Wide-ResNet and BERT), SNGP is competitive with deep ensembles in prediction, calibration and out-of-domain detection, and outperforms the other single-model approaches.
Real-time Uncertainty Decomposition for Online Learning Control2020-10-06   ${\displaystyle \cong }$
Safety-critical decisions based on machine learning models require a clear understanding of the involved uncertainties to avoid hazardous or risky situations. While aleatoric uncertainty can be explicitly modeled given a parametric description, epistemic uncertainty rather describes the presence or absence of training data. This paper proposes a novel generic method for modeling epistemic uncertainty and shows its advantages over existing approaches for neural networks on various data sets. It can be directly combined with aleatoric uncertainty estimates and allows for prediction in real-time as the inference is sample-free. We exploit this property in a model-based quadcopter control setting and demonstrate how the controller benefits from a differentiation between aleatoric and epistemic uncertainty in online learning of thermal disturbances.
Uncertainty Quantification in Deep Learning for Safer Neuroimage Enhancement2019-07-31   ${\displaystyle \cong }$
Deep learning (DL) has shown great potential in medical image enhancement problems, such as super-resolution or image synthesis. However, to date, little consideration has been given to uncertainty quantification over the output image. Here we introduce methods to characterise different components of uncertainty in such problems and demonstrate the ideas using diffusion MRI super-resolution. Specifically, we propose to account for $intrinsic$ uncertainty through a heteroscedastic noise model and for $parameter$ uncertainty through approximate Bayesian inference, and integrate the two to quantify $predictive$ uncertainty over the output image. Moreover, we introduce a method to propagate the predictive uncertainty on a multi-channelled image to derived scalar parameters, and separately quantify the effects of intrinsic and parameter uncertainty therein. The methods are evaluated for super-resolution of two different signal representations of diffusion MR images---DTIs and Mean Apparent Propagator MRI---and their derived quantities such as MD and FA, on multiple datasets of both healthy and pathological human brains. Results highlight three key benefits of uncertainty modelling for improving the safety of DL-based image enhancement systems. Firstly, incorporating uncertainty improves the predictive performance even when test data departs from training data. Secondly, the predictive uncertainty highly correlates with errors, and is therefore capable of detecting predictive "failures". Results demonstrate that such an uncertainty measure enables subject-specific and voxel-wise risk assessment of the output images. Thirdly, we show that the method for decomposing predictive uncertainty into its independent sources provides high-level "explanations" for the performance by quantifying how much uncertainty arises from the inherent difficulty of the task or the limited training examples.
Evaluating Scalable Bayesian Deep Learning Methods for Robust Computer Vision2020-04-07   ${\displaystyle \cong }$
While deep neural networks have become the go-to approach in computer vision, the vast majority of these models fail to properly capture the uncertainty inherent in their predictions. Estimating this predictive uncertainty can be crucial, for example in automotive applications. In Bayesian deep learning, predictive uncertainty is commonly decomposed into the distinct types of aleatoric and epistemic uncertainty. The former can be estimated by letting a neural network output the parameters of a certain probability distribution. Epistemic uncertainty estimation is a more challenging problem, and while different scalable methods recently have emerged, no extensive comparison has been performed in a real-world setting. We therefore accept this task and propose a comprehensive evaluation framework for scalable epistemic uncertainty estimation methods in deep learning. Our proposed framework is specifically designed to test the robustness required in real-world computer vision applications. We also apply this framework to provide the first properly extensive and conclusive comparison of the two current state-of-the-art scalable methods: ensembling and MC-dropout. Our comparison demonstrates that ensembling consistently provides more reliable and practically useful uncertainty estimates. Code is available at https://github.com/fregu856/evaluating_bdl.
BayesAdapter: Being Bayesian, Inexpensively and Robustly, via Bayeisan Fine-tuning2020-10-05   ${\displaystyle \cong }$
Despite their theoretical appealingness, Bayesian neural networks (BNNs) are falling far behind in terms of adoption in real-world applications compared with normal NNs, mainly due to their limited scalability in training, and low fidelity in their uncertainty estimates. In this work, we develop a new framework, named BayesAdapter, to address these issues and bring Bayesian deep learning to the masses. The core notion of BayesAdapter is to adapt pre-trained deterministic NNs to be BNNs via Bayesian fine-tuning. We implement Bayesian fine-tuning with a plug-and-play instantiation of stochastic variational inference, and propose exemplar reparameterization to reduce gradient variance and stabilize the fine-tuning. Together, they enable training BNNs as if one were training deterministic NNs with minimal added overheads. During Bayesian fine-tuning, we further propose an uncertainty regularization to supervise and calibrate the uncertainty quantification of learned BNNs at low cost. To empirically evaluate BayesAdapter, we conduct extensive experiments on a diverse set of challenging benchmarks, and observe significantly higher training efficiency, better predictive performance, and more calibrated and faithful uncertainty estimates than existing BNNs.
Improving model calibration with accuracy versus uncertainty optimization2020-12-14   ${\displaystyle \cong }$
Obtaining reliable and accurate quantification of uncertainty estimates from deep neural networks is important in safety-critical applications. A well-calibrated model should be accurate when it is certain about its prediction and indicate high uncertainty when it is likely to be inaccurate. Uncertainty calibration is a challenging problem as there is no ground truth available for uncertainty estimates. We propose an optimization method that leverages the relationship between accuracy and uncertainty as an anchor for uncertainty calibration. We introduce a differentiable accuracy versus uncertainty calibration (AvUC) loss function that allows a model to learn to provide well-calibrated uncertainties, in addition to improved accuracy. We also demonstrate the same methodology can be extended to post-hoc uncertainty calibration on pretrained models. We illustrate our approach with mean-field stochastic variational inference and compare with state-of-the-art methods. Extensive experiments demonstrate our approach yields better model calibration than existing methods on large-scale image classification tasks under distributional shift.
Reducing Risk and Uncertainty of Deep Neural Networks on Diagnosing COVID-19 Infection2021-04-28   ${\displaystyle \cong }$
Effective and reliable screening of patients via Computer-Aided Diagnosis can play a crucial part in the battle against COVID-19. Most of the existing works focus on developing sophisticated methods yielding high detection performance, yet not addressing the issue of predictive uncertainty. In this work, we introduce uncertainty estimation to detect confusing cases for expert referral to address the unreliability of state-of-the-art (SOTA) DNNs on COVID-19 detection. To the best of our knowledge, we are the first to address this issue on the COVID-19 detection problem. In this work, we investigate a number of SOTA uncertainty estimation methods on publicly available COVID dataset and present our experimental findings. In collaboration with medical professionals, we further validate the results to ensure the viability of the best performing method in clinical practice.
Regression with Uncertainty Quantification in Large Scale Complex Data2019-12-04   ${\displaystyle \cong }$
While several methods for predicting uncertainty on deep networks have been recently proposed, they do not readily translate to large and complex datasets. In this paper we utilize a simplified form of the Mixture Density Networks (MDNs) to produce a one-shot approach to quantify uncertainty in regression problems. We show that our uncertainty bounds are on-par or better than other reported existing methods. When applied to standard regression benchmark datasets, we show an improvement in predictive log-likelihood and root-mean-square-error when compared to existing state-of-the-art methods. We also demonstrate this method's efficacy on stochastic, highly volatile time-series data where stock prices are predicted for the next time interval. The resulting uncertainty graph summarizes significant anomalies in the stock price chart. Furthermore, we apply this method to the task of age estimation from the challenging IMDb-Wiki dataset of half a million face images. We successfully predict the uncertainties associated with the prediction and empirically analyze the underlying causes of the uncertainties. This uncertainty quantification can be used to pre-process low quality datasets and further enable learning.
Learnable Uncertainty under Laplace Approximations2020-10-06   ${\displaystyle \cong }$
Laplace approximations are classic, computationally lightweight means for constructing Bayesian neural networks (BNNs). As in other approximate BNNs, one cannot necessarily expect the induced predictive uncertainty to be calibrated. Here we develop a formalism to explicitly "train" the uncertainty in a decoupled way to the prediction itself. To this end we introduce uncertainty units for Laplace-approximated networks: Hidden units with zero weights that can be added to any pre-trained, point-estimated network. Since these units are inactive, they do not affect the predictions. But their presence changes the geometry (in particular the Hessian) of the loss landscape around the point estimate, thereby affecting the network's uncertainty estimates under a Laplace approximation. We show that such units can be trained via an uncertainty-aware objective, making the Laplace approximation competitive with more expensive alternative uncertainty-quantification frameworks.