10,16,2021

 Challenges and Pitfalls of Machine Learning Evaluation and Benchmarking2019-06-25   ${\displaystyle \cong }$ An increasingly complex and diverse collection of Machine Learning (ML) models as well as hardware/software stacks, collectively referred to as "ML artifacts", are being proposed - leading to a diverse landscape of ML. These ML innovations proposed have outpaced researchers' ability to analyze, study and adapt them. This is exacerbated by the complicated and sometimes non-reproducible procedures for ML evaluation. A common practice of sharing ML artifacts is through repositories where artifact authors post ad-hoc code and some documentation, but often fail to reveal critical information for others to reproduce their results. This results in users' inability to compare with artifact authors' claims or adapt the model to his/her own use. This paper discusses common challenges and pitfalls of ML evaluation and benchmarking, which can be used as a guideline for ML model authors when sharing ML artifacts, and for system developers when benchmarking or designing ML systems. Strategies and Principles of Distributed Machine Learning on Big Data2015-12-31   ${\displaystyle \cong }$ The rise of Big Data has led to new demands for Machine Learning (ML) systems to learn complex models with millions to billions of parameters, that promise adequate capacity to digest massive datasets and offer powerful predictive analytics thereupon. In order to run ML algorithms at such scales, on a distributed cluster with 10s to 1000s of machines, it is often the case that significant engineering efforts are required --- and one might fairly ask if such engineering truly falls within the domain of ML research or not. Taking the view that Big ML systems can benefit greatly from ML-rooted statistical and algorithmic insights --- and that ML researchers should therefore not shy away from such systems design --- we discuss a series of principles and strategies distilled from our recent efforts on industrial-scale ML solutions. These principles and strategies span a continuum from application, to engineering, and to theoretical research and development of Big ML systems and architectures, with the goal of understanding how to make them efficient, generally-applicable, and supported with convergence and scaling guarantees. They concern four key questions which traditionally receive little attention in ML research: How to distribute an ML program over a cluster? How to bridge ML computation with inter-machine communication? How to perform such communication? What should be communicated between machines? By exposing underlying statistical and algorithmic characteristics unique to ML programs but not typically seen in traditional computer programs, and by dissecting successful cases to reveal how we have harnessed these principles to design and develop both high-performance distributed ML software as well as general-purpose ML frameworks, we present opportunities for ML researchers and practitioners to further shape and grow the area that lies between ML and systems. Learning by Design: Structuring and Documenting the Human Choices in Machine Learning Development2021-05-03   ${\displaystyle \cong }$ The influence of machine learning (ML) is quickly spreading, and a number of recent technological innovations have applied ML as a central technology. However, ML development still requires a substantial amount of human expertise to be successful. The deliberation and expert judgment applied during ML development cannot be revisited or scrutinized if not properly documented, and this hinders the further adoption of ML technologies--especially in safety critical situations. In this paper, we present a method consisting of eight design questions, that outline the deliberation and normative choices going into creating a ML model. Our method affords several benefits, such as supporting critical assessment through methodological transparency, aiding in model debugging, and anchoring model explanations by committing to a pre hoc expectation of the model's behavior. We believe that our method can help ML practitioners structure and justify their choices and assumptions when developing ML models, and that it can help bridge a gap between those inside and outside the ML field in understanding how and why ML models are designed and developed the way they are. MLPerf Inference Benchmark2020-05-09   ${\displaystyle \cong }$ Machine-learning (ML) hardware and software system demand is burgeoning. Driven by ML applications, the number of different ML inference systems has exploded. Over 100 organizations are building ML inference chips, and the systems that incorporate existing models span at least three orders of magnitude in power consumption and five orders of magnitude in performance; they range from embedded devices to data-center solutions. Fueling the hardware are a dozen or more software frameworks and libraries. The myriad combinations of ML hardware and ML software make assessing ML-system performance in an architecture-neutral, representative, and reproducible manner challenging. There is a clear need for industry-wide standard ML benchmarking and evaluation criteria. MLPerf Inference answers that call. In this paper, we present our benchmarking method for evaluating ML inference systems. Driven by more than 30 organizations as well as more than 200 ML engineers and practitioners, MLPerf prescribes a set of rules and best practices to ensure comparability across systems with wildly differing architectures. The first call for submissions garnered more than 600 reproducible inference-performance measurements from 14 organizations, representing over 30 systems that showcase a wide range of capabilities. The submissions attest to the benchmark's flexibility and adaptability. Declarative Machine Learning - A Classification of Basic Properties and Types2016-05-19   ${\displaystyle \cong }$ Declarative machine learning (ML) aims at the high-level specification of ML tasks or algorithms, and automatic generation of optimized execution plans from these specifications. The fundamental goal is to simplify the usage and/or development of ML algorithms, which is especially important in the context of large-scale computations. However, ML systems at different abstraction levels have emerged over time and accordingly there has been a controversy about the meaning of this general definition of declarative ML. Specification alternatives range from ML algorithms expressed in domain-specific languages (DSLs) with optimization for performance, to ML task (learning problem) specifications with optimization for performance and accuracy. We argue that these different types of declarative ML complement each other as they address different users (data scientists and end users). This paper makes an attempt to create a taxonomy for declarative ML, including a definition of essential basic properties and types of declarative ML. Along the way, we provide insights into implications of these properties. We also use this taxonomy to classify existing systems. Finally, we draw conclusions on defining appropriate benchmarks and specification languages for declarative ML. Machine Learning for Intelligent Optical Networks: A Comprehensive Survey2020-03-11   ${\displaystyle \cong }$ With the rapid development of Internet and communication systems, both in services and technologies, communication networks have been suffering increasing complexity. It is imperative to improve intelligence in communication network, and several aspects have been incorporating with Artificial Intelligence (AI) and Machine Learning (ML). Optical network, which plays an important role both in core and access network in communication networks, also faces great challenges of system complexity and the requirement of manual operations. To overcome the current limitations and address the issues of future optical networks, it is essential to deploy more intelligence capability to enable autonomous and exible network operations. ML techniques are proved to have superiority on solving complex problems; and thus recently, ML techniques have been used for many optical network applications. In this paper, a detailed survey of existing applications of ML for intelligent optical networks is presented. The applications of ML are classified in terms of their use cases, which are categorized into optical network control and resource management, and optical networks monitoring and survivability. The use cases are analyzed and compared according to the used ML techniques. Besides, a tutorial for ML applications is provided from the aspects of the introduction of common ML algorithms, paradigms of ML, and motivations of applying ML. Lastly, challenges and possible solutions of ML application in optical networks are also discussed, which intends to inspire future innovations in leveraging ML to build intelligent optical networks. MLSys: The New Frontier of Machine Learning Systems2019-12-01   ${\displaystyle \cong }$ Machine learning (ML) techniques are enjoying rapidly increasing adoption. However, designing and implementing the systems that support ML models in real-world deployments remains a significant obstacle, in large part due to the radically different development and deployment profile of modern ML methods, and the range of practical concerns that come with broader adoption. We propose to foster a new systems machine learning research community at the intersection of the traditional systems and ML communities, focused on topics such as hardware systems for ML, software systems for ML, and ML optimized for metrics beyond predictive accuracy. To do this, we describe a new conference, MLSys, that explicitly targets research at the intersection of systems and machine learning with a program committee split evenly between experts in systems and ML, and an explicit focus on topics at the intersection of the two. Insights into Performance Fitness and Error Metrics for Machine Learning2020-05-17   ${\displaystyle \cong }$ Machine learning (ML) is the field of training machines to achieve high level of cognition and perform human-like analysis. Since ML is a data-driven approach, it seemingly fits into our daily lives and operations as well as complex and interdisciplinary fields. With the rise of commercial, open-source and user-catered ML tools, a key question often arises whenever ML is applied to explore a phenomenon or a scenario: what constitutes a good ML model? Keeping in mind that a proper answer to this question depends on a variety of factors, this work presumes that a good ML model is one that optimally performs and best describes the phenomenon on hand. From this perspective, identifying proper assessment metrics to evaluate performance of ML models is not only necessary but is also warranted. As such, this paper examines a number of the most commonly-used performance fitness and error metrics for regression and classification algorithms, with emphasis on engineering applications. The Adversarial Machine Learning Conundrum: Can The Insecurity of ML Become The Achilles' Heel of Cognitive Networks?2019-06-03   ${\displaystyle \cong }$ The holy grail of networking is to create \textit{cognitive networks} that organize, manage, and drive themselves. Such a vision now seems attainable thanks in large part to the progress in the field of machine learning (ML), which has now already disrupted a number of industries and revolutionized practically all fields of research. But are the ML models foolproof and robust to security attacks to be in charge of managing the network? Unfortunately, many modern ML models are easily misled by simple and easily-crafted adversarial perturbations, which does not bode well for the future of ML-based cognitive networks unless ML vulnerabilities for the cognitive networking environment are identified, addressed, and fixed. The purpose of this article is to highlight the problem of insecure ML and to sensitize the readers to the danger of adversarial ML by showing how an easily-crafted adversarial ML example can compromise the operations of the cognitive self-driving network. In this paper, we demonstrate adversarial attacks on two simple yet representative cognitive networking applications (namely, intrusion detection and network traffic classification). We also provide some guidelines to design secure ML models for cognitive networks that are robust to adversarial attacks on the ML pipeline of cognitive networks. A Rigorous Machine Learning Analysis Pipeline for Biomedical Binary Classification: Application in Pancreatic Cancer Nested Case-control Studies with Implications for Bias Assessments2020-09-08   ${\displaystyle \cong }$ Machine learning (ML) offers a collection of powerful approaches for detecting and modeling associations, often applied to data having a large number of features and/or complex associations. Currently, there are many tools to facilitate implementing custom ML analyses (e.g. scikit-learn). Interest is also increasing in automated ML packages, which can make it easier for non-experts to apply ML and have the potential to improve model performance. ML permeates most subfields of biomedical research with varying levels of rigor and correct usage. Tremendous opportunities offered by ML are frequently offset by the challenge of assembling comprehensive analysis pipelines, and the ease of ML misuse. In this work we have laid out and assembled a complete, rigorous ML analysis pipeline focused on binary classification (i.e. case/control prediction), and applied this pipeline to both simulated and real world data. At a high level, this 'automated' but customizable pipeline includes a) exploratory analysis, b) data cleaning and transformation, c) feature selection, d) model training with 9 established ML algorithms, each with hyperparameter optimization, and e) thorough evaluation, including appropriate metrics, statistical analyses, and novel visualizations. This pipeline organizes the many subtle complexities of ML pipeline assembly to illustrate best practices to avoid bias and ensure reproducibility. Additionally, this pipeline is the first to compare established ML algorithms to 'ExSTraCS', a rule-based ML algorithm with the unique capability of interpretably modeling heterogeneous patterns of association. While designed to be widely applicable we apply this pipeline to an epidemiological investigation of established and newly identified risk factors for pancreatic cancer to evaluate how different sources of bias might be handled by ML algorithms. Visual Machine Learning: Insight through Eigenvectors, Chladni patterns and community detection in 2D particulate structures2020-01-02   ${\displaystyle \cong }$ Machine learning (ML) is quickly emerging as a powerful tool with diverse applications across an extremely broad spectrum of disciplines and commercial endeavors. Typically, ML is used as a black box that provides little illuminating rationalization of its output. In the current work, we aim to better understand the generic intuition underlying unsupervised ML with a focus on physical systems. The systems that are studied here as test cases comprise of six different 2-dimensional (2-D) particulate systems of different complexities. It is noted that the findings of this study are generic to any unsupervised ML problem and are not restricted to materials systems alone. Three rudimentary unsupervised ML techniques are employed on the adjacency (connectivity) matrix of the six studied systems: (i) using principal eigenvalue and eigenvectors of the adjacency matrix, (ii) spectral decomposition, and (iii) a Potts model based community detection technique in which a modularity function is maximized. We demonstrate that, while solving a completely classical problem, ML technique produces features that are distinctly connected to quantum mechanical solutions. Dissecting these features help us to understand the deep connection between the classical non-linear world and the quantum mechanical linear world through the kaleidoscope of ML technique, which might have far reaching consequences both in the arena of physical sciences and ML. Application of Machine Learning in Wireless Networks: Key Techniques and Open Issues2019-02-28   ${\displaystyle \cong }$ As a key technique for enabling artificial intelligence, machine learning (ML) is capable of solving complex problems without explicit programming. Motivated by its successful applications to many practical tasks like image recognition, both industry and the research community have advocated the applications of ML in wireless communication. This paper comprehensively surveys the recent advances of the applications of ML in wireless communication, which are classified as: resource management in the MAC layer, networking and mobility management in the network layer, and localization in the application layer. The applications in resource management further include power control, spectrum management, backhaul management, cache management, beamformer design and computation resource management, while ML based networking focuses on the applications in clustering, base station switching control, user association and routing. Moreover, literatures in each aspect is organized according to the adopted ML techniques. In addition, several conditions for applying ML to wireless communication are identified to help readers decide whether to use ML and which kind of ML techniques to use, and traditional approaches are also summarized together with their performance comparison with ML based approaches, based on which the motivations of surveyed literatures to adopt ML are clarified. Given the extensiveness of the research area, challenges and unresolved issues are presented to facilitate future studies, where ML based network slicing, infrastructure update to support ML based paradigms, open data sets and platforms for researchers, theoretical guidance for ML implementation and so on are discussed. Proposed Guidelines for the Responsible Use of Explainable Machine Learning2019-11-29   ${\displaystyle \cong }$ Explainable machine learning (ML) enables human learning from ML, human appeal of automated model decisions, regulatory compliance, and security audits of ML models. Explainable ML (i.e. explainable artificial intelligence or XAI) has been implemented in numerous open source and commercial packages and explainable ML is also an important, mandatory, or embedded aspect of commercial predictive modeling in industries like financial services. However, like many technologies, explainable ML can be misused, particularly as a faulty safeguard for harmful black-boxes, e.g. fairwashing or scaffolding, and for other malevolent purposes like stealing models and sensitive training data. To promote best-practice discussions for this already in-flight technology, this short text presents internal definitions and a few examples before covering the proposed guidelines. This text concludes with a seemingly natural argument for the use of interpretable models and explanatory, debugging, and disparate impact testing methods in life- or mission-critical ML systems. Towards ML Engineering: A Brief History Of TensorFlow Extended (TFX)2020-09-28   ${\displaystyle \cong }$ Software Engineering, as a discipline, has matured over the past 5+ decades. The modern world heavily depends on it, so the increased maturity of Software Engineering was an eventuality. Practices like testing and reliable technologies help make Software Engineering reliable enough to build industries upon. Meanwhile, Machine Learning (ML) has also grown over the past 2+ decades. ML is used more and more for research, experimentation and production workloads. ML now commonly powers widely-used products integral to our lives. But ML Engineering, as a discipline, has not widely matured as much as its Software Engineering ancestor. Can we take what we have learned and help the nascent field of applied ML evolve into ML Engineering the way Programming evolved into Software Engineering [1]? In this article we will give a whirlwind tour of Sibyl [2] and TensorFlow Extended (TFX) [3], two successive end-to-end (E2E) ML platforms at Alphabet. We will share the lessons learned from over a decade of applied ML built on these platforms, explain both their similarities and their differences, and expand on the shifts (both mental and technical) that helped us on our journey. In addition, we will highlight some of the capabilities of TFX that help realize several aspects of ML Engineering. We argue that in order to unlock the gains ML can bring, organizations should advance the maturity of their ML teams by investing in robust ML infrastructure and promoting ML Engineering education. We also recommend that before focusing on cutting-edge ML modeling techniques, product leaders should invest more time in adopting interoperable ML platforms for their organizations. In closing, we will also share a glimpse into the future of TFX. Examining Machine Learning for 5G and Beyond through an Adversarial Lens2020-09-05   ${\displaystyle \cong }$ Spurred by the recent advances in deep learning to harness rich information hidden in large volumes of data and to tackle problems that are hard to model/solve (e.g., resource allocation problems), there is currently tremendous excitement in the mobile networks domain around the transformative potential of data-driven AI/ML based network automation, control and analytics for 5G and beyond. In this article, we present a cautionary perspective on the use of AI/ML in the 5G context by highlighting the adversarial dimension spanning multiple types of ML (supervised/unsupervised/RL) and support this through three case studies. We also discuss approaches to mitigate this adversarial ML risk, offer guidelines for evaluating the robustness of ML models, and call attention to issues surrounding ML oriented research in 5G more generally. Towards a Robust and Trustworthy Machine Learning System Development2021-01-08   ${\displaystyle \cong }$ Machine Learning (ML) technologies have been widely adopted in many mission critical fields, such as cyber security, autonomous vehicle control, healthcare, etc. to support intelligent decision-making. While ML has demonstrated impressive performance over conventional methods in these applications, concerns arose with respect to system resilience against ML-specific security attacks and privacy breaches as well as the trust that users have in these systems. In this article, firstly we present our recent systematic and comprehensive survey on the state-of-the-art ML robustness and trustworthiness technologies from a security engineering perspective, which covers all aspects of secure ML system development including threat modeling, common offensive and defensive technologies, privacy-preserving machine learning, user trust in the context of machine learning, and empirical evaluation for ML model robustness. Secondly, we then push our studies forward above and beyond a survey by describing a metamodel we created that represents the body of knowledge in a standard and visualized way for ML practitioners. We further illustrate how to leverage the metamodel to guide a systematic threat analysis and security design process in a context of generic ML system development, which extends and scales up the classic process. Thirdly, we propose future research directions motivated by our findings to advance the development of robust and trustworthy ML systems. Our work differs from existing surveys in this area in that, to the best of our knowledge, it is the first of its kind of engineering effort to (i) explore the fundamental principles and best practices to support robust and trustworthy ML system development; and (ii) study the interplay of robustness and user trust in the context of ML systems. Local Interpretability of Calibrated Prediction Models: A Case of Type 2 Diabetes Mellitus Screening Test2020-06-02   ${\displaystyle \cong }$ Machine Learning (ML) models are often complex and difficult to interpret due to their 'black-box' characteristics. Interpretability of a ML model is usually defined as the degree to which a human can understand the cause of decisions reached by a ML model. Interpretability is of extremely high importance in many fields of healthcare due to high levels of risk related to decisions based on ML models. Calibration of the ML model outputs is another issue often overlooked in the application of ML models in practice. This paper represents an early work in examination of prediction model calibration impact on the interpretability of the results. We present a use case of a patient in diabetes screening prediction scenario and visualize results using three different techniques to demonstrate the differences between calibrated and uncalibrated regularized regression model. Coupling Machine Learning and Crop Modeling Improves Crop Yield Prediction in the US Corn Belt2020-07-28   ${\displaystyle \cong }$ This study investigates whether coupling crop modeling and machine learning (ML) improves corn yield predictions in the US Corn Belt. The main objectives are to explore whether a hybrid approach (crop modeling + ML) would result in better predictions, investigate which combinations of hybrid models provide the most accurate predictions and determine the features from the crop modeling that are most effective to be integrated with ML for corn yield prediction. Five ML models and six ensemble models have been designed to address the research question. The results suggest that adding simulation crop model variables (APSIM) as input features to ML models can make a significant difference in the performance of ML models, and it can boost ML performance by up to 29%. Furthermore, we investigated partial inclusion of APSIM features in the ML prediction models, and we found that soil and weather-related APSIM variables are most influential on the ML predictions followed by crop-related and phenology-related variables. Finally, based on feature importance measure, it has been observed that simulated APSIM average drought stress and average water table depth during the growing season are the most important APSIM inputs to ML. This result indicates that weather information alone is not sufficient, and ML models need more hydrological inputs to make improved yield predictions. Stress Testing of Meta-learning Approaches for Few-shot Learning2021-01-21   ${\displaystyle \cong }$ Meta-learning (ML) has emerged as a promising learning method under resource constraints such as few-shot learning. ML approaches typically propose a methodology to learn generalizable models. In this work-in-progress paper, we put the recent ML approaches to a stress test to discover their limitations. Precisely, we measure the performance of ML approaches for few-shot learning against increasing task complexity. Our results show a quick degradation in the performance of initialization strategies for ML (MAML, TAML, and MetaSGD), while surprisingly, approaches that use an optimization strategy (MetaLSTM) perform significantly better. We further demonstrate the effectiveness of an optimization strategy for ML (MetaLSTM++) trained in a MAML manner over a pure optimization strategy. Our experiments also show that the optimization strategies for ML achieve higher transferability from simple to complex tasks. MLPerf Training Benchmark2020-03-02   ${\displaystyle \cong }$ Machine learning (ML) needs industry-standard performance benchmarks to support design and competitive evaluation of the many emerging software and hardware solutions for ML. But ML training presents three unique benchmarking challenges absent from other domains: optimizations that improve training throughput can increase the time to solution, training is stochastic and time to solution exhibits high variance, and software and hardware systems are so diverse that fair benchmarking with the same binary, code, and even hyperparameters is difficult. We therefore present MLPerf, an ML benchmark that overcomes these challenges. Our analysis quantitatively evaluates MLPerf's efficacy at driving performance and scalability improvements across two rounds of results from multiple vendors.