Oracle-Based Robust Optimization via Online Learning2014-02-25 ${\displaystyle \cong }$ |

Robust optimization is a common framework in optimization under uncertainty when the problem parameters are not known, but it is rather known that the parameters belong to some given uncertainty set. In the robust optimization framework the problem solved is a min-max problem where a solution is judged according to its performance on the worst possible realization of the parameters. In many cases, a straightforward solution of the robust optimization problem of a certain type requires solving an optimization problem of a more complicated type, and in some cases even NP-hard. For example, solving a robust conic quadratic program, such as those arising in robust SVM, ellipsoidal uncertainty leads in general to a semidefinite program. In this paper we develop a method for approximately solving a robust optimization problem using tools from online convex optimization, where in every stage a standard (non-robust) optimization program is solved. Our algorithms find an approximate robust solution using a number of calls to an oracle that solves the original (non-robust) problem that is inversely proportional to the square of the target accuracy. |

Automatically Learning Compact Quality-aware Surrogates for Optimization Problems2020-06-18 ${\displaystyle \cong }$ |

Solving optimization problems with unknown parameters often requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values. Recent work has shown that including the optimization problem as a layer in the model training pipeline results in predictions of the unobserved parameters that lead to higher decision quality. Unfortunately, this process comes at a large computational cost because the optimization problem must be solved and differentiated through in each training iteration; furthermore, it may also sometimes fail to improve solution quality due to non-smoothness issues that arise when training through a complex optimization layer. To address these shortcomings, we learn a low-dimensional surrogate model of a large optimization problem by representing the feasible space in terms of meta-variables, each of which is a linear combination of the original variables. By training a low-dimensional surrogate model end-to-end, and jointly with the predictive model, we achieve: i) a large reduction in training and inference time; and ii) improved performance by focusing attention on the more important variables in the optimization and learning in a smoother space. Empirically, we demonstrate these improvements on a non-convex adversary modeling task, a submodular recommendation task and a convex portfolio optimization task. |

Robust Optimization for Non-Convex Objectives2017-07-04 ${\displaystyle \cong }$ |

We consider robust optimization problems, where the goal is to optimize in the worst case over a class of objective functions. We develop a reduction from robust improper optimization to Bayesian optimization: given an oracle that returns $?$-approximate solutions for distributions over objectives, we compute a distribution over solutions that is $?$-approximate in the worst case. We show that de-randomizing this solution is NP-hard in general, but can be done for a broad class of statistical learning tasks. We apply our results to robust neural network training and submodular optimization. We evaluate our approach experimentally on corrupted character classification, and robust influence maximization in networks. |

Data-Driven Stochastic Robust Optimization: A General Computational Framework and Algorithm for Optimization under Uncertainty in the Big Data Era2017-12-29 ${\displaystyle \cong }$ |

A novel data-driven stochastic robust optimization (DDSRO) framework is proposed for optimization under uncertainty leveraging labeled multi-class uncertainty data. Uncertainty data in large datasets are often collected from various conditions, which are encoded by class labels. Machine learning methods including Dirichlet process mixture model and maximum likelihood estimation are employed for uncertainty modeling. A DDSRO framework is further proposed based on the data-driven uncertainty model through a bi-level optimization structure. The outer optimization problem follows a two-stage stochastic programming approach to optimize the expected objective across different data classes; adaptive robust optimization is nested as the inner problem to ensure the robustness of the solution while maintaining computational tractability. A decomposition-based algorithm is further developed to solve the resulting multi-level optimization problem efficiently. Case studies on process network design and planning are presented to demonstrate the applicability of the proposed framework and algorithm. |

Learning Convex Optimization Models2020-06-18 ${\displaystyle \cong }$ |

A convex optimization model predicts an output from an input by solving a convex optimization problem. The class of convex optimization models is large, and includes as special cases many well-known models like linear and logistic regression. We propose a heuristic for learning the parameters in a convex optimization model given a dataset of input-output pairs, using recently developed methods for differentiating the solution of a convex optimization problem with respect to its parameters. We describe three general classes of convex optimization models, maximum a posteriori (MAP) models, utility maximization models, and agent models, and present a numerical experiment for each. |

Learning to Optimize Under Constraints with Unsupervised Deep Neural Networks2021-01-03 ${\displaystyle \cong }$ |

In this paper, we propose a machine learning (ML) method to learn how to solve a generic constrained continuous optimization problem. To the best of our knowledge, the generic methods that learn to optimize, focus on unconstrained optimization problems and those dealing with constrained problems are not easy-to-generalize. This approach is quite useful in optimization tasks where the problem's parameters constantly change and require resolving the optimization task per parameter update. In such problems, the computational complexity of optimization algorithms such as gradient descent or interior point method preclude near-optimal designs in real-time applications. In this paper, we propose an unsupervised deep learning (DL) solution for solving constrained optimization problems in real-time by relegating the main computation load to offline training phase. This paper's main contribution is proposing a method for enforcing the equality and inequality constraints to the DL-generated solutions for generic optimization tasks. |

Structured Convex Optimization under Submodular Constraints2013-09-26 ${\displaystyle \cong }$ |

A number of discrete and continuous optimization problems in machine learning are related to convex minimization problems under submodular constraints. In this paper, we deal with a submodular function with a directed graph structure, and we show that a wide range of convex optimization problems under submodular constraints can be solved much more efficiently than general submodular optimization methods by a reduction to a maximum flow problem. Furthermore, we give some applications, including sparse optimization methods, in which the proposed methods are effective. Additionally, we evaluate the performance of the proposed method through computational experiments. |

Differentiable Convex Optimization Layers2019-10-28 ${\displaystyle \cong }$ |

Recent work has shown how to embed differentiable optimization problems (that is, problems whose solutions can be backpropagated through) as layers within deep learning architectures. This method provides a useful inductive bias for certain problems, but existing software for differentiable optimization layers is rigid and difficult to apply to new settings. In this paper, we propose an approach to differentiating through disciplined convex programs, a subclass of convex optimization problems used by domain-specific languages (DSLs) for convex optimization. We introduce disciplined parametrized programming, a subset of disciplined convex programming, and we show that every disciplined parametrized program can be represented as the composition of an affine map from parameters to problem data, a solver, and an affine map from the solver's solution to a solution of the original problem (a new form we refer to as affine-solver-affine form). We then demonstrate how to efficiently differentiate through each of these components, allowing for end-to-end analytical differentiation through the entire convex program. We implement our methodology in version 1.1 of CVXPY, a popular Python-embedded DSL for convex optimization, and additionally implement differentiable layers for disciplined convex programs in PyTorch and TensorFlow 2.0. Our implementation significantly lowers the barrier to using convex optimization problems in differentiable programs. We present applications in linear machine learning models and in stochastic control, and we show that our layer is competitive (in execution time) compared to specialized differentiable solvers from past work. |

Learning the Solution Manifold in Optimization and Its Application in Motion Planning2020-07-24 ${\displaystyle \cong }$ |

Optimization is an essential component for solving problems in wide-ranging fields. Ideally, the objective function should be designed such that the solution is unique and the optimization problem can be solved stably. However, the objective function used in a practical application is usually non-convex, and sometimes it even has an infinite set of solutions. To address this issue, we propose to learn the solution manifold in optimization. We train a model conditioned on the latent variable such that the model represents an infinite set of solutions. In our framework, we reduce this problem to density estimation by using importance sampling, and the latent representation of the solutions is learned by maximizing the variational lower bound. We apply the proposed algorithm to motion-planning problems, which involve the optimization of high-dimensional parameters. The experimental results indicate that the solution manifold can be learned with the proposed algorithm, and the trained model represents an infinite set of homotopic solutions for motion-planning problems. |

Optimizing Wireless Systems Using Unsupervised and Reinforced-Unsupervised Deep Learning2020-01-03 ${\displaystyle \cong }$ |

Resource allocation and transceivers in wireless networks are usually designed by solving optimization problems subject to specific constraints, which can be formulated as variable or functional optimization. If the objective and constraint functions of a variable optimization problem can be derived, standard numerical algorithms can be applied for finding the optimal solution, which however incur high computational cost when the dimension of the variable is high. To reduce the on-line computational complexity, learning the optimal solution as a function of the environment's status by deep neural networks (DNNs) is an effective approach. DNNs can be trained under the supervision of optimal solutions, which however, is not applicable to the scenarios without models or for functional optimization where the optimal solutions are hard to obtain. If the objective and constraint functions are unavailable, reinforcement learning can be applied to find the solution of a functional optimization problem, which is however not tailored to optimization problems in wireless networks. In this article, we introduce unsupervised and reinforced-unsupervised learning frameworks for solving both variable and functional optimization problems without the supervision of the optimal solutions. When the mathematical model of the environment is completely known and the distribution of environment's status is known or unknown, we can invoke unsupervised learning algorithm. When the mathematical model of the environment is incomplete, we introduce reinforced-unsupervised learning algorithms that learn the model by interacting with the environment. Our simulation results confirm the applicability of these learning frameworks by taking a user association problem as an example. |

Warm Starting Bayesian Optimization2016-08-11 ${\displaystyle \cong }$ |

We develop a framework for warm-starting Bayesian optimization, that reduces the solution time required to solve an optimization problem that is one in a sequence of related problems. This is useful when optimizing the output of a stochastic simulator that fails to provide derivative information, for which Bayesian optimization methods are well-suited. Solving sequences of related optimization problems arises when making several business decisions using one optimization model and input data collected over different time periods or markets. While many gradient-based methods can be warm started by initiating optimization at the solution to the previous problem, this warm start approach does not apply to Bayesian optimization methods, which carry a full metamodel of the objective function from iteration to iteration. Our approach builds a joint statistical model of the entire collection of related objective functions, and uses a value of information calculation to recommend points to evaluate. |

Convex Optimization for Linear Query Processing under Approximate Differential Privacy2016-05-16 ${\displaystyle \cong }$ |

Differential privacy enables organizations to collect accurate aggregates over sensitive data with strong, rigorous guarantees on individuals' privacy. Previous work has found that under differential privacy, computing multiple correlated aggregates as a batch, using an appropriate \emph{strategy}, may yield higher accuracy than computing each of them independently. However, finding the best strategy that maximizes result accuracy is non-trivial, as it involves solving a complex constrained optimization program that appears to be non-linear and non-convex. Hence, in the past much effort has been devoted in solving this non-convex optimization program. Existing approaches include various sophisticated heuristics and expensive numerical solutions. None of them, however, guarantees to find the optimal solution of this optimization problem. This paper points out that under ($?$, $?$)-differential privacy, the optimal solution of the above constrained optimization problem in search of a suitable strategy can be found, rather surprisingly, by solving a simple and elegant convex optimization program. Then, we propose an efficient algorithm based on Newton's method, which we prove to always converge to the optimal solution with linear global convergence rate and quadratic local convergence rate. Empirical evaluations demonstrate the accuracy and efficiency of the proposed solution. |

Efficient Computation of Probabilistic Dominance in Robust Multi-Objective Optimization2019-10-18 ${\displaystyle \cong }$ |

Real-world problems typically require the simultaneous optimization of several, often conflicting objectives. Many of these multi-objective optimization problems are characterized by wide ranges of uncertainties in their decision variables or objective functions, which further increases the complexity of optimization. To cope with such uncertainties, robust optimization is widely studied aiming to distinguish candidate solutions with uncertain objectives specified by confidence intervals, probability distributions or sampled data. However, existing techniques mostly either fail to consider the actual distributions or assume uncertainty as instances of uniform or Gaussian distributions. This paper introduces an empirical approach that enables an efficient comparison of candidate solutions with uncertain objectives that can follow arbitrary distributions. Given two candidate solutions under comparison, this operator calculates the probability that one solution dominates the other in terms of each uncertain objective. It can substitute for the standard comparison operator of existing optimization techniques such as evolutionary algorithms to enable discovering robust solutions to problems with multiple uncertain objectives. This paper also proposes to incorporate various uncertainties in well-known multi-objective problems to provide a benchmark for evaluating uncertainty-aware optimization techniques. The proposed comparison operator and benchmark suite are integrated into an existing optimization tool that features a selection of multi-objective optimization problems and algorithms. Experiments show that in comparison with existing techniques, the proposed approach achieves higher optimization quality at lower overheads. |

Differentially Private Convex Optimization with Feasibility Guarantees2020-06-22 ${\displaystyle \cong }$ |

This paper develops a novel differentially private framework to solve convex optimization problems with sensitive optimization data and complex physical or operational constraints. Unlike standard noise-additive algorithms, that act primarily on the problem data, objective or solution, and disregard the problem constraints, this framework requires the optimization variables to be a function of the noise and exploits a chance-constrained problem reformulation with formal feasibility guarantees. The noise is calibrated to provide differential privacy for identity and linear queries on the optimization solution. For many applications, including resource allocation problems, the proposed framework provides a trade-off between the expected optimality loss and the variance of optimization results. |

Scalarizing Functions in Bayesian Multiobjective Optimization2019-04-11 ${\displaystyle \cong }$ |

Scalarizing functions have been widely used to convert a multiobjective optimization problem into a single objective optimization problem. However, their use in solving (computationally) expensive multi- and many-objective optimization problems in Bayesian multiobjective optimization is scarce. Scalarizing functions can play a crucial role on the quality and number of evaluations required when doing the optimization. In this article, we study and review 15 different scalarizing functions in the framework of Bayesian multiobjective optimization and build Gaussian process models (as surrogates, metamodels or emulators) on them. We use expected improvement as infill criterion (or acquisition function) to update the models. In particular, we compare different scalarizing functions and analyze their performance on several benchmark problems with different number of objectives to be optimized. The review and experiments on different functions provide useful insights when using and selecting a scalarizing function when using a Bayesian multiobjective optimization method. |

Fast Large-Scale Discrete Optimization Based on Principal Coordinate Descent2019-09-16 ${\displaystyle \cong }$ |

Binary optimization, a representative subclass of discrete optimization, plays an important role in mathematical optimization and has various applications in computer vision and machine learning. Usually, binary optimization problems are NP-hard and difficult to solve due to the binary constraints, especially when the number of variables is very large. Existing methods often suffer from high computational costs or large accumulated quantization errors, or are only designed for specific tasks. In this paper, we propose a fast algorithm to find effective approximate solutions for general binary optimization problems. The proposed algorithm iteratively solves minimization problems related to the linear surrogates of loss functions, which leads to the updating of some binary variables most impacting the value of loss functions in each step. Our method supports a wide class of empirical objective functions with/without restrictions on the numbers of $1$s and $-1$s in the binary variables. Furthermore, the theoretical convergence of our algorithm is proven, and the explicit convergence rates are derived, for objective functions with Lipschitz continuous gradients, which are commonly adopted in practice. Extensive experiments on several binary optimization tasks and large-scale datasets demonstrate the superiority of the proposed algorithm over several state-of-the-art methods in terms of both effectiveness and efficiency. |

Black Box Algorithm Selection by Convolutional Neural Network2019-12-22 ${\displaystyle \cong }$ |

Although a large number of optimization algorithms have been proposed for black box optimization problems, the no free lunch theorems inform us that no algorithm can beat others on all types of problems. Different types of optimization problems need different optimization algorithms. To deal with this issue, researchers propose algorithm selection to suggest the best optimization algorithm from the algorithm set for a given unknown optimization problem. Usually, algorithm selection is treated as a classification or regression task. Deep learning, which has been shown to perform well on various classification and regression tasks, is applied to the algorithm selection problem in this paper. Our deep learning architecture is based on convolutional neural network and follows the main architecture of visual geometry group. This architecture has been applied to many different types of 2-D data. Moreover, we also propose a novel method to extract landscape information from the optimization problems and save the information as 2-D images. In the experimental section, we conduct three experiments to investigate the classification and optimization capability of our approach on the BBOB functions. The results indicate that our new approach can effectively solve the algorithm selection problem. |

Non-convex Optimization for Machine Learning2017-12-21 ${\displaystyle \cong }$ |

A vast majority of machine learning algorithms train their models and perform inference by solving optimization problems. In order to capture the learning and prediction problems accurately, structural constraints such as sparsity or low rank are frequently imposed or else the objective itself is designed to be a non-convex function. This is especially true of algorithms that operate in high-dimensional spaces or that train non-linear models such as tensor models and deep networks. The freedom to express the learning problem as a non-convex optimization problem gives immense modeling power to the algorithm designer, but often such problems are NP-hard to solve. A popular workaround to this has been to relax non-convex problems to convex ones and use traditional methods to solve the (convex) relaxed optimization problems. However this approach may be lossy and nevertheless presents significant challenges for large scale optimization. On the other hand, direct approaches to non-convex optimization have met with resounding success in several domains and remain the methods of choice for the practitioner, as they frequently outperform relaxation-based techniques - popular heuristics include projected gradient descent and alternating minimization. However, these are often poorly understood in terms of their convergence and other properties. This monograph presents a selection of recent advances that bridge a long-standing gap in our understanding of these heuristics. The monograph will lead the reader through several widely used non-convex optimization techniques, as well as applications thereof. The goal of this monograph is to both, introduce the rich literature in this area, as well as equip the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems. |

Inexact Derivative-Free Optimization for Bilevel Learning2020-06-22 ${\displaystyle \cong }$ |

Variational regularization techniques are dominant in the field of mathematical imaging. A drawback of these techniques is that they are dependent on a number of parameters which have to be set by the user. A by now common strategy to resolve this issue is to learn these parameters from data. While mathematically appealing this strategy leads to a nested optimization problem (known as bilevel optimization) which is computationally very difficult to handle. A key ingredient in solving the upper-level problem is the exact solution of the lower-level problem which is practically infeasible. In this work we propose to solve these problems using inexact derivative-free optimization algorithms which never require to solve the lower-level problem exactly. We provide global convergence and worst-case complexity analysis of our approach, and test our proposed framework on ROF-denoising and learning MRI sampling patterns. Dynamically adjusting the lower-level accuracy yields learned parameters with similar reconstruction quality as high-accuracy evaluations but with dramatic reductions in computational work (up to 100 times faster in some cases). |

Deep Inverse Optimization2018-12-03 ${\displaystyle \cong }$ |

Given a set of observations generated by an optimization process, the goal of inverse optimization is to determine likely parameters of that process. We cast inverse optimization as a form of deep learning. Our method, called deep inverse optimization, is to unroll an iterative optimization process and then use backpropagation to learn parameters that generate the observations. We demonstrate that by backpropagating through the interior point algorithm we can learn the coefficients determining the cost vector and the constraints, independently or jointly, for both non-parametric and parametric linear programs, starting from one or multiple observations. With this approach, inverse optimization can leverage concepts and algorithms from deep learning. |