Logarithmic regret algorithms for online convex optimization elad hazan. However, being logarithmically convex is a strictly stronger property than being convex. The major hurdle in extending these algorithms to integration and optimization of logconcave functions is the lack of a provably rapidmixing random walk with a similar mild logarithmic dependence on the start. A logarithmically convex function f is a convex function since it is the composite of the increasing convex function and the function. Adaptive algorithms for online convex optimization. The methods we describe are general enough to capture most existing algorithms with single, simple and generic analysis, and lie at the heart of several recent advancements in prediction theory. Logarithmic regret algorithms for strongly repeated games. In this paper, leveraging results on geometry and convex analysis, we further our understanding of the role of curvature in optimization. After each point is chosen, it encounters a sequence of possibly unrelated convex cost functions. Logarithmic regret algorithms for strongly convex repeated games shai shalevshwartz 1and yoram singer,2 1 school of computer sci.
Many classes of convex optimization problems admit polynomialtime algorithms, whereas mathematical optimization is in general nphard. Logarithmic regret algorithms for online optimization sham m. We show that the curvature of the decision makers decision set has a major impact on the growth rate of the minimax regret with respect to the time horizon. Mathematics stack exchange is a question and answer site for people studying math at any level and professionals in related fields. Algorithms for convex optimization algorithms, nature, and. Algorithms with logarithmic or sublinear regret for constrained contextual bandits. Feb 08, 2020 it is known that the curvature of the feasible set in convex optimization allows for algorithms with better convergence rates, and there has been renewed interest in this topic both for offline as well as online problems. This mirrors what has been done for the special cases of prediction from expert advice by kivinen and warmuth eurocolt 1999, and universal portfolios by cover math.
Adaptive optimization problems under uncertainty with limited. Previous approaches for online convex optimization are based on. We will write cfor the set of possible gradient vectors ct. Understanding machine learning by shai shalevshwartz. At the time of each decision, the outcomes associated with the choices are unknown to the player. Curvature of feasible sets in offline and online optimization. After committing to a decision, the decision maker su ers a loss. Zhe li october 23, 2015 elad hazan, adam kalai, satyen kale, and amit agarwal logarithmic regret algorithms for online convex optimization.
Our algorithms below are stated in terms of linear loss functions. In this paper, we give algorithms that achieve regret ologt for an arbitrary sequence of strictly convex functions with bounded first and second derivatives. It is known that the curvature of the feasible set in convex optimization allows for algorithms with better convergence rates, and there has been renewed interest in this topic both for offline as well as online problems. Zinkevich icml 2003 introduced this framework, which models many natural repeated decisionmaking. We propose several algorithms achieving logarithmic regret, which besides being more general are also much more efficient to implement. In an online convex optimization problem a decisionmaker. The online convex optimization problem becomes more challenging when the player only receives partial feedback on the choices of the adversary.
External regret static class best fixed solution compares to a single best strategy in h the class h is fixed beforehand. The logarithm is used for the logarithmic derivative trick, which turns products into sums. Optimal algorithms for online convex optimization with multi. Convex optimization problem minimize f0x subject to fix. Second, we study the minimax achievable regret in the online convex optimization framework when the loss function is piecewise linear. Convex optimization is a subfield of mathematical optimization that studies the problem of minimizing convex functions over convex sets. This algorithm requires the problem domain to be a polytope, instead of. A consequence of this is that we can guarantee at most constant regret with respect to the origin, x 0. The regret achieved by these algorithms is proportional to a polynomial square root in the number of iterations.
A geometric program, or gp, is a type of global optimization problem that concerns minimizing a subject to constraint functions so as to allow one to solve unique nonlinear programming problems. Attain sometimes cannot catch simple patterns regret grows with. Exact convex confidenceweighted learning 2008 online passiveaggressive algorithms 2006 logarithmic regret algorithms foronline convex optimization 2007 a secondorder perceptron algorithm 2005 online learning with kernels 2004 solving large scale linear prediction problems using stochastic gradient descent algorithms 2004. This problem of linear prediction with a convex loss function has been well studied e. This mirrors what has been done for the special cases of prediction from expert advice by kivinen and warmuth kw99, and universal portfolios by cover cov91. The simplest convex program problem without constraints and with a strongly convex objective. The subgradient method is a simple algorithm for minimizing a nondifferentiable convex function, and more generally, solving convex optimization problems. Bandit linear optimization the actual loss is known and not the complete loss function ex. Adaptive subgradient methods for online learning and stochastic optimization. What are the ways to prove logarithmic function is a convex. Logarithmic regret algorithms for online convex optimization elad hazan 1. Critically, our algorithms provide this guarantee simultaneously for all x2rn, without any need to know rin advance. Noregret algorithms for unconstrained online convex optimization. Algorithms with logarithmic or sublinear regret for.
The online convex optimization framework permits this example, because the function f tx lq t,xp t is a convex function of x. Logarithmic transformation is a method used to change geometric programs into their convex forms. Machine learning journal volume 69, issue 23 pages. Algorithms for convex optimization continuous optimization methods have played a major role in the development of fast algorithms for problems arising in areas such as theoretical computer science, discrete optimization, data science, statistics, and machine learning. This is useful because the functions to optimize are often products, often with exponentials, and detection of the extrema requires differentiation. In an online convex optimization problem a decisionmaker makes a sequence of decisions, i. November 20, 2008 abstract in an online convex optimization problem a decisionmaker makes a sequence of decisions, i. Logarithmic regret algorithms for online convex optimization. In addition, our algorithms and results can be viewed as generalizations of existing results for the problem of predicting from expert advice, where it is known that one can achieve logarithmic regret for. Cesabianchi and lugosi 2006 and references therein. Pdf logarithmic regret algorithms for online convex. Noregret algorithms for unconstrained online convex. On iteratively reweighted algorithms for nonsmooth non.
The main new ideas give rise to an efficient algorithm based on. We show that the curvature of the decision makers decision set has a major impact on the growth rate of the minimax regret with respect to. Further, we design algorithms and obtain similar regret bounds for more general systems with unknown context distribution and heterogeneous costs. This technique can be applied to any online convex optimization problem where a. In this paper, we give algorithms that achieve regret o log t for an arbitrary sequence of strictly convex functions with bounded first and second derivatives. Obviously adaptiveregret is a strict generalization of regret. Logarithmic regret algorithms for online convex optimization elad hazan, adam kalai, satyen kale, and amit agarwal presenter. Optimal stochastic strongly convex optimization with a. The convex optimization approach to regret minimization. Its complexity in terms of problem size is very good each iteration is cheap, but in terms of accuracy, very poor the algorithm typically requires thousands or millions of iterations.
818 622 215 768 1137 263 680 1381 256 858 1002 1424 844 1573 1403 1274 842 119 895 1115 75 1537 1133 244 72 627 1309 903 312 619 1058 1301 430 1159 832 795 828 549 119 480 1312 276