Publications

Conference Papers


Faster Accelerated First-order Methods for Convex Optimization with Strongly Convex Function Constraints

Published in , 2024

In this paper, we introduce faster accelerated primal-dual algorithms for minimizing a convex function subject to strongly convex function constraints. Prior to our work, the best complexity bound was O(ε⁻¹), regardless of the strong convexity of the constraint function. It is unclear whether the strong convexity assumption can enable even better convergence results. To address this issue, we have developed novel techniques to progressively estimate the strong convexity of the Lagrangian function. Our approach, for the first time, effectively leverages the constraint strong convexity, obtaining an improved complexity of O(ε⁻¹/²). This rate matches the complexity lower bound for strongly-convex-concave saddle point optimization and is therefore order-optimal. We show the superior performance of our methods in sparsity-inducing constrained optimization, notably Google’s personalized PageRank problem. Furthermore, we show that a restarted version of the proposed methods can effectively identify the optimal solution’s sparsity pattern within a finite number of steps, a result that appears to have independent significance.

Download Paper

A Single-Loop Robust Policy Gradient Method for Robust Markov Decision Processes

Published in , 2024

Robust Markov Decision Processes (RMDPs) have recently been recognized as a valuable and promising approach to discovering a policy with creditable performance, particularly in the presence of a dynamic environment and estimation errors in the transition matrix due to limited data. Despite extensive exploration of dynamic programming algorithms for solving RMDPs, there has been a notable upswing in interest in developing efficient algorithms using the policy gradient method. In this paper, we propose the first single-loop robust policy gradient (SRPG) method with the global optimality guarantee for solving RMDPs through its minimax formulation. Moreover, we complement the convergence analysis of the nonconvex-nonconcave min-max optimization problem with the objective function’s gradient dominance property, which is not explored in the prior literature. Numerical experiments validate the efficacy of SRPG, demonstrating its faster and more robust convergence behavior compared to its nested-loop counterpart.

Download Paper

Decentralized Gradient-Free Methods for Stochastic Non-smooth Non-convex Optimization

Published in , 2024

We consider decentralized gradient-free optimization of minimizing Lipschitz continuous functions that satisfy neither smoothness nor convexity assumption. We propose two novel gradient-free algorithms, the Decentralized Gradient-Free Method (DGFM) and its variant, the Decentralized Gradient-Free Method+ (DGFM+). Based on the techniques of randomized smoothing and gradient tracking, DGFM requires the computation of the zeroth-order oracle of a single sample in each iteration, making it less demanding in terms of computational resources for individual computing nodes. Theoretically, DGFM achieves a complexity of O(d³/²δ⁻¹ε⁻⁴) for obtaining a (δ,ε)-Goldstein stationary point. DGFM+, an advanced version of DGFM, incorporates variance reduction to further improve the convergence behavior. It samples a mini-batch at each iteration and periodically draws a larger batch of data, which improves the complexity to O(d³/²δ⁻¹ε⁻³). Moreover, experimental results underscore the empirical advantages of our proposed algorithms when applied to real-world datasets.

Download Paper

Preprints


PDCS: A Primal-Dual Large-Scale Conic Programming Solver with GPU Enhancements

Published in , 2025

In this paper, we introduce the Primal-Dual Conic Programming Solver (PDCS), a largescale conic programming solver with GPU enhancements. Problems that PDCS currently supports include linear programs, second-order cone programs, convex quadratic programs, and exponential cone programs. PDCS achieves scalability to large-scale problems by leveraging sparse matrix-vector multiplication as its core computational operation, which is both memoryefficient and well-suited for GPU acceleration. The solver is based on the restarted primal-dual hybrid gradient method but further incorporates several enhancements, including adaptive reflected Halpern restarts, adaptive step-size selection, adaptive weight adjustment, and diagonal rescaling. Additionally, PDCS employs a bijection-based method to compute projections onto rescaled cones. Furthermore, cuPDCS is a GPU implementation of PDCS and it implements customized computational schemes that utilize different levels of GPU architecture to handle cones of different types and sizes. Numerical experiments demonstrate that cuPDCS is generally more efficient than state-of-the-art commercial solvers and other first-order methods on large-scale conic program applications, including Fisher market equilibrium problems, Lasso regression, and multi-period portfolio optimization. Furthermore, cuPDCS also exhibits better scalability, efficiency, and robustness compared to other first-order methods on the conic program benchmark dataset CBLIB. These advantages are more pronounced in large-scale, lower-accuracy settings.

Download Paper

Uniformly Optimal and Parameter-free First-order Methods for Convex and Function-constrained Optimization

Published in , 2024

This paper presents new first-order methods for achieving optimal oracle complexities in convex optimization with convex functional constraints. Oracle complexities are measured by the number of function and gradient evaluations. To achieve this, we enable first-order methods to utilize computational oracles for solving diagonal quadratic programs in subproblems. For problems where the optimal value f* is known, such as those in overparameterized models and feasibility problems, we propose an accelerated first-order method that incorporates a modified Polyak step size and Nesterov’s momentum. Notably, our method does not require knowledge of smoothness levels, Hölder continuity parameter of the gradient, or additional line search, yet achieves the optimal oracle complexity bound of O(ε⁻²/⁽¹⁺ᵖ⁾) under Hölder smoothness conditions. When f* is unknown, we reformulate the problem as finding the root of the optimal value function and develop inexact fixed-point iteration and secant method to compute f*. These root-finding subproblems are solved inexactly using first-order methods to a specified relative accuracy. We employ the accelerated prox-level (APL) method, which is proven to be uniformly optimal for convex optimization with simple constraints. Our analysis demonstrates that APL-based level-set methods also achieve the optimal oracle complexity of O(ε⁻²/⁽¹⁺ᵖ⁾) for convex function-constrained optimization, without requiring knowledge of any problem-specific structures. Through experiments on various tasks, we demonstrate the advantages of our methods over existing approaches in function-constrained optimization.

Download Paper

A Low-Rank ADMM Splitting Approach for Semidefinite Programming

Published in , 2024

We introduce a new first-order method for solving general semidefinite programming problems, based on the alternating direction method of multipliers (ADMM) and a matrix-splitting technique. Our algorithm has an advantage over the Burer-Monteiro approach as it only involves much easier quadratically regularized subproblems in each iteration. For a linear objective, the subproblems are well-conditioned quadratic programs that can be efficiently solved by the standard conjugate gradient method. We show that the ADMM algorithm achieves sublinear or linear convergence rates to the KKT solutions under different conditions. Building on this theoretical development, we present LoRADS, a new solver for linear SDP based on the Low-Rank ADMM Splitting approach. LoRADS incorporates several strategies that significantly increase its efficiency. Firstly, it initiates with a warm-start phase that uses the Burer-Monteiro approach. Moreover, motivated by the SDP low-rank theory [So et al., 2008], LoRADS chooses an initial rank of logarithmic order and then employs a dynamic approach to increase the rank. Numerical experiments indicate that LoRADS exhibits promising performance on various SDP problems. A noteworthy achievement of LoRADS is its successful solving of a matrix completion problem with 15, 694, 167 constraints and a matrix variable of size 40, 000 × 40, 000 in 351 seconds.

Download Paper