Equivalent Reformulations

In this lecture, we talk about equivalent reformulations, that are reformulations done on the initial problem of the form

(1)   \begin{equation*}\begin{aligned}& \underset{x}{\text{minimize}}& & f_0(x) \\& \text{subject to}& & f_i(x) \leq b_i, \; i = 1, \ldots, m \\& & & g_i(x) = 0, \; i = 1, \ldots, p.\end{aligned}\end{equation*}

A very interesting question is the following. Assume we got two problems \mathcal{P}_1 and \mathcal{P}_2 that are equivalent problems. Are their duals, (hereby denoted by \mathcal{D}_1 and \mathcal{D}_2, respectively) the same ? The given lecture gives many examples to show that in many cases, it happens that the dual problems could be very different. We will discuss three main types of equivalent reformulations that are

  • Introducing new variables to the problem
  • Transforming the cost into another proportional cost (i.e. an increasing function of the cost)
  • Absorbing constraints into the domain of the cost, a.k.a. implicit constraints.

Introducing New Variables

We start our lecture by talking about introducing auxiliary variables to the main problem. Assume the following unconstrained minimization problem

(2)   \begin{equation*}(\mathcal{P}_1): \underset{x}{\text{minimize }} f_0(Ax+b)\end{equation*}

A very easy achieved equivalent problem is to replace Ax + b by y and impose y = Ax+b as a new constraint to the problem, namely

(3)   \begin{equation*}(\mathcal{P}_2):\begin{aligned}& \underset{x,y}{\text{minimize}}& & f_0(y) \\& \text{subject to}& & y = Ax+b\end{aligned}\end{equation*}

To get the dual problems, we first need the Lagrangian dual functions. For problem (1), the dual problem is clearly

(4)   \begin{equation*}(\mathcal{D}_1):   \underset{\nu}{\text{maximize }} p^* \end{equation*}

where p^* is the optimal value of (\mathcal{P}_1). However, the dual problem of (\mathcal{P}_2) is achieved by first computing the Lagrangian function that is

(5)   \begin{equation*} \mathcal{L}(x,y,\nu) = f_0(y) + \nu^T ( Ax + b - y) \end{equation*}

Next, we have to pass through the Lagrangian dual function that is

(6)   \begin{equation*} \begin{split} g(\nu) &= \inf_{x,y} \lbrace \mathcal{L}(x,y,\nu) \rbrace \\&= \inf_{x,y} \lbrace f_0(y) + \nu^T ( Ax + b - y) \rbrace \\&= \inf_{x,y} \lbrace f_0(y) + \nu^T ( b - y) + \nu^T Ax \rbrace \\&= \inf_{y} \Big\lbrace \inf_{x} \lbrace f_0(y) + \nu^T ( b - y) + \nu^T Ax \rbrace \Big\rbrace \\&= \inf_{y} \Big\lbrace f_0(y) + \nu^T ( b - y) + \inf_{x} \lbrace\nu^T Ax \rbrace \Big\rbrace \\\end{split}\end{equation*}

The term \inf_{x} \lbrace\nu^T Ax \rbrace is clearly -\infty in the general case except for the more interesting case when \nu^T A = 0, in which case the infimum becomes 0, so the above reads

(7)   \begin{equation*}\begin{split}g(\nu) &= \inf_{y} \Big\lbrace f_0(y) + \nu^T ( b - y) \Big\rbrace \\&=b^T \nu + \inf_{y} \Big\lbrace f_0(y) - \nu^T y \Big\rbrace \\&=b^T \nu - \sup_{y} \Big\lbrace \nu^T y - f_0(y) \Big\rbrace \\&= b^T \nu - f_0^*(\nu) \end{split} \end{equation*}

where f_0^*(\nu) is the conjugate function of f_0(x). Finally the dual becomes

(8)   \begin{equation*}(\mathcal{D}_2):\begin{aligned}& \underset{\nu}{\text{maximize}}& & b^T \nu - f_0^*(\nu) \\& \text{subject to}& & A^T \nu = 0\end{aligned}\end{equation*}

Indeed, (\mathcal{D}_1) and (\mathcal{D}_2) are very different. The lecture deals with two applications of the above concept by considering two different forms of f_0(x), that are :f_0(x) = \log(\sum\limits_{i=1}^m \exp(a_i^T x + b_i)) and f_0(x) = \Vert Ax - b \Vert.

Cost Transformation

The second type of reformulation is by simply transforming the cost, i.e. instead of maximizing g(x), you maximize g^2(x), or more generally speaking, you can maximize f(g(x)) where f(x) is an increasing function of x. Let me show you an example, where we have the following unconstrained minimization problem

(9)   \begin{equation*}(\mathcal{P}_1): \underset{x}{\text{minimize }} \Vert Ax - b \Vert\end{equation*}

The dual problem of the above is

(10)   \begin{equation*}(\mathcal{D}_1): \underset{\nu}{\text{maximize }} p^*\end{equation*}

where p^* is the optimal value of (\mathcal{P}_1). Now consider the transformed cost problem as

(11)   \begin{equation*}(\mathcal{P}_3):\begin{aligned}& \underset{x,y}{\text{minimize}}& & \Vert y \Vert^2 \\& \text{subject to}& & y = Ax-b\end{aligned}\end{equation*}

The Lagrangian of the above problem is

(12)   \begin{equation*}\mathcal{L}(x,y,\nu) = \Vert y \Vert^2 + \nu^T(y - Ax-b)\end{equation*}

Now let’s compute the Lagrange dual problem as

(13)   \begin{equation*}\begin{split}g(\nu) &= \inf_{x,y} \mathcal{L}(x,y,\nu) \\&= \inf_{x,y} \lbrace \Vert y \Vert^2 + \nu^T(y - Ax-b) \rbrace \\&= \inf_{x,y} \lbrace \Vert y \Vert^2 + \nu^T(y -b) - \nu^T A x \rbrace \\&= \inf_{y}\inf_{x} \lbrace \Vert y \Vert^2 + \nu^T(y -b) - \nu^T A x \rbrace \\&= \inf_{y}\lbrace \Vert y \Vert^2 + \nu^T(y -b) - \inf_{x} \nu^T A x \rbrace \\\end{split}\end{equation*}

As previously we can say that

(14)   \begin{equation*}\inf_{x} \nu^T A x=\begin{cases}0 & \text{ if } A^T \nu = 0 \\-\infty & \text{ else}\end{cases}\end{equation*}

where we will be interested in the case where A^T \nu = 0, otherwise there’s nothing to optimize. So,

(15)   \begin{equation*}\begin{split}g(\nu) &= \inf_{y}\lbrace \Vert y \Vert^2 + \nu^T(y -b) \rbrace \\&= b^T \nu + \inf_{y}\lbrace \Vert y \Vert^2 + \nu^Ty \rbrace \\&= b^T \nu + \inf_{y}\lbrace \Vert -y \Vert^2 - \nu^T(-y) \rbrace \\&= b^T \nu - \Vert \nu \Vert_*^2\end{split}\end{equation*}

where \Vert \Vert_*^2 is the dual norm of \Vert \Vert^2. Finally, the dual problem is

(16)   \begin{equation*}(\mathcal{D}_3): \begin{aligned} & \underset{\nu}{\text{maximize}} & & b^T \nu - \Vert \nu \Vert*^2 \\& \text{subject to}& & A^T \nu = 0\end{aligned}\end{equation*}

which is very different from (\mathcal{D}_1).

Constraint Absorption

The last type of reformulation we discuss is by absorbing constraints to the cost, or more formally speaking, thru implicit constraints. Let’s see an example

(17)   \begin{equation*}(\mathcal{P}_1):\begin{aligned}& \underset{x}{\text{minimize}}& & c^T x \\& \text{subject to}& & Ax = b \\& & & l \leq x \leq u\end{aligned}\end{equation*}

which is the same as

(18)   \begin{equation*}(\mathcal{P}_1):\begin{aligned}& \underset{x}{\text{minimize}}& & c^T x \\& \text{subject to}& & Ax = b \\& & & l \leq x ; - x \leq - l\end{aligned}\end{equation*}

The Lagrangian of the above problem is

(19)   \begin{equation*}\mathcal{L}(x,\lambda,\nu)=c^T x + \nu^T(Ax - b) + \lambda_1^T (x - u) + \lambda_2^T(-x + l)\end{equation*}


(20)   \begin{equation*}\lambda=\begin{bmatrix}\lambda_1 \\\lambda_2\end{bmatrix}\end{equation*}

Now computing the Lagrangian dual function as

(21)   \begin{equation*}\begin{split}g(\lambda,\nu) &= \inf_{x} \lbrace c^T x + \nu^T(Ax - b) + \lambda_1^T (x - u) + \lambda_2^T(-x + l) \rbrace \\&= \inf_{x} \lbrace (c + A^T \nu + \lambda_1 - \lambda_2)^T x \rbrace- \nu^T b - \lambda_1^T u + \lambda_2^T l \end{split}\end{equation*}


(22)   \begin{equation*}\inf_{x} \lbrace (c + A^T \nu + \lambda_1 - \lambda_2)^T x \rbrace=\begin{cases}0 & \text{ if } c + A^T \nu + \lambda_1 - \lambda_2 = 0 \\-\infty & \text{ else}\end{cases}\end{equation*}

As previously, we will be interested in the case where c + A^T \nu + \lambda_1 - \lambda_2 = 0, hence

(23)   \begin{equation*}\begin{split}g(\lambda,\nu) =- \nu^T b - \lambda_1^T u + \lambda_2^T l\end{split}\end{equation*}

Finally the dual problem here is

(24)   \begin{equation*}(\mathcal{D}_1):\begin{aligned}& \underset{\nu,\lambda_1,\lambda_2}{\text{minimize}}& & - \nu^T b - \lambda_1^T u + \lambda_2^T l \\& \text{subject to}& & c + A^T \nu + \lambda_1 - \lambda_2 = 0 \\& & & \lambda_1 \geq 0 , \lambda_2 \geq 0\end{aligned}\end{equation*}

Now let us absorb the “box” constraints in the cost of \mathcal{P}_1 as

(25)   \begin{equation*}(\mathcal{P}_2):\begin{aligned}& \underset{x}{\text{minimize}}& & f_0(x) \\& \text{subject to}& & Ax = b\end{aligned}\end{equation*}


(26)   \begin{equation*}f_0(x)=\begin{cases}c^T x & \text{ if } l \leq x \leq u \\0 & \text{ else}\end{cases}\end{equation*}

The Lagrangian function above is

(27)   \begin{equation*}\begin{split}g(\nu) &= \inf_x \lbrace f_0(x) + \nu^T (Ax - b) \rbrace \\&= \inf_{l \leq x \leq u} \lbrace f_0(x) + \nu^T (Ax - b) \rbrace \\&= b^T \nu + \inf_{l \leq x \leq u} \lbrace (c+A^T \nu)^T x \rbrace\end{split}\end{equation*}

where we show in the lecture why

    \begin{equation*}\inf_{l \leq x \leq u} \lbrace (c+A^T \nu)^T x \rbrace =\Big(\max \lbrace (c + A^T \nu),0 \rbrace\Big)^T l+\Big(\min \lbrace (c + A^T \nu),0 \rbrace\Big)^T u\end{equation*}

and hence the dual problem becomes the following unconstrained maximization problem

    \begin{equation*}(\mathcal{D}_2):\underset{\nu}{\text{maximize }} -b^T \nu +\Big(\max \lbrace (c + A^T \nu),0 \rbrace\Big)^T l+\Big(\min \lbrace (c + A^T \nu),0 \rbrace\Big)^T u\end{equation*}

which is very different from (\mathcal{D}_1)

Ahmad Bazzi

Ahmad Bazzi is an Electrical Engineer and YouTuber. Ahmad Bazzi is a signal processing engineer at CEVA DSP. He holds a PhD in Electrical Engineering from EURECOM. Ahmad Bazzi is a EURECOM alumni. Ahmad Bazzi obtained his Master studies (Summa Cum Laude) in Electrical Engineering at SUPELEC, Paris. Ahmad Bazzi has many publications in well-known IEEE conferences, including a nomination award and is a co-inventor in several patents. Ahmad Bazzi dedicates time to publish high level lectures on Mathematics (including convex optimization) and Programming. Ahmad Bazzi also focuses on Machine Learning, Convex Optimization, Linear Algebra, Python, SymPy, NumPy, Pandas, CVXOPT, MATLAB, C++ for firmware systems.

Leave a Reply

Your email address will not be published. Required fields are marked *

Next Post

Perturbation & Sensitivity Analysis

Mon Nov 16 , 2020
In this lecture, we talk about Perturbation and Sensitivity Analysis. But what does that mean ? Well, consider our good old looking optimization problem that looks like this (1)   One way to tell how the above problem reacts to perturbation is to actually perturb it and check how its […]