Equivalent Reformulations


In this lecture, we talk about equivalent reformulations, that are reformulations done on the initial problem of the form

(1)   \begin{equation*}\begin{aligned}& \underset{x}{\text{minimize}}& & f_0(x) \\& \text{subject to}& & f_i(x) \leq b_i, \; i = 1, \ldots, m \\& & & g_i(x) = 0, \; i = 1, \ldots, p.\end{aligned}\end{equation*}

A very interesting question is the following. Assume we got two problems \mathcal{P}_1 and \mathcal{P}_2 that are equivalent problems. Are their duals, (hereby denoted by \mathcal{D}_1 and \mathcal{D}_2, respectively) the same ? The given lecture gives many examples to show that in many cases, it happens that the dual problems could be very different. We will discuss three main types of equivalent reformulations that are

  • Introducing new variables to the problem
  • Transforming the cost into another proportional cost (i.e. an increasing function of the cost)
  • Absorbing constraints into the domain of the cost, a.k.a. implicit constraints.

Introducing New Variables

We start our lecture by talking about introducing auxiliary variables to the main problem. Assume the following unconstrained minimization problem

(2)   \begin{equation*}(\mathcal{P}_1): \underset{x}{\text{minimize }} f_0(Ax+b)\end{equation*}

A very easy achieved equivalent problem is to replace Ax + b by y and impose y = Ax+b as a new constraint to the problem, namely

(3)   \begin{equation*}(\mathcal{P}_2):\begin{aligned}& \underset{x,y}{\text{minimize}}& & f_0(y) \\& \text{subject to}& & y = Ax+b\end{aligned}\end{equation*}

To get the dual problems, we first need the Lagrangian dual functions. For problem (1), the dual problem is clearly

(4)   \begin{equation*}(\mathcal{D}_1):   \underset{\nu}{\text{maximize }} p^* \end{equation*}

where p^* is the optimal value of (\mathcal{P}_1). However, the dual problem of (\mathcal{P}_2) is achieved by first computing the Lagrangian function that is

(5)   \begin{equation*} \mathcal{L}(x,y,\nu) = f_0(y) + \nu^T ( Ax + b - y) \end{equation*}

Next, we have to pass through the Lagrangian dual function that is

(6)   \begin{equation*} \begin{split} g(\nu) &= \inf_{x,y} \lbrace \mathcal{L}(x,y,\nu) \rbrace \\&= \inf_{x,y} \lbrace f_0(y) + \nu^T ( Ax + b - y) \rbrace \\&= \inf_{x,y} \lbrace f_0(y) + \nu^T ( b - y) + \nu^T Ax \rbrace \\&= \inf_{y} \Big\lbrace \inf_{x} \lbrace f_0(y) + \nu^T ( b - y) + \nu^T Ax \rbrace \Big\rbrace \\&= \inf_{y} \Big\lbrace f_0(y) + \nu^T ( b - y) + \inf_{x} \lbrace\nu^T Ax \rbrace \Big\rbrace \\\end{split}\end{equation*}

The term \inf_{x} \lbrace\nu^T Ax \rbrace is clearly -\infty in the general case except for the more interesting case when \nu^T A = 0, in which case the infimum becomes 0, so the above reads

(7)   \begin{equation*}\begin{split}g(\nu) &= \inf_{y} \Big\lbrace f_0(y) + \nu^T ( b - y) \Big\rbrace \\&=b^T \nu + \inf_{y} \Big\lbrace f_0(y) - \nu^T y \Big\rbrace \\&=b^T \nu - \sup_{y} \Big\lbrace \nu^T y - f_0(y) \Big\rbrace \\&= b^T \nu - f_0^*(\nu) \end{split} \end{equation*}

where f_0^*(\nu) is the conjugate function of f_0(x). Finally the dual becomes

(8)   \begin{equation*}(\mathcal{D}_2):\begin{aligned}& \underset{\nu}{\text{maximize}}& & b^T \nu - f_0^*(\nu) \\& \text{subject to}& & A^T \nu = 0\end{aligned}\end{equation*}

Indeed, (\mathcal{D}_1) and (\mathcal{D}_2) are very different. The lecture deals with two applications of the above concept by considering two different forms of f_0(x), that are :f_0(x) = \log(\sum\limits_{i=1}^m \exp(a_i^T x + b_i)) and f_0(x) = \Vert Ax - b \Vert.

Cost Transformation

The second type of reformulation is by simply transforming the cost, i.e. instead of maximizing g(x), you maximize g^2(x), or more generally speaking, you can maximize f(g(x)) where f(x) is an increasing function of x. Let me show you an example, where we have the following unconstrained minimization problem

(9)   \begin{equation*}(\mathcal{P}_1): \underset{x}{\text{minimize }} \Vert Ax - b \Vert\end{equation*}

The dual problem of the above is

(10)   \begin{equation*}(\mathcal{D}_1): \underset{\nu}{\text{maximize }} p^*\end{equation*}

where p^* is the optimal value of (\mathcal{P}_1). Now consider the transformed cost problem as

(11)   \begin{equation*}(\mathcal{P}_3):\begin{aligned}& \underset{x,y}{\text{minimize}}& & \Vert y \Vert^2 \\& \text{subject to}& & y = Ax-b\end{aligned}\end{equation*}

The Lagrangian of the above problem is

(12)   \begin{equation*}\mathcal{L}(x,y,\nu) = \Vert y \Vert^2 + \nu^T(y - Ax-b)\end{equation*}

Now let’s compute the Lagrange dual problem as

(13)   \begin{equation*}\begin{split}g(\nu) &= \inf_{x,y} \mathcal{L}(x,y,\nu) \\&= \inf_{x,y} \lbrace \Vert y \Vert^2 + \nu^T(y - Ax-b) \rbrace \\&= \inf_{x,y} \lbrace \Vert y \Vert^2 + \nu^T(y -b) - \nu^T A x \rbrace \\&= \inf_{y}\inf_{x} \lbrace \Vert y \Vert^2 + \nu^T(y -b) - \nu^T A x \rbrace \\&= \inf_{y}\lbrace \Vert y \Vert^2 + \nu^T(y -b) - \inf_{x} \nu^T A x \rbrace \\\end{split}\end{equation*}

As previously we can say that

(14)   \begin{equation*}\inf_{x} \nu^T A x=\begin{cases}0 & \text{ if } A^T \nu = 0 \\-\infty & \text{ else}\end{cases}\end{equation*}

where we will be interested in the case where A^T \nu = 0, otherwise there’s nothing to optimize. So,

(15)   \begin{equation*}\begin{split}g(\nu) &= \inf_{y}\lbrace \Vert y \Vert^2 + \nu^T(y -b) \rbrace \\&= b^T \nu + \inf_{y}\lbrace \Vert y \Vert^2 + \nu^Ty \rbrace \\&= b^T \nu + \inf_{y}\lbrace \Vert -y \Vert^2 - \nu^T(-y) \rbrace \\&= b^T \nu - \Vert \nu \Vert_*^2\end{split}\end{equation*}

where \Vert \Vert_*^2 is the dual norm of \Vert \Vert^2. Finally, the dual problem is

(16)   \begin{equation*}(\mathcal{D}_3): \begin{aligned} & \underset{\nu}{\text{maximize}} & & b^T \nu - \Vert \nu \Vert*^2 \\& \text{subject to}& & A^T \nu = 0\end{aligned}\end{equation*}

which is very different from (\mathcal{D}_1).

Constraint Absorption

The last type of reformulation we discuss is by absorbing constraints to the cost, or more formally speaking, thru implicit constraints. Let’s see an example

(17)   \begin{equation*}(\mathcal{P}_1):\begin{aligned}& \underset{x}{\text{minimize}}& & c^T x \\& \text{subject to}& & Ax = b \\& & & l \leq x \leq u\end{aligned}\end{equation*}

which is the same as

(18)   \begin{equation*}(\mathcal{P}_1):\begin{aligned}& \underset{x}{\text{minimize}}& & c^T x \\& \text{subject to}& & Ax = b \\& & & l \leq x ; - x \leq - l\end{aligned}\end{equation*}

The Lagrangian of the above problem is

(19)   \begin{equation*}\mathcal{L}(x,\lambda,\nu)=c^T x + \nu^T(Ax - b) + \lambda_1^T (x - u) + \lambda_2^T(-x + l)\end{equation*}


(20)   \begin{equation*}\lambda=\begin{bmatrix}\lambda_1 \\\lambda_2\end{bmatrix}\end{equation*}

Now computing the Lagrangian dual function as

(21)   \begin{equation*}\begin{split}g(\lambda,\nu) &= \inf_{x} \lbrace c^T x + \nu^T(Ax - b) + \lambda_1^T (x - u) + \lambda_2^T(-x + l) \rbrace \\&= \inf_{x} \lbrace (c + A^T \nu + \lambda_1 - \lambda_2)^T x \rbrace- \nu^T b - \lambda_1^T u + \lambda_2^T l \end{split}\end{equation*}


(22)   \begin{equation*}\inf_{x} \lbrace (c + A^T \nu + \lambda_1 - \lambda_2)^T x \rbrace=\begin{cases}0 & \text{ if } c + A^T \nu + \lambda_1 - \lambda_2 = 0 \\-\infty & \text{ else}\end{cases}\end{equation*}

As previously, we will be interested in the case where c + A^T \nu + \lambda_1 - \lambda_2 = 0, hence

(23)   \begin{equation*}\begin{split}g(\lambda,\nu) =- \nu^T b - \lambda_1^T u + \lambda_2^T l\end{split}\end{equation*}

Finally the dual problem here is

(24)   \begin{equation*}(\mathcal{D}_1):\begin{aligned}& \underset{\nu,\lambda_1,\lambda_2}{\text{minimize}}& & - \nu^T b - \lambda_1^T u + \lambda_2^T l \\& \text{subject to}& & c + A^T \nu + \lambda_1 - \lambda_2 = 0 \\& & & \lambda_1 \geq 0 , \lambda_2 \geq 0\end{aligned}\end{equation*}

Now let us absorb the “box” constraints in the cost of \mathcal{P}_1 as

(25)   \begin{equation*}(\mathcal{P}_2):\begin{aligned}& \underset{x}{\text{minimize}}& & f_0(x) \\& \text{subject to}& & Ax = b\end{aligned}\end{equation*}


(26)   \begin{equation*}f_0(x)=\begin{cases}c^T x & \text{ if } l \leq x \leq u \\0 & \text{ else}\end{cases}\end{equation*}

The Lagrangian function above is

(27)   \begin{equation*}\begin{split}g(\nu) &= \inf_x \lbrace f_0(x) + \nu^T (Ax - b) \rbrace \\&= \inf_{l \leq x \leq u} \lbrace f_0(x) + \nu^T (Ax - b) \rbrace \\&= b^T \nu + \inf_{l \leq x \leq u} \lbrace (c+A^T \nu)^T x \rbrace\end{split}\end{equation*}

where we show in the lecture why

    \begin{equation*}\inf_{l \leq x \leq u} \lbrace (c+A^T \nu)^T x \rbrace =\Big(\max \lbrace (c + A^T \nu),0 \rbrace\Big)^T l+\Big(\min \lbrace (c + A^T \nu),0 \rbrace\Big)^T u\end{equation*}

and hence the dual problem becomes the following unconstrained maximization problem

    \begin{equation*}(\mathcal{D}_2):\underset{\nu}{\text{maximize }} -b^T \nu +\Big(\max \lbrace (c + A^T \nu),0 \rbrace\Big)^T l+\Big(\min \lbrace (c + A^T \nu),0 \rbrace\Big)^T u\end{equation*}

which is very different from (\mathcal{D}_1)