# Rosenbrock Function Minimization

0

## Introduction

In this article, we present different approaches on minimizing the Rosenbrock function, namely thru Newton, Damped Newton and Steepest Descent.

## The Rosenbrock Function

Rosenbrock’s function is given as

(1)

(2)

Since all the methods we will use are gradient-aware and/or Hessian-aware, then we will spend some time in computing a closed form formula of the gradient and Hessian. Let us now start with the gradient of , that is

(3)

and let

(4)

Also let’s compute the Hessian matrix

(5)

Therefore, we can compute the four double derivatives gives

(6)

## Minimization with Numerical Methods

#### Steepest Descent

Steepest descent works as follows

(7)

where is a given parameter, called \textit{step-size}. So, the iteration works as follows

where are the values of at iteration .

#### Newton Method

Newton is a modified steepest descent working on Hessian’s instead as follows

(8)

#### Damped Newton Method

In contrast to Newton, Damped Newton method adds a damping factor as

(9)

## MATLAB implementation

We will implement the three methods above as

function[x] =  steepdescent(f,df,x0,alpha,Niter)
x(:,1) = x0;

for n = 2:Niter
x1 = x(1,n-1);
x2 = x(2,n-1);
x(:,n) = x(:,n-1) -  alpha*df(x1,x2);
end
function[x] =  newtonmethod(f,df,H,x0,Niter)
x(:,1) = x0;

for n = 2:Niter
x1 = x(1,n-1);
x2 = x(2,n-1);
x(:,n) = x(:,n-1) -  inv(H(x1,x2))*df(x1,x2);
end
function[x] =  dampednewtonmethod(f,df,H,x0,alpha,Niter)
x(:,1) = x0;

for n = 2:Niter
x1 = x(1,n-1);
x2 = x(2,n-1);
x(:,n) = x(:,n-1) -  alpha*inv(H(x1,x2))*df(x1,x2);
end

Furthermore, we implement our main function with three different initializers to test their sensitivity to the final solution as

Niter = 3000;
f = @(x1,x2)100*(x2 - x1^2)^2 + (1-x1)^2;
df = @(x1,x2) [-400*x1*(x2 - x1^2) - 2*(1-x1); 200*(x2 - x1^2)];
H = @(x1,x2) [-400*(x2-x1^2) + 800*x1^2 + 2, -400*x1; -400*x1, 200];

% Initials
x0_1 = [-1;1];
x0_2 = [0;1];
x0_3 = [2;1];
%Steepest Descent
[x_steepest_1] =  steepdescent(f,df,x0_1,1e-3,Niter);
[x_steepest_2] =  steepdescent(f,df,x0_2,1e-3,Niter);
[x_steepest_3] =  steepdescent(f,df,x0_3,1e-3,Niter);
% Newton
[x_newton_1] =  newtonmethod(f,df,H,x0_1,Niter);
[x_newton_2] =  newtonmethod(f,df,H,x0_2,Niter);
[x_newton_3] =  newtonmethod(f,df,H,x0_3,Niter);
%Damped Newton
[x_dampednewton_1] =  dampednewtonmethod(f,df,H,x0_1,1e-2,Niter);
[x_dampednewton_2] =  dampednewtonmethod(f,df,H,x0_2,1e-2,Niter);
[x_dampednewton_3] =  dampednewtonmethod(f,df,H,x0_3,1e-2,Niter);

%plot
figure
subplot(3,1,1)
plot(x_steepest_1(1,:),'r','Linewidth',2)
hold on
plot(x_steepest_2(1,:),'g','Linewidth',2)
plot(x_steepest_3(1,:),'b','Linewidth',2)

plot(x_steepest_1(2,:),'--r','Linewidth',2)
plot(x_steepest_2(2,:),'--g','Linewidth',2)
plot(x_steepest_3(2,:),'--b','Linewidth',2)

xlabel('Iteration number n')
ylabel('x')
legend('x1 path initializing x0 = [-1 1]','x1 path initializing x0 = [0 1]','x1 path initializing x0 = [2 1]','x2 path initializing x0 = [-1 1]','x2 path initializing x0 = [0 1]','x2 path initializing x0 = [2 1]')
title('Steepest descent with step size 10^{-3}')
grid on
grid minor

subplot(3,1,2)
plot(x_newton_1(1,:),'r','Linewidth',2)
hold on
plot(x_newton_2(1,:),'g','Linewidth',2)
plot(x_newton_3(1,:),'b','Linewidth',2)

plot(x_newton_1(2,:),'--r','Linewidth',2)
plot(x_newton_2(2,:),'--g','Linewidth',2)
plot(x_newton_3(2,:),'--b','Linewidth',2)

xlabel('Iteration number n')
ylabel('x')
legend('x1 path initializing x0 = [-1 1]','x1 path initializing x0 = [0 1]','x1 path initializing x0 = [2 1]','x2 path initializing x0 = [-1 1]','x2 path initializing x0 = [0 1]','x2 path initializing x0 = [2 1]')
title('Newton method')
grid on
grid minor

subplot(3,1,3)
plot(x_dampednewton_1(1,:),'r','Linewidth',2)
hold on
plot(x_dampednewton_2(1,:),'g','Linewidth',2)
plot(x_dampednewton_3(1,:),'b','Linewidth',2)

plot(x_dampednewton_1(2,:),'--r','Linewidth',2)
plot(x_dampednewton_2(2,:),'--g','Linewidth',2)
plot(x_dampednewton_3(2,:),'--b','Linewidth',2)

xlabel('Iteration number n')
ylabel('x')
legend('x1 path initializing x0 = [-1 1]','x1 path initializing x0 = [0 1]','x1 path initializing x0 = [2 1]','x2 path initializing x0 = [-1 1]','x2 path initializing x0 = [0 1]','x2 path initializing x0 = [2 1]')
title('Newton method with damping factor 10^{-2} ')
grid on
grid minor

which gives us the plot below

Furthermore, we see in the above figure that the convergence of all methods using different initializations. We see Newton converges the fastest, the second faster is damped-newton then steepest descent.

## Summary

In summary, we have seen how the three methods: Steepest Descent, Newton and damped Newton minimize the Rosenbrock function and their convergence behavior as a function of different initializers.
Finally, Subscribe to my channel to support us ! We are very close to 100K subscribers <3 w/ love. The algorithmic chef – Ahmad Bazzi. Click here to subscribe. Check my other articles here.