The following lecture talks about the Markowitz Portfolio Optimization problem in convex optimization. Indeed, many variants of this problem exists, but the classical one looks like this

where is an sized vector containing the amount of assets to invest in. The vector is the mean of the relative price asset change and the matrix is the matrix of variance-covariance of assets. The parameter is minimum accepted returns.

Learn more about the above problem and its application to the stock market by watching the above lecture.

About

This lecture focuses on the theoretical as well as practical aspects of the Support Vector Machines. It is a supervised learning model associated with learning algorithms that analyze data used for classification and regression analysis. Developed at AT&T Bell Laboratories by Vapnik with colleagues (Boser et al., 1992, Guyon et al., 1993, Vapnik et al., 1997), it presents one of the most robust prediction methods, based on the statistical learning framework or VC theory proposed by Vapnik and Chervonenkis (1974) and Vapnik (1982, 1995).

Outline

00:00:00 Introduction

00:01:11 Support Vector Machines

00:03:55 Supporting Vectors and Hyperplanes

00:07:05 SVM Mathematical Modelling

00:08:58 Hard Margin SVM

00:47:21 Outlier Sensitivity & Linear Separability

00:49:11 Hard Margin SVM on Python

01:13:15 Soft Margin SVM

01:27:09 Soft Margin SVM on Python

01:31:47 Outro

Mathematical optimization is a problem that takes the following form

(1)

where is a vector containing all the variables of the problem

(2)

The function is referred to as the cost or the objective function. Moreover, the functions are referred to as constraint functions. In most cases, our goal is to find a (or the) point which is feasible (i.e. satisfies and is minimum. Rigorously stated, is optimal if

(3)

Well, we can state many applications. In finance and stock analysis, a well-known one is Markowitz portfolio optimization. This problem takes the form

(4)

Here will reflect the number of assets (or stocks) held over a period of time. For example, let’s say you decide to buy stocks in the period of time between today and 6 months from now. You are interested in the following stocks: CEVA, GOOGL, LVMH and NIO. This means you have decided on 4 assets and hence your . Furthermore, let’s denote by the amount of asset held throughout the period of investment. A long position in asset would indicate , and a short position in asset would mean . Moreover, is the change in price divided by the initial price (i.e. today’s price). Your return will simply be

(5)

Anyone investing (short or long term) would simply want to maximize . However, no constraints would simply mean that is a vector of all-, which is unrealistic. Keeping our feet on the ground, we should understand that a vector of all- is un-achievable, but we can accept a minimum return as

(6)

where is a minimum return you seek from the investment over your investing period. The above equation will then suit one of our constraints. Note that the above is a way of saying “I want maximum return”. To embed risk somewhere, volatility has to be included. A suitable measure of volatility seems to be the variance of the asset prices, which is captured in covariance matrix . The variance would then by the term . Markowitz introduced the problem of minimizing risk subject to maximum and acceptable return

(7)

Note that the constraint along with imposes a probability constraint on vector . In other words, we are interested in vectors that contains probabilities (or proportions). Markowitz portfolio optimization lies under the category of convex optimization problems of type QP (Quadratic Programming). In Electrical Engineering, convex optimization finds application in many communication and electronic manufacturing problems, such as water filling and electronic micro scale design. ]]>

The above lecture is brought to you by Skillshare. In a previous post of mine, we introduced weak alternatives. As a small reminder, consider the following two sets

(1)

and

(2)

where

(3)

is the dual function and is the domain of the problem. Since we did not impose any convexity assumption on ‘s neither did we assume that our ‘s are affine, then all we can say about and is that they form weak alternatives. In other words,

- If is feasible, then is infeasible.
- If is feasible, then is infeasible.

In this lecture, we assume the following

- are convex
- ‘s are affine, i.e.
- such that

In that case, we write as

(4)

Thanks to the three conditions above, we could strengthen weak alternatives so that they form strong alternatives. That is to say

- is feasible is infeasible.
- is feasible is infeasible.

Indeed, strong alternatives are stronger since (unlike weak alternatives) if we know that one of the sets or is infeasible, then the other has to be feasible.

In my YouTube lecture, I give two applications relating to linear inequalities and intersection of ellipsoids.

]]>This tutorial is brought to you by DataCamp. The tutorial does the most in rigorously explaining the little bits and pieces of the wonderful Matplotlib. Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. Matplotlib makes easy things easy and hard things easier.

Contents of this lecture are partitioned as follows:

00:00:00 Introduction

00:00:58 What is MATPLOTLIB ?

00:01:56 Installing MATPLOTLIB

00:02:22 Pyplot

00:04:35 Plot Formatting

00:06:05 Multiple Plotting

00:08:43 Legend

00:11:43 Keyword String Plotting

00:16:34 Categorical Data Plotting

00:17:24 Bar Plot

00:17:52 Scatter Plot

00:18:38 Subplotting

00:20:27 Figure Size Adjustment

00:21:27 Control Line Properties

00:22:43 Multiple Figures & Axes

00:25:43 Text Manipulation

00:28:38 Gridding

00:28:56 Plot Limit

00:30:16 Text Annotation

00:35:38 Logarithmic & Nonlinear Scales

00:37:49 Log Scale

00:38:38 Symmetric Log scale

00:39:25 Logistic Scale

00:40:05 imread & imshow

00:42:10 Image Cropping

00:44:03 Barcodes

00:48:23 Layer Images

00:51:42 Alpha Blending

00:53:35 Fill Curves

00:53:59 Koch Snowflake

01:00:59 Rendering Equations with LaTeX

01:05:48 Polar Curves

01:09:59 Summary

Let us say we are interested in checking whether system hereby defined as

(1)

is feasible or not. In other words, could we find a vector that satisfies set ? In many cases, it may turn out to be hard to answer the question by exhaustively searching all possible candidates of . To crank it up a notch, we formulate an optimization problem that serves us well. To this extent, consider

(2)

Yes, that is right. We minimize 0 subject to being in . Well, nothing fancy has been done here. As a matter of fact, if one closely looks at the optimal value, that is

(3)

One could realize that the optimal value acts as an indicator function, i.e. it returns when is feasible, else . That’s awesome. You just wrote down an optimization problem that tells you whether is feasible or not. In other words, you wrote down an optimization problem answering your main question. However, nothing fancy has been done here. All we’re doing is re-writing the problem and hence nothing could be learned from the optimization problem in equation (2). On the other hand, things becomes a whole lot more interesting when taking a look at the dual problem. But for that, we need to pass by the Lagrangian function that is

(4)

The dual function is the infimum of over (domain of the problem), that is

(5)

Finally, the dual problem would be to

(6)

As in the primal problem, the optimal value of the dual problem is also an indicator function to another set of inequalities,

(7)

where

(8)

So now the question is ” How does relate to ? “. Applying weak duality that is

(9)

we can infer two cases.

- Case 1: If , then has to be .
- Case 2: If , then has to be .

Using equations (7) and (3) along with the two cases, we get the following:

- If is feasible, then is infeasible.
- If is feasible, then is infeasible.

This is what weak alternative is. It is when at most one of the inequalities are feasible.

]]>In this lecture, we talk about Perturbation and Sensitivity Analysis. But what does that mean ? Well, consider our good old looking optimization problem that looks like this

(1)

One way to tell how the above problem reacts to perturbation is to actually perturb it and check how its optimal value behaves with perturbation parameters. To this end, consider the following perturbed problem

(2)

What did we just do ? Well instead of having strict equalities and inequalities. We “perturb” the zero boundary. If , we say that the inequality constraint is relaxed by an amount of . Likewise, when , the equality constraint is relaxed by an amount of . Also we can see that problem (2) “boils down” to problem (1) when perturbation parameters ‘s and ‘s are set to zero, which makes sense right ? Now if , you can see that we “tighten” the inequality constraint. Similarly, if , the equality constraint is said to be “tightened”. Going back to our main concern that is “check how its optimal value behaves with perturbation parameters”, we have to quantify that perturbation on , the optimal value of problem (1). To this extent, let us define the optimal value of problem (2) as

(3)

The function tells us how the optimal value of problem (2) as a function of perturbation parameters

(4)

Note that for the particular case and , we have that . In the lecture, we prove an inequality that shows us how far is from which is the following

(5)

where are the optimal dual Lagrangian multipliers. The above also provides a global view on how far we are from the optimal unperturbed problem in terms of the optimal dual Lagrangian multipliers. Now, if we impose extra properties on , i.e. differentiable at and strong duality, we can get a feel on what happens locally around . In the above lecture, we also prove that given the previous two conditions (strong duality and differentiability around ), we have that

(6)

The above allows us to quantify how active a constraint is at the optimal point . ]]>

In this lecture, we talk about equivalent reformulations, that are reformulations done on the initial problem of the form

(1)

A very interesting question is the following. Assume we got two problems and that are equivalent problems. Are their duals, (hereby denoted by and , respectively) the same ? The given lecture gives many examples to show that in many cases, it happens that the dual problems could be very different. We will discuss three main types of equivalent reformulations that are

- Introducing new variables to the problem
- Transforming the cost into another proportional cost (i.e. an increasing function of the cost)
- Absorbing constraints into the domain of the cost, a.k.a. implicit constraints.

We start our lecture by talking about introducing auxiliary variables to the main problem. Assume the following unconstrained minimization problem

(2)

A very easy achieved equivalent problem is to replace by and impose as a new constraint to the problem, namely

(3)

To get the dual problems, we first need the Lagrangian dual functions. For problem (1), the dual problem is clearly

(4)

where is the optimal value of . However, the dual problem of is achieved by first computing the Lagrangian function that is

(5)

(6)

The term is clearly in the general case except for the more interesting case when , in which case the infimum becomes 0, so the above reads

(7)

where is the conjugate function of . Finally the dual becomes

(8)

Indeed, * and ** *are very different. The lecture deals with two applications of the above concept by considering two different forms of , that are : and .

The second type of reformulation is by simply transforming the cost, i.e. instead of maximizing , you maximize , or more generally speaking, you can maximize where is an increasing function of . Let me show you an example, where we have the following unconstrained minimization problem

(9)

The dual problem of the above is

(10)

where is the optimal value of . Now consider the transformed cost problem as

(11)

The Lagrangian of the above problem is

(12)

Now let’s compute the Lagrange dual problem as

(13)

As previously we can say that

(14)

where we will be interested in the case where , otherwise there’s nothing to optimize. So,

(15)

where is the dual norm of . Finally, the dual problem is

(16)

which is very different from .

The last type of reformulation we discuss is by absorbing constraints to the cost, or more formally speaking, thru implicit constraints. Let’s see an example

(17)

which is the same as

(18)

The Lagrangian of the above problem is

(19)

where

(20)

Now computing the Lagrangian dual function as

(21)

where

(22)

As previously, we will be interested in the case where , hence

(23)

Finally the dual problem here is

(24)

Now let us absorb the “box” constraints in the cost of as

(25)

where

(26)

The Lagrangian function above is

(27)

where we show in the lecture why

and hence the dual problem becomes the following unconstrained maximization problem

which is very different from ]]>

In a previous post of mine, I talked about CVXOPT Programming. In this one, we’ll introduce cone programming. So first things first. Cone programming is a term (short for **second-order cone program** (**SOCP**)) is a convex optimization problem of the following form

(1)

where matrices fall in and vectors . On the other hand, vectors and . Also . Finally, and .

SOCPs are easily solved on CVXOPT using the socp solver found in the solvers module of CVXOPT. Please refer to the video for a step-by-step tutorial.

On the other hand, Quadratically Constrained Quadratic Programs (a.k.a QCQPs) are a bit tricky. QCQPs are optimization problems in which both the objective function and the constraints are quadratic functions, i.e.

(2)

where all matrices are given matrices. The QCQP problem is convex if matrices are positive semi-definite. Note that a QCQP boils down to a QP (Quadratic Program) when matrices are zero matrices. If in addition to the aforementioned statements, we also have , then the problem is an LP (Linear Program).

That’s enough background to get you going with QCQPs. Turns out that CVXOPT does not have a “QCQP” solver. On the bright side, we can cast a QCQP as an SOCP enabling us to solve QCQPs via the SOCP solver of CVXOPT. How do we do that ? Well minimizing the cost of the QCQP, i.e. minimizing is the same as minimizing an upper bound of as

(3)

We will now make use of an important theorem of positive semi-definite matrices, that is

**Theorem 1**: *Let* *be a positive semidefinite matrix in **. Then there is exactly one positive semidefinite matrix* *such that *.

Actually, the above theorem is very intuitive. It’s a generalization of something you learned back in high school. It’s a matrix way of saying that every non-negative number admits a square root . So making use of the matrix square root decomposition, we will write down the square root decompositions of all matrices as

(4)

We get

(5)

We shall also decompose ‘s as

(6)

So we get

(7)

Now let us include the term in the first constraints as

(8)

Similarly, we include the term in the last constraint as

(9)

Why did we do that ? Well, notice that

(10)

This means that

(11)

The problem in equation (11) is really close to that of (1). The only thing that is a bit “annoying” is the . To bypass this problem, we bound by a user-defined parameter as

(12)

The above problem is an SOCP as

(13)

where

(14)

So, in short, by choosing the above variables, we can go from a QCQP to an SOCP and hence solve the former on CVXOPT. The lecture uses Jupyter notebook on my favorite Google Chrome browser. ]]>

SciPy is a free and open-source Python library used for technical computing and scientific computing. SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering. In this lecture, we introduce all SciPy’s functions one will most probably pass through when Jupyter Notebook on Google Chrome, which is one of my favorite. Finally, the lecture is outlined as follows:

00:00 Intro

00:21 What is SciPy ?

02:06 SciPy subpackages

05:15 Installing SciPy

08:34 Concatenate of NumPy

09:03 Concatenating by rows

09:51 Concatenating by columns

10:34 Slicing Matrices

11:01 Mesh Grids

12:17 Polynomials

13:02 Polynomial Multiplication

13:32 Integrating Polynomials

14:17 Polynomial Derivatives

15:02 Polynomial as Array

15:31 Vectorizing Functions

18:52 Special SciPy Functions

19:15 Airy Functions

22:35 Exponentially Scaled Airy Functions

24:43 Bessel Functions

25:44 Thin Drumhead Example

31:51 Logit Function

33:18 Gamma Function

35:56 Error Functions

37:04 Entropy Function

38:01 Huber Function