Invented byย George Dantzig in 1947, Linear Programming is one of the most fundamental tools in combinatorial optimization. You have two views: the geometrical view and the algebraic view.ย There are beautiful connection between them.

This is what a linear program looks like, which is minimizing a linear objective function and is subject to a set of inequality constraints.

`min c`_{1} x_{1} + ... + c_{n} x_{n}
such that
a_{11} x_{1} + ... + a_{1n} x_{n} โค b_{1}
...
a_{m1} x_{1} + ... + a_{mn} x_{n} โค b_{m}
x_{i} โฅ 0 (1 โค i โค n)

You of course can do maximization, as long as you negate the entire objective function:

`min c`_{1} x_{1} + ... + c_{n} x_{n}
โน max -(c_{1} x_{1} + ... + c_{n} x_{n})

If `x`

must be able to take negative values, you can replace it with 2 other variables _{i}`x`

and ^{+}_{i}`x`

, but both of them meet the requirement that they shall be non-negative:^{-}_{i}

`x`_{i}
โน x^{+}_{i} - x^{-}_{i}

A equality constraint can be replace with two inequality constraints. However, since we are talking about linear programming, variable `x`

can not be integers, and constraints must be linear._{i}

## Geometrical View

`ฮป`

is a convex combination of points _{1} v_{1} + ... + ฮป_{n} v_{n}`v`

if_{1}, ..., v_{n}

`ฮป`_{1} + ... + ฮป_{n} = 1
ฮป_{i} โฅ 0 (1 โค i โค n)

A set `S`

in R^{n} is convex if it contains all the convex combinations of the points in `S`

. The intersection of convex sets is a convex set. This is very useful, because a linear constraint `a`

represents a half space, which is also a convex set. The intersection of a set of half spaces (a bunch of constraints) is also convex, called polyhedron. If it is finite, it is called _{11} x_{1} + ... + a_{1n} x_{n} โค b_{1}**polytope**. Every point in a polytope is a convex combination of its vertices.

We are obsessed with vertices, because the theorem: **at least one** of the points where the objective value is minimal is a vertex. We know the optimal solution is going to be at least one of the vertices. So we can solve an linear program “geometrically” by enumerating all the vertices, and select the one with the smallest objective value.

## Algebraic View

Explore the vertices in ploytope to find the optimal solution seems hard and even impossible, a more intelligent way of solving the problem is to connect the geometric view to the algebraic view, using the famous Simplex algorithm.

### Finding Basic Feasible Solution

First let’s back up and look at how to find solutions to linear systems:

`a`_{11} x_{1} + ... + a_{1n} x_{n} = b_{1}
...
a_{m1} x_{1} + ... + a_{mn} x_{n} = b_{m}
x_{i} โฅ 0 (1 โค i โค n)

Usually high school will teach how to use Gaussian elimination to solve it. You basically express some of the variables `x`

(basic variables) in terms of the other ones _{1}, ..., x_{m}`x`

(non-basic variables)._{m+1}, ..., x_{n}

`x`_{1} = b_{1} + โ^{n}_{i=m+1} a_{1i} x_{i}
...
x_{m} = b_{m} + โ^{n}_{i=m+1} a_{mi} x_{i}

As long as `b`

, you will have a basic feasible solution (BFS), where all non-basic variables _{1}, ..., b_{m} โฅ 0`x`

can be zero._{m+1}, ..., x_{n}

BUT, recall that constraints in a linear program are all inequality for example `a`

. This won’t be a big issue, because we can transform the inequality to equality, if we add one additional _{11} x_{1} + ... + a_{1n} x_{n} โค b_{1}`s`

, called slack variables:_{i}

`a`_{11} x_{1} + ... + a_{1n} x_{n} + s_{1} = b_{1}
...
a_{m1} x_{1} + ... + a_{mn} x_{n} + s_{m} = b_{m}
s_{1}, ..., s_{m} โฅ 0

So, summarize on how to find the basic feasible solution:

- Re-express the constraints as equations, by adding slack variables
- Select m variables which will be the basic variables (m is the number of constraints)
- Re-express them in terms of the non-basic variable only using Gaussian elimination
- If
`b`

, then we have a basic feasible solution._{1}, ..., b_{m}โฅ 0

You may immediately come up with an idea that is trying to generate **all** basic feasible solutions, and select the one with the best objective function value. However this is usually not possible in practice, because there are a huge number of solutions `n! / (m! (n-m)!)`

. The Simplex Algorithm offers a better way to find it.

## The Simplex Algorithm

The Simplex algorithm is essentially a local search algorithm, it moves from one basic feasible solution to another basic feasible solution. The beautiful thing is it guaranteed to find the global optimum, because of convexity. The key is how to make move?

- Select a non-basic variable with a negative coefficient (called entering variable
`x`

)._{l} - Introduce the entering variable in the basis by removing an existing basic variable (called leaving variable
`x`

)._{e} - Perform Gaussian elimination. It is possible to get negative
`b`

after doing the elimination._{i}

To avoid negative `b`

, we have to choose the _{i}**leaving variable** carefully, we must maintain feasibility by finding the smallest ratio between the `b`

and the minus of the coefficient of the _{i}**entering variable**.

`l = argmin`_{i:a_ie<0} b_{i}/(-a_{ie})

Then when you do the Gaussian elimination, this will help keep all `b`

positive, which will give you another basic feasible solution. This entire set of operation is called _{i}**pivoting** in linear programming.

We now are able to move from basic feasible solution, but when to stop?

- Take the objective function
- Replace all the basic variables with the non-basic variables

A basic feasible solution is optimal if its objective function, after having eliminated all basic variables, is of the form:

`c`_{0} + c_{1} x_{1} + ... + c_{n} x_{n}
with c_{i} โฅ 0 (1 โค i โค n)

Overall, the Simplex algorithm can be expressed in just 4 lines of code:

`while โ 1 โค i โค n: c`_{i} < 0 do
choose e such that c_{e} < 0;
l = arg-min_{i:a_ie<0} b_{i} / (-a_{ie});
pivot(e, l);

### When Algorithm Unbounded by Below

There is a nasty situation that `c`

(the coefficient of the selected entering variable in objective function) but all _{e} < 0`a`

(all the coefficients of the selected entering variable in the linear systems). It means you are not able to select leaving variable, since you need a negative _{ie} > 0`a`

._{ie}

For an entering variable, you can not select a leaving variable for it. The reason behind the scene is that the entering variable can be arbitrarily large positive value, which will make objective function arbitrarily low. It basically means that the algorithm here is not bounded by below, there is a mistake in the modeling.

### When `b`_{i}

Becomes Zero

_{i}

When a `b`

is zero, it will cause the corresponding variable _{i}`x`

is always selected as leaving variable. When you do the pivoting, you are going to stay at the same value of the objective function, without improvement. We have to find essentially another way to guarantee termination. There are a few useful ways:_{i}

Bland rule | Always select the first entering variable with negative coefficient, in the objective function. |

Pivoting rule | Breaking ties when selecting the leaving variable by using a lexicographic rule. |

Perturbation methods | Perturb the basis, and then go back later on. |

### The First Basic Feasible Solution

We use essentially the Simplex algorithm to find the first basic feasible solution. We transform the original linear program by:

- Add artificial variables
`y`

for each constraint_{1}, ..., y_{m} - Change the objective function to
`y`

_{1}+ ... + y_{m}

`min y`_{1} + ... + y_{m}
such that
a_{11} x_{1} + ... + a_{1n} x_{n} + y1 = b_{1}
...
a_{m1} x_{1} + ... + a_{mn} x_{n} + ym = b_{m}
x_{i} โฅ 0 (1 โค i โค n)

The `y`

variables in this new linear program just give us a new basis. What we are going to do is basically minimize the sum of these _{i}`y`

variables, then we can optimize the objective function._{i}

- If the objective function is greater than zero, we know we don’t have a feasible solution.
- If the objective function is zero, we know we have a basic feasible solution (in terms of variables other than
`y`

), then we can do the optimization of the original linear program._{i}

### Matrix Notations

The linear program expressed in matrix is:

```
min c x
s.t. A x = b
```

The Simplex algorithm can also be expressed using matrices:

`A`

: The matrix for coefficients of basic variables._{B}`x`

: Column vector of basic variables._{B}`A`

: The matrix for coefficients for non-basic variables._{N}`x`

: Column vector of non-basic variables._{N}`b`

: Column vector of right hand side values.

Putting them together, we have:

`A`_{B} x_{B} + A_{N} x_{N} = b
โน A_{B} x_{B} = b - A_{N} x_{N}
โน x_{B} = A^{-1}_{B} b - A^{-1}_{B} A_{N} x_{N}
โน x_{B} = b' - A'_{N} x_{N}

The solution is feasible if `b' โฅ 0`

.

The objective function can be re-expressed as:

`c x = c`_{B} x_{B} + c_{N} x_{N}
= c_{B} (A^{-1}_{B} b - A^{-1}_{B} A_{N} x_{N}) + c_{N} x_{N}
= c_{B} A^{-1}_{B} b + (c_{N} - c_{B} A^{-1}_{B} A_{N}) x_{N}
= c_{B} A^{-1}_{B} b + (c_{N} - c_{B} A^{-1}_{B} A_{N}) x_{N} + (c_{B} - c_{B} A^{-1}_{B} A_{B}) x_{B}
= c_{B} A^{-1}_{B} b + (c - c_{B} A^{-1}_{B} A) x

We can define `c`

as _{B} A^{-1}_{B}`ฮ `

, which is called Simplex multiplier. Now objective function `c x`

becomes:

`c x = ฮ b + (c - ฮ A)x`

When we have `c - ฮ A โฅ 0`

, we have optimal solution.

Linear programming is often presented with a tableau, which is easier for pivoting.

### Duality

Duality theory is basically looking at linear programming in two different ways:

Primal | Dual |

`min c` | `max y` |

If the primal has an optimal solution, then the dual has an optimal solution with the same objective function value. The simplex multiplier `ฮ = c`

are a feasible solution to the dual. The dual of the dual is the primal._{B} A^{-1}_{B}

For more on ** Discrete Optimization: Linear Programming**, please refer to the wonderful course here https://www.coursera.org/learn/discrete-optimization

## Related Quick Recap

*I am Kesler Zhu, thank you for visiting my website. Check out more course reviews at https://KZHU.ai*

Don't forget to sign up newsletter, don't miss any chance to learn.

Or share what you've learned with friends!

Tweet