Invented by George Dantzig in 1947, Linear Programming is one of the most fundamental tools in combinatorial optimization. You have two views: the geometrical view and the algebraic view. There are beautiful connection between them.
This is what a linear program looks like, which is minimizing a linear objective function and is subject to a set of inequality constraints.
min c1 x1 + ... + cn xn such that a11 x1 + ... + a1n xn ≤ b1 ... am1 x1 + ... + amn xn ≤ bm xi ≥ 0 (1 ≤ i ≤ n)
You of course can do maximization, as long as you negate the entire objective function:
min c1 x1 + ... + cn xn ⟹ max -(c1 x1 + ... + cn xn)
xi must be able to take negative values, you can replace it with 2 other variables
x-i, but both of them meet the requirement that they shall be non-negative:
xi ⟹ x+i - x-i
A equality constraint can be replace with two inequality constraints. However, since we are talking about linear programming, variable
xi can not be integers, and constraints must be linear.
λ1 v1 + ... + λn vn is a convex combination of points
v1, ..., vn if
λ1 + ... + λn = 1 λi ≥ 0 (1 ≤ i ≤ n)
S in Rn is convex if it contains all the convex combinations of the points in
S. The intersection of convex sets is a convex set. This is very useful, because a linear constraint
a11 x1 + ... + a1n xn ≤ b1 represents a half space, which is also a convex set. The intersection of a set of half spaces (a bunch of constraints) is also convex, called polyhedron. If it is finite, it is called polytope. Every point in a polytope is a convex combination of its vertices.
We are obsessed with vertices, because the theorem: at least one of the points where the objective value is minimal is a vertex. We know the optimal solution is going to be at least one of the vertices. So we can solve an linear program “geometrically” by enumerating all the vertices, and select the one with the smallest objective value.
Explore the vertices in ploytope to find the optimal solution seems hard and even impossible, a more intelligent way of solving the problem is to connect the geometric view to the algebraic view, using the famous Simplex algorithm.
Finding Basic Feasible Solution
First let’s back up and look at how to find solutions to linear systems:
a11 x1 + ... + a1n xn = b1 ... am1 x1 + ... + amn xn = bm xi ≥ 0 (1 ≤ i ≤ n)
Usually high school will teach how to use Gaussian elimination to solve it. You basically express some of the variables
x1, ..., xm (basic variables) in terms of the other ones
xm+1, ..., xn (non-basic variables).
x1 = b1 + ∑ni=m+1 a1i xi ... xm = bm + ∑ni=m+1 ami xi
As long as
b1, ..., bm ≥ 0, you will have a basic feasible solution (BFS), where all non-basic variables
xm+1, ..., xn can be zero.
BUT, recall that constraints in a linear program are all inequality for example
a11 x1 + ... + a1n xn ≤ b1. This won’t be a big issue, because we can transform the inequality to equality, if we add one additional
si, called slack variables:
a11 x1 + ... + a1n xn + s1 = b1 ... am1 x1 + ... + amn xn + sm = bm s1, ..., sm ≥ 0
So, summarize on how to find the basic feasible solution:
- Re-express the constraints as equations, by adding slack variables
- Select m variables which will be the basic variables (m is the number of constraints)
- Re-express them in terms of the non-basic variable only using Gaussian elimination
b1, ..., bm ≥ 0, then we have a basic feasible solution.
You may immediately come up with an idea that is trying to generate all basic feasible solutions, and select the one with the best objective function value. However this is usually not possible in practice, because there are a huge number of solutions
n! / (m! (n-m)!). The Simplex Algorithm offers a better way to find it.
The Simplex Algorithm
The Simplex algorithm is essentially a local search algorithm, it moves from one basic feasible solution to another basic feasible solution. The beautiful thing is it guaranteed to find the global optimum, because of convexity. The key is how to make move?
- Select a non-basic variable with a negative coefficient (called entering variable
- Introduce the entering variable in the basis by removing an existing basic variable (called leaving variable
- Perform Gaussian elimination. It is possible to get negative
biafter doing the elimination.
To avoid negative
bi, we have to choose the leaving variable carefully, we must maintain feasibility by finding the smallest ratio between the
bi and the minus of the coefficient of the entering variable.
l = argmini:a_ie<0 bi/(-aie)
Then when you do the Gaussian elimination, this will help keep all
bi positive, which will give you another basic feasible solution. This entire set of operation is called pivoting in linear programming.
We now are able to move from basic feasible solution, but when to stop?
- Take the objective function
- Replace all the basic variables with the non-basic variables
A basic feasible solution is optimal if its objective function, after having eliminated all basic variables, is of the form:
c0 + c1 x1 + ... + cn xn with ci ≥ 0 (1 ≤ i ≤ n)
Overall, the Simplex algorithm can be expressed in just 4 lines of code:
while ∃ 1 ≤ i ≤ n: ci < 0 do choose e such that ce < 0; l = arg-mini:a_ie<0 bi / (-aie); pivot(e, l);
When Algorithm Unbounded by Below
There is a nasty situation that
ce < 0 (the coefficient of the selected entering variable in objective function) but all
aie > 0 (all the coefficients of the selected entering variable in the linear systems). It means you are not able to select leaving variable, since you need a negative
For an entering variable, you can not select a leaving variable for it. The reason behind the scene is that the entering variable can be arbitrarily large positive value, which will make objective function arbitrarily low. It basically means that the algorithm here is not bounded by below, there is a mistake in the modeling.
bi Becomes Zero
bi is zero, it will cause the corresponding variable
xi is always selected as leaving variable. When you do the pivoting, you are going to stay at the same value of the objective function, without improvement. We have to find essentially another way to guarantee termination. There are a few useful ways:
|Bland rule||Always select the first entering variable with negative coefficient, in the objective function.|
|Pivoting rule||Breaking ties when selecting the leaving variable by using a lexicographic rule.|
|Perturbation methods||Perturb the basis, and then go back later on.|
The First Basic Feasible Solution
We use essentially the Simplex algorithm to find the first basic feasible solution. We transform the original linear program by:
- Add artificial variables
y1, ..., ymfor each constraint
- Change the objective function to
y1 + ... + ym
min y1 + ... + ym such that a11 x1 + ... + a1n xn + y1 = b1 ... am1 x1 + ... + amn xn + ym = bm xi ≥ 0 (1 ≤ i ≤ n)
yi variables in this new linear program just give us a new basis. What we are going to do is basically minimize the sum of these
yi variables, then we can optimize the objective function.
- If the objective function is greater than zero, we know we don’t have a feasible solution.
- If the objective function is zero, we know we have a basic feasible solution (in terms of variables other than
yi), then we can do the optimization of the original linear program.
The linear program expressed in matrix is:
min c x s.t. A x = b
The Simplex algorithm can also be expressed using matrices:
AB: The matrix for coefficients of basic variables.
xB: Column vector of basic variables.
AN: The matrix for coefficients for non-basic variables.
xN: Column vector of non-basic variables.
b: Column vector of right hand side values.
Putting them together, we have:
AB xB + AN xN = b ⟹ AB xB = b - AN xN ⟹ xB = A-1B b - A-1B AN xN ⟹ xB = b' - A'N xN
The solution is feasible if
b' ≥ 0.
The objective function can be re-expressed as:
c x = cB xB + cN xN = cB (A-1B b - A-1B AN xN) + cN xN = cB A-1B b + (cN - cB A-1B AN) xN = cB A-1B b + (cN - cB A-1B AN) xN + (cB - cB A-1B AB) xB = cB A-1B b + (c - cB A-1B A) x
We can define
cB A-1B as
Π, which is called Simplex multiplier. Now objective function
c x becomes:
c x = Π b + (c - Π A)x
When we have
c - Π A ≥ 0, we have optimal solution.
Linear programming is often presented with a tableau, which is easier for pivoting.
Duality theory is basically looking at linear programming in two different ways:
If the primal has an optimal solution, then the dual has an optimal solution with the same objective function value. The simplex multiplier
Π = cB A-1B are a feasible solution to the dual. The dual of the dual is the primal.
For more on Discrete Optimization: Linear Programming, please refer to the wonderful course here https://www.coursera.org/learn/discrete-optimization
Related Quick Recap
I am Kesler Zhu, thank you for visiting my website. Check out more course reviews at https://KZHU.ai
All of your support will be used for maintenance of this site and more great content. I am humbled and grateful for your generosity. Thank you!
Don't forget to sign up newsletter, don't miss any chance to learn.
Or share what you've learned with friends!Tweet