Matrices & Systems of Linear Equations

Table of Contents

Matrix is a rectangular array of numbers. In an m*n matrix, m stands for the number of rows, n stands for the number of columns. When m or n equals to 1, we have row vectors or column vectors, respectively.

Addition and Multiplication

Adding two matrices (with the same size) is just adding each of the components. Multiplication of a Matrix by scale k is just k get multiplied by each of the elements. Multiplication of matrices is a little special, we went across the columns of the first matrix and down the rows of the second matrix. So the second matrix must has the same number of rows as the first matrix has columns.

(m * n)(n * p) -> (m * p)

Elements in the resulting matrix is derived from corresponding row and column in multiplicands. It is important that they don’t commute under multiplication.

C = A B
c_ij = Σ_k (a_ik × b_kj)

A (B C) = (A B) C
A B = A C does not imply B = C
A B ≠ B A

There are a few special matrices, which are very useful:

Zero matrix: 0. There will be many times you need to solve equation like A * x = 0.
Identity matrix: I. This is always a square matrix and it commutes, like A * I = I * A = A
Diagonal matrix: This only has elements on diagonal and zeros everywhere else.
Banded matrix: For example: tri-diagonal matrix.
Upper / Lower triangular matrix: there are only elements above or below the diagonal.

Transpose

Taking the transpose of a matrix is to make the rows of the matrix become the columns, and the columns become the rows: a^T_ij = a_ji. Here are some algebra of the transpose:

(A^T)^T = A
(A + B)^T = A^T + B^T
(A B)^T = B^T A^T

Matrix A is symmetric if A^T = A. It is a skew-symmetric matrix if A^T = -A. When a matrix is symmetric, you actually reduce the number of freedom that you have in the matrix. Any square matrix can be written as the sum of a symmetric and skew-symmetric matrix. A^TA is symmetric.

Inner product / Dot product (of two vectors)

Suppose 2 column vectors u and v, their inner product is u^T v which gives a scalar. If u^T v = 0, then u and v are orthogonal (perpendicular) to each other. The norm of vector u is ||u||, which is the length of the vector. The vector u is normalized if ||u|| = 1.

||u|| = (u^Tu)^1/2 = (u₁² + u₂² + ... U_n²)^1/2

If two vectors are orthogonal to each other and normalized, then both vectors are orthonormal.

Outer product

The outer product of 2 vectors u and v are uv^T.

Inverse

Not all matrices are invertible. If determinant of a matrix is zero, then 1 / determinant does not exist, so the inverse of this matrix does not exist. If a square matrix A is invertible then A A^-1 = I = A^-1 A. Its inverse is unique.

(A B)^-1 = B^-1 A^-1
(A^T)^-1 = (A^-1)^T

Orthogonal matrices

The inverse of an orthogonal matrix is equal to the transpose of the matrix: Q^-1 = Q^T , i.e.: Q Q^T = Q^T Q = I, which is actually multiplying a row of Q against a row of Q. So if the rows are the same, you get 1; if the rows are different, you get zero. This is the definition of orthonormality.

(Q x)^T (Q x) = ||Q x||² = x^T Q^T Q x = x^T I x = ||x||²

The equation above tells us the orthogonal matrix preserves norms or length. The rotation of a vector through an angle can be done by an orthogonal matrix. The rotation matrix R_θ is an orthogonal matrix because the length of vector x does not change when you rotate it.

R_θ x = x'

Solving the equation, we get that R_θ is below, and more R(-θ) = R(θ)^-1.

cosθ -sinθ
sinθ  cosθ

A permutation matrix is an n-by-n matrix which when you multiply it against another matrix, you permute the rows of the matrix. Identity matrix is also called permutation matrix which keeps the same order. Below is an example of 2-by-2 permutation matrix:

0 1
1 0

When multiplied on the left, it permutes the rows of a matrix; when multiplied on the left, it permutes the columns of a matrix. Actually permutation matrix is just the identity matrix with its rows permuted.

Gaussian Elimination

The main reason for matrices is to represent linear systems of equations. The first step is to form “Augmented matrix”. Next we do some operations on the augmented matrix to make the system of linear equations easier to solve. We can change the order of equations (i.e. the order of the rows of the matrix). We can multiply an equation (a row of the matrix) by a constant. We can also multiply an equation with a constant and then add it to another equation. So we can bring the matrix to the upper triangular form, which is the goal of Gaussian Elimination. After that you could use back-substitution to calculate values for all variables.

Reduced Row Echelon Form

We go even further than upper triangular form, we go all the way to possibly the identity matrix. The idea is you use the pivots to eliminate not just below the pivot, but also above the pivot. Columns with pivots are called pivot column, other columns are called non-pivot columns. In those pivot columns, all its pivots are 1s, all other elements are 0s.

Computing Inverse

If a matrix A has an inverse, then A A^-1 = I. You can see this as A multiplies i-th column of A^-1 and get the i-th column of I. So to solve what A^-1 is, we just need to solve n equations in this form:

A a_i^-1 = e_i^-1

The augmented matrix will look like A with I attached to the right, i.e. AI, we bring the matrix A to reduced row echelon form, then the I part of the augmented matrix will become A inverse.

AI --> bring A to reduced row echelon form --> IA^-1

LU decomposition

It actually turns out the Gaussian elimination procedure gives us a matrix decomposition. We can write A = L U, where L is a lower triangular matrix, and U is an upper triangular matrix. This is called LU decomposition of A.

An elementary matrix is an identity matrix with one of zeros replaced by a number. The Gaussian elimination of matrix A is actually equivalent to an elementary matrix multiplying the matrix A.

Gaussian elimination U = M_n ... M₃M₂M₁A
where M_i are elementary matrices, U is upper triangular matrix

All of Mi matrices are invertible, we could get

M₁^-1 ... M_n-2^-1 M_n-1^-1 M_n^-1 U = A
where M_i^-1 can be easily calculated: the non-diagonal elements multiplied by -1

M_n^-1 … M₃^-1 M₂^-1 M₁^-1 is actually a lower triangular matrix, called L. So finally we got A = L U, which is called LU decomposition. The value of LU decomposition is when you solve A x = b, and there are many b‘s. Then if you first find A = L U, then use L U x = b to solve, it will be very fast.

My Certificate

For more on Matrices & Systems of Linear Equations, please refer to the wonderful course here https://www.coursera.org/learn/matrix-algebra-engineers

My #72 course certificate from Coursera

Eigenvalues and Eigenvectors

I am Kesler Zhu, thank you for visiting my website. Check out more course reviews at https://KZHU.ai

Addition and Multiplication

Transpose

Inner product / Dot product (of two vectors)

Outer product

Inverse

Orthogonal matrices

Gaussian Elimination

Reduced Row Echelon Form

Computing Inverse

LU decomposition

My Certificate

Related Quick Recap

Related Posts

My 166th course certificate from Coursera

My 164th course certificate from Coursera

My 162nd course certificate from Coursera

Leave a Reply Cancel reply