Review of linear algebra#

Terminology#

An ordinary number in R or C may be called a scalar. An m×n matrix A is a rectangular m-by-n array of numbers called elements or entries. The numbers m and n are called the row dimension and the column dimension, respectively; collectively they describe the size or shape of A. We say A belongs to the set Rm×n if its entries are real, or Cm×n if they are complex-valued. A square matrix has equal row and column dimensions. A row vector has dimension 1×n, while a column vector has dimension m×1.

In this text, all vectors are column vectors, and we use Rn or Cn to denote spaces of these vectors. When a row vector is needed, it is given an explicit transpose symbol (see below).

We use capital letters in bold to refer to matrices, and lowercase bold letters for vectors. The bold symbol 0 may refer to a vector of all zeros or to a zero matrix, depending on context; we use 0 as the scalar zero only.

To refer to a specific element of a matrix, we use the uppercase name of the matrix without boldface. For instance, A24 refers to the (2,4) element of A. To refer to an element of a vector, we use just one subscript, as in x3. A boldface character with one or more subscripts, on the other hand, is a matrix (uppercase) or vector (lowercase) that belongs to a numbered collection.

We will have frequent need to refer to the individual columns of a matrix as vectors. We use a lowercase bold version of the matrix name with a subscript to represent the column number. For example, a1,a2,,an are the columns of the m×n matrix A. Conversely, whenever we define a sequence of vectors v1,,vp, we can implicitly consider them to be columns of a matrix V. Sometimes we might write

V=[vj]

to emphasize the connection between a matrix and its columns.

The diagonal (more specifically, main diagonal) of an n×n matrix A refers to the entries Aii, i=1,,n. The entries Aij where ji=k are on a superdiagonal if k>0 and a subdiagonal if k<0. The diagonals are numbered as indicated here:

[012n1101n2n+2101n+1210].

A diagonal matrix is one whose entries are all zero off the main diagonal. An upper triangular matrix U has entries Uij with Uij=0 if i>j, and a lower triangular matrix L has Lij=0 if i<j.

The transpose of ACm×n is the matrix ATCn×m given by

AT=[A11A21Am1A1nA2nAmn].

The adjoint or hermitian of a matrix A is given by A=AT, where the bar denotes taking a complex conjugate elementwise.1 If A is real, then A=AT. A square matrix is symmetric if AT=A and hermitian if A=A.

Algebra#

Matrices and vectors of the same size may be added elementwise. Multiplication by a scalar is also defined elementwise. These operations obey the familiar laws of commutativity, associativity, and distributivity. The multiplication of two matrices, on the other hand, is less straightforward.

There are two ways for vectors to be multiplied together. If v and w are in Cn, their inner product is

vw=k=1nvkwk.

Trivially, one finds that wv=(vw).

Additionally, any two vectors vCm and wCn (with mn allowed) have an outer product, which is an m×n matrix:

(1)#vw=[viwj]i=1,,m,j=1,,n=[v1w1v1w2v1wnv2w1v2w2v2wnvmw1vmw2vmwn].

For real vectors, the complex conjugates above have no effect and becomes T.

In order for matrices A and B to be multiplied, it is necessary that their inner dimensions match. Thus, if A is m×p, then B must be p×n. In terms of scalar components, the (i,j) entry of C=AB is given by

(2)#Cij=k=1pAikBkj.

Note that even if AB is defined, BA may not be. Moreover, even when both products are defined, they may not equal each other.

Observation 1

Matrix multiplication is not commutative, i.e., the order of terms in a product matters to the result.

Matrix multiplication is associative, however:

ABC=(AB)C=A(BC).

Hence while we cannot change the ordering of the terms, we can change the order of the operations. This is a property that we will use repeatedly. We also note here the important identity

(3)#(AB)T=BTAT.

Specifically, if either product is defined, then they both are defined and equal each other.

Linear combinations#

It is worth reinterpreting (2) at a vector level. If A has dimensions m×n, it can be multiplied on the right by an n×1 column vector v to produce an m×1 column vector Av, which satisfies

(4)#Av=[kA1kvkkA2kvkkAmkvk]=v1[A11A21Am1]+v2[A12A22Am2]++vn[A1nA2nAmn]=v1a1++vnan.

We say that Av is a linear combination of the columns of A.

Observation 2

Multiplying a matrix on the right by a column vector produces a linear combination of the columns of the matrix.

There is a similar interpretation of multiplying A on the left by a row vector. Keeping to our convention that boldface letters represent column vectors, we write, for vRm,

(5)#vTA=[kvkAk1kvkAk2kvkAkn]=v1[A11A1n]+v2[A21A2n]++vm[Am1Amn].
Observation 3

Multiplying a matrix on the left by a row vector produces a linear combination of the rows of the matrix.

These two observations extend to more general matrix-matrix multiplications. One can show that (assuming that A is m×p and B is p×n)

(6)#AB=A[b1b2bn]=[Ab1Ab2Abn].

Equivalently, if we write A in terms of rows, then

(7)#A=[w1Tw2TwmT]AB=[w1TBw2TBwmTB].
Observation 4

A matrix-matrix product is a horizontal concatenation of matrix-vector products involving the columns of the right-hand matrix. Equivalently, a matrix-matrix product is also a vertical concatenation of vector-matrix products involving the rows of the left-hand matrix.

The representations of matrix multiplication are interchangeable; whichever one is most convenient at any moment can be used.

Example 5

Let

A=[110231],B=[21041132].

Then, going by (2), we get

AB=[(1)(2)+(1)(1)(1)(1)+(1)(1)(1)(0)+(1)(3)(1)(4)+(1)(2)(0)(2)+(2)(1)(0)(1)+(2)(1)(0)(0)+(2)(3)(0)(4)+(2)(2)(3)(2)+(1)(1)(3)(1)+(1)(1)(3)(0)+(1)(3)(3)(4)+(1)(2)]=[1232226454310].

But note also, for instance, that

A[21]=2[103]+1[121]=[125],

and so on, as according to (6).

There is also an interpretation, presented in Section 2.4, of matrix products in terms of vector outer products.

Identity and inverse#

The identity matrix of size n, called I (or sometimes In), is a diagonal n×n matrix with every diagonal entry equal to one. As can be seen from (6) and (7), it satisfies AI=A for ACm×n and IB=B for BCn×p. It is therefore the matrix analog of the number 1, the multiplicative identity.

Example 6

Let

B=[217460104401].

Suppose we want to create a zero in the (2,1) entry by adding 3 times the first row to the second row, leaving the other rows unchanged.

We can express this operation as a product AB as follows. From dimensional considerations alone, A will need to be 3×3. According to (5), we get “3 times row 1 plus row 2” from left-multiplying B by the column vector [3,1,0]. Equation (7) tells us that this must be the second row of A.

Since the first and third rows of AB are the same as those of B, similar logic tells us that the first and third rows of A are the same as the identity matrix:

[100310001]B=[21740322124401].

This can be verified using (2).

Note that a square matrix A can always be multiplied by itself to get a matrix of the same size. Hence we can define the integer powers A2=(A)(A), A3=(A2)A=(A)A2 (by associativity), and so on. By definition, A0=I.

If A is an n×n matrix, then there may be at most one matrix Z of the same size such that

ZA=AZ=I.

If Z exists, it is called the inverse of A and is written as A1. In this situation we say that A is invertible.

The zero matrix has no inverse. For n>1 there are also nonzero matrices that have no inverse. Such matrices are called singular. The properties “invertible” and “singular” are exclusive opposites; thus, nonsingular means invertible and noninvertible means singular.

Linear systems#

Given a square, n×n matrix A and n-vectors x and b, the equation Ax=b is equivalent to

a11x1+a12x2++a1nxn=b1a21x1+a22x2++a2nxn=b2an1x1+an2x2++annxn=bn.

The following facts are usually proved in any elementary text on linear algebra.

Theorem 7 :  Linear algebra equivalence

The following statements are equivalent:

  1. A is nonsingular.

  2. (A1)1=A.

  3. Ax=0 implies that x=0.

  4. Ax=b has a unique solution, x=A1b, for any n-vector b.

Exercises#

  1. ✍ In racquetball, the winner of a rally serves the next rally. Generally, the server has an advantage. Suppose that when Ashley and Barbara are playing racquetball, Ashley wins 60% of the rallies she serves and Barbara wins 70% of the rallies she serves. If xR2 is such that x1 is the probability that Ashley serves first and x2=1x1 is the probability that Barbara serves first, define a matrix A such that Ax is a vector of the probabilities that Ashley and Barbara each serve the second rally. What is the meaning of A10x?

  2. ✍ Suppose we have lists of n terms and m documents. We can define an m×n matrix A such that Aij=1 if term j appears in document i, and Aij=0 otherwise. Now suppose that the term list is

    "numerical", "analysis", "more", "cool", "accounting"
    

    and that x=[11010]T. Give an interpretation of the product Ax.

  3. ✍ Let

    A=[0100000100000010].

    Show that An=0 when n4.

  4. ✍ Find two matrices A and B, neither of which is the zero matrix, such that AB=0.

  5. ✍ Prove that when AB is defined, BTAT is defined too, and use Equation (2) to show that (AB)T=BTAT.

  6. ✍ Show that if A is invertible, then (AT)1=(A1)T. (This matrix is often just written as AT.)

  7. ✍ Prove true, or give a counterexample: The product of upper triangular square matrices is upper triangular.


1

The conjugate of a complex number is found by replacing all references to the imaginary unit i by i. We do not use complex numbers until the second half of the book.