7. The matrix exponential

As we well know by now, the solution of the scalar linear IVP \(x'=ax\), \(x(0)=x_0\) is

\[ x(t) = e^{at} x_0. \]

Wouldn’t it be interesting if in the vector case \(\mathbf{x}'=\mathbf{A} \mathbf{x}\), \(\bfx(0)=\bfx_0\), we could write

\[ \bfx(t) = e^{\bfA t} \bfx_0 ? \]

Funny you should ask.

7.1. Definition

We know the Taylor series

\[e^{at} = 1 + at + \frac{1}{2!}(at)^2 + \frac{1}{3!} (at)^3 + \cdots.\]

This series can be generalized directly to a square matrix \(\mathbf{A}\), for which integer powers are possible:

(7.5)\[e^{t\mathbf{A}} = \mathbf{I} + t\mathbf{A} + \frac{1}{2!}t^2 \mathbf{A}^2 + \frac{1}{3!} t^3 \mathbf{A}^3 + \cdots.\]

Let’s not worry too much about whether this converges (it does). What are its properties?

Theorem 7.3 (Matrix exponential)

Let \(\mathbf{A},\mathbf{B}\) be \(n\times n\) matrices. Then

  1. \(e^{t\mathbf{A}}=\mathbf{I}\) if \(t=0\),

  2. \(\displaystyle \dd{}{t}e^{t\mathbf{A}} = \mathbf{A} e^{t\mathbf{A}} = e^{t\mathbf{A}}\mathbf{A}\),

  3. \([e^{t\mathbf{A}}]^{-1} = e^{-t\mathbf{A}}\),

  4. \(e^{(s+t)\mathbf{A}} = e^{s\bfA} \cdot e^{t\bfA} = e^{t\bfA} \cdot e^{s\bfA}\)

  5. If \(\mathbf{A}\mathbf{B}=\mathbf{B}\mathbf{A}\), then \(e^{t(\mathbf{A}+\mathbf{B})} = e^{t\mathbf{A}}\cdot e^{t\mathbf{B}} = e^{t\mathbf{B}}\cdot e^{t\mathbf{A}}\).

These conclusions follow pretty easily from the series definition (7.5). They are all essential to what we normally expect from an exponential function, although in the last case we had to restrict the applicability.

From these properties we can make the connection to the IVP.

Theorem 7.4

If \(\bfA\) is a constant square matrix, then

\[ \bfx(t) = e^{t \bfA} \bfx_0 \]

solves the initial-value problem \(\mathbf{x}'=\mathbf{A} \mathbf{x}\), with \(\bfx(0)=\bfx_0\).

Proof

With \(\bfx\) as defined in the theorem statement, we calculate

\[ \dd{}{t} \bfx(t) = \Bigl( \dd{}{t}e^{t \bfA} \Bigr) \bfx_0 = \mathbf{A} e^{t\mathbf{A}} \bfx_0 = \bfA \bfx(t), \]

using property 2 above. Furthermore,

\[ \bfx(0) = e^{0 \bfA} \bfx_0 = \meye \bfx_0 = \bfx_0, \]

using property 1 above.

7.2. Connection to fundamental matrices

We already had a solution procedure for \(\mathbf{x}'=\mathbf{A} \mathbf{x}\) with \(\bfx(0)=\bfx_0\). We use a fundamental matrix to write the general solution \(\bfx = \mathbf{X}(t) \mathbf{c}\), then apply the initial condition to get

\[ \mathbf{x}_0 = \mathbf{x}(0) = \mathbf{X}(0) \mathbf{c}. \]

This is a linear system that can be solved for \(\mathbf{c}\) using a matrix inverse, leading to

\[ \bfx(t) = \mathbf{X}(t) \mathbf{X}(0)^{-1} \bfx_0. \]

Since the solution of an IVP is unique (and the matrices here are invertible), we get the following useful result.

Formula 7.5 (Matrix exponential)

If \(\mathbf{X}(t)\) is any fundamental matrix for \(\bfx'=\bfA\bfx\), then

\[ e^{t\bfA} = \mathbf{X}(t) \mathbf{X}(0)^{-1}. \]

This formula is one way to avoid the rather daunting prospect of having to sum an infinite series of matrices.

Example

Given

\[\mathbf{A} = \twomat{1}{1}{4}{1}\]

and the eigenpairs \(\lambda_1=3\), \(\mathbf{v}_1 = \twovec{1}{2}\) and \(\lambda_2=-1\), \(\mathbf{v}_2=\twovec{1}{-2}\), find \(e^{t\mathbf{A}}\).

Example

Given that

\[\mathbf{A} = \twomat{1}{-1}{5}{-3}\]

has the eigenpairs

\[\lambda = -1\pm i, \; \mathbf{v} = \twovec{1}{2\mp i},\]

find \(e^{t\bfA}\) and the solution of the IVP \(\bfx'=\mathbf{A}\bfx\), \(\bfx(0)=\twovec{2}{1}\).

7.3. Defective matrix case

There is one situation in \(\bfx'=\bfA \bfx\) for which we have not yet produced a fundamental matrix in the constant-coefficient case: when \(\bfA\) is defective. It’s complicated to spell out what happens in full generality, but it’s easily managed for the \(2\times 2\) case.

Theorem 14.7 states that for a \(2\times 2\) matrix with a repeated eigenvalue, the matrix is either a multiple of the identity, or defective. In the latter case, something useful happens, which we state without proof.

Theorem 7.6

If \(\mathbf{A}\) is a defective \(2\times 2\) matrix with double eigenvalue \(\lambda\), then \((\mathbf{A}-\lambda \mathbf{I})^2= \boldsymbol{0}\).

Note that by part 5 of the matrix exponential properties theorem above,

\[ e^{t\mathbf{A}} = e^{t\lambda\mathbf{I}} \, e^{t(\mathbf{A}-\lambda \mathbf{I})}. \]

For the first right-side term we get

\[ e^{t\lambda\mathbf{I}} = \mathbf{I} + t\lambda \mathbf{I} + \frac{1}{2!}(t\lambda)^2 \mathbf{I}^2 + \cdots = \bigl(1 + t\lambda + \frac{1}{2!}(t\lambda)^2 + \cdots\bigr)\, \mathbf{I} = e^{t\lambda} \mathbf{I}, \]

and for the second term we get a power series that truncates after two terms:

\[ e^{t(\mathbf{A}-\lambda \mathbf{I})} = \mathbf{I} + t(\mathbf{A}-\lambda \mathbf{I}) + \frac{1}{2!}t^2 (\mathbf{A}-\lambda \mathbf{I})^2 + \cdots. \]

Thus we arrive at the following.

Formula 7.7 (Matrix exponential for \(2\times 2\) defective)
(7.6)\[e^{t\mathbf{A}} = \bigl( e^{t\lambda} \meye \bigr) \, \bigl( \mathbf{I} + t(\mathbf{A}-\lambda \mathbf{I}) \bigr) = e^{t\lambda} \bigl( \mathbf{I} + t(\mathbf{A}-\lambda \mathbf{I}) \bigr).\]

Example

Find \(e^{t\mathbf{A}}\) if \(\bfA=\twomat{4}{1}{0}{4}\).

Example

A critically damped free oscillator with natural frequency \(\omega_0\) is equivalent to a system with the matrix

\[ \bfA = \twomat{0}{1}{-\omega_0^2}{-2\omega_0}. \]

The characteristic polynomial is \(\lambda^2 + 2\omega_0\lambda + \omega_0^2\), which has the double root \(\lambda=-\omega_0\). Clearly this matrix is not a multiple of \(\meye\), so it is defective. We compute

\[ \bfA - \lambda\meye = \twomat{\omega_0}{1}{-\omega_0^2}{-\omega_0}. \]

This gives the exponential

\[ e^{t\mathbf{A}} = e^{-t\omega_0} \twomat{1+t\omega_0}{t}{-t\omega_0^2}{1-t\omega_0}, \]

which is a fundamental matrix. The oscillator position is the first component of the general solution,

\[ e^{-t\omega_0}[ c_1 (1+t\omega_0) + c_2 t], \]

which is equivalent to the general solution we saw before for this problem, \(e^{-t\omega_0}(c_1+c_2t)\).

7.4. Propagators

Obviously the formulas lead to some intense algebra for particular examples, even in the \(2\times 2\) case. A computer can handle it, but the elementwise expressions get tediously long in all but special cases.

The theoretical implications are more significant. The interpretation of \(e^{t\bfA}\) in the ODE context is that it transforms a vector initial condition to the solution vector at a time \(t\). This is a linear operator; the solutions of linear equations behave linearly.

Property 3 above implies that to invert that transformation, you can use \(e^{-t\bfA}\), which is the same as running the ODE in “negative time.” More formally, it’s a statement about time reversibility.

Property 4 above implies that the evolution at time \(t+s\) is equivalent to evolving by time \(t\), then by time \(s\) (or vice versa). This is a statement about time invariance. A linear equation with a non-constant coefficient matrix also has a propagator matrix, but it’s not a matrix exponential, and the time invariance is broken.