2. Conversion of higher-order ODEs

We begin with a vital observation that applies to all ODEs, not just linear ones: order can be exchanged for dimension.

Example

The simple harmonic oscillator

\[ x'' + 2 Z \omega_0 x' + \omega_0^2 x = f(t) \]

can be converted to an equivalent first-order system using the definitions

\[ u_1 = x, \quad u_2 = x'. \]

We can easily derive an ODE for \(\bfu\) without reference to \(x\). First, by definition, \(u_1'=u_2\). Next, \(u_2' = x''\), and we can isolate \(x''\) in the original equation to get

\[ u_2' = f - 2 Z \omega_0 x' - \omega_0^2 x = f - 2 Z \omega_0 u_2 - \omega_0^2 u_1. \]

Hence

\[\begin{split}\begin{split} u_1' &= u_2,\\ u_2' &= f - 2 Z \omega_0 u_2 - \omega_0^2 u_1. \end{split}\end{split}\]

Example

A frictionless pendulum is governed by the nonlinear equation \(\theta''+ \frac{g}{L}\sin(\theta)=0\), where \(\theta(t)\) is the angle made by the pendulum from the straight-down position. Define

\[ u_1=\theta, \quad u_2=\theta'. \]

Then \(u_1'=u_2\). Since \(u_2'=\theta''\), we can use the original ODE to express it in terms of \(\theta\) and \(\theta'\), and hence in terms of \(u_1\) and \(u_2\). Altogether the system is, in vector notation,

\[\twovec{u_1'}{u_2'} = \twovec{u_2}{- \frac{g}{L}\sin(u_1)}.\]

2.1. General procedure

The technique illustrated above generalizes to equations of any order, and to systems of equations of any orders. The formal procedure is a bit complicated to describe but fully automatic; there are no decisions to make.

We will use a template example to keep the notation under control. Suppose you are given a system in three scalar variables:

\[\begin{split}\begin{split} x'' &= \alpha(t,x,x',y,z,z',z''),\\ y' &= \beta(t,x,x',y,z,z',z''),\\ z''' &= \gamma(x,x',y,z,z',z''). \end{split}\end{split}\]

In any given example, every function might not actually depend on every possible variable, but that is fine. Note that the functions \(\alpha,\beta,\gamma\) can only depend on derivatives of each variable lower than the orders on the left-hand sides. If, say, \(\alpha\) depended on \(x''\) as well, the system would be implicitly defined, and while this too can be handled, we will not encounter such systems.

Our definitions for conversion to a first-order system are

\[\begin{split}\begin{split} u_1 &= x, \, u_2 = x', \\ u_3 &= y,\\ u_4 &= z, \, u_5 = z', \, u_6 = z''. \end{split}\end{split}\]

The key is to include a component for each of the lower-order derivatives of each variable, whether or not those terms appear explicitly in the original problem. Everything else now flows from these definitions:

\[\begin{split}\begin{split} u_1' &= x' = u_2,\\ u_2' &= x'' = \alpha(t,u_1,u_2,u_3,u_4,u_5,u_6), \\ u_3' &= y' = \beta(t,u_1,u_2,u_3,u_4,u_5,u_6),\\ u_4' &= z' = u_5,\\ u_5' &= z'' = u_6,\\ u_6' &= z''' = \gamma(t,u_1,u_2,u_3,u_4,u_5,u_6). \end{split}\end{split}\]

The final expression of the new system is \(\bfu' = \bff(t,\bfu)\), free of all references to the original variables \(x\), \(y\), and \(z\).

Example

Two pendulums hanging from a bar and swinging in parallel planes are coupled through torsion on the bar, obeying

\[\begin{align*} \theta_1''+ b\theta_1' + \frac{g}{L}\sin(\theta_1) + \kappa(\theta_1-\theta_2) & = 0 \\ \theta_2''+ b\theta_2' + \frac{g}{L}\sin(\theta_2) + \kappa(\theta_2-\theta_1) & = 0 \\ \end{align*}\]

Express this as a first-order system.

If initial conditions are supplied for the original problem, they can be translated as well. In the above template, the initial values would need to be given for \(x,x',y,z,z',z''\), which are just the components of the new dependent variable \(\bfu\).

Attention

While every high-order problem can be converted to a first-order system, the converse is not true. That is, there are first-order systems that are not equivalent to any higher-order problem. They inhabit a strictly larger universe.

2.2. Linear problems

When the original high-order ODE or system is linear, then its first-order equivalent is linear as well. We saw that in the first example of this section, where

\[ x'' + 2 Z \omega_0 x' + \omega_0^2 x = f(t) \]

can be converted to

(2.7)\[\begin{split}\begin{split} u_1' &= u_2,\\ u_2' &= f - 2 Z \omega_0 u_2 - \omega_0^2 u_1. \end{split}\end{split}\]

As a linear problem, we can express it using a matrix-vector multiplication:

\[ \bfu' = \twomat{0}{1}{-\omega_0^2}{-2Z\omega_0} \bfu + \twovec{0}{f}, \]

which is of the form \(\bfu'=\bfA(t)\bfu + \bff(t)\). The \(2\times 2\) matrix \(\bfA\) here is called the coefficient matrix of the first-order system.