Chapter 2: Initial Value Problems

Overview

This chapter covers numerical methods for solving initial value problems (IVPs) in ordinary differential equations.

Learning objectives

  • Understand the structure of initial value problems
  • Implement explicit and implicit time integration schemes
  • Analyze stability and accuracy of numerical methods

2.1 Introduction to IVPs

2.1.1 Definition

Initial value problems involve finding a function that satisfies a differential equation and meets specified initial conditions. The general form of an IVP is given by: \[ \dot{\underline q} \left ( t \right ) = f \left ( t, \underline q \left ( t \right )\right ), \quad t > t _ 0, \tag{1}\] where,

  • \(t\) is the independent variable (often time),

  • \(\underline q\) is the dependent variable (state vector, e.g., position, velocity, etc.),

    \(\forall t \ge t _ 0\), \(\underline q \left ( t \right ) \in \mathbb{R} ^ n\),

  • \(f\) is a given function defining the system dynamics (model),

    \(\forall t \ge t _ 0\), \(\underline q \in \mathbb R ^ n\), \(f \left ( t, \underline q \right ) \in \mathbb{R} ^ n\).

The definition of \(q\) is completed by specifying the initial condition at time \(t _ 0\): \[ \underline q \left ( t _ 0 \right ) = \underline{q _ 0}. \tag{2}\]

NoteGlobal Cauchy–Lipschitz theorem

Let \(I = \left [ t _ a, t _ b \right ] \subset \mathbb R\) be a closed interval with \(t _ 0 \in I\). Let \(f : I \times \mathbb R ^ n \to \mathbb R ^ n\) satisfy:

  1. Continuity in time and state: \(f\) is continuous on \(I \times \mathbb R ^ n\).
  2. Global Lipschitz in the state (uniform in time): there exists \(L > 0\) such that for all \(t \in I\) and all \(\underline{q _ 1}\), \(\underline{q _ 2} \in \mathbb R ^ n\), \[ \left \Vert f \left ( t, \underline{q _ 1} \right ) - f \left ( t, \underline{q _ 2} \right ) \right \Vert \le L \left \Vert \underline{q _ 1} - \underline{q _ 2} \right \Vert. \]

Then the initial value problem (equations 1 and 2) admits a unique solution \(\underline q \in \mathcal C ^ 1 \left (I; \mathbb R ^ n \right )\)1 on the whole interval \(I\).

2.1.2 Examples of IVPs

Radioactive decay

A simple example of an IVP is the radioactive decay of a substance, described by the equation: \[ \dot{N} \left ( t \right ) = -\lambda N \left ( t \right ), \quad N \left ( 0 \right ) = N _ 0, \tag{3}\] where \(N \left ( t \right )\) is the quantity of the substance at time \(t\), \(\lambda\) is the decay constant, and \(N _ 0\) is the initial quantity.

This is one example of a scalar (\(n = 1\)) IVP. The analytical solution is given by: \[ N \left ( t \right ) = \exp \left ( -\lambda t \right ) N _ 0. \]

The decay constant may be related to the half-life \(t _ {1 / 2}\) (the time required for half of the substance to decay) of the substance by the formula: \[ t _ {1 / 2} = \frac{\ln \left ( 2 \right )}{\lambda}. \]

Note

Equation 3 is a first-order, linear, scalar, homogeneous and autonomous IVP. It is a very valuable model that will be used throughout this course to illustrate the properties of numerical methods for IVPs.

Simple pendulum

Another example of an IVP is the motion of a simple pendulum, which consists of a mass attached to a string swinging under the influence of gravity. If the friction is neglected, the total energy of the system is conserved, so one can derive a ordinary differential equation for the angle of displacement \(\theta \left ( t \right )\) from the vertical position.

The kinetic and potential energies of the pendulum are given by, \[ E _ k = \frac{1}{2} m R ^ 2 \dot \theta ^ 2, \quad \mathrm{and} \quad E _ p = -m g R \cos \theta, \] respectively. Differentiating the total energy \(E = E _ k + E _ p\) with respect to time and setting the derivative to zero (since energy is conserved) leads to, \[ m R ^ 2 \dot \theta \left ( \ddot \theta + \frac{g}{R} \sin \theta \right ) = 0. \] The first solution (\(\dot \theta = 0\)) corresponds to the trivial case of a stationary pendulum, while the second solution gives the equation of motion, \[ \ddot \theta + \omega ^ 2 \sin \theta = 0 \tag{4}\] where \(\omega = \sqrt{g / R}\).

Simple pendulum (source: Wikipedia).
Note

Equation 4 is a second-order ordinary differential equation. In order to recover the canonical form (Equation 1), we introduce the state vector \[ \underline q = \begin{bmatrix} \theta \\ \dot \theta \end{bmatrix} \] leading to the system, \[ \frac{\mathrm d}{\mathrm d t} \begin{bmatrix} q _ 1 \\ q _ 2 \end{bmatrix} = \begin{bmatrix} q _ 2 \\ - \omega ^ 2 \sin q _ 1 \end{bmatrix}. \]

Predator–prey model (Lotka–Volterra)

The Lotka–Volterra equations are a pair of first-order, nonlinear, ordinary differential equations that describe the dynamics of biological systems in which two species interact, one as a predator and the other as prey. The equations are given by: \[ \begin{aligned} \dot x & = \alpha x - \beta x y, \\ \dot y & = \delta x y - \gamma y, \end{aligned} \] where \(x \left ( t \right )\) and \(y \left ( t \right )\) represent the population sizes of the prey and predator species at time \(t\), respectively. The parameters \(\alpha\), \(\beta\), \(\delta\), and \(\gamma\) are positive constants that describe the interaction between the two species.

The first equation models the growth of the prey population, which increases at a rate proportional to its current size (\(\alpha x\)) but decreases due to predation at a rate proportional to both the prey and predator populations (\(\beta x y\)). The second equation models the growth of the predator population, which increases at a rate proportional to the number of prey consumed (\(\delta x y\)) but decreases due to natural death at a rate proportional to its current size (\(\gamma y\)).

Population dynamics for rabbit and fox problem mentioned aside (source: Wikipedia).

2.2 Numerical methods for IVPs

An important question in the numerical analysis of IVPs is whether a given numerical method will produce an approximation that converges to the true solution as the time step size \(\tau\) approaches zero.

For linear initial value problems, convergence of a numerical scheme follows from consistency and stability. The Lax–Richtmyer equivalence theorem provides a rigorous proof of this statement in the finite-difference setting; analogous results hold for Runge–Kutta and other time-integration methods.

This section first introduces a simple model (referred to as a “toy model”), the form of which is first justified. Simple numerical schemes are then introduced. They are then applied to the numerical integration of the toy model. The consistency and stability of the resulting solution are then studied.

2.2.1 A toy model

Let us therefore consider the class of linear IVPs, \[ \dot q = A \, q, \tag{5}\] where \(A \in \mathcal M _ n \left ( \mathbb R \right )\) is a constant matrix. This is a special case of the more general nonlinear IVPs given by Equation 1, where the function \(f\) is linear in \(q\) and does not depend on time.

Let us also assume that the matrix \(A\) is diagonalizable. Then it admits a spectral decomposition of the form, \[ A = R \Lambda L, \tag{6}\] where the columns of \(R\) are the right eigenvectors of \(A\), the rows of \(L\) are the left eigenvectors of \(A\), and \(\Lambda\) is a diagonal matrix containing the eigenvalues of \(A\). The left and right eigenvectors are chosen such that they satisfy the biorthogonality condition, \[ L R = I, \] where \(I\) is the identity matrix.

NoteDiagonalizability

A sufficient condition for a matrix to be diagonalizable is that it has \(n\) distinct eigenvalues. However, this condition is not necessary, as there are matrices with repeated eigenvalues that are still diagonalizable. In general, a matrix is diagonalizable if the algebraic multiplicity of each eigenvalue equals its geometric multiplicity (the number of linearly independent eigenvectors associated with that eigenvalue).

Substituting the spectral decomposition (Equation 6) into the linear IVP (Equation 5) and left-multiplying by \(L\) gives, \[ \frac{\mathrm d}{\mathrm d t} \left ( L q \right ) = L A q = L R \Lambda L q = \Lambda \left ( L q \right ). \] By introducing the characteristic variable \(w = L q\), we can rewrite the above equation as, \[ \dot w = \Lambda w, \] or equivalently, \[ \dot w _ i = \lambda _ i w _ i, \quad i = 1, \ldots, n. \]

In other words, the original system of \(n\) coupled ordinary differential equations has been decoupled into \(n\) independent scalar ordinary differential equations, each of which can be solved separately.

This motivates the study of the scalar IVP, \[ \dot q = \lambda q, \tag{7}\] already encountered in the context of the radioactive decay example. A small modification however is that \(\lambda\) is now allowed to be a complex number, since the eigenvalues of a real matrix can be complex.

The solution of this initial value problem is known, \[ q \left ( t \right ) = \exp \left ( \lambda t \right ) q _ 0, \] which has the remarkable property, \[ q \left ( t + \tau \right ) = \exp \left ( \lambda \tau \right ) q \left ( t \right ). \tag{8}\] This will be used in the following section.

2.2.2 Basic time integration schemes

In order to approximate the solution of an IVP numerically, we discretize the time domain into a series of time steps. Let \(t _ 0\) be the initial time and \(\tau\) be the time step size (assumed constant for simplicity, but can be variable in general). We define a sequence of discrete time points, \[ t _ n = n \tau, \quad n \in \mathbb N. \] This is an arithmetic sequence where each time point is obtained by adding the time step size \(\tau\) to the previous time point, \[ t _ {n + 1} = t _ n + \tau. \]

The underlying idea is to approximate the solution \(q \left ( t \right )\) at these discrete time points, denoted as \(q _ n \approx q \left ( t _ n \right )\). Of course. This sequence is initialized using the initial condition, \(q _ 0\) (Equation 2). In fact, the property of the exact solution of the toy problem (Equation 8) can be written for \(t = t _ n\), \[ q \left ( t _ {n + 1} \right ) = \exp \left ( \lambda \tau \right ) q \left ( t _ n \right ), \] which shows that the exact solution sampled at constant rate \(\tau\) forms a geometric sequence with common ratio, \[ \sigma \left ( z \right ) \equiv \exp \left ( z \right ). \]

The goal of a numerical method for IVPs is to produce a recurrence relation, akin to Equation 8, but for arbitrary models \(f\) (i.e. not just the toy model). The starting point for this is the integration of the IVP (Equation 1) over a single time step \(\left [ t _ n, t _ {n + 1} \right ]\), \[ \int _ {t _ n} ^ {t _ {n + 1}} \dot q \left ( t \right ) \mathrm d t = \int _ {t _ n} ^ {t _ {n + 1}} f \left ( t, q \left ( t \right ) \right ) \mathrm d t, \] which yields the exact recurrence relation, \[ q \left ( t _ {n + 1} \right ) = q \left ( t _ n \right ) + \int _ {t _ n} ^ {t _ {n + 1}} f \left ( t, q \left ( t \right ) \right ) \mathrm d t. \]

This relation is of little practical use, because the integral depends on the unknown solution \(q \left ( t \right )\) for \(t \in \left [ t _ n, t _ {n + 1} \right ]\). The idea is therefore to approximate the integral using some quadrature rule.

, and the goal is to devise a recurrence relation that relates the \(n + 1\)th term of the sequence (\(\underline{q _ n}\)) to some combination of a finite number of previous terms (\(\underline{q _ n}\), \(\underline{q _ {n - 1}}\), …, \(\underline{q _ {n + 1 - k}}\)).

In the context of IVPs, the number of previous terms used to compute the next term is referred to as the number of steps. It is traditional to distinguish between:

  • Single-step methods (\(k = 1\)): these methods compute the next value \(\underline{q _ {n + 1}}\) using only the current value \(\underline{q _ n}\). Examples include the Euler method and Runge-Kutta methods.
  • Multi-step methods (\(k > 1\)): these methods compute the next value \(\underline{q _ {n + 1}}\) using multiple previous values, such as \(\underline{q _ n}\), \(\underline{q _ {n - 1}}\), etc. Examples include Adams-Bashforth and Adams-Moulton methods.
NoteFibonacci sequence

A famous example of a linear multi-step (\(k = 2\)) sequence with constant coefficients is the Fibonacci sequence, defined by the recurrence relation: \[ F _ {n + 1} = F _ n + F _ {n - 1}, \] and the initial conditions, \[ F _ 0 = 0 \quad \mathrm{and} \quad F _ 1 = 1. \]

2.2 Euler Methods

2.3 Runge-Kutta Methods

2.4 Multi-step Methods

2.5 Stability Analysis

Footnotes

  1. Space of all functions once continuously differentiable on \(I\).↩︎