5.1. The interpolation problem#
Given \(n+1\) distinct points \((t_0,y_0)\), \((t_1,y_1),\ldots,(t_n,y_n)\), with \(t_0<t_1<\ldots <t_n\) called nodes, the interpolation problem is to find a function \(p(x)\), called the interpolant, such that \(p(t_k)=y_k\) for \(k=0,\dots,n\).
In this chapter, we use \(t_k\) for the nodes and \(x\) to denote the continuous independent variable.
Attention
The interpolation nodes are numbered from 0 to \(n\). This is convenient for our mathematical statements, but less so in a language such as Julia in which vector indices start with 1. Our approach is that indices in a computer code have the same meaning as those identically named in the mathematical formulas, and therefore must be incremented by one whenever used in an indexing context.
Polynomials#
Polynomials are the obvious first candidate to serve as interpolating functions. They are easy to work with, and in Section 2.1 we saw that a linear system of equations can be used to determine the coefficients of a polynomial that passes through every member of a set of given points in the plane. However, it’s not hard to find examples for which polynomial interpolation leads to unusable results.
Here are some points that we could consider to be observations of an unknown function on \([-1,1]\).
n = 5
t = range(-1,1,length=n+1)
y = @. t^2 + t + 0.05*sin(20*t)
scatter(t,y,label="data",leg=:top)
The polynomial interpolant, as computed using fit
, looks very sensible. It’s the kind of function you’d take home to meet your parents.
p = Polynomials.fit(t,y,n) # interpolating polynomial
plot!(p,-1,1,label="interpolant")
But now consider a different set of points generated in almost exactly the same way.
n = 18
t = range(-1,1,length=n+1)
y = @. t^2 + t + 0.05*sin(20*t)
scatter(t,y,label="data",leg=:top)
The points themselves are unremarkable. But take a look at what happens to the polynomial interpolant.
p = Polynomials.fit(t,y,n)
x = range(-1,1,length=1000) # use a lot of points
plot!(x,p.(x),label="interpolant")
Surely there must be functions that are more intuitively representative of those points!
Interpolation by a polynomial at equally spaced nodes is ill-conditioned as the degree of the polynomial grows.
In Chapter 9 we explore the large oscillations in the last figure of Demo 5.1.2; it turns out that one must abandon either equally spaced nodes or \(n\to\infty\) for polynomials. In the rest of this chapter we will keep \(n\) fairly small and let the nodes be unrestricted.
Piecewise polynomials#
In order to keep polynomial degrees small while interpolating large data sets, we will choose interpolants from the piecewise polynomials. Specifically, the interpolant \(p\) must be a polynomial on each subinterval \([t_{k-1},t_k]\) for \(k=1,\ldots,n\).
Some examples of piecewise polynomials for the nodes \(t_0=-2\), \(t_1=0\), \(t_2=1\), and \(t_3=4\) are \(p_1(x)=x+1\), \(p_2(x)=\operatorname{sign}(x)\), \(p_3(x)=|x-1|^{3}\), and \(p_4(x)=(\max\{0,x\})^{4}\). Note that \(p_{1}\), \(p_{2}\), and \(p_4\) would also be piecewise polynomial on the node set \(\{t_0,t_1,t_3\}\), but \(p_3\) would not.
Usually we designate in advance a maximum degree \(d\) for each polynomial piece of \(p(x)\). An important property of the piecewise polynomials of degree \(d\) is that they form a vector space: that is, any linear combination of piecewise polynomials of degree \(d\) is another piecewise polynomial of degree \(d\). If \(p\) and \(q\) share the same node set, then the combination is piecewise polynomial on that node set.
Let us recall the data from Demo 5.1.2.
Show code cell source
n = 12
t = range(-1,1,length=n+1)
y = @. t^2 + t + 0.5*sin(20*t)
scatter(t,y,label="data",leg=:top)
Here is an interpolant that is linear between each consecutive pair of nodes, using plinterp
from Section 5.2.
p = FNC.plinterp(t,y)
plot!(p,-1,1,label="piecewise linear")
We may prefer a smoother interpolant that is piecewise cubic, generated using Spline1D
from the Dierckx
package.
p = Spline1D(t,y)
plot!(x->p(x),-1,1,label="piecewise cubic")
We will consider piecewise linear interpolation in more detail in Section 5.2, and we look at piecewise cubic interpolation in Section 5.3.
Conditioning of interpolation#
In the interpolation problem we are given the values \((t_k,y_k)\) for \(k=0,\ldots,n\). Let us consider the nodes \(t_k\) of the problem to be fixed, and let \(a=t_0\), \(b=t_n\). Then the data for the interpolation problem consists of a vector \(\mathbf{y}\), and the result of the problem is a function on \([a,b]\).
Let \(\mathcal{I}\) be a prescription for producing the interpolant from a data vector. That is, \(\mathcal{I}(\mathbf{y})=p\), where \(p(t_k)=y_k\) for all \(k\). The interpolation methods we will consider are all linear, in the sense that
for all vectors \(\mathbf{y},\mathbf{z}\) and scalars \(\alpha,\beta\).
Linearity greatly simplifies the analysis of interpolation. To begin with, for any data vector \(\mathbf{y}\) we have the standard expression \(\mathbf{y}=\sum y_k \mathbf{e}_k\), where as always \(\mathbf{e}_k\) is a column of an identity matrix.1 Hence by linearity,
The functions appearing within the sum above have particular significance.
A cardinal function \(\phi_k\) for a node set \(t_0,\ldots,t_n\) is the function that interpolates the value \((t_k,1)\) and \((t_j,0)\) for all \(j\neq k\).
For any set of \(n+1\) nodes, there are \(n+1\) cardinal functions \(\phi_0,\ldots,\phi_n\), each singling out a different interpolation node in the set. We finish (5.1.2) by writing
In the following result we use the function infinity-norm or max-norm defined by
Suppose that \(\cI\) is a linear interpolation method on nodes \(t_0,\ldots,t_n\). Then with respect to the infinity norm, the absolute condition number of \(\cI\) satisfies
where the \(\phi_k\) are cardinal interpolating functions.
Suppose the data vector is perturbed from \(\mathbf{y}\) to \(\mathbf{y}+ \mathbf{d}\). Then
Hence
The absolute condition number maximizes this quantity over all \(\mathbf{d}\). Suppose \(j\) is such that \(\|\phi_j\|_\infty\) is maximal. Then let \(\mathbf{d}=\mathbf{e}_j\) and the first inequality in (5.1.5) follows. The other inequality follows from the triangle inequality:
Since \(|d_k|\le \|\mathbf{d}\|_\infty\) for all \(k\), this finishes (5.1.5).
In Demo 5.1.2 and Demo 5.1.5 we saw a big difference between polynomial interpolation and piecewise polynomial interpolation of some arbitrarily chosen data. The same effects can be seen clearly in the cardinal functions, which are closely tied to the condition numbers.
n = 18
t = range(-1,stop=1,length=n+1)
y = [zeros(9);1;zeros(n-9)]; # data for 10th cardinal function
scatter(t,y,label="data")
ϕ = Spline1D(t,y)
plot!(x->ϕ(x),-1,1,label="spline",
xlabel=L"x",ylabel=L"\phi(x)",
title="Piecewise cubic cardinal function")
The piecewise cubic cardinal function is nowhere greater than one in absolute value. This happens to be true for all the cardinal functions, ensuring a good condition number for any interpolation with these functions. But the story for global polynomials is very different.
scatter(t,y,label="data")
ϕ = Polynomials.fit(t,y,n)
plot!(x->ϕ(x),-1,1,label="polynomial",
xlabel=L"x",ylabel=L"\phi(x)",legend=:top,
title="Polynomial cardinal function")
From the figure we can see that the condition number for polynomial interpolation on these nodes is at least 500.
Exercises#
⌨ Create data by entering
t = -2:4; y = tanh.(t);
(a) Use
fit
to construct and plot the polynomial interpolant of the data, superimposed on a scatter plot of the data.(b) Use
Spline1D
to construct and plot a piecewise cubic interpolant of the data, superimposed on a scatter plot of the data.⌨ The following table gives the life expectancy in the U.S. by year of birth.
1980
1985
1990
1995
2000
2005
2010
73.7
74.7
75.4
75.8
77.0
77.8
78.7
(a) Defining “year since 1980” as the independent variable, use
fit
to construct and plot the polynomial interpolant of the data.(b) Use
Spline1D
to construct and plot a piecewise cubic interpolant of the data.(c) Use both methods to estimate the life expectancy for a person born in 2007. Which value is more believable?
⌨ The following two vectors define a flying saucer shape.
x = [ 0,0.51,0.96,1.06,1.29,1.55,1.73,2.13,2.61, 2.19,1.76,1.56,1.25,1.04,0.58,0 ] y = [ 0,0.16,0.16,0.43,0.62,0.48,0.19,0.18,0, -0.12,-0.12,-0.29,-0.30,-0.15,-0.16,0 ]
We can regard both \(x\) and \(y\) as functions of a parameter \(s\), with the points being values given at \(s=0,1,\ldots,15\).
(a) Use
Spline1D
once on each coordinate as functions of \(s\), and make a picture of the flying saucer.(b) One drawback of the result in part (a) is the noticeable corner at the left side, which corresponds to \(s=0\) from above and \(s=15\) from below. There is a periodic variation on cubic spline interpolation that you can invoke by adding the keyword
periodic=true
to theSpline1D
call. Use this to re-plot the flying saucer.✍ Define
\[q(s) = a\frac{s(s-1)}{2} - b (s-1)(s+1) + c \frac{s(s+1)}{2}.\](a) Show that \(q\) is a polynomial interpolant of the points \((-1,a)\), \((0,b)\), \((1,c)\).
(b) Find a change of variable \(s=Ax+B\) so that the values \(s=-1,0,1\) correspond to \(x=x_0-h,x_0,x_0+h\).
(c) Find a quadratic polynomial interpolant \(\tilde{q}(x)\) for the points \((x_0-h,a)\), \((x_0,b)\), \((x_0+h,c)\).
✍ (continuation) Use the result of the previous exercise and Theorem 5.1.7 to derive bounds on the condition number of quadratic polynomial interpolation at the nodes \(x_0-h\), \(x_0\), \(x_0+h\).
- 1
To be precise, we are using \(\mathbf{e}_k\) to mean column number \(k+1\) from an \((n+1)\times (n+1)\) identity matrix, since in linear algebra we start indexing at 1.