Radial basis functions
Contents
7.2. Radial basis functions#
Solving on a tensor-product grid is limiting in two major ways:
Geometry: not very flexible, and especially challenging in 3D
Dimension: number of points grows exponentially with the dimension
Methods that work without any grid are known as meshless or meshfree methods. A common category of these are kernel methods, where the interpolating or approximating function is in the form
where the \(x_n\) are points in \(\mathbb{R}^d\) called centers, \(c_n\) are constants, and \(K\) is a kernel function. A particularly interesting choice is
where \(\varphi\) is known as the basic function and \(\|\cdot\|\) is a norm, most typically the 2-norm. The resulting \(K(x-x_n)\) is known as a radial basis function, since it is radially symmetric around its center \(x_n\).
Basic functions#
Here are some of the most common choices for the basic function.
Polyharmonic spline: \(\varphi(r) = r^k\) for odd \(k\), or \(r^k\log(r)\) for even values of \(k\). These functions have \(k-1\) continuous derivatives, with higher-order discontinuities at the center points.
Wendland function: \(\varphi\) is a piecewise polynomial with compact support. The number of continuous derivatives is selectable by construction.
Gaussian: \(\varphi(r) = \exp(-r^2)\) (smooth)
Multiquadric: \(\varphi(r) = \sqrt(1+r^2)\) (smooth)
Theory (again based on Fourier analysis) states that the convergence rate of the RBF interpolant depends on the smoothness of \(\varphi\) in a familiar way: algebraic if the smoothness is finite, and spectral if it’s infinite. However, realizing these rates in practice, particularly in the spectral case, requires severely limited circumstances.
In practice one often adds a shape parameter \(\epsilon > 0\):
It’s also possible, and sometimes desirable, to let \(\epsilon\) vary from center to center. The shape parameter has no effect on the polyharmonic case, it controls the support radius in the Wendland case, and it acts as a local scaling in the smooth cases.
Interpolation conditions#
Given
we need \(N\) conditions to determine the coefficients. The standard choice is to interpolate values at the centers:
which yields the linear system \(\bfA \mathbf{c} = \bff\), where
Clearly, \(\bfA\) is symmetric. For some basic functions, such as the Gaussian, it’s probably positive definite, as well.
It’s common to augment the RBFs with low-degree polynomial terms:
where \(\psi_1,\ldots,\psi_M\) is a basis for polynomials in \(\mathbb{R}^d\) of a fixed, low degree. In order to determine the additional degrees of freedom, we add the constraints
This leads to the linear system
which remains symmetric. This formulation allows exact reproduction of low-degree polynomials, which is both practically and theoretically useful. In the multiquadric case, including the constant term ensures that the system is positive definite, while the polyharmonic case usually includes at least the constant and linear polynomials.
Convergence#
One may define the fill distance for domain \(\Omega\) and set of centers \(x_n\) as
It’s an analog of the grid spacing or step size. At one extreme is nonstationary approximation, in which \(\epsilon\) is fixed as \(h\to 0\), or \(\epsilon\to 0\) on a fixed center set. While can get spectral convergence (in an unfamiliar norm over a disappointingly small space of functions), the condition number of the linear system also grows exponentially, severely limiting the realizable accuracy.
At the other extreme, a stationary approximation holds \(\epsilon h\) fixed as \(h\to 0\). Naively, this case exhibits neither growth in the condition number nor convergence in a function space. However, it is not useless, as it allows one to tune the tradeoff between these competing effects. One can also choose to allow \(\epsilon h\) to go to zero more slowly than \(h\), which leads to a compromise between conditioning and approximation accuracy.
A landmark result by Platte, Trefethen, and Kuilaars is worth mentioning here.
Theorem 7.2.1 (Impossibility Theorem)
Suppose \(E\) is a compact set containing \([-1,1]\) and let \(B(E)\) be the Banach space of functions that are continuous on \(E\) and analytic in its interior. If \(\{\phi_n\}\) is approximation process based on sampling function \(f\) at \(n\) equispaced points such that, for some positive \(M\) and \(\sigma >1\),
for all \(f\in B(E)\), then the condition number of \(\phi_n\) satisfies
for some \(C>1\) and all sufficiently large \(n\).
Beyond its literal statement, the theorem shows that unless one has a special set of interpolation nodes, then there is no escaping exponential ill-conditioning when there is a spectral convergence rate, even in one dimension.
Simple cases#
For the polyharmonic spline \(\varphi(r)=r\) in one dimension, the approximation \(u\) is simply a piecewise linear interpolant. For \(\varphi(r)=r^3\) in one dimension, one gets a cubic spline interpolant, although not with the classical choices of end conditions.
For smooth basic functions in one dimension, then as \(\epsilon \to 0\) (the “flat limit”) one gets the polynomial interpolant. The same is true in higher dimensions, under some minor restrictions (polynomial interpolation in more than one dimension is no joke). This result suggests that while the basic RBF method is an unstable way to get the interpolant, there is a sensible limiting value that one can try to come at by other means.
Interpolation demo#
using Distances
x = [0,0.1,0.31,0.48,0.66,0.87,1]
R = pairwise(Euclidean(),x)
7×7 Matrix{Float64}:
0.0 0.1 0.31 0.48 0.66 0.87 1.0
0.1 0.0 0.21 0.38 0.56 0.77 0.9
0.31 0.21 0.0 0.17 0.35 0.56 0.69
0.48 0.38 0.17 0.0 0.18 0.39 0.52
0.66 0.56 0.35 0.18 0.0 0.21 0.34
0.87 0.77 0.56 0.39 0.21 0.0 0.13
1.0 0.9 0.69 0.52 0.34 0.13 0.0
φ = r -> sqrt(1+r^2)
A = φ.(R)
7×7 Matrix{Float64}:
1.0 1.00499 1.04695 1.10923 1.19817 1.32548 1.41421
1.00499 1.0 1.02181 1.06977 1.14612 1.2621 1.34536
1.04695 1.02181 1.0 1.01435 1.05948 1.14612 1.21495
1.10923 1.06977 1.01435 1.0 1.01607 1.07336 1.12712
1.19817 1.14612 1.05948 1.01607 1.0 1.02181 1.05622
1.32548 1.2621 1.14612 1.07336 1.02181 1.0 1.00841
1.41421 1.34536 1.21495 1.12712 1.05622 1.00841 1.0
f = x -> exp(sin(2x))
c = A \ f.(x)
7-element Vector{Float64}:
-76.43069853435156
154.5857646949567
-237.24855655958984
386.0074113809028
-309.3302319961014
37.133559705609755
45.040716957143744
using LinearAlgebra
cond(A)
9.4242553538069e6
using Plots
plotly()
default(label="",linewidth=2)
plot(f,0,1)
scatter!(x,f.(x))
u = t -> sum(c[j]*φ(norm(t-x[j])) for j in eachindex(x))
plot!(u,0,1)
┌ Warning: For saving to png with the `Plotly` backend `PlotlyBase` and `PlotlyKaleido` need to be installed.
│ err = ArgumentError("Package PlotlyBase not found in current path.\n- Run `import Pkg; Pkg.add(\"PlotlyBase\")` to install the PlotlyBase package.")
└ @ Plots /Users/driscoll/.julia/packages/Plots/wutJB/src/backends.jl:420
x = range(-1,1,21)
R = pairwise(Euclidean(),x)
logε = range(-2,3,100)
xx = range(-1,1,1800)
κ = []
err = []
for logε in logε
ε = 10^logε
A = φ.(ε*R)
push!(κ,cond(A))
c = A\f.(x)
u = t -> sum(c[j]*φ(ε*norm(t-x[j])) for j in eachindex(x))
push!(err,norm(f.(xx)-u.(xx),Inf))
end
plot(10 .^logε,κ,xaxis=("ε",:log10),yaxis=(:log10),label="cond")
plot!(10 .^logε,err,label="err",title="Shape parameter")
xx = range(-1,1,1800)
n = 10:10:320
κ = []
err = []
for n in n
x = range(-1,1,n+1)
R = pairwise(Euclidean(),x)
ε = 10
A = φ.(ε*R)
push!(κ,cond(A))
c = A\f.(x)
u = t -> sum(c[j]*φ(ε*norm(t-x[j])) for j in eachindex(x))
push!(err,norm(f.(xx)-u.(xx),Inf))
end
plot(n,1.0./κ,xaxis=("N"),yaxis=(:log10),label="1/cond")
plot!(n,err,label="err",title="Nonstationary convergence")
xx = range(-1,1,1800)
n = 10:10:320
κ = []
err = []
for n in n
x = range(-1,1,n+1)
R = pairwise(Euclidean(),x)
ε = 0.5/(2/n)
A = φ.(ε*R)
push!(κ,cond(A))
c = A\f.(x)
u = t -> sum(c[j]*φ(ε*norm(t-x[j])) for j in eachindex(x))
push!(err,norm(f.(xx)-u.(xx),Inf))
end
plot(n,1.0./κ,xaxis=("N"),yaxis=(:log10),label="1/cond")
plot!(n,err,label="err",title="Stationary nonconvergence")
PDE collocation#
In the context of PDEs, there are many ways to deploy RBFs. The natural starting point is collocation. To solve the linear problem
we can replace interpolation conditions by collocation conditions. Suppose we have centers \(x_1,\ldots,x_N\) in the interior \(\Omega\) and \(x_{N+1},\ldots,x_{N+\nu}\) on the boundary \(\partial \Omega\). Then
and
This is still a linear system to solve for the coefficients.
t = range(-1.1,1.1,64)
isinside(x,y) = x^2 + 3y^2 < 0.97
XI = [ [x,y] for x in t, y in t if isinside(x,y)]
scatter([x[1] for x in XI],[x[2] for x in XI],m=2,msw=0)
XB = [ [cos(θ),sin(θ)/sqrt(3)] for θ in range(0,2π,150) ]
scatter!([x[1] for x in XB],[x[2] for x in XB],m=2,msw=0)
φ = r -> r^5
Lφ = r -> 25r^3
x = [XI;XB]
R = pairwise(Euclidean(),x)
Ni,Nb = length(XI),length(XB)
N = Ni+Nb
A = Lφ.(R[1:Ni,:])
B = φ.(R[Ni+1:N,:])
f = [-ones(Ni);[x[1]^2+x[2] for x in XB]]
c = [A;B] \ f
1598-element Vector{Float64}:
-6.095227617968963
-1.5515284190210141
0.3830492331272455
-0.30116575694989534
-0.0009633194412975579
-0.011953477329130836
0.018705179191211972
0.03469533558241129
0.03857549644622416
0.039817383628852655
0.034642324937579384
0.016299337473628245
0.008207954970407874
⋮
4.572263824028745
-2.0581471864292666
17.825889525762353
40.94307514382058
-27.859683956569334
21.399565210730717
49.39074006808319
-11.091588213730953
3.052552674913801
1.5926956814207112
1.7054304825068067
-681.2992619782173
u = t -> isinside(t...) ? sum(c[j]*φ(norm(t-x[j])) for j in eachindex(x)) : NaN
xx = range(-1,1,130)
surface(xx,xx,[u([x,y]) for x in xx, y in xx])
Again, but with constant and linear terms added:
P = [ones(Ni+Nb) [x[1] for x in x] [x[2] for x in x]]
c = [ [A;B] P; P' zeros(3,3) ] \ [f; zeros(3)];
u = function(t)
if isinside(t...)
u1 = sum(c[j]*φ(norm(t-x[j])) for j in 1:N)
u2 = c[N+1] + c[N+2]*t[1] + c[N+3]*t[2]
return u1 + u2
else
return NaN
end
end
xx = range(-1,1,130)
surface(xx,xx,[u([x,y]) for x in xx, y in xx])