7.5. Dimension reduction#
The SVD has another important property that proves very useful in a variety of applications. Let
where
Each outer product
where
The rank of a sum of matrices is always less than or equal to the sum of the ranks, so
Suppose
, andIf the rank of
is or less, then .
Compression#
If the singular values of
We make an image from some text, then reload it as a matrix.
Show code cell source
plot(annotations=(0.5,0.5,text("Hello world",44,:center,:center)),
grid=:none,frame=:none,size=(400,150))
savefig("hello.png")
img = load("hello.png")
A = @. Float64(Gray(img))
Gray.(A)
Next we show that the singular values decrease until they reach zero (more precisely, until they are about
U,σ,V = svd(A)
scatter(σ,xaxis=(L"i"), yaxis=(:log10,L"\sigma_i"),
title="Singular values")
The rapid decrease suggests that we can get fairly good low-rank approximations.
plt = plot(layout=(2,2),frame=:none,aspect_ratio=1,titlefontsize=10)
for i in 1:4
k = 3i
Ak = U[:,1:k]*diagm(σ[1:k])*V[:,1:k]'
plot!(Gray.(Ak),subplot=i,title="rank = $k")
end
plt
Consider how little data is needed to reconstruct these images. For rank-9, for instance, we have 9 left and right singular vectors plus 9 singular values, for a compression ratio of better than 12:1.
m,n = size(A)
compression = m*n / (9*(m+n+1))
12.099213551119178
Capturing major trends#
The use of dimension reduction offered by low-rank SVD approximation goes well beyond simply reducing computation time. By isolating the most important contributions to the matrix, dimension reduction can uncover deep connections and trends that are otherwise obscured by weaker effects and noise.
One useful way to quantify the decay in the singular values is to compute
Clearly
This matrix describes the votes on bills in the 111th session of the United States Senate. (The data set was obtained from [https://voteview.com].) Each row is one senator, and each column is a vote item.
@load "voting.jld2" A;
If we visualize the votes (yellow is “yea,” blue is “nay”), we can see great similarity between many rows, reflecting party unity.
heatmap(A,color=:viridis,
title="Votes in 111th U.S. Senate",xlabel="bill",ylabel="senator")
We use (7.5.3) to quantify the decay rate of the values.
U,σ,V = svd(A)
τ = cumsum(σ.^2) / sum(σ.^2)
scatter(τ[1:16], xaxis=("k"), yaxis=(L"\tau_k"),
title="Fraction of singular value energy")
The first and second singular triples contain about 58% and 17%, respectively, of the energy of the matrix. All others have far less effect, suggesting that the information is primarily two-dimensional. The first left and right singular vectors also contain interesting structure.
Show code cell source
scatter( U[:,1],label="",layout=(1,2),
xlabel="senator" ,title="left singular vector")
scatter!( V[:,1],label="",subplot=2,
xlabel="bill",title="right singular vector")
Both vectors have values greatly clustered near
Show code cell source
x1 = A*V[:,1]; x2 = A*V[:,2];
@load "voting.jld2" Rep Dem Ind
Rep = vec(Rep); Dem = vec(Dem); Ind = vec(Ind);
scatter(x1[Dem],x2[Dem],color=:blue,label="D",
xaxis=("partisanship"),yaxis=("bipartisanship"),title="111th US Senate by voting record" )
scatter!(x1[Rep],x2[Rep],color=:red,label="R")
scatter!(x1[Ind],x2[Ind],color=:yellow,label="I")
Not all data sets can be reduced effectively to a small number of dimensions, but as Demo 7.5.3 illustrates, in some cases reduction reveals information that corresponds to real-world understanding.
Exercises#
✍ Suppose that
is an matrix. Explain why is the distance (in 2-norm) from to the set of all singular matrices.✍ Suppose
is a matrix and the eigenvalues of are 3, 4, 7, and 10. How close is in the 2-norm to (a) a rank-3 matrix? (b) a rank-2 matrix?(a) ⌨ Find the rank-1 matrix closest to
as measured in the 2-norm.
(b) ⌨ Repeat part (a) for
✍ Find the rank-1 matrix closest to
as measured in the 2-norm, where
.⌨ Following Demo 7.5.2 as a guide, load the “mandrill” test image and convert it to a matrix of floating-point pixel grayscale intensities. Using the SVD, display as images the best approximations of rank 5, 10, 15, and 20.
- 1
In statistics this quantity may be interpreted as the fraction of explained variance.