7.5 Dimension Reduction

Key theory:

Ak=ki=1σiuivTi=UkSkVTk

  • defines a rank k approximation with the first k important contributions
  • compression of matrices
  • interpretation of the fraction of variance captured as sum of square singular values

Polling Demo

datafile = download("https://tobydriscoll.net/fnc-julia/_static/resources/voting.jld2")

@load datafile A;

heatmap(A,color=:viridis,
    title="Votes in 111th U.S. Senate",xlabel="bill",ylabel="senator")
U,σ,V = svd(A)
τ = cumsum.^2) / sum.^2)
scatter(τ[1:16], xaxis=("k"), yaxis=(L"\tau_k"), 
    title="Fraction of singular value variance")
scatter( U[:,1],label="",layout=(1,2),
    xlabel="senator" ,title="left singular vector")
scatter!( V[:,1],label="",subplot=2,
    xlabel="bill",title="right singular vector")

Exercise 7.5.3

find the rank-1 matrix closest to A

A = [1. 5.; 5. 1.]

U, sigma, V = svd(A);
A1 = U[:,1]*sigma[1,1]*V[:,1]'
sigma.^2 ./ sum(sigma.^2)
A = [1. 5.; 0. 1.]

U, sigma, V = svd(A);
A1 = U[:,1]*sigma[1,1]*V[:,1]'
sigma.^2 ./ sum(sigma.^2)

Exercise 7.5.5

Following Demo 7.5.2 as a guide, load the “mandrill” test image and convert it to a matrix of floating-point pixel grayscale intensities. Using the SVD, display as images the best approximations of rank 5, 10, 15, and 20.

img = testimage("mandrill");
A = @. Float64(Gray(img))
plot(Gray.(A), frame =:none)
U, S, V = svd(A);

plt = plot(layout=(2,2),frame=:none,aspect_ratio=1,titlefontsize=10);
for i in 1:4
    k = 5i
    Ak = U[:,1:k]*diagm(S[1:k])*V[:,1:k]'
    plot!(Gray.(Ak),subplot=i,title="rank = $k")
end
plt
cumsum(S.^2) / sum(S.^2)