A creation of a Simple Toy Model

In [1]:
using Plots
gr(size=(600,400))
plotly()
Out[1]:
Plots.PlotlyBackend()
In [2]:
include("geompack.jl")
Out[2]:
ProjOverHyperPlaneND (generic function with 1 method)

Generation of Data

In [3]:
# this is the origin of the referencial
origin = VecND(3,[1,2,3]);
# this is the direction of the movement of the pendulum
direction = VecND(3,[1,1,2]);

npts = 2000
lower = 0 
upper = 10

println(origin)
println(direction)
VecND(3, [1.0, 2.0, 3.0])
VecND(3, [1.0, 1.0, 2.0])
In [4]:
# these are the scalars that together give the magnitude of the signal
multp = [cos(x) for x in linspace(lower, upper ,npts)];
In [5]:
x = linspace(lower,upper,npts)
plot(x,multp)
Out[5]:
In [6]:
# we add some noise to it
stdv = 0.1

bnoise = stdv * randn(npts);
observed = multp + bnoise;
plot(x,observed,label = "Observed Signal")
plot!(x,multp,w=3,label="Original Signal")
Out[6]:
In [7]:
# the observations are made in a 3D space
observation_pts = map(x -> SomaND(origin,ProdByScalarND(x,direction)),observed);

ox = [observation_pts[i].values[1] for i in 1:npts];
oy = [observation_pts[i].values[2] for i in 1:npts];
oz = [observation_pts[i].values[3] for i in 1:npts];
In [8]:
# we add the random noise in each component
xnoise = 0.3 * randn(npts);
ynoise = 0.5 * randn(npts);
znoise = 0.2 * randn(npts);

noise_vectors = [SomaND(observation_pts[i],VecND(3,[xnoise[i],ynoise[i],znoise[i]])) for i in 1:npts];
In [9]:
ax = [noise_vectors[i].values[1] for i in 1:npts];
ay = [noise_vectors[i].values[2] for i in 1:npts];
az = [noise_vectors[i].values[3] for i in 1:npts];
In [10]:
gr(size=(900,600))
plotly()
Out[10]:
Plots.PlotlyBackend()
In [11]:
scatter3d(ax,ay,az,markersize=.6,label = "Observations")
scatter3d!(ox,oy,oz,markersize = .6 , label = "Original Data")
Out[11]:

Generation of Data Views

In [12]:
pt1 = VecND(3,[2,-10,5])
vec11 = VecND(3,[-1,2,1])
vec12 = VecND(3,[1,-10,-3])
pl1varray = [vec11,vec12]
plane1 = HyperPlaneNDInit(pt1,pl1varray)

pt2 = VecND(3,[-5.,10,-3])
vec21 = VecND(3,[-0.042486352066674525, -0.7748369397560202, 0.32968505000961074])
vec22 = VecND(3,[1,-10,-3])
pl2varray = [vec21,vec22]
plane2 = HyperPlaneNDInit(pt2,pl2varray)


pt3 = VecND(3,[-2.403287986454733, -4.1295977249862, 0.68818196014408])
vec31 = VecND(3,[-10,2,-2])
vec32 = VecND(3,[0.4750280825480982, -0.0368312710814023, -0.592734463002089])
pl3varray = [vec31,vec32]
plane3 = HyperPlaneNDInit(pt3,pl3varray);
In [13]:
dataview1 = map(x-> ProjOverHyperPlaneND(x,plane1), noise_vectors);
dataview2 = map(x-> ProjOverHyperPlaneND(x,plane2), noise_vectors);
dataview3 = map(x-> ProjOverHyperPlaneND(x,plane3), noise_vectors);
In [14]:
ax1 = [dataview1[i].values[1] for i in 1:npts];
ay1 = [dataview1[i].values[2] for i in 1:npts];
az1 = [dataview1[i].values[3] for i in 1:npts];

ax2 = [dataview2[i].values[1] for i in 1:npts];
ay2 = [dataview2[i].values[2] for i in 1:npts];
az2 = [dataview2[i].values[3] for i in 1:npts];

ax3 = [dataview3[i].values[1] for i in 1:npts];
ay3 = [dataview3[i].values[2] for i in 1:npts];
az3 = [dataview3[i].values[3] for i in 1:npts];
In [15]:
scatter3d(ax,ay,az,markersize=.6,label = "Observations")
scatter3d!(ax1 , ay1 , az1 , markersize = .6 , label = "Data View 1")
scatter3d!(ax2 , ay2 , az2 , markersize = .6 , label = "Data View 2")
scatter3d!(ax3 , ay3 , az3 , markersize = .6 , label = "Data View 3")
Out[15]:

Generate the views on each plane

In [16]:
# plane1
# we create the vectors in the orthogonal base 
versor11 = VersorND(vec11)
versor12 = VectorProdND([versor11,plane1.normal])

proj2Dplane1 = map(x-> VecND(2,
        [
        DotND(SomaND(x,ProdByScalarND(-1.0,plane1.point)),versor11),
        DotND(SomaND(x,ProdByScalarND(-1.0,plane1.point)),versor12)
            ]
    ),dataview1
);

pl1x = [proj2Dplane1[i].values[1] for i in 1:npts];
pl1y = [proj2Dplane1[i].values[2] for i in 1:npts];

plot1 = scatter(pl1x,pl1y,markersize = .8,label="Data view in plane 1");
In [17]:
# plane2
# we create the vectors in the orthogonal base 
versor21 = VersorND(vec21)
versor22 = VectorProdND([versor21,plane2.normal])

proj2Dplane2 = map(x-> VecND(2,
        [
        DotND(SomaND(x,ProdByScalarND(-1.0,plane2.point)),versor21),
        DotND(SomaND(x,ProdByScalarND(-1.0,plane2.point)),versor22)
            ]
    ),dataview2
);

pl2x = [proj2Dplane2[i].values[1] for i in 1:npts];
pl2y = [proj2Dplane2[i].values[2] for i in 1:npts];

plot2 = scatter(pl2x,pl2y,markersize = .8,label="Data view in plane 1");
In [18]:
# plane3
# we create the vectors in the orthogonal base 
versor31 = VersorND(vec31)
versor32 = VectorProdND([versor31,plane3.normal])

proj2Dplane3 = map(x-> VecND(2,
        [
        DotND(SomaND(x,ProdByScalarND(-1.0,plane3.point)),versor31),
        DotND(SomaND(x,ProdByScalarND(-1.0,plane3.point)),versor32)
            ]
    ),dataview3
);

pl3x = [proj2Dplane3[i].values[1] for i in 1:npts];
pl3y = [proj2Dplane3[i].values[2] for i in 1:npts];

plot3 = scatter(pl3x,pl3y,markersize = .8,label="Data view in plane 1");
In [19]:
plot(plot1,plot2,plot3,layout=(3,1),legend = false,aspect_ratio = 1 ,size = (400,1200))
Out[19]:

PCA comes to action

There are a lot of good references on \textbf{Principal Component Analysis} (see \cite{Hyvarinen2001} for example). The core idea of the algorithm is based in the \textbf{Singular Value Decomposition} (see \cite{Strang2016}) and it is simply described as follows. We have $n$ explanatory variables $x_1,\dots,x_n$ and $m$ synchronous observations of each variable, meaning that $m>>n$. The observations of each variable are stored in the column of a matrix $X\in\mathbb{R}^{m\times n}$. We also assume that $r(X)=n$. This fact comes natural because otherwise some of the explanatory variables would be correlated and so those would become unnecessary.

Both $X^T X$ and $XX^T$ are symmetric and so all their eigenvalues are real and eigenvectors associated to different eigenvalues are orthogonal. Moreover, because the columns of $X$ are linearly independent, for every $u\in\mathbb{R}^n$,

\begin{equation}\label{eq:pca.10} u^T X^T X u = (Xu)^T(Xu)=0 \Leftrightarrow Xu = 0 \Leftrightarrow u = 0. \end{equation}

So $X^T X$ is positive definite! This means that there are $n$ positive eigenvalues $\sigma_1^2,\dots,\sigma_n^2$ of $X^T X$. The good news do not stop here! Suppose that $v_i\in\mathbb{R}^m$ such that

\begin{equation}\label{eq:pca.20} X^TXu_i = \sigma_i^2 u_i \text{ and } Xu_i = v_i. \end{equation}

Then we have $\sigma_i X^Tv_i = X^TXu_i = \sigma_i^2u_i$. This implies that $X^Tv_i=\sigma_iu_i$ which is equivalent to have $XX^Tv_i=X\sigma_iu_i=\sigma_i^2v_i$. Well, PCA algorithm is just the \textbf{Singular Value Decomposition} of the covariance matrix where we make an orthogonal projection over the eigenspaces that correspond to the higher eigenvalues, tipically those $\sigma_i^2>1$.

In [20]:
matrix1 = VecNDToMatrix(proj2Dplane1);
matrix2 = VecNDToMatrix(proj2Dplane2);
matrix3 = VecNDToMatrix(proj2Dplane3);


matrix = cat(2,matrix1 , matrix2 , matrix3);
In [21]:
medias = []
desvio = []

for i in 1:size(matrix)[2]
    push!(medias,mean(matrix[:,i]))
    push!(desvio,std(matrix[:,i]))
end

println(medias)
println(desvio)
Any[9.33577, 6.10291, 9.39873, 0.9062, -2.47059, 1.19937]
Any[0.989642, 0.675141, 0.481188, 1.46865, 0.891597, 1.26439]
In [22]:
center_matrix = matrix

for i in 1:size(center_matrix)[1]
    for j in 1:size(center_matrix)[2]
        center_matrix[i,j] = (matrix[i,j] - medias[j])/(desvio[j])
    end
end

# correlation matrix
cov_center_matrix = (1.0/(npts - 1.0)) * center_matrix' * center_matrix
Out[22]:
6×6 Array{Float64,2}:
  1.0        0.870195  -0.611889   0.95314   -0.758082   0.897869
  0.870195   1.0       -0.612788   0.859894  -0.862307   0.800309
 -0.611889  -0.612788   1.0       -0.3641     0.139173  -0.216242
  0.95314    0.859894  -0.3641     1.0       -0.897886   0.988057
 -0.758082  -0.862307   0.139173  -0.897886   1.0       -0.918489
  0.897869   0.800309  -0.216242   0.988057  -0.918489   1.0     
In [23]:
U,S,V = svd(center_matrix);
In [24]:
size(U)
Out[24]:
(2000, 6)
In [25]:
S
Out[25]:
6-element Array{Float64,1}:
 97.225      
 45.5093     
 21.6842     
  1.90001e-13
  3.12655e-14
  1.46236e-14
In [26]:
Ureduce = U[:,1:3];

Pmat = center_matrix' * Ureduce;
print(size(Pmat))

Xredux = Ureduce * Pmat';

for i in 1:size(Xredux)[1]
    for j in 1:size(Xredux)[2]
        Xredux[i,j] = desvio[j] * Xredux[i,j] + medias[j]
    end
end

size(Xredux)
(6, 3)
Out[26]:
(2000, 6)
In [27]:
aux = Xredux[:,2]
auy = Xredux[:,4]
auz = Xredux[:,6]

scatter3d(aux,auy,auz,markersize = 1 , color = :blue ,label = "Reconstruction Data")
scatter3d!(ax,ay,az,markersize = 1 ,label = "Observations")
Out[27]:

References

(Hyvarinen, Karhunen et al., 2001) Aapo. Hyvarinen, Juha. Karhunen and Erkki. Oja, ``Independent component analysis'', 2001.

(Strang, 2016) Gilbert Strang, ``Introduction to Linear Algebra'', 2016.