Supervised Dimension Reduction
Greater dimensionality always brings about more difficult learning tasks. Here we introduce a supervised dimension reduction method based on linear dimension reduction as introduced in
which can also be simplified as:
z=Tx,x∈Rd,z∈Rm,m<d
Of course, centeralization in the first place is necessary:
xi←xi−1n∑i′=1nxi′
Fisher discrimination analysis is one of the most basic supervised linear dimension reduction methods, where we seek for a T to make samples of the same label as close as possible and vice versa. To begin with, define within-class class matrix S(w) and between-class matrix S(b) as:
S(w)=∑y=1c∑i:yi=y(xi−μy)(xi−μy)T∈R(d×d)S(b)=∑y=1cnyμyμTy∈R(d×d)
where
μy=1ny∑i:yi=yxi
∑i:yi=y stands for the sum of y satisfying yi=y, ny is the amount of samples belonging to class y.
Then we can define the projection matrix T:
maxT∈Rm×dtr((TS(w)TT)−1TS(b)TT)
It is obvious that our optimization goal is trying to maximize within-class matrix TS(w)TT as well as minimize between-class matrix TS(b)TT.
This optimization problem is solvable once we carry out some approaches similar to the one used in Unsupervised Dimension Reduction, i.e.
S(b)ξ=λS(w)ξ
where the normalized eigenvalues are λ1≥⋯≥λd≥0 and corresponded eigen-vectors are ξ1,⋯,ξd. Taking the largest m eigenvalues we get the solution of T:
Tˆ=(ξ1,⋯,ξm)T
n=100; x=randn(n,2);
x(1:n/2,1)=x(1:n/2,1)-4;
x(n/2+1:end,1)=x(n/2+1:end, 1)+4;
x=x-repmat(mean(x),[n,1]);
y=[ones(n/2,1);2*ones(n/2,1)];
m1=mean(x(y==1,:));
x1=x(y==1,:)-repmat(m1,[n/2,1]);
m2=mean(x(y==2,:));
x2=x(y==2,:)-repmat(m2,[n/2,1]);
[t,v]=eigs(n/2*(m1')*m1+n/2*(m2')*m2,x1'*x1+x2'*x2,1);
figure(1); clf; hold on; axis([-8 8 -6 6]);
plot(x(y==1,1),x(y==1,2),'bo');
plot(x(y==2,1),x(y==2,2),'rx');
plot(99*[-t(1) t(1)],99*[-t(2) t(2)],'k-');
Attention please: when samples have several peeks, the output fails to be ideal. Local Fisher Discrimination Analysis may work yet.