Nonlinear Dimensionality Reduction Methods for Pattern Recognition
The aim of dimensionality reduction is to find a lower
dimensional, simpler representation while keeping the important
information in the data. It is essential to employ dimensionality
reduction for high dimensional data in order to extract relevant
features and filter the non-relevant ones. This allows obtaining
simpler models and useful knowledge from the data.
In this thesis, we discuss and compare several unsupervised
nonlinear methods for dimensionality reduction, namely, Isomap,
Locally Linear Embedding (LLE), Curvilinear Component Analysis
(CCA), Curvilinear Distance Analysis (CDA) and Stochastic Neighbor
Embedding (SNE), by testing their accuracies on standard benchmark
data sets. We propose a modification (SNE-Iso Hybrid), and
introduce the implicit learning of mapping functions in order to
solve the problem of mapping previously unseen data points.
We observe that using the metrics inherent in the data
distribution allows better modelling than using the Euclidean
distance and increases the model accuracies for nonlinear data.