HomePage
Research Topics
Restricted Area
Login:

Password:


 
Nonlinear Dimensionality Reduction : the book

   
 
 
Order it
 
 
 
The book may be ordered:
 
Summary
 
 

Methods of dimensionality reduction are innovative and important tools in the fields of data analysis, data mining and machine learning. They provide a way to understand and visualize the structure of complex data sets. Traditional methods like principal component analysis and classical metric multidimensional scaling suffer from being based on linear models. Until recently, very few methods were able to reduce the data dimensionality in a nonlinear way. However, since the late nineties, many new methods have been developed and nonlinear dimensionality reduction, also called manifold learning, has become a hot topic. New advances that account for this rapid growth are e.g. the use of graphs to represent the manifold topology, and the use of new metrics, like the geodesic distance. In addition, new optimization schemes, based on kernel techniques and spectral decomposition, have lead to spectral embedding, which encompasses many of the recently developed methods.

This book describes existing and advanced methods to reduce the dimensionality of numerical databases. For each method, the description starts from intuitive ideas, develops the necessary mathematical details and ends by outlining the algorithmic implementation. Methods are compared with each other with the help of different illustrative examples.

The purpose of the book is to summarize clear facts and ideas about well-known methods as well as recent developments in the topic of nonlinear dimensionality reduction. With this goal in mind, methods are all described from a unifying point of view, in order to highlight their respective strengths and shortcomings.

The book is primarily intended for statisticians, computer scientists and data analysts. Other practitioners having a basic background in statistics and/or computational learning may find it interesting too, like psychologists (in psychometry) and economists. The book may also be of interest for R&D departments of companies that are active in computer science, engineering or consultancy.

 
Software
 
  Simulations described in the book were performed using C++ software designed by John A. Lee. The implemented techniques/methods are listed below.
  • Vector quantization
    • K-means
    • Competitive Learning
    • Neural Gas
    • Self-Organizing Maps
    • ...
  • Intrinsic dimensionality estimation
    • Local PCA
  • Metrics
    • Euclidean norm
    • L_p norm (p = 1, 2, infinity)
    • Graph distances (shortest paths and commute time)
  • Nonlinear dimensionality
    • Classical metric multidimensional scaling
      (depending on the metric: cmMDS, Isomap, Laplacian eigenmaps)
    • Sammon's nonlinear mapping
      (depending on the metric: Euclidean NLM, Geodesic NLM)
    • Curvilinear component analysis
      (depending on the metric: Euclidean CCA, Geodesic CCA = CDA)
    • Local multidimensional scaling
      (depending on the metric: Euclidean LMDS, Geodesic LMDS)
    • Self-Organizing Maps
    • Isotop
  • Other
    • Shepard diagrams

The algorithms are embedded in a command interpreter, allowing the user to program tasks either online or offline. Commands are easy to learn and help is available at the prompt. Current version: 20060502

Please report any bug by mail to John Lee
 
Links
 
  Other software implementing NLDR methods can be found on the internet:


[UCL]

Webmasters: Arnaud de Decker - arnaud.dedecker@uclouvain.be; Thibault Helleputte - thibault.helleputte@uclouvain.be