ESANN2007

15th European Symposium on Artificial Neural Networks
Bruges, Belgium, April 25-26-27

[Electronic proceedings home page] [Electronic proceedings author index]

ESANN2007
Content of the proceedings

WARNING: you need Adobe Acrobat reader 7.0 or more to view the PDF files below



Dynamic and complex systems


ES2007-40

Synchronization and acceleration: complementary mechanisms of temporal coding

Thomas Burwick

Abstract
Temporal coding is studied with an oscillatory network model that is a complex-valued generalization of the Cohen-Grossberg-Hopfield system. The model is considered with synchronization and acceleration, where acceleration refers to a mechanism that causes the units of the network to oscillate with higher-phase velocity in case of stronger and/or more coherent input. Applying Hebbian memory, we demonstrate that acceleration introduces the desynchronization that is needed to segment two overlapping patterns without using inhibitory couplings.

Manuscript from author [PDF]

ES2007-64

Pattern Recognition using Chaotic Transients

Wee Jin Goh, Nigel Crook

Abstract
This paper proposes a novel nonlinear transient computation device described as the LTCM that uses the chaotic attractor provided by the Lorenz system of equations to perform pattern recognition. Previous work on nonlinear transient computation has demonstrated that such devices can process time varying input signals. This paper investigates the ability of the LTCM to correctly classify static, linearly inseperable data sets commonly used as benchmarks in the pattern recognition research community. The results from the LTCM are compared with those from support vector machines and multi-layer perceptrons on the same data sets.

Manuscript from author [PDF]

ES2007-102

Order in Complex Systems of Nonlinear Oscillators: Phase Locked Subspaces

Jan-Hendrik Schleimer, Ricardo Vigário

Abstract
Any order parameter quantifying the degree of organisation in a physical system can be studied in connection to source extraction algorithms. Independent component analysis (ICA) by minimising the mutual information of the sources falls into that line of thought, since it can be interpreted as searching components with low complexity. Complexity pursuit, a modification minimising Kolmogorov complexity, is a further example. In this article a specific case of order in complex networks of self- sustained oscillators is discussed, with the objective of recovering original synchronisation pattern between them. The approach is put in relation with ICA.

Manuscript from author [PDF]

[Back to Top]


Prototype-based learning


ES2007-57

"Kernelized" Self-Organizing Maps for Structured Data

Fabio Aiolli, Giovanni Da San Martino, Alessandro Sperduti, Markus Hagenbuchner

Abstract
The suitability of the well known kernels for trees, and the lesser known Self-Organizing Map for Structures for categorization tasks on structured data is investigated in this paper. It is shown that a suitable combination of the two approaches, by defining new kernels on the activation map of a Self-Organizing Map for Structures, can result in a system that is significantly more accurate for categorization tasks on structured data. The effectiveness of the proposed approach is demonstrated experimentally on a relatively large corpus of XML formatted data.

Manuscript from author [PDF]

ES2007-138

Model collisions in the dissimilarity SOM

Fabrice Rossi

Abstract
We investigate in this paper the problem of model collisions in the Dissimilarity Self Organizing Map (SOM). This extension of the SOM to dissimilarity data suffers from constraints imposed on the model representation, that lead to some strong map folding: several units share a common prototype. We propose in this paper an efficient way to address this problem via a branch and bound approach.

Manuscript from author [PDF]

ES2007-78

Clustering a medieval social network by SOM using a kernel based distance measure

Nathalie Villa, Romain Boulet

Abstract
In order to explore the social organization of a medieval peasant community before the Hundred Years' War, we propose the use of an adaptation of the well-known Kohonen Self Organizing Map to dissimilarity data. In this paper, the algorithm is used with a distance based on a kernel which allows the choice of a smoothing parameter to control the importance of local or global proximities.

Manuscript from author [PDF]

ES2007-81

Relevance matrices in LVQ

Petra Schneider, Michael Biehl, Barbara Hammer

Abstract
We propose a new matrix learning scheme to extend Generalized Relevance Learning Vector Quantization (GRLVQ). By introducing a full matrix of relevance factors in the distance measure, correlations between different features and their importance for the classification scheme can be taken into account. In comparison to the weighted euclidean metric used for GRLVQ, this metric is more powerful to represent the internal structure of the data appropriately while maintaining its excellent generalization ability as large margin optimizer. The algorithm is tested and compared to alternative LVQ schemes using an artificial dataset and the image segmentation data from the UCI repository.

Manuscript from author [PDF]

ES2007-89

Tracking fast changing non-stationary distributions with a topologically adaptive neural network: application to video tracking

Georges Adrian Drumea, Hervé Frezza-Buet

Abstract
In this paper, an original method named GNG-T, extended from GNG-U algorithm by Fritzke is presented. The method performs continuously vector quantization over a distribution that changes over time. It deals with both sudden changes and continuous ones, and is thus suited for video tracking framework, where continuous tracking is required as well as fast adaptation to incoming and outgoing people. The central mechanism relies on the management of quantization resolution, that cope with stopping condition problems of usual Growing Neural Gas inspired methods. Application to video tracking is briefly presented.

Manuscript from author [PDF]

ES2007-110

Systematicity in sentence processing with a recursive self-organizing neural network

Igor Farkas, Matthew W. Crocker

Abstract
As potential candidates for human cognition, connectionist models of sentence processing must learn to behave systematically by generalizing from a small traning set. It was recently shown that Elman networks and, to a greater extent, echo state networks (ESN) possess limited ability to generalize in artificial language learning tasks. We study this capacity for the recently introduced recursive self-organizing neural network model and show that its performance is comparable with ESNs.

Manuscript from author [PDF]

[Back to Top]


Model selection and regularization


ES2007-22

Agglomerative Independent Variable Group Analysis

Antti Honkela, Jeremias Seppä, Esa Alhoniemi

Abstract
Independent Variable Group Analysis (IVGA) is a principle for grouping dependent variables together while keeping mutually independent or weakly dependent variables in separate groups. In this paper an agglomerative method for learning a hierarchy of IVGA groupings is presented. The method resembles hierarchical clustering, but the distance measure is based on an approximation of mutual information between groups of variables. The approach also allows determining optimal cutoff points for the hierarchy. The method is demonstrated to find sensible groupings of variables that ease construction of a predictive model.

Manuscript from author [PDF]

ES2007-75

Classifying n-back EEG data using entropy and mutual information features

Liang Wu, Predrag Neskovic, Etienne Reyes, Elena Festa, Heindel William

Abstract
In this work we show that entropy (H) and mutual information (MI) can be used as methods for extracting spatially localized features for classification purposes. In order to increase accuracy of entropy estimation, we use a Bayesian approach with a Dirichlet prior to derive estimation equations. We calculate the H and MI features for each electrode (H) and pair of electrodes (MI) in three frequency bands and use them to train the Naive Bayes classifier. We test the H and MI features on one/five trial long segments of n-back memory EEG signals and show that they outperform power spectrum and linear correlation features respectively.

Manuscript from author [PDF]

ES2007-62

Nearest Neighbor Distributions and Noise Variance Estimation

Elia Liitiäinen, Francesco Corona, Amaury Lendasse

Abstract
In this paper, we address the problem of deriving bounds for the moments of nearest neighbor distributions. The bounds are formulated for the general case and specifically applied to the problem of noise variance estimation with the Delta and the Gamma test. For this problem, we focus on the rate of convergence and the bias of the estimators and validate the theoretical achievement with experimental results.

Manuscript from author [PDF]

ES2007-79

Complexity bounds of radial basis functions and multi-objective learning

Illya Kokshenev, Antônio Braga

Abstract
In the paper, the problem of multi-objective (MOBJ) learning is discussed. The problem of obtaining apparent (effective) complexity measure, which is one of the objectives, is considered. For the specific case of RBFN, the bounds on the smoothness-based complexity measure are proposed. As shown in the experimental part, the bounds can be used for Pareto set approximation.

Manuscript from author [PDF]

[Back to Top]


Fuzzy and Probabilistic Methods in Neural Networks and Machine Learning


ES2007-7

How to process uncertainty in machine learning?

Barbara Hammer, Thomas Villmann

Abstract

Manuscript from author [PDF]

ES2007-23

An Estimation of Response Certainty using Features of Eye-movements

Minoru Nakayama, Yosiyuki Takahasi

Abstract
To examine the feasibility of estimating the degree of ``strength of belief (SOB)'' of viewer's responses using support vector machines (SVM) trained with features of gazes, the gazing features were analyzed while subjects reviewed their own responses to multiple choice tasks. Subjects freely reported the certainty of their chosen answers, and these responses were then classified as high and low SOBs. All gazing points of eye-movements were classified into visual areas, or cells, which corresponded with the positions of answers so that training data, consisting of the features and SOB, was produced. A discrimination model for SOB was trained with several combinations of features to see whether performance of a significant level could be obtained. As a result, a trained model with 3 features, which consists of interval time, vertical difference and length between gazes, can provide significant discrimination performance for SOB.

Manuscript from author [PDF]

ES2007-115

Visualisation of tree-structured data through generative probabilistic modelling

Nikolaos Gianniotis, Peter Tino

Abstract
We present a generative probabilistic model for the topographic mapping of tree structured data. The model is formulated as constrained mixture of hidden Markov tree models. A natural measure of likelihood arises as a cost function that guides the model fitting. We compare our approach with an existing neural-based methodology for constructing topographic maps of directed acyclic graphs. We argue that the probabilistic nature of our model brings several advantages, such as principled interpretation of the visualisation plots.

Manuscript from author [PDF]

ES2007-145

Visualization of Fuzzy Information in Fuzzy-Classification for Image Segmentation using MDS

Thomas Villmann, Strickert Marc, Cornelia Brüß, Frank-Michael Schleif, Udo Seiffert

Abstract

Manuscript from author [PDF]

[Back to Top]


Learning I


ES2007-99

SOM for intensity inhomogeneity correction in MRI

Maite García-Sebastián, Manuel Graña

Abstract
Given an appropiate imaging resolution, a common Magnetic Resonance Imaging (MRI) model assumes that object under study is composed of piecewise constant materials, so that MRI produces piecewise constant images. The intensity inhomogeneity (IIH) is modeled by a multiplicative inhomogeneity field. It is due to the spatial inhomogeneity in the excitatory Radio Frecuency (RF) signal and other effects. It has been acknowledged as a greater source of error for automatic segmentation algorithms than additive noise. We propose a new non parametric IIH correction algorithm where the Self Organizing Map (SOM) is used to estimate the IIH field.

Manuscript from author [PDF]

ES2007-83

SOM+EOF for finding missing values

Antti Sorjamaa, Paul Merlin, Bertrand Maillet, Amaury Lendasse

Abstract
In this paper, a new method for the determination of missing values in temporal databases is presented. This new method is based on two projection methods: a nonlinear one (Self-Organized Maps) and a linear one (Empirical Orthogonal Functions). The global methodology that is presented combines the advantages of both methods to get accurate candidates for missing values. An application of the determination of missing values for fund return database is presented.

Manuscript from author [PDF]

ES2007-124

Self-organized chains for clustering

Hassan Ghaziri

Abstract
This paper presents a new algorithm for clustering. It is an generalisation of the K-means algorithms . Each cluster will be represented by a chain of prototypes instead of being represented by one prototype like for the K-means. The chains are competing together to represent clusters and are evolving according to Kohonen maps adaptation rule. It is well known that K-means performs very well with hyper-spherical data and has difficulties in dealing with irregular data. We have shown on special artificial data that the new algorithm we are presenting performs very well for different types of data sets . In addition, it shows robustness regarding initial conditions.

Manuscript from author [PDF]

ES2007-59

On the dynamics of Vector Quantization and Neural Gas

Aree Witoelar, Michael Biehl, Anarta Ghosh, Barbara Hammer

Abstract
A large variety of machine learning models which aim at vector quantization have been proposed. However, only very preliminary rigorous mathematical analysis concerning their learning behavior such as convergence speed, robustness with respect to initialization, etc.\ exists. In this paper, we use the theory of on-line learning for an exact mathematical description of the training dynamics of Vector Quantization mechanisms in model situations. We study update rules including the basic Winner-Takes-All mechanism and the Rank-Based update of the popular Neural Gas network. We investigate a model with three competing prototypes trained from a mixture of Gaussian clusters and compare performances in terms of dynamics, sensitivity to initial conditions and asymptotic results. We demonstrate that rank-based Neural Gas achieves both robustness to initial conditions and best asymptotic quantization error.

Manuscript from author [PDF]

ES2007-33

Three-dimensional self-organizing dynamical systems for discrete structures memorizing and retrieval

Alexander Yudashkin

Abstract
The synthesis concept for dynamical system with the memory of multiple states defined with the quaternion algebra usage is considered. The system memorizes numerous configurations consisting of separate nodes and retrieves any of them from the non-stationary distorted state. Each stored configuration corresponds to the particular attractor of the dynamical system, defined by the set of nonlinear ordinary differential equation in the hypercomplex domain. The model demonstrates the intelligence in sample structure assembling based on the initial desire, that is shown numerically in the paper. Such models can be used in robotics, complex information systems and in pattern recognition tasks.

Manuscript from author [PDF]

ES2007-76

Clustering using genetic algorithm combining validation criteria

Murilo Naldi, André Carvalho

Abstract
Clustering techniques have been a valuable tool for several data analysis applications. However, one of the main difficulties associated with clustering is the validation of the results obtained. Both clustering algorithms and validation criteria present an inductive bias, which can favor datasets with particular characteristics. Besides, different runs of the same algorithm using the same data set may produce different clusters. In this work, traditional clustering and validation techniques are combined with Genetic Algorithms (GAs) to build clusters that better approximate the real distribution of the dataset. The GA employs a fitness function that combines two validation criteria. Such combination allows the GA to improve the evaluation of the candidate solutions. Furthermore, this combined approach avoids the individual weaknesses of each criterion. A set of experiments are run to compare the proposed model with other clustering algorithms, with promising results.

Manuscript from author [PDF]

ES2007-112

Toward a robust 2D spatio-temporal self-organization

Thomas Girod, Laurent Bougrain, Frédéric Alexandre

Abstract
Several models have been proposed for spatio-temporal self-organization, among which the TOM model by Wiemer is particularly promising. In this paper, we propose to adapt and extend this model to 2D maps to make it more generic and biologically plausible and more adapted to realistic applications, illustrated here by an application to speech analysis.

Manuscript from author [PDF]

ES2007-58

Adaptive Weight Change Mechanism for Kohonens's Neural Network Implemented in CMOS 0.18 um Technology

Tomasz Talaska, Rafal Dlugosz, Pedrycz Witold

Abstract
In this paper, we present a block of adaptive weight change (AWC) mechanism for analog current-mode Kohonen's Neural Network (KNN) implemented in CMOS 0.18 um technology. As the other essential building blocks of KNNs dealing with the calculations of Euclidean distance, formation of a conscience mechanism and determination of the winner-takes-all (WTA) circuits have been already developed, the AWC forms another essential step towards the realization of the network. We show that the proposed network works with small values of analog signals thus resulting in low power dissipation and chip area when compared with digital realizations of KNNs. Each neuron occupies chip area equal to about 1000 um2 and dissipates 20 uW of power for 20 MHz input data rate.

Manuscript from author [PDF]

ES2007-146

Feature clustering and mutual information for the selection of variables in spectral data

Catherine Krier, Damien Francois, Fabrice Rossi, Michel Verleysen

Abstract
Spectral data often have a large number of highly-correlated features, making feature selection both necessary and uneasy. A methodology combining hierarchical constrained clustering of spectral variables and selection of clusters by mutual information is proposed. The clustering allows reducing the number of features to be selected by grouping similar and consecutive spectral variables together, allowing an easy interpretation. The approach is applied to two datasets related to spectroscopy data from the food industry.

Manuscript from author [PDF]

ES2007-67

Prediction of post-synaptic activity in proteins using recursive feature elimination

Bernardo Carvalho, Ricardo Ribeiro, Talles Medeiros

Abstract
This work presents a new approach to predict post-synaptic activities in proteins. It uses a feature selection technique, called Recursive Feature Elimination, in order to select only the relevant features from the complete database. Once the reduced subset is found, Least Squares Support Vector Machine, a SVM based classifier, is used to predict its classes. The experiments were performed on a database that was harvested from Swiss Prot/Uniprot, a public domain database with a rich source of information for a very large number of proteins. The obtained results show that the proposed approach led to a reduced representation to the database, using only 6% of the original information, and yielded an improvement into the classification when compared to another two prediction techniques applied to the complete database, Decision Tree and Least Squares Support Vector Machine.

Manuscript from author [PDF]

ES2007-18

A new feature selection scheme using data distribution factor for transactional data

Piyang Wang, Tommy W. S. Chow

Abstract
A new efficient unsupervised feature selection method is proposed to handle transactional data. The proposed feature selection method introduces a new Data Distribution Factor (DDF) to select appropriate clusters. This method combines the compactness and separation together with a newly introduced concept of singleton item. This new feature selection method is computationally inexpensive and is able to deliver very promising results. Four datasets from UCI machine learning repository are used in this studied. The obtained results show that the proposed method is very efficient and able to deliver very reliable results.

Manuscript from author [PDF]

ES2007-55

informational cost in correlation-based neuronal networks

Gaetano Liborio Aiello, Carlo Casarino

Abstract
The cost of maintaining a given level of activity in a neuronal network depends on its size and degree of connectivity. Should a neural function require large-size fully-connected networks, the cost can easily exceed metabolic resources, especially for high level neural functions. We show that, even in this case, the cost can still match the energetic resources provided the function is broken down into a set of subfunctions, each assigned to a higly-connected, small- size module, all together forming a correlation-based type network. Cell assemblies are the best examples of such type of networks.

Manuscript from author [PDF]

ES2007-13

Controlling complexity of RBF networks by similarity

Ulrich Rückert, Ralf Eickhoff

Abstract
Using radial basis function networks for function approximation tasks suffers from unavailable knowledge about an adequate network size. In this work, a measuring technique is proposed which can control the model complexity and is based on the correlation coefficient between two basis functions. Simulation results show good performance and, therefore, this technique can be integrated in the RBF training procedure.

Manuscript from author [PDF]

ES2007-37

Adaptive Global Metamodeling with Neural Networks

Dirk Gorissen, Wouter Hendrickx, Tom Dhaene

Abstract
Due to the scale and computational complexity of current simulation codes, metamodels (or surrogate models) have become indispensable tools for exploring and understanding the design space. Consequently, there is great interest in techniques that facilitate the construction and evaluation of such approximation models while minimizing the computational cost and maximizing metamodel accuracy. This paper presents a novel, adaptive, integrated approach to global metamodeling with neural networks based on the Multivariate Metamodeling Toolbox. An adaptive, evolutionary inspired, modeling algorithm is presented and its performance compared with rational metamodeling on a number of test problems.

Manuscript from author [PDF]

[Back to Top]


Convex Optimization for the Design of Learning Machines


ES2007-5

Convex optimization for the design of learning machines

Kristiaan Pelckmans, Johan Suykens, De Moor Bart

Abstract

Manuscript from author [PDF]

ES2007-72

Deploying SDP for machine learning

Tijl De Bie

Abstract
We discuss the use in machine learning of a general type of convex optimisation problems known as semi-definite programming (SDP). We intend to argue that SDP's arise quite naturally in a variety of situations, accounting for there omnipresence in modern machine learning approaches, and we provide examples in support.

Manuscript from author [PDF]

ES2007-73

A metamorphosis of Canonical Correlation Analysis into multivariate maximum margin learning

Sandor Szedmak, Tijl De Bie, David R. Hardoon

Abstract
Canonical Correlation Analysis(CCA) is a useful tool to discover relationship between different sources of information represented by vectors. The solution of the underlying optimization problem involves a generalized eigenproblem and is nonconvex. We will show a sequence of transformations which turn CCA into a convex maximum margin problem. The new formulation can be applied for the same class of problems at a significantly lower computational cost and with a better numerical stability.

Manuscript from author [PDF]

ES2007-38

Model Selection for Kernel Probit Regression

Gavin Cawley

Abstract
The convex optimisation problem involved in fitting a kernel probit regression (KPR) model can be solved efficiently via an iteratively re-weighted least-squares (IRWLS) approach. The use of successive quadratic approximations of the true objective function suggests an efficient approximate form of leave-one-out cross-validation for KPR, based on an existing exact algorithm for the weighted least-squares support vector machine. This forms the basis for an efficient gradient descent model selection procedure used to tune the values of the regularisation and kernel parameters. Experimental results are given demonstrating the utility of this approach.

Manuscript from author [PDF]

ES2007-29

Interval discriminant analysis using support vector machines

Cecilio Angulo, Davide Anguita, Luis González

Abstract
Imprecision, incompleteness, prior knowledge or improved learning speed can motivate interval–represented data. Most approaches for SVM learning of interval data use local kernels based on interval distances. We present here a novel approach, suitable for linear SVMs, which allows to deal with interval data without resorting to interval distances. The experimental results confirms the validity of our proposal.

Manuscript from author [PDF]

[Back to Top]


Generative models and maximum likelihood approaches


ES2007-126

Mixtures of robust probabilistic principal component analyzers

Cédric Archambeau, Nicolas Delannay, Michel Verleysen

Abstract
Discovering low-dimensional (nonlinear) manifolds is an important open problem in Machine Learning. In many applications, the data are living in a high dimensional space. This can lead to serious problems in practice due to the curse of dimensionality. Fortunately, the core of the data lies often on one or several low-dimensional manifolds. A way to handle these is to pre-process the data by nonlinear data projection techniques (see for example Tenenbaum, et al., 2000). An alternative approach is to combine local linear models. In particular, mixtures of probabilistic principal component analyzers (Tipping and Bishop, 1999) are very attractive as each component is specifically designed to extract the local principal orientations in the data. However, an important issue is the model sensitivity to data lying off the manifold, possibly leading to mismatches between successive local models. The mixtures of robust probabilistic principal component analyzers introduced in this paper heal this problem as each component is able to cope with atypical data while identifying the local principal directions. Interestingly, the standard mixture of Gaussians is a particular instance of this more general model.

Manuscript from author [PDF]

ES2007-53

Learning topology of a labeled data set with the supervised generative gaussian graph

Pierre Gaillard, Michaël Aupetit, Gérard Govaert

Abstract
Discovering the topology of a set of labeled data in a Euclidian space can help to design better decision systems. In this work, we propose a supervised generative model based on the Delaunay Graph of some prototypes representing the labeled data in order to extract the topology of the classes.

Manuscript from author [PDF]

ES2007-91

Markovian blind separation of non-stationary temporally correlated sources

Rima Guidara, Shahram Hosseini, Yannick Deville

Abstract
In a previous work, we developed a quasi-efficient maximum likelihood approach for blindly separating stationary, temporally correlated sources modeled by Markov processes. In this paper, we propose to extend this idea to separate mixtures of non-stationary sources. To handle non-stationarity, two methods based respectively on blocking and kernel smoothing are used to find parametric estimates of the score functions of the sources, required for implementing the maximum likelihood approach. Then, the proposed methods exploit simultaneously non-Gaussianity, non-stationarity and time correlation in a quasi-efficient manner. Experimental results using artificial and real data show clearly the better performance of the proposed methods with respect to classical source separation methods.

Manuscript from author [PDF]

ES2007-111

Collaborative Filtering with interlaced Generalized Linear Models

Nicolas Delannay, Michel Verleysen

Abstract
Collaborative Filtering (CF) aims at finding patterns in a sparse matrix of contingency. It can be used for example to mine the ratings given by users on a set of items. In this paper, we introduce a new model for CF based on the Generalized Linear Models formalism. Interestingly, it shares specificities of the model-based and the factorization approaches. The model is simple, and yet it performs very well on the popular MovieLens and Jester datasets.

Manuscript from author [PDF]

[Back to Top]


Kernel methods and Support Vector Machines


ES2007-105

Computing and stopping the solution paths for $\nu$-SVR

Gilles Gasso, Karina Zapien, Stéphane Canu

Abstract
The paper describes the computation of the full paths of the well-known $\nu$-SVR. In the classical method, the user provides two parameters: the regulation parameter $\lambda$ and $\nu$ which settles the width of the tube of the $\epsilon$-insensitive cost optimized by SVR. The paper proposes an efficient to way to get all the solutions by varying $\nu$ and $\lambda$. It analyses also the stopping of the algorithm using the leave-one-out-criterion.

Manuscript from author [PDF]

ES2007-30

Optimizing kernel parameters by second-order methods

Shigeo Abe

Abstract
Radial basis function network (RBF) kernels are widely used for support vector machines (SVMs). But for model selection of an SVM, we need to optimize the kernel parameter and the margin parameter by time-consuming cross validation. In this paper we propose determining parameters for RBF and Mahalanobis kernels by maximizing the class separability by the second-order optimization. For multi-class problems, we determine the kernel parameters for all the two-class problems and set the average value of the parameter values to all the kernel parameters. Then we determine the margin parameter by cross-validation. By computer experiments of multi-class problems we show that the proposed method works to select optimal or near optimal parameters.

Manuscript from author [PDF]

ES2007-32

A novel kernel-based method for local pattern extraction in random process signals

Majid Beigi, Andreas Zell

Abstract
We consider a class of random process signals which contain randomly position local similarities representing the texture of an object. Those repetitive parts may occur in speech, musical pieces and sonar signals. We suggest a warped time resolved spectrum kernel for extracting the subsequence similarity in time series in general, and as an example in biosonar signals. Having a set of those kernels for similarity extraction in different size of ubsequences, we propose a new method to find an optimal linear combination and selection of those kernels. We formulate the optimal kernel selection via maximizing the Kernel Fisher Discriminant criterium (KFD) and use Mesh Adaptive Direct Search method (MADS) to solve the optimization problem. Our method is used for biosonar landmark classification with promising results.

Manuscript from author [PDF]

ES2007-113

One-class SVM regularization path and comparison with alpha seeding

Alain Rakotomamonjy, Manuel DAVY

Abstract
One-class support vector machines (1-SVMs) estimate the level set of the underlying density observed data. Aside the kernel selection issue, one difficulty concerns the choice of the 'level' parameter. In this paper, following the work by Hastie et. al (2004), we derive the entire regularization path for $\nu$-1-SVMs. Since this regularization path is efficient for building different level sets estimate, we have empirically compared such approach to state of the art approach based on alpha seeding and we show that regularization path is far more efficient.

Manuscript from author [PDF]

[Back to Top]


Reinforcement Learning


ES2007-4

Reinforcement learning in a nutshell

Verena Heidrich-Meisner, Martin Lauer, Christian Igel, Martin Riedmiller

Abstract

Manuscript from author [PDF]

ES2007-93

A unified view of TD algorithms, introducing Full-gradient TD and Equi-gradient descent TD

Manuel Loth, Philippe Preux, Manuel DAVY

Abstract
This paper addresses policy evaluation in MDP. It provides a unified view of algorithms such as TD(lambda), LSTD(lambda), iLSTD, and residual-gradient TD. We assert that they all consist of minimizing a gradient function and differ in the form of this function and their means of minimizing it. Building on this unified view, two new schemes are introduced: Full-gradient TD which uses a generalization of the principle introduced in iLSTD, and EGD TD which reduces the gradient by successive equi-gradient descents (generalisation of the LARS algorithm). These three algorithms share the worthy property of using much more efficiently the samples than TD, while keeping the good properties of gradient descent schemes.

Manuscript from author [PDF]

ES2007-125

Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

Jan Peters, Stefan Schaal

Abstract
In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains both the natural policy gradient by linear regression. We show that this architecture can be used to learn the "building blocks of movement generation", called motor primitives. Motor primitives are parameterized control policies such as splines or nonlinear differential equations with desired attractor properties. We show that our most modern algorithm, the Episodic Natural Actor-Critic outperforms previous algorithms by at least an order of magnitude. We demonstrate the efficiency of this reinforcement learning method in the application of learning to hit a baseball with an anthropomorphic robot arm.

Manuscript from author [PDF]

ES2007-24

Neural Rewards Regression for near-optimal policy identification in Markovian and partial observable environments

Daniel Schneegass, Steffen Udluft, Thomas Martinetz

Abstract
Neural Rewards Regression (NRR) is a generalisation of Temporal Difference Learning (TD-Learning) and Approximate Q-Iteration with Neural Networks. The method allows to trade between these two techniques as well as between approaching the fixed point of the Bellman iteration and minimising the Bellman residual. NRR explicitly finds the optimal Q-function without an algorithmic framework except Back Propagation for Neural Networks. We further extend the approach by a recurrent substructure to Recurrent Neural Rewards Regression for partial observable environments or higher order Markov Decision Processes. It allows to transport past information to the present and the future in order to reconstruct the Markov property.

Manuscript from author [PDF]

ES2007-35

Immediate Reward Reinforcement Learning for Projective Kernel Methods

Colin Fyfe, Pei Ling Lai

Abstract
We extend a reinforcement learning algorithm which has previously been shown to cluster data. We have previously applied the method to unsupervised projection methods, principal component analysis, exploratory projection pursuit and canonical correlation analysis. We now show how the same methods can be used in feature spaces to perform kernel principal component analysis and kernel canonical correlation analysis.

Manuscript from author [PDF]

ES2007-49

Replacing eligibility trace for action-value learning with function approximation

Kary Främling

Abstract
The eligibility trace is one of the most used mechanisms to speed up reinforcement learning. Replacing eligibility traces generally seem to perform better than accumulating eligibility traces. However, replacing traces are currently not applicable when using function approximation methods where states are not represented uniquely by binary values. This paper proposes two modifications to replacing traces that overcome this limitation. Experimental results from the Mountain-Car task indicate that the new replacing traces outperform both the accumulating and the `ordinary' replacing traces.

Manuscript from author [PDF]

ES2007-54

The Recurrent Control Neural Network

Anton Maximilian Schaefer, Steffen Udluft, Hans-Georg Zimmermann

Abstract
This paper presents our Recurrent Control Neural Network (RCNN), which is a model-based approach for a data-efficient modelling and control of reinforcement learning problems in discrete time. Its architecture is based on a recurrent neural network (RNN), which is extended by an additional control network. The latter has the particular task to learn the optimal policy. This method has the advantage that by using neural networks we can easily deal with high-dimensions or continuous state and action spaces. Furthermore we can profit from their high system-identification and approximation quality. We show that our RCNN is able to learn a potentially optimal policy by testing it on two different settings of the mountain car problem.

Manuscript from author [PDF]

[Back to Top]


Learning II


ES2007-3

The Intrinsic Recurrent Support Vector Machine

Daniel Schneegass, Anton Maximilian Schaefer, Thomas Martinetz

Abstract
In this work, we present a new model for a Recurrent Support Vector Machine. We call it intrinsic because the complete recurrence is directly incorporated within the considered optimisation problem. This approach offers the advantage that the model straightforwardly develops an algorithmic solution. We test the algorithm on several simple time series. The results are promising and can be seen as a starting point for further research. By inventing better and more efficient methods and algorithms, we expect that Recurrent Support Vector Machines could become an alternative to handle and simulate dynamical systems.

Manuscript from author [PDF]

ES2007-21

A-LSSVM: an Adaline based iterative sparse LS-SVM classifier

Bernardo Carvalho, Antônio Braga

Abstract
LS-SVM aims at solving the learning problem with a system of linear equations. Although this solution is simpler, there is a loss of sparseness in the feature vectors. We present in this work a new method, A-LSSVM, which uses the neural model Adaline to solve the LS-SVM's linear system while automatically finding the support vectors. The proposed approach is compared with other methods in literature to impose sparseness in LS-SVM: Pruning, LS2-SVM, Ada-Pinv and IP-LSSVM. The experiments, performed on three important benchmark databases in Machine Learning, show that all sparse LS-SVMs have an accuracy near SVM, but differ in training time and support vectors found.

Manuscript from author [PDF]

ES2007-25

Explicit Kernel Rewards Regression for data-efficient near-optimal policy identification

Daniel Schneegass, Steffen Udluft, Thomas Martinetz

Abstract
We present the Explicit Kernel Rewards Regression (EKRR) approach, as an extension of Kernel Rewards Regression (KRR), for Optimal Policy Identification in Reinforcement Learning. The method uses the Structural Risk Minimisation paradigm to achieve a high generalisation capability. This explicit version of KRR offers at least two important advantages. On the one hand, finding the optimal policy is done by a quadratic program, hence no Policy Iteration techniques are necessary. And on the other hand, the approach allows for the usage of further constraints and certain regularisation techniques as e.g. in Ridge Regression and Support Vector Machines.

Manuscript from author [PDF]

ES2007-69

Kernel-based online machine learning and support vector reduction

Sumeet Agarwal, Saradhi Vedula, Harish Karnick

Abstract
We apply kernel-based machine learning methods to online learning situations, and the related requirement of reducing the complexity of the learnt classifier. Online methods are particularly useful in situations which involve streaming data, such as medical or financial applications. We show that the concept of span of support vectors can be used to build a classifier that performs reasonably well while satisfying given space and time constraints, thus making it potentially suitable for such online situations.

Manuscript from author [PDF]

ES2007-97

Kernel PCA based clustering for inducing features in text categorization

Zsolt Minier, Lehel Csato

Abstract
We study dimensionality reduction or feature selection in text document categorization problem. We focus on the first step in building text categorization systems, that is the choice of efficiently representing numerically the natural language text. This numerical representation is going to be used by machine learning algorithms. We propose a representation based on word clusters. We build a kernel matrix from the word distribution over the different categories and apply kernel PCA to extract a low-dimensional representation of words. On this low-dimensional representation we use K-means clustering to group words into clusters and use these clusters subsequently in the document categorization task. We show that kernel PCA based clustering gives better or comparable performance than several advanced clustering methods when applied for the standard Reuters corpus.

Manuscript from author [PDF]

ES2007-104

Kernel on Bag of Paths For Measuring Similarity of Shapes

Frederic Suard, Alain Rakotomamonjy, Abdelaziz Benrshrair

Abstract
A common approach for classifying shock graphs is to use a dissimilarity measure on graphs and a distance based classifier. In this paper, we propose the use of kernel functions for data mining problems on shock graphs. The first contribution of the paper is to extend the class of graph kernel by proposing kernels based on bag of paths. Then, we propose a methodology for using these kernels for shock graphs retrieval. Our experimental results show that our approach is very competitive compared to graph matching approaches and is rather robust.

Manuscript from author [PDF]

ES2007-42

Electroencephalogram signal classification for brain computer interfaces using wavelets and support vector machines

Francesc Benimeli, Ken Sharman

Abstract
An electroencephalogram (EEG) signal classification procedure for use in real-time synchronous brain computer interfaces (BCI)is proposed. The features used to perform the classification consist in the coefficients of a discrete wavelet transform (DWT) computed for each trial. A support vector machine (SVM) algorithm has been applied to classify the resultant feature vectors. Some experimental results obtained from the experimental application of the proposed procedure to the classification of two mental states are presented.

Manuscript from author [PDF]

ES2007-119

Bat echolocation modelling using spike kernels with Support Vector Regression.

Bertrand Fontaine, Herbert Peremans, Benjamin Schrauwen

Abstract
From the echoes of their vocalisations bats extract information about the positions of reflectors. To gain an understanding of how target position is translated into neural features, we model the bat's peripheral auditory system up until the auditory nerve. This model assumes multiple threshold detecting neurons for each frequency channel where the inter-spike times are linked to the location of the reflector. To show that this coding process can be reversed we compute the kernel product of the spike trains using a non-binned spike kernel function. This approach allows doing regression on azimuth and elevation using Support Vector Machines.

Manuscript from author [PDF]

ES2007-20

Ensemble neural classifier design for face recognition

Terry Windeatt

Abstract
A method for tuning MLP learning parameters in an ensemble classifier framework is presented. No validation set or cross-validation technique is required to optimize parameters for generalisability. In this paper, the technique is applied to face recognition using Error-Correcting Output Coding strategy to solve multi-class problems.

Manuscript from author [PDF]

ES2007-44

Data reduction using classifier ensembles

J.S. Sánchez, L.I. Kuncheva

Abstract
We propose a data reduction approach for finding a reference set for the nearest neighbour classifier. The approach is based on classifier ensembles. Each ensemble member is given a subset of the training data. Using Wilson's editing method, the ensemble member produces a reduced reference set. We explored several routes to make use of these reference sets. The results with 10 real and artificial data sets indicated that merging the reference sets and subsequent editing of the merged set provides the best trade-off between the error and the size of the resultant reference set. This approach can also handle large data sets because only small fractions of the data are edited at a time.

Manuscript from author [PDF]

ES2007-108

ICA-based High Frequency VaR for Risk Management

Patrick Kouontchou, Bertrand Maillet

Abstract
Independent Component Analysis (ICA, see Comon, 1994 and Hyvärinen et al., 2001) is more appropriate when non-linearity and non-normality are at stake, as mentioned by Back and Weigend (1997) in a financial context. Using high-frequency data on the French Stock Market, we evaluate this technique when generating scenarii for accurate Value-at-Risk computations, reducing by this mean the effective dimensionality of the scenario specification problem in several cases, without leaving aside some main characteristics of the dataset. Various methods for specifying stress scenarii are discussed, compared to other published ones (see Giot and Laurent, 2004), and classical tests of rejection are presented (Kupiec, 1997, Christoffersen and Pelletier, 2003).

Manuscript from author [PDF]

ES2007-80

Algebraic inversion of an artificial neural network classifier

Travis Wiens, Rich Burton, Greg Schoenau

Abstract
Artificial neural networks are, by their definition, non-linear functions. Typically, this means that it is impossible to find a closed-form solution for the inverse function of a neural network. This paper presents a special form of neural network classifier that allows for its algebraic inversion in order to find the boundary between classes. The control of the fuel-air ratio in a spark ignition engine is given as an example.

Manuscript from author [PDF]

ES2007-123

Estimation of tangent planes for neighborhood graph correction

Karina Zapien, Gilles Gasso, Stéphane Canu

Abstract
Local algorithms for non-linear dimensionality reduction and semi-supervised learning algorithms based on spectral decomposition have become quite popular. One drawback of these lie on the fact that a nearest neighborhood graph has to be built in order to decide which two points are to be kept close. In the presence of shortcuts (union of two points whose distance measure along the submanifold is actually large), the resulting embbeding will be unsatisfactory. This paper proposes an algorithm to detect wrong graph connections based on the tangent plane of the manifold at each point, this lead to the estimation of the proper number of neighbor for each point in the dataset. Experiments show that the constructions of the graph can be improved with this method.

Manuscript from author [PDF]

ES2007-128

Estimating the Number of Components in a Mixture of Multilayer Perceptrons

Madalina Olteanu, joseph Rynkiewicz

Abstract
In this paper we are interested in estimating the number of components in a mixture of multilayer perceptrons. The penalized marginal-likelihood criterion for mixture models and hidden Markov models introduced by Keribin (2000) and Gassiat (2002) is extended to mixtures of multilayer perceptrons. We prove the consistency of the BIC criterion under some hypothesis which involve essentially the bracketing entropy of the generalized score-functions class and check the assumptions of the main result.

Manuscript from author [PDF]

[Back to Top]


Biologically motivated learning


ES2007-88

Derivation of nonlinear amplitude equations for the normal modes of a self-organizing system

Junmei Zhu, Christoph von der Malsburg

Abstract
We here are pointing out a basically well-known pathway to the analysis of self-organizing systems that is now well in reach of numerical methods. Systems of coupled nonlinear differential equations are decomposed into normal modes, are reduced by adiabatic elimination of stable modes to a much smaller system of unstable modes and their nonlinear interaction. In the past, this treatment was accessible only for highly idealized model systems. Guided by an application to retinotopic map formation we discuss the extension to more realistic cases.

Manuscript from author [PDF]

ES2007-27

A neural model of cross-modal association in insects

Jan Wessnitzer, Barbara Webb

Abstract
We developed a computational model of learning in the Mushroom Body, a region of multimodal integration in the insect brain. Using realistic neural dynamics and a biologically-based learning rule (spike timing dependent plasticity), the model is tested as part of an insect brain inspired architecture within a closed loop behavioural task. Replicating in simulation an experiment carried out on bushcrickets, we show the system can successfully associate visual to auditory cues, so as to maintain a steady heading towards an intermittent sound source.

Manuscript from author [PDF]

ES2007-106

Transition from initialization to working stage in biologically realistic networks

Andreas Herzog, Kube Karsten, Michaelis Bernd, deLima Ana D., Voigt Thomas

Abstract
In biology, during the early development of cortical neurons to a mature functional network a complex set of development steps is necessary. One of the key elements hereby is the transition of the network dynamics, which start from a slow synchronous activity in a early differentiation phase to a mature firing with complex high-order patterns of spikes and bursts. In this modeling study we investigate the required properties of the network to initialize this transition by the switching of the reversal potential of the GABAergic synapses. The simulated networks are generated by a statistical first-order description of parameters for the neuron model and the network architecture.

Manuscript from author [PDF]

ES2007-95

A supervised learning approach based on STDP and polychronization in spiking neuron networks

Hélène Paugam-Moisy, Régis Martinez, Samy Bengio

Abstract
We propose a network model of spiking neurons, without preimposed topology and driven by STDP (Spike-Time-Dependent Plasticity), a temporal Hebbian unsupervised learning mode, biologically observed. The model is further driven by a supervised learning algorithm, based on a margin criterion, that has effect on the synaptic delays linking the network to the output neurons, with classification as a goal task. The network processing and the resulting performance are completely explainable by the concept of polychronization, proposed by Izhikevich (2006). The model emphasizes the computational capabilities of this concept.

Manuscript from author [PDF]

[Back to Top]


Learning causality


ES2007-6

Computational Intelligence approaches to causality detection

Katerina Hlavackova-Schindler, Pablo F. Verdes

Abstract

Manuscript from author [PDF]

ES2007-149

Distinguishing between cause and effect via kernel-based complexity measures for conditional distributions

Xiaohai Sun, Dominik Janzing, Schoelkopf Bernhard

Abstract
We propose a method to evaluate the complexity of probability measures from data that is based on a reproducing kernel Hilbert space seminorm of the logarithm of conditional probability densities. The motivation is to provide a tool for a causal inference method which assumes that conditional probabilities for effects given their causes are typically simpler and smoother than vice-versa. We present experiments with toy data where the quantitative results are consistent with our intuitive understanding of complexity and smoothness. Also in some examples with real-world data the probability measure corresponding to the true causal direction turned out to be less complex than those of the reversed order.

Manuscript from author [PDF]

ES2007-147

Causality analysis of LFPs in micro-electrode arrays based on mutual information

Nikolay Manyakov, Marc Van Hulle

Abstract
Since perceptual and motor processes in the brain are the result of interactions between neurons, layers and brain areas, a lot of attention has been directed towards the development of techniques to unveil these interactions both in terms of connectivity and direction of interaction. Several techniques are derived from the Granger causality principle, and are based on multivariate autoregressive modeling, so that they can only account for the linear aspect of these interactions. We propose here a technique based on conditional mutual information which enables us not only to describe the directions of nonlinear connections, but also their time delays. We compare our technique with others using ground truth data, thus, for which we know the connectivity. As an application, we consider local field potentials (LFPs) recorded with the 96 micro-electrode UTAH array implanted in area V4 of the macaque monkey's visual cortex.

Manuscript from author [PDF]

ES2007-43

Learning causality by identifying common effects with kernel-based dependence measures

Xiaohai Sun, Dominik Janzing

Abstract
We describe a method for causal inference that measures the strength of statistical dependence by the Hilbert-Schmidt norm of kernel-based conditional cross-covariance operators. We consider the increase of the dependence of two variables X and Y by conditioning on a third variable Z as a hint for Z being a common effect of X and Y. Based on this assumption, we collect \"votes\" for hypothetical causal directions and orient the edges according to the majority vote. For most of our experiments with artificial and real-world data our method has outperformed the conventional constraint-based inductive causation (IC) algorithm.

Manuscript from author [PDF]

ES2007-26

Causality and communities in neural networks

Leonardo Angelini, Daniele Marinazzo, Mario Pellicoro, Sebastiano Stramaglia

Abstract
A recently proposed nonlinear extension of Granger causality is used to map the dynamics of a neural population onto a graph, whose community structure characterizes the collective behavior of the system. Both the number of communities and the modularity depend on transmission delays and on the learning capacity of the system.

Manuscript from author [PDF]

ES2007-148

Exploring the causal order of binary variables via exponential hierarchies of Markov kernels

Xiaohai Sun, Dominik Janzing

Abstract
We propose a new algorithm for estimating the causal structure that underlies the observed dependence among n (n>=4) binary variables X_1,...,X_n. Our inference principle states that the factorization of the joint probability into conditional probabilities for X_j given X_1,...,X_{j-1} often leads to simpler terms if the order of variables is compatible with the directed acyclic graph representing the causal structure. We study joint measures of OR/AND gates and show that the complexity of the conditional probabilities (the so-called Markov kernels), defined by a hierarchy of exponential models, depends on the order of the variables. Some toy and real-data experiments support our inference rule.

Manuscript from author [PDF]

[Back to Top]


Reservoir Computing


ES2007-8

An overview of reservoir computing: theory, applications and implementations

Benjamin Schrauwen, David Verstraeten, Jan Van Campenhout

Abstract

Manuscript from author [PDF]

ES2007-39

Spiral Recurrent Neural Network for Online Learning

Huaien Gao, Rudolf Sollacher, Hans-Peter Kriegel

Abstract
Autonomous, self* sensor networks require sensor nodes with a certain degree of "intelligence". An elementary component of such an "intelligence" is the ability to learn online predicting sensor values. We consider recurrent neural network (RNN) models trained with an extended Kalman filter algorithm based on real time recurrent learning (RTRL) with teacher forcing. We compared the performance of conventional neural network architectures with that of spiral recurrent neural networks (Spiral RNN) - a novel RNN architecture combining a trainable hidden recurrent layer with the "echo state" property of echo state neural networks (ESNN). We found that this novel RNN architecture shows more stable performance and faster convergence.

Manuscript from author [PDF]

ES2007-74

Several ways to solve the MSO problem

Jochen Jakob Steil

Abstract
The so called MSO-problem, -- a simple superposition of two or more sinusoidal waves --, has recently been discussed as a benchmark problem for reservoir computing and was shown to be not learnable by standard echo state regression. However, we show that are at least three simple ways to learn the MSO signal by introducing a time window on the input, by changing the network time step to match the sampling rate of the signal, and by reservoir adaptation. The latter approach is based on an universal principle to implement a sparsity constraint on the activity patterns of the network neurons, which improves spatio-temporal encoding in the network.

Manuscript from author [PDF]

ES2007-114

Adapting reservoir states to get Gaussian distributions

David Verstraeten, Benjamin Schrauwen, Dirk Stroobandt

Abstract
We present an online adaptation rule for reservoirs that is inspired by Intrinsic Plasiticity (IP). The IP rule maximizes the information content of the reservoir state by adapting it so that the distribution approximates a given target. Here we fix the variance of the target distribution, which results in a Gaussian distribution. We apply the rule to two tasks with quite different temporal and computational characteristics.

Manuscript from author [PDF]

ES2007-134

Structured reservoir computing with spatiotemporal chaotic attractors

Carlos Lourenço

Abstract
We approach the themes "computing with chaos" and "reservoir computing" in a unified setting. Different neural architectures are mentioned which display chaotic as well as reservoir properties. The architectures share a common topology of close-neighbor connections which supports different types of spatiotemporal dynamics in continuous time. We bring up the role of spatiotemporal structure and associated symmetries in reservoir-mediated pattern processing. Such type of computing is somewhat different from most other examples of reservoir computing.

Manuscript from author [PDF]

ES2007-68

A first attempt of reservoir pruning for classification problems

Xavier Dutoit, Hendrik Van Brussel, Marnix Nuttin

Abstract
Reservoir Computing is a new paradigm to use artificial neural networks. Despite its promising performances, it has still some drawbacks: as the reservoir is created randomly, it needs to be large enough to be able to capture all the features of the data. Here we propose a method to start with a large reservoir and then reduce its size by pruning out neurons. We then apply this method on a prototypical and a real problem. Both applications show that it allows to improve the performance for a given number of neurons.

Manuscript from author [PDF]

ES2007-98

Intrinsic plasticity for reservoir learning algorithms

Marion Wardermann, Jochen Jakob Steil

Abstract
Recently, a new class of learning algorithms has been pro- posed, reservoir algorithms ([1]). Their learning ability relies heavily on the properties of the reservoir, which is held fixed during learning. In 2005, a fast, biologically plausible learning algorithm — intrinsic plasticity (IP, [2]) — has been proposed, which steers an analog neuron’s output distribution. We will show in this article in what way IP alters the properties of the reservoir and enhances the learning behaviour of reservoir learning algorithms, esp. that of Backpropagation–Decorrelation Recurrent Learning (BPDC, [3])

Manuscript from author [PDF]

[Back to Top]


Learning III


ES2007-19

Bifurcation analysis for a discrete-time Hopfield neural network of two neurons with two delays

Eva Kaslik, Stefan Balint

Abstract
In this paper, a bifurcation analysis is undertaken for a discrete-time Hopfield neural network of two neurons with two different delays and self-connections. Conditions ensuring the asymptotic stability of the null solution are found, with respect to two characteristic parameters of the system. It is shown that for certain values of these parameters, fold or Neimark-Sacker bifurcations occur, but codimension 2 (fold-Neimark-Sacker, double Neimark Sacker and resonance 1:1) bifurcations may also be present. The direction and the stability of the Neimark-Sacker bifurcations are investigated by applying the center manifold theorem and the normal form theory.

Manuscript from author [PDF]

ES2007-45

Spicules-based competitive neural network

Jose Antonio Gomez-Ruiz, Jose Muñoz-Perez, M. Angeles Garcia-Bernal, Ezequiel Lopez-Rubio

Abstract
We present a new model of unsupervised competitive neural network, based on spicules. This model is capable of detecting topological information of an input space, determining its orientation and, in most case, its skeleton.

Manuscript from author [PDF]

ES2007-47

Sparsely-connected associative memory models with displaced connectivity

Lee Calcraft, Rod Adams, Neil Davey

Abstract
Our work is concerned with finding optimum connection strategies in high-performance associative memory models. Taking inspiration from axonal branching in biological neurons, we impose a displacement of the point of efferent arborisation, so that the output from each node travels a certain distance before branching to connect to other units. This technique is applied to networks constructed with a connectivity profile based on Gaussian distributions, and the results compared to those obtained with a network containing purely local connections, displaced in the same manner. It is found that displacement of the point of arborisation has a very beneficial effect on the performance of both network types, with the displaced locally-connected network performing the best.

Manuscript from author [PDF]

ES2007-94

RNN-based Learning of Compact Maps for Efficient Robot Localization

Alexander Förster, Alex Graves, Jürgen Schmidhuber

Abstract
We describe a new algorithm for robot localization, efficient both in terms of memory and processing time. It transforms a stream of laser range sensor data into a probabilistic calculation of the robot's position, using a bidirectional Long Short-Term Memory (LSTM) recurrent neural network (RNN) to learn the structure of the environment and to answer queries such as: in which room is the robot? To achieve this, the RNN builds an implicit map of the environment.

Manuscript from author [PDF]

ES2007-92

Human motion recognition using Nonlinear Transient Computation

Nigel Crook, Wee Jin Goh

Abstract
A novel approach to human motion recognition is proposed that is based on a variation of the Nonlinear Transient Computation Machine (NTCM). The motion data used to train the NTCM comes from point-light display video sequences of a human walking. The NTCM is trained to distinguish between sequences of video frames that depict coordinated walking motion from those that depict uncoordinated (random) motion.

Manuscript from author [PDF]

ES2007-14

Automatically searching near-optimal artificial neural networks

Leandro Almeida, Teresa Ludermir

Abstract
The idea of automatically searching neural networks that learn faster and generalize better is becoming increasingly widespread. In this paper, we present a new method for searching near-optimal artificial neural networks that include initial weights, transfer functions, architectures and learning rules that are specially tailored to a given problem. Experimental results have shown that the method is able to produce compact, efficient networks with satisfactory generalization power and shorter training times.

Manuscript from author [PDF]

ES2007-129

A new decision strategy in multi-objective training of the artificial neural networks

Talles Medeiros, Ricardo Takahashi, Antônio Braga

Abstract
In this work it is presented a new proposal to select a model in the multi-objective training method of the Artificial Neural Network (NN). In order to do this, information from the residue of the Pareto optimal solution is used. The principle to decide for minimum autocorrelation of the data is a criteria that guarantees the extraction of the current information in the noisy data. The experiments show the performance of the proposed DM for variations of the supervised learning problems.

Manuscript from author [PDF]

ES2007-86

Functional elements and networks in fMRI

Jarkko Ylipaavalniemi, Eerika Savia, Ricardo Vigário, Samuel Kaski

Abstract
We propose a two-step approach for the analysis of functional magnetic resonance images, in the context of natural stimuli. In the first step, elements of functional brain activity emerge, based on spatial independence assumptions. The second step exploits temporal covariation between the elements and given features of the natural stimuli to identify functional networks. The networks can have complex activation patterns related to common task goals.

Manuscript from author [PDF]

ES2007-77

Feature extraction for EEG classification: representing electrode outputs as a Markov stochastic process

Liang Wu, Predrag Neskovic

Abstract
In this work we introduce a new model for representing EEG signals and extracting discriminative features. We treat the outputs of each electrode as a stochastic process and assume that the sequence of variables forming a process is stationary and Markov. To capture temporal dependences within an electrode, we use conditional entropy and to capture dependences between different electrodes we use conditional mutual information features of increasing complexities. We show that even when using a small number of sampling points for their estimation (e.g. a single trial) these features carry discriminative information. We test the usefulness of these features by classifying the EEG data from n-back memory tasks.

Manuscript from author [PDF]

ES2007-107

A hierarchical model for syllable recognition

Xavier Domont, Martin Heckmann, Heiko Wersing, Frank Joublin, Christian Goerick

Abstract
Inspired by recent findings on the similarities between the primary auditory and visual cortex we propose a neural network for speech recognition based on a hierarchical feedforward architecture for visual object recognition. When using a Gammatone filterbank for the spectral analysis the resulting spectrograms of syllables can be interpreted as images. After a preprocessing enhancing the formants in the speech signal and a length normalization, the images can than be fed into the visual hierarchy. We demonstrate the validity of our approach on the recognition of 25 different monosyllabic words and compare the results to the Sphinx-4 speech recognition system. Especially for noisy speech our hierarchical model achieves a clear improvement.

Manuscript from author [PDF]

ES2007-50

Classification of computer intrusions using functional networks. A comparative study

Amparo Alonso-Betanzos, Noelia Sánchez-Maroño, Félix M. Carballal-Fortes, Juan A. Suárez-Romero, Beatriz Pérez-Sánchez

Abstract
Intrusion detection is a problem that has attracted a great deal of attention from computer scientists lately, due to the exponential increase in computer attacks in recent years. DARPA KDD Cup 99 is a standard dataset for classifying computer attacks to which several machine learning techniques have been applied. In this paper, we describe the results obtained using functionalnetworks - a paradigm that extends feedforward neural networks - and compare these to the results obtained by other techniques applied to the same dataset. Of particular interest is the capacity for generalization of the approach used.

Manuscript from author [PDF]

ES2007-36

Identification of churn routes in the Brazilian telecommunications market

David L. García, Alfredo Vellido, Angela Nebot

Abstract
The globalization and deregulation of business environments are rapidly shifting the competitive challenges that telecommunications service providers face. As a result, many of these companies are focusing on the preservation of existing customers and the limitation of customer attrition damages. In this brief paper, we investigate the existence of abandonment routes in the Brazilian telecommunications market, according to the customers’ service consumption pattern. A non-linear latent variable model of the manifold learning family is used to segment and visualize the data, as well as to identify typical churn routes.

Manuscript from author [PDF]

[Back to Top]