ESANN1999
7th European Symposium on Artificial Neural Networks
Bruges, Belgium, April 21-22-23


[Electronic proceedings
home page
]
[Information about
ESANN1999
]
[Electronic proceedings
author index
]

ESANN 1999
Content of the proceedings

WARNING: you need Adobe Acrobat reader 4.0 or more to view the PDF files below.



Dynamical systems


ES1999-1
Synchronizing chaotic neuromodules
F. Pasemann

Abstract
We discuss the time-discrete parametrized synchronous dynamics of two coupled chaotic neuromodules. The symmetrical coupling of identical 2-neuron modules results in periodic, quasiperiodic as well as chaotic dynamics constrained to a synchronization manifold M. Stability of the synchonized dynamics is calculated by transversal Lyapunov exponents. In addition to synchronized attractors there often co-exist asynchronous periodic, quasiperiodic or even chaotic atractors. Simulation results for selected sets of parameters are presented.

Manuscript from author [PDF]

ES1999-14
Mean-field equations reveal synchronization in a 2-populations neural network model
E. Daucé, O. Moynot, O. Pinaud, M. Samuelides, B. Doyon

Abstract
We study a 2-populations model of analogic recurrent neural network.
This model takes into account the influence of inhibitory and excitatory neurons. It is dedicated to study collective dynamical properties of large size fully connected recurrent networks. The evolution of neuron activation states is given in the thermodynamic limit by a set of mean-field equations and the network satisfies a "propagation of chaos" property. All these results are supported by rigorous proofs using large deviations techniques. Moreover, we observe that the bifurcation diagram of these mean-field equations, as well as nite size simulations, reveal a parametric domain where the expectation and variance of the limit law of the activation potentials describe periodic oscillations. Fluctuations of individual neurons around this average may occur, showing the existence of a stochastic non stationary regime for long time. This can be directly
related to recent biological discoveries about the role of inhibition in the synchronization of excitatory neurons.

Manuscript from author [PDF]

[Back to Top]


Self-organization


ES1999-13
A hierarchical self-organizing feature map for analysis of not well separable clusters of different feature density
S. Schünemann, B. Michaelis

Abstract
This paper introduces a hierarchical Self-Organizing Feature Map (SOFM). The partial maps consist of individual numbers of neurons, which makes a cluster analysis with different degrees of resolution possible. A definition of a special Mahalanobis space of the data set during the learning improves the properties concerning clusters with low density.

Manuscript from author [PDF]

ES1999-47
Using the Kohonen algorithm for quick initialization of Simple Competitive Learning algorithm
E. de Bodt, M. Cottrell, M. Verleysen

Abstract
In a previous paper ([1], ESANN’97), we compared the Kohonen
algorithm (SOM) to Simple Competitive Learning Algorithm (SCL) when the goal is to reconstruct an unknown density. We showed that for that purpose, the SOM algorithm quickly provides an excellent approximation of the initial density, when the frequencies of each class are taken into account to weight the quantifiers of the classes. Another important property of the SOM is the well known topology conservation, which implies that neighbor data are classified into the same class (as usual) or into neighbor classes. In this paper, we study another interesting property of the SOM algorithm, that holds for any fixed number of quantifiers. We show that even we use those approaches only for quantization, the SOM algorithm can be successfully used to accelerate in a very large proportion the speed of convergence of the classical Simple Competitive Learning Algorithm (SCL).

Manuscript from author [PDF]

[Back to Top]


Special session: Adaptive computation of data structures


ES1999-307
Learning in structured domains
M. Gori

Abstract
By and large, learning from examples in the machine learning litera-
ture refers to static data types. That main stream of interest, however, has had signi cant bifurcations (see e.g. the learning issues connected with syntactic and structured pattern recognition) arisen from the need to exploit the structure inherently attached to the data of some learning tasks. In this paper, I review brie y the research carried out in the last few years in the area of connectionist models in the attempt to extend the corresponding learning approaches to the case of structured domain. I give a uni ed picture of the adaptive computation which can be carried out on graphical objects and show that, under certain restrictions on the kind of graph to be processed, the classic learning algorithm for feedforward networks can be straightforwardly extended.

Manuscript from author [PDF]

ES1999-301
Approximation capabilities of folding networks
B. Hammer

Abstract
In this paper we show several approximation results for folding networks - a generalization of partial recurrent neural networks such that not only time sequences but arbitrary trees can serve as input: Any measurable function can be approximated in probability. Any continuous function can be approximated in the maximum norm on inputs with restricted height, but the resources necessarily increase at least exponentially in the input height. In general, approximation on arbitrary inputs is not possible in the maximum norm.

Manuscript from author [PDF]

ES1999-304
Tree-recursive computation of gradient information for structures
A. Kuechler

Abstract
Recently, the so-called Backpropagation Through Structure (BPTS) gradient calculation algorithm has been developed to capture learning scenarios where data is adequately represented by hybrid
continuous-discrete structures (e.g. labeled ordered trees, nodes aug
mented by continuous information). BPTS can be viewed as an extension of the well-known Backpropagation Through Time (BPTT) algorithm for discrete-time dynamical systems and sequence processing. The well-known (functionally equivalent) Real-time Recurrent Learning (RTRL) algorithm has to be favored to BPTT if long sequences are processed. This paper investigates whether and how RTRL can be generalized - while conserving its appealing algorithmic properties -to calculate the gradient information for models operating on the domain of rooted labeled ordered trees. The answer is partly negative. It turns out that a postorder traversal of the tree has to be obeyed in order to keep the space consumption independent from the size of the input structures. By processing vertices in an inverse topological ordering the algorithm can also be applied on labeled directed ordered acyclic graphs. However, we show that on this graph domain the memory consumption grows (in the worst case) linearly with the size of the input structure.

Manuscript from author [PDF]

ES1999-305
Learning search-control heuristics for automated deduction systems with folding architecture networks
C. Goller

Abstract
During the last years, folding architecture networks and the closely related concept of recursive neural networks have been developed for solving supervised learning tasks on data structures. In this paper, these networks are applied to the problem of learning search-control heuristics for automated deduction systems. Experimental results with the automated deduction system Setheo in an algebraic domain show a considerable performance improvement. Controlled by heuristics which had been learned from simple problems in this domain the system is able to solve several problems from the same domain which had been out of reach for the original system.

Manuscript from author [PDF]

ES1999-306
A topological transformation for hidden recursive modelsarchitecture networks
F. Costa, P. Frasconi, G. Soda

Abstract
Discriminant hidden Markov models can be generalized from
strings to labeled acyclic structures and, in particular, ordered trees [6, 7]. Inference and parameter estimation algorithms for this class of models can be derived in a straightforward way as special instances of inference and learning algorithms for Bayesian networks. However, if we are interested in building a discriminant model, in which arrows are directed towards the root of the tree, the model turns out to be intractable since the number of parameters grows exponentially with the number of neighbors of each node. In this paper we describe a topological transformation that maps ordered trees into binary trees, thus making the total number of parameters independent of the number of neighbors, as for the case of generative models. Besides reducing complexity, it also permits to deal with general ordered trees without imposing a priori a limit on the maximum outdegree. We show that the topological transformation maps regular sets of trees into regular sets of binary trees and, as a result, it does not a ect the possibility of classifying trees with a nite state device. Finally, experimental results from a logo classi cation task are shown.

Manuscript from author [PDF]

ES1999-302
Neural learning of approximate simple regular languages
M. Forcada, A. Corbi, M. Gori, M. Maggini

Abstract
Discrete-time recurrent neural networks (DTRNN) have been used to infer DFA from sets of examples and counterexamples; however, discrete algorithmic methods are much better at this task and clearly outperform DTRNN in space and time complexity. We show, however, how DTRNN may be used to learn not the exact language that explains the whole learning set but an approximate and much simpler
language that explains a great majority of the examples by using sim-
ple rules. This is accomplished by gradually varying the error function
in such a way that the net is eventually allowed to classify clearly but incorrectly those strings that are diÆcult to learn, which are treated as exceptions. The results show that in this way, the DTRNN usually learns a simpli ed approximate language.

Manuscript from author [PDF]

ES1999-355
A benchmark for testing adaptive systems on structured data
M. Hagenbuchner, A.C. Tsoi

Abstract
A number of adaptive methods capable of coping with structured data have emerged recently. Until recently, it was difficult to compare the performance of these methods as there are no universally accepted benchmark problems. As a result, we have developed a methodology to generate a benchmark problem sufficiently exible to permit the simulation of a wide range of structured data learning problems, sufficiently fast to generate a
set of patterns in a reasonable time, and sfficiently small to allow easy access to needed data. The benchmark described in this paper is an artificial learning task consisting of images that feature objects built through rules expressed by an attributed plex grammar. There are a number of advantages in utilizing this methodology. First, it can be well de ned by using an attributed plex grammar. There is no need for the provision of a huge dataset of images as sets for training and testing can quickly be produced through a given grammar. But most importantly, this benchmark encapsulates some of the typical problems encountered in data processing of structured information. This paper illustrates this methodology by means of a traffic policeman problem. The patterns are used to generate data-trees as inputs for a typical adaptive learning algorithm. Preliminary tests show that some of these newly emerged adaptive learning algorithms
perform very well compared to conventional methods.

Manuscript from author [PDF]

[Back to Top]


Methodology


ES1999-5
The application of neural networks to the paper-making industry
P.J. Edwards, A. F. Murray, G. Papadopoulos, A.R. Wallace, J. Barnard

Abstract
This paper describes the application of neural network techniques to
the paper-making industry, particularly for the prediction of paper “curl”. Paper curl is a common problem and can only be measured reliably off-line, after manufacture. Model development is carried out using imperfect data, typical of that collected in many manufacturing environments, and addresses issues pertinent to real-world use. Predictions then are presented in terms that are relevant to the
machine operator, as a measure of paper acceptability, a direct prediction of the quality measure, and always with a measure of prediction confidence. Therefore, the techniques described in this paper are widely applicable to industry.

Manuscript from author [PDF]

ES1999-7
Marble slabs quality classification system using texture recognition and neural networks methodology
J. Martinez-Cabeza de Vaca Alajarin, L.-M. Tomas Balibrea

Abstract
This article describes the use of an LVQ neural network for
the clustering and classi cation of marble slabs according to their texture. The method used for the recognition of textures is based on the Sum and Di erence Histograms, a faster version of the Co- occurrence Matrices. The input of the network is a vector of statistical parameters which characterize the pattern shown to the net, and the desired output is the class to which the pattern belongs (supervised learning). The samples chosen for testing the algorithms have been marble slabs of type "Crema Mar l Sierra de la Puerta". The neural network has been implemented using MATLAB.

Manuscript from author [PDF]

ES1999-8
Visual-based posture recognition using hybrid neural networks
A. Corradini, H.-J. Boehme, H.-M. Gross

Abstract
This paper describes the preliminary results of the research work currently ongoing at our department and carried out as part of a
project founded by the Commission of the European Union. In this
paper a novel approach to human posture analysis and recognition using standard image processing techniques as well as hybrid neural information processing is presented. We rst develop a reliable and robust person localization module via a combination of oriented lters and three-dimensional dynamic neural elds. Then we focus on the view-based recognition of the user's static gestural instructions from a prede ned vocabulary based on both a skin color model and statistical normalized moment invariants. The segmentation of the postures occurs by means of the skin color model based on the Mahalanobis metric. From the resulting binary image containing only regions which have been classifi ed as skin candidates we extract translation and scale invariant moments. They are used as input for two di erent neural classi ers whose results are then compared.
To train and test the neural classi ers we gathered the data from ve
people performing 18 repetitions of each of ve postures (our vocabulary): stop, go left, go right, hello left and hello right. The system is currently under development with constant updates and new developments. It uses input from a color video camera and is user-independent. The aim is to build a real-time system able to deal with dynamic gestures.

Manuscript from author [PDF]

ES1999-36
Model clustering by deterministic annealing
B. Bakker, T. Heskes

Abstract
Although training an ensemble of neural network solutions increases
the amount of information obtained from a system, large ensembles may be hard to analyze. Since data clustering is a good method to summarize large bodies of data, we will show in this paper how to use clustering on instances of neural networks. We will describe an algorithm based on deterministic annealing, which is able to cluster various types of data. As an example, we will apply the algorithm to instances of three different types of MLP's, trained to predict the time of death of ovarian cancer patients.

Manuscript from author [PDF]

[Back to Top]


Special session: Remote sensing spectral image analysis


ES1999-356
The challenges in spectral image analysis: an introduction, and review of ANN approaches
E. Merenyi

Abstract
Utilization of remote sensing multi- and hyperspectral imagery has been rapidly increasing in numerous areas of economic and scientific significance. Hyperspectral sensors, in particular, provide the detailed information that is known from laboratory measurements to characterize and identify minerals, soils, rocks, plants, water bodies, and other surface materials. This opens up tremendous possibilities for resource exploration and management, environmental monitoring, natural hazard prediction, and more. However, exploitation of the wealth of information in spectral images has yet to match up to the sensors' capabilities, as conventional methods often prove inadequate. ANNs hold the promise to revolutionize this area by overcoming many of the mathematical obstacles that traditional techniques fail at. By providing high speed when implemented in parallel hardware, (near-)real time processing of extremely high data volumes, typical in remote sensing spectral imaging, will also be possible.

Manuscript from author [PDF]

ES1999-354
A simple associative neural network for producing spatially homogenous spectral abundance interpretations of hyperspectral imagery
N. Pendock

Abstract
A hyperspectral remotely sensed image may be modeled as a linear mixture of the spectral responses of unknown spectral endmembers. Using the a-priori information that the unknown spectral abundance images should be spatially homogenous, a simple associative neural network may be trained using Hebbian learning to extract spectral endmembers and corresponding abundance images from a hyperspectral image. The technique is applied to an AVIRIS image of Cuprite, Nevada and is compared to an interactive technique for approximating the spectral convex hull of a hyperspectral image that requires a-priori geological knowledge to identify spectral endmembers.

Manuscript from author [PDF]

ES1999-352
Estimating the intrinsic dimensionality of hyperspectral images
J. Bruske, E. Merenyi

Abstract
Estimating the intrinsic dimensionality (ID) of an intrinsically low (d-) dimensional data set embedded in a high (n-) dimensional input space by conventional Principal Component Analysis (PCA) is computationally hard because PCA scales cubic (O(n**3)) with the input dimension [11]. Besides this computational drawback, global PCA will overestimate the ID if the data manifold is curved. In this paper we apply ID_OTPM [1], a new algorithm for ID estimation based on Optimally Topology Preserving Maps [7] to image sequences. In particular, we utilize ID_OTPM for ID estimation of an AVIRIS data set, a hyperspectral satellite image sequence with input dimension n =257880. Most interestingly, our experiments suggest that the inter-band dimension, db, of the AVIRIS data set is between one and two, whereas the spectral dimension, ds, is about four. These results provide important clues for compression, visualization and classification of the AVIRIS data set.

Manuscript from author [PDF]

ES1999-351
Benefits and limits of the self-organizing map and its variants in the area of satellite remote sensoring processing
T. Villmann

Abstract
not yet available

Manuscript from author [PDF]

ES1999-353
Comparison of Kohonen, scale-invariant and GTM self-organising maps for interpretation of spectral data
D. MacDonald, S. McGlinchey, J. Kawala, C. Fyfe

Abstract
We investigate the use of artificial neural networks in classifying hyperspectral data. Such data when collected from remote sensors provides extremely detailed coverage of e.g. the mineralogical composition of planetary surfaces, however the volume of data supplied often overwhelms traditional classifiers. When we wish to investigate such data sets in an open-ended manner, the use of unsupervised learning is a pre-requisite. A set of remotely sensed spectral images are use to train several different topology preserving neural networks. In each method, the data is projected onto a two dimensional grid designed to visualise the data set in a low dimensional space. Such mappings allow graceful degradation of the classifications given by the mappings since nearby data points are mapped to the same or similar classifications.

Manuscript from author [PDF]

[Back to Top]


ANN models and learning I


ES1999-12
AdaBoost and neural networks
T. Windeatt, R. Ghaderi

Abstract
AdaBoost, a recent version of Boosting is known to improve the performance of decision trees in many classification problems, but in some cases it does not do as well as expected. There are also a few reports of its application to more complex classifiers such as neural networks. In this paper we decompose and modify this algorithm for use with RBF NNs, our methodology being based on the technique of combining multiple classifiers.

Manuscript from author [PDF]

ES1999-22
Modeling face recognition learning in early infant development
F. Acerra, Y. Burnod, S. de Schonen

Abstract
Face recognition development has been studied in experimental psychology, in the first month of life. These studies show that already at the age of 4 months the right hemisphere processes configural information, while the left hemisphere processes what is classically called local information. We have developped a neural model to understand how face recognition learning develops in early infancy. We propose a bayesian network based on local cellular properties of visual areas and on lateral-feedforward interactions in the cortex. The model reproduces the experimental data of the right hemisphere infant behavior, when tested with faces. We suggest that the bayesian neural networks and the biological properties of cortical areas may be a more general and useful instrument to understand human development.

Manuscript from author [PDF]

ES1999-3
The NeuralBAG algorithm: optimizing generalization performance in bagged neural networks
J. Carney, P. Cunningham

Abstract
In this paper we propose an algorithm we call "NeuralBAG"
that estimates the set of weights and number of hidden units each network in a bagged ensemble should have so that the generalization performance of the ensemble is optimized. Experiments performed on noisy synthetic data demonstrate the potential of the algorithm. On average, ensembles trained using NeuralBAG out-perform bagged networks trained using cross-validation by 53% and individual networks trained using "cheating" by 32%.

Manuscript from author [PDF]

ES1999-38
Neuro-wavelet parametric characterization of hardness profiles
V. Colla, L. Reyneri, M. Sgarbi

Abstract
This work compares a few attempts based on Neural and Wavelet networks, for extracting the Jominy hardness pro le of steels directly from the chemical composition. In particular, the paper proposes a multi-networks architecture, where a rst network is used as a parametric modeler of the Jominy pro le itself, while a second one is used as a parameter estimator from the steel chemical composition.

Manuscript from author [PDF]

ES1999-11
Heterogeneity enhanced order in a chaotic neural network
S. Mizutani, K. Shimohara

Abstract
Order of the mean eld by heterogeneity is studied in the turbulent phase of a chaotic neural network. Heterogeneity means the distributed randomness of the input in each neuron or the weight in the network. The average power spectrum of the mean eld is used to observe the order and to focus on its peak sharpness. The sharpness of the power peak grows remarkably in the turbulent phase, except around the phase, due to the input disorder. One can nd the maximum of the power sharpness as the weight disorder increases in the turbulent phase. We suppose that this ordering e ect is important for processing information for actual neural networks because of the general existence of such heterogeneity.

Manuscript from author [PDF]

ES1999-20
Tackling the stability/plasticity dilemma with double loop dynamic systems
C. Lecerf

Abstract
Open and organized systems such as living organisms regulate their
exchanges in order to maintain adaptation to their environment. When one reduces a biological organism to its central nervous system (CNS), adaptation comes up as an information flow exchange between the CNS and its environment. Though, the main mechanism used so far to explain learning is derived from the Hebb's hypothesis and it relies on structural modifications of the network through changing weights on connections. The double loop concept proposed
here is the core of a structural and dynamic model tackling with incremental learning in large neural networks. A computer simulation of this concept is briefly described, then is given an equivalent mathematical dynamic system that is related to Thomas' biological feedback theory. Due to the double loop architecture, the observed dynamics shows that the model gives a built-in functional answer to the stability/plasticity dilemma.

Manuscript from author [PDF]

[Back to Top]


Biological models and inspiration


ES1999-6
Regularization in oculomotor adaptation
J. Bullinaria, P. Riddell, S. Rushton

Abstract
The oculomotor system remains plastic so that it can maintain clear
single binocular vision during development and also in novel visual conditions (such as wearing new spectacles). It is important to understand this adaptation process so that we can predict in advance potential problems that might arise with new optical devices such as virtual reality head mounted displays. In this paper we present neural network models of adaptation to vertical disparities at
different points in the visual field and argue that regularization (weight decay) provides a more realistic account of the empirical data than other approaches.

Manuscript from author [PDF]

ES1999-26
Recurrent V1-V2 interaction for early visual information processing
H. Neumann, W. Sepp

Abstract
A majority of cortical areas are connected via feedforward and feedback fiber projections. The computational role of the descending
feedback pathways at different processing stages remains largely unknown. We suggest a new computational model in which normalized
activities of orientation selective contrast cells are fed forward to
the next higher processing stage. The arrangement of input activation is matched against local patterns of curvature shape to generate activities which are subsequently fed back to the previous stage. Initial measurements that are consistent with the top-down generated context-dependent responses are locally enhanced. In all, we present a computational theory for recurrent processing in visual cortex in which the significance of measurements is evaluated on the basis of priors that are represented as contour code patterns. The model handles a variety of perceptual phenomena, such as e.g. bar texture stimuli, illusory contours, and grouping of fragmented shape outline.

Manuscript from author [PDF]

ES1999-33
Neural field description of state-dependent receptive field changes in the visual cortex
K. Suder, F. Wörgötter, T. Wennekers

Abstract
Receptive elds in V1 have been shown to be wider during synchronized than during non-synchronized EEG states, where, in ad-
dition, they can shrink over time in response to ashed stimuli. In the
present paper we employ a neural eld approach to describe the activity patterns in V1 analytically. Expressions for spatio-temporal receptive elds are derived and tted to experimental data. The model supports the idea that the observed RF-restructuring is mainly driven by state-dependent LGN ring patterns (burst vs. tonic mode).

Manuscript from author [PDF]

 

 

[Back to Top]


Special session: Support Vector Machines


ES1999-451
Integrating the evidence framework and the support vector machine
J. Kwok

Abstract
In this paper, we show that training of the support vector machine (SVM) can be interpreted as performing the level 1 inference of MacKay's evidence framework. We further on show that levels 2 and 3 can also be applied to SVM. This allows automatic adjustment of the regularization parameter and the kernel parameter. More importantly, it opens up a wealth of Bayesian tools for use with SVM. Performance is evaluated on both synthetic and real-world data sets.

Manuscript from author [PDF]

ES1999-452
Support vector classifier with asymetric kernel function
K. Tsuda

Abstract
not yet available

Manuscript from author [PDF]

ES1999-453
A multiplicative updating algorithm for training support vector machine
N. Cristianini, C. Campbell, J. Shawe-Taylor

Abstract
Support Vector Machines nd maximal margin hyperplanes in a high dimensional feature space, represented as a sparse linear combination of training points. Theoretical results exist which guarantee a high generalization performance when the margin is large or when the representation is very sparse. Multiplicative-Updating algorithms are a new tool for perceptron learning which are guaranteed to converge rapidly when the target concept is sparse. In this paper we present a Multiplicative-Updating algorithm for training Support Vector Machines which combines the generalization power provided by VC theory with the convergence properties of multiplicative algorithms.

Manuscript from author [PDF]

ES1999-455
Face identification using support vector machines
R. Fernandez, E. Viennet

Abstract
The Support Vector Machine (SVM) is a statistic learning
technique proposed by Vapnik and his research group [8]. In this paper, we benchmark SVMs on a face identi cation problem and propose two approaches incorporating SV classi ers. The rst approach maps the images in to a low dimensional features vector via a local Principal Component Analysis (PCA), features vectors are then used as the inputs of a SVM. The second algorithm is a direct SV classi er with invariances. Both approaches are tested on the freely available ORL database. The SV classi er with invariances achieves an error of 1.5%, which is the best result known on ORL database.

Manuscript from author [PDF]

ES1999-456
Statistical mechanics of support vector machine
A. Buhot, M. Gordon

Abstract
We present a theoretical study of the properties of a class of Support Vector Machines within the framework of Statistical Mechanics. We determine their capacity, the margin, the number of support vectors and the distribution of distances of the patterns to the separating hyperplane in feature-space.

Manuscript from author [PDF]

ES1999-459
An efficient formulation of sparsity controlled support vector regression
P. Drezet, R. Harrison

Abstract
Support Vector Regression (SVR) is a kernel based regression method capable of implementing a variety of regularization techniques. Implementation of SVR usually follows a dual optimization technique which includes Vapnik's e-insensitive zone. The number of terms in the resulting SVR approximation function is dependent on the size of this zone, but improving sparsity by increasing the size of this zone adversely effects precision. We describe an efficient method of formulating SVR without an e-insensitive zone, that selects a minimum support set for the terms of the approximator. Sparsity can then be traded for increased training error and/or decreased SV regularisation.

Manuscript from author [PDF]

ES1999-460
Generalized support vector machines
D. Mattera, F. Palmieri, S. Haykin

Abstract
Most Support Vector (SV) methods proposed in the recent literature can be viewed in a uni ed framework with great exibility in terms of the choice of the basis functions. We show that all these problems can be solved within a unique approach if we are equipped with a robust method for nding a sparse solution of a linear system. Moreover, for such a purpose, we propose an iterative algorithm that can be simply implemented. This allows us to generalize the classical SV method to a generic choice of the basis functions.

Manuscript from author [PDF]

ES1999-461
Support vector machines for multi-class pattern recognition
J. Weston, C. Watkins

Abstract
The solution of binary classi cation problems using support vector machines (SVMs) is well developed, but multi-class problems with more than two classes have typically been solved by combining independently produced binary classifiers. We propose a formulation of the SVM that enables a multi-class pattern recognition problem to be solved in a single optimisation. We also propose a similar generalization of linear programming machines. We report experiments using bench-markdatasets in which these two methods achieve a reduction in the number of support vectors and kernel calculations needed.

Manuscript from author [PDF]

ES1999-462
From regression to classification in support vector machines
M. Pontil, R. Rifkin, T. Evgeniou

Abstract
We study the relation between support vector machines (SVMs) for regression (SVMR) and SVM for classiffication (SVMC). We show that for a given SVMC solution there exists a SVMR solution which is equivalent for a certain choice of the parameters. In particular our result is that for epsilon sufficiently close to one, the optimal hyperplane and threshold for the SVMC problem with regularization parameter Cc are equal to 1/(1-epsilon) times the optimal hyperplane and threshold for SVMR with regularization parameter Cr = (1-epsilon) Cc. A direct consequence of this result is that SVMC can be seen as a special case of SVMR.

Manuscript from author [PDF]

 

ES1999-464
From first order logic to Nd: a data driven reformulation
M. Sebag

Abstract
First order logic (FOL) o ers a natural way of modeling domains such as chemistry: a molecule is most adequately described as a graph of atoms linked by simple or double bonds. To overcome the specific difficulties of dealing with FOL, this paper presents an automatic mapping from the initial problem domain onto the set of integer vectors Nd, where d is a user-supplied integer. This mapping onto a metric space induces a (semi)-distance on the problem domain. Within supervised learning, the quality of the reformulation can thus be estimated from the predictive accuracy of a k-nearest neighbor classi er based on this distance. The approach is validated on a real-world problem pertaining to organic chemistry: toxicology prediction.

Manuscript from author [PDF]

ES1999-454
Dimensionality reduction by local processing
C. Wöhler, U. Kressel, J. Schürmann, J. Anlauf

Abstract
In this paper we describe a novel approach towards dimensionality reduction of patterns to be classi ed. It consists of local processing of the patterns as an alternative to the well-known global principal
component analysis (PCA) algorithm. We use a feed-forward neural network architecture with spatial or spatio-temporal receptive eld connections between the rst two layers that yields a transformed feature vector of signi cantly reduced dimension. We suggest two techniques to adapt the weights of the receptive elds: a local PCA algorithm and training by online gradient descent. Our dimensionality reduction algorithm requires computational costs that are several times smaller compared to the classical PCA approach without loosing performance in the subsequent classi cation process. We apply the algorithm to the problem of handwritten digit recognition as well as to the recognition of pedestrians in image sequences.

Manuscript from author [PDF]

ES1999-457
A kernel based adaline
T. Friess, R. Harrison

Abstract
This new algorithm combines the conceptual simplicity of a least-mean-square algorithm for linear regression, but exhibits the power of a universal non-linear function approximator. The method is based on a generalisation of the Widrow-Hoff LMS rule using Mercer kernels. Simple examples in curve fitting and non-linear systems identification are solved by the method.

Manuscript from author [PDF]

 

ES1999-458
Data domain description using support vectors
D. Tax, R. Duin

Abstract
This paper introduces a new method for data domain description, inspired by the Support Vector Machine by V.Vapnik, called the Support Vector Domain Description (SVDD). This method computes a sphere shaped decision boundary with minimal volume around a set of objects. This data description can be used for novelty or outlier detection. It contains support vectors describing the sphere boundary and it has the possibility of obtaining higher order boundary descriptions without much extra computational cost. By using the different kernels this SVDD can obtain more flexible and more accurate data descriptions. The error of the first kind, the fraction of the training objects which will be rejected, can be estimated immediately from the description.

Manuscript from author [PDF]

ES1999-463
Support vector machines vs multi-layer perceptrons in particle identification
N. Barabino, M. Pallavicini, A. Petrolini, M. Pontil, A. Verri

Abstract
In this paper we evaluate the performance of Support Vector Machines (SVMs) and Multi-Layer Perceptrons (MLPs) on two di erent
problems of Particle Identi cation in High Energy Physics experiments.
The obtained results indicate that SVMs and MLPs tend to perform very similarly.

Manuscript from author [PDF]

[Back to Top]


ANN models and learning II


ES1999-34
Specialization with cortical models: An application to causality learning
H. Frezza-Buet, F. Alexandre

Abstract
In this paper we present the principle of learning by specialization within a cortically-inspired framework. Specialization of neurons in the cortex has been observed, and many models are using such "cortical-like" learning mechanisms, adapted for computational efficiency. Adaptations will be discussed, in light of experiments with our cortical model addressing causality learning from perceptive sequences.

Manuscript from author [PDF]

ES1999-23
Generalisation capabilities of a distributed neural classifier
A. Ribert, A. Ennaji, Y. Lecourtier

Abstract
This article describes a new approach to the automated construction of a distributed neural classifier. The methodology is based upon supervised hierarchical clustering which enables one to determine reliable regions in the representation space. The proposed methodology proceeds by associating each of these regions with a Multi-Layer Perceptron (MLP). Each MLP has to recognise elements inside its region, while rejecting all others. Experimental results for a real problem (handwritten digit recognition) reveal an interesting
generalisation behaviour of the distributed classifier in comparison to the knearest neighbour algorithm as well as a single MLP.

Manuscript from author [PDF]

ES1999-42
A comparison of three PCA neural techniques
S. Fiori, F. Piazza

Abstract
We present a comparison of three neural PCA techniques: the GHA by Sanger, the APEX by Kung and Diamataras, and the psi-APEX first proposed by the present authors. Through numerical simulations and computational complexity evaluations we show the psi-APEX
algorithms exhibit superior capability and interesting features.

Manuscript from author [PDF]

 

ES1999-40
Neural networks which identify composite factors
D. MacDonald, D. Charles, C. Fyfe

Abstract
We investigate the use of an artificial neural network to form a sparse distributed representation of the underlying factors in data sets. We extend the previously proposed [1] network so that it may identify composite causes in data sets by creating a hierarchical network. We use the network as a means of identifying individual faces when the network is trained on a mixture of faces and show both analytically and through experiments how noise allows us to
find precisely the factors without prior assumptions of the number of factors.

Manuscript from author [PDF]

ES1999-2
Supervised Art-II: a new neural network architecture, with quicker learning algorithm, for learning and classifying multivaled input patterns
K. R. Al-Rawi, C. Gonzalo, A. Arquero

Abstract
A new artificial neural network (ANN) architecture for learning and classifying multivalued input patterns has been introduced, called
Supervised ART-II. It represents a new supervision approach for ART
modules. It is quicker in learning than Supervised ART-I when the
number of category nodes is large, and it requires less memory. The architecture, learning, and testing of the newly developed ANN have been discussed.

Manuscript from author [PDF]

 

[Back to Top]


Classification


ES1999-4
Feature binding and relaxation labeling with the competitive layer model
H. Wersing, H. Ritter

Abstract
We discuss the relation of the Competitive Layer Model (CLM) to Relaxation Labeling (RL) with regard to feature binding and labeling problems. The CLM uses cooperative and competitive interactions to partition a set of input features into groups by energy minimization. As we show, the stable attractors of the CLM provide consistent and unambiguous labelings in the sense of RL and we give an efficient stochastic simulation procedure for their identification. In addition to binding the CLM exhibits contextual activity modulation to represent stimulus salience. We incorporate deterministic annealing for avoidance of local minima and show how figure-ground segmentation and grouping can be combined for the CLM application of contour grouping on a real image.

Manuscript from author [PDF]

ES1999-16
Segmentation-free detection of overtaking vehicles with a two-stage time-delay neural network classifier
C. Wöhler, J. Schürmann, J. Anlauf

Abstract
We propose an algorithm based on a time delay neural network (TDNN) with spatio-temporal receptive elds for segementation-
free detection of overtaking vehicles on motorways. Our algorithm transforms the detection problem into a classi cation problem of strongly downscaled image sequences which serve as an input to the TDNN without a preliminary segmentation step. The TDNN classi er is followed by an additional classi cation stage to evaluate the TDNN output over time, which achieves a signi cant enhancement of the detection performance especially under difficult visibility conditions.

Manuscript from author [PDF]

ES1999-18
An integer recurrent artificial neural network for classifying feature vectors
R. K. Brouwer

Abstract
The main contribution of this report is the development of an integer recurrent artificial neural network (IRANN) for classification of feature vectors. The network consists both of threshold units or perceptrons and of counters, which are non-threshold units with bi-nary input and integer output. Input and output of the network consists of vectors of natural numbers. For classification representatives of sets are stored by calculating a connection matrix such that all the elements in a training set are attracted to members of the same training set. The class of its attractor then classifies an arbitrary element if the attractor is a member of one of the original training sets. The network is successfully applied to the classification of sugar diabetes data and credit application data.

Manuscript from author [PDF]

 

ES1999-24
Feature selection for ANNs using genetic algorithms in condition monitoring
L. Jack, A. Nandi

Abstract
Arti cial Neural Networks (ANNs) can be used successfully to detect faults in rotating machinery, using statistical estimates of the vibration signal as input features. One of the main problems facing the use of ANNs is the selection of the best inputs to the ANN, allowing the creation of compact, highly accurate networks that require comparatively little preprocessing. This paper examines
the use of a Genetic Algorithm (GA) to select the most signi cant input features from a large set of possible features in machine condition monitoring contexts. Using a large set of 156 di erent features, the GA is able to select a set of 6 features that give 100% recognition accuracy.

Manuscript from author [PDF]

 

[Back to Top]


Special session: Information extraction using unsupervised neural networks


ES1999-208
Trends in Unsupervised Learning
C. Fyfe

Abstract
We review the trends in unsupervised learning towards the search for (in)dependence rather than (de)correlation, towards the use of global objective functions, towards a balancing of cooperation and competition and towards probabilistic, particularly Bayesian methods.

Manuscript from author [PDF]

ES1999-202
Detection of two Gaussian clusters
A. Buhot, M. Gordon

Abstract
We discuss the detection of two Gaussian clusters given a cloud of points. The optimal learning curve for this unsupervised learning scenario is determined with a replica calculation. A comparison with principal component analysis and supervised learning allows to understand the three different learning phases observed.

Manuscript from author [PDF]

ES1999-205
Independent component analysis for mixture densities
F. Palmieri, A. Budillon, D. Mattera

Abstract
Independent component analysis (ICA), formulated as a density estimation problem, is extended to a mixture density model. A number of ICA blocks, associated to implicit equivalent classes, are updated in turn on the basis of the estimated density they represent. The approach is equivalent to the EM algorithm and allows an easy non linear extension of all the current ICA algorithms. We also show a preliminary test on bi-dimensional synthetic data drawn from a mixture model.

Manuscript from author [PDF]

ES1999-206
Extraction of intrinsic dimension using CCA - Application to blind sources separation
N. Donckers, A. Lendasse, V. Wertz, M. Verleysen

Abstract
A general-purpose useful parameter in data analysis is the intrinsic dimension of a data set, corresponding to the minimum number of variables necessary to describe the data without significant loss of information. The knowledge of this dimension also facilitates most non-linear projection methods. We will show that the intrinsic dimension of a data set can be efficiently estimated using Curvilinear Component Analysis; we will also show that the method can be applied to the Blind Source Separation problem to estimate the number of sources in a mixing.

Manuscript from author [PDF]

ES1999-207
Noise to extract independent causes
D. Charles, C. Fyfe

Abstract
Noisy threshold activation functions are used to force sparse
responses on the output neurons of an unsupervised neural network enabling the network to identify the underlying independent factors of visual data. The addition of noise into the network enables us to control the response of the network to the data so we can force only as many outputs to respond to the data as there are signi cant factors in the data. Noise is also used to modularise the response of the network so that factors with temporal correlation may be coded in the same module of the output space.

Manuscript from author [PDF]

ES1999-201
Information retrieval systems using an associative conceptual space
J. van den Berg, M. Schuemie

Abstract
After a review of 'intelligent' information retrieval systems, we propose an AI-based retrieval system inspired by the WEBSOM-algorithm. Contrary to that approach, however, we introduce a system using only the index of every document. The knowledge extraction process results into a so-called Associative Conceptual Space, where the 'concepts', as found in the documents, are clustered using a Hebbian-type of (un)learning. Then, each document is characterised by comparing the concepts found in it, to those present in the concept space. Applying the characterisations, all documents can be clustered such that semantically similar documents lie close together on a Self-Organising Map.

Manuscript from author [PDF]

ES1999-203
Taking inspiration from the Hippocampus can help solving robotics problems
A. Revel, P. Gaussier, J.P. Banquet

Abstract
not yet available

Manuscript from author [PDF]

 

[Back to Top]


ANN models and learning III


 

ES1999-37
Orthogonal least square algorithm applied to the initialization of multi-layer perceptrons
V. Colla, L. Reyneri, M. Sgarbi

Abstract
An efficient procedure is proposed for initializing two-layer
perceptrons and for determining the optimal number of hidden neurons. This is based on the Orthogonal Least Squares method, which is typical of RBF as well as Wavelet networks. Some experiments are discussed, in which the proposed method is coupled with standard backpropagation training and compared with random initialization.

Manuscript from author [PDF]

ES1999-17
Maximisation of stability ranges for recurrent neural networks subject to on-line adaptation
J. Steil, H. Ritter

Abstract
We present conditions for absolute stability of recurrent neural networks with time-varying weights based on the Popov theorem from non-linear feedback system theory. We show how to maximise the stability bounds by deriving a convex optimisation problem subject to linear matrix inequality constraints, which can efficiently be solved by interior point methods with standard software.

Manuscript from author [PDF]

ES1999-303
Encoding of sequential translators in discrete-time recurrent neural nets
R.P. Neco, M.L. Forcada, R.C. Carrasco, M.A. Valdez-Munoz

Abstract
In recent years, there has been a lot of interest in the use of
discrete-time recurrent neural nets (DTRNN) to learn nite-state tasks, and in the computational power of DTRNN, particularly in connection with nite-state computation. This paper describes a simple strategy to devise stable encodings of sequential nite-state translators (SFST) in a second-order DTRNN with units having bounded, strictly growing, continuous sigmoid activation functions. The strategy relies on bounding criteria based on a study of the conditions under which the DTRNN is actually behaving as a SFST.

Manuscript from author [PDF]

ES1999-29
On the invertibility of the RBF model in a predictive control strategy
A. Fache, O. Dubois, A. Billat

Abstract
This paper describes the importance of the RBF model quality in a model-based predictive control scheme. We show that a good neuronal approximator does not necessarily correctly model the
intrinsic behaviour of the identified system. We have used a simulated example to show the harmful effects of a particular type of incorrect behaviour, the non-invertibility of the model relative to the control input. Lastly, we propose a derived RBF model that is slightly more complex, but which is systematically invertible.

Manuscript from author [PDF]

ES1999-46
Nonlinear factorization in sparsely encoded Hopfield-like neural networks
A.M. Sirota, A.A. Frolov, D. Husek

Abstract
The problem of binary factorization of complex patterns in recurrent Hopfieldlike neural network was studied both theoretically and by means of computer simulation. The number and sparseness of factors mixed in patterns crucially determines the ability of an autoassociator to perform a factorization. Basing on experimental data on memory and learning one may suggest, that there exists a neural system of intermediate storage of information, which fulfills the function of binary factorization of the incoming polysensory information for its further effective storage in the form of elementary associatively bound factors. We suppose that field CA3 of the hippocampus possessing all properties of the autoassociative memory performs such function. This functional idea could be fruitfully applied to various memory related tasks (e.g. spatial navigation) and lead to some critical experiments.

Manuscript from author [PDF]

ES1999-21
Storage capacity and dynamics of nonmonotonic networks
B. Crespi, I. Lazzizzera

Abstract
This work investigates the retrieval capacities of different types of nonmonotonic neurons. Storage capacity is maximized when the neuron response is a function with well defined geometrical characteristics. Numerical experiments demonstrate that storage capacity is directly related to the dynamical property of the iterative map that describes the network evolution. Maximum capacity is reached when the neuron dynamics are subdivided into two non-overlapping "erratic bands" around points xi = +/- 1.

Manuscript from author [PDF]

ES1999-30
A general approach to construct RBF net-based classifier
F. Belloir, A. Fache, A. Billat

Abstract
This paper describes a global approach to the construction of
Radial Basis Function (RBF) neural net classifier. We used a new
simple algorithm to completely define the structure of the RBF
classifier. This algorithm has the major advantage to require only the
training set (no step learning, threshold or other parameters as in other methods). Tests on several benchmark datasets showed, despite its simplicity, that this algorithm provides a robust and efficient classifier. The results of this built RBF classifier are compared to those obtained with three other classifiers : a classic one and two neural ones. The robustness and efficiency of this kind of RBF classifier make the proposed algorithm very attractive.

Manuscript from author [PDF]

ES1999-32
Hidden Markov gating for prediction of change points in switching dynamical systems
S. Liehr, K. Pawelzik, J. Kohlmorgen, S. Lemm, K.-R. Müller

Abstract
The prediction of switching dynamical systems requires an identification of each individual dynamics and an early detection of mode changes. Here we present a unified framework of a mixtures of experts architecture and a generalized hidden Markov model (HMM) with a state space dependent transition matrix. The specialization of the experts in the dynamical regimes and the adaptation of the switching probabilities is performed simultaneously during the training procedure. We show that our method allows for a fast online detection of mode changes in cases where the most recent input data together with the last dynamical mode contain sufficient information to indicate a dynamical change.

Manuscript from author [PDF]

ES1999-31
Critical and non-critical avalanche behavior in networks of integrate-and-fire neurons
C. Eurich, T. Conradi, H. Schwegler

Abstract
We study avalanches of spike activity in fully connected networks of integrate-and- re neurons which receive purely random input. In contrast to the self-organized critical avalanche behavior in sandpile models, critical and non-critical behavior is found depending on the interaction strength. Avalanche behavior can be readily understood by using combinatorial arguments in phase space.

Manuscript from author [PDF]

[Back to Top]


Special session: Spiking neurons


ES1999-252
Fast analog computation in networks of spiking neurons using unreliable synapses
T. Natschläger, W. Maass

Abstract
We investigate through theoretical analysis and computer simulations the consequences of unreliable synapses for fast analog computations in networks of spiking neurons, with analog variables encoded by the ring activities of pools of spiking neurons. Our results suggest that the known unreliability of synaptic transmission may be viewed as a useful tool for analog computing, rather than as a "bug" in neuronal hardware. We also investigate computations on analog time series encoded by the ring activities of pools of spiking neurons.

Manuscript from author [PDF]

ES1999-253
Learning a temporal code
P. Häfliger

Abstract
The paper proposes a concrete information encoding for networks of spiking neurons. A temporal code is presented in which neurons respond to simultaneous spike releases of a particular group of neurons. The paper puts a spike-based learning rule in the context of that coding and shows how a network adapts to events experienced while observing an environment. Furthermore, correlations between events distant in time can be learnt. To demonstrate this, a net is simulated, the neurons of which become selective to moving bar stimuli after repeated presentations of samples.

Manuscript from author [PDF]

ES1999-251
VC dimension bounds for networks of spiking neurons
M. Schmitt

Abstract
We calculate bounds on the VC dimension and pseudo dimension for networks of spiking neurons. The connections between network nodes are parameterized by transmission delays and synaptic weights. We provide bounds in terms of network depth and number of connections that are almost linear. For networks with few layers this yields better bounds than previously established results for networks of unrestricted depth.

Manuscript from author [PDF]

ES1999-254
What does a neuron talk about ?
S. Wilke, C.Eurich

Abstract
We study the coding accuracy of a population of stochastically spiking neurons that respond to di erent features of a stimulus.
By using Fisher information as a measure of the encoding error, it can
be shown that narrow tuning functions in one of the encoded dimensions increase the coding accuracy for this dimension as long as the active sub-population is large enough. This can be achieved by neurons that are broadly tuned in the other dimensions. If one or more stimulus features encoded by the neural population are unknown, the relative widths of the tuning curves in the remaining dimensions are a measure of the corresponding relative accuracies. This feature allows a quantitative description of the kind of information conveyed by the neural population.

Manuscript from author [PDF]

[Back to Top]


Temporal series


 

ES1999-401
Development of a French speech recognizer using a hybrid HMM/MLP system
J.-M. Boite, C. Ris

Abstract
In this paper we describe the development of a French speech recognizer, and the experiments we carried out on our hybrid
HMM/ANN system which combines Arti cial Neural Networks (ANN)
and Hidden Markov Models (HMMs). A phone recognition experiment
with our baseline system achieved a phone accuracy of about 75% which is very similar to the best results reported in the literature [1]. Preliminary experiments on continuous speech recognition have set a baseline performance for our hybrid HMM/ANN system on BREF using lexicons of di erent sizes. All the experiments were carried out with the STRUT (Speech Training and Recognition Uni ed Toolkit) software [2] and the NOWAY large vocabulary decoder [3]

Manuscript from author [PDF]

ES1999-402
A hybrid system for fraud detection in mobile communications
Y. Moreau, E. Lerouge, H. Verrelst, J. Vandewalle, C. Störmann, P. Burge

Abstract
During the course of the European project \Advanced Security for Personal Communication Technologies" (ASPeCT), we have developed some rule-based and neural network architectures as a number of di erent fraud detection tools for GSM networks. We have now integrated these different techniques into a hybrid detection tool. We optimized the performance of the hybrid system in terms of the number of subscribers raising alarms. More precisely, we optimized performance curves showing the trade-o between the percentage of correctly identi ed fraudsters versus the percentage of new subscribers raising alarms. We report here on a common suite of experiments we performed on these di erent systems.

Manuscript from author [PDF]

ES1999-35
Hybrid HMM/MLP models for times series prediction
J. Rynkiewicz

Abstract
We present a hybrid model consisting of an hidden Markov chain and MLPs to model piecewise stationary series. We compare our results with the model of gating networks (A.S. Weigend et al. [6]) and we show than, at least on the classical laser time series, our model is more parcimonious and give better segmentation of the series.

Manuscript from author [PDF]

[Back to Top]