ICPRAM 2014 Abstracts


Area 1 - Theory and Methods

Full Papers
Paper Nr: 16
Title:

Multiple Segmentation of Image Stacks

Authors:

Jonathan Smets and Manfred Jaeger

Abstract: We propose a method for the simultaneous construction of multiple image segmentations by combining a recently proposed “convolution of mixtures of Gaussians” model with a multi-layer hidden Markov random field structure. The resulting method constructs for a single image several, alternative segmentations that capture different structural elements of the image. We also apply the method to collections of images with identical pixel dimensions, which we call image stacks. Here it turns out that the method is able to both identify groups of similar images in the stack, and to provide segmentations that represent the main structures in each group.

Paper Nr: 35
Title:

Measuring Cluster Similarity by the Travel Time between Data Points

Authors:

Yonggang Lu, Xiaoli Hou and Xurong Chen

Abstract: A new similarity measure for hierarchical clustering is proposed. The idea is to treat all the data points as mass points under a hypothetical gravitational force field, and derive the hierarchical clustering results by estimating the travel time between data points. The shorter the time needed to travel from one point to another, the more similar the two data points are. In order to avoid the complexity in the simulation using molecular dynamics, the potential field produced by all the data points is computed. Then the travel time between a pair of data points is estimated using the potential field. In our method, the travel time is used to construct a new similarity measure, and an edge-weighted tree of all the data points is built to improve the efficiency of the hierarchical clustering. The proposed method called Travel-Time based Hierarchical Clustering (TTHC) is evaluated by comparing with four other hierarchical clustering methods. Two real datasets and two synthetic dataset families composed of 200 randomly produced datasets are used in our Experiments. It is shown that the TTHC method can produce very competitive results, and using the estimated travel time instead of the distance between data points is capable of improving the robustness and the quality of clustering.

Paper Nr: 43
Title:

Affine Invariant Shape Matching using Histogram of Radon Transform and Angle Correlation Matrix

Authors:

Makoto Hasegawa and Salvatore Tabbone

Abstract: An affine invariant shape matching descriptor using the histogram of Radon transform (HRT) and the dynamic time warping (DTW) distance is proposed. Our descriptor based on the Radon transform is robust to shape rotation, uniform scaling, and translation. For non-uniform scaling and shearing, our descriptor has a non-linear sparse and dense distortion relative to the angle coordinates. Therefore, we apply DTW on a cost matrix to be robust to these transformations. This cost matrix is defined as an angle correlation matrix based on the product of two matrices only. Moreover, based on the beam search algorithm, we speed-up the time complexity of our method. Experimental results show that our approach is fast to compute and competitive compared to well-known descriptors.

Paper Nr: 45
Title:

Unsupervised Consensus Functions Applied to Ensemble Biclustering

Authors:

Blaise Hanczar and Mohamed Nadif

Abstract: The ensemble methods are very popular and can improve significantly the performance of classification and clustering algorithms. Their principle is to generate a set of different models, then aggregate them into only one. Recent works have shown that this approach can also be useful in biclustering problems.The crucial step of this approach is the consensus functions that compute the aggregation of the biclusters. We identify the main consensus functions commonly used in the clustering ensemble and show how to extend them in the biclustering context. We evaluate and analyze the performances of these consensus functions on several experiments based on both artificial and real data.

Paper Nr: 47
Title:

Modified Fuzzy C-Means as a Stereo Segmentation Method

Authors:

Michal Krumnikl, Eduard Sojka and Jan Gaura

Abstract: This paper presents an extension to the popular fuzzy c-means clustering method by introducing an additional disparity cue. The creation of the clusters is driven by a degree of the stereo match and thus is able to separate the objects based on their different colour and spatial depth. In contrast to the other popular approaches, the clustering is not performed on the individual input images, but on the stereo pair, and takes into account the matching properties. The algorithm is capable of producing the segmentations, as well as the disparity maps. The results of this algorithm show that the proposed method can improve the segmentation, under the condition of having the stereo image pair of the segmented scene.

Paper Nr: 54
Title:

On Selecting Helpful Unlabeled Data for Improving Semi-Supervised Support Vector Machines

Authors:

Thanh-Binh Le and Sang-Woon Kim

Abstract: Recent studies have demonstrated that Semi-Supervised Learning (SSL) approaches that use both labeled and unlabeled data are more effective and robust than those that use only labeled data. However, it is also well known that using unlabeled data is not always helpful in SSL algorithms. Thus, in order to select a small amount of helpful unlabeled samples, various selection criteria have been proposed in the literature. One criterion is based on the prediction by an ensemble classifier and the similarity between pairwise training samples. However, because the criterion is only concerned with the distance information among the samples, sometimes it does not work appropriately, particularly when the unlabeled samples are near the boundary. In order to address this concern, a method of training semi-supervised support vector machines (S3VMs) using selection criterion is investigated; this method is a modified version of that used in SemiBoost. In addition to the quantities of the original criterion, using the estimated conditional class probability, the confidence values of the unlabeled data are computed first. Then, some unlabeled samples that have higher confidences are selected and, together with the labeled data, used for retraining the ensemble classifier. The experimental results, obtained using artificial and real-life benchmark datasets, demonstrate that the proposed mechanism can compensate for the shortcomings of the traditional S3VMs and, compared with previous approaches, can achieve further improved results in terms of classification accuracy.

Paper Nr: 57
Title:

SCHOG Feature for Pedestrian Detection

Authors:

Ryuichi Ozaki and Kazunori Onoguchi

Abstract: Co-occurrence Histograms of Oriented Gradients(CoHOG) has succeeded in describing the detailed shape of the object by using a co-occurrence of features. However, unlike HOG, it does not consider the difference of gradient magnitude between the foreground and the background. In addition, the dimension of the CoHOG feature is also very large. In this paper, we propose Similarity Co-occurrence Histogram of Oriented Gradients( SCHOG) considering the similarity and co-occurrence of features. Unlike CoHOG which quantize edge gradient direction to eight directions, SCHOG quantize it to four directions. Therefore, the feature dimension for the co-occurrence between edge gradient direction decreases greatly. In addition to the co-occurrence between edge gradient directions the binary code representing the similarity between features is introduced. In this paper, we use the pixel intensity, the edge gradient magnitude and the edge gradient direction as the similarity. In spite of reducing the resolution of the edge gradient direction, SCHOG realizes higher performance and lower dimension than CoHOG by adding this similarity. We have focused on pedestrian detection in this paper. However, this method is also applicable to various object recognition by introducing various kind of similarity. In experiments using the INRIA Person Dataset, SCHOG is evaluated in comparison with the conventional CoHOG.

Paper Nr: 59
Title:

Discriminative Prior Bias Learning for Pattern Classification

Authors:

Takumi Kobayashi and Kenji Nishida

Abstract: Prior information has been effectively exploited mainly using probabilistic models. In this paper, by focusing on the bias embedded in the classifier, we propose a novel method to discriminatively learn the prior bias based on the extra prior information assigned to the samples other than the class category, e.g., the 2-D position where the local image feature is extracted. The proposed method is formulated in the framework of maximum margin to adaptively optimize the biases, improving the classification performance. We also present the computationally efficient optimization approach that makes the method even faster than the standard SVM of the same size. The experimental results on patch labeling in the on-board camera images demonstrate the favorable performance of the proposed method in terms of both classification accuracy and computation time.

Paper Nr: 64
Title:

Minutiae Persistence among Multiple Samples of the Same Person’s Fingerprint in a Cooperative User Scenario

Authors:

Vedrana Krivokuća, Waleed Abdulla and Akshya Swain

Abstract: A significant challenge in the development of automated fingerprint recognition algorithms is dealing with missing minutiae. While it is generally assumed that some minutiae will always be missing between multiple samples of the same fingerprint, this assumption has never been empirically evaluated. An important factor influencing minutiae persistence in civilian fingerprint recognition applications is the consistency with which a user places their finger on the fingerprint scanner during fingerprint image acquisition. This paper investigates the probability of a reference minutia repeating in another sample of the same person’s fingerprint, when that probability depends on user consistency alone. The investigation targets cooperative users in a civilian fingerprint recognition application. To simulate this scenario, a database of 800 fingerprint samples from 100 participants was collected. Analysis of the database showed that the median probability of a reference minutia repeating in another sample of the same fingerprint is 0.95 with an interquartile range of 0.04. Combining multiple samples of the same fingerprint to filter out only the most reliable reference minutiae was shown to improve this probability. A complementary study demonstrated that automatic feature extractors and matchers may lower minutiae repeatability, but that user consistency is nevertheless the most influential factor.

Paper Nr: 82
Title:

Dirichlet-tree Distribution Enhanced Random Forests for Head Pose Estimation

Authors:

Yuanyuan Liu, Jingying Chen, Leyuan Liu, Yujiao Gong and Nan Luo

Abstract: Head pose estimation is important in human-machine interfaces. However, illumination variation, occlusion and low image resolution make the estimation task difficult. Hence, a Dirichlet-tree distribution enhanced Random Forests approach (D-RF) is proposed in this paper to estimate head pose efficiently and robustly under various conditions. First, Gabor features of the facial positive patches are extracted to eliminate the influence of occlusion and noise. Then, the D-RF is proposed to estimate the head pose in a coarse-to-fine way. In order to improve the discrimination capability of the approach, an adaptive Gaussian mixture model is introduced in the tree distribution. The proposed method has been evaluated with different data sets spanning from -90º to 90º in vertical and horizontal directions under various conditions.The experimental results demonstrate the approach’s robustness and efficiency.

Paper Nr: 99
Title:

A Descriptor based on Intensity Binning for Image Matching

Authors:

B. Balasanjeevi and C. Chandra Sekhar

Abstract: This paper proposes a method for extracting image descriptors using intensity binning. It is based on the fact that, when the intensities of the interest regions are quantized, the pixels retain their bin labels under common image deformations, up to a certain degree of perturbation. Consequently, the spatial configuration and the shape of the connected regions of pixels belonging to each bin become resilient to noise, which, as a whole, capture the topography of the intensity map pertaining to that region. We examine the effect of classical image deformations on this representation and seek to find a compact yet robust representation which remains unperturbed in the presence of noise and image deformations. We use Oxford dataset in our experiments and the results show that the proposed descriptor gives a better performance than the existing methods for matching two images under common image deformations.

Paper Nr: 102
Title:

A First Algorithm to Calculate Force Histograms in the Case of 3D Vector Objects

Authors:

Jameson Reed, Mohammad Naeem and Pascal Matsakis

Abstract: In daily conversation, people use spatial prepositions to denote spatial relationships and describe relative positions. Various quantitative relative position descriptors can be found in the literature. However, they all have been designed with 2D objects in mind, most of them cannot be extended to handle 3D objects in vector form, and there is currently no implementation able to process such objects. In this paper, we build on a descriptor called the histogram of forces, and we present the first algorithm for quantitative relative position descriptor calculation in the case of 3D vector objects. Experiments validate the approach.

Paper Nr: 112
Title:

Kernel Completion for Learning Consensus Support Vector Machines in Bandwidth-limited Sensor Networks

Authors:

Sangkyun Lee and Christian Pölitz

Abstract: Recent developments in sensor technology allows for capturing dynamic patterns in vehicle movements, temperature changes, and sea-level fluctuations, just to name a few. A usual way for decision making on sensor networks, such as detecting exceptional surface level changes across the Pacific ocean, involves collecting measurement data from all sensors to build a predictor in a central processing station. However, data collection becomes challenging when communication bandwidth is limited, due to communication distance or low-energy requirements. Also, such settings will introduce unfavorable latency for making predictions on unseen events. In this paper, we propose an alternative strategy for such scenarios, aiming to build a consensus support vector machine (SVM) in each sensor station by exchanging a small amount of sampled information from local kernel matrices amongst peers. Our method is based on decomposing a “global” kernel defined with all features into “local” kernels defined only with attributes stored in each sensor station, sampling few entries of the decomposed kernel matrices that belong to other stations, and filling in unsampled entries in kernel matrices by matrix completion. Experiments on benchmark data sets illustrate that a consensus SVM can be built in each station using limited communication, which is competent in prediction performance to an SVM built with accessing all features.

Paper Nr: 119
Title:

Removing Motion Blur using Natural Image Statistics

Authors:

Johannes Herwig, Timm Linder and Josef Pauli

Abstract: We tackle deconvolution of motion blur in hand-held consumer photography with a Bayesian framework combining sparse gradient and color priors for regularization. We develop a closed-form optimization utilizing iterated re-weighted least squares (IRLS) with a Gaussian approximation of the regularization priors. The model parameters of the priors can be learned from a set of natural images which resemble common image statistics. We throughly evaluate and discuss the effect of different regularization factors and make suggestions for reasonable values. Both gradient and color priors are current state-of-the-art. In natural images the magnitude of gradients resembles a kurtotic hyper-Laplacian distribution, and the two-color model exploits the observation that locally any color is a linear approximation between some primary and secondary colors. Our contribution is integrating both priors into a single optimization framework and providing a more detailed derivation of their optimization functions. Our re-implementation reveals different model parameters than previously published, and the effectiveness of the color priors alone are explicitly examined. Finally, we propose a context-adaptive parameterization of the regularization factors in order to avoid over-smoothing the deconvolution result within highly textured areas.

Short Papers
Paper Nr: 18
Title:

A Low Illumination Environment Motion Detection Method based on Dictionary Learning

Authors:

Huaxin Xiao, Yu Liu, Bin Wang, Shuren Tan and Maojun Zhang

Abstract: This paper proposes a dictionary-based motion detection method on video images captured under low light with serious noise. The proposed approach trains a dictionary by background images without foreground. It then reconstructs the test image according to the theory of sparse coding, and introduces the Structural Similarity Index Measurement (SSIM) as the detection standard to identify the detection caused by the brightness and contrast ratio changes. Experimental results show that compared to the mixture of Gaussian model and ViBe method, the proposed method can reach a better result under extreme low illumination circumstance.

Paper Nr: 46
Title:

Learning with Kernel Random Field and Linear SVM

Authors:

Haruhisa Takahashi

Abstract: Deep learning methods, which include feature extraction in the training process, are achieving success in pattern recognition and machine learning fields but require huge parameter setting, and need the selection from various methods. On the contrary, Support Vector Machines (SVMs) have been popularly used in these fields in light of the simple algorithm and solid reasons based on the learning theory. However, it is difficult to improve recognition performance in SVMs beyond a certain level of capacity, in that higher dimensional feature space can only assure the linear separability of data as opposed to separation of the data manifolds themselves. We propose a new framework of kernel machine that generates essentially linearly separable kernel features. Our method utilizes pretraining process based on a kernel generative model and the mean field Fisher score with a higher-order autocorrelation kernel. Thus derived features are to be separated by a liner SVM, which exhibits far better generalization performance than any kernel-based SVMs. We show the experiments on the face detection using the appearance based approach, and that our method can attain comparable results with the state-of-the-art face detection methods based on AdaBoost, SURF, and cascade despite of smaller data size and no preprocessing.

Paper Nr: 88
Title:

Integrating Local Information-based Link Prediction Algorithms with OWA Operator

Authors:

James N. K. Liu, Yu-Lin He, Yan-Xing Hu, Xi-Zhao Wang and Simon C. K. Shiu

Abstract: The objective of link prediction for social network is to estimate the likelihood that a link exists between two nodes x and y. There are some well-known local information-based link prediction algorithms (LILPAs) which have been proposed to handle this essential and crucial problem in the social network analysis. However, they can not adequately consider the so-called local information: the degrees of x and y, the number of common neighbors of nodes x and y, and the degrees of common neighbors of x and y. In other words, not any LILPA takes into account all the local information simultaneously. This limits the performances of LILPAs to a certain degree and leads to the high variability of LILPAs. Thus, in order to make full use of all the local information and obtain a LILPA with highly-predicted capability, an ordered weighted averaging (OWA) operator based link prediction ensemble algorithm (LPEOWA) is proposed by integrating nine different LILPAs with aggregation weights which are determined with maximum entropy method. The final experimental results on benchmark social network datasets show that LPEOWA can obtain higher prediction accuracies which is measured by the area under the receiver operating characteristic curve (AUC) in comparison with nine individual LILPAs.

Paper Nr: 90
Title:

Initialization Framework for Latent Variable Models

Authors:

Heydar Maboudi Afkham, Carl Henrik Ek and Stefan Carlsson

Abstract: In this paper, we discuss the properties of a class of latent variable models that assumes each labeled sample is associated with set of different features, with no prior knowledge of which feature is the most relevant feature to be used. Deformable-Part Models (DPM) can be seen as good example of such models. While Latent SVM framework (LSVM) has proven to be an efficient tool for solving these models, we will argue that the solution found by this tool is very sensitive to the initialization. To decrease this dependency, we propose a novel clustering procedure, for these problems, to find cluster centers that are shared by several sample sets while ignoring the rest of the cluster centers. As we will show, these cluster centers will provide a robust initialization for the LSVM framework.

Paper Nr: 91
Title:

Gradual Improvement of Image Descriptor Quality

Authors:

Heydar Maboudi Afkham, Carl Henrik Ek and Stefan Carlsson

Abstract: In this paper, we propose a framework for gradually improving the quality of an already existing image descriptor. The descriptor used in this paper (Afkham et al., 2013) uses the response of a series of discriminative components for summarizing each image. As we will show, this descriptor has an ideal form in which all categories become linearly separable. While, reaching this form is not feasible, we will argue how by replacing a small fraction of these components, it is possible to obtain a descriptor which is, on average, closer to this ideal form. To do so, we initially identify which components do not contribute to the quality of the descriptor and replace them with more robust components. Here, a joint feature selection method is used to find improved components. As our experiments show, this change directly reflects in the capability of the resulting descriptor in discriminating between different categories.

Paper Nr: 109
Title:

Parallel Classification System based on an Ensemble of Mixture of Experts

Authors:

Benjamín Moreno-Montiel and René MacKinney-Romero

Abstract: The classification of large amounts of data is a challenging problem that only a small number of classification algorithms can handle. In this paper we propose a Parallel Classification System based on an Ensemble of Mixture of Experts (PCEM). The system uses MIMD (Multiple Instruction and Multiple Data Stream) architecture, using a set of process that communicates via messages. PCEM is implemented using parallel schemes of traditional classifiers, for the mixture of experts, and using a parallel version of a Genetic Algorithm to implement a voting weighted criterion. The PCEM is a novel algorithm since it allows us to classify large amounts of data with low execution times and high performance measures, which makes it an excellent tool for in classification of large amounts of data. A series of tests were performed with well known databases that allowed us to measure how PCEM performs with many datasets and how well it does compared with other systems available.

Paper Nr: 124
Title:

The Integer Approximation of Undirected Graphical Models

Authors:

Nico Piatkowski, Sangkyun Lee and Katharina Morik

Abstract: Machine learning on resource constrained ubiquitous devices suffers from high energy consumption and slow execution time. In this paper, it is investigated how to modify machine learning algorithms in order to reduce the number of consumed clock cycles—not by reducing the asymptotic complexity, but by assuming a weaker execution platform. In particular, an integer approximation to the class of undirected graphical models is proposed. Algorithms for inference, maximum-a-posteriori prediction and parameter estimation are presented and approximation error is discussed. In numerical evaluations on synthetic data, the response of the model to several influential properties of the data is investigated. The results on the synthetic data are confirmed with a natural language processing task on an open data set. In addition, the runtime on low-end hardware is regarded. The overall speedup of the new algorithms is at least 2× while overall loss in accuracy is rather small. This allows running probabilistic methods on very small devices, even if they do not contain a processor that is capable of executing floating point arithmetic at all.

Paper Nr: 171
Title:

Improving Kernel Grower Methods using Ellipsoidal Support Vector Data Description

Authors:

Sabra Hechmi, Alya Slimene and Ezzeddine Zagrouba

Abstract: In these recent years, kernel methods have gained a considerable interest in many areas of machine learning. This work investigates the ability of kernel clustering methods to deal with one of the meaningful problem of computer vision namely image segmentation task. In this context, we propose a novel kernel method based on an Ellipsoidal Support Vector Data Description ESVDD. Experiments conducted on a selected synthetic data sets and on Berkeley image segmentation benchmark show that our approach significantly outperforms state-of-the-art kernel methods.

Paper Nr: 177
Title:

Fast Arabic Glyph Recognizer based on Haar Cascade Classifiers

Authors:

Ashraf AbdelRaouf, Colin A. Higgins, Tony Pridmore and Mahmoud I. Khalil

Abstract: Optical Character Recognition (OCR) is an important technology. The Arabic language lacks both the variety of OCR systems and the depth of research relative to Roman scripts. A machine learning, Haar-Cascade classifier (HCC) approach was introduced by Viola and Jones (Viola and Jones 2001) to achieve rapid object detection based on a boosted cascade Haar-like features. Here, that approach is modified for the first time to suit Arabic glyph recognition. The HCC approach eliminates problematic steps in the pre-processing and recognition phases and, most importantly, the character segmentation stage. A recognizer was produced for each of the 61 Arabic glyphs that exist after the removal of diacritical marks. These recognizers were trained and tested on some 2,000 images each. The system was tested with real text images and produces a recognition rate for Arabic glyphs of 87%. The proposed method is fast, with an average document recognition time of 14.7 seconds compared with 15.8 seconds for commercial software.

Paper Nr: 180
Title:

Speeding up Support Vector Machines - Probabilistic versus Nearest Neighbour Methods for Condensing Training Data

Authors:

Moïri Gamboni, Abhijai Garg, Oleg Grishin, Seung Man Oh, Francis Sowani, Anthony Spalvieri-Kruse, Godfried T. Toussaint and Lingliang Zhang

Abstract: Several methods for reducing the running time of support vector machines (SVMs) are compared in terms of speed-up factor and classification accuracy using seven large real world datasets obtained from the UCI Machine Learning Repository. All the methods tested are based on reducing the size of the training data that is then fed to the SVM. Two probabilistic methods are investigated that run in linear time with respect to the size of the training data: blind random sampling and a new method for guided random sampling (Gaussian Condensing). These methods are compared with k-Nearest Neighbour methods for reducing the size of the training set and for smoothing the decision boundary. For all the datasets tested blind random sampling gave the best results for speeding up SVMs without significantly sacrificing classification accuracy.

Posters
Paper Nr: 15
Title:

An Inhomogeneous Bayesian Texture Model for Spatially Varying Parameter Estimation

Authors:

Chathurika Dharmagunawardhana, Sasan Mahmoodi, Michael Bennett and Mahesan Niranjan

Abstract: In statistical model based texture feature extraction, features based on spatially varying parameters achieve higher discriminative performances compared to spatially constant parameters. In this paper we formulate a novel Bayesian framework which achieves texture characterization by spatially varying parameters based on Gaussian Markov random fields. The parameter estimation is carried out by Metropolis-Hastings algorithm. The distributions of estimated spatially varying parameters are then used as successful discriminant texture features in classification and segmentation. Results show that novel features outperform traditional Gaussian Markov random field texture features which use spatially constant parameters. These features capture both pixel spatial dependencies and structural properties of a texture giving improved texture features for effective texture classification and segmentation.

Paper Nr: 30
Title:

Enhanced Kernel Uncorrelated Discriminant Nearest Feature Line Analysis for Radar Target Recognition

Authors:

Chunyu Wan, Xuelian Yu, Yun Zhou and Xuegang Wang

Abstract: In this paper, a new subspace learning algorithm, called enhanced kernel uncorrelated discriminant nearest feature line analysis (EKUDNFLA), is presented. The aim of EKUDNFLA is to seek a feature subspace in which the within-class feature line (FL) distances are minimized and the between-class FL distances are maximized simultaneously. At the same time, an uncorrelated constraint is imposed to get statistically uncorrelated features, which contain minimum redundancy and ensure independence, and thus it is highly desirable in many practical applications. Optimizing an objective function in a kernel feature space, nonlinear features are extracted. In addition, a weighting coefficient is introduced to adjust the proportion between within-class and between-class information to get an optimal effect. Experimental results on radar target recognition with measured data demonstrate the effectiveness of the proposed method.

Paper Nr: 31
Title:

Comparison of Performances of Plug-in Spatial Classification Rules based on Bayesian and ML Estimators

Authors:

Kestutis Ducinskas, Egle Zikariene and Lina Dreiziene

Abstract: The problem of classifying a scalar Gaussian random field observation into one of two populations specified by a different parametric drifts and common covariance model is considered. The unknown drift and scale parameters are estimated using given a spatial training sample. This paper concerns classification procedures associated to a parametric plug-in Bayes Rule obtained by substituting the unknown parameters in the Bayes rule by their estimators. The Bayesian estimators are used for the particular prior distributions of the unknown parameters. A closed-form expression is derived for the actual risk associated to the aforementioned classification rule. An estimator of the expected risk based on the derived actual risk is used as a performance measure for the classifier incurred by the plug-in Bayes rule. A stationary Gaussian random field with an exponential covariance function sampled on a regular 2-dimensional lattice is used for the simulation experiment. A critical performance comparison between the plug-in Bayes Rule defined above and a one based on ML estimators is performed.

Paper Nr: 56
Title:

Heuristic Ensemble of Filters for Reliable Feature Selection

Authors:

Ghadah Aldehim, Beatriz de la Iglesia and Wenjia Wang

Abstract: Feature selection has become ever more important in data mining in recent years due to the rapid increase in the dimensionality of data. Filters are preferable in practical applications as they are much faster than wrapper based approaches, but their reliability and consistency vary considerably on different data and yet no rule exists to indicate which one should be used for a particular given dataset. In this paper, we propose a heuristic ensemble approach that combines multiple filters with heuristic rules to improve the overall performance. It consists of two types of filters: subset filters and ranking filters, and a heuristic consensus algorithm. The experimental results demonstrate that our ensemble algorithm is more reliable and effective than individual filters as the features selected by the ensemble consistently achieve better accuracy for typical classifiers on various datasets.

Paper Nr: 60
Title:

Curve Reconstruction from Noisy and Unordered Samples

Authors:

Marek W. Rupniewski

Abstract: An algorithm for the reconstruction of closed and open curves from clouds of their noisy and unordered samples is presented. Each curve is reconstructed as a polygonal path represented by its vertices, which are determined in an iterative process comprising evolutionary and decimation stages. The quality of the reconstruction is studied with respect to the local density of the samples and the standard deviation of the noise perturbing the samples. The algorithm is verified to work for arbitrary dimensions of ambient space.

Paper Nr: 62
Title:

A Comparison of Approaches for Person Re-identification

Authors:

Maria De Marsico, Riccardo Distasi, Stefano Ricciardi and Daniel Riccio

Abstract: Advanced surveillance applications often require to re-identify an individual. In the typical context of a camera network, this means to recognize a subject acquired at one location among a feasible set of candidates acquired at different locations and/or times. This task is especially challenging in applications targeted at crowded environments. Face and gait are contactless biometrics which are particularly suited to re-identification, but even “soft” biometrics have been considered to this aim. We present a review of approaches to re-identification, with some characteristic examples in literature. The goal is to provide an estimate of both the state-of-the-art and the potential of such techniques to further improve them and to extend the applicability of re-identification systems.

Paper Nr: 76
Title:

A Fast Computation Method for IQA Metrics Based on their Typical Set

Authors:

Vittoria Bruni and Domenico Vitulano

Abstract: This paper deals with the typical set of an image quality assessment (IQA) measure. In particular, it focuses on the well known and widely used Structural SIMilarity index (SSIM). In agreement with Information Theory, the visual distortion typical set is composed of the least amount of information necessary to estimate the quality of the distorted image. General criteria for an effective and fruitful computation of the set will be given. As it will be shown, the typical set allows to increase IQA efficiency by considerably speeding up its computation, thanks to the reduced number of image blocks used for the evaluation of the considered IQA metric.

Paper Nr: 85
Title:

Optimal Conjugate Gradient Algorithm for Generalization of Linear Discriminant Analysis Based on L1 Norm

Authors:

Kanishka Tyagi, Nojun Kwak and Michael Manry

Abstract: This paper analyzes a linear discriminant subspace technique from an L1 point of view. We propose an efficient and optimal algorithm that addresses several major issues with prior work based on, not only the L1 based LDA algorithm but also its L2 counterpart. This includes algorithm implementation, effect of outliers and optimality of parameters used. The key idea is to use conjugate gradient to optimize the L1 cost function and to find an learning factor during the update of the weight vector in the subspace. Experimental results on UCI datasets reveal that the present method is a significant improvement over the previous work. Mathematical treatment for the proposed algorithm and calculations for learning factor are the main subject of this paper.

Paper Nr: 89
Title:

A Method of Pixel Unmixing by Classes based on the Possibilistic Similarity

Authors:

B. Alsahwa, S. Almouahed, D. Guériot and B. Solaiman

Abstract: In this paper, an approach for pixel unmixing based on possibilistic similarity is proposed. This approach uses possibility distributions to express both the expert’s semantic knowledge (a priori knowledge) and the contextual information. Dubois-Prade’s probability-possibility transformation is used to construct these possibility distributions starting from statistical information (learning areas delimitated by an expert for each thematic class in the analyzed scene) which serve, first, for the estimation of the probability density functions using the kernel density estimation. The pixel unmixing is then performed based on the possibilistic similarity between a local possibility distribution estimated around the considered pixel and the obtained possibility distributions representing the predefined thematic classes. The obtained similarity values are used in order to obtain the abundances of different classes in the considered pixel. Accuracy analysis of pixels unmixing demonstrates that the proposed approach represents an efficient estimator of their abundances of the predefined thematic classes and, in turn, higher classification accuracy is achieved. Synthetic images are used in order to evaluate the performances of the proposed approach.

Paper Nr: 92
Title:

Fuzzy Set Theoretical Analysis of Human Membership Values on the Color Triangle - Mapping from the Color Triangle (Antecedent) via the Color Triangle (Consequent) to the Tone Triangle

Authors:

Shun Kato, Itsuki Shinomiya, Fumihiko Mori and Naotoshi Sugano

Abstract: The present study considers a fuzzy color system in which three membership functions are constructed on a color triangle. This system can process a fuzzy input to a color triangle system and output the center of gravity of three weights associated with respective grades. Three fuzzy sets (red, green, and blue) are applied to the color triangle relationship. By treating the attributes of redness, greenness, and blueness on the color triangle, a target color can be easily obtained as the center of gravity of the output fuzzy set. In the present paper, 0% triangle is consisted of the lines of 0% redness, 0% greenness, and 0% blueness of the attributes. The colors on 0% triangle map into the right corner of tone triangle (on C or near C). As compare the inference results for fuzzy inputs with those for crisp inputs, move to W (white region). The input-output relationship is shown by redness and chromaticness. The inference outputs for crisp inputs and for fuzzy inputs are obviously different. Those for crisp inputs show vertically linear.

Paper Nr: 94
Title:

Region Segregation by Linking Keypoints Tuned to Colour

Authors:

M. Farrajota, J. M. F. Rodrigues and J. M. H. du Buf

Abstract: Coloured regions can be segregated from each other by using colour-opponent mechanisms, colour contrast, saturation and luminance. Here we address segmentation by using end-stopped cells tuned to colour instead of to colour contrast. Colour information is coded in separate channels. By using multi-scale cortical endstopped cells tuned to colour, keypoint information in all channels is coded and mapped by multi-scale peaks. Unsupervised segmentation is achieved by analysing the branches of these peaks, which yields the best-fitting image regions.

Paper Nr: 103
Title:

Kernel Hierarchical Agglomerative Clustering - Comparison of Different Gap Statistics to Estimate the Number of Clusters

Authors:

Na Li, Nicolas Lefebvre and Régis Lengellé

Abstract: Clustering algorithms, as unsupervised analysis tools, are useful for exploring data structure and have owned great success in many disciplines. For most of the clustering algorithms like k-means, determining the number of the clusters is a crucial step and is one of the most difficult problems. Hierarchical Agglomerative Clustering (HAC) has the advantage of giving a data representation by the dendrogram that allows clustering by cutting the dendrogram at some optimal level. In the past years and within the context of HAC, efficient statistics have been proposed to estimate the number of clusters and the Gap Statistic by Tibshirani has shown interesting performances. In this paper, we propose some new Gap Statistics to further improve the determination of the number of clusters. Our works focus on the kernelized version of the widely-used Hierarchical Clustering Algorithm.

Paper Nr: 105
Title:

An Unsupervised Nonparametric and Cooperative Approach for Classification of Multicomponent Image Contents

Authors:

Akar Taher, Kacem Chehdi and Claude Cariou

Abstract: In this paper an unsupervised nonparametric cooperative and adaptive approach for multicomponent image partitioning is presented. In this approach the images are partitioned component by component and intermediate classification results are evaluated and fused, to get the final partitioning result. Two unsupervised classification methods are used in parallel cooperation to partition each component of the image. The originality of the approach relies i) on its local adaptation to the type of regions in an image (textured, non-textured), ii) on the automatic estimation of the number of classes and iii) on the introduction of several levels of evaluation and validation of intermediate partitioning results before obtaining the final classification result. For the management of similar or conflicting results issued from the two classification methods, we gradually introduced various assessment steps that exploit the information of each component and its adjacent components, and finally the information of all the components. In our approach, the detected region types are treated separately from the feature extraction step, to the final classification results. The efficiency of our approach is shown on two real applications using a hyperspectral image for the identification of invasive and non invasive vegetation and a multispectral image for pine trees detection.

Paper Nr: 117
Title:

Overlapping Clustering with Outliers Detection

Authors:

Amira Rezgui, Chiheb-Eddine Ben N'Cir and Nadia Essoussi

Abstract: Detecting overlapping groups is an important challenge in clustering offering relevant solutions for many applications domains. Recently, Parametrized R-OKMmethod was defined as an extension of OKMto control overlapping boundaries between clusters. However, the performance of both, OKMand Parametrized R-OKM is considerably reduced when data contain outliers. The presence of outliers affects the resulting clusters and yields to clusters which do not fit the true structure of data. In order to improve the existing methods, we propose a robust method able to detect relevant overlapping clusters with outliers identification. Experiments performed on artificial and real multi-labeled data sets showed the effectiveness of the proposed method to produce relevant non disjoint groups.

Paper Nr: 118
Title:

Segmentation Ensemble - A Knowledge Reuse for Model Order Selection using Case-based Reasoning

Authors:

Pakaket Wattuya and Ekkawut Rojsattarat

Abstract: Cluster ensemble has emerged as a powerful technique for improving robustness, stability, and accuracy of clustering solutions, however, automatic estimating the appropriate number of clusters in the final combined results remains unsolved. In this paper we present a new approach based on a case-based reasoning to handle this difficult task. The key success of our approach is a novel use of cluster ensemble in a different role from the past. Each ensemble component is viewed as an expert domain for building a case base. Having benefited from the information extracted from cluster ensemble, a case-based reasoning is able to settle efficiently the appropriate number of clusters underlying a clustering ensemble. Our approach is simple, fast and effective. Three simulations with different state-of-the-art segmentation algorithms are presented to illustrate the efficacy of the proposed approach. We extensively evaluate our approach on a large dataset in comparison with recent approaches for determining the number of regions in segmentation combination framework. Experiments demonstrate that our approach can substantially reduce computational time required by the existing methods, more importantly, without the loss of segmentation combination accuracy. This contribution makes the segmentation ensemble combination concept more feasible in real-world applications.

Paper Nr: 134
Title:

SYNC-SOM - Double-layer Oscillatory Network for Cluster Analysis

Authors:

A. V. Novikov and E. N. Benderskaya

Abstract: Despite partial synchronization in the oscillatory networks based on Kuramoto model can be used for cluster analysis, convergence rate of synchronization processes depends on number of oscillators and number of links between oscillators. Moreover result of clustering depends on radius of connectivity that should be chosen in line with input data. We propose double-layer oscillatory network for the two problems. Our network relevant in situation when fast solution is required and when input data should be clustering without expert estimations. In this paper, we presented results of experiments that confirmed better quality then traditional algorithms.

Paper Nr: 149
Title:

Unsupervised Relevance Analysis for Feature Extraction and Selection - A Distance-based Approach for Feature Relevance

Authors:

Diego H. Peluffo, John A. Lee, Michel Verleysen, José L. Rodríguez and Germán Castellanos-Domínguez

Abstract: The aim of this paper is to propose a new generalized formulation for feature extraction based on distances from a feature relevance point of view. This is done within an unsupervised framework. To do so, it is first outlined the formal concept of feature relevance. Then, a novel feature extraction approach is introduced. Such an approach employs the M-norm as a distance measure. It is demonstrated that under some conditions, this method can readily explain literature methods. As another contribution of this paper, we propose an elegant feature ranking approach for feature selection followed from the spectral analysis of the data variability. Also, we provide a weighted PCA scheme revealing the relationship between feature extraction and feature selection. To assess the behavior of the studied methods within a pattern recognition system, a clustering stage is carried out. Normalized mutual information is used to quantify the quality of resultant clusters. Proposed methods reach comparable results with respect to literature methods.

Paper Nr: 152
Title:

Dimensionality Reduction of Features using Multi Resolution Representation of Decomposed Images

Authors:

Avi Bleiweiss

Abstract: A common objective in multi class, image analysis is to reduce the dimensionality of input data, and capture the most discriminant features in the projected space. In this work, we investigate a system that first finds clusters of similar points in feature space, using a nearest neighbor, graph based decomposition algorithm. This process transforms the original image data on to a subspace of identical dimensionality, but at a much flatter, color gamut. The intermediate representation of the segmented image, follows an effective, local descriptor operator that yields a marked compact feature vector, compared to the one obtained from a descriptor, immediately succeeding the native image. For evaluation, we study a generalized, multi resolution representation of decomposed images, parameterized by a broad range of a decreasing number of clusters. We conduct experiments on both non and correlated image sets, expressed in raw feature vectors of one million elements each, and demonstrate robust accuracy in applying our features to a linear SVM classifier. Compared to state-of-the-art systems of identical goals, our method shows increased dimensionality reduction, at a consistent feature matching performance.

Paper Nr: 163
Title:

Stability of Ensemble Feature Selection on High-Dimension and Low-Sample Size Data - Influence of the Aggregation Method

Authors:

David Dernoncourt, Blaise Hanczar and Jean-Daniel Zucker

Abstract: Feature selection is an important step when building a classifier. However, the feature selection tends to be unstable on high-dimension and small-sample size data. This instability reduces the usefulness of selected features for knowledge discovery: if the selected feature subset is not robust, domain experts can have little trust that they are relevant. A growing number of studies deal with feature selection stability. Based on the idea that ensemble methods are commonly used to improve classifiers accuracy and stability, some works focused on the stability of ensemble feature selection methods. So far, they obtained mixed results, and as far as we know no study extensively studied how the choice of the aggregation method influences the stability of ensemble feature selection. This is what we study in this preliminary work. We first present some aggregation methods, then we study the stability of ensemble feature selection based on them, on both artificial and real data, as well as the resulting classification performance.

Paper Nr: 167
Title:

GPU Solver with Chi-square Kernels for SVM Classification of Big Sparse Problems

Authors:

Krzysztof Sopyla and Pawel Drozda

Abstract: This paper presents the ongoing research on the GPU SVM solutions for classification of big sparse datasets. In particular, after the success of implementation of RBF kernel for sparse matrix formats in previous work we decided to evaluate Chi2 and Exponential Chi2 kernels. Moreover, the details of GPU solver are pointed. Experimental session summarizes results of GPU SVM classification for different sparse data formats and different SVM kernels and demonstrates that solution for Exponential Chi2 achieves significant accelerations in GPU SVM processing, while the results for Chi2 kernel are very far from satisfactory.

Paper Nr: 168
Title:

Iterative Robust Registration Approach based on Feature Descriptors Correspondence - Application to 3D Faces Description

Authors:

Wieme Gadacha and Faouzi Ghorbel

Abstract: In this paper, we intend to introduce a fast surface registration process which is independent from the original parameterization of the surface and invariant under 3D rigid transformations. It is based on a feature descriptors correspondence. Such feature descriptors are extracted from the superposition of two surfacic curves: geodesic levels and radial ones from local neighborhoods defined around reference points already picked on the surface. A study of the optimal number of those curves thanks to a generalized version of Shannon theorem is developed. Thus, the obtained discretized parametrisation (ordered descriptors) is the basis of the matching phase that becomes obvious and more robust comparing to the classic ICP algorithm. Experimentations are conducted on facial surfaces from the Bosphorus database to test the registration of both rigid and non-rigid shapes (neutral faces vs. faces with expressions). The Hausdorff distance in shape space is used as an evaluation metric to test the robustness to tessellation. The discriminative power in face description is also estimated.

Paper Nr: 179
Title:

Bilingual Software Requirements Tracing using Vector Space Model

Authors:

Olcay Taner Yıldız, Ahmet Okutan and Ercan Solak

Abstract: In the software engineering world, creating and maintaining relationships between byproducts generated during the software lifecycle is crucial. A typical relation is the one that exists between an item in the requirements document and a block in the subsequent system design, i.e. class in the source code. In many software engineering projects, the requirement documentation is prepared in the language of the developers, whereas developers prefer to use the English language in the software development process. In this paper, we use the vector space model to extract traceability links between the requirements written in one language (Turkish) and the implementations of classes in another language (English). The experiments show that, by using a generic translator such as Google translate, we can obtain promising results, which can also be improved by using comment info in the source code.

Area 2 - Applications

Full Papers
Paper Nr: 26
Title:

An a-contrario Approach for Face Matching

Authors:

Luis D. Di Martino, Javier Preciozzi, Federico Lecumberry and Alicia Fernández

Abstract: In this work we focus on the matching stage of a face recognition system. These systems are used to identify an unknown person or to validate a claimed identity. In the face recognition field it is very common to innovate in the extracted features of a face and use a simple threshold on the distance between samples in order to perform the validation of a claimed identity. In this work we present a novel strategy based in the a-contrario framework in order to improve the matching stage. This approach results in a validation threshold that is automatically adapted to the data and allows to predict the performance of the system in advance. We perform several experiments in order to validate this novel strategy using different databases and show its advantages over using a simple threshold over the distances.

Paper Nr: 67
Title:

Motion Binary Patterns for Action Recognition

Authors:

Florian Baumann, Jie Lao, Arne Ehlers and Bodo Rosenhahn

Abstract: In this paper, we propose a novel feature type to recognize human actions from video data. By combining the benefit of Volume Local Binary Patterns and Optical Flow, a simple and efficient descriptor is constructed. Motion Binary Patterns (MBP) are computed in spatio-temporal domain while static object appearances as well as motion information are gathered. Histograms are used to learn a Random Forest classifier which is applied to the task of human action recognition. The proposed framework is evaluated on the well-known, publicly available KTH dataset, Weizman dataset and on the IXMAS dataset for multi-view action recognition. The results demonstrate state-of-the-art accuracies in comparison to other methods.

Paper Nr: 69
Title:

Text Line Aggregation

Authors:

Christopher Beck, Alan Broun, Majid Mirmehdi, Tony Pipe and Chris Melhuish

Abstract: We present a new approach to text line aggregation that can work as both a line formation stage for a myriad of text segmentation methods (over all orientations) and as an extra level of filtering to remove false text candidates. The proposed method is centred on the processing of candidate text components based on local and global measures. We use orientation histograms to build an understanding of paragraphs, and filter noise and construct lines based on the discovery of prominent orientations. Paragraphs are then reduced to seed components and lines are reconstructed around these components. We demonstrate results for text aggregation on the ICDAR 2003 Robust Reading Competition data, and also present results on our own more complex data set.

Paper Nr: 72
Title:

Shape-based Segmentation of Tomatoes for Agriculture Monitoring

Authors:

Ujjwal Verma, Florence Rossant, Isabelle Bloch, Julien Orensanz and Denis Boisgontier

Abstract: In this paper, we present a segmentation procedure based on a parametric active contour with shape constraint, in order to follow the growth of the tomatoes from the images acquired in the field. This is a challenging task because of the poor contrast in the images and the occlusions by the vegetation. In our sequential approach, considering one image per day, we assume that a segmentation of the tomatoes is available for the image acquired the previous day. An initial curve for the active contour model is computed by combining gradient information and region information. Then, an active contour with shape constraint is applied to provide an elliptic approximation of the tomato boundary. We performed a quantitative evaluation of our approach by comparing the results with the manual segmentation. Given the varying degree of occlusion in the images, the image data set was divided into three categories, based on the occlusion degree of the tomato in the processed image. For the cases with low occlusion, good results were obtained, with an average relative distance between the manual segmentation and the automatic segmentation of 2.73% (expressed as percentage of the size of tomato). For the images with significant amount of occlusion, a good segmentation was obtained on 44% of the images, where the average error was less than 10%.

Paper Nr: 78
Title:

Real-time Pedestrian Detection in a Truck’s Blind Spot Camera

Authors:

Kristof Van Beeck and Toon Goedemé

Abstract: In this paper we present a multi-pedestrian detection and tracking framework targeting a specific application: detecting vulnerable road users in a truck’s blind spot zone. Research indicates that existing non-vision based safety solutions are not able to handle this problem completely. Therefore we aim to develop an active safety system which warns the truck driver if pedestrians are present in the truck’s blind spot zone. Our system solely uses the vision input from the truck’s blind spot camera to detect pedestrians. This is not a trivial task, since the application inherently requires real-time operation while at the same time attaining very high accuracy. Furthermore we need to cope with the large lens distortion and the extreme viewpoints introduced by the blind spot camera. To achieve this, we propose a fast and efficient pedestrian detection and tracking framework based on our novel perspective warping window approach. To evaluate our algorithm we recorded several realistically simulated blind spot scenarios with a genuine blind spot camera mounted on a real truck. We show that our algorithm achieves excellent accuracy results at real-time performance, using a single core CPU implementation only.

Paper Nr: 79
Title:

Statistical Shape Model for Simulation of Realistic Endometrial Tissue

Authors:

Sebastian Kurtek, Chafik Samir and Lemlih Ouchchane

Abstract: We propose a new framework for developing statistical shape models of endometrial tissues from real clinical data. Endometrial tissues naturally form cylindrical surfaces, and thus, we adopt, with modification, a recent Riemannian framework for statistical shape analysis of parameterized surfaces. This methodology is based on a representation of surfaces termed square-root normal fields (SRNFs), which enables invariance to all shape preserving transformations including translation, scale, rotation, and re-parameterization. We extend this framework by computing parametrization-invariant statistical summaries of endometrial tissue shapes, and random sampling from learned generative models. Such models are very useful for medical practitioners during different tasks such as diagnosing or monitoring endometriosis. Furthermore, real data in medical applications in general (and in particular in this application) is often scarce, and thus the generated random samples are a key step for evaluating segmentation and registration approaches. Moreover, this study allows us to efficiently construct a large set of realistic samples that can open new avenues for diagnosing and monitoring complex diseases when using automatic techniques from computer vision, machine learning, etc.

Paper Nr: 86
Title:

LISF: An Invariant Local Shape Features Descriptor Robust to Occlusion

Authors:

Leonardo Chang, Miguel Arias-Estrada, L. Enrique Sucar and José Hernández-Palancar

Abstract: In this work an invariant shape features extraction, description and matching method (LISF) for binary images is proposed. In order to balance the discriminative power and the robustness to noise and occlusion in the contour, local features are extracted from contour to describe shape, which are later matched globally. The proposed extraction, description and matching methods are invariant to rotation, translation, and scale and present certain robustness to partial occlusion. Its invariability and robustness are validated by the performed experiments in shape retrieval and classification tasks. Experiments were carried out in the Shape99, Shape216, and MPEG-7 datasets, where different artifacts were artificially added to obtain partial occlusion as high as 60%. For the highest occlusion levels the proposed method outperformed other popular shape description methods, with about 20% higher bull’s eye score and 25% higher accuracy in classification.

Paper Nr: 95
Title:

Rule Management for Information Extraction from Title Pages of Academic Papers

Authors:

Atsuhiro Takasu and Manabu Ohta

Abstract: This paper discusses the problem of managing rules for page layout analysis and information extraction. We have been developing a system to extract information from academic papers that exploits both page layout and textual information. For this purpose, a conditional random field (CRF) analyzer is designed according to the layout of the object pages. Because various layouts are used in academic papers, we must prepare a set of rules for each type of layout to achieve high extraction accuracy. As the number of papers in a system grows, rule management becomes a big problem. For example, when should we make a new set of rules, and how can we acquire them efficiently while receiving new articles? This paper examines two scores to measure the fitness of rules and the applicability of rules learned for another type of layout. We evaluate the scores for bibliographic information extraction from title pages of academic papers and show that they are effective for measuring the fitness. We also examine the sampling of training data when learning a new set of rules.

Paper Nr: 98
Title:

Particle Video for Crowd Flow Tracking - Entry-Exit Area and Dynamic Occlusion Detection

Authors:

Antoine Fagette, Patrick Jamet, Daniel Racoceanu and Jean-Yves Dufour

Abstract: In this paper we interest ourselves to the problem of flow tracking for dense crowds. For this purpose, we use a cloud of particles spread on the image according to the estimated crowd density and driven by the optical flow. This cloud of particles is considered as statistically representative of the crowd. Therefore, each particle has physical properties that enable us to assess the validity of its behavior according to the one expected from a pedestrian and to optimize its motion dictated by the optical flow. This leads us to three applications described in this paper: the detection of the entry and exit areas of the crowd in the image, the detection of dynamic occlusions and the possibility to link entry areas with exit ones according to the flow of the pedestrians. We provide the results of our experimentation on synthetic data and show promising results.

Paper Nr: 128
Title:

Image Decolorization by Maximally Preserving Color Contrast

Authors:

Alex Yong-Sang Chia, Keita Yaegashi and Soh Masuko

Abstract: We propose a method to convert a color image to its gray representation with the objective that color contrast in the color image is maximally preserved as gray contrast in the gray image. Given a color image, we first extract unique colors of the image through robust clustering for its color values. Based on the color contrast between these unique colors, we tailor a non-linear decolorization function that maximally preserves contrast in the gray image. A novelty here is the proposal of a color-gray feature that tightly couple color contrast with gray contrast information. We compute the optimal color-gray feature, and drive the search for a decolorization function that generates a color-gray feature that is most similar to the optimal one. This function is then used to convert a color image to its gray representation. Our experiments and user study demonstrate the greater effectiveness of this method in comparison to previous techniques.

Short Papers
Paper Nr: 13
Title:

Accurate X-corner Fiducial Marker Localization in Image Guided Surgery (IGS)

Authors:

Thomas Kerstein, Hubert Roth and Jürgen Wahrburg

Abstract: In this paper a novel approach for reliable detection and accurate localization of X-corner fiducial markers is presented, which is particularly designed for Image Guided Surgery (IGS). The key idea is to combine two meaningful basic topological characteristics to one boosted filter providing adequate detection reliability and localization accuracy. Additionally and in contrast to conventional, retroreflective planar or spherical markers, X-corner fiducials facilitate not only position measurements with high precision but provide additional orientation information for improving distinction of multiple fiducials arranged within a geometrical reference structure. Experiments reveal robustness to considerable perspective distortion as well as invariance to illumination changes. Furthermore the presented approach offers high computational efficiency and a high level of flexibility for application-specific system design.

Paper Nr: 14
Title:

Combining Text Semantics and Image Geometry to Improve Scene Interpretation

Authors:

Dennis Medved, Fangyuan Jiang, Peter Exner, Magnus Oskarsson, Pierre Nugues and Kalle Aström

Abstract: In this paper, we describe a novel system that identifies relations between the objects extracted from an image. We started from the idea that in addition to the geometric and visual properties of the image objects, we could exploit lexical and semantic information from the text accompanying the image. As experimental set up, we gathered a corpus of images from Wikipedia as well as their associated articles. We extracted two types of objects: human beings and horses and we considered three relations that could hold between them: Ride, Lead, or None. We used geometric features as a baseline to identify the relations between the entities and we describe the improvements brought by the addition of bag-of-word features and predicate–argument structures we derived from the text. The best semantic model resulted in a relative error reduction of more than 18% over the baseline.

Paper Nr: 29
Title:

Snow Side Wall Detection using a Single Camera

Authors:

Kazunori Onoguchi and Takahito Sato

Abstract: In the area where it snows heavily, snow removal of a road cannot often catch up with snowfall. Especially, a community road becomes too narrow for vehicles to pass each other since snow removal is insufficient compared with a main street. To obtain this information, this paper presents the novel method to measure the distance between a vehicle and a snow wall of shoulder by a single camera. Our method creates the inverse perspective mapping (IPM) image by projecting an input image to the virtual plane which is parallel to the moving direction of the vehicle and which is perpendicular to the road surface. Then, the distance to the side wall is calculated from the histogram whose bin is the length of an optical flow detected in the IPM image. The optical flow of the IPM image is detected by a block matching and the motion of the side wall is obtained from the peak of the histogram. The narrow way is detected by results measured by several vehicles with a backup camera.Our method is robust to changes in the appearance of the texture on the side wall that occur when a vehicle moves along a road.

Paper Nr: 40
Title:

Comparative Study of Two Segmentation Methods of Handwritten Arabic Text - MM-OIC and HT-MM

Authors:

Fethi Ghazouani, Samia Snoussi Maddouri and Fadoua Bouafif Samoud

Abstract: We present in this paper a comparative study of two segmentation methods of handwritten Arabic text. The first method is a combination of the Mathematical Morphology (MM) and the algorithm of construction of the Outer Isothetic Cover of a digital object (OIC) named MM-OIC. The second method uses the Hough Transform (HT) and MMto segment the handwriting Arabic script called HT-MM. These methods are applied in two levels of segmentation: text lines and Pieces of Words. The two proposed methods are evaluated and compared to a set of documents selected from three databases: IFN/ENIT-database (17 documents), BSB (16 documents) and KSU (30 documents) online databases. The average rate line segmentation of MM-OIC is 75%, and of HT-MM is 45%. The average rate of PAW segmentation acheive 89% for the MM-OIC and 70% for the HT-MM method. The efficiency of the MM-OIC method is explained by the fact that this method can extract the approximate form of writing, and sometimes it can exceed some problems that are related to the Arabic script such as the overlapping lines and diacratical symbols.

Paper Nr: 44
Title:

Structure from Motion - ToF-aided 3D Reconstruction of Isometric Surfaces

Authors:

S. Jafar Hosseini and Helder Araújo

Abstract: This paper deals with structure-from-motion (SfM) for non-rigid surfaces that undergo isometric motion. Our SfM framework aims at the joint estimation of the 3D surface and the camera motion by combining a ToF range sensor and a monocular RGB camera through a template-based approach. Our goal is to use the 2D low-resolution depth estimates provided by the TOF camera, in order to facilitate the estimation of non-rigid structure using the high-resolution images obtained by means of a RGB camera. In this paper, we model isometric surfaces with a triangular mesh. The ToF sensor is used to obtain the depth of a sparse set of 3D feature points, from which the depth of the mesh vertices can be recovered using a multivariate linear system. Subsequently, we form a non-linear constraint based on the projected length of each edge. A second non-linear constraint is then used for minimizing re-projection errors. These constraints are finally incorporated into an optimization scheme to solve for structure and motion. Experimental results show that the proposed approach has good performance even if only a low-resolution depth image is used.

Paper Nr: 52
Title:

Feature Extraction in Pet Images for the Diagnosis of Alzheimer’s Disease

Authors:

João Duarte, Helena Aidos and Ana Fred

Abstract: Alzheimer’s disease accounts for an estimated 60% to 80% of cases of dementia and its victims are mainly elderly people. Recently, several computer-aided diagnosis systems have been developed, based on extracting information from FDG-PET scans. 3-dimensional FDG-PET images, under a voxel-as-feature approach, lead to high-dimensional feature spaces, which results in system performance problems. In order to reduce the dimensionality of these images, multi-scale methods may be used as feature extraction. We propose a multiscale approach for feature extraction of 3-dimensional images to improve the performance of a diagnosis system using clustering techniques. To evaluate the performance of our approach we applied it to a database obtained from Alzheimer’s Disease Neuroimaging Initiative (ADNI) and compare it with Gaussian pyramid technique. Experimental results have shown that the proposed approach is a good option for image feature reduction, outperforming the Gaussian pyramid technique.

Paper Nr: 55
Title:

Bio-inspired Metaheuristic based Visual Tracking and Ego-motion Estimation

Authors:

J. R. Siddiqui and S. Khatibi

Abstract: The problem of robust extraction of ego-motion from a sequence of images for an eye-in-hand camera configuration is addressed. A novel approach toward solving planar template based tracking is proposed which performs a non-linear image alignment and a planar similarity optimization to recover camera transformations from planar regions of a scene. The planar region tracking problem as a motion optimization problem is solved by maximizing the similarity among the planar regions of a scene. The optimization process employs an evolutionary metaheuristic approach in order to address the problem within a large non-linear search space. The proposed method is validated on image sequences with real as well as synthetic image datasets and found to be successful in recovering the ego-motion. A comparative analysis of the proposed method with various other state-of-art methods reveals that the algorithm succeeds in tracking the planar regions robustly and is comparable to the state-of-the art methods. Such an application of evolutionary metaheuristic in solving complex visual navigation problems can provide different perspective and could help in improving already available methods.

Paper Nr: 74
Title:

Confidence-based Rank-level Fusion for Audio-visual Person Identification System

Authors:

Mohammad Rafiqul Alam, Mohammed Bennamoun, Roberto Togneri and Ferdous Sohel

Abstract: A multibiometric identification system establishes the identity of a person based on the biometric data presented to its sub-systems. Each sub-system compares the features extracted from the input against the templates of all identities stored in its gallery. In rank-level fusion, ranked lists from different sub-systems are combined to reach the final decision about an identity. However, the state-of-art rank-level fusion methods consider that all sub-systems perform equally well in any conditions. In practice, the probe data may be affected by different degradations (e.g., illumination and pose variation on the face image, environmental noise etc.) and thus affect the overall recognition accuracy. In this paper, robust confidence-based rank-level fusion methods are proposed by using confidence measures for all participating sub-systems. Experimental results show that the confidence-based approach of rank-level fusion achieves higher recognition rates than the state-of-art.

Paper Nr: 80
Title:

Non Technical Loses Detection - Experts Labels vs. Inspection Labels in the Learning Stage

Authors:

Fernanda Rodríguez, Federico Lecumberry and Alicia Fernández

Abstract: Non-technical losses detection is a complex task, with high economic impact. The diversity and big number of consumption records, makes it very important to find an efficient automatic method for detection the largest number of frauds with the least amount of experts’ hours involved in preprocessing and inspections. This article analyzes the performance of a strategy based on learning from expert labeling: suspect/no-suspect, with one using inspection labels: fraud/no-fraud. Results show that the proposed framework, suitable for imbalance problems, improves performance in terms of the Fmeasure with inspection labels, avoiding hours of experts labeling.

Paper Nr: 84
Title:

A Multi-fonts Kanji Character Recognition Method for Early-modern Japanese Printed Books with Ruby Characters

Authors:

Taeka Awazu, Manami Fukuo, Masami Takata and Kazuki Joe

Abstract: The web site of National Diet Library in Japan provides a lot of early-modern (AD1868-1945) Japanese printed books to the public, but full-text search is essentially impossible. In order to perform advanced search for historical literatures, the automatic textualization of the images is required. However, the ruby system, which is peculiar to Japanese books, gives a serious obstacle against the textualization. When we apply existing OCRs to early-modern Japanese printed books, the recognition rate is extremely low. To solve this problem, we have already proposed a multi-font Kanji character recognition method using the PDC feature and an SVM. In this paper, we propose a ruby character removal method for early-modern Japanese printed books using genetic programming, and evaluate our multi-fonts Kanji character recognition method with 1,000 types of early-modern Japanese printed Kanji characters.

Paper Nr: 93
Title:

Human Action Recognition for Real-time Applications

Authors:

Ivo Reznicek and Pavel Zemcik

Abstract: Action recognition in video is an important part of many applications. While the performance of action recognition has been intensively investigated, not much research so far has been done in the understanding of how long a sequence of video frames is needed to correctly recognize certain actions. This paper presents a new method of measurement of the length of the video sequence necessary to recognize the actions based on space-time feature points. Such length is the key information necessary to successfully recognize the actions in real-time or performance critical applications. The action recognition used in the presented approach is the state-of-the-art one; vocabulary, bag of words and SVM processing. The proposed methods is experimentally evaluated on human action recognition dataset.

Paper Nr: 106
Title:

Fusion of Audio-visual Features using Hierarchical Classifier Systems for the Recognition of Affective States and the State of Depression

Authors:

Markus Kächele, Michael Glodek, Dimitrij Zharkov, Sascha Meudt and Friedhelm Schwenker

Abstract: Reliable prediction of affective states in real world scenarios is very challenging and a significant amount of ongoing research is targeted towards improvement of existing systems. Major problems include the unreliability of labels, variations of the same affective states amongst different persons and in different modalities as well as the presence of sensor noise in the signals. This work presents a framework for adaptive fusion of input modalities incorporating variable degrees of certainty on different levels. Using a strategy that starts with ensembles of weak learners, gradually, level by level, the discriminative power of the system is improved by adaptively weighting favorable decisions, while concurrently dismissing unfavorable ones. For the final decision fusion the proposed system leverages a trained Kalman filter. Besides its ability to deal with missing and uncertain values, in its nature, the Kalman filter is a time series predictor and thus a suitable choice to match input signals to a reference time series in the form of ground truth labels. In the case of affect recognition, the proposed system exhibits superior performance in comparison to competing systems on the analysed dataset.

Paper Nr: 113
Title:

Hierarchical Energy-transfer Features

Authors:

Radovan Fusek, Eduard Sojka, Karel Mozdřeň and Milan Šurkala

Abstract: In the paper, we propose the novel and efficient object descriptors that are designed to describe the appearance of the objects. The descriptors are called as Hierarchical Energy-Transfer Features (HETF). The main idea behind HETF is that the shape of the objects can be described by the function of energy distribution. In the image, the transfer of energy is solved by making use of physical laws. The function of the energy distribution is obtained by sampling, after the energy transfer process; the image is divided into the cells of variable sizes and the values of the function is investigated inside each cell. The proposed descriptors achieved very good detection results compared with the state-of-the-art methods (e.g. Haar, HOG, LBP features). We show the robustness of the descriptors for solving the face detection problem.

Paper Nr: 126
Title:

Region-based Abnormal Motion Detection in Video Surveillance

Authors:

Jorge Henrique Busatto Casagrande and Marcelo Ricardo Stemmer

Abstract: This article proposes a method to detect abnormal motion based on the subdivision of regions of interest in the scene. The method reduces the large amount of data generated in a tracking-based approach as well as the corresponding computational cost in training phase. The regions are spatially identified and contain data of transition vectors, resulting from the centroid tracking of multiple moving objects. On these data, we applied a one-class supervised training with one set of normal tracks on Gaussian mixtures to find relevant clusters, which discriminate the trajectory of objects. The lowest probability of transition vectors is used as the threshold to detect abnormal motions. The ROC (Receiver Operating Characteristic) curves are used to this task and also to determinate the efficiency of the model for each size increment of the region grid. The results show that there is a range of grid size values, which ensure a best margin of correct abnormal motions detection for each type of scenario, even with a significant reduction of data samples.

Paper Nr: 132
Title:

External Vision based Robust Pose Estimation System for a Quadrotor in Outdoor Environments

Authors:

Wei Zheng, Fan Zhou and Zengfu Wang

Abstract: In this paper, an external vision based robust pose estimation system for a quadrotor in outdoor environments has been proposed. This system could provide us with approximate ground truth of pose estimation for a quadrotor outdoors, while most of external vision based systems perform indoors. Here, we do not modify the architecture of the quadrotor or put colored blobs, LEDs on it. Only using the own features of the quadrotor, we present a novel robust pose estimation algorithm to get the accurate pose of a quadrotor. With good observed results, we get all the four rotors and calculate the pose. But when fewer than four rotors are observed, all of existing external vision based systems of the quadrotor do not mention this and could not get right pose results. In this paper, we have solved this problem and got accurate pose estimation with IMU(inertial measurement unit) data. This system can provide us with approximate ground truth outdoors. We demonstrate in real experiments that the vision-based pose estimation system for outdoor environments can perform accurately and robustly in real time.

Paper Nr: 147
Title:

Towards Automated Video Analysis of Sensorimotor Assessment Data

Authors:

Ana B. Graciano Fouquier, Séverine Dubuisson, Isabelle Bloch and Anja Klöeckner

Abstract: Sensorimotor assessment aims at evaluating sensorial and motor capabilities of children who are likely to present a pervasive developmental disorder, such as autism. It relies on playful activities which are proposed by a psychomotrician expert to the child, with the intent of observing how the latter responds to various physical and cognitive stimuli. Each session is recorded so that the psychomotrician can use the video as a support for reviewing in-session impressions and drawing final conclusions. These recordings carry a wealth of information that could be exploited for research purposes and contribute to a better understanding of autism spectrum disorders. However, the systematic inspection of these data by clinical professionals would be time-consuming and impracticable. In order to make these analyses feasible, we discuss a computer vision approach to prospect behavior information from the available visual data acquired throughout assessment sessions.

Paper Nr: 164
Title:

Automatic ATM Fraud Detection as a Sequence-based Anomaly Detection Problem

Authors:

Maik Anderka, Timo Klerx, Steffen Priesterjahn and Hans Kleine Büning

Abstract: Because of the direct access to cash and customer data, automated teller machines (ATMs) are the target of manifold attacks and fraud. To counter this problem, modern ATMs utilize specialized hardware security systems that are designed to detect particular types of attacks and manipulation. However, such systems do not provide any protection against future attacks that are unknown at design time. In this paper, we propose an approach that is able to detect known as well as unknown attacks on ATMs and that does not require additional security hardware. The idea is to utilize automatic model generation techniques to learn patterns of normal behavior from the status information of standard devices comprised in an ATM; a significant deviation from the learned behavior is an indicator of a fraud attempt. We cast the identification of ATM fraud as a sequence-based anomaly detection problem, and we describe three specific methods that implement our approach. An empirical evaluation using a real-world data set that has been recorded on a public ATM within a time period of nine weeks shows promising results and underlines the practical applicability of the proposed approach.

Paper Nr: 176
Title:

On the Bin Number Choice of Joint Histogram Estimation Applied to Mutual Information based Face Recognition

Authors:

Abdenour Hacine-Gharbi and Philippe Ravier

Abstract: In this paper, we investigate the binning problem of joint histogram estimation applied to mutual information based face recognition application. Classical approaches for histograms estimation tend to empirically fix the bin numbers. We evaluate in this work some state of the art rules for automatically choosing the bin numbers. The face recognition problem has been studied in the case of local and holistic methods. The choice’s performance has been evaluated using AT&T database with single sample in the training set. The results show that better accuracy recognition rates can be achieved with data driven bin number choices rather than fixed bin numbers. In the local method, the results show a higher robustness of the automatic vs fixed bin number choice when the regions become smaller.

Paper Nr: 181
Title:

Live Stream Oriented Age and Gender Estimation using Boosted LBP Histograms Comparisons

Authors:

Lionel Prevost, Philippe Phothisane and Erwan Bigorgne

Abstract: Research has recently focused on human age and gender estimation because they are useful cues in many applications such as human-machine interaction, soft biometrics or demographic statistics for marketing. Even though human perception of other people’s age is often biased, attaining this kind of precision with an automatic estimator is still a difficult challenge. In this paper, we propose a real time face tracking framework that includes a sequential estimation of people’s gender then age. A single gender estimator and several gender-specific age estimators are trained using a boosting scheme and their decisions are combined to output a gender and an age in years. We choose to train all these estimators using local binary patterns histograms extracted from still facial images. The whole process is thoroughly tested on state-of art databases and video sets. Results on the popular FG-NET database show results comparable to human perception (overall 70% correct responses within 5 years tolerance and almost 90% within 10 years tolerance). The age and gender estimators can output decisions at 21 frames per second. Combined with the face tracker, they provide real-time estimations of age and gender.

Paper Nr: 182
Title:

Classification of the Liver Tumors using Multiresolution, Superior Order EOCM Textural Features

Authors:

Delia Mitrea, Sergiu Nedevschi and Radu Badea

Abstract: The non-invasive diagnosis of the tumors is a major issue in nowadays research. Our purpose is to elaborate computerized methods in order to perform an accurate characterization and automatic diagnosis of these tumors, using ultrasound image information. We defined the textural model of these tumors, consisting of those textural features that were the most relevant for their characterization, and of the specific values for these textural features. We also defined the superior order generalized cooccurrence matrices and the associated Haralick features, in order to improve the performance of the automatic recognition. In this work, we computed the Edge Orientation Cooccurrence Matrix (EOCM) of order three and the associated Haralick features at multiple resolutions, after applying the Wavelet transform recursively. We analyzed the relevance of the newly defined textural features, in comparison with the former textural features, and then we assessed the improvement due to the final set of relevant textural features upon the classification performance. For the experiments, we considered some of the most frequent liver tumors: the hepatocellular carcinoma (HCC), the most frequent malignant liver tumor and the hemangioma, an important benign liver tumor. We also considered, for comparison, the cirrhotic parenchyma on which the HCC tumor had evolved.

Posters
Paper Nr: 6
Title:

KOSHIK- A Large-scale Distributed Computing Framework for NLP

Authors:

Peter Exner and Pierre Nugues

Abstract: In this paper, we describe KOSHIK, an end-to-end framework to process the unstructured natural language content of multilingual documents. We used the Hadoop distributed computing infrastructure to build this framework as it enables KOSHIK to easily scale by adding inexpensive commodity hardware. We designed an annotation model that allows the processing algorithms to incrementally add layers of annotation without modifying the original document. We used the Avro binary format to serialize the documents. Avro is designed for Hadoop and allows other data warehousing tools to directly query the documents. This paper reports the implementation choices and details of the framework, the annotation model, the options for querying processed data, and the parsing results on the English and Swedish editions of Wikipedia.

Paper Nr: 23
Title:

Human Activity Recognition Framework in Monitored Environments

Authors:

O. León, M. P. Cuellar, M. Delgado, Y. Le Borgne and G. Bontempi

Abstract: This work addresses the problem of the recognition of human activities in Ambient Assisted Living (AAL) scenarios. The ultimate goal of a good AAL system is to learn and recognise behaviours or routines of the person or people living at home, in order to help them if something unusual happens. In this paper, we explore the advances in unobstrusive depth camera-based technologies to detect human activities involving motion. We explore the benefits of a framework for gesture recognition in this field, in contrast to raw signal processing techniques. For the framework validation, Hidden Markov Models and Dynamic Time Warping have been implemented for the action learning and recognition modules as a baseline due to their well known results in the field. The results obtained after the experimentation suggest that the depth sensors are accurate enough and useful in this field, and also that the preprocessing framework studied may result in a suitable methodology.

Paper Nr: 24
Title:

Automatic Polyp Detection using DSC Edge Detector and HOG Features

Authors:

Himanshu Agrahari, Yuji Iwahori, M. K. Bhuyan, Somnath Ghorai, Himanshu Kohli, Robert J. Woodham and Kunio Kasugai

Abstract: Endoscopy is a very powerful technology to examine the intestinal tract and to detect the presence of any possible abnormalities like polyps, the main cause of cancer. This paper presents an edge based method for polyp detection in endoscopic video images. It utilizes discrete singular convolution (DSC) algorithm for edge detection/segmentation scheme, then by using conic fitting techniques (ellipse and hyperbola) potential candidates are determined. These candidates are first rotated so as to make major axis in the x-axis direction, and then classified as polyp or non-polyp by SVM classifier which is trained separately for ellipse and hyperbola with HOG features.

Paper Nr: 37
Title:

Detection of Prostate Abnormality within the Peripheral Zone using Local Peak Information

Authors:

Andrik Rampun, Paul Malcolm and Reyer Zwiggelaar

Abstract: In this paper, a fully automatic method is proposed for the detection of prostate cancer within the peripheral zone. The method starts by filtering noise in the original image followed by feature extraction and smoothing which is based on the Discrete Cosine Transform. Next, we identify the peripheral zone area using a quadratic equation and divide it into left and right regions. Subsequently, peak detection is performed on both regions. Finally, we calculate the percentage similarity and Ochiai coefficients to decide whether abnormality occurs. The initial evaluation of the proposed method is based on 90 prostate MRI images from 25 patients and 82.2% (sensitivity/specificity: 0.81/0.84) of the slices were classified correctly with 8.9% false negative and false positive results.

Paper Nr: 39
Title:

Applying Machine Learning Techniques to Baseball Pitch Prediction

Authors:

Michael Hamilton, Phuong Hoang, Lori Layne, Joseph Murray, David Padget, Corey Stafford and Hien Tran

Abstract: Major League Baseball, a professional baseball league in the US and Canada, is one of the most popular sports leagues in North America. Partially because of its popularity and the wide availability of data from games, baseball has become the subject of significant statistical and mathematical analysis. Pitch analysis is especially useful for helping a team better understand the pitch behavior it may face during a game, allowing the team to develop a corresponding batting strategy to combat the predicted pitch behavior. We apply several common machine learning classification methods to PITCH f/x data to classify pitches by type. We then extend the classification task to prediction by utilizing features only known before a pitch is thrown. By performing significant feature analysis and introducing a novel approach for feature selection, moderate improvement over former results is achieved.

Paper Nr: 42
Title:

A Dynamic Hybrid Local-spatial Interest Point Matching Algorithm for Articulated Human Body Tracking

Authors:

Alireza Dehghani and Alistair Sutherland

Abstract: Current interest point (IP) matching algorithms are either local-based or spatial-based. We propose a hybrid local-spatial IP matching algorithm for articulated human body tracking. The first stage is local-based and finds matched pairs of IPs from two lists of reference and target IPs through a local-feature-descriptors-based matching method. The second stage of the algorithm is spatial-based. It starts with the confidently matched pairs of the previous stage, and recovers more matched pairs from the remaining unmatched IPs through graph matching and cyclic string matching. To compensate for the problem of Reference List Leakage (RLL), which decreases the number of reference IPs throughout the frame sequence and causes failure of tracking, an IP List Scoring and Refinement (LSR) strategy is proposed to maintain the number of reference IPs around a specific level. Experimental results show that not only the proposed algorithm increases the precision rate from 61.53% to 97.81%, but also it improves the recall rate from % 52.33 to 96.40%.

Paper Nr: 51
Title:

A Pixel Labeling Framework for Comparing Texture Features Application to Digitized Ancient Books

Authors:

Maroua Mehri, Petra Gomez-Krämer, Pierre Héroux, Alain Boucher and Rémy Mullot

Abstract: In this article, a complete framework for the comparative analysis of texture features is presented and evaluated for the segmentation and characterization of ancient book pages. Firstly, the content of an entire book is characterized by extracting the texture attributes of each page. The extraction of the texture features is based on a multiresolution analysis. Secondly, a clustering approach is performed in order to classify automatically the homogeneous regions of book pages. Namely, two approaches are compared based on two different statistical categories of texture features, autocorrelation and co-occurrence, in order to segment the content of ancient book pages and find homogeneous regions with little a priori knowledge. By computing several clustering and classification accuracy measures, the results of the comparison show the effectiveness of the proposed framework. Tests on different book contents (text vs. graphics, manuscript vs. printed) show that those texture features are more suitable to distinguish textual regions from graphical ones, than to distinguish text fonts.

Paper Nr: 58
Title:

Tracking by Shape with Deforming Prediction for Non-rigid Objects

Authors:

Kenji Nishida, Takumi Kobayashi and Jun Fujiki

Abstract: A novel algorithm for tracking by shape with deforming prediction is proposed. The algorithm is based on the similarity of the predicted and actual object shape. Second order approximation for feature point movement by Taylor expansion is adopted for shape prediction, and the similarity is measured by using chamfer matching of the predicted and the actual shape. Chamfer matching is also used to detect the feature point movements to predict the object deformation. The proposed algorithm is applied to the tracking of a skier and showed a good tracking and shape prediction performance.

Paper Nr: 66
Title:

Time-segmentation- and Position-free Recognition from Video of Air-drawn Gestures and Characters

Authors:

Yuki Nitsuma, Syunpei Torii, Yuichi Yaguchi and Ryuichi Oka

Abstract: We report on the recognition from video streams of isolated alphabetic characters and connected cursive textual characters, such as alphabetic, hiragana a kanji characters, drawn in the air. This topic involves a number of difficult problems in computer vision, such as the segmentation and recognition of complex motion from video. We utilize an algorithm called time-space continuous dynamic programming (TSCDP) that can realize both time- and location-free (spotting) recognition. Spotting means that prior segmentation of input video is not required. Each of the reference (model) characters used is represented by a single stroke composed of pixels. We conducted two experiments involving the recognition of 26 isolated alphabetic characters and 23 Japanese hiragana and kanji air-drawn characters. Moreover we conducted gesture recognition experiments based on TSCDP and showed that TSCDP was free from many restrictions imposed upon conventional methods.

Paper Nr: 73
Title:

3D Shape Retrieval using Uncertain Semantic Query - A Preliminary Study

Authors:

Hattoibe Aboubacar, Vincent Barra and Gaëlle Loosli

Abstract: The recent technological progress contributes to a huge increase of 3D models available in digital forms. Numerous applications were developed to deal with this amount of information, especially for 3D shape retrieval. One of the main issues is to break the semantic gap between shapes desired by users and shapes returned by retrieval methods. In this paper, we propose an algorithm to address this issue. First the user gives a semantic request. Second, a fuzzy 3D-shape generator sketches out suitable 3D-shapes. Those shapes are filtered by the user or a learning machine to select the ones that match the semantic query. Then, we use a state-of-the-art retrieval method to return real-world 3D shapes that match this semantic query. This algorithm is used to retrieve object in SHREC’07 database. The results are good and promising.

Paper Nr: 77
Title:

Benchmarking Binarisation Techniques for 2D Fiducial Marker Tracking

Authors:

Yves Rangoni and Eric Ras

Abstract: This paper proposes a comparative study of different binarisation techniques for 2D fiducial marker tracking. The application domain is the recognition of objects for Tangible User Interface (TUI) using a tabletop solution. In this case, the common technique is to use markers, attached to the objects, which can be identified using camera-based pattern recognition techniques. Among the different operations that lead to a good recognition of these markers, the step of binarisation of greyscale image is the most critical one. We propose to investigate how this important step can be improved not only in terms of quality but also in term of computational efficiency. State-of-the-art thresholding techniques are benchmarked on this challenging task. A real-world tabletop TUI is used to perform an objective and goal oriented evaluation through the ReacTIVision framework. A computational efficient implementation of one of the best window-based thresholders is proposed in order to satisfy the real-time processing of a video stream. The experimental results reveal that an improvement of up to 10 points of the fiducial tracking recognition rate can be reached when selecting the right thresholder over the embedded method while being more robust and still remaining time-efficient.

Paper Nr: 83
Title:

Human Action Description Based on Temporal Pyramid Histograms

Authors:

Yingying Liu and Arcot Sowmya

Abstract: In this paper, we present an approach to action description based on temporal pyramid histograms. Bag of features is a widely used action recognition framework based on local features, for example spatio-temporal feature points. Although it outperforms other approaches on several public datasets, sequencing information is ignored. Instead of only calculating the occurrence of code words, we also encode their temporal layout in this work. The proposed temporal pyramid histograms descriptor is a set of histogram atoms generated from the original video clip and its subsequences. To classify actions based on the temporal pyramid histograms descriptor, we design a function to calculate the weights of the histogram atoms according to the corresponding sequence lengths. We test the descriptor using nearest neighbour for classification. Experimental results show that, in comparison to the state-of-the-art, our description approach improves action recognition accuracy.

Paper Nr: 97
Title:

Use of Multiple Low Level Features to Find Interesting Regions

Authors:

Michael Borck, Geoff West and Tele Tan

Abstract: Vehicle-based mobile mapping systems capture co-registered imagery and 3D point cloud information over hundreds of kilometres of transport corridor. Methods for extracting information from these large datasets are labour intensive and automatic methods are desired. In addition, such methods need to be easily configured by non-expert users to detect and measure many classes of objects. This paper describes a workflow to take a large number of image and depth features, use machine learning to generate an object detection system that is fast to configure and run. The output is high detection of the objects of interest but with an acceptable number of false alarms. This is desirable as the output is fed into a more complex and hence more computationally expensive analysis system to reject the false alarms and measure the remaining objects. Image and depth features from bounding boxes around objects of interest and random background are used for training with some popular learning algorithms. The interface allows a non-expert user to observe the performance and make modifications to improve the performance.

Paper Nr: 101
Title:

Video Object Recognition and Modeling by SIFT Matching Optimization

Authors:

Alessandro Bruno, Luca Greco and Marco La Cascia

Abstract: In this paper we present a novel technique for object modeling and object recognition in video. Given a set of videos containing 360 degrees views of objects we compute a model for each object, then we analyze short videos to determine if the object depicted in the video is one of the modeled objects. The object model is built from a video spanning a 360 degree view of the object taken against a uniform background. In order to create the object model, the proposed techniques selects a few representative frames from each video and local features of such frames. The object recognition is performed selecting a few frames from the query video, extracting local features from each frame and looking for matches in all the representative frames constituting the models of all the objects. If the number of matches exceed a fixed threshold the corresponding object is considered the recognized objects .To evaluate our approach we acquired a dataset of 25 videos representing 25 different objects and used these videos to build the objects model. Then we took 25 test videos containing only one of the known objects and 5 videos containing only unknown objects. Experiments showed that, despite a significant compression in the model, recognition results are satisfactory.

Paper Nr: 107
Title:

Semi-Automated Identification of Leopard Frogs

Authors:

Dijana Petrovska-Delacrétaz, Aaron Edwards, John Chiasson, Gérard Chollet and David S. Pilliod

Abstract: Principal component analysis is used to implement a semi-automatic recognition system to identify recaptured northern leopard frogs (Lithobates pipiens). Results of both open set and closed set experiments are given. The presented algorithm is shown to provide accurate identification of 209 individual leopard frogs from a total set of 1386 images

Paper Nr: 111
Title:

Robust Object Tracking using Log-Gabor Filters and Color Histogram

Authors:

Oumaima Sliti, Chekib Gmati, Fouzi Benzarti and Hamid Amiri

Abstract: The performance of the tracking algorithm relies heavily on the target structural information accuracy. In this paper, we propose a robust object tracking method based on the log-Gabor texture and color histogram. Our hypothesis is that by adding log-Gabor filter to color features, and then embedded it in the mean shift framework, tracking performances will notably enhance. Compared with several methods of state-of-the-art mean shift trackers, our approach extracts the target information efficiently. Experimental results on various challenging videos show that the proposed method improves the tracking with fewer mean shift iterations.

Paper Nr: 121
Title:

SuperResolution-aided Recognition of Cytoskeletons in Scanning Probe Microscopy Images

Authors:

Sara Colantonio, Mario D'Acunto, Marco Righi and Ovidio Salvetti

Abstract: In this paper, we discuss the possibility to adopt SuperResolution (SR) methods as an important preparatory step to Pattern Recognition, so as to improve the accuracy of image content recognition and identification. Actually, SR mainly deals with the task of deriving a high-resolution image from one or multiple low resolution images of the same scene. The high-resolved image corresponds to a more precise image whose content is enriched with information hidden among the pixels of the original low resolution image(s), and corresponds to a more faithfully representation of the imaged scene. Such enriched content obviously represents a better sample of the scene which can be profitably used by Pattern Recognition algorithms. A real application scenario is discussed dealing with the recognition of cell skeletons in Scanning Probe Microscopy (SPM) single image SR. Results show that the SR allows us to detect and recognize important information barely visible in the original low-resolution image.

Paper Nr: 133
Title:

Face Templates Creation for Surveillance Face Recognition System

Authors:

Tobias Malach and Jiri Prinosil

Abstract: This paper addresses the problem of face templates creation for facial recognition system. The application of a face recognition system in real-world conditions requires compact and representative face templates in order to maintain low error rate and low classification time. Contemporary face template creation methods are not suitable for face recognition systems with large number of users as they produce many templates per person. These templates are often redundant and their high number requires long classification time. The paper presents four approaches to face templates creation that produce one to three face templates per person. The influence of different face template creation approaches was assessed on PubFig and IFaViD database. The achieved results show that appropriate face template creation methods have a significant influence on face recognition system performance.

Paper Nr: 146
Title:

Automatic Identification of Motor Patterns Leading to Freezing of Gait in Parkinson’s Disease - An Exploratory Study

Authors:

Luca Palmerini, Laura Rocchi, Jeffrey M. Hausdorff and Lorenzo Chiari

Abstract: Freezing of gait (FOG) is a common and disabling gait disturbance among patients with advanced Parkinson’s Disease (PD). FOG episodes are often overcome using attention or cues from the environment. Hence, identification of events prior to FOG may be very effective to improve mobility in PD patients. Previous work has suggested that there are changes in the gait pattern just prior to freezing. Nonetheless, little work has been done to explore the possibility of identifying motor patterns that are characteristic of the pre-FOG phase (few seconds before the FOG). We analysed the acceleration signals from sensors worn on the ankle, thigh, and trunk of eight patients with PD who experienced freezing. We translated windows of the raw signals in symbols by using Symbolic Aggregate approXimation. The aim was to discriminate the patterns of symbols characterizing pre-FOG from the ones characterizing normal activity (standing and walking with no FOG). Sensitivity over 50% and Specificity over 70% were obtained by using a classifier on symbolic data, with different combinations of sensor position/sampling/windows duration. These preliminary findings demonstrate that it is possible to automatically identify (some of) the motor patterns that eventually lead to FOG events before they occur by using wearable sensors.

Paper Nr: 154
Title:

A Neural Network Approach for Human Gesture Recognition with a Kinect Sensor

Authors:

T. D’Orazio, C. Attolico, G. Cicirelli and C. Guaragnella

Abstract: Service robots are expected to be used in many household in the near future, provided that proper interfaces are developed for the human robot interaction. Gesture recognition has been recognized as a natural way for the communication especially for elder or impaired people. With the developments of new technologies and the large availability of inexpensive depth sensors, real time gesture recognition has been faced by using depth information and avoiding the limitations due to complex background and lighting situations. In this paper the Kinect Depth Camera, and the OpenNI framework have been used to obtain real time tracking of human skeleton. Then, robust and significant features have been selected to get rid of unrelated features and decrease the computational costs. These features are fed to a set of Neural Network Classifiers that recognize ten different gestures. Several experiments demonstrate that the proposed method works effectively. Real time tests prove the robustness of the method for realization of human robot interfaces.

Paper Nr: 155
Title:

Pattern-based Classification of Rhythms

Authors:

Johannes Fliege, Frank Seifert and André Richter

Abstract: We present a pattern-based approach for the rhythm classification task that combines an Auto-correlation function (ACF) and Discrete Fourier transform (DFT). Rhythm hypotheses are first extracted from symbolic input data, e.g. MIDI, by the ACF. These hypotheses are analysed by the use of DFT to remove duplicates before the classification process. The classification of rhythms is performed using ACF in combination with rhythm patterns contained in a knowledge base. We evaluate this method using pre-labelled input data and discuss our results. We show that a knowledge-based approach is reasonable to address the problem of rhythm classification for symbolic data.

Paper Nr: 161
Title:

Adaptive Noise Variance Identification in Vision-aided Motion Estimation for UAVs

Authors:

Fan Zhou, Wei Zheng and Zengfu Wang

Abstract: Vision location methods have been widely used in the motion estimation of unmanned aerial vehicles (UAVs). The noise of the vision location result is usually modeled as the white gaussian noise so that this result could be utilized as the observation vector in the kalman filter to estimate the motion of the vehicle. Since the noise of the vision location result is affected by external environment, the variance of the noise is uncertain. However, in previous researches the variance is usually set as a fixed empirical value, which will lower the accuracy of the motion estimation. In this paper, a novel adaptive noise variance identification (ANVI) method is proposed, which utilizes the special kinematic property of the UAV for frequency analysis and adaptively identify the variance of the noise. Then, the adaptively identified variance are used in the kalman filter for accurate motion estimation. The performance of the proposed method is assessed by simulations and field experiments on a quadrotor system. The results illustrate the effectiveness of the method.

Paper Nr: 165
Title:

Development of an Interhemispheric Symmetry Measurement in the Neonatal Brain

Authors:

Ninah Koolen, Anneleen Dereymaeker, Katrien Jansen, Jan Vervisch, Vladimir Matic, Maarten De Vos, Gunnar Naulaers and Sabine Van Huffel

Abstract: The automated analysis of the EEG pattern of the preterm newborn would be a valuable tool in the neonatal intensive care units for the prognosis of neurological development. The analysis of the (a)symmetry between the two hemispheres can provide useful information about neuronal dysfunction in early stages. Consecutive and subgroup analyses of different brain regions will allow detecting physiologic asymmetry versus pathologic asymmetry. This can improve the assessment of the long-term neurodevelopmental outcome. We show that pathological asymmetry can be measured and detected using the channel symmetry index, which comprises the difference in power spectral density of contralateral EEG signals. To distinguish pathological from physiological normal EEG patterns, we make use of one-class SVM classifiers.

Paper Nr: 173
Title:

Enhanced Routing Algorithm for Opportunistic Networking - On the Improvement of the Basic Opportunistic Networking Routing Algorithm by the Application of Machine Learning

Authors:

Ladislava Smítková Janků and Kateřina Hyniová

Abstract: The opportunistic communication networks are special communication networks where no assumption is made on the existence of a complete path between two nodes wishing to communicate; the source and destination nodes needn't be connected to the same network at the same time. This assumption makes the routing in these networks extremely difficult. We proposed the novel opportunistic networking routing algorithm, which improves the basic opportunistic networking routing algorithm by application of machine learning. The HMM Autonomous Robot Mobility Models and Node Reachability Model are constructed from the observed data and used in a proposed routing scheme in order to compute the combined probabilities of message delivery to the destination node. In the proposed routing scheme, the messages are coppied between two nodes only if the combined probability of the message delivery to the destination node is higher than the preliminary defined limit value. The routing scheme was developed for the networks of autonomous mobile robots. The improvement about 70% in a network load is reported.

Paper Nr: 178
Title:

Enhanced Image Processing Pipeline and Parallel Generation of Multiscale Tiles for Web-based 3D Rendering of Whole Mouse Brain Vascular Networks

Authors:

Jaerock Kwon

Abstract: Mapping out the complex vascular network in the brain is critical for understanding the transport of oxygen, nutrition, and signaling molecules. The vascular network can also provide us with clues to the relationship between neural activity and blood oxygen-related signals. Advanced high-throughput 3D imaging instruments such as the Knife-Edge Scanning Microscope (KESM) are enabling the imaging of the full vascular network in small animal brains (e.g., the mouse) at sub-micrometer resolution. The amount of data per brain (for KESM) is on the order of 2TB, thus it is a major challenge just to visualize it at full resolution. In this paper, we present an enhanced image processing pipeline for KESM mouse vascular network data set, and a parallel multi-scale tile generation system for web-based pseudo-3D rendering. The system allows full navigation of the data set at all resolution scales. We expect our approach to help in broader dissemination of large-scale, high-resolution 3D microscopy data.