ICPRAM 2015 Abstracts


Area 1 - Theory and Methods

Full Papers
Paper Nr: 3
Title:

Spotting Differences Among Observations

Authors:

Marko Rak, Tim König and Klaus-Dietz Tönnies

Abstract: Identifying differences among the sample distributions of different observations is an important issue in many fields ranging from medicine over biology and chemistry to physics. We address this issue, providing a general framework to detect difference spots of interest in feature space. Such spots occur not only at various locations, they may also come in various shapes and multiple sizes, even at the same location. We deal with these challenges in a scale-space detection framework based on the density function difference of the observations. Our framework is intended for semi-automatic processing, providing human-interpretable interest spots for further investigation of some kind, e.g., for generating hypotheses about the observations. Such interest spots carry valuable information, which we outline at a number of classification scenarios from UCI Machine Learning Repository; namely, classification of benign/malign breast cancer, genuine/forged money and normal/spondylolisthetic/disc-herniated vertebral columns. To this end, we establish a simple decision rule on top of our framework, which bases on the detected spots. Results indicate state-of-the-art classification performance, which underpins the importance of the information that is carried by these interest spots.

Paper Nr: 4
Title:

Naive Bayes Classifier with Mixtures of Polynomials

Authors:

J. Luengo and Rafael Rumi

Abstract: We present in this paper a methodology for including continuous features in the Naive Bayes classifier by estimating the density function of the continuous variables through the Mixtures of Polynomials model. Three new issues are considered for this model: i) a classification oriented parameter estimation procedure ii) a feature selection procedure and iii) the definition of new kind of variable, to deal with those variables that are in theory continuous, but their behavior makes the estimation difficult. These methods are tested with respect to classical discrete and Gaussian Naive Bayes, as well as classification trees.

Paper Nr: 18
Title:

Stream-based Active Learning in the Presence of Label Noise

Authors:

Mohamed-Rafik Bouguelia, Yolande Belaïd and Abdel Belaïd

Abstract: Mislabelling is a critical problem for stream-based active learning methods because it not only impacts the classification accuracy but also deviates the active learner from querying informative data. Dealing with label noise is omitted by most existing active learning methods. We address this issue and propose an efficient method to identify and mitigate mislabelling errors for active learning in the streaming setting. We first propose a mislabelling likelihood measure to characterize the potentially mislabelled instances. This measure is based on the degree of disagreement among the predicted and the queried class label (given by the labeller). Then, we derive a measure of informativeness that expresses how much the label of an instance needs to be corrected by an expert labeller. Specifically, an instance is worth relabelling if it shows highly conflicting information among the predicted and the queried labels. We show that filtering instances with a high mislabelling likelihood and correcting only the filtered instances with a high conflicting information greatly improves the performances of the active learner. Experiments on several real world data prove the effectiveness of the proposed method in terms of filtering efficiency and classification accuracy of the stream-based active learner.

Paper Nr: 33
Title:

Representation Optimization with Feature Selection and Manifold Learning in a Holistic Classification Framework

Authors:

Fabian Bürger and Josef Pauli

Abstract: Many complex and high dimensional real-world classification problems require a carefully chosen set of features, algorithms and hyperparameters to achieve the desired generalization performance. The choice of a suitable feature representation has a great effect on the prediction performance. Manifold learning techniques – like PCA, Isomap, Local Linear Embedding (LLE) or Autoencoders – are able to learn a better suitable representation automatically. However, the performance of a manifold learner heavily depends on the dataset. This paper presents a novel automatic optimization framework that incorporates multiple manifold learning algorithms in a holistic classification pipeline together with feature selection and multiple classifiers with arbitrary hyperparameters. The highly combinatorial optimization problem is solved efficiently using evolutionary algorithms. Additionally, a multi-pipeline classifier based on the optimization trajectory is presented. The evaluation on several datasets shows that the proposed framework outperforms the Auto-WEKA framework in terms of generalization and optimization speed in many cases.

Paper Nr: 35
Title:

Adaptive Classification for Person Re-identification Driven by Change Detection

Authors:

C. Pagano, E. Granger, R. Sabourin, G. L. Marcialis and F. Roli

Abstract: Person re-identification from facial captures remains a challenging problem in video surveillance, in large part due to variations in capture conditions over time. The facial model of a target individual is typically designed during an enrolment phase, using a limited number of reference samples, and may be adapted as new reference videos become available. However incremental learning of classifiers in changing capture conditions may lead to knowledge corruption. This paper presents an active framework for an adaptive multi-classifier system for video-to-video face recognition in changing surveillance environments. To estimate a facial model during the enrolment of an individual, facial captures extracted from a reference video are employed to train an individual-specific incremental classifier. To sustain a high level of performance over time, a facial model is adapted in response to new reference videos according the type of concept change. If the system detects that the facial captures of an individual incorporate a gradual pattern of change, the corresponding classifier(s) are adapted through incremental learning. In contrast, to avoid knowledge corruption, if an abrupt pattern of change is detected, a new classifier is trained on the new video data, and combined with the individual’s previously-trained classifiers. For validation, a specific implementation is proposed, with ARTMAP classifiers updated using an incremental learning strategy based on Particle Swarm Optimization, and the Hellinger Drift Detection Method is used for change detection. Simulation results produced with Faces in Action video data indicate that the proposed system allows for scalable architectures that maintains a significantly higher level of accuracy over time than a reference passive system and an adaptive Transduction Confidence Machine-kNN classifier, while controlling computational complexity.

Paper Nr: 40
Title:

Structural 3D Point Pattern Matching using Iteratively Reweighted Feature Compatibilities

Authors:

Thomas Kerstein, Hubert Roth and Jürgen Wahrburg

Abstract: In this paper a novel approach for structural 3D point pattern matching based on iteratively reweighted Euclidean distance compatibilities between the points of a scene point set and a model point set is presented. One key advantage is the utilization of histogram-like descriptors, called distance signatures, leading to a notable simplification of the matching. Since distance similarities alone do not necessarily ensure geometrical consistence of the matching result, additionally morphological compatibilities are incorporated. For this purpose, distance similarity is combined with a mutual closest point criterion using an Expectation Maximization (EM) variant of the Iterative Closest Point (ICP) algorithm. The algorithm copes well with patterns of significantly differing point sizes as well as almost uniform point distributions. It shows robustness to considerable position jitter and offers adequate processing performance for continuous matching of patterns of moderate point sizes, such as they typically arise in applications related to point based object recognition.

Paper Nr: 48
Title:

oAdaBoost - An AdaBoost Variant for Ordinal Classification

Authors:

João Costa and Jaime S. Cardoso

Abstract: Ordinal data classification (ODC) has a wide range of applications in areas where human evaluation plays an important role, ranging from psychology and medicine to information retrieval. In ODC the output variable has a natural order; however, there is not a precise notion of the distance between classes. The Data Replication Method was proposed as tool for solving the ODC problem using a single binary classifier. Due to its characteristics, the Data Replication Method is straightforwardly mapped into methods that optimize the decision function globally. However, the mapping process is not applicable when the methods construct the decision function locally and iteratively, like decision trees and AdaBoost (with decision stumps). In this paper we adapt the Data Replication Method for AdaBoost, by softening the constraints resulting from the data replication process. Experimental comparison with state-of-the-art AdaBoost variants in synthetic and real data show the advantages of our proposal.

Paper Nr: 65
Title:

Video Analysis in Indoor Soccer using a Quadcopter

Authors:

Filipe Trocado Ferreira, Jaime S. Cardoso and Hélder P. Oliveira

Abstract: Automatic vision systems are widely used in sports competition to analyze individual and collective performance during the matches. However, the complex implementation based on multiple fixed cameras and the human intervention on the process makes this kind of systems expensive and not suitable for the big majority of the teams. In this paper we propose a low-cost, portable and flexible solution based on the use of Unmanned Air Vehicles to capture images from indoor soccer games. Since these vehicles suffer from vibrations and disturbances, the acquired video is very unstable, presenting a set of unusual problems in this type of applications. We propose a complete video-processing framework, including video stabilization, camera calibration, player detection, and team performance analysis. The results showed that camera calibration was able to correct automatically image-to-world homography; the player detection precision and recall was around 75%; and the high-level data interpretation showed a strong similarity with ground-truth derived results.

Paper Nr: 75
Title:

Introducing the Φ-Descriptor - A Most Versatile Relative Position Descriptor

Authors:

Pascal Matsakis, Mohammad Naeem and Farhad Rahbarnia

Abstract: Spatial prepositions, like above, inside, near, denote spatial relationships. A relative position descriptor is a basis from which quantitative models of spatial relationships can be derived. It is an image descriptor, like colour, texture, and shape descriptors. Various relative position descriptors can be found in the literature. In this paper, we introduce a new relative position descriptorthe -descriptorthat has about all the strengths of each and every one of its competitors, and none of the weaknesses. Our approach is based on the concept of the F-histogram and on an original categorization of pairs of consecutive boundary points on a line.

Paper Nr: 82
Title:

Discriminative Kernel Feature Extraction and Learning for Object Recognition and Detection

Authors:

Hong Pan, Søren Olsen and Yaping Zhu

Abstract: Feature extraction and learning is critical for object recognition and detection. By embedding context cue of image attributes into the kernel descriptors, we propose a set of novel kernel descriptors called context kernel descriptors (CKD). The motivation of CKD is to use the spatial consistency of image attributes or features defined within a neighboring region to improve the robustness of descriptor matching in kernel space. For feature learning, we develop a novel codebook learning method, based on the Cauchy-Schwarz Quadratic Mutual Information (CSQMI) measure, to learn a compact and discriminative CKD codebook from a rich and redundant CKD dictionary. Projecting the original full-dimensional CKD onto the codebook, we reduce the dimensionality of CKD without losing its discriminability. CSQMI derived from Rényi quadratic entropy can be efficiently estimated using a Parzen window estimator even in high-dimensional space. In addition, the latent connection between Rényi quadratic entropy and the mapping data in kernel feature space further facilitates us to capture the geometric structure as well as the information about the underlying labels of the CKD using CSQMI. Thus the resulting codebook and reduced CKD are discriminative. We report superior performance of our algorithm for object recognition on benchmark datasets like Caltech-101 and CIFAR-10, as well as for detection on a challenging chicken feet dataset.

Paper Nr: 84
Title:

Detection of Ruptures in Spatial Relationships in Video Sequences

Authors:

Abdalbassir Abou-Elailah, Valerie Gouet-Brunet and Isabelle Bloch

Abstract: The purpose of this work is to detect strong changes in spatial relationships between objects in video sequences, with a limited knowledge on the objects. First, a fuzzy representation of the objects is proposed based on low-level generic primitives. Furthermore, angle and distance histograms are used as examples to model the spatial relationships between two objects. Then, we estimate the distances between different angle or distance histograms during time. By analyzing the evolution of the spatial relationships during time, ruptures are detected in this evolution. Experimental results show that the proposed method can efficiently detect the ruptures in the spatial relationships, exploiting only low-level primitives. This constitutes a promising step towards event detection in videos, with few a priori models on the objects.

Paper Nr: 94
Title:

Normalised Diffusion Cosine Similarity and Its Use for Image Segmentation

Authors:

Jan Gaura and Eduard Sojka

Abstract: In many image-segmentation algorithms, measuring the distances is a key problem since the distance is often used to decide whether two image points belong to a single or, respectively, to two different image segments. The usual Euclidean distance need not be the best choice. Measuring the distances along the surface that is defined by the image function seems to be more relevant in more complicated images. Geodesic distance, i.e. the shortest path in the corresponding graph, or the k shortest paths can be regarded as the simplest methods. It might seem that the diffusion distance should provide the properties that are better since all the paths (not only their limited number) are taken into account. In this paper, we firstly show that the diffusion distance has the properties that make it difficult to use it image segmentation, which extends the recent observations of some other authors. Afterwards, we propose a new measure called normalised diffusion cosine similarity that is more suitable. We present the corresponding theory as well as the experimental results.

Paper Nr: 116
Title:

Detection and Recognition of Painted Road Surface Markings

Authors:

Jack Greenhalgh and Majid Mirmehdi

Abstract: A method for the automatic detection and recognition of text and symbols painted on the road surface is presented. Candidate regions are detected as maximally stable extremal regions (MSER) in a frame which has been transformed into an inverse perspective mapping (IPM) image, showing the road surface with the effects of perspective distortion removed. Detected candidates are then sorted into words and symbols, before they are interpreted using separate recognition stages. Symbol-based road markings are recognised using histogram of oriented gradient (HOG) features and support vector machines (SVM). Text-based road signs are recognised using a third-party optical character recognition (OCR) package, after application of a perspective correction stage. Matching of regions between frames, and temporal fusion of results is used to improve performance. The proposed method is validated using a data-set of videos, and achieves F-measures of 0.85 for text characters and 0.91 for symbols.

Short Papers
Paper Nr: 9
Title:

On Selecting Useful Unlabeled Data Using Multi-view Learning Techniques

Authors:

Thanh-Binh Le and Sang-Woon Kim

Abstract: In a semi-supervised learning approach, using a selection strategy, strongly discriminative examples are first selected from unlabeled data and then, together with labeled data, utilized for training a (supervised) classifier. This paper investigates a new selection strategy for the case when the data are composed of different multiple views: first, multiple views of the data are derived independently; second, each of the views are used for measuring corresponding confidences with which examples to be selected are evaluated; third, all the confidence levels measured from the multiple views are used as a weighted average for deriving a target confidence; this selecting-and-training is repeated for a predefined number of iterations. The experimental results, obtained using synthetic and real-life benchmark data, demonstrate that the proposed mechanism can compensate for the shortcomings of the traditional strategies. In particular, the results demonstrate that when the data is appropriately decomposed into multiple views, the strategy can achieve further improved results in terms of the classification accuracy.

Paper Nr: 13
Title:

Entity Matching in OCRed Documents with Redundant Databases

Authors:

Nihel Kooli and Abdel Belaïd

Abstract: This paper presents an entity recognition approach on documents recognized by OCR (Optical Character Recognition). The recognition is formulated as a task of matching entities in a database with their representations in a document. A pre-processing step of entity resolution is performed on the database to provide a better representation of the entities. For this, a statistical model based on record linkage and record merge phases is used. Furthermore, documents recognized by OCR can contain noisy data and altered structure. An adapted method is proposed to retrieve the entities from their structures by tolerating possible OCR errors. A modified version of EROCS is applied to this problem by adapting the notion of segments to blocks provided by the OCR. It handles document segments to match the document to its corresponding entities. For efficiency, a process of data labeling in the document is applied in order to filter the compared entities and segments. The evaluation on business documents shows a significant improvement of matching rates compared to those of EROCS.

Paper Nr: 16
Title:

BLSTM-CTC Combination Strategies for Off-line Handwriting Recognition

Authors:

Luc Mioulet, G. Bideault, C. Chatelain, T. Paquet and S. Brunessaux

Abstract: In this paper we present several combination strategies using multiple BLSTM-CTC systems. Given several feature sets our aim is to determine which strategies are the most relevant to improve on an isolated word recognition task (the WR2 task of the ICDAR 2009 competition), using a BLSTM-CTC architecture. We explore different combination levels: early integration (feature combination), mid level combination and late fusion (output combinations). Our results show that several combinations outperform single feature BLSTM-CTCs.

Paper Nr: 38
Title:

HyperSAX: Fast Approximate Search of Multidimensional Data

Authors:

Jens Emil Gydesen, Henrik Haxholm, Niels Sonnich Poulsen, Sebastian Wahl and Bo Thiesson

Abstract: The increasing amount and size of data makes indexing and searching more difficult. It is especially challenging for multidimensional data such as images, videos, etc. In this paper we introduce a new indexable symbolic data representation that allows us to efficiently index and retrieve from a large amount of data that may appear in multiple dimensions. We use an approximate lower bounding distance measure to compute the distance between multidimensional arrays, which allows us to perform fast similarity searches. We present two search methods, exact and approximate, which can quickly retrieve data using our representation. Our approach is very general and works for many types of multidimensional data, including different types of image representations. Even for millions of multidimensional arrays, the approximate search will find a result in a few milliseconds, and will in many cases return a result similar to the best match.

Paper Nr: 39
Title:

Learning Dictionary Via Wavelet Sparse Principal Component Analysis

Authors:

Shengkun Xie, Anna Lawniczak and Sridhar Krishnan

Abstract: This work focuses on learning a double-sparse dictionary directly from specific data. The proposed method combines wavelet transform with sparse principal component analysis (SPCA), namely wavelet sparse PCA (WSPCA). The double-sparsity of the learned dictionary is achieved by obtaining both sparse signal components and sparse signal loadings. This work also examines the application of the proposed method to epileptic EEG signal classification problem. The nearly perfect classification accuracy (i.e., 99.7%) is obtained for a set of 8-hour long EEGs and the results are compared to other PCA-based sparse methods. Our experiments illustrate that the WSPCA approach lowers data variability of the extacted features, and it outperforms both sparse PCA and wavelet PCA methods in classifying the EEG signals that we consider.

Paper Nr: 41
Title:

Shape-based Object Retrieval and Classification with Supervised Optimisation

Authors:

Cong Yang, Oliver Tiebe, Pit Pietsch, Christian Feinen, Udo Kelter and Marcin Grzegorzek

Abstract: In order to enhance the performance of shape retrieval and classification, in this paper, we propose a novel shape descriptor with low computation complexity that can be easily fused with other meaningful descriptors like shape context, etc. This leads to a significant increase in descriptive power of original descriptors without adding to much computation complexity. To make the proposed shape descriptor more practical and general, a supervised optimisation strategy is introduced. The most significant scientific contributions of this paper includes the introduction of a new and simple feature descriptor with supervised optimisation strategy leading to the impressive improvement of the accuracy in object classification and retrieval scenario.

Paper Nr: 45
Title:

Deterministic Method for Automatic Visual Grading of Seed Food Products

Authors:

Pierre Dubosclard, Stanislas Larnier, Hubert Konik, Ariane Herbulot and Michel Devy

Abstract: This paper presents a deterministic method for automatic visual grading, designed to solve the industrial problem of evaluation of seed lots. The sample is thrown in bulk onto a tray placed in a chamber for acquiring color image. An image processing method had been developed to separate and characterize each seed. Shape learning is performed on isolated seeds. The collected information is used for the segmentation. A first step is made based on simple criteria such as regions, edges and normals to the boundary. Then, an active contour with shape prior is performed to improve the results.

Paper Nr: 55
Title:

3-Dimensional Motion Recognition by 4-Dimensional Higher-order Local Auto-correlation

Authors:

Hiroki Mori, Takaomi Kanda, Dai Hirose and Minoru Asada

Abstract: In this paper, we propose a 4-Dimensional Higher-order Local Auto-Correlation (4D HLAC). The method aims to extract the features of a 3D time series, which is regarded as a 4D static pattern. This is an orthodox extension of the original HLAC, which represents correlations among local values in 2D images and can effectively summarize motion in 3D space. To recognize motion in the real world, a recognition system should exploit motion information from the real-world structure. The 4D HLAC feature vector is expected to capture representations for general 3D motion recognition, because the original HLAC performed very well in image recognition tasks. Based on experimental results showing high recognition performance and low computational cost, we conclude that our method has a strong advantage for 3D time series recognition, even in practical situations.

Paper Nr: 56
Title:

Metric Learning in Dimensionality Reduction

Authors:

Alexander Schulz and Barbara Hammer

Abstract: The emerging big dimensionality in digital domains causes the need of powerful non-linear dimensionality reduction techniques for a rapid and intuitive visual data access. While a couple of powerful non-linear dimensionality reduction tools have been proposed in the last years, their applicability is limited in practice: since a non-linear projection is no longer characterised by semantically meaningful data dimensions, the visual display provides only very limited interpretability which goes beyond mere neighbourhood relationships and, hence, interactive data analysis and further expert insight are hindered. In this contribution, we propose to enhance non-linear dimensionality reduction techniques by a metric learning framework. This allows us to quantify the relevance of single data dimensions and their correlation with respect to the given visual display; on the one side, this explains its most relevant factors; on the other side, it opens the way towards an interactive data analysis by changing the data representation based on the learned metric from the visual display.

Paper Nr: 58
Title:

Unconstrained Speech Segmentation using Deep Neural Networks

Authors:

Van Zyl van Vuuren, Louis ten Bosch and Thomas Niesler

Abstract: We propose a method for improving the unconstrained segmentation of speech into phoneme-like units using deep neural networks. The proposed approach is not dependent on acoustic models or forced alignment, but operates using the acoustic features directly. Previous solutions of this type were plagued by the tendency to hypothesise additional incorrect phoneme boundaries near the phoneme transitions. We show that the application of deep neural networks is able to reduce this over-segmentation substantially, and achieve improved segmentation accuracies. Furthermore, we find that generative pre-training offers an additional benefit.

Paper Nr: 63
Title:

A Fast 4D Facial Expression Recognition Method for Low-resolution Videos

Authors:

Jie Shao and Nan Dong

Abstract: With the popularity of 3D cameras, 4D (3D+time) facial expression recognition has become one of the fresh topics in recent years. In order to meet the requirements of practical applications using non-professional 3D cameras (like Microsoft Kinect), a fast facial expression recognition method for low-resolution RGB-D videos is introduced in this paper. In the proposed solution, first faces are automatically detected and aligned from each RGB-D image sequence. Then faces of each image sequence are represented by their local 4D texture features. These features are trained by Conditional Random Field (CRF) model based classifiers CRFs, HCRFs and LDCRFs respectively. Plenty of discussions are made to compare the classifiers in aspects of time and effectiveness. Our final results demonstrate the effectiveness and practicability of our approach in 4D facial expression recognition related applications.

Paper Nr: 69
Title:

Alignment of Cyclically Ordered Trees

Authors:

Takuya Yoshino and Kouichi Hirata

Abstract: In this paper, as unordered trees preserving the adjacency among siblings, we introduce the following three kinds of a cyclically ordered tree, that is, a biordered tree that allows both a left-to-right and a right-to-left order among siblings, a cyclic-ordered tree that allows cyclic order among siblings in a left-to-right direction and a cyclic-biordered tree that allows cyclic order among siblings in both left-to-right and right-to-left directions. Then, we design the algorithms to compute the alignment distance and the segmental alignment distance between biordered trees in O(n2D2) time and ones between cyclic-ordered trees and cyclic-biordered trees in O(n2D4) time, where n is the maximum number of nodes and D is the maximum degree in two given trees.

Paper Nr: 71
Title:

An Exact Graph Edit Distance Algorithm for Solving Pattern Recognition Problems

Authors:

Zeina Abu-Aisheh, Romain Raveaux, Jean-Yves Ramel and Patrick Martineau

Abstract: Graph edit distance is an error tolerant matching technique emerged as a powerful and flexible graph matching paradigm that can be used to address different tasks in pattern recognition, machine learning and data mining; it represents the minimum-cost sequence of basic edit operations to transform one graph into another by means of insertion, deletion and substitution of vertices and/or edges. A widely used method for exact graph edit distance computation is based on the A* algorithm. To overcome its high memory load while traversing the search tree for storing pending solutions to be explored, we propose a depth-first graph edit distance algorithm which requires less memory and searching time. An evaluation of all possible solutions is performed without explicitly enumerating them all. Candidates are discarded using an upper and lower bounds strategy. A solid experimental study is proposed; experiments on a publicly available database empirically demonstrated that our approach is better than the A* graph edit distance computation in terms of speed, accuracy and classification rate.

Paper Nr: 73
Title:

Two-way Multimodal Online Matrix Factorization for Multi-label Annotation

Authors:

Jorge A. Vanegas, Viviana Beltran and Fabio A. González

Abstract: This paper presents a matrix factorization algorithm for multi-label annotation. The multi-label annotation problem arises in situations such as object recognition in images where we want to automatically find the objects present in a given image. The solution consists in learning a classification model able to assign one or many labels to a particular sample. The method presented in this paper learns a mapping between the features of the input sample and the labels, which is later used to predict labels for unannotated instances. The mapping between the feature representation and the labels is found by learning a common semantic representation using matrix factorization. An important characteristic of the proposed algorithm is its online formulation based on stochastic gradient descent which can scale to deal with large datasets. According to the experimental evaluation, which compares the method with state-of-the-art space embedding algorithms, the proposed method presents a competitive performance improving, in some cases, previously reported results.

Paper Nr: 78
Title:

Relative Position Descriptors - A Review

Authors:

Mohammad Naeem and Pascal Matsakis

Abstract: A relative position descriptor is a quantitative representation of the relative position of two spatial objects. It is a low-level image descriptor, like colour, texture, and shape descriptors. A good amount of work has been carried out on relative position description. Application areas include content-based image retrieval, remote sensing, medical imaging, robot navigation, and geographic information systems. This paper reviews the existing work. It identifies the approaches that have been used as well as the properties that can be expected from relative position descriptors. It compares and provides a brief overview of various descriptors, including their main properties, strengths and limitations, and it suggests areas for future work.

Paper Nr: 87
Title:

Distance Based Active Learning for Domain Adaptation

Authors:

Christian Pölitz

Abstract: We investigate methods to apply Domain Adaptation coupled with Active Learning to reduce the number of labels needed to train a classifier. We assume to have a classification task on a given unlabelled set of documents and access to labels from different documents of other sets. The documents from the other sets come from different distributions. Our approach uses Domain Adaptation together with Active Learning to find a minimum number of labelled documents from the different sets to train a high quality classifier. We assume that documents from different sets that are close in a latent topic space can be used for a classification task on a given different set of documents.

Paper Nr: 93
Title:

A Non-parametric Spectral Model for Graph Classification

Authors:

Andrea Gasparetto, Giorgia Minello and Andrea Torsello

Abstract: Graph-based representations have been used with considerable success in computer vision in the abstraction and recognition of object shape and scene structure. Despite this, the methodology available for learning structural representations from sets of training examples is relatively limited. In this paper we take a simple yet effective spectral approach to graph learning. In particular, we define a novel model of structural representation based on the spectral decomposition of graph Laplacian of a set of graphs, but which make away with the need of one-to-one node-correspondences at the base of several previous approaches, and handles directly a set of other invariants of the representation which are often neglected. An experimental evaluation shows that the approach significantly improves over the state of the art.

Paper Nr: 95
Title:

Crater Detection using CGC - A New Circle Detection Method

Authors:

Vinciane Lacroix and Sabine Vanhuysse

Abstract: ”Constrained Gradient for Circle” (CGC) is a new circle detection algorithm based on the gradient of the intensity image. The method relies on two conditions. The “gradient angle compatibility condition” constrains the gradient of a given percentage of the pixels belonging to some digital circles having a radius in the range of radii to detect to point towards the centre of the circle or in the opposite direction. The “curvature compatibility condition” constrains the variation of the gradient angle of the same pixels in a range depending on the radius of the circle. These two conditions are sufficient to detect the core of circular shapes. The best-fitting circle is then identified. The method is applied to artificial and reference images and compared to state-of-the-art methods. It is also applied to water-filled crater detection in Cambodia: these craters that might indicate the presence of Unexploded Ordnance (UXO) dating from the US bombing produce dark circles on satellite panchromatic images.

Paper Nr: 96
Title:

Image Quality Assessment for Photo-consistency Evaluation on Planar Classification in Urban Scenes

Authors:

Marie-Anne Bauda, Sylvie Chambon, Pierre Gurdjos and Vincent Charvillat

Abstract: In the context of semantic segmentation of urban scenes, the calibrated multi-views and the flatness assumption are commonly used to estimate a warped image based on the homography estimation. In order to classify planar and non-planar areas, we propose an evaluation protocol that compares several Image Quality Assessments (IQA) between a reference zone and its warped zone. We show that cosine angle distance-based measures are more efficient than euclidean distance-based for the planar/non-planar classification and that the Universal Quality Image (UQI) measure outperforms the other evaluated measures.

Paper Nr: 103
Title:

MultiResolution Complexity Analysis - A Novel Method for Partitioning Datasets into Regions of Different Classification Complexity

Authors:

G. Armano and E. Tamponi

Abstract: Systems for complexity estimation typically aim to quantify the overall complexity of a domain, with the goal of comparing the hardness of different datasets or to associate a classification task to an algorithm that is deemed best suited for it. In this work we describe MultiResolution Complexity Analysis, a novel method for partitioning a dataset into regions of different classification complexity, with the aim of highlighting sources of complexity or noise inside the dataset. Initial experiments have been carried out on relevant datasets, proving the effectiveness of the proposed method.

Paper Nr: 120
Title:

On the Application of Bio-Inspired Optimization Algorithms to Fuzzy C-Means Clustering of Time Series

Authors:

Muhammad Marwan Muhammad Fuad

Abstract: Fuzzy c-means clustering (FCM) is a clustering method which is based on the partial membership concept. As with the other clustering methods, FCM applies a distance to cluster the data. While the Euclidean distance is widely-used to perform the clustering task, other distances have been suggested in the literature. In this paper we study the use of a weighted combination of metrics in FCM clustering of time series where the weights in the combination are the outcome of an optimization process using differential evolution, genetic algorithms, and particle swarm optimization as optimizers. We show how the overfitting phenomenon interferes in the optimization process that the optimal results obtained during the training stage degrade during the testing stage as a result of overfitting.

Paper Nr: 127
Title:

High Dimensional Similarity Search with Bundled Query Processing on Hilbert R-Tree

Authors:

Yohei Nasu, Naoki Kishikawa, Kei Tashima, Shin Kodama, Yasunobu Imamura, Takeshi Shinohara, Koichi Hirata and Tetsuji Kuboyama

Abstract: Hilbert R-tree is an R-tree, which is a B-tree-like multiway balanced tree, such that data objects with high dimensions are sorted along the Hilbert curve. In this paper, we first point out that the compact Hilbert R-tree, which is a Hilbert R-tree without preserving Hilbert values, realizes the same performance as the standard Hilbert R-tree, by using the Hilbert sort and the Hilbert merge. Then, to improve search time for high dimensional objects in the compact Hilbert R-tree, we propose a bundled query processing. Furthermore, we introduce two methods, the pre-processing by the Hilbert merge and the control for the order of visiting nodes. From experimental results, we observe that, in the similarity search of sound and image data, the bundled query processing is about 30% faster than the combinations of individual query processing.

Paper Nr: 133
Title:

The Critical Feature Dimension and Critical Sampling Problems

Authors:

Bernardete M. Ribeiro, Andrew Sung, Divya Suryakumar and Ram Basnet

Abstract: Efficacious data mining methods are critical for knowledge discovery in various applications in the era of big data. Two issues of immediate concern in big data analytic tasks are how to select a critical subset of features and how to select a critical subset of data points for sampling. This position paper presents ongoing research by the authors that suggests: 1. the critical feature dimension problem is theoretically intractable, but simple heuristic methods may well be sufficient for practical purposes; 2. there are big data analytic problems where the success of data mining depends more on the critical feature dimension than the specific features selected, thus a random selection of the features based on the dataset’s critical feature dimension will prove sufficient; and 3. The problem of critical sampling has the same intractable complexity as critical feature dimension, but again simple heuristic methods may well be practicable in most applications.

Paper Nr: 144
Title:

A Dissimilarity Measure for Comparing Origami Crease Patterns

Authors:

Seung Man Oh, Godfried T. Toussaint, Erik D. Demaine and Martin L. Demaine

Abstract: A measure of dissimilarity (distance) is proposed for comparing origami crease patterns represented as geometric graphs. The distance measure is determined by minimum-weight matchings calculated between the edges as well as the vertices of the graphs being compared. The distances between pairs of edges and pairs of vertices of the graph are weighted linear combinations of six parameters that constitute geometric features of the edges and vertices. The results of a preliminary study performed with a collection of 45 crease patterns obtained from Mitani’s ORIPA web page, revealed which of these features appear to be more salient for obtaining a clustering of the crease patterns that appears to agree with human intuition.

Posters
Paper Nr: 1
Title:

Using Nonlinear Models to Enhance Prediction Performance with Incomplete Data

Authors:

Faraj A. A. Bashir and Hua-Liang Wei

Abstract: A great deal of recent methodological research on missing data analysis has focused on model parameter estimation using modern statistical methods such as maximum likelihood and multiple imputation. These approaches are better than traditional methods (for example listwise deletion and mean imputation methods). These modern techniques can lead to unbiased parametric estimation in many particular application cases. However, these methods do not work well in some cases especially for nonlinear systems that have highly nonlinear behaviour. This paper explains the linear parametric estimation in existence of missing data, which includes an overview of biased and unbiased linear parametric estimation with missing data, and provides accessible descriptions of expectation maximization (EM) algorithm and Gauss-Newton method. In particular, this paper proposes a Gauss-Newton iteration method for nonlinear parametric estimation in case of missing data. Since Gauss-Newton method needs initial values that are hard to obtain in the presence of missing data, the EM algorithm is thus used to estimate these initial values. In addition, we present two analysis examples to illustrate the performance of the proposed methods.

Paper Nr: 5
Title:

Adaptive Variational Model and Learning-based SVM for Medical Image Segmentation

Authors:

Sami Bourouis

Abstract: Precise medical images segmentation is an important step for several clinical applications such as anatomy and pathology study. However, the achievement of this task has proven problematic due to many factors like the non homogeneous intensities in several modalities (e.g. MRI and CT-scans). In this paper we investigate this line of research by studying some relevant works in this field and proposing a hybrid method to improve the detection of a tumor area in medical imaging. The purpose of this work is to exploit the potential of a variational-based approach and the prior knowledge via a learning algorithm. Segmenting with two different algorithms can enlarge the advantages of both of them, reduce their drawbacks and achieve high performances. Experimental results applied to "brain tumor segmentation in MRI" demonstrate the effectiveness of the developed framework.

Paper Nr: 32
Title:

User-driven Nearest Neighbour Exploration of Image Archives

Authors:

Luca Piras, Deiv Furcas and Giorgio Giacinto

Abstract: Learning what a specific user is exactly looking for, during a session of image search and retrieval, is a problem that has been mainly approached with “classification” or “exploration” techniques. Classification techniques follow the assumption that the images in the archive are statically subdivided into classes. Exploration approaches, on the other hand, are more focused on following the varying needs of the user. It turns out that image retrieval techniques based on classification approaches, though often showing good performances, are not prone to adapt to different users’ goals. In this paper we propose a relevance feedback mechanism that drives the search into promising regions of the feature space according to the Nearest Neighbor paradigm. In particular, each image labelled as being relevant by the user, is used as a “seed” for an exploration of the space based on the Nearest Neighbors paradigm. Reported results show that this technique allows attaining higher recall and average precision performances than other state-of-the-art relevance feedback approaches.

Paper Nr: 50
Title:

Fuzzy Logic and Multi-biometric Fusion - An Overview

Authors:

Fabian Maul and Naser Damer

Abstract: Fuzzy logic has been proposed to improve various aspects of multi-biometric applications including enhancements to the decision making of the application and the robustness to noisy data. This paper discusses recent work that utilized fuzzy logic techniques within the multi-biometric fusion problem. This discussion is presented under two categories, the type of authentication scenario and the nature of the fused data. The paper also presents an introduction to fuzzy logic and multi-biometric fusion. Based on the discussed works, this paper aims to establish current trends and research possibilities in this field.

Paper Nr: 57
Title:

An Interactive Model for Structural Pattern Recognition based on the Bayes Classifier

Authors:

Xavier Cortés, Francesc Serratosa and Carlos Francisco Moreno-García

Abstract: This paper presents an interactive model for structural pattern recognition based on a naïve Bayes classifier. In some applications, the automatically computed correlation between local parts of two images is not good enough. Moreover, humans are very good at locating and mapping local parts of images although any kind of global transformations had been applied to these images. In our model, the user interacts on the automatically obtained correlation (or correspondences between local parts) and helps the system to find the best correspondence while the global transformation parameters are automatically recomputed. The model is based on a Bayes classifier in which the human interaction is properly modelled and embedded in the model. We show that with little human interaction, the quality of the returned correspondences and global transformation parameters drastically increases.

Paper Nr: 90
Title:

Object Attention Patches for Text Detection and Recognition in Scene Images using SIFT

Authors:

Bowornrat Sriman and Lambert Schomaker

Abstract: Natural urban scene images contain many problems for character recognition such as luminance noise, varying font styles or cluttered backgrounds. Detecting and recognizing text in a natural scene is a difficult problem. Several techniques have been proposed to overcome these problems. These are, however, usually based on a bottom-up scheme, which provides a lot of false positives, false negatives and intensive computation. Therefore, an alternative, efficient, character-based expectancy-driven method is needed. This paper presents a modeling approach that is usable for expectancy-driven techniques based on the well-known SIFT algorithm. The produced models (Object Attention Patches) are evaluated in terms of their individual provisory character recognition performance. Subsequently, the trained patch models are used in preliminary experiments on text detection in scene images. The results show that our proposed model-based approach can be applied for a coherent SIFT-based text detection and recognition process.

Paper Nr: 106
Title:

Classifying Nucleotide Sequences and their Positions of Influenza A Viruses through Several Kernels

Authors:

Issei Hamada, Takaharu Shimada, Daiki Nakata, Kouichi Hirata and Tetsuji Kuboyama

Abstract: In this paper, we classify nucleotide sequences and their positions of influenza A viruses by using both nucleotide sequence kernels and phylogenetic tree kernels. In the nucleotide sequence kernel, we regard a nucleotide sequence as a vector, a multiset and a string. In the phylogenetic tree kernel, we use a relabeled phylogenetic tree obtained by replacing the labels of leaves that are indices of nucleotide sequences in the reconstructed phylogenetic tree from a set of nucleotide sequences with the nucleotides at a fixed position and trimmed phylogenetic trees obtained by trimming the branches in the relabeled phylogenetic tree with same leaves as possible. Then, we observe which of kernels are effective the classification of nucleotide sequences as analyzing pandemic occurrences and regions and the classification of positions in nucleotide sequences as analyzing positions in packaging signals.

Paper Nr: 136
Title:

Paradigms for the Construction and Annotation of Emotional Corpora for Real-world Human-Computer-Interaction

Authors:

Markus Kächele, Stefanie Rukavina, Günther Palm, Friedhelm Schwenker and Martin Schels

Abstract: A major building block for the construction of reliable statistical classifiers in the context of affective human-computer interaction is the collection of training samples that appropriately reflect the complex nature of the desired patterns. This is especially in this application a non-trivial issue as, even though it is easily agreeable that emotional patterns should be incorporated in future computer operating, it is by far not clear how it should be realized. There are still open questions such as which types of emotional patterns to consider together with their degree of helpfulness for computer interactions and the more fundamental question on what emotions do actually occur in this context. In this paper we start by reviewing existing corpora and the respective techniques for the generation of emotional contents and further try to motivate and establish approaches that enable to gather, identify and categorize patterns of human-computer interaction. %Thus we believe it is possible to gather valid and relevant data material for the affective computing community.

Paper Nr: 137
Title:

Transfer Learning for Bibliographic Information Extraction

Authors:

Quang-Hong Vuong and Takasu Atsuhiro

Abstract: This paper discusses the problems of analyzing title page layouts and extracting bibliographic information from academic papers. Information extraction is an important task for easily using digital libraries. Sequence analyzers are usually used to extract information from pages. Because we often receive new layouts and the layouts also usually change, it is necessary to have a machenism for self-trainning a new analyzer to achieve a good extraction accuracy. This also makes the management becomes easier. For example, when the new layout is inputed, There is a problem of how we can learn automatically and efficiently to create a new analyzer. This paper focuses on learning a new sequence analyzer automatically by using transfer learning approach. We evaluated the efficiency by testing three academic journals. The results show that the proposed method is effective to self-train a new sequence analyer.

Paper Nr: 142
Title:

Hyperspectral Data Classification based on Local Polynomial Approximation

Authors:

Bolanle Tolulope Abe and Jaco Jordaan

Abstract: Processing and classification of hyperspectral data into various class labels have several challenges due to image objects entrenchment in a single pixel and large data size that needs serious computation and huge memory. This work presence local polynomial approximation (LPA) method to process Washington DC Mall hyperspectral data set for image classification. The data generated from LPA are then classified using neural network (NN), support vector machine (SVM) and random forest (RF). The classification procedures are implemented in Waikato Environment for Knowledge Analysis (WEKA). To evaluate the results, the performer of the classifiers per class label are presented in metrics and the overall classification results are presented in tabular form. The Friedman statistical test is carried out on the classification results to establish the performance of each classifier on the processed data. The LPA method is evaluated on the datasets using the different classifiers to demonstrate its efficacy.

Area 2 - Applications

Full Papers
Paper Nr: 10
Title:

A Hybrid BLSTM-HMM for Spotting Regular Expressions

Authors:

Gautier Bideault, Luc Mioulet, Clement Chatelain and Thierry Paquet

Abstract: This article concerns the spotting of regular expressions (REGEX) in handwritten documents using a hybrid model. Spotting REGEX in a document image allow to consider further extraction tasks such as document categorization or named entities extraction. Our model combines state of the art BLSTM recurrent neural network for character recognition and segmentation with a HMM model able to spot the desired sequences. Our experiments on a public handwritten database show interesting results.

Paper Nr: 14
Title:

A New Robust Color Descriptor for Face Detection

Authors:

Eyal Braunstain and Isak Gath

Abstract: Most state-of-the-art approaches to object and face detection rely on intensity information and ignore color information, as it usually exhibits variations due to illumination changes and shadows, and due to the lower spatial resolution in color channels than in the intensity image. We propose a new color descriptor, derived from a variant of Local Binary Patterns, designed to achieve invariance to monotonic changes in chroma. The descriptor is produced by histograms of encoded color texture similarity measures of small radially-distributed patches. As it is based on similarities of local patches, we expect the descriptor to exhibit a high degree of invariance to local appearance and pose changes. We demonstrate empirically by simulation the invariance of the descriptor to photometric variations, i.e. illumination changes and image noise, geometric variations, i.e. face pose and camera viewpoint, and discriminative power in a face detection setting. Lastly, we show that the contribution of the presented descriptor to face detection performance is significant and superior to several other color descriptors, which are in use for object detection. This color descriptor can be applied in color-based object detection and recognition tasks.

Paper Nr: 17
Title:

Mobility Assessment of Demented People Using Pose Estimation and Movement Detection - An Experimental Study in the Field of Ambient Assisted Living

Authors:

Julia Richter, Christian Wiede and Gangolf Hirtz

Abstract: The European population will steadily be growing older in the following decades. At the same time, the risk of getting dementia increases with higher age. Both these factors are apt to cause serious problems for the society, especially with regard to the caring sector, which also suffers from the lack of qualified personnel. As technical support systems can be of assistance to medical staff and patients, a mobility assessment system for demented people is presented in this paper. The grade of mobility is measured by means of the person’s pose and movements in a monitored area. For this purpose, pose estimation and movement detection algorithms have been developed. These process 3-D data, which are provided by an optical stereo sensor installed in a living environment. In order to train and test a discriminative classifier a variety of labelled training and test data was recorded. Moreover, we designed a discriminative and universal feature vector for pose estimation. The experiments demonstrated that the algorithms work robustly. In connection with a human machine interface, the system facilitates a mobilisation as well as a more valid assessment of the patient’s medical condition than it is presently the case.

Paper Nr: 19
Title:

Increased Fall Detection Accuracy in an Accelerometer-based Algorithm Considering Residual Movement

Authors:

Panagiotis Kostopoulos, Tiago Nunes, Kevin Salvi, Michel Deriaz and Julien Torrent

Abstract: Every year over 11 million falls are registered. Falls play a critical role in the deterioration of the health of the elderly and the subsequent need of care. This paper presents a fall detection system running on a smartwatch (F2D). Data from the accelerometer is collected, passing through an adaptive threshold-based algorithm which detects patterns corresponding to a fall. A decision module takes into account the residual movement of the user, matching a detected fall pattern to an actual fall. Unlike traditional systems which require a base station and an alarm central, F2D works completely independently. To the best of our knowledge, this is the first fall detection system which works on a smartwatch, being less stigmatizing for the end user. The fall detection algorithm has been tested by Fondation Suisse pour les Téléthèses (FST), the project partner for the commercialization of our system. Taking advantage of their experience with the end users, we are confident that F2D meets the demands of a reliable and easily extensible system. This paper highlights the innovative algorithm which takes into account residual movement to increase the fall detection accuracy and summarizes the architecture and the implementation of the fall detection system.

Paper Nr: 21
Title:

Towards Pose-free Tracking of Non-rigid Face using Synthetic Data

Authors:

Ngoc-Trung Tran, Fakhreddine Ababsa and Maurice Charbit

Abstract: The non-rigid face tracking has been achieved many advances in recent years, but most of empirical experiments are restricted at near-frontal face. This report introduces a robust framework for pose-free tracking of non-rigid face. Our method consists of two phases: training and tracking. In the training phase, a large offline synthesized database is built to train landmark appearance models using linear Support Vector Machine (SVM). In the tracking phase, a two-step approach is proposed: the first step, namely initialization, benefits 2D SIFT matching between the current frame and a set of adaptive keyframes to estimate the rigid parameters. The second step obtains the whole set of parameters (rigid and non-rigid) using a heuristic method via pose-wise SVMs. The combination of these aspects makes our method work robustly up to 90° of vertical axial rotation. Moreover, our method appears to be robust even in the presence of fast movements and tracking losses. Comparing to other published algorithms, our method offers a very good compromise of rigid and non-rigid parameter accuracies. This study gives a promising perspective because of the good results in terms of pose estimation (average error is less than 4°on BUFT dataset) and landmark tracking precision (5.8 pixel error compared to 6.8 of one state-of-the-art method on Talking Face video). These results highlight the potential of using synthetic data to track non-rigid face in unconstrained poses.

Paper Nr: 24
Title:

3-D Shape Matching for Face Analysis and Recognition

Authors:

Wei Quan, Bogdan Matuszewski and Lik-Kwan Shark

Abstract: The aims of this paper are to introduce a 3-D shape matching scheme for automatic face recognition and to demonstrate its invariance to pose and facial expressions. The core of this scheme lies on the combination of non-rigid deformation registration and statistical shape modelling. While the former matches 3-D faces regardless of facial expression variations, the latter provides a low-dimensional feature vector that describes the deformation after the shape matching process, thereby enabling robust identification of 3-D faces. In order to assist establishment of accurate dense point correspondences, an isometric embedding shape representation is introduced, which is able to transform 3-D faces to a canonical form that retains the intrinsic geometric structure and achieve shape alignment of 3-D faces independent from individual’s facial expression. The feasibility and effectiveness of the proposed method was investigated using standard publicly available Gavab and BU-3DFE databases, which contain faces expressions and pose variations. The performance of the system was compared with the existing benchmark approaches and it demonstrates that the proposed scheme provides a competitive solution for the face recognition task with real-world practicality.

Paper Nr: 30
Title:

Fast Discovery of Discriminative Mid-level Patches

Authors:

Angran Lin, Xuhui Jia and Kowk Ping Chan

Abstract: Learning discriminative mid-level patches has gained popularity in recent years since they can be applied to various computer vision topics and achieve better performance. However, state-of-the-art learning methods require a lot of training time, especially when the problem scale becomes much larger. In this paper we propose a simple but fast and effective way, the Fast Exemplar Clustering(FEC), to mine discriminative mid-level patches with only class labels provided. We verified our results on the task of scene classification and it took us only one day to train the model on the MIT Indoor 67 dataset using an Core i5 quad-core computer with Matlab. The results of our experiments revealed that the mid-level patches discovered by our method were semantically meaningful and achieved competitive accuracy compared to the state-of-the-art techniques. In addition, we created a new scene classification dataset named Outdoor Sight 20 which contains outdoor views of 20 famous tourist attractions to test our model.

Paper Nr: 67
Title:

Improvement of Recovering Shape from Endoscope Images Using RBF Neural Network

Authors:

Yuji Iwahori, Seiya Tsuda, Robert J. Woodham, M. K. Bhuyan and Kunio Kasugai

Abstract: The VBW (Vogel-Breuß-Weickert) model is proposed as a method to recover 3-D shape under point light source illumination and perspective projection. However, the VBW model recovers relative, not absolute, shape. Here, shape modification is introduced to recover the exact shape. Modification is applied to the output of the VBW model. First, a local brightest point is used to estimate the reflectance parameter from two images obtained with movement of the endoscope camera in depth. After the reflectance parameter is estimated, a sphere image is generated and used for Radial Basis Function Neural Network (RBF-NN) learning. The NN implements the shape modification. NN input is the gradient parameters produced by the VBW model for the generated sphere. NN output is the true gradient parameters for the true values of the generated sphere. Depth can then be recovered using the modified gradient parameters. Performance of the proposed approach is confirmed via computer simulation and real experiment.

Paper Nr: 77
Title:

3D Registration of Multi-modal Data Using Surface Fitting

Authors:

Amine Mahiddine, Rabah Iguernaissi, Djamal Merad, Pierre Drap and Jean-marc Boï

Abstract: The registration of two 3D point clouds is an essential step in many applications. The objective of our work is to estimate the best geometric transformation to merge two point clouds obtained from different sensors. In this paper, we present a new approach for feature extraction which is distinguished by the nature of the extracted signature of each point. The descriptor we propose is invariant to rotation and overcomes the problem of multiresolution. To validate our approach, we have tested on synthetic data and we have applied to heterogeneous real data.

Paper Nr: 83
Title:

Dismantling Composite Visualizations in the Scientific Literature

Authors:

Po-Shen Lee and Bill Howe

Abstract: We are analyzing the visualizations in the scientific literature to enhance search services, detect plagiarism, and study bibliometrics. An immediate problem is the ubiquitous use of multi-part figures: single images with multiple embedded sub-visualizations. Such figures account for approximately 35% of the figures in the scientific literature. Conventional image segmentation techniques and other existing approaches have been shown to be ineffective for parsing visualizations. We propose an algorithm to automatically segment multi-chart visualizations into a set of single-chart visualizations, thereby enabling downstream analysis. Our approach first splits an image into fragments based on background color and layout patterns. An SVM-based binary classifier then distinguishes complete charts from auxiliary fragments such as labels, ticks, and legends, achieving an average 98.1% accuracy. Next, we recursively merge fragments to reconstruct complete visualizations, choosing between alternative merge trees using a novel scoring function. To evaluate our approach, we used 261 scientific multi-chart figures randomly selected from the Pubmed database. Our algorithm achieves 80% recall and 85% precision of perfect extractions for the common case of eight or fewer sub-figures per figure. Further, even imperfect extractions are shown to be sufficient for most chart classification and reasoning tasks associated with bibliometrics and academic search applications.

Paper Nr: 86
Title:

Automatic Image Annotation Using Convex Deep Learning Models

Authors:

Niharjyoti Sarangi and C. Chandra Sekhar

Abstract: Automatically assigning semantically relevant tags to an image is an important task in machine learning. Many algorithms have been proposed to annotate images based on features such as color, texture, and shape. Success of these algorithms is dependent on carefully handcrafted features. Deep learning models are widely used to learn abstract, high level representations from raw data. Deep belief networks are the most commonly used deep learning models formed by pre-training the individual Restricted Boltzmann Machines in a layer-wise fashion and then stacking together and training them using error back-propagation. In the deep convolutional networks, convolution operation is used to extract features from different sub-regions of the images to learn better representations. To reduce the time taken for training, models that use convex optimization and kernel trick have been proposed. In this paper we explore two such models, Tensor Deep Stacking Network and Kernel Deep Convex Network, for the task of automatic image annotation. We use a deep convolutional network to extract high level features from raw images, and then use them as inputs to the convex deep learning models. Performance of the proposed approach is evaluated on benchmark image datasets.

Paper Nr: 117
Title:

Multi-Object Segmentation for Assisted Image reConstruction

Authors:

Sonia Caggiano, Maria De Marsico, Riccardo Distasi and Daniel Riccio

Abstract: MOSAIC is a tool for jigsaw puzzle solving. It is designed to assist cultural heritage operators in reconstructing broken pictorial artifacts from their fragments. These undergo feature extraction and feature based indexing, so that any fragment can be the key to queries about color distribution, shape and texture. Query results are listed in order of similarity, which helps the user to locate fragments likely to be near the key fragment in the original picture. A complete working protocol is provided to bring the user from the raw materials to a working database. System performance has been assessed with both computer simulations and a real case study involving the reconstruction of a XV century fresco.

Short Papers
Paper Nr: 20
Title:

Combining Fisher Vectors in Image Retrieval Using Different Sampling Techniques

Authors:

Tomás Mardones, Héctor Allende and Claudio Moraga

Abstract: This paper addresses the problem of content-based image retrieval in a large-scale setting. Most works in the area sample image patches using an affine invariant detector or in a dense fashion, but we show that both sampling methods are complementary. By using Fisher Vectors we show how several sampling methods can be combined in a simple fashion inquiring only in a small fixed computational cost while significantly increasing the precision of the image retrieval system. As a second contribution, we show Fisher Vectors using their variance component, normally ignored in image retrieval applications, have better performance than their mean component under certain relevant settings. Experiments with up to 1 million images indicate that the proposed method remains valid in large-scale image search.

Paper Nr: 36
Title:

Plane Fitting and Depth Variance Based Upsampling for Noisy Depth Map from 3D-ToF Cameras in Real-time

Authors:

Kazuki Matsumoto, Francois de Sorbier and Hideo Saito

Abstract: Recent advances of ToF depth sensor devices enables us to easily retrieve scene depth data with high frame rates. However, the resolution of the depth map captured from these devices is much lower than that of color images and the depth data suffers from the optical noise effects. In this paper, we propose an efficient algorithm that upsamples depth map captured by ToF depth cameras and reduces noise. The upsampling is carried out by applying plane based interpolation to the groups of points similar to planar structures and depth variance based joint bilateral upsampling to curved or bumpy surface points. For dividing the depth map into piecewise planar areas, we apply superpixel segmentation and graph component labeling. In order to distinguish planar areas and curved areas, we evaluate the reliability of detected plane structures. Compared with other state-of-the- art algorithms, our method is observed to produce an upsampled depth map that is smoothed and closer to the ground truth depth map both visually and numerically. Since the algorithm is parallelizable, it can work in real-time by utilizing highly parallel processing capabilities of modern commodity GPUs.

Paper Nr: 42
Title:

Using Phase Congruency Model for Microaneurysms Detection in Fundus Image

Authors:

Zhitao Xiao, Fang Zhang, Lei Geng, Jun Wu, Xinpeng Zhang, Long Su and Chunyan Shan

Abstract: This paper addresses an automatic detection method of microaneurysms in color fundus images, which plays a key role in computer assisted diagnosis of diabetic retinopathy, a serious and frequent eye disease. The main concentration of this paper is to detect microaneurysms with phase congruency. The first step consists in image normalization and green channel extraction. The second step aims at obtaining microaneurysms candidate regions, which is achieved using phase congruency. Then the irrelevant information, such as the vessel fragments, is removed by constructing directional cross-section profiles. Through testing on 50 fundus images provided by ROC website, the experimental results show that this method can accurately get microaneurysms in color fundus images.

Paper Nr: 44
Title:

Automated Respiration Detection from Neonatal Video Data

Authors:

Ninah Koolen, Olivier Decroupet, Anneleen Dereymaeker, Katrien Jansen, Jan Vervisch, Vladimir Matic, Bart Vanrumste, Gunnar Naulaers, Sabine Van Huffel and Maarten De Vos

Abstract: In the interest of the neonatal comfort, the need for noncontact respiration monitoring increases. Moreover, home respiration monitoring would be beneficial. Therefore, the goal is to extract the respiration rate from video data included in a polysomnography. The presented method first uses Eulerian video magnification to amplify the respiration movements. A respiration signal is obtained through the optical flow algorithm. Independent component analysis and principal component analysis are applied to improve the signal quality, with minor enhancement of the signal quality. The respiratory rate is extracted as the dominant frequency in the spectrograms obtained using the short-time Fourier transform. Respiratory rate detection is successful (94.12%) for most patients during quiet sleep stages. Real-time monitoring could possibly be achieved by lowering the spatial and temporal resolutions of the input video data. The outline for successful video-aided detection of the respiration pattern is shown, thereby paving the way for improvement of the overall assessment in the NICU and application in a home-friendly environment.

Paper Nr: 49
Title:

Speaker Identification with Short Sequences of Speech Frames

Authors:

Giorgio Biagetti, Paolo Crippa, Alessandro Curzi, Simone Orcioni and Claudio Turchetti

Abstract: In biometric person identification systems, speaker identification plays a crucial role as the voice is the more natural signal to produce and the simplest to acquire. Mel frequency cepstral coefficients (MFCCs) have been widely adopted for decades in speech processing to capture the speech-specific characteristics with a reduced dimensionality. However, although their ability to de-correlate the vocal source and the vocal tract filter make them suitable for speech recognition, they show up some drawbacks in speaker recognition. This paper presents an experimental evaluation showing that reducing the dimension of features by using the discrete Karhunen-Loève transform (DKLT), guarantees better performance with respect to conventional MFCC features. In particular with short sequences of speech frames, that is with utterance duration of less than 1 s, the performance of truncated DKLT representation are always better than MFCC.

Paper Nr: 64
Title:

An Online Vector Error Correction Model for Exchange Rates Forecasting

Authors:

Paola Arce, Jonathan Antognini, Werner Kristjanpoller and Luis Salinas

Abstract: Financial time series are known for their non-stationary behaviour. However, sometimes they exhibit some stationary linear combinations. When this happens, it is said that those time series are cointegrated.The Vector Error Correction Model (VECM) is an econometric model which characterizes the joint dynamic behaviour of a set of cointegrated variables in terms of forces pulling towards equilibrium. In this study, we propose an Online VEC model (OVECM) which optimizes how model parameters are obtained using a sliding window of the most recent data. Our proposal also takes advantage of the long-run relationship between the time series in order to obtain improved execution times. Our proposed method is tested using four foreign exchange rates with a frequency of 1-minute, all related to the USD currency base. OVECM is compared with VECM and ARIMA models in terms of forecasting accuracy and execution times. We show that OVECM outperforms ARIMA forecasting and enables execution time to be reduced considerably while maintaining good accuracy levels compared with VECM.

Paper Nr: 68
Title:

Cooperative Gesture Recognition - Learning Characteristics of Classifiers and Navigating the User to an Ideal Situation

Authors:

Hiromasa Yoshimoto and Yuichi Nakamura

Abstract: This paper introduces a novel scheme of gesture interface that guides the user toward obtaining better performance and usability. The accuracy of gesture recognition is heavily affected by how the user makes postures and moves, as well as environmental conditions such as lighting. The usability of the gesture interface can potentially be improved by notifying the user of when and how better accuracy is obtained. For this purpose, we propose a method for estimating the performance of gesture recognition in its current condition, and a method for suggesting possible ways to improve performance to the user. In performance estimation, accuracy in the current condition is estimated based on supervised learning with a large number of samples and corresponding ground truths. If the estimated accuracy is insufficient, the module searches for better conditions that can be reached with the user’s cooperation. If a good improvement is possible, the way to improve is communicated to the user in terms of visual feedback, which shows how to avoid or how to recover from the undesirable condition. In this way, users benefit, i.e., better accuracy and usability, by cooperating with the gesture interface.

Paper Nr: 88
Title:

Gunshot Classification from Single-channel Audio Recordings using a Divide and Conquer Approach

Authors:

Héctor A. Sánchez-Hevia, David Ayllón, Roberto Gil-Pita and Manuel Rosa-zurera

Abstract: Gunshot acoustic analysis is a field with many practical applications, but due to the multitude of factors involved in the generation of the acoustic signature of firearms, it is not a trivial task, especially since the recorded waveforms show a strong dependence on the shooter’s position and orientation, even when firing the same weapon. In this paper we address acoustic weapon classification using pattern recognition techniques with single channel recordings while taking into account the spatial aspect of the problem, so departing from the typical approach. We are working with three broad categories: rifles, handguns and shotguns. Our approach is based on two proposals: a Divide and Conquer classification strategy and the inclusion of some novel features based on the physical model of gunshot acoustics. The Divide and Conquer strategy is aimed at improving the rate of success of the classification stage by using previously retrieved spatial information to select between a set of specialized weapon classifiers. The minimum relative error reduction achieved when both proposals are used, compared with a single-stage classifier employing traditional features is 38.7%.

Paper Nr: 92
Title:

Voice Verification System for Mobile Devices based on ALIZE/LIA_RAL

Authors:

Hussein Sharafeddin, Mageda Sharafeddin and Haitham Akkary

Abstract: The main contribution of this paper is providing an architecture for mobile users to authenticate user identity through short text phrases using robust open source voice recognition library ALIZE and speaker recognition tool LIA_RAL. Our architecture consists of a server connected to a group of subscribed mobile devices. The server is mainly needed for training the world model while user training and verification run on the individual mobile devices. The server uses a number of public random speaker text independent voice files to generate data, including the world model, used in training and calculating scores. The server data are shipped with the initial install of our package and with every subsequent package update to all subscribed mobile devices. For security purposes, training data consisting of raw voice and processed files of each user reside on the user’s device only. Verification is based on a short text-independent as well as text-dependent phrases, for ease of use and enhanced performance that gets processed and scored against the user trained model. While we implemented our voice verification in Android, the system will perform as efficiently in iOS. It will in fact be easier to implement since the base libraries are all written in C/C++. We show that the verification success rate of our system is 82%. Our system provides a free robust alternative to replace commercial voice identification and verification tools and extensible to implement more advanced mathematical models available in ALIZE and shown to improve voice recognition.

Paper Nr: 105
Title:

Assessment of Dendritic Cell Therapy Effectiveness Based on the Feature Extraction from Scientific Publications

Authors:

Alexey Yu. Lupatov, Alexander I. Panov, Roman E. Suvorov, Alexander V. Shvets, Konstantin N. Yarygin and Galina D. Volkova

Abstract: Dendritic cells (DCs) vaccination is a promising way to contend cancer metastases especially in the case of immunogenic tumors. Unfortunately, it is only rarely possible to achieve a satisfactory clinical outcome in the majority of patients treated with a particular DC vaccine. Apparently, DC vaccination can be successful with certain combinations of features of the tumor and patients immune system that are not yet fully revealed. Difficulty in predicting the results of the therapy and high price of preparation of individual vaccines prevent wider use of DC vaccines in medical practice. Here we propose an approach aimed to uncover correlation between the effectiveness of specific DC vaccine types and personal characteristics of patients to increase efficiency of cancer treatment and reduce prices. To accomplish this, we suggest two-step analysis of published clinical trials results for DCs vaccines: first, the information extraction subsystem is trained, and, second, the extracted data is analyzed using JSM and AQ methodology.

Paper Nr: 107
Title:

A Fourth Order Tensor Statistical Model for Diffusion Weighted MRI - Application to Population Comparison

Authors:

Theodosios Gkamas, Félix Renard, Christian Heinrich and Stéphane Kremer

Abstract: In this communication, we propose an original statistical model for diffusion-weighted magnetic resonance imaging, in order to determine new biomarkers. Second order tensor (T2) modeling of Orientation Distribution Functions (ODFs) is popular and has benefited of specific statistical models, incorporating appropriate metrics. Nevertheless, the shortcomings of T2s, for example for the modeling of crossing fibers, are well identified. We consider here fourth order tensor (T4) models for ODFs, thus alleviating the T2 shortcomings. We propose an original metric in the T4 parameter space. This metric is incorporated in a nonlinear dimension reduction procedure. In the resulting reduced space, we represent the probability density of the two populations, normal and abnormal, by kernel density estimation with a Gaussian kernel, and propose a permutation test for the comparison of the two populations. Application of the proposed model on synthetic and real data is achieved. The relevance of the approach is shown.

Paper Nr: 115
Title:

Adaptive Traffic Signal Control of Bottleneck Subzone based on Grey Qualitative Reinforcement Learning Algorithm

Authors:

Junping Xiang and Zonghai Chen

Abstract: A Grey Qualitative Reinforment Learning algorithm is present in this paper to realize the adaptive signal control of bottleneck subzone, which is described as a nonlinear optimization problem. In order to handle the uncertainites in the traffic flow system, grey theory model and qualitative method were used to express the sensor data. In order to avoid deducing the function relationship of the traffic flow and the timing plan, grey reinforcement learning algorithm, which is the biggest innovation in this paper, was proposed to seek the solution. In order to enhance the generalization capability of the system and avoid the "curse of dimensionality" and improve the convergence speed, BP neural network was used to approximate the Q-function. We do three simulation experiments (calibrated with real data) using four evaluation indicators for contrast and analyze. Simulation results show that the proposed method can significantly improve the traffic situation of bottleneck subzone, and the algorithm has good robustness and low noise sensitivity.

Paper Nr: 132
Title:

Learning Dynamic Systems from Time-series Data - An Application to Gene Regulatory Networks

Authors:

Ivo J. P. M. Timoteo and Sean B. Holden

Abstract: We propose a local search approach for learning dynamic systems from time-series data, using networks of differential equations as the underlying model. We evaluate the performance of our approach for two scenarios: first, by comparing with an l1-regularization approach under the assumption of a uniformly weighted network for identifying systems of masses and springs; and then on the task of learning gene regulatory networks, where we compare it with the best performers in the DREAM4 challenge using the original dataset for that challenge. Our method consistently improves on the performance of the other methods considered in both scenarios.

Paper Nr: 140
Title:

Estimation of Postoperative Knee Flexion at Initial Contact of Cerebral Palsy Children using Neural Networks

Authors:

Omar A. Galarraga C., Vincent Vigneron, Bernadette Dorizzi, Néjib Khouri and Eric Desailly

Abstract: Cerebral Palsy affects walking and often produces excessive knee flexion at initial contact (KFIC). Hamstring lengthening surgery (HL) is applied to decrease KFIC. The objective of this work is to design a simulator of the effect of HL on KFIC that could be used as a decision-making tool. The postoperative KFIC is estimated given the preoperative gait, physical examination and the type of surgery. Nonlinear data fitting is performed by feedforward neural networks. The mean regression error on test is 9.25 degrees and 63.21% of subjects are estimated within an error range of 10 degrees. The simulator is able to give good estimations independently of the preoperative gait parameters and the type of surgery. This system predicts the outcomes of orthopaedic surgery on CP children with real gait parameters, and not with qualitative characteristics.

Paper Nr: 141
Title:

Assessment of the Extent of the Necessary Clinical Testing of New Biotechnological Products Based on the Analysis of Scientific Publications and Clinical Trials Reports

Authors:

Roman Suvorov, Ivan Smirnov, Konstantin Popov, Nikolay Yarygin and Konstantin Yarygin

Abstract: To estimate patients risks and make clinical decisions, evidence based medicine (EBM) relies upon the results of reproducible trials and experiments supported by accurate mathematical methods. Experimental and clinical evidence is crucial, but laboratory testing and especially clinical trials are expensive and time-consuming. On the other hand, a new medical product to be evaluated may be similar to one or many already tested. Results of the studies hitherto performed with similar products may be a useful tool to determine the extent of further pre-clinical and clinical testing. This paper suggests a workflow design aimed to support such an approach including methods for information collection, assessment of research reliability, extraction of structured information about trials and meta-analysis. Additionally, the paper contains a discussion of the issues emering during development of an integrated software system that implements the proposed workflow.

Paper Nr: 145
Title:

Prosody based Automatic Classification of the Uses of French ‘Oui’ as Convinced or Unconvinced Uses

Authors:

Abdenour Hacine-Gharbi, Mélanie Petit, Philippe Ravier and François Némo

Abstract: When working with oral speech, the issue of natural meaning processing can be improved using easily available prosodic information. Only recently, semanticists have started to consider that the prosodic features could play a key role in the interpretation and classification of different word’s uses. In this work, we propose a prosodic based automatic system that allows to classify the French word ‘oui’ into one of the classes ‘conviction’ or ‘lack of conviction’. To that aim, a questionnaire inspired from opinion polls has been created and permitted to obtain 118 occurrences for both classes of ‘oui’. Combined with feature selection procedure, the best classification rates decreases from 85.45% (speaker dependent mode) to 79.25% (speaker independent mode which is closer to an application). Interestingly, we also introduce the ‘shuttle’ principle that seeks to validate the semantic interpretation thanks to prosodic analysis.

Paper Nr: 146
Title:

Predicting Alzheimer’s Disease - A Neuroimaging Study with 3D Convolutional Neural Networks

Authors:

Adrien Payan and Giovanni Montana

Abstract: Pattern recognition methods using neuroimaging data for the diagnosis of Alzheimer’s disease have been the subject of extensive research in recent years. In this paper, we use deep learning methods, and in particular sparse autoencoders and 3D convolutional neural networks, to build an algorithm that can predict the disease status of a patient, based on an MRI scan of the brain. We report on experiments using the ADNI data set involving 2,265 historical scans. We demonstrate that 3D convolutional neural networks outperform several other classifiers reported in the literature and produce state-of-art results.

Posters
Paper Nr: 6
Title:

Speckled Images Segmentation and Algorithm Comparison

Authors:

Luigi Cinque, Rossella Cossu and Rosa Maria Spitaleri

Abstract: An image segmentation process, based on the level set method, consists in the time evolution of an initial curve until it reaches the boundary of the objects to be extracted. Classically the evolution of the initial curve is determined by a speed function. In this paper, the speed in the level set procedure is characterized by the combination of two different speed functions and the resulting algorithm is applied to speckled images, like SAR (Synthetic Aperture Radar) images. In order to assess improvements of the segmentation performance, the computational process is tested on synthetic and then applied to real images. Performances are evaluated on synthetic images by using the Hausdorff distance. The real SAR images were acquired during the ERS2 mission.

Paper Nr: 11
Title:

FOREST - A Flexible Object Recognition System

Authors:

Julia Moehrmann and Gunther Heidemann

Abstract: Despite the growing importance of image data, image recognition has succeeded in taking a permanent role in everyday life in specific areas only. The reason is the complexity of currently available software and the difficulty in developing image recognition systems. Currently available software frameworks expect users to have a comparatively high level of programming and computer vision skills. FOREST – a flexible object recognition framework – strives to overcome this drawback. It was developed for non-expert users with little-to-no knowledge in computer vision and programming. While other image recognition systems focus solely on the recognition functionality, FOREST covers all steps of the development process, including selection of training data, ground truth annotation, investigation of classification results and of possible skews in the training data. The software is highly flexible and performs the computer vision functionality autonomously by applying several feature detection and extraction operators in order to capture important image properties. Despite the use of weakly supervised learning, applications developed with FOREST achieve recognition rates between 86 and 99% and are comparable to state-of-the-art recognition systems.

Paper Nr: 22
Title:

Automatic Tooth Identification in Dental Panoramic Images with Atlas-based Models

Authors:

Selma Guzel, Ayse Betul Oktay and Kadir Tufan

Abstract: After catastrophes and mass disasters, accurate and efficient identification of decedents requires an automatic system which depends upon strong biometrics. In this paper, we present an automatic tooth detection and labeling system based on panoramic dental radiographs. Although our ultimate objective is to identify decedents by comparing the postmortem and antemortem dental radiographs, this paper only involves the tooth detection and the tooth labeling stages. In the system, the tooth regions are first determined and the detection module runs for each region individually. By employing the sliding window technique, the Haar features are extracted from each window and the SVM classifies the windows as tooth or not. The labeling module labels the candidate tooth positions determined by the SVM with an atlas-based model and the final tooth positions are inferred. The novelty of our system is combining the atlas-based model with the SVM under the same framework. We tested our system on 35 panoramic images and the results are promising.

Paper Nr: 34
Title:

Joint Under and Over Water Calibration of a Swimmer Tracking System

Authors:

Sebastian Haner, Linus Svärm, Erik Ask and Anders Heyden

Abstract: This paper describes a multi-camera system designed for capture and tracking of swimmers both above and below the surface of a pool. To be able to measure the swimmer’s position, the cameras need to be accurately calibrated. Images captured below the surface provide a number of challenges, mainly due to refraction and reflection effects at optical media boundaries. We present practical methods for intrinsic and extrinsic calibration of two sets of cameras, optically separated by the water surface, and for stitching panoramas allowing synthetic panning shots of the swimmer.

Paper Nr: 47
Title:

Keyword based Keyframe Extraction in Online Video Collections

Authors:

Edoardo Ardizzone, Marco La Cascia and Giuseppe Mazzola

Abstract: Keyframe extraction methods aim to find in a video sequence the most significant frames, according to specific criteria. In this paper we propose a new method to search, in a video database, for frames that are related to a given keyword, and to extract the best ones, according to a proposed quality factor. We first exploit a speech to text algorithm to extract automatic captions from all the video in a specific domain database. Then we select only those sequences (clips), whose captions include a given keyword, thus discarding a lot of information that is useless for our purposes. Each retrieved clip is then divided into shots, using a video segmentation method, that is based on the SURF descriptors and keypoints. The sentence of the caption is projected onto the segmented clip, and we select the shot that includes the input keyword. The selected shot is further inspected to find good quality and stable parts, and the frame which maximizes a quality metric is selected as the best and the most significant frame. We compare the proposed algorithm with another keyframe extraction method based on local features, in terms of Significance and Quality.

Paper Nr: 54
Title:

Privacy Aware Person-specific Assisting System for Home Environment

Authors:

Ahmad Rabie and Uwe Handmann

Abstract: As smart homes are being more and more popular, the needs of finding assisting systems which interface between users and home environments are growing. Furthermore, for people living in such homes, elderly and disabled people in particular and others in general, it is totally important to develop devices, which can support and aid them in their ordinary daily life. We focused in this work on sustaining privacy issues of the user during a real interaction with the surrounding home environment. A smart person-specific assistant system for services in home environment is proposed. The role of this system is the assisting of persons by controlling home activities and guiding the adaption of Smart-Home-Human interface towards the needs of the considered person. At the same time the system sustains privacy issues of it’s interaction partner. As a special case of medical assisting the system is so implemented, that it provides for elderly or disabled people person-specific medical assistance. The system has the ability of identifying its interaction partner using some biometric features. According to the recognized ID the system, first, adopts towards the needs of recognized person. Second the system represents person-specific list of medicines either visually or auditive. And third the system gives an alarm in the case of taking medicament either later or earlier as normal taking time.

Paper Nr: 66
Title:

Learning to Predict Video Saliency using Temporal Superpixels

Authors:

Anurag Singh, Chee-Hung Henry Chu and Michael A. Pratt

Abstract: Visual Saliency of a video sequence can be computed by combining spatial and temporal features that attract a user’s attention to a group of pixels. We present a method that computes video saliency by integrating these features: color dissimilarity, objectness measure, motion difference, and boundary score. We use temporal clusters of pixels, or temporal superpixels, to simulate attention associated with a group of moving pixels in a video sequence. The features are combined using weights learned by a linear support vector machine in an online fashion. The temporal linkage for superpixels is then used to find the saliency flow across the image frames. We experimentally demonstrate the efficacy of the proposed method and that the method has better performance when compared to state-of-the-art methods.

Paper Nr: 76
Title:

Collaborative Tracking and Distributed Control for an IP-PTZ Camera Network

Authors:

Pierrick Paillet, Romaric Audigier, Frédéric Lerasle and Quoc-Cuong Pham

Abstract: Networked Pan-Tilt-Zoom cameras (IP-PTZ) are increasingly used in video-surveillance areas thanks to their active perception capabilities. However, they often exhibit large execution latency and motion delays making them difficultly fitted for autonomous tracking task. Targets may be lost due to inaccuracies in motion control and delayed reactions drive cameras to expired target positions. Furthermore, zoom increases this risk as the field-of-View (FoV) is reduced. However, PTZ in a network may collaborate to increase their accuracy, compare their estimations to detect potential failure and react accordingly. In this article, we present an original approach of collaboration between IP-PTZ having a joint FoV, in order to achieve a robust tracking. First an asynchronous fusion filter detects tracking failure while efficiently handling inherent IP-PTZ network delays. Then, a distributed control strategy anticipates and optimizes motion orders to increase the system reactivity.

Paper Nr: 81
Title:

Application of Face Verification in Automated Passenger Clearance System

Authors:

Wentao Shen, Yicong Liang, Xiaoqing Ding and Changsong Liu

Abstract: It is important to improve the performance of verification algorithm in the automated passenger clearance system. The aim is to guarantee a low artificial intervention rate under a certain false acceptance rate (FAR). We analysed both the program flows of face verification and hardware system integration. Specifically, a multi-pose local binary pattern (MS-LBP) feature is used to find faces and locate eyes for the fuzzy, posture face images under complex illumination conditions. A new histogram of oriented gradient feature tailored for face (FHOG) is introduced to describe faces. Finally, a heterogeneous feature projection algorithm is provided to improve the feature discriminativity of probe and gallery images in the verification system. Experiments showed that the upgraded system has 20% improvement in performance than the original version, which demonstrate the effectiveness of the algorithm.

Paper Nr: 91
Title:

Fast Classification of Dust Particles from Shadows

Authors:

Elio D. Di Claudio, Giovanni Jacovitti, Gianni Orlandi and Andrea Proietti

Abstract: A fast and versatile method for classifying dust particles dispersed in the air is presented. The method uses images captured by a simple imaging system composed of a photographic sensor array and of an illuminating source. Such a device is exposed to free particulate deposition from the environment, and its accumulation is measured by observing the shadows of the particles the air casts onto the photographic sensor. Particles are detected and classified in order to measure their density and to analyse their composition. To this purpose, the contour paths of particle shadows are traced. Then, distinctive features of single particles, such as dimension and morphology, are extracted by looking at corresponding features of the sequence of local orientation changes of contours. Discrimination between dust and fibre particles is efficiently done using the varimax norm of these orientation changes. It is shown through field examples that such a technique is very well suited for quantitative and qualitative dust analysis in real environments.

Paper Nr: 98
Title:

The Symmetry of Oligonucleotide Distance Distributions in the Human Genome

Authors:

Ana Helena Tavares, Vera Afreixo, João M. O. S. Rodrigues and Carlos A. C. Bastos

Abstract: The inter-oligonucleotide distance is defined as the distance to the next occurrence of the same oligonucleotide. In this work, using the inter-oligonucleotide distance concept, we develop new methods to evaluate the lack of homogeneity in symmetric word pairs (pairs of reversed complement oligonucleotides), in equivalent composition groups. We apply the developed methods to the human genome and we conclude that a strong similarity exists between the distance distributions of symmetric oligonucleotides. We also conclude that exceptional distance symmetry is present in several equivalent composition groups, that is, there is a strong lack of homogeneity in the group and a strong homogeneity in the included symmetric word pairs. This suggests a stronger parity rule than Chargaff’s: in the human genome, symmetric oligonucleotides have equivalent occurrence frequency and, additionally, they present similar distance distributions.

Paper Nr: 102
Title:

Fast Regularized Least Squares and k-means Clustering Method for Intrusion Detection Systems

Authors:

Parisa Movahedi, Paavo Nevalainen, Markus Viljanen and Tapio Pahikkala

Abstract: Intrusion detection systems are intended for reliable, accurate and efficient detection of attacks in a large networked system. Machine learning methods have shown promising results in terms of accuracy but one disadvantage they share is the high computational cost of training and prediction phase when applied to intrusion detection. Recently some methods have been introduced to increase this efficiency. Kernel based methods are one of the most popular methods in the literature, and extending them with approximation techniques we describe in this paper has a huge impact on minimizing the computational time of the Intrusion Detection System (IDS). This paper proposes using optimized Regularized Least Square (RLS) classification combined with k-means clustering. Standard techniques are used in choosing the optimal RLS predictor parameters. The optimization leads to fewer basis vectors which improves the prediction speed of the IDS. Our algorithm evaluated on the KDD99 benchmark IDS dataset demonstrates considerable improvements in the training and prediction times of the intrusion detection while maintaining the accuracy.

Paper Nr: 109
Title:

Single-frame Image Denoising and Inpainting Using Gaussian Mixtures

Authors:

Afonso Teodoro, Mariana Almeida and Mario Figueiredo

Abstract: This paper proposes a patch-based method to address two of the core problems in image processing: denoising and inpainting. The approach is based on a Gaussian mixture model estimated exclusively from the observed image via the expectation-maximization algorithm, based on which the minimum mean squared error estimate is computed in closed form. The results show that this simple method is able to perform on the same level as other state-of-the-art algorithms.

Paper Nr: 111
Title:

Interactive 3D Modeling - A Survey-based Perspective on Interactive 3D Reconstruction

Authors:

Julius Schöning and Gunther Heidemann

Abstract: 3D reconstruction and modeling techniques based on computer vision show a significant improvement in recent decades. Despite the great variety, a majority of these techniques depend on specific photographic collections or video footage. For example, most are designed for large data collections, overlapping photos, captures from turntables or photos with lots of detectable features such as edges. If the input, however, does not fit the particular specification, most techniques can no longer create reasonable 3D reconstructions. We review the work in the research area of 3D reconstruction and 3D modeling with a focus on the specific capabilities of these methods and possible drawbacks. Within this literature review, the practical usability with the focus on the input data — the collections of photographs or videos — and on the resulting models are discussed. Upon this basis, we introduce our position of interactive 3D reconstruction and modeling as a possible opportunity of lifting current restrictions from these techniques, which leads to the possibility of creating CAD-ready models in the future.

Paper Nr: 122
Title:

An Unsupervised Method for Suspicious Regions Detection in Mammogram Images

Authors:

Marco Insalaco, Alessandro Bruno, Alfonso Farruggia, Salvatore Vitabile and Edoardo Ardizzone

Abstract: Over the past years many researchers proposed biomedical imaging methods for computer-aided detection and classification of suspicious regions in mammograms. Mammogram interpretation is performed by radiologists by visual inspection. The large volume of mammograms to be analyzed makes such readings labour intensive and often inaccurate. For this purpose, in this paper we propose a new unsupervised method to automatically detect suspicious regions in mammogram images. The method consists mainly of two steps: preprocessing; feature extraction and selection. Preprocessing steps allow to separate background region from the breast profile region. In greater detail, gray levels mapping transform and histogram specifications are used to enhance the visual representation of mammogram details. Then, local keypoints and descriptors such as SURF have been extracted in breast profile region. The extracted keypoints are filtered by proper parameters tuning to detect suspicious regions. The results, in terms of sensitivity and confidence interval are very encouraging.

Paper Nr: 129
Title:

Objective and Subjective Metrics for 3D Display Perception Evaluation

Authors:

Andrea Albarelli, Luca Cosmo, Filippo Bergamasco and Andrea Gasparetto

Abstract: Many modern professional 3D display systems adopt stereo vision and viewer-dependent rendering in order to offer an immersive experience and to enable complex interaction models. Within these scenarios, the ability of the user to effectively perform a task depends both on the correct rendering of the scene and on his ability to perceive it. These factors, in turn, are affected by several error sources, such as accuracy of the user position estimation or lags between tracking and rendering. With this paper, we introduce a practical and sound method to quantitatively assess the accuracy of any view-dependent display approach and the effects of the different error sources. This is obtained by defining a number of metrics that can be used to analyze the results of a set of experiments specially crafted to tickle different aspects of the system. This fills a clear shortcoming of the evaluation methods for 3D displays found in literature, that are, for the most part, qualitative.

Paper Nr: 130
Title:

Motion Field Regularization for Sliding Objects Using Global Linear Optimization

Authors:

Gustaf Johansson, Mats Andersson and Hans Knutsson

Abstract: In image registration it is often necessary to employ regularization in one form or another to be able to find a plausible displacement field. In medical applications, it is useful to define different constraints for different areas of the data. For instance to measure if organs have moved as expected after a finished treatment. One common problem is how to find plausible motion vectors far away from known motion. This paper introduces a new method to build and solve a Global Linear Optimizations (GLO) problem with a novel set of terms which enable specification of border areas to allow a sliding motion. The GLO approach is important especially because it allows simultaneous incorporation of several different constraints using information from medical atlases such as localization and properties of organs. The power and validity of the method is demonstrated using two simple, but relevant 2D test images. Conceptual comparisons with previous methods are also made to highlight the contributions made in this paper. The discussion explains important future work and experiments as well as exciting future improvements to the GLO framework.

Paper Nr: 138
Title:

Resilient Propagation for Multivariate Wind Power Prediction

Authors:

Jannes Stubbemann, Nils Andre Treiber and Oliver Kramer

Abstract: Wind power prediction based on statistical learning has the potential to outperform classical physical weather prediction models. Neural networks have been successfully applied to wind prediction in the past. In this paper, we apply neural networks to the spatio-temporal prediction model we proposed in the past. We concentrate on a comparison between classical backpropagation and the more advanced resilient propagation (RPROP) variants. The analysis is based on time series data from the NREL western wind data set. The experimental results show that RPROP+ and iRPROP+ significantly outperform the classical backpropagation variants.