ICPRAM 2017 Abstracts


Area 1 - Theory and Methods

Full Papers
Paper Nr: 2
Title:

Optimized Linear Imputation

Authors:

Yehezkel S. Resheff and Daphna Weinshal

Abstract: Often in real-world datasets, especially in high dimensional data, some feature values are missing. Since most data analysis and statistical methods do not handle gracefully missing values, the first step in the analysis requires the imputation of missing values. Indeed, there has been a long standing interest in methods for the imputation of missing values as a pre-processing step. One recent and effective approach, the IRMI stepwise regression imputation method, uses a linear regression model for each real-valued feature on the basis of all other features in the dataset. However, the proposed iterative formulation lacks convergence guarantee. Here we propose a closely related method, stated as a single optimization problem and a block coordinate-descent solution which is guaranteed to converge to a local minimum. Experiments show results on both synthetic and benchmark datasets, which are comparable to the results of the IRMI method whenever it converges. However, while in the set of experiments described here IRMI often diverges, the performance of our methods is shown to be markedly superior in comparison with other methods.

Paper Nr: 6
Title:

PSF Smooth Method based on Simple Lens Imaging

Authors:

Dazhi Zhan, Weili Li, Zhihui Xiong, Mi Wang and Maojun Zhang

Abstract: Compared with modern camera lenses, the simple lens system could be more meaningful to use in many scientific applications in terms of cost and weight. However, the simple lens system suffers from optical aberrations which limits its applicapability. Recent research combined single lens optics with complex post-capture correction methods to correct these artifacts. In this study, we initially estimate the spatial variability point spread function (PSF) through blind image deconvolution with total variant (TV) regularization. PSF is optimized to be more smoothed for enhancing the robustness. A sharp image is then recovered through fast non-blind deconvolution. Experimental results show that our method is at par with state-of-the-art deconvolution approaches and possesses an advantage in suppressing ringing artifacts.

Paper Nr: 8
Title:

Preliminary Evaluation of Symbolic Regression Methods for Energy Consumption Modelling

Authors:

R. Rueda, M. P. Cuéllar, M. Delgado and M. C. Pegalajar

Abstract: In the last few years, energy efficiency has become a research field of high interest for governments and industry. In order to understand consumption data and provide useful information for high-level decision making processes in energy efficiency, there is the problem of information modelling and knowledge discovery coming from a set of energy consumption sensors. This paper focuses in this problem, and explores the use of symbolic regression techniques able to find out patterns in data that can be used to extract an analytical formula that explains the behaviour of energy consumption in a set of public buildings. More specifically, we test the feasibility of different representations such as trees and straight line programs for the implementation of genetic programming algorithms, to find out if a building consumption data can be suitably explained from the energy consumption data from other similar buildings. Our experimental study suggests that the Straight Line Programs representation may overcome the limitations of traditional tree-based representations and provides accurate patterns of energy consumption models.

Paper Nr: 10
Title:

Semantic Pattern-based Retrieval of Architectural Floor Plans with Case-based and Graph-based Searching Techniques and their Evaluation and Visualization

Authors:

Qamer Uddin Sabri, Johannes Bayer, Viktor Ayzenshtadt, Syed Saqib Bukhari, Klaus-Dieter Althoff and Andreas Dengel

Abstract: Until today, for the conceptual design of architectural floor plans, architects widely follow the traditional pen and paper based method to draw the conceptual floor plans, and retrieve the similar floor plans in the printed reference collections. In this paper we present a complete end-to-end system that helps architects to retrieve similar floor plans in early design phases. This work makes a three-fold contribution. Firstly, we have adapted three state of the art techniques to retrieve the similar floor plans: case-based reasoning (CBR), exact graph matching, and inexact graph matching. Secondly, we conducted a test to detect the computational limits of the searching techniques. And finally, we performed a qualitative analysis by running more realistic test cases created by architects while keeping in mind the computational limits. For visualization of results, we have integrated advanced version of our previously implemented web-based user interface. The qualitative analysis showed that the exact graph matching gives in general better results for a majority of test cases, as compared to other two methods. The novelty of our approach is that it combines CBR, exact, and inexact graph matching in one system in the domain of retrieval of architectural floor plans.

Paper Nr: 11
Title:

Mapping Distance Graph Kernels using Bipartite Matching

Authors:

Tetsuya Kataoka, Eimi Shiotsuki and Akihiro Inokuchi

Abstract: The objective of graph classification is to classify graphs of similar structures into the same class. This problem is of key importance in areas such as cheminformatics and bioinformatics. Support Vector Machines can efficiently classify graphs if graph kernels are used instead of feature vectors. In this paper, we propose two novel and efficient graph kernels called Mapping Distance Kernel with Stars (MDKS) and Mapping Distance Kernel with Vectors (MDKV). MDKS approximately measures the graph edit distance using star structures of height one. The method runs in $O(\upsilon^3)$, where $\upsilon$ is the maximum number of vertices in the graphs. However, when the height of the star structures is increased to avoid structural information loss, this graph kernel is no longer efficient. Hence, MDKV represents star structures of height greater than one as vectors and sums their Euclidean distances. It runs in $O(h(\upsilon^3 +|\Sigma|\upsilon^2))$, where $\Sigma$ is a set of vertex labels and graphs are iteratively relabeled $h$ times. We verify the computational efficiency of the proposed graph kernels on artificially generated datasets. Further, results on three real-world datasets show that the classification accuracy of the proposed graph kernels is higher than three conventional graph kernel methods.

Paper Nr: 14
Title:

Shape-based Trajectory Clustering

Authors:

Telmo J. P. Pires and Mário A. T. Figueiredo

Abstract: Automatic trajectory classification has countless applications, ranging from the natural sciences, such as zoology and meteorology, to urban planning, sports analysis, and surveillance, and has generated great research interest. This paper proposes and evaluates three new methods for trajectory clustering, strictly based on the trajectory shapes, thus invariant under changes in spatial position and scale (and, optionally, orientation). To extract shape information, the trajectories are first uniformly resampled using splines, and then described by the sequence of tangent angles at the resampled points. Dealing with angular data is challenging, namely due to its periodic nature, which needs to be taken into account when designing any clustering technique. In this context, we propose three methods: a variant of the k-means algorithm, based on a dissimilarity measure that is adequate for angular data; a finite mixture of multivariate Von Mises distributions, which is fitted using an EM algorithm; sparse nonnegative matrix factorization, using complex representation of the angular data. Methods for the automatic selection of the number of clusters are also introduced. Finally, these techniques are tested and compared on both real and synthetic data, demonstrating their viability.

Paper Nr: 16
Title:

Spatially Constrained Clustering to Define Geographical Rating Territories

Authors:

Shengkun Xie, Anna T. Lawniczak and Zizhen Wang

Abstract: In this work, spatially constrained clustering of insurance loss cost is studied. The study has demonstrated that spatially constrained clustering is a promising technique for defining geographical rating territories using auto insurance loss data as it is able to satisfy the contiguity constraint while implementing clustering. In the presented work, to ensure statistically sound clustering, advanced statistical approaches, including average silhouette statistic and Gap statistic, were used to determine the number of clusters. The proposed method can also be applied to demographical data analysis and real estate data clustering due to the nature of spatial constraint.

Paper Nr: 18
Title:

Domain Adaptation Transfer Learning by SVM Subject to a Maximum-Mean-Discrepancy-like Constraint

Authors:

Xiaoyi Chen and Régis Lengellé

Abstract: This paper is a contribution to solving the domain adaptation problem where no labeled target data is available.A new SVM approach is proposed by imposing a zero-valued Maximum Mean Discrepancy-like constraint.This heuristic allows us to expect a good similarity between source and target data, after projection onto an efficient subspace of a Reproducing Kernel Hilbert Space. Accordingly, the classifier will perform well on source and target data. We show that this constraint does not modify the quadratic nature of the optimization problem encountered in classic SVM, so standard quadratic optimization tools can be used. Experimental results demonstrate the competitiveness and efficiency of our method.

Paper Nr: 19
Title:

Recognition of Handwritten Music Symbols using Meta-features Obtained from Weak Classifiers based on Nearest Neighbor

Authors:

Jorge Calvo-Zaragoza, Jose J. Valero-Mas and Juan R. Rico-Juan

Abstract: The classification of musical symbols is an important step for Optical Music Recognition systems. However, little progress has been made so far in the recognition of handwritten notation. This paper considers a scheme that combines ideas from ensemble classifiers and dissimilarity space to improve the classification of handwritten musical symbols. Several sets of features are extracted from the input. Instead of combining them, each set of features is used to train a weak classifier that gives a confidence for each possible category of the task based on distance-based probability estimation. These confidences are not combined directly but used to build a new set of features called Confidence Matrix, which eventually feeds a final classifier. Our work demonstrates that using this set of features as input to the classifiers significantly improves the classification results of handwritten music symbols with respect to other features directly retrieved from the image.

Paper Nr: 20
Title:

How New Information Criteria WAIC and WBIC Worked for MLP Model Selection

Authors:

Seiya Satoh and Ryohei Nakano

Abstract: The present paper evaluates newly invented information criteria for singular models. Well-known criteria such as AIC and BIC are valid for regular statistical models, but their validness for singular models is not guaranteed. Statistical models such as multilayer perceptrons (MLPs), RBFs, HMMs are singular models. Recently WAIC and WBIC have been proposed as new information criteria for singular models. They are developed on a strict mathematical basis, and need empirical evaluation. This paper experimentally evaluates how WAIC and WBIC work for MLP model selection using conventional and new learning methods.

Paper Nr: 48
Title:

Face Class Modeling based on Local Appearance for Recognition

Authors:

Mokhtar Taffar and Serge Miguet

Abstract: This work proposes a new formulation of the objects modeling combining geometry and appearance. The object local appearance location is referenced with respect to an invariant which is a geometric landmark. The appearance (shape and texture) is a combination of Harris-Laplace descriptor and local binary pattern (LBP), all is described by the invariant local appearance model (ILAM). We applied the model to describe and learn facial appearances and to recognize them. Given the extracted visual traits from a test image, ILAM model is performed to predict the most similar features to the facial appearance, first, by estimating the highest facial probability, then in terms of LBP Histogram-based measure. Finally, by a geometric computing the invariant allows to locate appearance in the image. We evaluate the model by testing it on different images databases. The experiments show that the model results in high accuracy of detection and provides an acceptable tolerance to the appearance variability.

Paper Nr: 54
Title:

Random Projections with Control Variates

Authors:

Keegan Kang and Giles Hooker

Abstract: Random projections are used to estimate parameters of interest in large scale data sets by projecting data into a lower dimensional space. Some parameters of interest between pairs of vectors are the Euclidean distance and the inner product, while parameters of interest for the whole data set could be its singular values or singular vectors. We show how we can borrow an idea from Monte Carlo integration by using control variates to reduce the variance of the estimates of Euclidean distances and inner products by storing marginal information of our data set. We demonstrate this variance reduction through experiments on synthetic data as well as the colon and kos datasets. We hope that this inspires future work which incorporates control variates in further random projection applications.

Paper Nr: 65
Title:

Extracting Latent Behavior Patterns of People from Probe Request Data: A Non-negative Tensor Factorization Approach

Authors:

Kaito Oka, Masaki Igarashi, Atsushi Shimada and Rin-ichiro Taniguchi

Abstract: Although people flow analysis is widely studied because of its importance, there are some difficulties with previous methods, such as the cost of sensors, person re-identification, and the spread of smartphone applications for collecting data. Today, Probe Request sensing for people flow analysis is gathering attention because it conquers many of the difficulties of previous methods. We propose a framework for Probe Request data analysis for extracting the latent behavior patterns of people. To make the extracted patterns understandable, we apply a Non-negative Tensor Factorization with a sparsity constraint and initialization with prior knowledge to the analysis. Experimental result showed that our framework helps the interpretation of Probe Request data.

Paper Nr: 73
Title:

Orthogonal Neighborhood Preserving Projection using L1-norm Minimization

Authors:

Purvi A. Koringa and Suman K. Mitra

Abstract: Subspace analysis or dimensionality reduction techniques are becoming very popular for many computer vision tasks including face recognition or in general image recognition. Most of such techniques deal with optimizing a cost function using L2-norm. However, recently, due to capability of handling outliers, optimizing such cost function using L1-norm is drawing the attention of researchers. Present work is the first attempt towards the same goal where Orthogonal Neighbourhood Preserving Projection (ONPP) technique is optimized using L1-norm. In particular the relation of ONPP and PCA is established in the light of L2-norm and then ONPP is optimized using an already proposed mechanism of L1-PCA. Extensive experiments are performed on synthetic as well as real data. It has been observed that L1-ONPP outperforms its counterpart L2-ONPP.

Paper Nr: 76
Title:

EINCKM: An Enhanced Prototype-based Method for Clustering Evolving Data Streams in Big Data

Authors:

Ammar Al Abd Alazeez, Sabah Jassim and Hongbo Du

Abstract: Data stream clustering is becoming an active research area in big data. It refers to group constantly arriving new data records in large chunks to enable dynamic analysis/updating of information patterns conveyed by the existing clusters, the outliers, and the newly arriving data chunk. Prototype-based algorithms for solving the problem have their promises for simplicity and efficiency. However, existing implementations have limitations in relation to quality of clusters, ability to discover outliers, and little consideration of possible new patterns in different chunks. In this paper, a new incremental algorithm called Enhanced Incremental K-Means (EINCKM) is developed. The algorithm is designed to detect new clusters in an incoming data chunk, merge new clusters and existing outliers to the currently existing clusters, and generate modified clusters and outliers ready for the next round. The algorithm applies a heuristic-based method to estimate the number of clusters (K), a radius-based technique to determine and merge overlapped clusters and a variance-based mechanism to discover the outliers. The algorithm was evaluated on synthetic and real-life datasets. The experimental results indicate improved clustering correctness with a comparable time complexity to existing methods dealing with the same kind of problems.

Paper Nr: 82
Title:

A Tracking Approach for Text Line Segmentation in Handwritten Documents

Authors:

Insaf Setitra, Zineb Hadjadj and Abdelkrim Meziane

Abstract: Tracking of objects in videos consists of giving a label to the same object moving in different frames. This labelling is performed by predicting position of the object given its set of features observed in previous frames. In this work, we apply the same rationale by considering each connected component in the manuscript as a moving object and to track it so that to minimize the distance and angle of of the connected component to its nearest neighbour. The approach was applied to images of ICDAR 2013 handwritten segmentation contest and proved to be robust against text orientation, size and writing script.

Paper Nr: 98
Title:

Three-dimensional Object Recognition via Subspace Representation on a Grassmann Manifold

Authors:

Ryoma Yataka and Kazuhiro Fukui

Abstract: In this paper, we propose a method for recognizing three-dimensional (3D) objects using multi-view depth images. To derive the essential 3D shape information extracted from these images for stable and accurate 3D object recognition, we need to consider how to integrate partial shapes of a 3D object. To address this issue, we introduce two ideas. The first idea is to represent a partial shape of the 3D object by a three-dimensional subspace in a high-dimensional vector space. The second idea is to represent a set of the shape subspaces as a subspace on a Grassmann manifold, which reflects the 3D shape of the object more completely. Further, we measure the similarity between two subspaces on the Grassmann manifold by using the canonical angles between them. This measurement enables us to construct a more stable and accurate method based on richer information about the 3D shape. We refer to this method based on subspaces on a Grassmann manifold as the Grassmann mutual subspace method (GMSM). To further enhance the performance of the GMSM, we equip it with powerful feature-extraction capabilities. The validity of the proposed method is demonstrated through experimental comparisons with several conventional methods on a hand-depth image dataset.

Short Papers
Paper Nr: 12
Title:

Margin-based Refinement for Support-Vector-Machine Classification

Authors:

Helene Dörksen and Volker Lohweg

Abstract: In real-world scenarios it is not always possible to generate an appropriate number of measured objects for machine learning tasks. At the learning stage, for small/incomplete datasets it is nonetheless often possible to get high accuracies for several arbitrarily chosen classifiers. The fact is that many classifiers might perform accurately, but decision boundaries might be inadequate. In this situation, the decision supported by marginlike characteristics for the discrimination of classes might be taken into account. Accuracy as an exclusive measure is often not sufficient. To contribute to the solution of this problem, we present a margin-based approach originated from an existing refinement procedure. In our method, margin value is considered as optimisation criterion for the refinement of SVM models. The performance of the approach is evaluated on a real-world application dataset for Motor Drive Diagnosis coming from the field of intelligent autonomous systems in the context of Industry 4.0 paradigm as well as on several UCI Repository samples with different numbers of features and objects.

Paper Nr: 42
Title:

Linear Discriminant Analysis based on Fast Approximate SVD

Authors:

Nassara Elhadji Ille Gado, Edith Grall-Maës and Malika Kharouf

Abstract: We present an approach for performing linear discriminant analysis (LDA) in the contemporary challenging context of high dimensionality. The projection matrix of LDA is usually obtained by simultaneously maximizing the between-class covariance and minimizing the within-class covariance. However it involves matrix eigendecomposition which is computationally expensive in both time and memory requirement when the number of samples and the number of features are large. To deal with this complexity, we propose to use a recent dimension reduction method. The technique is based on fast approximate singular value decomposition (SVD) which has deep connections with low-rank approximation of the data matrix. The proposed approach, appSVD+LDA, consists of two stages. The first stage leads to a set of artificial features based on the original data. The second stage is the classical LDA. The foundation of our approach is presented and its performances in term of accuracy and computation time in comparison with some state-of-the-art techniques are provided for different real data sets.

Paper Nr: 55
Title:

Action Sequence Matching of Team Managers

Authors:

Olaf Flak, Cong Yang and Marcin Grzegorzek

Abstract: Traditionally, team managers are analysed and compared based on human perception with data collected from surveys and questionnaires. These methods normally have low efficiency especially in dynamic and complex environments. In order to improve the accuracy and stability of manager analysis in management science, in this paper, we propose a novel manager representation method which is general and flexible enough to cover most types of managers. For manager analysis, we introduce the first manager matching algorithm that calculates the global similarity between managers. The proposed matching algorithm not only returns robust and stable manager similarities, but also details the matched parts among managerial action sequences. With this, the proposed methods provide more research possibilities in management science.

Paper Nr: 85
Title:

A Novel Dictionary Learning based Multiple Instance Learning Approach to Action Recognition from Videos

Authors:

Abhinaba Roy, Biplab Banerjee and Vittorio Murino

Abstract: In this paper we deal with the problem of action recognition from unconstrained videos under the notion of multiple instance learning (MIL). The traditional MIL paradigm considers the data items as bags of instances with the constraint that the positive bags contain some class-specific instances whereas the negative bags consist of instances only from negative classes. A classifier is then further constructed using the bag level annotations and a distance metric between the bags. However, such an approach is not robust to outliers and is time consuming for a moderately large dataset. In contrast, we propose a dictionary learning based strategy to MIL which first identifies class-specific discriminative codewords, and then projects the bag-level instances into a probabilistic embedding space with respect to the selected codewords. This essentially generates a fixed-length vector representation of the bags which is specifically dominated by the properties of the class-specific instances. We introduce a novel exhaustive search strategy using a support vector machine classifier in order to highlight the class-specific codewords. The standard multiclass classification pipeline is followed henceforth in the new embedded feature space for the sake of action recognition. We validate the proposed framework on the challenging KTH and Weizmann datasets, and the results obtained are promising and comparable to representative techniques from the literature.

Paper Nr: 93
Title:

Retrieving Similar X-ray Images from Big Image Data using Radon Barcodes with Single Projections

Authors:

Morteza Babaie, H. R. Tizhoosh, Shujin Zhu and M. E. Shiri

Abstract: The idea of Radon barcodes (RBC) has been introduced recently. In this paper, we propose a content-based image retrieval approach for big datasets based on Radon barcodes. Our method (Single Projection Radon Barcode, or SP-RBC) uses only a few Radon single projections for each image as global features that can serve as a basis for weak learners. This is our most important contribution in this work, which improves the results of the RBC considerably. As a matter of fact, only one projection of an image, as short as a single SURF feature vector, can already achieve acceptable results. Nevertheless, using multiple projections in a long vector will not deliver anticipated improvements. To exploit the information inherent in each projection, our method uses the outcome of each projection separately and then applies more precise local search on the small subset of retrieved images. We have tested our method using IRMA 2009 dataset a with 14,400 x-ray images as part of imageCLEF initiative. Our approach leads to a substantial decrease in the error rate in comparison with other non-learning methods.

Paper Nr: 97
Title:

Localization and Mapping of Cheap Educational Robot with Low Bandwidth Noisy IR Sensors

Authors:

Muhammad Habib Mahmood and Pere Ridao Rodriguez

Abstract: The advancements in robotics has given rise to the manufacturing of affordable educational mobile robots. Due to their size and cost, they possess limited global localization and mapping capability. The purpose of producing these robots is not fully materialized if advance algorithms cannot be demonstrated on them. In this paper, we address this limitation by just using dead-reckoning and low bandwidth noisy infrared sensors for localization in an unknown environment. We demonstrate Extended Kalman Filter implementation, produce a map of the unknown environment by Occupancy grid mapping and based on this map, perform particle filtering to do Monte-Carlo Localization. In our implementation, we use the low cost e-puck mobile robot, which performs these tasks. We also putforth an empirical evaluation of the results, which shows convergence. The presented results provide a base to further build on the navigation and path-planning problems.

Paper Nr: 102
Title:

Adaptive Initialization of Cluster Centers using Ant Colony Optimization: Application to Medical Images

Authors:

B. S. Harish, S. V. Aruna Kumar, Francesco Masulli and Stefano Rovetta

Abstract: Segmentation is a fundamental preprocessing step in medical imaging for diagnosis and surgical operations planning. The popular Fuzzy C-Means clustering algorithm perform well in the absence of noise, but it is non robust to noise as it makes use of the Euclidean distance and does not exploit the spatial information of the image. These limitations can be addressed by using the Robust Spatial Kernel FCM (RSKFCM) algorithm that takes advantage of the spatial information and uses a Gaussian kernel function to calculate the distance between the center and data points. Though RSKFCM gives a good result, the main drawback of this method is the inability of obtaining good minima for the objective function as it happens for many other clustering algorithms. To improve the efficiency of RSKFCM method, in this paper, we proposed the Ant Colony Optimization algorithm based RSKFCM (ACORSKFCM). By using the Ant Colony Optimization, RSKFCM initializes the cluster centers and reaches good minima of the objective function. Experimental results carried out on the standard medical datasets like Brain, Lungs, Liver and Breast images. The results show that the proposed approach outperforms many other FCM variants.

Paper Nr: 106
Title:

A Tensor-based Technique for Structure-aware Image Inpainting

Authors:

Adib Akl and Charles Yaacoub

Abstract: Image inpainting is an active area of study in computer graphics, computer vision and image processing. Different image inpainting algorithms have been recently proposed. Most of them have shown their efficiency with different image types. However, failure cases still exist, especially when dealing with local image variations. This paper presents an image inpainting approach based on structure layer modeling, where this latter is represented by the second-moment matrix, also known as the structure tensor. The structure layer of the image is first inpainted using the non-parametric synthesis algorithm of Wei and Levoy, then the inpainted field of second-moment matrices is used to constrain the inpainting of the image itself. Results show that using the structural information, relevant local patterns can be better inpainted comparing to the standard intensity-based approach.

Paper Nr: 111
Title:

Discrete Wavelet Transform based Watermarking for Image Content Authentication

Authors:

Obaid Ur-Rehman and Natasa Zivic

Abstract: A watermarking scheme based on discrete wavelet transform for content based image authentication is proposed in this paper. The proposed scheme is tolerant to minor modifications which could be due to legitimate image processing operations. The tolerance is obtained by protecting the low frequency data of the wavelet transform using approximate message authentication codes. Major modifications in the image content are identified as forgery attacks. Simulation results are given for unintentional modifications, such as channel noise, and for intentional modifications such as the object insertion and deletion. Security analysis is given at the end to analyze the security strength of the proposed image authentication scheme.

Paper Nr: 116
Title:

Post Lasso Stability Selection for High Dimensional Linear Models

Authors:

Niharika Gauraha, Tatyana Pavlenko and Swapan k. Parui

Abstract: Lasso and sub-sampling based techniques (e.g. Stability Selection) are nowadays most commonly used methods for detecting the set of active predictors in high-dimensional linear models. The consistency of the Lassobased variable selection requires the strong irrepresentable condition on the design matrix to be fulfilled, and repeated sampling procedures with large feature set make the Stability Selection slow in terms of computation time. Alternatively, two-stage procedures (e.g. thresholding or adaptive Lasso) are used to achieve consistent variable selection under weaker conditions (sparse eigenvalue). Such two-step procedures involve choosing several tuning parameters that seems easy in principle, but difficult in practice. To address these problems efficiently, we propose a new two-step procedure, called Post Lasso Stability Selection (PLSS). At the first step, the Lasso screening is applied with a small regularization parameter to generate a candidate subset of active features. At the second step, Stability Selection using weighted Lasso is applied to recover the most stable features from the candidate subset. We show that under mild (generalized irrepresentable) condition, this approach yields a consistent variable selection method that is computationally fast even for a very large number of variables. Promising performance properties of the proposed PLSS technique are also demonstrated numerically using both simulated and real data examples.

Paper Nr: 136
Title:

Dynamic Selection of Exemplar-SVMs for Watch-list Screening through Domain Adaptation

Authors:

Saman Bashbaghi, Eric Granger, Robert Sabourin and Guillaume-Alexandre Bilodeau

Abstract: Still-to-video face recognition (FR) plays an important role in video surveillance, allowing to recognize individuals of interest over a network of video cameras. Watch-list screening is a challenging video surveillance application, because faces captured during enrollment (with still camera) may differ significantly from those captured during operations (with surveillance cameras) under uncontrolled capture conditions (with variations in, e.g., pose, scale, illumination, occlusion, and blur). Moreover, the facial models used for matching are typically designed a priori with a limited number of reference stills. In this paper, a multi-classifier system is proposed that exploits domain adaptation and multiple representations of face captures. An individual-specific ensemble of exemplar-SVM (e-SVM) classifiers is designed to model the single reference still of each target individual, where different random subspaces, patches, and face descriptors are employed to generate a diverse pool of classifiers. To improve robustness of face models, e-SVMs are trained using the limited number of labeled faces in reference stills from the enrollment domain, and an abundance of unlabeled faces in calibration videos from the operational domain. Given the availability of a single reference target still, a specialized distance-based criteria is proposed based on properties of e-SVMs for dynamic selection of the most competent classifiers per probe face. The proposed approach has been compared to reference systems for still-to-video FR on videos from the COX-S2V dataset. Results indicate that ensemble of e-SVMs designed using calibration videos for domain adaptation and dynamic ensemble selection yields a high level of FR accuracy and computational efficiency.

Posters
Paper Nr: 5
Title:

A Simple Node Ordering Method for the K2 Algorithm based on the Factor Analysis

Authors:

Vahid Rezaei Tabar

Abstract: In this paper, we use the Factor Analysis (FA) to determine the node ordering as an input for K2 algorithm in the task of learning Bayesian network structure. For this purpose, we use the communality concept in factor analysis. Communality indicates the proportion of each variable's variance that can be explained by the retained factors. This method is much easier than ordering-based approaches which do explore the ordering space. Because it depends only on the correlation matrix. As well, experimental results over benchmark networks ‘Alarm’ and ‘Hailfinder’ show that our new method has higher accuracy and better degree of data matching.

Paper Nr: 13
Title:

A Simplified Low Rank and Sparse Model for Visual Tracking

Authors:

Mi Wang, Huaxin Xiao, Yu Liu, Wei Xu and Maojun Zhang

Abstract: Object tracking is the process of determining the states of a target in consecutive video frames based on properties of motion and appearance consistency. Numerous tracking methods using low-rank and sparse constraints perform well in visual tracking. However, these methods cannot reasonably balance the two characteristics. Sparsity always pursues a sparse enough solution that ignores the low-rank structure and vice versa. Therefore, this paper replaces the low-rank and sparse constraints with 2,1 l norm. A simplified lowrank and sparse model for visual tracking (LRSVT), which is built upon the particle filter framework, is proposed in this paper. The proposed method first prunes particles which are different with the object and selects candidate particles for efficiency. A dictionary is then constructed to represent the candidate particles. The proposed LRSVT algorithm is evaluated against three related tracking methods on a set of seven challenging image sequences. Experimental results show that the LRSVT algorithm favorably performs against state-of-the-art tracking methods with regard to accuracy and execution time.

Paper Nr: 17
Title:

Synthetic Data Generation for Deep Learning in Counting Pedestrians

Authors:

Hadi Keivan Ekbatani, Oriol Pujol and Santi Segui

Abstract: One of the main limitations of the application of Deep Learning (DL) algorithms is when dealing with problems with small data. One workaround to this issue is the use of synthetic data generators. In this framework, we explore the benefits of synthetic data generation as a surrogate for the lack of large data when applying DL algorithms. In this paper, we propose a problem of learning to count the number of pedestrians using synthetic images as a substitute for real images. To this end, we introduce an algorithm to create synthetic images for being fed to a designed Deep Convolutional Neural Network (DCNN) to learn from. The model is capable of accurately counting the number of individuals in a real scene.

Paper Nr: 38
Title:

Research on Seamless Image Stitching based on Depth Map

Authors:

Chengming Zou, Pei Wu and Zeqian Xu

Abstract: Considering the slow speed of panorama image stitching and the ghosting of traditional image stitching algorithms, we propose a solution by improving the classical image stitching algorithm. Firstly, a SIFT algorithm based on block matching is used for feature matching which was proposed in our previously published paper. Then, the collaborative stitching of the color and depth cameras is applied to further enhance the accuracy of image matching. Finally, according to a multi-band blending algorithm, we obtain a panoramic image of high quality through image fusion. The proposed algorithm is based on two problems in the technology of feature-based image stitching algorithm, the algorithm’s real-time and ghosting. A series of experiments show that the accuracy and reliability of the improved algorithm have been increased. Besides a comparison with AutoStitch algorithm illustrates the advantage of the improved algorithm in efficiency and quality of stitching.

Paper Nr: 49
Title:

New Features for the Recognition of German-Kurrent-Handwriting with HMM-based Offline Systems

Authors:

Klaus Prätel

Abstract: In 2007, the project Herbar-Digital was launched at the University of Applied Sciences and Arts Hannover (Steinke, K.-H., Dzido, R., Gehrke, M., Prätel, K., 2008). The aim of this project is to realize a global Herbarium, to compare findings quickly and catalog. There are many herbaria, i.e. collections of herbarium specimens worldwide. Herbarium specimens are paper pages where botanical elements are glued on. This herbarium specimens are provided with some important clues, such as the name of the submitter, barcode, color table, flag for first record, description of the findings, this often handwritten. All information on the herbarium specimens should be evaluated digitally. Since a number of discoveries in the 19th Century took place, Alexander von Humboldt is to be mentioned, here is the challenge to identify specific manuscripts from this period. This paper describes the topic recognition of old German handwriting (cursive).

Paper Nr: 58
Title:

Concatenated Decision Paths Classification for Datasets with Small Number of Class Labels

Authors:

Ivan Mitzev and Nicolas H. Younan

Abstract: In recent years, the amount of collected information has rapidly increased, that has led to an increasing interest to time series data mining and in particular to the classification of these data. Traditional methods for classification are based mostly on distance measures between the time series and 1-NN classification. Recent development of classification methods based on time series shapelets- propose using small sub-sections of the entire time series, which appears to be most representative for certain classes. In addition, the shapelets-based classification method produces higher accuracies on some datasets because the global features are more sensitive to noise than the local ones. Despite its advantages the shapelets methods has an apparent disadvantage- slow training time. Varieties of algorithms were proposed to tackle this problem, one of which is the concatenated decision paths (CDP) algorithm. This algorithm as initially proposed works only with datasets with a number of class indexes higher than five. In this paper, we investigate the possibility to use CDP for datasets with less than five classes. We also introduce improvements that shorten the overall training time of the CDP method.

Paper Nr: 68
Title:

Acoustic Detection of Violence in Real and Fictional Environments

Authors:

Marta Bautista-Durán, Joaquín García-Gómez, Roberto Gil-Pita, Héctor Sánchez-Hevia, Inma Mohino-Herranz and Manuel Rosa-Zurera

Abstract: Detecting violence is an important task due to the amount of people who suffer its effects daily. There is a tendency to focus the problem either in real situations or in non real ones, but both of them are useful on its own right. Until this day there has not been clear effort to try to relate both environments. In this work we try to detect violent situations on two different acoustic databases through the use of crossed information from one of them into the other. The system has been divided into three stages: feature extraction, feature selection based on genetic algorithms and classification to take a binary decision. Results focus on comparing performance loss when a database is evaluated with features selected on itself, or selection based in the other database. In general, complex classifiers tend to suffer higher losses, whereas simple classifiers, such as linear and quadratic detectors, offers less than a 10% loss in most situations.

Paper Nr: 77
Title:

Lorentzian Distance Classifier for Multiple Features

Authors:

Yerzhan Kerimbekov and Hasan Şakir Bilge

Abstract: Machine Learning is one of the frequently studied issues in the last decade. The major part of these research area is related with classification. In this study, we suggest a novel Lorentzian Distance Classifier for Multiple Features (LDCMF) method. The proposed classifier is based on the special metric of the Lorentzian space and adapted to more than two features. In order to improve the performance of Lorentzian Distance Classifier (LDC), a new Feature Selection in Lorentzian Space (FSLS) method is improved. The FSLS method selects the significant feature pair subsets by discriminative criterion which is rebuilt according to the Lorentzian metric. Also, in this study, a data compression (pre-processing) step is used that makes data suitable in Lorentzian space. Furthermore, the covariance matrix calculation in Lorentzian space is defined. The performance of the proposed classifier is tested through public GESTURE, SEEDS, TELESCOPE, WINE and WISCONSIN data sets. The experimental results show that the proposed LDCMF classifier is superior to other classical classifiers.

Paper Nr: 81
Title:

A Hierarchical Tree Distance Measure for Classification

Authors:

Kent Munthe Caspersen, Martin Bjeldbak Madsen, Andreas Berre Eriksen and Bo Thiesson

Abstract: In this paper, we explore the problem of classification where class labels exhibit a hierarchical tree structure. Many multiclass classification algorithms assume a flat label space, where hierarchical structures are ignored. We take advantage of hierarchical structures and the interdependencies between labels. In our setting, labels are structured in a product and service hierarchy, with a focus on spend analysis. We define a novel distance measure between classes in a hierarchical label tree. This measure penalizes paths though high levels in the hierarchy. We use a known classification algorithm that aims to minimize distance between labels, given any symmetric distance measure. The approach is global in that it constructs a single classifier for an entire hierarchy by embedding hierarchical distances into a lower-dimensional space. Results show that combining our novel distance measure with the classifier induces a trade-off between accuracy and lower hierarchical distances on misclassifications. This is useful in a setting where erroneous predictions vastly change the context of a label.

Paper Nr: 110
Title:

Electromagnetismlike Mechanism Descriptor with Fourier Transform for a Passive Copy-move Forgery Detection in Digital Image Forensics

Authors:

Sajjad Dadkhah, Mario Köppen, Hamid A. Jalab, Somayeh Sadeghi, Azizah Abdul Manaf and Diaa Uliyan

Abstract: Copy-move forgery is a special type of forgery that involves duplicating one region of an image by covering it with a copy of another region from the same image. This study develops a simple and powerful descriptor called Electromagnetismlike mechanism descriptor (EMag), for locating tampered areas in copy-move forgery on the basis of Fourier transform within a reasonable amount of time. EMag is based on the collective attraction-repulsion mechanism, which considers each images pixel as an electrical charge. The main component of EMag is the degree of the attraction-repulsion force between the current pixel and its neighbours. In the proposed algorithm, the image is divided into similar non-overlapping blocks, and then the final force for each block is evaluated and used to construct the tampered image features vector. The experimental results demonstrate the efficiency of the proposed algorithm in terms of detection time and detection accuracy. The detection rate of the proposed algorithm is improved by reduction of false positive rate (FPR) and increment of true positive rate (TPR).

Paper Nr: 115
Title:

3D Face and Ear Recognition based on Partial MARS Map

Authors:

Tingting Zhang, Zhichun Mu, Yihang Li, Qing Liu and Yi Zhang

Abstract: This paper proposes a 3D face recognition approach based on facial pose estimation, which is robust to large pose variations in the unconstrained scene. Deep learning method is used to facial pose estimation, and the generation of partial MARS (Multimodal fAce and eaR Spherical) map reduces the probability of feature points appearing in the deformed region. Then we extract the features from the depth and texture maps. Finally, the matching scores from two types of maps should be calculated by Bayes decision to generate the final result. In the large pose variations, the recognition rate of the method in this paper is 94.6%. The experimental results show that our approach has superior performance than the existing methods used on the MARS map, and has potential to deal with 3D face recognition in unconstrained scene.

Paper Nr: 129
Title:

The Effect of SIFT Features Properties in Descriptors Matching for Near-duplicate Retrieval Tasks

Authors:

Afra'a Ahmad Alyosef and Andreas Nürnberger

Abstract: The scale invariant feature transformation algorithm (SIFT) has been widely used for near-duplicate retrieval tasks. Most studies and evaluations published so far focused on increasing retrieval accuracy by improving descriptor properties and similarity measures. Contrast, scale and orientation properties of the SIFT features were used in computing the SIFT descriptor, but their explicit influence in the feature matching step was not studied. Moreover, it has not been studied yet how to specify an appropriate criterion to extract (almost) the same number of SIFT features (respectively keypoints) of all images in a database. In this work, we study the effects of contrast and scale properties of SIFT features when ranking and truncating the extracted descriptors. In addition, we evaluate if scale, contrast and orientation features can be used to bias the descriptor matching scores, i.e., if the keypoints are quite similar in these features, we enforce a higher similarity in descriptor matching. We provide results of a benchmark data study using the proposed modifications in the original SIFT􀀀128D and on the region compressed SIFT (RC-SIFT􀀀64D) descriptors. The results indicate that using contrast and orientation features to bias feature matching can improve near-duplicate retrieval performance.

Area 2 - Applications

Full Papers
Paper Nr: 4
Title:

Analysis of Regionlets for Pedestrian Detection

Authors:

Niels Ole Salscheider, Eike Rehder and Martin Lauer

Abstract: Human detection is an important task for many autonomous robots as well as automated driving systems. The Regionlets detector was one of the best-performing approaches for pedestrian detection on the KITTI dataset during 2015. We analysed the Regionlets detector and its performance. This paper discusses the improvements in accuracy that were achieved by the different ideas of the Regionlets detector. It also analyses what the boosting algorithm learns and how this relates to the expectations. We found that the random generation of regionlet configurations can be replaced by a regular grid of regionlets. Doing so reduces the dimensionality of the feature space drastically but does not decrease detection performance. This translates into a decrease in memory consumption and computing time during training.

Paper Nr: 40
Title:

Enhancing Emotion Recognition from ECG Signals using Supervised Dimensionality Reduction

Authors:

Hany Ferdinando, Tapio Seppänen and Esko Alasaarela

Abstract: Dimensionality reduction (DR) is an important issue in classification and pattern recognition process. Using features with lower dimensionality helps the machine learning algorithms work more efficient. Besides, it also can improve the performance of the system. This paper explores supervised dimensionality reduction, LDA (Linear Discriminant Analysis), NCA (Neighbourhood Components Analysis), and MCML (Maximally Collapsing Metric Learning), in emotion recognition based on ECG signals from the Mahnob-HCI database. It is a 3-class problem of valence and arousal. Features for kNN (k-nearest neighbour) are based on statistical distribution of dominant frequencies after applying a bivariate empirical mode decomposition. The results were validated using 10-fold cross and LOSO (leave-one-subject-out) validations. Among LDA, NCA, and MCML, the NCA outperformed the other methods. The experiments showed that the accuracy for valence was improved from 55.8% to 64.1%, and for arousal from 59.7% to 66.1% using 10-fold cross validation after transforming the features with projection matrices from NCA. For LOSO validation, there is no significant improvement for valence while the improvement for arousal is significant, i.e. from 58.7% to 69.6%.

Paper Nr: 47
Title:

Opinion Retrieval from Product Review by Exploiting Review Helpfulness

Authors:

Jintao Du, Huafei Zheng, Wen Chan and Xiangdong Zhou

Abstract: Opinion mining of product review has attracted a lot of research interest. The volume of online product review is increasingly large, however the quality of reviews is varying. In this paper we present a novel method of dealing with opinion retrieval task by exploring the quality of review. Specifically, we present a novel opinion retrieval and reranking method by leveraging the helpfulness of the reviews and query expansion. The experimental results on more than 70k review documents from 4 domains demonstrate the effectiveness of our approach compared to the strong baselines and the state of the art methods.

Paper Nr: 64
Title:

A Robust Method for Blood Vessel Extraction in Endoscopic Images with SVM-based Scene Classification

Authors:

Mayank Golhar, Yuji Iwahori, M. K. Bhuyan, Kenji Funahashi and Kunio Kasugai

Abstract: This paper proposes a model for blood vessel detection in endoscopic images. A novel SVM-based scene classification of endoscopic images is used. This SVM-based model classifies images into four classes on the basis of dye content and blood vessel presence in the scene, using various colour, edge and texture based features. After classification, a vessel extraction method is proposed which is based on the Frangi vesselness approach. In original Frangi Vesselness results, it is observed that many non-blood vessel edges are inaccurately detected as blood vessels. So, two additions are proposed, background subtraction and a novel dissimilarity-detecting filtering procedure, which are able to discriminate between blood vessel and non-blood vessel edges by exploiting the symmetric nature property of blood vessels. It was found that the proposed approach gave better accuracy of blood vessel extraction when compared with the vanilla Frangi Vesselness approach and BCOSFIRE filter, another state-of-art vessel delineation approach.

Paper Nr: 78
Title:

A Virtual Glove System for the Hand Rehabilitation based on Two Orthogonal LEAP Motion Controllers

Authors:

Giuseppe Placidi, Luigi Cinque, Andrea Petracca, Matteo Polsinelli and Matteo Spezialetti

Abstract: Hand rehabilitation therapy is fundamental in the recovery process for patients suffering from post-stroke or post-surgery impairments. Traditional approaches require the presence of therapist during the sessions, involving high costs and subjective measurements of the patients’ abilities and progresses. Recently, several alternative approaches have been proposed. Mechanical devices are often expensive, cumbersome and patient specific, while virtual devices are not subject to this limitations, but, especially if based on a single sensor, could suffer from occlusions. In this paper a novel multi-sensor approach, based on the simultaneous use of two LEAP motion controllers, is proposed. The hardware and software design is illustrated and the measurements error induced by the mutual infrared interference is discussed. Finally, a calibration procedure, a tracking model prototype based on the sensors turnover and preliminary experimental results are presented.

Paper Nr: 92
Title:

Unsupervised Data-driven Hidden Markov Modeling for Text-dependent Speaker Verification

Authors:

Dijana Petrovska-Delacrétaz and Houssemeddine Khemiri

Abstract: We present a text-dependent speaker verification system based on unsupervised data-driven Hidden Markov Models (HMMs) in order to take into account the temporal information of speech data. The originality of our proposal is to train unsupervised HMMs with only raw speech without transcriptions, that provide pseudo phonetic segmentation of speech data. The proposed text-dependent system is composed of the following steps. First, generic unsupervised HMMs are trained. Then the enrollment speech data for each target speaker is segmented with the generic models, and further processing is done in order to obtain speaker and text adapted HMMs, that will represent each speaker. During the test phase, in order to verify the claimed identity of the speaker, the test speech is segmented with the generic and the speaker dependent HMMs. Finally, two approaches based on log-likelihood ratio and concurrent scoring are proposed to compute the score between the test utterance and the speaker’s model. The system is evaluated on Part1 of the RSR2015 database with Equal Error Rate (EER) on the development set, and Half Total Error Rate (HTER) on the evaluation set. An average EER of 1.29% is achieved on the development set, while for the evaluation part the average HTER is equal to 1.32%.

Paper Nr: 99
Title:

Compression Techniques for Deep Fisher Vectors

Authors:

Sarah Ahmed and Tayyaba Azim

Abstract: This paper investigates the use of efficient compression techniques for Fisher vectors derived from deep architectures such as Restricted Boltzmann machine (RBM). Fisher representations have recently created a surge of interest by proving their worth for large scale object recognition and retrieval problems, however they suffer from the problem of large dimensionality as well as have some intrinsic properties that make them unique from the conventional bag of visual words (BoW) features. We have shown empirically which of the normalisation and state of the art compression techniques is well suited for deep Fisher vectors making them amenable for large scale visual retrieval with reduced memory footprint. We further show that the compressed Fisher vectors give impressive classification results even with costless linear classifiers like k-nearest neighbour.

Paper Nr: 100
Title:

Analysis of Wi-Fi-based and Perceptual Congestion

Authors:

Masaki Igarashi, Atsushi Shimada, Kaito Oka and Rin-ichiro Taniguchi

Abstract: Conventional works for congestion estimates focus on estimating quantitative congestion (e.g., actual number of people, mobile devices, and crowd density). Meanwhile, we focus on perceptual congestion rather than quantitative congestion toward providing perceptual congestion information. We analyze the relationship between quantitative and perceptual congestion. For this analysis, we construct a system for estimating and visualizing congestion and collecting user reports about congestion. We use the number of mobile devices as quantitative congestion measurements obtained from Wi-Fi packet sensors, and user-report-based congestion as a perceptual congestion measurement collected via our Web service. Base on the obtained quantitative and perceptual congestion, we investigate the relationship between these values.

Paper Nr: 101
Title:

Prediction of User Opinion for Products - A Bag-of-Words and Collaborative Filtering based Approach

Authors:

Esteban García-Cuesta, Daniel Gómez-Vergel, Luis Gracias Expósito and María Vela-Pérez

Abstract: The rapid proliferation of social network services (SNS) gives people the opportunity to express their thoughts, opinions, and tastes on a wide variety of subjects such as movies or commercial items. Most item shopping websites currently provide SNS systems to collect users’ opinions, including rating and text reviews. In this context, user modeling and hyper-personalization of contents reduce information overload and improve both the efficiency of the marketing process and the user’s overall satisfaction. As is well known, users’ behavior is usually subject to sparsity and their preferences remain hidden in a latent subspace. A majority of recommendation systems focus on ranking the items by describing this subspace appropriately but neglect to properly justify why they should be recommended based on the user’s opinion. In this paper, we intend to extract the intrinsic opinion subspace from users’ text reviews –by means of collaborative filtering techniques– in order to capture their tastes and predict their future opinions on items not yet reviewed. We will show how users’ reviews can be predicted by using a set of words related to their opinions.

Paper Nr: 103
Title:

Error Correction over Optical Transmission

Authors:

Weam M. Binjumah, Alexey Redyuk, Rod Adams, Neil Davey and Yi Sun

Abstract: Reducing bit error rate and improving performance of modern coherent optical communication system is a significant issue. As the distance travelled by the information signal increases, bit error rate will degrade. Support Vector Machines are the most up to date machine learning method for error correction in optical transmission systems. Wavelet transform has been a popular method to signals processing. In this study, the properties of most used Haar and Daubechies wavelets are implemented for signals correction. Our results show that the bit error rate can be improved by using classification based on wavelet transforms (WT) and support vector machine (SVM).

Paper Nr: 104
Title:

Evaluating a New Conversive Hidden non-Markovian Model Approach for Online Movement Trajectory Verification

Authors:

Tim Dittmar, Claudia Krull and Graham Horton

Abstract: This paper presents further research on an implemented classification and verification system that employs a novel approach for stochastically modelling movement trajectories. The models are based on Conversive Hidden non-Markovian Models that are especially suited to mimic temporal dynamics of time series as in contrast to the relative Hidden Markov Models(HMM) and the dynamic time warping(DTW) method, timestamp information of data are an integral part. The system is able to create trajectory models from examples and is tested on signatures, doodles and pseudo-signatures for its verification performance. By using publicly available databases comparisons are made to evaluate the potential of the system. The results reveal that the system already performs similar to a general DTW approach on doodles and pseudo-signatures but does not reach the performance of specialized HMM systems for signatures. But further possibilities to improve the results are discussed.

Paper Nr: 117
Title:

Behavior Recognition in Mouse Videos using Contextual Features Encoded by Spatial-temporal Stacked Fisher Vectors

Authors:

Zheheng Jiang, Danny Crookes, Brian Desmond Green, Shengping Zhang and Huiyu Zhou

Abstract: Manual measurement of mouse behavior is highly labor intensive and prone to error. This investigation aims to efficiently and accurately recognize individual mouse behaviors in action videos and continuous videos. In our system each mouse action video is expressed as the collection of a set of interest points. We extract both appearance and contextual features from the interest points collected from the training datasets, and then obtain two Gaussian Mixture Model (GMM) dictionaries for the visual and contextual features. The two GMM dictionaries are leveraged by our spatial-temporal stacked Fisher Vector (FV) to represent each mouse action video. A neural network is used to classify mouse action and finally applied to annotate continuous video. The novelty of our proposed approach is: (i) our method exploits contextual features from spatio-temporal interest points, leading to enhanced performance, (ii) we encode contextual features and then fuse them with appearance features, and (iii) location information of a mouse is extracted from spatio-temporal interest points to support mouse behavior recognition. We evaluate our method against the database of Jhuang et al. (Jhuang et al., 2010) and the results show that our method outperforms several state-of-the-art approaches.

Short Papers
Paper Nr: 7
Title:

Motion Error Classification for Assisted Physical Therapy - A Novel Approach using Incremental Dynamic Time Warping and Normalised Hierarchical Skeleton Joint Data

Authors:

Julia Richter, Christian Wiede, Bharat Shinde and Gangolf Hirtz

Abstract: Preventive and therapeutic measures can contribute to maintain or to regain physical abilities. In Germany, the growing number of elderly people is posing serious challenges for the therapeutic sector. Therefore, the objective that has been pursued in recent research is to assist patients during their medical training by reproducing therapists' feedback. Extant systems have been limited to feedback that is based on the evaluation of only the similarity between a pre-recorded reference and the currently performed motion. To date, very little is known about feedback generation that exceeds such similarity evaluations. Moreover, current systems require a personalised, pre-recorded reference for each patient in order to compare the reference against the motion performed during the exercise and to generate feedback. The aim of this study is to develop and evaluate an error classification algorithm for therapy exercises using Incremental Dynamic Time Warping and 3-D skeleton joint information. Furthermore, a normalisation method that allows the utilisation of non-personalised references has been investigated. In our experiments, we were able to successfully identify errors, even for non-personalised reference data, by using normalised hierarchical coordinates.

Paper Nr: 9
Title:

Banknote Simulator for Aging and Soiling Banknotes using Gaussian Models and Perlin Noise

Authors:

Sangwook Baek, Sanghun Lee, Euison Choi, Yoonkil Baek and Chulhee Lee

Abstract: In this paper, we propose a banknote simulator that generates aged and soiled banknotes. By analyzing the characteristics of circulating banknotes, we developed Gaussian brightness models for gray level changes of circulating banknotes. In addition, the Perlin noise model was used to simulate soiling. The proposed algorithm was tested using US Dollars (USD) and the experimental results show that the proposed method effectively simulated soiled banknote images from new banknote images.

Paper Nr: 26
Title:

Identification of Types of Corrosion through Electrochemical Noise using Machine Learning Techniques

Authors:

Lorraine Marques Alves, Romulo Almeida Cotta and Patrick Marques Ciarelli

Abstract: Several systems in industries are subject to the effects of corrosion, such as machines, structures and a lot of equipment. As consequence, the corrosion can damage structures and equipment, causing financial losses and accidents. Such consequences can be reduced considerably with the use of methods of detection, analysis and monitoring of corrosion in hazardous areas, which can provide useful information to maintenance planning and accident prevention. In this paper, we analyze features extracted from electrochemical noise to identify types of corrosion, and we use machine learning techniques to perform this task. Experimental results show that the features obtained using wavelet transform are effective to solve this problem, and all the five evaluated classifiers achieved an average accuracy above 90%.

Paper Nr: 39
Title:

An Interval Distribution Analysis for RTI+

Authors:

Fabian Witter, Timo Klerx and Artus Krohn-Grimberghe

Abstract: The algorithm RTI+ learns a Probabilistic Deterministic Real-Time Automaton (PDRTA) from unlabeled timed sequences. RTI+ is an efficient algorithm that runs in polynomial time and can be applied to a variety of real-world behavior identification problems. Nevertheless, we uncover a lack of accuracy in identifying the intervals (or time guards) of the PDRTA. This inaccuracy can lead to wrong predictions of timed sequences in the learned model. We show by example that segments in intervals that are not covered by training data are responsible for this effect. We call those segments gaps and name three types of gaps that can appear. Two of the types cause wrong predictions of sequences and should thus be removed from the model. Therefore, we introduce our novel Interval Distribution Analysis (IDA) which utilizes statistical outlier detection to identify and remove gaps. In the context of ATM fraud detection, we show that IDA can improve the results of RTI+ in a real-world scenario.

Paper Nr: 53
Title:

Artistic Style Characterization of Vincent Van Gogh’s Paintings using Extracted Features from Visible Brush Strokes

Authors:

Tieta Putri, Ramakrishnan Mukundan and Kourosh Neshatian

Abstract: This paper outlines important methods used for brush stroke region extraction for quantifying artistic style of Vincent Van Gogh’s paintings. After performing the region extraction, stroke-related features such as colour and texture features are extracted from the visible brush stroke regions. We then test the features by performing a binary classification between painters from different art movements and painters from the same art movement.

Paper Nr: 56
Title:

Real-Time Gesture Recognition using a Particle Filtering Approach

Authors:

Frédéric Li, Lukas Köping, Sebastian Schmitz and Marcin Grzegorzek

Abstract: In this paper we present an approach for real-time gesture recognition using exclusively 1D sensor data, based on the use of Particle Filters and Dynamic Time Warping Barycenter Averaging (DBA). In a training phase, sensor records of users performing different gestures are acquired. For each gesture, the associated sensor records are then processed by the DBA method to produce one average record called template gesture. Once trained, our system classifies one gesture performed in real-time, by computing -using particle filters- an estimation of its probability of belonging to each class, based on the comparison of the sensor values acquired in real-time to those of the template gestures. Our method is tested on the accelerometer data of the Multimodal Human Activities Dataset (MHAD) using the Leave-One-Out cross validation, and compared with state-of-the-art approaches (SVM, Neural Networks) adapted for real-time gesture recognition. It manages to achieve a 85.30% average accuracy and outperform the others, without the need to define hyper-parameters whose choice could be restrained by real-time implementation considerations.

Paper Nr: 57
Title:

Automatic Polyp Detection from Endoscope Image using Likelihood Map based on Edge Information

Authors:

Yuji Iwahori, Hiroaki Hagi, Hiroyasu Usami, Robert J. Woodham, Aili Wang, M. K. Bhuyan and Kunio Kasugai

Abstract: An endoscope is a medical instrument that acquires images inside the human body. This paper proposes a new approach for the automatic detection of polyp regions in an endoscope image by generating a likelihood map with both of edge and color information to obtain high accuracy so that probability becomes high at around polyp candidate region. Next, Histograms of Oriented Gradients (HOG) features are extracted from the detected region and random forests are applied for the classification to judge whether the detected region is polyp region or not. It is shown that the proposed approach has high accuracy for the polyp detection and the usefulness is confirmed through the computer experiments with endoscope images.

Paper Nr: 72
Title:

PaperClip: Automated Dossier Reorganizing

Authors:

Wessel Stoop, Iris Hendrickx and Tom van Ees

Abstract: We investigate the creation of a robust algorithm for document identification and page ordering in a digital mail room in the banking sector. PaperClip is a system that takes files containing pages of various documents as input, and returns multiple files that contain all the pages of one document in the correct order. PaperClip performs (1) document type classification and (2) page number classification on each page, and then (3) merges the results. We experimented with various algorithms and methods for these three steps and we performed an elaborate evaluation to measure different aspects of the methods. The best performing setup achieved a cut F-score of 86\% and a V-measure of 0.91\% . This is high enough to fulfill business needs of the banking sector.

Paper Nr: 83
Title:

Iterative Adaptive Sparse Sampling Method for Magnetic Resonance Imaging

Authors:

Giuseppe Placidi, Luigi Cinque, Andrea Petracca, Matteo Polsinelli and Matteo Spezialetti

Abstract: Magnetic Resonance Imaging (MRI) represents a major imaging modality for its low invasiveness and for its property to be used in real-time and functional applications. The acquisition of radial directions is often used but a complete examination always requires long acquisition times. The only way to reduce acquisition time is undersampling. We present an iterative adaptive acquisition method (AAM) for radial sampling/reconstruction MRI that uses the information collected during the sequential acquisition process on the inherent structure of the underlying image for calculating the following most informative directions. A full description of AAM is furnished and some experimental results are reported; a comparison between AAM and weighted compressed sensing (CS) strategy is performed on numerical data. The results demonstrate that AAM converges faster than CS and that it has a good termination criterion for the acquisition process.

Paper Nr: 89
Title:

Segmentation of Bone Structures by Removal of Skin and using a Convex Relaxation Technique

Authors:

José A. Pérez-Carrasco, Begoña Acha, C. Suárez and Carmen Serrano

Abstract: In this paper an algorithm to extract the skin and obtain the segmentation of bones from patients in CT volumes is described. The skin is extracted using an adaptive region growing algorithm followed by morphological operations. The segmentation of bone structures is implemented by the minimization of an energy function and using a convex relaxation minimization algorithm to minimize the energy term. The cost terms in the energy function are computed using the distance between the mean and variance parameters within bone structures in a training set and the mean and variance parameters computed locally at each voxel position (x,y,z) in a test dataset. Several performance metrics have been computed to assess the algorithm. Comparisons with two techniques (thresholding and level sets) have been carried out and the results show that the algorithm proposed clearly outperform both techniques in terms of accuracy in the delimitation results.

Paper Nr: 112
Title:

Perspectively Correct Construction of Virtual Views

Authors:

Christian Fuchs and Dietrich Paulus

Abstract: The computation of virtual camera views is a common requirement in the development of computer vision appliances. We present a method for the perspectively correct computation of configurable virtual cameras using depth data gained from stereo correspondences. It avoids unnatural warping of 3-D objects as caused by homography-based approaches. Our method is tested using different stereo datasets.

Paper Nr: 119
Title:

Electrical Appliances Identification and Clustering using Novel Turn-on Transient Features

Authors:

Mohamed Nait Meziane, Abdenour Hacine-Gharbi, Philippe Ravier, Guy Lamarque, Jean-Charles Le Bunetel and Yves Raingeaud

Abstract: Due to the growing need for a detailed consumption information in the context of energy efficiency, different energy disaggregation, also called Non-Intrusive Load Monitoring (NILM), methods have been proposed. These methods may be subdivided into supervised and unsupervised approaches. Electrical appliance classification is one of the tasks a NILM system should perform. Depending on the chosen NILM approach, the classification task consists of either identifying the appliances or grouping them into clusters. In this paper, we present the results of appliance identification and clustering using the Controlled On/Off Loads Library (COOLL) dataset. We use novel features extracted from a recently proposed turn-on transient current model for both identification and clustering. The results show that the amplitude-related features of this model are the most suited for appliance identification (giving a classification rate (CR) of 98.57%) whereas the enveloperelated features are the most adapted for appliance clustering.

Paper Nr: 121
Title:

Deep Learning Approach for Classification of Mild Cognitive Impairment Subtypes

Authors:

Upul Senanayake, Arcot Sowmya, Laughlin Dawes, Nicole A. Kochan, Wei Wen and Perminder Sachdev

Abstract: Timely intervention in individuals at risk of dementia is often emphasized, and Mild Cognitive Impairment (MCI) is considered to be an effective precursor to Alzheimers disease (AD), which can be used as an intervention criterion. This paper attempts to use deep learning techniques to recognise MCI in the elderly. Deep learning has recently come to attention with its superior expressive power and performance over conventional machine learning algorithms. The current study uses variations of auto-encoders trained on neuropsychological test scores to discriminate between cognitively normal individuals and those with MCI in a cohort of community dwelling individuals aged 70-90 years. The performance of the auto-encoder classifier is further optimized by creating an ensemble of such classifiers, thereby improving the generalizability as well. In addition to comparable results to those of conventional machine learning algorithms, the auto-encoder based classifiers also eliminate the need for separate feature extraction and selection while also allowing seamless integration of features from multiple modalities.

Paper Nr: 126
Title:

Deep Learning-based Prediction Method for People Flows and Their Anomalies

Authors:

Shigeru Takano, Maiya Hori, Takayuki Goto, Seiichi Uchida, Ryo Kurazume and Rin-ichiro Taniguchi

Abstract: This paper proposes prediction methods for people flows and anomalies in people flows on a university campus. The proposed methods are based on deep learning frameworks. By predicting the statistics of people flow conditions on a university campus, it becomes possible to create applications that predict future crowded places and the time when congestion will disappear. Our prediction methods will be useful for developing applications for solving problems in cities.

Paper Nr: 127
Title:

Triangular Curvature Approximation of Surfaces - Filtering the Spurious Mode

Authors:

Paavo Nevalainen, Ivan Jambor, Jonne Pohjankukka, Jukka Heikkonen and Tapio Pahikkala

Abstract: Curvature spectrum is a useful feature in surface classification but is difficult to apply to cases with high noise typical e.g. to natural resource point clouds. We propose two methods to estimate the mean and the Gaussian curvature with filtering properties specific to triangulated surfaces. Methods completely filter a highest shape mode away but leave single vertical pikes only partially dampened. Also an elaborate computation of nodal dual areas used by the Laplace-Beltrami mean curvature can be avoided. All computation is based on triangular setting, and a weighted summation procedure using projected tip angles sums up the vertex values. A simplified principal curvature direction definition is given to avoid computation of the full second fundamental form. Qualitative evaluation is based on numerical experiments over two synthetical examples and a prostata tumor example. Results indicate the proposed methods are more robust to presence of noise than other four reference formulations.

Paper Nr: 128
Title:

A Digital Palaeographic Approach towards Writer Identification in the Dead Sea Scrolls

Authors:

Maruf A. Dhali, Sheng He, Mladen Popović, Eibert Tigchelaar and Lambert Schomaker

Abstract: To understand the historical context of an ancient manuscript, scholars rely on the prior knowledge of writer and date of that document. In this paper, we study the Dead Sea Scrolls, a collection of ancient manuscripts with immense historical, religious, and linguistic significance, which was discovered in the mid-20th century near the Dead Sea. Most of the manuscripts of this collection have become digitally available only recently and techniques from the pattern recognition field can be applied to revise existing hypotheses on the writers and dates of these scrolls. This paper presents our ongoing work which aims to introduce digital palaeography to the field and generate fresh empirical data by means of pattern recognition and artificial intelligence. Challenges in analyzing the Dead Sea Scrolls are highlighted by a pilot experiment identifying the writers using several dedicated features. Finally, we discuss whether to use specifically-designed shape features for writer identification or to use the Deep Learning methods on a relatively limited ancient manuscript collection which is degraded over the course of time and is not labeled, as in the case of the Dead Sea Scrolls.

Paper Nr: 130
Title:

Local and Global Feature Selection for Prosodic Classification of the Word’s Uses

Authors:

Abdenour Hacine-Gharbi, Philippe Ravier and François Nemo

Abstract: The aim of this study is to evaluate the ability of local or global prosodic features in achieving a classification task of word’s uses. The use of French word “oui” in spontaneous discourse can be identified as belonging to the class “convinced (CV)”or “lack of conviction (NCV)”. Statistics of classical prosodic patterns are considered for the classification task. Local features are those computed on single phonemes. Global features are computed on the whole word. The results show that 10 features completely explain the two clusters CV and NCV carried out by linguistic experts, the features having being selected thanks to the Max-Relevance Min-Redundancy filter selection strategy. The duration of the phoneme /w/ is found to be highly relevant for all the investigated classification systems. Local features are predominantly more relevant than global ones. The system was validated by building classification systems in a speaker dependent mode and in a speaker independent mode and also by investigating manual phoneme segmentation and automatic phoneme segmentation. In the most favorable case (speaker dependent mode and manual phoneme segmentation), the rate reached 87.72%. The classification rate reached 78.57% in the speaker independent mode with automatic phoneme segmentation which is a system configuration close to an industrial one.

Paper Nr: 132
Title:

Measuring Physical Activity of Older Adults via Smartwatch and Stigmergic Receptive Fields

Authors:

Antonio L. Alfeo, Mario G. C. A. Cimino and Gigliola Vaglini

Abstract: Physical activity level (PAL) in older adults can enhance healthy aging, improve functional capacity, and prevent diseases. It is known that human annotations of PAL can be affected by subjectivity and inaccuracy. Recently developed smart devices can allow a non-invasive, analytic, and continuous gathering of physiological signals. We present an innovative computational system fed by signals of heartbeat rate, wrist motion and pedometer sensed by a smartwatch. More specifically, samples of each signal are aggregated by functional structures called trails. The trailing process is inspired by stigmergy, an insects’ coordination mechanism, and is managed by computational units called stigmergic receptive fields (SRFs). SRFs, which compute the similarity between trails, are arranged in a stigmergic perceptron to detect a collection of micro-behaviours of the raw signal, called archetypes. A SRF is adaptive to subjects: its structural parameters are tuned by a differential evolution algorithm. SRFs are used in a multilayer architecture, providing further levels of processing to realize macro analyses in the application domain. As a result, the architecture provides a daily PAL, useful to detect behavioural shift indicating initial signs of disease or deviations in performance. As a proof of concept, the approach has been experimented on three subjects.

Paper Nr: 133
Title:

Spikiness Assessment of Term Occurrences in Microblogs: An Approach based on Computational Stigmergy

Authors:

Mario G. C. A. Cimino, Federico Galatolo, Alessandro Lazzeri, Witold Pedrycz and Gigliola Vaglini

Abstract: A significant phenomenon in microblogging is that certain occurrences of terms self-produce increasing mentions in the unfolding event. In contrast, other terms manifest a spike for each moment of interest, resulting in a wake-up-and-sleep dynamic. Since spike morphology and background vary widely between events, to detect spikes in microblogs is a challenge. Another way is to detect the spikiness feature rather than spikes. We present an approach which detects and aggregates spikiness contributions by combination of spike patterns, called archetypes. The soft similarity between each archetype and the time series of term occurrences is based on computational stigmergy, a bio-inspired scalar and temporal aggregation of samples. Archetypes are arranged into an architectural module called Stigmergic Receptive Field (SRF). The final spikiness indicator is computed through linear combination of SRFs, whose weights are determined with the Least Square Error minimization on a spikiness training set. The structural parameters of the SRFs are instead determined with the Differential Evolution algorithm, minimizing the error on a training set of archetypal series. Experimental studies have generated a spikiness indicator in a real-world scenario. The indicator has enhanced a cloud representation of social discussion topics, where the more spiky cloud terms are more blurred.

Paper Nr: 140
Title:

An Action Unit based Hierarchical Random Forest Model to Facial Expression Recognition

Authors:

Jingying Chen, Mulan Zhang, Xianglong Xue, Ruyi Xu and Kun Zhang

Abstract: Facial expression recognition is important in natural human-computer interaction, research in this direction has made great progress. However, recognition in noisy environments still remains challenging. To improve the efficiency and accuracy of the expression recognition in noisy environments, this paper presents a hierarchical random forest model based on facial action units (AUs). First, an AUs based feature extraction method is proposed to extract facial feature effectively; second, a hierarchical random forest model based on different AU regions is developed to recognize the expressions in a coarse-to-fine way. The experiment results show that the proposed approach has a good performance in different environments.

Posters
Paper Nr: 15
Title:

Algorithms for Telemetry Data Mining using Discrete Attributes

Authors:

Roy B. Ofer, Adi Eldar, Adi Shalev and Yehezkel S. Resheff

Abstract: As the cost of collecting and storing large amounts of data continues to drop, we see a constant rise in the amount of telemetry data collected by software applications and services. With the data mounding up, there is an increasing need for algorithms to automatically and efficiently mine insights from the collected data. One interesting case is the description of large tables using frequently occurring patterns, with implications for failure analysis and customer engagement. Finding frequently occurring patterns has applications both in an interactive usage where an analyst repeatedly query the data and in a completely automated process queries the data periodically and generate alerts and or reports based on the mining. Here we propose two novel mining algorithms for the purpose of computing such predominant patterns in relational data. The first method is a fast heuristic search, and the second is based on an adaptation of the apriori algorithm. Our methods are demonstrated on real-world datasets, and extensions to some additional fundamental mining tasks are discussed.

Paper Nr: 21
Title:

A High Resolution Optical Satellite Image Dataset for Ship Recognition and Some New Baselines

Authors:

Zikun Liu, Liu Yuan, Lubin Weng and Yiping Yang

Abstract: Ship recognition in high-resolution optical satellite images is an important task. However, it is difficult to recognize ships under complex backgrounds, which is the main bottleneck for ship recognition and needs to be further explored and researched. As far as we know, there is no public remote sensing ship dataset and few open source work. To facilitate future ship recognition related research, in this paper, we present a public high-resolution ship dataset, ``HRSC2016'', that covers not only bounding-box labeling way, but also rotated bounding box way with three-level classes including ship, ship category and ship types. We also provide the ship head position for all the ships with ``V'' shape heads and the segmentation mask for every image in ``Test set''. Besides, we volunteer a ship annotation tool and some development tools. Given these rich annotations we perform a detailed analysis of some state-of-the-art methods, introduce a novel metric, the separation fitness (SF), that is used for evaluating the performance of the sea-land segmentation task and we also build some new baselines for recognition. The latest dataset can be downloaded from ``http://www.escience.cn/people/liuzikun/DataSet.html''.

Paper Nr: 51
Title:

Candidate Oil Spill Detection in SLAR Data - A Recurrent Neural Network-based Approach

Authors:

Sergiu-Ovidiu Oprea, Pablo Gil, Damian Mira and Beatriz Alacid

Abstract: Intentional oil pollution damages marine ecosystems. Therefore, society and governments require maritime surveillance for early oil spill detection. The fast response in the detection process helps to identify the offenders in the vast majority of cases. Nowadays, it is a human operator whom is trained for carrying out oil spill detection. Operators usually use image processing techniques and data analysis from optical, thermal or radar acquired from aerial vehicles or spatial satellites. The current trend is to automate the oil spill detection process so that this can filter candidate oil spill from an aircraft as a decision support system for human operators. In this work, a robust and automated system for candidate oil spill based on Recurrent Neural Network (RNN) is presented. The aim is to provide a faster identification of the candidate oil spills from SLAR scanned sequences. So far, the majority of the research works about oil spill detection are focused on the classification between real oil spills and look-alikes, and they use SAR or optical images but not SLAR. Furthermore, the overall decision is usually taken by an operator in the research works of state-of-art, mainly due to the wide variety of types of look-alikes which cause false positives in the detection process when traditional NN are used. This work provides a RRN-based approach for candidate oil spill detection using SLAR data in contrast with the traditional Multilayer Perceptron Neural Network (MPNN). The system is tested with temporary data acquired from a SLAR sensor mounted on an aircraft. It achieves a success rate in detecting of 97%.

Paper Nr: 59
Title:

High Level Shape Representation in Printed Gujarati Character

Authors:

Mukesh M. Goswami and Suman K. Mitra

Abstract: This paper presents extraction and identification of the high-level stroke (HLS) from printed Gujarati characters. The HLS feature describes a character as a sequence of predefined high-level strokes. Such a high-level shape representation enables approximate shape similarity computation between characters and can easily be extended to word-level. The shape similarity based character and word matching have extensive application in word-spotting based document image retrieval and character classification. Therefore, the proposed features were tested on printed Gujarati character database consisting of 12000 samples from 42 different symbol classes. The classification is performed using k-nearest neighbor with shape similarity measure. Also, a shape similarity based printed Gujarati word matching experiment is reported on a small word image database and the initial result are encouraging.

Paper Nr: 60
Title:

Joint Depth and Alpha Matte Optimization via Stereo

Authors:

Junlei Ma, Dianle Zhou, Chen Chen and Wei Wang

Abstract: This study presents a novel iterative algorithm of joint depth and alpha matte optimization via stereo (JDMOS). This algorithm realizes simultaneous estimation of depth map and matting image to obtain final convergence. The depth map provides depth information to realize automatic image matting, whereas the border details generated from the image matting can refine the depth map in boundary areas. Compared with monocular matting methods, another advantage offered by JDMOS is that the image matting process is completely automatic, and the result is significantly more robust when depth information is introduced. The major contribution of JDMOS is adding image matting information to the cost function, thereby refining the depth map, especially in the scene boundary. Similarly, optimized disparity information is stitched into the matting algorithm as prior knowledge to make the foreground–background segmentation more accurate. Experimental results on Middlebury datasets demonstrate the effectiveness of JDMOS.

Paper Nr: 61
Title:

EEG and Eye Movement Maps of Chess Players

Authors:

Laercio R. Silva Junior, Fabio H. G. Cesar, Fabio T. Rocha and Carlos E. Thomaz

Abstract: Due to a number of advantages to work in the chess environment and its cognitive complexity nature, this game has been used a lot in scientific experiments in order to study the human cognitive process. This article describes the steps to acquisition and processing of electroencephalography signals (EEG) and eye tracking of volunteers with different levels of proficiency in chess and, after the application of mathematical and statistical methods, maps are generated to discuss and verify patterns among the chess players. Results show neural activations in different brain areas as well as distinct eye movements for the investigated chess questions and volunteers who participated in this study.

Paper Nr: 62
Title:

Oil Spill Detection using Segmentation based Approaches

Authors:

D. Mira, P. Gil, B. Alacid and F. Torres

Abstract: This paper presents a description and comparison of two segmentation methods for the oil spill detection in the sea surface. SLAR sensors acquire video sequences from which snapshots are extracted for the detection of oil spills. Both approaches are segmentation based on graph techniques and J-image respectively. Finally, the aim of applying both approaches to SLAR snapshots, as shown, is to detect the largest part of the oil slick and minimize the false detection of the spill.

Paper Nr: 67
Title:

Using Deep Convolutional Neural Networks to Predict Goal-scoring Opportunities in Soccer

Authors:

Martijn Wagenaar, Emmanuel Okafor, Wouter Frencken and Marco A. Wiering

Abstract: Deep learning approaches have successfully been applied to several image recognition tasks, such as face, object, animal and plant classification. However, almost no research has examined on how to use the field of machine learning to predict goal-scoring opportunities in soccer from position data. In this paper, we propose the use of deep convolutional neural networks (DCNNs) for the above stated problem. This aim is actualized using the following steps: 1) development of novel algorithms for finding goal-scoring opportunities and ball possession which are used to obtain positive and negative examples. The dataset consists of position data from 29 matches played by a German Bundlesliga team. 2) These examples are used to create original and enhanced images (which contain object trails of soccer positions) with a resolution size of 256x256 pixels. 3) Both the original and enhanced images are fed independently as input to two DCNN methods: instances of both GoogLeNet and a 3-layered CNN architecture. A K-nearest neighbor classifier was trained and evaluated on ball positions as a baseline experiment. The results show that the GoogLeNet architecture outperforms all other methods with an accuracy of 67.1%.

Paper Nr: 71
Title:

ECG-based Biometrics using a Deep Autoencoder for Feature Learning - An Empirical Study on Transferability

Authors:

Afonso Eduardo, Helena Aidos and Ana Fred

Abstract: Biometric identification is the task of recognizing an individual using biological or behavioral traits and, recently, electrocardiogram has emerged as a prominent trait. In addition, deep learning is a fast-paced research field where several models, training schemes and applications are being actively investigated. In this paper, an ECG-based biometric system using a deep autoencoder to learn a lower dimensional representation of heartbeat templates is proposed. A superior identification performance is achieved, validating the expressiveness of such representation. A transfer learning setting is also explored and results show practically no loss of performance, suggesting that these deep learning methods can be deployed in systems with offline training.

Paper Nr: 74
Title:

Comparing Local Descriptors and Bags of Visual Words to Deep Convolutional Neural Networks for Plant Recognition

Authors:

Pornntiwa Pawara, Emmanuel Okafor, Olarik Surinta, Lambert Schomaker and Marco Wiering

Abstract: The use of machine learning and computer vision methods for recognizing different plants from images has attracted lots of attention from the community. This paper aims at comparing local feature descriptors and bags of visual words with different classifiers to deep convolutional neural networks (CNNs) on three plant datasets; AgrilPlant, LeafSnap, and Folio. To achieve this, we study the use of both scratch and fine-tuned versions of the GoogleNet and the AlexNet architectures and compare them to a local feature descriptor with k-nearest neighbors and the bag of visual words with the histogram of oriented gradients combined with either support vector machines and multi-layer perceptrons. The results shows that the deep CNN methods outperform the hand-crafted features. The CNN techniques can also learn well on a relatively small dataset, Folio.

Paper Nr: 75
Title:

Light Intensity Automatic Adjustment Technology and Its Application in Palmprint Recognition System

Authors:

Guangming Lu, Xu Liang and Bing Xie

Abstract: Palmprint acquisition device with constant light intensity is easily influenced by the ambient light. The phenomenon that the captured palmprint images are too bright or dark happens frequently, and it leads to the failure of key points location. In order to ensure the captured palmprint image can obtain a good quality, an automatic light intensity adjustment system based on PWM Control Technology is designed. It can make proper adjustments for the light source according to quality evaluation results of the current image. In our study, a new palmprint dataset was established under different light intensity levels. Then, an optimal mean gray interval is learned from the dataset. Finally, the automatic light control module is designed to guarantee the brightness of the captured image falls into the optimal interval. Experimental results show that the technology is efficient for palmprint key points location and ROI (Region of Interest) segmentation, and it also commendably improves the palmprint recognition accuracy at the same time.

Paper Nr: 86
Title:

Eating and Drinking Recognition via Integrated Information of Head Directions and Joint Positions in a Group

Authors:

Naoto Ienaga, Yuko Ozasa and Hideo Saito

Abstract: Recent years have seen the introduction of service robots as waiters or waitresses in restaurants and cafes. In such venues, it is common for customers to visit in groups as well as for them to engage in conversation while eating and drinking. It is important for cyber serving staff to understand whether they are eating and drinking, or not, in order to wait on tables at appropriate times. In this paper, we present a method by which the robots can recognize eating and drinking actions performed by individuals in a group. Our approach uses the positions of joints in the human body as a feature and long short-term memory to achieve a recognition task on time-series data. We also used head directions in our method, as we assumed that it is effective for recognition in a group. The information garnered from head directions and joint positions is integrated via logistic regression and employed in recognition. The results show that this yielded the highest accuracy and effectiveness of the robots’ tasks.

Paper Nr: 87
Title:

Computer-Aided Diagnosis for Endotracheal Intubation Confirmation using Video-image Classification

Authors:

Dror Lederman

Abstract: In this paper, a Computer-Aided Diagnosis (CAD) system for endotracheal tube position confirmation, and detection of errors in intubation positioning is presented. Endotracheal intubation is a complex procedure which requires high skills and the use of secondary confirmation devices to ensure correct positioning of the tube. Our novel confirmation approach is based on video images classification and specifically on identification of specific anatomical landmarks, including esophagus, upper trachea and main bifurcation of the trachea into the two primary bronchi (“carina”), as indicators of correct or incorrect tube insertion and positioning. Classification of the images is performed using a neural network classifier. The performance of the proposed approach was evaluated using a dataset of cow-intubation videos and a dataset of human-intubation videos. Each one of the video images was manually (visually) classified by a medical expert into one of three categories: upper tracheal intubation, correct (carina) intubation and esophageal intubation. The image classification algorithm was applied off-line using a leave-one-case-out method. The results show that the system correctly classified 1567 out of 1600 (97.9%) of the cow intubations images, and 349 out of the 358 human intubations images (97.5%).

Paper Nr: 88
Title:

CFS- InfoGain based Combined Shape-based Feature Vector for Signer Independent ISL Database

Authors:

Garima Joshi, Renu Vig and Sukhwinder Singh

Abstract: In Sign language Recognition (SLR) system, signs are identified on the basis of hand shapes. Zernike Moments (ZM) are used as an effective shape descriptor in the field of Pattern Recognition. These are derived from orthogonal Zernike polynomial. The Zernike polynomial characteristics change as order and iteration parameter are varied. Observing their behaviour gives an insight into the selection of a particular value of ZM as a part of an optimal feature vector. The performance of ZMs can be improved by combining it with other features, therefore, ZMs are combined with Hu Moments (HM) and Geometric features (GF). An optimal feature vector of size 56 is proposed for ISL dataset. The importance of the internal edge details to address issue of hand-over-hand occlusion is also highlighted in the paper. The proposed feature set gives high accuracy for Support Vector Machine (SVM), Logistic Model Tree (LMT) and Multilayer Perceptron (MLP). However, the accuracy of Bayes Net (BN), Nave Bayes (NB), J48 and k- Nearest Neighbour (k-NN) improves significantly for Info Gain based normalized feature set.

Paper Nr: 94
Title:

HMM-based Activity Recognition with a Ceiling RGB-D Camera

Authors:

Daniele Liciotti, Emanuele Frontoni, Primo Zingaretti, Nicola Bellotto and Tom Duckett

Abstract: Automated recognition of Activities of Daily Living allows to identify possible health problems and apply corrective strategies in Ambient Assisted Living (AAL). Activities of Daily Living analysis can provide very useful information for elder care and long-term care services. This paper presents an automated RGB-D video analysis system that recognises human ADLs activities, related to classical daily actions. The main goal is to predict the probability of an analysed subject action. Thus, abnormal behaviour can be detected. The activity detection and recognition is performed using an affordable RGB-D camera. Human activities, despite their unstructured nature, tend to have a natural hierarchical structure; for instance, generally making a coffee involves a three-step process of turning on the coffee machine, putting sugar in cup and opening the fridge for milk. Action sequence recognition is then handled using a discriminative Hidden Markov Model (HMM). RADiaL, a dataset with RGB-D images and 3D position of each person for training as well as evaluating the HMM, has been built and made publicly available.

Paper Nr: 96
Title:

The Impact of Memory Dependency on Precision Forecast - An Analysis on Different Types of Time Series Databases

Authors:

Ricardo Moraes Muniz da Silva, Mauricio Kugler and Taizo Umezaki

Abstract: Time series forecasting is an important type of quantitative method in which past observations of a set of variables are used to develop a model describing their relationship. The Autoregressive Integrated Moving Average (ARIMA) model is a commonly used method for modelling time series. It is applied when the data show evidence of nonstationarity, which is removed by applying an initial differencing step. Alternatively, for time series in which the long-run average decays more slowly than an exponential decay, the Autoregressive Fractionally Integrated Moving Average (ARFIMA) model is used. One important issue on time series forecasting is known as the short and long memory dependency, which corresponds to how much past history is necessary in order to make a better prediction. It is not always clear if a process is stationary or what is the influence of the past samples on the future value, and thus, which of the two models, is the best choice for a given time series. The objective of this research is to have a better understanding this dependency for an accurate prediction. Several datasets of different contexts were processed using both models, and the prediction accuracy and memory dependency were compared.

Paper Nr: 108
Title:

Recognition of Oracle Bone Inscriptions by Extracting Line Features on Image Processing

Authors:

Lin Meng

Abstract: Oracle bone inscriptions is a kind of characters, which are inscribed on cattle bone or turtle shells with sharp objects about 3000 years ago. Understanding these inscriptions can give us a lot of insight into world history, character evaluations, global weather shifts, etc. However, for some political reasons the inscriptions remained buried in ruins until their discovery about 120 years ago. The aging process has caused the inscriptions to become less legible. In this work, we design a system and proposal a recognition method for recognizing oracle bone inscriptions as a template image from an oracle bone inscription database, by using the line feature of the inscriptions. First we use Gaussian filtering and labeling to reduce noise and use affine transformation and thinning to extract the skeleton. Then we use Hough transform to extracting the line feature points by proposing a method of clustering. Finally, we calculate the minimum distance of the line feature points between the original image and the template images to perform the recognition. Experimental results shows that almost 80% of inscriptions are recognized as the most minimum distance and the second-most minimum-distance. And the proposal can recognized well, even if the noise and tilt happened in original images.

Paper Nr: 122
Title:

Prediction of Protein X-ray Crystallisation Trial Image Time-courses

Authors:

B. M. Thamali Lekamge, Arcot Sowmya and Janet Newman

Abstract: This paper presents an algorithm to predict the outcome of a protein x-ray crystallisation trial. Results obtained from classification of individual images in a time-course are used, along with random forests, to make a prediction of the time-course outcome. Experiments on multiple datasets show that the first 8 frames of each time-course are quite sufficient to predict the final outcome.

Paper Nr: 123
Title:

Anomaly Detection for an Elderly Person Watching System using Multiple Power Consumption Models

Authors:

Maiya Hori, Tatsuro Harada and Rin-ichiro Taniguchi

Abstract: We propose an anomaly detection method for watching elderly people using only the power data acquired by a smart meter. In a conventional system that uses only power data, a warning is issued if the power consumption does not increase after the wake-up time or when the amount of power does not change for a long time. These methods need to set the wake-up time and power threshold for each user. Furthermore, wrong warnings are issued while residents are out of the home. In our method, multiple common power consumption models are created for each household for each short time zone, and a watching system is constructed by regarding the gaps between these models and newly observed data as anomaly values. This can be automatically applied to various situations such as “during sleep,” “during home activity” and “time zone for frequently going out in the daytime.”

Paper Nr: 131
Title:

Identification of Corrosive Substances through Electrochemical Noise using Wavelet and Recurrence Quantification Analysis

Authors:

Lorraine Marques Alves, Romulo A. Cotta, Adilson Ribeiro Prado and Patrick Marques Ciarelli

Abstract: There are many types of corrosive substances that are used in industrial processes or that are the result of chemical reactions and, over time or due to process failures, these substances can damage, through corrosion, machines, structures and a lot of equipment. As consequence, this can cause financial losses and accidents. Such consequences can be reduced considerably with the use of methods of identification of corrosive substances, which can provide useful information to maintenance planning and accident prevention. In this paper, we analyze two methods using electrochemical noise signal to identify corrosive substances that is acting on the metal surface and causing corrosion. The first method is based on Wavelet Transform, and the second one is based on Recurrence Quantification Analysis. Both methods were applied on a data set with six types of substances, and experimental results shown that both methods achieved, for some classification techniques, an average accuracy above 90%. The obtained results indicate the both methods are promising.

Paper Nr: 139
Title:

Cell Trajectory Clustering: Towards the Automated Identification of Morphogenetic Fields in Animal Embryogenesis

Authors:

Juan Raphael Diaz Simões, Paul Bourgine, Denis Grebenkov and Nadine Peyriéras

Abstract: The recent availability of complete cell lineages from live imaging data opens the way to novel methodologies for the automated analysis of cell dynamics in animal embryogenesis. We propose a method for the calculation of measure-based dissimilarities between cells. These dissimilarity measures allow the use of clustering algorithms for the inference of time-persistent patterns. The method is applied to the digital cell lineages reconstructed from live zebrafish embryos imaged from 6 to 13 hours post fertilization. We show that the position and velocity of cells are sufficient to identify relevant morphological features including bilateral symmetry and coherent cell domains. The method is flexible enough to readily integrate larger sets of measures opening the way to the automated identification of morphogenetic fields.

Paper Nr: 142
Title:

Smart Lifelogging: Recognizing Human Activities using PHASOR

Authors:

Minh-Son Dao, Duc-Tien Dang-Nguyen, Michael Riegler and Cathal Gurrin

Abstract: This paper introduces a new idea for sensor data analytics, named PHASOR, that can recognize and stream individual human activities online. The proposed sensor concept can be utilized to solve some emerging problems in smartcity domain such as health care, urban mobility, or security by creating a lifelog of human activities. PHASOR is created from three ‘components’: ID, model, and Sensor. The first component is to identify which sensor is used to monitor which object (e.g., group of users, individual users, type of smartphone). The second component decides suitable classifiers for human activities recognition. The last one includes two types: (1) physical sensors that utilize embedded sensors in smartphones to recognize human activities, (2) human factors that uses human interaction to personally increase the accuracy of the detection. The advantage of PHASOR is the error signal is inversely proportional to its lifetime, which is well-suited for lifelogging applications. The proposed concept is evaluated and compared to de-facto datasets as well as state-of-the-art of Human Activity Recognition (HAR) using smartphones, confirming that applying PHASOR can improves the accuracy of HAR.