VISAPP 2014 Abstracts


Area 1 - Image Formation and Preprocessing

Full Papers
Paper Nr: 109
Title:

Active Contour Segmentation with Affine Coordinate-based Parametrization

Authors:

Q. Xue, L. Igual, A. Berenguel, M. Guerrieri and L. Garrido

Abstract: In this paper, we present a new framework for image segmentation based on parametrized active contours. The contour and the points of the image space are parametrized using a set of reduced control points that have to form a closed polygon in two dimensional problems and a closed surface in three dimensional problems. By moving the control points, the active contour evolves. We use mean value coordinates as the parametrization tool for the interface, which allows to parametrize any point of the space, inside or outside the closed polygon or surface. Region-based energies such as the one proposed by Chan and Vese can be easily implemented in both two and three dimensional segmentation problems. We show the usefulness of our approach with several experiments.
Download

Paper Nr: 334
Title:

Region-constrained Feature Matching with Hierachical Agglomerative Clustering

Authors:

Jung Whan Jang, Mostafiz Mehebuba Hossain and Hyuk-Jae Lee

Abstract: Local feature matching is one of the most fundamental issues in computer vision. Hierarchical agglomerative clustering (HAC) has been effectively used to distinguish inliers from outliers. The drawback of HAC is its large computational complexity which increases rapidly as the number of feature correspondences increases. To overcome this drawback, this paper proposes a region-constrained feature matching in which an image is segmented into small regions and feature correspondences are clustered inside each region. Adjacent segmented regions are merged to form larger regions if the correspondences inside regions are similar. The merge may increase the accuracy of clustering, and consequently, it improves the accuracy of matching operations as well. The proposed region-constrained clustering dramatically reduces the execution time by as much as 500 times compared to the previous clustering while it achieves a similar matching accuracy.
Download

Short Papers
Paper Nr: 72
Title:

Using Channel Representations in Regularization Terms - A Case Study on Image Diffusion

Authors:

Christian Heinemann, Freddie Åström, George Baravdish, Kai Krajsek, Michael Felsberg and Hanno Scharr

Abstract: In this work we propose a novel non-linear diffusion filtering approach for images based on their channel representation. To derive the diffusion update scheme we formulate a novel energy functional using a soft-histogram representation of image pixel neighborhoods obtained from the channel encoding. The resulting Euler-Lagrange equation yields a non-linear robust diffusion scheme with additional weighting terms stemming from the channel representation which steer the diffusion process. We apply this novel energy formulation to image reconstruction problems, showing good performance in the presence of mixtures of Gaussian and impulse-like noise, e.g. missing data. In denoising experiments of common scalar-valued images our approach performs competitive compared to other diffusion schemes as well as state-of-the-art denoising methods for the considered noise types.
Download

Paper Nr: 93
Title:

Calibrating Focal Length for Paracatadioptric Camera from One Circle Image

Authors:

Huixian Duan, Lin Mei, Yanfeng Shang and Chuanping Hu

Abstract: Camera calibration from circles has great advantages, but for paracatadioptric camera, the estimation of intrinsic parameters using circle images is still an open and challenging problem. Previous work proved that the paracatadioptric projection of a circle is a quartic curve. But due to the partial occlusion, only part of the quartic curve is visible on the image plane. Consequently, circle image cannot be directly estimated using image points extracted from the visible part and camera parameters cannot be calibrated. To solve this problem, In this paper, we study the properties of paracatadioptric circle image and application in calibrating the focal length for the case that aspect ratio is 1 and skew is 0. Firstly, we derive the necessary and sufficient conditions that must be satisfied by paracatadioptric circle image. Next, based on these conditions, a new object function is presented to correctly estimate the circle image. Then, we show that the focal length can be computed from the estimated paracatadioptric circle image and the principal point that is estimated from the projected contour of parabolic mirror. Experimental results on both simulated and real image data have demonstrated the effectiveness of our method.
Download

Paper Nr: 112
Title:

Speeding Up Object Detection - Fast Resizing in the Integral Image Domain

Authors:

Michael Gschwandtner, Andreas Uhl and Andreas Unterweger

Abstract: In this paper, we present an approach to resize integral images directly in the integral image domain. For the special case of resizing by a power of two, we propose a highly parallelizable variant of our approach, which is identical to bilinear resizing in the image domain in terms of results, but requires fewer operations per pixel. Furthermore, we modify a parallelized state-of-the-art object detection algorithm which makes use of integral images on multiple scales so that it uses our approach and compare it to the unmodified implementation. We demonstrate that our modification allows for an average speedup of 6.38% on a dual-core processor with hyper-threading and 12.6% on a 64-core multi-processor system, respectively, without impacting the overall detection performance. Moreover, we show that these results can be extended to a whole class of object detection algorithms.
Download

Paper Nr: 115
Title:

Oriented Half Gaussian Kernels and Anisotropic Diffusion

Authors:

Baptiste Magnier and Philippe Montesinos

Abstract: Nonlinear PDEs (partial differential equations) offer a convenient formal framework for image regularization and are at the origin of several efficient algorithms. In this paper, we present a new approach which is based (i) on a set of half Gaussian kernel filters, and (ii) a nonlinear anisotropic PDE diffusion. On one hand, half Gaussian kernels provide oriented filters whose flexibility enables to detect edges with great accuracy. On the other hand, a nonlinear anisotropic diffusion scheme offers a means to smooth images while preserving fine structures or details, e.g. lines, corners and junctions. Based on the calculus of the gradient magnitude and two diffusion directions, we construct a diffusion control function able to achieve precise image regularization. Some quantified experimental results compared to existing PDEs approaches and a discussion about the parameterizing of the method are presented.
Download

Paper Nr: 136
Title:

Restoration of Old Document Images using Different Color Spaces - Restoration of Old Document Images

Authors:

Ederson Marcos Sgarbi, Wellington Aparecido Della Mura, Nikolas Moya, Jacques Facon and Horacio A. Legal Ayala

Abstract: An obstacle in old document interpretation comes from the lack of image quality. Old documents frequently appear with digitization errors, uneven background, bleed-through effect. A new approach based on morphological color operators to restore the color text is presented. The morphological tools are based on three color spaces, HSI well known in morphological processes, YCrCb and Y IQ rarely used in morphological procedures. Experimental results carried onto 100 old documents have proven that using YCrCb and YIQ is as effective as using HSI to recover ancient texts in uneven and foxed background images, without presenting problems in hue ordination.

Paper Nr: 160
Title:

Dynamic Multiscale Visualization of Flight Data

Authors:

Tijmen Klein, Matthew van der Zwan and Alexandru Telea

Abstract: We present a novel set of techniques for visualization of very large data sets encoding flight information obtained from Air Traffic Control. The aims of our visualization are to provide a smooth way to explore the available information and find outlier spatio-temporal patterns by navigating between fine-scale, detail, views on the data and coarse overviews of large areas and long time periods. To achieve this, we extend and adapt several image-based visualization techniques, including animation, density maps, and bundled graphs. In contrast to previous methods, we are able to visualize significantly more information on a single screen, with limited clutter, and also create real-time animations of the data. For computational scalability, we implement our method using GPU-accelerated techniques. We demonstrate our results on several real-world data sets ranging from hours over a country to one month over the entire world.
Download

Paper Nr: 178
Title:

Hand Veins Recognition System

Authors:

João Ricardo Gonçalves Neves and Paulo Lobato Correia

Abstract: Accurate protection systems capable of replacing the traditional passwords and ID cards are essential, for commodity and for security reasons. A hand-vein pattern recognition system is just one of a vast group of biometrics techniques under research, in order to become the reference recognition system. This paper presents a hand vein biometric recognition system that uses the hand blood vessels pattern to identify an individual. All biometric systems have an immense application potential as they present advantages over the traditional identification systems. They are able to work with patterns that are very hard to duplicate, since they are different from person to person, and it is also impossible to lose of forget them, since the biometric characteristics are intrinsically attached to the human body. The developed approach was created with the intent of providing an effective protection system despite having been designed and implemented using inexpensive hardware, in comparison with the biometric recognition systems presently offered at a commercial level. The results show that a reliable system can be produced at a low cost and can be used standalone or in combination with other systems.
Download

Paper Nr: 222
Title:

Tone Mapping for Single-shot HDR Imaging

Authors:

Johannes Herwig, Matthias Sobczyk and Josef Pauli

Abstract: The problem of tone mapping for HDR (high dynamic range) to LDR (low dynamic range) conversion is introduced by a unified framework considering all the usual processing steps. Then the specific problem of single-shot HDR is outlined where special emphasis is taken on the effect of the greater noise floor of those images when compared to the usual exposure bracketing approach to HDR. We herein tailor the popular tone mapping operators proposed by Reinhard for single-shot HDR. A region-based approach for preprocessing any HDR image in order to increase SNR and perceptual sharpness is introduced as an extension to our initial tone mapping framework. The results are compared with respect to specially developed baseline tone mappers and an extensive subjective evaluation is performed.
Download

Paper Nr: 278
Title:

Study of Interference Noise in Multi-Kinect Set-up

Authors:

Tanwi Mallick, Partha Pratim Das and Arun Kumar Majumdar

Abstract: KinectTM, a low-cost multimedia sensing device, has revolutionized human computer interaction (HCI) by making various applications of human activity tracking affordable and widely available. Often multiple Kinects are used in imaging applications to improve the field of view, depth of field and uni-directional vision of a single Kinect. Unfortunately, multiple Kinects lead to IR Interference Noise (IR Noise, in short) in the depth map. In this paper we analyse the estimators for interference noise, survey various imaging techniques to mitigate the interference at source, and characterize them in parallel to a well-known classification system in telecom industry. Finally we compare their performance from reported literature and outline our on-going research to control interference noise by software shuttering.
Download

Paper Nr: 287
Title:

Computational Models of Object Recognition - Goal, Role and Success

Authors:

Tayyaba Azim

Abstract: This paper surveys the learning algorithms of visual features representation and the computational modelling approaches proposed with the aim of developing better artificial object recognition systems. It turns out that most of the learning theories and schemas have been developed either in the spirit of understanding biological facts of vision or designing machines that provide better or competitive perception power than humans. In this study, we discuss and analyse the impact of notable statistical approaches that map the cognitive neural activity at macro level formally, as well as those that work independently without any biological inspiration towards the goal of developing better classifiers. With the ultimate objective of classification in hand, the dimensions of research in computer vision and AI in general, have expanded so much so that it has become important to understand if our goals and diagnostics of the visual input learning are correct or not. We first highlight the mainstream approaches that have been proposed to solve the classification task ever since the advent of the field, and then suggest some criterion of success that can guide the direction of the future research.
Download

Paper Nr: 343
Title:

Towards Relative Altitude Estimation in Topological Navigation Tasks using the Global Appearance of Visual Information

Authors:

Francisco Amorós, Luis Payá, Oscar Reinoso, David Valiente and Lorenzo Fernández

Abstract: In this work, we present a collection of different techniques oriented to the altitude estimation in topological visual navigation tasks. All the methods use descriptors based on the global appearance of the scenes. The techniques are tested using our own experimental database, which is composed of a set of omnidirectional images captured in real lightning conditions including several locations and altitudes. We use different representations of the visual information, including the panoramic and orthographic views, and the projection of the omnidirectional image into the uni sphere. The experimental results demonstrate the effectiveness of some of the techniques.
Download

Paper Nr: 373
Title:

Expression, Pose, and Illumination Invariant Face Recognition using Lower Order Pseudo Zernike Moments

Authors:

Madeena Sultana, Marina Gavrilova and Svetlana Yanushkevich

Abstract: Face recognition is an extremely challenging task with the presence of expression, orientation, and lightning variation. This paper presents a novel expression and pose invariant feature descriptor by combining Daubechies discrete wavelets transform and lower order pseudo Zernike moments. A novel normalization method is also proposed to obtain illumination invariance. The proposed method can recognize face images regardless of facial orientation, expression, and illumination variation using small number of features. An extensive experimental investigation is conducted using a large variation of facial orientation, expression, and illumination to evaluate the performance of the proposed method. Experimental results confirm that the proposed approach obtains high recognition accuracy and computational efficiency under different pose, expression, and illumination conditions.
Download

Paper Nr: 385
Title:

Fuzzy-rule-embedded Reduction Image Construction Method for Image Enlargement with High Magnification

Authors:

Hakaru Tamukoh, Noriaki Suetake, Hideaki Kawano, Ryosuke Kubota, Byungki Cha and Takashi Aso

Abstract: This paper proposes a fuzzy-rule-embedded reduction image construction method for image enlargement. A fuzzy rule is generated by considering distribution of pixel value around a target pixel. The generated rule is embedded into the target pixel in a reduction image. The embedded fuzzy rule is used in a fuzzy inference to generate a highly magnified image from the reduction image. Experimental results, which scale factors are three and four, show that the proposed method realizes high-quality image enlargement in terms of both objective and subjective evaluations in comparison with conventional methods.
Download

Paper Nr: 402
Title:

GPU based Parallel Image Processing Library for Embedded Systems

Authors:

Mustafa Cavus, Hakkı Doganer Sumerkan, Osman Seckin Simsek, Hasan Hassan, Abdullah Giray Yaglikci and Oguz Ergin

Abstract: Embedded image processing systems have many challenges, due to large computational requirements and other physical, power, and environmental constraints. However recent contemporary mobile devices include a graphical processing unit (GPU) in order to offer better use interface in terms of graphics. Some of these embedded GPUs also support OpenCL which allows the use of computation capacity of embedded GPUs for general purpose computing. Within this OpenCL support, challenges of image processing in embedded systems become easier to handle. In this paper, we present a new OpenCL-based image processing library, named TRABZ-10, which is specifically designed to run on an embedded platform. Our results show that the functions of TRABZ-10 show 7X speedup on embedded platform over the functions of OpenCV on average.
Download

Paper Nr: 26
Title:

Kernel-based Adaptive Image Sampling

Authors:

Jianxiong Liu, Christos Bouganis and Peter Y. K. Cheung

Abstract: This paper presents an adaptive progressive image acquisition algorithm based on the concept of kernel construction. The algorithm takes the conventional route of blind progressive sampling to sample and reconstruct the ground truth image in an iterative manner. During each iteration, an equivalent kernel is built for each unsampled pixel to capture the spatial structure of its local neighborhood. The kernel is normalized by the estimated sample strength in the local area and used as the projection of the influence of this unsampled pixel to the consequent sampling procedure. The sampling priority of a candidate unsampled pixel is the sum of such projections from other unsampled pixels in the local area. Pixel locations with the highest priority are sampled in the next iteration. The algorithm does not require to pre-process or compress the ground truth image and therefore can be used in various situations where such procedure is not possible. The experiments show that the proposed algorithm is able to capture the local structure of images to achieve a better reconstruction quality than that of the existing methods.
Download

Paper Nr: 44
Title:

Multi-spectral Flash Imaging under Low-light Condition using Optimization with Weight Map

Authors:

Bong-Seok Choi, Dae-Chul Kim, Wang-Jun Kyung and Yeong-Ho Ha

Abstract: Long exposure shot and flash lights are generally used to acquire images under low-light environments. However, flash lights often induce color distortion, red-eye effect, and they can disturb the subject. The other hand, long-exposure shots are prone to motion-blur due to camera shake or subject-motion. Recently, multi-spectral flash imaging has been introduced to overcome the limitations of traditional low-light photography. Multi-spectral flash imaging is performed by combining the invisible and visible spectrum information. However, common multi spectral flash approaches induce color distortion due to the lower accuracy of the invisible spectrum image. In this paper, we propose a multi-spectral flash imaging algorithm using optimization with weight map in order to improve color accuracy and brightness of image. The UV/IR and visible spectrum images are firstly captured, respectively. Then, to compensate luminance value under low light condition, tone reproduction is performed by using adaptive curve due to image features that is obtained by Naka-Rushton formula. Next, to discriminate uniform regions from detail regions, weight map is generated by using Canny operator. Finally, the optimization object function takes into account the output likelihood with respect to the visible light image, the sparsity of image gradients as well as the spectral constraints for the IR-red channels and UV-blue channels. The performance of the proposed method has been subjectively evaluated using z-score, and we also show that output images have improved color accuracy and lower noise with respect to other methods.
Download

Paper Nr: 55
Title:

Tetrachromatic Metamerism - A Discrete, Mathematical Characterization

Authors:

Alfredo Restrepo Palacios

Abstract: Two light beams that are seen as of having the same colour but that have different spectra are said to be metameric. The colour of a light beam is based on the reading of severel photodetectors with different spectral responses and metamerism results when a set of photodetectors is unable to resolve two spectra. The spectra are then said to be metameric. We are interested in exploring the concept of metamerism in the tetrachromatic case. Applications are in computer vision, computational photography and satellite imaginery, for example.
Download

Paper Nr: 137
Title:

A Combined Calibration of 2D and 3D Sensors - A Novel Calibration for Laser Triangulation Sensors based on Point Correspondences

Authors:

Alexander Walch and Christian Eitzinger

Abstract: In this paper we describe a 2D/3D vision sensor, which consists of a laser triangulation sensor and matrix colour camera. The outcome of this sensor is the fusion of the 3D data delivered from the laser triangulation sensor and the colour information of the matrix camera in the form of a coloured point cloud. For this reason a novel calibration method for the laser triangulation sensor was developed, which makes it possible to use one common calibration object for both cameras and provides their relative spatial position. A sensor system with a SICK Ranger E55 profile scanner and a DALSA Genie color camera was set up to test the calibration in terms of the quality of the match between the color information and the 3D point cloud.
Download

Paper Nr: 159
Title:

Converting Underwater Imaging into Imaging in Air

Authors:

Tim Dolereit and Arjan Kuijper

Abstract: The application of imaging devices in underwater environments has become a common practice. Protecting the camera’s constituent electric parts against water leads to refractive effects emanating from the water-glass-air transition of light rays. These non-linear distortions can not be modeled by the pinhole camera model. For our new approach we focus on flat interface systems. By handling refractive effects properly, we are able to convert the problem to imaging conditions in air. We show that based on the location of virtual object points in water, virtual parameters of a camera following the pinhole camera model can be computed per image ray. This enables us to image the same object as if it was situated in air. Our novel approach works for an arbitrary camera orientation to the refractive interface. We show experimentally that our adopted physical methods can be used for the computation of 3D object points by a stereo camera system with much higher precision than with a naive in-situ calibration.
Download

Paper Nr: 164
Title:

Exemplar-based Human Body Super-resolution for Surveillance Camera Systems

Authors:

Kento Nishibori, Tomokazu Takahashi, Daisuke Deguchi, Ichiro Ide and Hiroshi Murase

Abstract: In this paper, we propose an exemplar-based super-resolution method applied to a human body in a surveillance video. Since persons are usually captured as low-resolution images by a video surveillance system, it is sometimes necessary to perform detection and identification of persons from not only a human face but also from the human body appearance. The super-resolution for a human body image is difficult because the appearances of person images vary according to the color of clothing and the posture of persons. Thus, we focus on the high-frequency components that could restore the lost high-frequency components of the low resolution image regardless to the variation of the clothing. Therefore, the purpose of the work presented in this paper is to apply the exemplar-based super-resolution using high-frequency components for a lowresolution human body image to generate a high-resolution human body image so that both computer systems and humans can identify persons more accurately. As a result of experiments, we confirmed the effectiveness of the proposed super-resolution method.
Download

Paper Nr: 179
Title:

Stabilization of Endoscopic Videos using Camera Path from Global Motion Vectors

Authors:

Navya Amin, Thomas Gross, Marvin C. Offiah, Susanne Rosenthal, Nail El-Sourani and Markus Borschbach

Abstract: Many algorithms for video stabilization have been proposed so far. However, not many digital video stabilization procedures for endoscopic videos are discussed. Endoscopic videos contain immense shakes and distortions as a result of some internal factors like body movements or secretion of body fluids as well as external factors like manual handling of endoscopic devices, introduction of surgical devices into the body, luminance changes etc.. The feature detection and tracking approaches that successfully stabilize the non-endoscopic videos might not give similar results for the endoscopic videos due to the presence of these distortions. Our focus of research includes developing a stabilization algorithm for such videos. This paper focusses on a special motion estimation method which uses global motion vectors for tracking applied to different endoscopic types (while taking into account the endoscopic region of interest). It presents a robust video processing and stabilization technique that we have developed and the results of comparing it with the state-of-the-art video stabilization tools. Also it discusses the problems specific to the endoscopic videos and the processing techniques which were necessary for such videos unlike the real-world videos.
Download

Paper Nr: 219
Title:

A Block Size Optimization Algorithm for Parallel Image Processing

Authors:

J. Alvaro Fernandez and M. Dolores Moreno

Abstract: The aim of this work is to define a strategy for rectangular block partitioning that can be adapted to the number of available processing units in a parallel processing machine, regardless of the input data size. With this motivation, an algorithm for optimal vector block partitioning is introduced and tested in a typical parallel image application. The proposed algorithm provides a novel partition method that reduces data sharing between blocks and maintains block sizes as equal as possible for any input size.
Download

Paper Nr: 243
Title:

Local Regression based Colorization Coding

Authors:

Paul Oh, Suk Ho Lee and Moon Gi Kang

Abstract: A new image coding technique for color image based on colorization method is proposed. In colorization based image coding, the encoder selects the colorization coefficients according to the basis made from the luminance channel. Then, in the decoder, the chrominance channels are reconstructed by utilizing the luminance channel and the colorization coefficients sent from the encoder. The main issue in colorization based coding is to extract colorization coefficients well such that the compression rate and the quality of the reconstructed color becomes good enough. In this paper, we use a local regression method to extract the correlated feature between the luminance channel and the chrominance channels. The local regions are obtained by performing an image segmentation on the luminance channel both in the encoder and the decoder. Then, in the decoder, the chrominance values in each local region are reconstructed via a local regression method. The use of the correlated features helps to colorize the image with more details. The experimental results show that the proposed algorithm performs better than JPEG and JPEG2000 in terms of the compression rate and the PSNR value.
Download

Paper Nr: 257
Title:

A Novel Fusion Algorithm for Visible and Infrared Image using Non-subsampled Contourlet Transform and Pulse-coupled Neural Network

Authors:

Chihiro Ikuta, Songjun Zhang, Yoko Uwate, Guoan Yang and Yoshifumi Nishio

Abstract: An image fusion algorithm between visible and infrared images is significant task for computer vision applications such as multi-sensor systems. Among them, although a visible image is clear perfectly able to be seen through the naked eyes, it is often suffers with noise; while an infrared image is unclear but it has high anti-noise property. In this paper, we propose a novel image fusion algorithm for visible and infrared images using a non-subsampled contourlet transform (NSCT) and a pulse-coupled neural network (PCNN). First, we decompose two original images above mentioned into low and high frequency coefficients based on the NSCT. Moreover, each low frequency coefficients for both images are duplicated at multiple scales, and are processed by laplacian filter and average filter respectively. Finally, we can fuse the normalized coefficients by using the PCNN. Conversely, we can reconstruct a fused image based on the low and high frequency coefficients, which are fused by using the inverse NSCT. Experimental results show that the proposed image fusion algorithm surpasses the conventional and state-of-art image fusion algorithm.
Download

Paper Nr: 259
Title:

Multi-scale Regions from Edge Fragments - A Graph Theory Approach

Authors:

Wajahat Kazmi and Hans Jørgen Andersen

Abstract: In this article we introduce a novel method for detecting multi-scale salient regions around edges using a graph based image compression algorithm. Images are recursively decomposed into triangles arranged into a binary tree using linear interpolation. The entropy of any local region of the image is inherent in the areas of the triangles and tree depth. We introduce twin leaves as nodes whose sibling share the same characteristics. Triangles corresponding to the twin leaves are filtered out from the binary tree. Graph connectivity is exploited to get clusters of triangles followed by ellipse fitting to estimate regions. Salient regions are thus formed as stable regions around edges. Tree hierarchy is then used to generate multi-scale regions. We evaluate our detector by performing image retrieval tests on our building database which shows that combined with Spin Images (Lazebnik et al., 2003), their performance is comparable to SIFT (Lowe, 2004).We also show that when they are used together with MSERs (Matas et al., 2002), the performance of MSERs is boosted.

Paper Nr: 306
Title:

3D Object Emphasis using Multiple Projectors

Authors:

Shohei Takada, Fumihiko Sakaue and Jun Sato

Abstract: In this paper, we propose a method for emphasizing 3D shapes by using patterned light projection from multiple projectors. In this method, we project patterned lights from multiple projectors. Then, the patterned lights are mixed up at the surface of objects. As a result, object regions which are different from preregistered 3D shapes are colored and emphasized visually. In this method, we do not need any computation for image processing, since the image processing is achieved by mixing lights projected from multiple projectors. Furthermore, we do not need to find image correspondences in order to obtain 3D information of objects. In this paper, we propose a method for generating projection patterns for visualizing small difference in 3D shapes such as defects of shape. The efficiency of the proposed method is test by using multiple projectors.
Download

Paper Nr: 353
Title:

SKen: A Statistical Test for Removing Outliers in Optical Flow - A 3D Reconstruction Case

Authors:

Samuel Macedo, Luis Vasconcelos, Vinicius Cesar, Saulo Pessoa and Judith Kelner

Abstract: The 3D reconstruction can be employed in several areas such as markerless augmented reality, manipulation of interactive virtual objects and to deal with the occlusion of virtual objects by real ones. However, many improvements into the 3D reconstruction pipeline in order to increase its efficiency may still be done. In such context, this paper proposes a filter for optimizing a 3D reconstruction pipeline. It is presented the SKen technique, a statistical hypothesis test that classifies the features by checking the smoothness of its trajectory. Although it was not mathematically proven that inliers features performed smooth camera paths, this work shows some evidence of a relationship between smoothness and inliers. By removing features that did not present smooth paths, the quality of the 3D reconstruction was enhanced.
Download

Paper Nr: 365
Title:

Synopsis of an Engineering Solution for a Painful Problem - Phantom Limb Pain

Authors:

A. Mousavi, J. Cole, T. Kalganova, R. Stone, J. Zhang, S. Pettifer, R. Walker, P. Nikopoulou-Smyrni, D. Henderson Slater, A. Aggoun, S. Von Rump and S. Naylor

Abstract: This paper is synopsis of a recently proposed solution for treating patients who suffer from Phantom Limb Pain (PLP). The underpinning approach of this research and development project is based on an extension of “mirror box” therapy which has had some promising results in pain reduction. An outline of an immersive individually tailored environment giving the patient a virtually realised limb presence, as a means to pain reduction is provided. The virtual 3D holographic environment is meant to produce immersive, engaging and creative environments and tasks to encourage and maintain patients’ interest, an important aspect in two of the more challenging populations under consideration (over-60s and war veterans). The system is hoped to reduce PLP by more than 3 points on an 11 point Visual Analog Scale (VAS), when a score less than 3 could be attributed to distraction alone.

Paper Nr: 384
Title:

Switching Median Filter with Signal Dependent Thresholds Designed by using Genetic Algorithm

Authors:

Ryosuke Kubota, Keisuke Onaga and Noriaki Suetake

Abstract: In this paper, we propose a new switching median filter with signal dependent thresholds designed by a genetic algorithm (GA). The switching median filter detects noise-corrupted pixels based on a threshold. Then it restores only the detected pixels. The present switching median filter deals with the random-valued impulse noises, whose distribution is ideally assumed as a uniform distribution. In the present method, the switching median filter, which has two kinds of the thresholds, is introduced. One is switching thresholds to detect the noise, and the other is selecting thresholds to choose the suitable switching threshold. As the suitable selecting threshold, a variance of signals is used. Then all of the switching and selecting thresholds of the proposed switching median filter are automatically optimized by using GA. To optimize the thresholds with GA, distribution distance between the assumed and the detected noises is employed as a fitness function. The validity and effectiveness of the proposed method is verified by some experiments.
Download

Area 2 - Image and Video Analysis

Full Papers
Paper Nr: 19
Title:

Polygonal Approximation of an Object Contour by Detecting Edge Dominant Corners using Iterative Corner Suppression

Authors:

Rabih Al Nachar, Elie Inaty, Patrick J. Bonnin and Yasser Alayli

Abstract: A new algorithm to detect straight edge parts which form the contour of an object presented in an image is discussed in this paper. This algorithm is very robust and can detect true straight edges even when their pixel's locations are not straight due to natural noise at the object borders. These straight edges are than used to report and classify contour's corners according to their angle and their adjacent segments lengths. A new technique for polygonal approximation is also presented to find the best set among these corners to construct the polygon vertices that best describe the approximating contour. It starts by eliminating the corners, one after the other using Iterative Corner Suppression (ICS) process. This in turn enables us to obtain the smallest possible error in the approximation. Experimental results demonstrate the efficiency of this technique in comparison with recently proposed algorithms.
Download

Paper Nr: 52
Title:

The World vs. SCOTT - Synthesis of COncealment Two-level Texture

Authors:

Julien Gosseaume, Kidiyo Kpalma and Joseph Ronsin

Abstract: We propose an original method of Synthesis of COncealment Two-level Texture (SCOTT). SCOTT was designed according to the Human Visual System so that the concealment texture is faithful to the visual environment it will be placed in, in terms of forms and colors. The results of simulation prove that the concealment texture is efficient although it is made of simple forms and only a few colors. Even if SCOTT has initially been designed for an application of reducing the visual pollution caused by manmade equipments (antenna, electrical cabinets, distributor boxes, repeater shelters, etc.), it may be used in many applications, such as inpainting, and even in image compression.
Download

Paper Nr: 103
Title:

A Saliency-based Framework for 2D-3D Registration

Authors:

Mark Brown, Jean-Yves Guillemaut and David Windridge

Abstract: Here we propose a saliency-based filtering approach to the problem of registering an untextured 3D object to a single monocular image. The principle of saliency can be applied to a range of modalities and domains to find intrinsically descriptive entities from amongst detected entities, making it a rigorous approach to multi-modal registration. We build on the Kadir-Brady saliency framework due to its principled information-theoretic approach which enables us to naturally extend it to the 3D domain. The salient points from each domain are initially aligned using the SoftPosit algorithm. This is subsequently refined by aligning the silhouette with contours extracted from the image. Whereas other point based registration algorithms focus on corners or straight lines, our saliency-based approach is more general as it is more widely applicable e.g. to curved surfaces where a corner detector would fail. We compare our salient point detector to the Harris corner and SIFT keypoint detectors and show it generally achieves superior registration accuracy.
Download

Paper Nr: 120
Title:

Generic and Real-time Detection of Specular Reflections in Images

Authors:

Alexandre Morgand and Mohamed Tamaazousti

Abstract: In this paper, we propose a generic and efficient method for real-time specular reflections detection in images. The method relies on a new thresholding technique applied in the Hue-Saturation-Value (HSV) color space. A detailed experimental study was conducted in this color space to highlight specular reflections’ properties. Current state-of-the-art methods have difficulties with lighting jumps by being too specific or computationally expensive for real-time applications. Our method addresses this problem using the following three steps: an adaptation of the contrast of the image to handle lighting jumps, an automatic thresholding to isolate specular reflections and a post-processing step to further reduce the number of false detections. This method has been compared with the state-of-the-art according to our two proposed experimental protocols based on contours and gravity center and offers fast and accurate results without a priori on the image in real-time.
Download

Paper Nr: 121
Title:

On the Segmentation and Classification of Water in Videos

Authors:

Pascal Mettes, Robby T. Tan and Remco Veltkamp

Abstract: The automatic recognition of water entails a wide range of applications, yet little attention has been paid to solve this specific problem. Current literature generally treats the problem as a part of more general recognition tasks, such as material recognition and dynamic texture recognition, without distinctively analyzing and characterizing the visual properties of water. The algorithm presented here introduces a hybrid descriptor based on the joint spatial and temporal local behaviour of water surfaces in videos. The temporal behaviour is quantified based on temporal brightness signals of local patches, while the spatial behaviour is characterized by Local Binary Pattern histograms. Based on the hybrid descriptor, the probability of a small region of being water is calculated using a Decision Forest. Furthermore, binary Markov Random Fields are used to segment the image frames. Experimental results on a new and publicly available water database and a subset of the DynTex database show the effectiveness of the method for discriminating water from other dynamic and static surfaces and objects.
Download

Paper Nr: 122
Title:

Segmentation of Optic Disc in Retina Images using Texture

Authors:

Suraya Mohammad, D. T. Morris and Neil Thacker

Abstract: The paper describes our work on the segmentation of the optic disc in retinal images. Our approach comprises of two main steps; a pixel classification method to identify pixels that may belong to the optic disc boundary and a circular template matching method to estimate the circular approximation of the optic disc boundary. The features used are based on texture, calculated using the intensity differences of local image patches. This was adapted from Binary Robust Independent Elementary Features (BRIEF). BRIEF is inherently invariant to image illumination and has a lower degree of computational complexity compared to other existing texture measurement methods. Fuzzy C-Means (FCM) and Naive Bayes are the clustering and classifier used to cluster/classify the image pixels. The method was tested on a set of 196 images composed of 110 healthy retina images and 86 glaucomatous images. The average mean overlap ratio between the true optic disc region and segmented region is 0.81 for both FCM and Naive Bayes. Comparison with a method based on the Hough Transform is also provided.
Download

Paper Nr: 127
Title:

Comparison of Different Color Spaces for Image Segmentation using Graph-cut

Authors:

Xi Wang, Ronny Hänsch, Lizhuang Ma and Olaf Hellwich

Abstract: Graph-cut optimization has been successfully applied in many image segmentation tasks. Within this framework color information has been extensively used as a perceptual property of objects to segment the foreground object from background. There are different representations of color in digital images, each with special characteristics. Previous work on segmentation lacks a systematic study of which color space is better suited for image segmentation. This work applies the Graph Cut algorithm for image segmentation based on five different, widespread color spaces and evaluates their performance on public benchmark datasets. Most of the tested color spaces lead to similar results. Segmentations based on L*a*b* color space are of slightly higher or similar quality as all the other methods. In contrast, RGB-based segmentations are mostly worse than a segmentation based on any other tested color space.
Download

Paper Nr: 171
Title:

Fast Segmentation for Texture-based Cartography of whole Slide Images

Authors:

Grégory Apou, Benoît Naegel, Germain Forestier, Friedrich Feuerhake and Cédric Wemmert

Abstract: In recent years, new optical microscopes have been developed, providing very high spatial resolution images called Whole Slide Images (WSI). The fast and accurate display of such images for visual analysis by pathologists and the conventional automated analysis remain challenging, mainly due to the image size (sometimes billions of pixels) and the need to analyze certain image features at high resolution. To propose a decision support tool to help the pathologist interpret the information contained by the WSI, we present a new approach to establish an automatic cartography of WSI in reasonable time. The method is based on an original segmentation algorithm and on a supervised multiclass classification using a textural characterization of the regions computed by the segmentation. Application to breast cancer WSI shows promising results in terms of speed and quality.
Download

Paper Nr: 183
Title:

Watershed from Propagated Markers based on Morphological Hierarchical Segmentation and Graph Matching

Authors:

André Roberto Ortoncelli and Franklin César Flores

Abstract: Watershed from propagated markers is a generic method to interactive segmentation of objects in image sequences, given by the combination of classical watershed from markers technique to motion estimation. The mask of segmentation, given by the segmentation of the object in the previous frame, is the main parameter to compute a set of markers to segment the same objects in the current frame. This paper introduces a new version of the watershed from propagated markers. In this proposal, the set of markers and its associated model graph are constructed in function of the mask of segmentation. The input graph is constructed given by the hierarchical segmentation of the next frame. The graph matching between the model graph and the input graph provides a pre-segmentation mask that will be used to compute the initial markers to the next frame. Experiments were done to illustrate the performance of the new version and its comparison to methods found in the literature and to previous versions of the watershed from propagated markers.
Download

Paper Nr: 187
Title:

Image Compensation for Improving Extraction of Driver’s Facial Features

Authors:

Jung-Ming Wang, Han-Ping Chou, Sei-Wang Chen and Chiou-Shann Fuh

Abstract: Extracting driver’s facial feature helps to identify the vigilance level of a driver. Some research about facial feature extraction also has been developed for controlled interface of vehicle. To acquire facial feature of drivers, research using various visual sensors have been reported. However, potential challenges to such a work include rapid illumination variation resulting from ambient lights, abrupt lighting change (e.g., entering/exiting tunnels and sunshine/shadow), and partial occlusion. In this paper, we propose an image compensation method for improve extraction of a driver’s facial features. This method has the advantages of fast processing and high adaptation. Our experiments show that the extraction of driver’s facial features can be improved significantly.
Download

Paper Nr: 239
Title:

Edge-based Foreground Detection with Higher Order Derivative Local Binary Patterns for Low-resolution Video Processing

Authors:

Francis Deboeverie, Gianni Allebosch, Dirk Van Haerenborgh, Peter Veelaert and Wilfried Philips

Abstract: Foreground segmentation is an important task in many computer vision applications and a commonly used approach to separate foreground objects from the background. Extremely low-resolution foreground segmentation, e.g. on video with resolution of 30x30 pixels, requires modifications of traditional high-resolution methods. In this paper, we adapt a texture-based foreground segmentation algorithm based on Local Binary Patterns (LBPs) into an edge-based method for low-resolution video processing. The edge information in the background model is introduced by a novel LBP strategy with higher order derivatives. Therefore, we propose two new LBP operators. Similar to the gradient operator and the Laplacian operator, the edge information is obtained by the magnitudes of First Order Derivative LBPs (FOD-LBPs) and the signs of Second Order Derivative LBPs (SOD-LBPs). Posterior to background subtraction, foreground corresponds to edges on moving objects. The method is implemented and tested on low-resolution images produced by monochromatic smart sensors. In the presence of illumination changes, the edge-based method outperforms texture-based foreground segmentation at low resolutions. In this work, we demonstrate that edge information becomes more relevant than texture information when the image resolution scales down.
Download

Paper Nr: 252
Title:

Hierarchical Bayesian Modelling of Visual Attention

Authors:

Jinhua Xu

Abstract: The brain employs interacting bottom-up and top-down processes to speed up searching and recognizing visual targets relevant to specific behavioral tasks. In this paper, we proposed a Bayesian model of visual attention that optimally integrates top-down, goal-driven attention and bottom-up, stimulus-driven visual saliency. In this approach, we formulated a multi-scale hierarchical model of objects in natural contexts, where the computing nodes at the higher levels have lower resolutions and larger sizes than the nodes at the lower levels, and provide local contexts for the nodes at the lower levels. The conditional probability of a visual variable given its context is calculated in an efficient way. The model entails several existing models of visual attention as its special cases. We tested this model as a predictor of human fixations in free-viewing and object searching tasks in natural scenes and found that the model performed very well.
Download

Paper Nr: 260
Title:

Shape Similarity based Surface Registration

Authors:

Manuel Frei and Simon Winkelbach

Abstract: In the last 20 years many approaches for the registration and localization of surfaces were developed. Most of them generate solutions by minimizing point distances or maximizing contact areas between surface points. Other algorithms try to detect corresponding points on the two surfaces by searching for points with same features and align them. However, aligning and localizing self-similar surfaces or surfaces having large regions with approximately constant curvature is still a complex problem. In this paper a new algorithm for registration and matching of surfaces is introduced, which extends an approach maximizing the contact area between the surfaces by surface-based dissimilarity features and thereby solves the problem of registering the problematic surfaces described above. Our evaluation shows the great potential of our approach regarding efficiency, accuracy and robustness for various applications like scan alignment, pottery assembly or bone reduction.
Download

Paper Nr: 290
Title:

M5AIE - A Method for Body Part Detection and Tracking using RGB-D Images

Authors:

Andre Brandao, Leandro A. F. Fernandes and Esteban Clua

Abstract: The automatic detection and tracking of human body parts in color images is highly sensitive to appearance features such as illumination, skin color and clothes. As a result, the use of depth images has been shown to be an attractive alternative over color images due to its invariance to lighting conditions. However, body part detection and tracking is still a challenging problem, mainly because the shape and depth of the imaged body can change depending on the perspective. We present a hybrid approach, called M5AIE, that uses both color and depth information to perform body part detection, tracking and pose classification. We have developed a modified Accumulative Geodesic Extrema (AGEX) approach for detecting body part candidates. We also have used the Affine-SIFT (ASIFT) algorithm for feature extraction, and we have adapted the conventional matching method to perform tracking and labeling of body parts in a sequence of images that has color and depth information. The results produced by our tracking system were used with the C4.5 Gain Ratio Decision Tree, the naïve Bayes and the KNN classification algorithms for the identification of the users pose.
Download

Paper Nr: 295
Title:

Real-time Emotion Recognition - Novel Method for Geometrical Facial Features Extraction

Authors:

Claudio Loconsole, Catarina Runa Miranda, Gustavo Augusto, Antonio Frisoli and Verónica Costa Orvalho

Abstract: Facial emotions provide an essential source of information commonly used in human communication. For humans, their recognition is automatic and is done exploiting the real-time variations of facial features. However, the replication of this natural process using computer vision systems is still a challenge, since automation and real-time system requirements are compromised in order to achieve an accurate emotion detection. In this work, we propose and validate a novel methodology for facial features extraction to automatically recognize facial emotions, achieving an accurate degree of detection. This methodology uses a real-time face tracker output to define and extract two new types of features: eccentricity and linear features. Then, the features are used to train a machine learning classifier. As result, we obtain a processing pipeline that allows classification of the six basic Ekman’s emotions (plus Contemptuous and Neutral) in real-time, not requiring any manual intervention or prior information of facial traits.
Download

Paper Nr: 323
Title:

Key-point Detection with Multi-layer Center-surround Inhibition

Authors:

Foti Coleca, Sabrina Zîrnovean, Thomas Käster, Thomas Martinetz and Erhardt Barth

Abstract: We present a biologically inspired algorithm for key-point detection based on multi-layer and nonlinear centersurround inhibition. A Bag-of-Visual-Words framework is used to evaluate the performance of the detector on the Oxford III-T Pet Dataset for pet recognition. The results demonstrate an increased performance of our algorithm compared to the SIFT key-point detector. We further improve the recognition rate by separately training codebooks for the ON- and OFF-type key points. The results show that our key-point detection algorithms outperform the SIFT detector by having a lower recognition-error rate over a whole range of different key-point densities. Randomly selected key-points are also outperformed.
Download

Paper Nr: 324
Title:

Delineation of Rock Fragments by Classification of Image Patches using Compressed Random Features

Authors:

Geoff Bull, Junbin Gao and Michael Antolovich

Abstract: Monitoring of rock fragmentation is a commercially important problem for the mining industry. Existing analysis methods either resort to physically sieving rock samples, or using image analysis software. The currently available software systems for this problem typically work with 2D images and often require a significant amount of time by skilled human operators, particularly to accurately delineate rock fragments. Recent research into 3D image processing promises to overcome many of the issues with analysis of 2D images of rock fragments. However, for many mines it is not feasible to replace their existing image collection systems and there is still a need to improve on methods used for analysing 2D images. This paper proposes a method for delineation of rock fragments using compressed Haar-like features extracted from small image patches, with classification by a support vector machine. The optimum size of image patches and the numbers of compressed features have been determined empirically. Delineation results for images of rocks were superior to those obtained using the watershed algorithm with manually assigned markers. Using compressed features is demonstrated to improve the computational efficiently such that a machine learning solution is viable.
Download

Paper Nr: 331
Title:

Monte-Carlo Image Retargeting

Authors:

Roberto Gallea, Edoardo Ardizzone and Roberto Pirrone

Abstract: In this paper an efficient method for image retargeting is proposed. It relies on a monte-carlo model that makes use of image saliency. Each random sample is extracted from deformation probability mass function defined properly, and shrinks or enlarges the image by a fixed size. The shape of the function, determining which regions of the image are affected by the deformations, depends on the image saliency. High informative regions are less likely to be chosen, while low saliency regions are more probable. Such a model does not require any optimization, since its solution is obtained by extracting repeatedly random samples, and allows real-time application even for large images. Computation time can be additionally improved using a parallel implementation. The approach is fully automatic, though it can be improved by providing interactively cues such as geometric constraints and/or automatic or manual labeling of relevant objects. The results prove that the presented method achieves results comparable or superior to reference methods, while improving efficiency.
Download

Short Papers
Paper Nr: 48
Title:

Statistical Models of Shape and Spatial Relation-application to Hippocampus Segmentation

Authors:

Saïd Ettaïeb, Kamel Hamrouni and Su Ruan

Abstract: This paper presents a new method based both on Active Shape Model (ASM) and spatial distance model to segment brain structures. It combines two types of a priori knowledge: the structure shapes and the distances between them. This knowledge consists of shape and distance variability which are estimated during a training step. Then, the obtained models are used to guide simultaneously the evolution of initial structure shapes towards the target contours. The proposed models are applied to extract two hippocampal regions on coronal MRI of the brain. The obtained results are encouraging and show the performance of the proposed model.
Download

Paper Nr: 86
Title:

Focus Evaluation Approach for Retinal Images

Authors:

Diana Veiga, Carla Pereira, Manuel Ferreira, Luís Gonçalves and João Monteiro

Abstract: Digital fundus photographs are often used to provide clinical diagnostic information about several pathologies such as diabetes, glaucoma, macular degeneration and vascular and neurologic disorders. To allow a precise analysis, digital fundus image quality should be assessed to evaluate if minimum requirements are present. Focus is one of the causes of low image quality. This paper describes a method that automatically classifies fundus images as focused or defocused. Various focus measures described in literature were tested and included in a feature vector for the classification step. A neural network classifier was used. HEI-MED and MESSIDOR image sets were utilized in the training and testing phase, respectively. All images were correctly classified by the proposed algorithm.
Download

Paper Nr: 91
Title:

A New Algorithm for Objective Video Quality Assessment on Eye Tracking Data

Authors:

Maria Grazia Albanesi and Riccardo Amadeo

Abstract: In this paper, we present an innovative algorithm based on a voting process approach, to analyse the data provided by an eye tracker during tasks of user evaluation of video quality. The algorithm relies on the hypothesis that a lower quality video is more “challenging” for the Human Visual System (HVS) than a high quality one, and therefore visual impairments influence the user viewing strategy. The goal is to generate a map of saliency of the human gaze on video signals, in order to create a No Reference objective video quality assessment metric. We consider the impairment of video compression (H.264/AVC algorithm) to generate different versions of video quality. We propose a protocol that assigns different playlists to different user groups, in order to avoid any effect of memorization of the visual stimuli on strategy. We applied our algorithm to data generated on a heterogeneous set of video clips, and the final result is the computation of statistical measures which provide a rank of the videos according to the perceived quality. Experimental results show that there is a strong correlation between the metric we propose and the quality of impaired video, and this fact confirms the initial hypothesis.
Download

Paper Nr: 96
Title:

Depth-Scale Method in 3D Registration of RGB-D Sensor Outputs

Authors:

Ismail Bozkurt and Egemen Özden

Abstract: Automatic registration of 3D scans with RGB data is studied in this paper. In contrast to bulk of research in the field which deploy 3D geometry consistency, local RGB image feature matches are used to solve the unknown 3D rigid transformation. The key novelty in this work is the introduction of a new simple measure, we call “Depthscale measure”, which logically represents the size of the local image features in 3D world, thanks to the availability of the depth data from the sensor. Depending on the operating characteristics of the target application, we show this measure can be useful and efficient in eliminating outliers through experimental results. Also system level details are given to help scientists who want to build a similar system.
Download

Paper Nr: 118
Title:

A Comparative Evaluation of 3D Keypoint Detectors in a RGB-D Object Dataset

Authors:

Silvio Filipe and Luís A. Alexandre

Abstract: When processing 3D point cloud data, features must be extracted from a small set of points, usually called keypoints. This is done to avoid the computational complexity required to extract features from all points in a point cloud. There are many keypoint detectors and this suggests the need of a comparative evaluation. When the keypoint detectors are applied to 3D objects, the aim is to detect a few salient structures which can be used, instead of the whole object, for applications like object registration, retrieval and data simplification. In this paper, we propose to do a description and evaluation of existing keypoint detectors in a public available point cloud library with real objects and perform a comparative evaluation on 3D point clouds. We evaluate the invariance of the 3D keypoint detectors according to rotations, scale changes and translations. The evaluation criteria used are the absolute and the relative repeatability rate. Using these criteria, we evaluate the robustness of the detectors with respect to changes of point-of-view. In our experiments, the method that achieved better repeatability rate was the ISS3D method.
Download

Paper Nr: 125
Title:

2D Shape Matching based on B-spline Curves and Dynamic Programming

Authors:

Nacéra Laiche and Slimane Larabi

Abstract: In this paper, we propose an approach for two-dimensional shape representation and matching using the B-spline modelling and Dynamic Programming (DP), which is robust with respect to affine transformations such as translation, rotation, scale change and some distortions. Boundary shape is first splitedinto distinctpartsbased on the curvature. Curvature points are critical attributes for shape description, allowing the concave and convex parts of an objectrepresentation, which are obtained by the polygonal approximation algorithm in our approach. After thateach part is approximated by a normalized B-spline curve usingsome global features including the arc length, the centroid of the shape and moments.Finally, matching and retrieval of similar shapes are obtained using a similarity measure defined on their normalized curves with Dynamic Programming.Dynamic programming not only recovers the best matching, but also identifies the most similar boundary parts. The experimental results on some benchmark databases validate the proposed approach.
Download

Paper Nr: 191
Title:

About the Impact of Pre-processing Tools on Segmentation Methods - Applied for Tree Leaves Extraction

Authors:

Manuel Grand-Brochier, Antoine Vacavant, Robin Strand, Guillaume Cerutti and Laure Tougne

Abstract: In this paper, we present a comparative study highlighting the improvements provided by pre-processing tools, such as input stroke or use of distance map for segmentation approaches. We propose in particular to highlight new methods for calculating distance map based on the prediction of changes in local color (published by G. Cerutti et al. in ReVeS Participation - Tree Species Classification Using Random Forests and Botanical Features. CLEF 2012). We study differents methods using thresholding, clustering, or even active contours, tested for an issue of tree leaves extraction. The observation criteria, such as Dice index, SSIM or MAD for example, allow us to analyze the performance obtained by each approach and in particular those of the GAC method, which are better for this context.
Download

Paper Nr: 201
Title:

A Visibility Graph based Shape Decomposition Technique

Authors:

Foteini Fotopoulou and Emmanouil Z. Psarakis

Abstract: In this paper, a new shape decomposition method named Visibility Shape Decomposition (VSD) is presented. Inspired from an idealization of the visibility matrix having a block diagonal form, the definition of a neighborhood based visibility graph is proposed and a two step iterative algorithm for its transformation into a block diagonal form, that can be used for a visually meaningful decomposition of the candidate shape, is presented. Although the proposed technique is applied to shapes of the MPEG7 database, it can be extended to 3D objects. The preliminary results we have obtained are promising.
Download

Paper Nr: 241
Title:

Performance Evaluation of Feature Point Descriptors in the Infrared Domain

Authors:

Pablo Ricaurte, Carmen Chilán, Cristhian A. Aguilera-Carrasco, Boris X. Vintimilla and Angel D. Sappa

Abstract: This paper presents a comparative evaluation of classical feature point descriptors when they are used in the long-wave infrared spectral band. Robustness to changes in rotation, scaling, blur, and additive noise are evaluated using a state of the art framework. Statistical results using an outdoor image data set are presented together with a discussion about the differences with respect to the results obtained when images from the visible spectrum are considered.
Download

Paper Nr: 263
Title:

1-D Temporal Segments Analysis for Traffic Video Surveillance

Authors:

M. Brulin, C. Maillet and H. Nicolas

Abstract: Traffic video surveillance is an important topic for security purposes and to improve the traffic flow management. Video surveillance can be used for different purposes such as counting of vehicles or to detect their speed and behaviors. In this context, it is often important to be able to analyze the video in real-time. The huge amount of data generated by the increasing number of cameras is an obstacle to reach this goal. A solution consists in selecting in the video only the regions of interest, essentially the vehicles on the road areas. In this paper, we propose to extract significant segments of the regions of interest and to analyze them temporally to count vehicles and to define their behaviors. Experiments on real data show that precise vehicle’s counting and high recall and precision are obtain for vehicle’s behavior and traffic analysis.
Download

Paper Nr: 282
Title:

Face Verification using LBP Feature and Clustering

Authors:

Chenqi Wang, Kevin Lin and Yi-Ping Hung

Abstract: In this paper, we present a mechanism to extract certain special faces—LBP-Faces, which are designed to represent different kinds of faces around the world, and utilize them as the basis to verify other faces. In particular, we show how our idea can integrate with Local Binary Pattern (LBP) and improve its performance. Other than most of the previous LBP-variant approaches, which, no matter try to improve coding mechanism or optimize the neighbourhood sizes, first divide a face into patch-level regions (e.g. 7×7 patches), concatenating histograms calculated in each patch to derive a rather long dimension vector, and then apply PCA to implement dimension reduction, our work use original LBP histograms, trying to retain the major properties such as discriminability and invariance, but in a much bigger component-level region (we divide faces into 7 components). In each component, we cluster LBP descriptors—in the form of histograms to derive N clustering centroids, which we define as LBP-Faces. Then, to any input face, we calculate its similarities with all these N LBP-Faces and use the similarities as final features to verify the face. It looks like we project the faces image into a new feature space—LBP-Faces space. The intuition within it is that when we depict an unknown face, we are prone to use description such as how likely the face’s eye or nose is to an known one. Result of our experiment on the Labeled Face in Wild (LFW) database shows that our method outperforms LBP in face verification.
Download

Paper Nr: 305
Title:

Local Texton Dissimilarity with Applications on Biomass Classification

Authors:

Radu Tudor Ionescu, Andreea-Lavinia Popescu, Dan Popescu and Marius Popescu

Abstract: Texture classification, texture synthesis, or similar tasks are an active topic in computer vision and pattern recognition. This paper aims to present a novel texture dissimilarity measure based on textons, namely the Local Texton Dissimilarity (LTD), inspired from (Dinu et al., 2012). Textons are represented as a set of features extracted from image patches. The proposed dissimilarity measure shows its application on biomass type identification. A new data set of biomass texture images is provided by this work, which is available at http://biomass.herokuapp.com. Images are separated into three classes, each one representing a type of biomass. The biomass type identification and quality assessment is of great importance when one in the biomass industry needs to produce another energy product, such as biofuel, for example. Two more experiments are conducted on popular texture classification data sets, namely Brodatz and UIUCTex. The proposed method benefits from a faster computational time compared to (Dinu et al., 2012) and a better accuracy when used for texture classification. The performance level of the machine learning methods based on LTD is comparable to the state of the art methods.
Download

Paper Nr: 310
Title:

A Fast Leaf Recognition Algorithm based on SVM Classifier and High Dimensional Feature Vector

Authors:

Cecilia Di Ruberto and Lorenzo Putzu

Abstract: Plants are fundamental for human beings, so it's very important to catalog and preserve all the plants species. Identifying an unknown plant species is not a simple task. Automatic image processing techniques based on leaves recognition can help to find the best features useful for plant representation and classification. Many methods present in literature use only a small and complex set of features, often extracted from the binary images or the boundary of the leaf. In this work we propose a leaf recognition method which uses a new features set that incorporates shape, color and texture features. A total of 138 features are extracted and used for training of a SVM model. The method has been tested on Flavia dataset, showing excellent performance both in terms of accuracy that often reaches 100\%, and in terms of speed, less than a second to process and extract features from an image.
Download

Paper Nr: 315
Title:

Automatic Analysis of In-the-Wild Mobile Eye-tracking Experiments using Object, Face and Person Detection

Authors:

Stijn De Beugher, Geert Brône and Toon Goedemé

Abstract: In this paper we present a novel method for the automatic analysis of mobile eye-tracking data in natural environments. Mobile eye-trackers generate large amounts of data, making manual analysis very time-consuming. Available solutions, such as marker-based analysis minimize the manual labour but require experimental control, making real-life experiments practically unfeasible. We present a novel method for processing this mobile eye-tracking data by applying object, face and person detection algorithms. Furthermore we present a temporal smoothing technique to improve the detection rate and we trained a new detection model for occluded person and face detections. This enables the analysis to be performed on the object level rather than the traditionally used coordinate level. We present speed and accuracy results of our novel detection scheme on challenging, large-scale real-life experiments.
Download

Paper Nr: 328
Title:

Contour Localization based on Matching Dense HexHoG Descriptors

Authors:

Yuan Liu and Paul Siebert

Abstract: The ability to detect and localize an object of interest from a captured image containing a cluttered background is an essential function for an autonomous robot operating in an unconstrained environment. In this paper, we present a novel approach to refining the pose estimate of an object and directly labelling its contours by dense local feature matching. We perform this task using a new image descriptor we have developed called the HexHoG. Our key novel contribution is the formulation of HexHoG descriptors comprising hierarchical groupings of rotationally invariant (S)HoG fields, sampled on a hexagonal grid. These HexHoG groups are centred on detected edges and therefore sample the image relatively densely. This formulation allows arbitrary levels of rotation-invariant HexHoG grouped descriptors to be implemented efficiently by recursion. We present the results of an evaluation based on the ALOI image dataset which demonstrates that our proposed approach can significantly improve an initial pose estimation based on image matching using standard SIFT descriptors. In addition, this investigation presents promising contour labelling results based on processing 2892 images derived from the 1000 image ALOI dataset.
Download

Paper Nr: 344
Title:

Effortless Scanning of 3D Object Models by Boundary Aligning and Stitching

Authors:

Susana Brandão, João P. Costeira and Manuela Veloso

Abstract: We contribute a novel algorithm for the digitation of complete 3D object models that requires little preparation effort from the user. Notably, the presented algorithm, Joint Alignment and Stitching of Non-Overlapping Meshes (JASNOM), completes 3D object models by aligning and stitching two 3D meshes by the boundaries and does not require any previous registration between them. JASNOM only requirement is the lack of overlap between meshes, which is simple to achieve in most man made object. JASNOM takes advantage that both meshes can only be connected by their boundary to reframe the alignment problem as a search of the best assignment between boundary vertices. To make the problem tractable, JASNOM reduces the search space considerably by imposing strong constraints on valid assignments that transform the original combinatorial problem into a discrete linear problem. By not requiring previous camera registration and by not depending on shape features, JASNOM contributions range from quick modeling of 3D objects to hole filling in meshes.
Download

Paper Nr: 387
Title:

Audiovisual Data Fusion for Successive Speakers Tracking

Authors:

Quentin Labourey, Olivier Aycard, Denis Pellerin and Michele Rombaut

Abstract: In this paper, a human speaker tracking method on audio and video data is presented. It is applied to conversation tracking with a robot. Audiovisual data fusion is performed in a two-steps process. Detection is performed independently on each modality: face detection based on skin color on video data and sound source localization based on the time delay of arrival on audio data. The results of those detection processes are then fused thanks to an adaptation of bayesian filter to detect the speaker. The robot is able to detect the face of the talking person and to detect a new speaker in a conversation.
Download

Paper Nr: 388
Title:

Motion Characterization of a Dynamic Scene

Authors:

Arun Balajee Vasudevan, Srikanth Muralidharan, Shiva Pratheek Chintapalli and Shanmuganathan Raman

Abstract: Given a video, there are many algorithms to separate static and dynamic objects present in the scene. The proposed work is focused on classifying the dynamic objects further as having either repetitive or non-repetitive motion. In this work, we propose a novel approach to achieve this challenging task by processing the optical flow fields corresponding to the video frames of a dynamic natural scene. We design an unsupervised learning algorithm which uses functions of the flow vectors to design the feature vector. The proposed algorithm is shown to be effective in classifying a scene into static, repetitive, and non-repetitive regions. The proposed approach finds significance in various vision and computational photography tasks such as video editing, video synopsis, and motion magnification.
Download

Paper Nr: 399
Title:

Graph Cut and Image Segmentation using Mean Cut by Means of an Agglomerative Algorithm

Authors:

Elaine Ayumi Chiba, Marco Antonio Garcia Carvalho and André Luís Costa

Abstract: Graph partitioning, or graph cut, has been studied by several authors as a tool for image segmentation. It refers to partitioning a graph into several subgraphs such that each of them represents a meaningful object of interest in the image. In this work we propose a hierarchical agglomerative clustering algorithm driven by the cut and mean cut criteria. Some preliminary experiments were performed using the benchmark of Berkeley BSDS500 with promising results.
Download

Paper Nr: 8
Title:

Image Analysis through Shifted Orthogonal Polynomial Moments

Authors:

Rajarshi Biswas and Sambhunath Biswas

Abstract: Image analysis is significant from the standpoint of image description. A well described image has merits in different research areas, e.g., image compression, machine learning, computer vision etc. This paper is an attempt to analyze graylevel images through shifted orthogonal polynomial moments, computed on a discrete disc. This removes the difficulty of computing the moments on an analytic disc. Excellent rotational invariance as well as illumination invariance is observed.
Download

Paper Nr: 12
Title:

Contour based Split and Merge Segmentation and Pre-classification of Zooplankton in Very Large Images

Authors:

Enrico Gutzeit, Christian Scheel, Tim Dolereit and Matthias Rust

Abstract: Zooplankton is an important component in the water ecosystem and food chain. To understand the influence of zooplankton on the ecosystem a data collection is necessary. In research the automatic image based recognition of zooplankton is of growing interest. Several systems have been developed for zooplankton recognition on low resolution images. For large images approaches are seldom. Images of this size easily exceed the main memory of standard computers. Our novel automatic segmentation approach is able to handle these large images. We developed a contour based Split & Merge approach for segmentation and, to reduce the nonzooplankton segments, combine it with a pre-classification of the segments in reference to their shape. The latter includes a detection of quasi round segments and a novel one for thin segments. Experiment results on several huge images show that we are able to handle this huge images satisfactory.
Download

Paper Nr: 13
Title:

Experimental Comparison of Vasculature Segmentation Methods

Authors:

Yuchun Ding and Li Bai

Abstract: Vessel segmentation algorithms play a very important role in vascular disease diagnosis and prediction. Current vessel segmentation research uses mostly images of large vessels, which are relatively easy to extract, but segmenting microvasculature is more challenging and very important for analysing vascular disease such as Alzheimer’s Diseases. The aim of this paper is to report experimental results of several common vessel image segmentation methods. Retinal vessel image database DRIVE is used for 2D experiments and a micro-CT image is used for 3D experiments.
Download

Paper Nr: 20
Title:

Image Registration based on Edge Dominant Corners

Authors:

Rabih Al Nachar, Elie Inaty, Patrick J. Bonnin and Yasser Alayli

Abstract: This paper presents a new algorithm for image registration working on an image sequence using dominant corners located on the image's edges under the assumption that the deformation between the successive images can be modeled by an affine transformation. To guarantee this assumption, the time interval between acquired images should be small like the time interval in a video sequence. In the edge image, dominant corners are extracted per linked contour and form a polygon that best approximates the current linked contour. The number of these dominant corners per contour is derived automatically given an approximation error. These dominant corners are shown to be very repeatable under affinity transformation. Then, a Primitive is constructed by four dominant corners. The invariant measure that characterizes each primitive is the ratio of areas of two triangles constructed by two triplets selected from these four corners.
Download

Paper Nr: 35
Title:

A Recursive Approach For Multiclass Support Vector Machine - Application to Automatic Classification of Endomicroscopic Videos

Authors:

Alexis Zubiolo, Grégoire Malandain, Barbara André and Éric Debreuve

Abstract: The two classical steps of image or video classification are: image signature extraction and assignment of a class based on this image signature. The class assignment rule can be learned from a training set composed of sample images manually classified by experts. This is known as supervised statistical learning. The well-known Support Vector Machine (SVM) learning method was designed for two classes. Among the proposed extensions to multiclass (three classes or more), the one-versus-one and one-versus-all approaches are the most popular ones. This work presents an alternative approach to extending the original SVM method to multiclass. A tree of SVMs is built using a recursive learning strategy, achieving a linear worst-case complexity in terms of number of classes for classification. During learning, at each node of the tree, a bi-partition of the current set of classes is determined to optimally separate the current classification problem into two sub-problems. Rather than relying on an exhaustive search among all possible subsets of classes, the partition is obtained by building a graph representing the current problem and looking for a minimum cut of it. The proposed method is applied to classification of endomicroscopic videos and compared to classical multiclass approaches.
Download

Paper Nr: 161
Title:

Energy based Descriptors and their Application for Car Detection

Authors:

Radovan Fusek, Eduard Sojka, Karel Mozdřeň and Milan Šurkala

Abstract: In this paper, we propose a novel technique for object description. The proposed method is based on investigation of energy distribution (in the image) that describes the properties of objects. The energy distribution is encoded into a vector of features and the vector is then used as an input for the SVM classifier. Generally, the technique can be used for detecting arbitrary objects. In this paper, however, we demonstrate the robustness of the proposed descriptors for solving the problem of car detection. Compared with the state-of-the-art descriptors (e.g. HOG, Haar-like features), the proposed approach achieved better results, especially from the viewpoint of dimensionality of the feature vector; the proposed approach is able to successfully describe the objects of interest with a relatively small set of numbers without the use of methods for the reduction of feature vector.
Download

Paper Nr: 173
Title:

Efficient Inference of Spatial Hierarchical Models

Authors:

Jan Mačák and Ondřej Drbohlav

Abstract: The long term goal of artificial intelligence and computer vision is to be able to build models of the world automatically and to use them for interpretation of new situations. It is natural that such models are efficiently organized in a hierarchical manner; a model is build by sub-models, these sub-models are again build of another models, and so on. These building blocks are usually shareable; different objects may consist of the same components. In this paper, we describe a hierarchical probabilistic model for visual domain and propose a method for its efficient inference based on data partitioning and dynamic programming. We show the behaviour of the model, which is in this case made manually, and inference method on a controlled yet challenging dataset consisting of rotated, scaled and occluded letters. The experiments show that the proposed model is robust to all above-mentioned aspects.
Download

Paper Nr: 210
Title:

Saliency Detection in Images using Graph-based Rarity, Spatial Compactness and Background Prior

Authors:

Sudeshna Roy and Sukhendu Das

Abstract: Bottom-up saliency detection techniques extract salient regions in an image while free-viewing the image. We have approached the problem with three different low-level cues– graph based rarity, spatial compactness and background prior. First, the image is broken into similar colored patches, called superpixels. To measure rarity we represent the image as a graph with superpixels as node and exponential color difference as the edge weights between the nodes. Eigenvectors of the Laplacian of the graph are then used, similar to spectral clustering (Ng et al., 2001). Each superpixel is associated with a descriptor formed from these eigenvectors and rarity or uniqueness of the superpixels are found using these descriptors. Spatial compactness is computed by combining disparity in color and spatial distance between superpixels. Concept of background prior is implemented by finding the weighted Mahalanobis distance of the superpixels from the statistically modeled mean background color. These cues in combination gives the proposed saliency map. Experimental results demonstrate that our method outperforms many of the recent state-of-the-art methods both in terms of accuracy and speed.
Download

Paper Nr: 223
Title:

Liquid Crystal Image Analysis by Image Descriptors

Authors:

Guilherme Enoc Egas de Carvalho, Franklin César Flores, Fernando Carlos Messias Freire and Anderson Reginaldo Sampaio

Abstract: Liquid crystals are substances with high impact technological, new substances have been discovered and the properties of these materials need to be examined. When viewed under a microscope using a polarized light source, different liquid crystal phases will appear to have distinct textures and colors. The use of digital image processing and computer vision is being initialized in the analysis of these materials. The goal of this work is to propose methods, based on visual descriptors, which are able to identify phase transitions and classify phases in liquid crystals from a sequence of images.
Download

Paper Nr: 232
Title:

Automatic Detection of MEO Satellite Streaks from Single Long Exposure Astronomic Images

Authors:

Anca Ciurte and Radu Danescu

Abstract: Nowadays, there is an increased interest in achieving an accurate surveillance of the sky, since the number of objects in Earth’s orbit (active satellites and debris) is continuously increasing. The satellites constantly need to be supervised in order to notice their deviations from their trajectories and update their coordinates. This paper presents a new method for satellite detection in 2D astronomic images acquired with a cheap, easy to set up optical surveillance system. The proposed method use the Radon Transform in order to identify satellite strikes in images followed by a set of decision rules to decide whether the streak is a satellite or not. The method was tested on multiple sequences of astronomic images, and was found to have a very high detection rate, along with a very low false positive rate.
Download

Paper Nr: 253
Title:

Hand Pose Recognition by using Masked Zernike Moments

Authors:

JungSoo Park, Hyo-Rim Choi, JunYoung Kim and TaeYong Kim

Abstract: In this paper we present a novel way of applying Zernike moments for image matching. Zernike moments are obtained from projecting image information under a circumscribed circle to Zernike basis function. However, the problem is that the power of discrimination may be reduced because hand images include lots of overlapped information due to their shape characteristic. On the other hand, in the pose discrimination shape information of hands excluding the overlapped area can increase the power of discrimination. In order to solve the overlapped information problem, we present a way of applying subtraction masks. Internal mask R1 eliminates overlapped information in hand images, while external mask R2 weighs outstanding features of hand images. Mask R3 combines the results from the image masked by R1 and the image masked by R2. The moments obtained by R3 mask increase the accuracy of discrimination for hand poses, which is shown in experiments by comparing conventional methods.
Download

Paper Nr: 267
Title:

Analysis of Widely-used Descriptors for Finger-vein Recognition

Authors:

Fariba Yousefi, Erdal Sivri, Ozgur Kaya, Selma Suloglu and Sinan Kalkan

Abstract: For finger-vein recognition, many successful methods, such as Line Tracking (LT), Maximum Curvature (MC) and Wide Line Detector (WL), have been proposed. Among these, LT has a very slow matching and feature-extraction phase, and LT, MC and WL are translation and rotation dependent. Moreover, we show in the paper, they are affected by noise. To overcome these drawbacks, we propose using popular feature descriptors widely used for several Computer Vision or Pattern Recognition (CVPR) problems in the literature. The CVPR descriptors we test include Histogram of Oriented Gradients (HOG), Fourier Descriptors (FD), Zernike Moments (ZM), Local Binary Patterns (LBP) and Global Binary Patterns (GBP), which have not been applied to the finger-vein recognition problem before. We compare these descriptors against LT, MC, and WL and evaluate their running times, performance and resilience against noise, rotation and translation. We report that the LT and WL methods accuracy are comparable to each other and WL gives the best accuracy, LT method’s speed is the slowest. Our results indicate that WL can be used together with ZM and GBP in case of rotation and noise, respectively.

Paper Nr: 291
Title:

Non-rigid Surface Registration using Cover Tree based Clustering and Nearest Neighbor Search

Authors:

Manal H. Alassaf, Yeny Yim and James K. Hahn

Abstract: We propose a novel non-rigid registration method that computes the correspondences of two deformable surfaces using the cover tree. The aim is to find the correct correspondences without landmark selection and to reduce the computational complexity. The source surface S is initially aligned to the target surface T to generate a cover tree from the densely distributed surface points. The cover tree is constructed by taking into account the positions and normal vectors of the points and used for hierarchical clustering and nearest neighbor search. The cover tree based clustering divides the two surfaces into several clusters based on the geometric features, and each cluster on the source surface is transformed to its corresponding cluster on the target. The nearest neighbor search from the cover tree reduces the search space for correspondence computation, and the source surface is deformed to the target by optimizing the point pairs. The correct correspondence of a given source point is determined by choosing one target point with the best correspondence measure from the k nearest neighbors. The proposed energy function with Jacobian penalty allows deforming the surface accurately and with less deformation folding.
Download

Paper Nr: 301
Title:

Unsupervised Segmentation of Hyperspectral Images based on Dominant Edges

Authors:

Sangwook Lee, Sanghun Lee and Chulhee Lee

Abstract: In this paper, we propose a new unsupervised segmentation method for hyperspectral images based on dominant edge information. In the proposed algorithm, we first apply the principal component analysis and select the dominant eigenimages. Then edge operators and the histogram equalizer are applied to the selected eigenimages, which produces edge images. By combining these edge images, we obtain a binary edge image. Morphological operations are then applied to these binary edge image to remove erroneous edges. Experimental results show that the proposed algorithm produced satisfactory results without any user input.
Download

Paper Nr: 311
Title:

Statistical Features for Image Retrieval - A Quantitative Comparison

Authors:

Cecilia Di Ruberto and Giuseppe Fodde

Abstract: In this paper we present a comparison between various statistical descriptors and analyze their goodness in classifying textural images. The chosen statistical descriptors have been proposed by Tamura, Battiato and Haralick. In this work we also test a combination of the three descriptors for texture analysis. The databases used in our study are the well-known Brodatz’s album and DDSM(Heath et al., 1998). The computed features are classified using the Naive Bayes, the RBF, the KNN, the Random Forest and Random Tree models. The results obtained from this study show that we can achieve a high classification accuracy if the descriptors are used all together.
Download

Paper Nr: 312
Title:

Event Clustering of Lifelog Image Sequence using Emotional and Image Similarity Features

Authors:

Photchara Ratsamee, Yasushi Mae, Masaru Kojima, Mitsuhiro Horade, Kazuto Kamiyama and Tatsuo Arai

Abstract: Lifelog image clustering is the process of grouping images into events based on image similarities. Until now, groups of images with low variance can be easily clustered, but clustering images with high variance is still a problem. In this paper, we challenge the problem of high variance, and present a methodology to accurately cluster images into their corresponding events. We introduce a new approach based on rankorder distance techniques using a combination of image similarity and an emotional feature measured from a biosensor. We demonstrate that emotional features along with rank-order distance based clustering can be used to cluster groups of images with low, medium, and high variance. Experimental evidence suggests that compared to average clustering precision rate (65.2%) from approaches that only consider image visual features, our technique achieves a higher precision rate (85.5%) when emotional features are integrated.
Download

Paper Nr: 319
Title:

High Definition Visual Attention based Video Summarization

Authors:

Yiming Qian and Matthew Kyan

Abstract: A High Definition visual attention based video summarization algorithm is proposed to extract feature frames and create a video summary. It uses colour histogram shot detection algorithm to separate the video into shots, then applies a novel high definition visual attention algorithm to construct a saliency map for each frame. A multivariate mutual information algorithm is applied to select a feature frame to represent each shot. Finally, those feature frames are processed by a self-organizing map to remove the redundant frames. The algorithm was assessed against manual key frame summaries presented with tested datasets from www.open-video.org. Of the frames selected by the algorithm, 27.8% to 68.1% were in agreement with the manual frame summaries depending on the category and length of the video.
Download

Paper Nr: 322
Title:

Optimization of Image Interpolation based on Nearest Neighbour Algorithm

Authors:

Olivier Rukundo and B. T. Maharaj

Abstract: This paper proposes an optimization scheme for the image interpolation algorithms, in particular the bilinear algorithm. The only original point is a decision step in which it is decided whether the four neighbouring pixels have the same value and if so the conventional bilinear interpolation is replaced by a nearest neighbour interpolation. The experimental results corroborated the efficiency of the proposed scheme over conventional bilinear and showed improvements in terms of speed and quality, especially in case where images with less grain textures have been interpolated.
Download

Paper Nr: 326
Title:

Evaluation of Color Spaces for Robust Image Segmentation

Authors:

Alexander Jungmann, Jan Jatzkowski and Bernd Kleinjohann

Abstract: In this paper, we evaluate the robustness of our color-based segmentation approach in combination with different color spaces, namely RGB, L*a*b*, HSV, and log-chromaticity (LCCS). For this purpose, we describe our deterministic segmentation algorithm including its gradually transformation of pixel-precise image data into a less error-prone and therefore more robust statistical representation in terms of moments. To investigate the robustness of a specific segmentation setting, we introduce our evaluation framework that directly works on the statistical representation. It is based on two different types of robustness measures, namely relative and absolute robustness. While relative robustness measures stability of segmentation results over time, absolute robustness measures stability regarding varying illumination by comparing results with ground truth data. The significance of these robustness measures is shown by evaluating our segmentation approach with different color spaces. For the evaluation process, an artificial scene was chosen as representative for application scenarios based on artificial landmarks.
Download

Paper Nr: 352
Title:

An Investigation on Local Wrinkle-based Extractor of Age Estimation

Authors:

Choon-Ching Ng, Moi Hoon Yap, Nicholas Costen and Baihua Li

Abstract: Research related to age estimation using face images has become increasingly important due to its potential use in various applications such as age group estimation in advertising and age estimation in access control. In contrast to other facial variations, age variation has several unique characteristics which make it a challenging task. As we age, the most pronounced facial changes are the appearance of wrinkles (skin creases), which is the focus of ageing research in cosmetic and nutrition studies. This paper investigates an algorithm for wrinkle detection and the use of wrinkle data as an age predictor. A novel method in detecting and classifying facial age groups based on a local wrinkle-based extractor (LOWEX) is introduced. First, each face image is divided into several convex regions representing wrinkle distribution areas. Secondly, these areas are analysed using a Canny filter and then concatenated into an enhanced feature vector. Finally, the face is classified into an age group using a supervised learning algorithm. The experimental results show that the accuracy of the proposed method is 80% when using FG-NET dataset. This investigation shows that local wrinkle-based features have great potential in age estimation. We conclude that wrinkles can produce a prominent ageing descriptor and identify some future research challenges.

Paper Nr: 361
Title:

Uncertainty Fusion based Object Recognition and Tracking in Maritime Scenes using Spatiotemporal Active Contours

Authors:

Ikhlef Bechar, Frederic Bouchara, Thibault Lelore, Vincente Guis and Michel Grimaldi

Abstract: This article addresses the problem of near real time video analysis of a maritime scene using a (moving) airborne RGB video camera in the goal of detecting and eventually recognizing a target maritime vessel. This is a very challenging problem mainly due to the high level of uncertainty of a maritime scene including a dynamic and noisy background, camera’s and target’s motions, and broad variability of background’s versus target’s appearances. We propose an approach which attempts to combine several types of spatiotemporal uncertainty in a single probabilistic framework. This allows to achieve a likelihood ratio with respect to any possible spatiotemporal configuration of the 2D+T video volume. Using the MAP estimation criterion, such a problem can be recast as as an energy minimization problem that we solve efficiently using a spatiotemporal active contour approach. We demonstrate the feasibility of the proposed approach using real maritime videos.
Download

Paper Nr: 369
Title:

General Purpose Segmentation for Microorganisms in Microscopy Images

Authors:

S. N. Jensen, R. Irani, T. B. Moeslund and Christian Rankl

Abstract: In this paper, we propose an approach for achieving generalized segmentation of microorganisms in microscopy images. It employs a pixel-wise classification strategy based on local features. Multilayer perceptrons are utilized for classification of the local features and is trained for each specific segmentation problem using supervised learning. This approach was tested on five different segmentation problems in bright field, differential interference contrast, fluorescence and laser confocal scanning microscopy. In all instance good results were achieved with the segmentation quality scoring a Dice coefficient of 0.831 or higher.
Download

Area 3 - Image and Video Understanding

Full Papers
Paper Nr: 24
Title:

Can 3D Shape of the Face Reveal your Age?

Authors:

Baiqiang Xia, Boulbaba Ben Amor, Mohamed Daoudi and Hassen Drira

Abstract: Age reflects the continuous accumulation of durable effects from the past since birth. Human faces deform with time non-inversely and thus contains their aging information. In addition to its richness with anatomy information, 3D shape of faces could have the advantage of less dependent on pose and independent of illumination, while it hasn’t been noticed in literature. Thus, in this work we investigate the age estimation problem from 3D shape of the face. With several descriptions grounding on Riemannian shape analysis of facial curves, we first extracted features from ideas of face Averageness, face Symmetry, its shape variations with Spatial and Gradient descriptors. Then, using the Random Forest-based Regression, experiments are carried out following the Leaving-One-Person-Out (LOPO) protocol on the FRGCv2 dataset. The proposed approach performs with a Mean Absolute Error (MAE) of 3:29 years using a gender-general test protocol. Finally, with the gender-specific experiments, which first separate the 3D scans into Female and Male subsets, then train and test on each gender specific subset in LOPO fashion, we improves the MAE to 3:15 years, which confirms the idea that the aging effect differs with gender.
Download

Paper Nr: 36
Title:

Exploiting Scene Cues for Dropped Object Detection

Authors:

Adolfo Lopez-Mendez, Florent Monay and Jean-Marc Odobez

Abstract: This paper presents a method for the automated detection of dropped objects in surveillance scenarios, which is a very important task for abandoned object detection. Our method works in single views and exploits prior information of the scene, such as geometry or the fact that a number of false alarms are caused by known objects, such as humans. The proposed approach builds dropped object candidates by analyzing blobs obtained with a multi-layer background subtraction approach. The created dropped object candidates are then characterized both by appearance and by temporal aspects such as the estimated drop time. Next, we incorporate prior knowledge about the possible sizes and positions of dropped objects through an efficient filtering approach. Finally, the output of a human detector is exploited over in order to filter out static objects that are likely to be humans that remain still. Experimental results on the publicly available PETS2006 datasets and on several long sequences recorded in metro stations show the effectiveness of the proposed approach. Furthermore, our approach can operate in real-time.
Download

Paper Nr: 38
Title:

Egocentric Activity Recognition using Histograms of Oriented Pairwise Relations

Authors:

Ardhendu Behera, Matthew Chapman, Anthony G. Cohn and David C. Hogg

Abstract: This paper presents an approach for recognising activities using video from an egocentric (first-person view) setup. Our approach infers activity from the interactions of objects and hands. In contrast to previous approaches to activity recognition, we do not require to use an intermediate such as object detection, pose estimation, etc. Recently, it has been shown that modelling the spatial distribution of visual words corresponding to local features further improves the performance of activity recognition using the bag-of-visual words representation. Influenced and inspired by this philosophy, our method is based on global spatio-temporal relationships between visual words. We consider the interaction between visual words by encoding their spatial distances, orientations and alignments. These interactions are encoded using a histogram that we name the Histogram of Oriented Pairwise Relations (HOPR). The proposed approach is robust to occlusion and background variation and is evaluated on two challenging egocentric activity datasets consisting of manipulative task. We introduce a novel representation of activities based on interactions of local features and experimentally demonstrate its superior performance in comparison to standard activity representations such as bag-of-visual words.
Download

Paper Nr: 80
Title:

Revisiting Pose Estimation with Foreshortening Compensation and Color Information

Authors:

Achint Setia, Anoop R. Katti and Anurag Mittal

Abstract: This paper addresses the problem of upper body pose estimation. The task is to detect and estimate 2D human configuration in static images for six parts: head, torso, and left-right upper and lower arms. The common approach to solve this has been the Pictorial Structure method (Felzenszwalb and Huttenlocher, 2005). We present this as a graphical model inference problem and use the loopy belief propagation algorithm for inference. When a human appears in fronto-parallel plane, fixed size part detectors are sufficient and give reliable detection. But when parts like lower and upper arms move out of the plane, we observe foreshortening and the part detectors become erroneous. We propose an approach that compensates foreshortening in the upper and lower arms, and effectively prunes the search state space of each part. Additionally, we introduce two extra pairwise constraints to exploit the color similarity information between parts during inference to get better localization of the upper and lower arms. Finally, we present experiments and results on two challenging datasets (Buffy and ETHZ Pascal), showing improvements on the lower arms accuracy and comparable results for other parts.
Download

Paper Nr: 88
Title:

Action Categorization based on Arm Pose Modeling

Authors:

Chongguo Li and Nelson H. C. Yung

Abstract: This paper proposes a novel method to categorize human action based on arm pose modeling. Traditionally, human action categorization relies much on the extracted features from video or images. In this research, we exploit the relationship between action categorization and arm pose modeling, which can be visualized in a graphic model. Given visual observations, both states can be estimated by maximum a posteriori (MAP) in that arm poses are first estimated under the hypothesis of action category by dynamic programming, and then action category hypothesis is validated by soft-max model based on the estimated arm poses. The prior distribution for every action is estimated by a semi-parametric estimator in advance, and pixel-based dense features including LBP, SIFT, colour-SIFT, and texton are utilized to enhance the likelihood computation by the joint Adaboosting algorithm. The proposed method has been evaluated on videos of walking, waving and jog from the HumanEva-I dataset. It is found to have arm pose modeling performance better than the method of mixtures of parts, and action categorization success rate of 96.69%.
Download

Paper Nr: 126
Title:

Learning Semantic Attributes via a Common Latent Space

Authors:

Ziad Al-Halah, Tobias Gehrig and Rainer Stiefelhagen

Abstract: Semantic attributes represent an adequate knowledge that can be easily transferred to other domains where lack of information and training samples exist. However, in the classical object recognition case, where training data is abundant, attribute-based recognition usually results in poor performance compared to methods that used image features directly. We introduce a generic framework that boosts the performance of semantic attributes considerably in traditional classification and knowledge transfer tasks, such as zero-shot learning. It incorporates the discriminative power of the visual features and the semantic meaning of the attributes by learning a common latent space that joins both spaces. We also specifically account for the presence of attribute correlations in the source dataset to generalize more efficiently across domains. Our evaluation of the proposed approach on standard public datasets shows that it is not only simple and computationally efficient but also performs remarkably better than the common direct attribute model.
Download

Paper Nr: 139
Title:

Absolute Spatial Context-aware Visual Feature Descriptors for Outdoor Handheld Camera Localization - Overcoming Visual Repetitiveness in Urban Environments

Authors:

Daniel Kurz, Peter Georg Meier, Alexander Plopski and Gudrun Klinker

Abstract: We present a framework that enables 6DoF camera localization in outdoor environments by providing visual feature descriptors with an Absolute Spatial Context (ASPAC). These descriptors combine visual information from the image patch around a feature with spatial information, based on a model of the environment and the readings of sensors attached to the camera, such as GPS, accelerometers, and a digital compass. The result is a more distinct description of features in the camera image, which correspond to 3D points in the environment. This is particularly helpful in urban environments containing large amounts of repetitive visual features. Additionally, we describe the first comprehensive test database for outdoor handheld camera localization comprising of over 45,000 real camera images of an urban environment, captured under natural camera motions and different illumination settings. For all these images, the dataset not only contains readings of the sensors attached to the camera, but also ground truth information on the full 6DoF camera pose, and the geometry and texture of the environment. Based on this dataset, which we have made available to the public, we show that using our proposed framework provides both faster matching and better localization results compared to state-of-the-art methods.
Download

Paper Nr: 146
Title:

Combining Dense Features with Interest Regions for Efficient Part-based Image Matching

Authors:

Priyadarshi Bhattacharya and Marina L. Gavrilova

Abstract: One of the most popular approaches for object recognition is bag-of-words which represents an image as a histogram of the frequency of occurrence of visual words. But it has some disadvantages. Besides requiring computationally expensive geometric verification to compensate for the lack of spatial information in the representation, it is particularly unsuitable for sub-image retrieval problems because any noise, background clutter or other objects in vicinity influence the histogram representation. In our previous work, we addressed this issue by developing a novel part-based image matching framework that utilizes spatial layout of dense features within interest regions to vastly improve recognition rates for landmarks. In this paper, we improve upon the previously published recognition results by more than 12% and achieve significant reductions in computation time. A region of interest (ROI) selection strategy is proposed along with a new voting mechanism for ROIs. Also, inverse document frequency weighting is introduced in our image matching framework for both ROIs and dense features inside the ROIs. We provide experimental results for various vocabulary sizes on the benchmark Oxford 5K and INRIA Holidays datasets.
Download

Paper Nr: 165
Title:

Active Learning in Social Context for Image Classification

Authors:

Elisavet Chatzilari, Spiros Nikolopoulos, Yiannis Kompatsiaris and Josef Kittler

Abstract: Motivated by the widespread adoption of social networks and the abundant availability of user-generated multimedia content, our purpose in this work is to investigate how the known principles of active learning for image classification fit in this newly developed context. The process of active learning can be fully automated in this social context by replacing the human oracle with the user tagged images obtained from social networks. However, the noisy nature of user-contributed tags adds further complexity to the problem of sample selection since, apart from their informativeness, our confidence about their actual content should be also maximized. The contribution of this work is on proposing a probabilistic approach for jointly maximizing the two aforementioned quantities with a view to automate the process of active learning. Experimental results show the superiority of the proposed method against various baselines and verify the assumption that significant performance improvement cannot be achieved unless we jointly consider the samples’ informativeness and the oracle’s confidence.
Download

Paper Nr: 169
Title:

Surface Area Analysis for People Number Estimation

Authors:

Hiroyuki Arai, Naoki Ito and Yukinobu Taniguchi

Abstract: An important property of surface areas of objects as observed by a calibrated monocular camera is introduced; also improved techniques to apply the property to people number estimation are proposed. Standard surface area (SSA) is defined as the surface area of the reverse projection of an image-pixel onto a plane at specific height in the real world. SSA is calculated for each pixel according to camera calibration parameters. When the target object is bound to a certain plane, for example the floor plane, the sum of SSA along with the foreground pixels of one target object becomes constant. Therefore, simple foreground detection and SSA summation yield the number of target objects. This basic idea was proposed in a prior article, but there were two major limitations. One is that the original model could not be applied to the area directly below the camera. The other is that the silhouette of the target object was limited to a simple rectangle. In this paper we propose improved techniques that remove the limitations. Slant silhouette analysis removes the first limitation, and silhouette decomposition the second. The validity and the effectiveness of the techniques are confirmed by experiments.
Download

Paper Nr: 174
Title:

Fast Self-supervised On-line Training for Object Recognition Specifically for Robotic Applications

Authors:

Markus Schoeler, Simon Christoph Stein, Jeremie Papon, Alexey Abramov and Florentin Woergoetter

Abstract: Today most recognition pipelines are trained at an off-line stage, providing systems with pre-segmented images and predefined objects, or at an on-line stage, which requires a human supervisor to tediously control the learning. Self-Supervised on-line training of recognition pipelines without human intervention is a highly desirable goal, as it allows systems to learn unknown, environment specific objects on-the-fly. We propose a fast and automatic system, which can extract and learn unknown objects with minimal human intervention by employing a two-level pipeline combining the advantages of RGB-D sensors for object extraction and high-resolution cameras for object recognition. Furthermore, we significantly improve recognition results with local features by implementing a novel keypoint orientation scheme, which leads to highly invariant but discriminative object signatures. Using only one image per object for training, our system is able to achieve a recognition rate of 79% for 18 objects, benchmarked on 42 scenes with random poses, scales and occlusion, while only taking 7 seconds for the training. Additionally, we evaluate our orientation scheme on the state-of-the-art 56-object SDU-dataset boosting accuracy for one training view per object by +37% to 78% and peaking at a performance of 98% for 11 training views.
Download

Paper Nr: 203
Title:

Joint Learning for Multi-class Object Detection

Authors:

Hamidreza Odabai Fard, Mohamed Chaouch, Quoc-cuong Pham, Antoine Vacavant and Thierry Chateau

Abstract: In practice, multiple objects in images are located by consecutively applying one detector for each class and taking the best confident score. In this work, we propose to show the advantage of grouping similar object classes into a hierarchical structure. While this approach has found interest in image classification, it is not analyzed for the object detection task. Each node in the hierarchy represents one decision line. All the decision lines are learned jointly using a novel problem formulation. Based on experiments using PASCAL VOC 2007 dataset, we show that our approach improves detection performance compared to a baseline approach.
Download

Paper Nr: 216
Title:

Subtasks of Unconstrained Face Recognition

Authors:

Joel Z. Leibo, Qianli Liao and Tomaso Poggio

Abstract: Unconstrained face recognition remains a challenging computer vision problem despite recent exceptionally high results ( ~ 95% accuracy) on the current gold standard evaluation dataset: Labeled Faces in the Wild (LFW). We offer a decomposition of the unconstrained problem into subtasks based on the idea that invariance to identity-preserving transformations is the crux of recognition. Each of the subtasks in the Subtasks of Unconstrained Face Recognition (SUFR) challenge consists of a same-different face-matching problem on a set of 400 individual synthetic faces rendered so as to isolate a specific transformation or set of transformations. We characterized the performance of 9 different models (8 previously published) on each of the subtasks. One notable finding was that the HMAX-C2 feature was not nearly as clutter-resistant as had been suggested by previous publications. Next we considered LFW and argued that it is too easy of a task to continue to be regarded as a measure of progress on unconstrained face recognition. In particular, strong performance on LFW requires almost no invariance, yet it cannot be considered a fair approximation of the outcome of a detection --> alignment pipeline since it does not contain the kinds of variability that realistic alignment systems produce when working on non-frontal faces. We offer a new, more difficult, natural image dataset: SUFR-in-the-Wild (SUFR-W), which we created using a protocol that was similar to LFW, but with a few differences designed to produce more need for transformation invariance. We present baseline results for eight different face recognition systems on the new dataset and argue that it is time to retire LFW and move on to more difficult evaluations for unconstrained face recognition.
Download

Paper Nr: 230
Title:

Classifying and Visualizing Motion Capture Sequences using Deep Neural Networks

Authors:

Kyunghyun Cho and Xi Chen

Abstract: The gesture recognition using motion capture data and depth sensors has recently drawn more attention in vision recognition. Currently most systems only classify dataset with a couple of dozens different actions. Moreover, feature extraction from the data is often computational complex. In this paper, we propose a novel system to recognize the actions from skeleton data with simple, but effective, features using deep neural networks. Features are extracted for each frame based on the relative positions of joints (PO), temporal differences (TD), and normalized trajectories of motion (NT). Given these features a hybrid multi-layer perceptron is trained, which simultaneously classifies and reconstructs input data. We use deep autoencoder to visualize learnt features. The experiments show that deep neural networks can capture more discriminative information than, for instance, principal component analysis can. We test our system on a public database with 65 classes and more than 2,000 motion sequences. We obtain an accuracy above 95% which is, to our knowledge, the state of the art result for such a large dataset.
Download

Paper Nr: 236
Title:

Multi-feature Real Time Pedestrian Detection from Dense Stereo SORT-SGM Reconstructed Urban Traffic Scenarios

Authors:

Ion Giosan and Sergiu Nedevschi

Abstract: In this paper, a real-time system for pedestrian detection in traffic scenes is proposed. It takes the advantage of having a pair of stereo video-cameras for acquiring the image frames and uses a sub-pixel level optimized semi-global matching (SORT-SGM) based stereo reconstruction for computing the dense 3D points map with high accuracy. A multiple paradigm detection module considering 2D, 3D and optical flow information is used for segmenting the candidate obstacles from the scene background. Novel features like texture dissimilarity, humans’ body specific features, distance related measures and speed are introduced and combined in a feature vector with traditional features like HoG score, template matching contour score and dimensions. A random forest (RF) classifier is trained and then applied in each frame for distinguishing the pedestrians from other obstacles based on the feature vector. A k-NN algorithm on the classification results over the last frames is applied for improving the accuracy and stability of the tracked obstacles. Finally, two comparisons are made: first between the classification results obtained by using the new SORT-SGM and the older local matching approach for stereo reconstruction and the second by comparing the different features RF classification results with other classifiers’ results.
Download

Paper Nr: 284
Title:

Image-based Object Classification of Defects in Steel using Data-driven Machine Learning Optimization

Authors:

Fabian Bürger, Christoph Buck, Josef Pauli and Wolfram Luther

Abstract: In this paper we study the optimization process of an object classification task for an image-based steel quality measurement system. The goal is to distinguish hollow from solid defects inside of steel samples by using texture and shape features of reconstructed 3D objects. In order to optimize the classification results we propose a holistic machine learning framework that should automatically answer the question "How well do state-of-the-art machine learning methods work for my classification problem?" The framework consists of three layers, namely feature subset selection, feature transform and classifier which subsequently reduce the data dimensionality. A system configuration is defined by feature subset, feature transform function, classifier concept and corresponding parameters. In order to find the configuration with the highest classifier accuracies, the user only needs to provide a set of feature vectors and ground truth labels. The framework performs a totally data-driven optimization using partly heuristic grid search. We incorporate several popular machine learning concepts, such as Principal Component Analysis (PCA), Support Vector Machines (SVM) with different kernels, random trees and neural networks. We show that with our framework even non-experts can automatically generate a ready for use classifier system with a significantly higher accuracy compared to a manually arranged system.
Download

Paper Nr: 288
Title:

PhotoCluster - A Multi-clustering Technique for Near-duplicate Detection in Personal Photo Collections

Authors:

Vassilios Vonikakis, Amornched Jinda-Apiraksa and Stefan Winkler

Abstract: This paper presents PhotoCluster, a new technique for identifying non-identical near-duplicate images in personal photo collections. Contrary to existing methods, PhotoCluster estimates the probability that a pair of images may be considered near-duplicate. Its main thrust is a multiple clustering step that produces a non-binary near-duplicate probability for each image pair, which exhibits correlation with the average observer opinion. First, PhotoCluster partitions the photolibrary into groups of semantically similar photos, using global features. Then, the multiple clustering step is applied within the images of these groups, using a combination of global and local features. Computationally expensive comparisons between local features are taking place only on a limited part of the library, resulting in a low overall computational cost. Evaluation with two publicly available datasets show that PhotoCluster outperforms existing methods, especially in identifying ambiguous near-duplicate cases.
Download

Paper Nr: 294
Title:

Who is the Hero? - Semi-supervised Person Re-identification in Videos

Authors:

Umar Iqbal, Igor D. D. Curcio and Moncef Gabbouj

Abstract: Given a crowd-sourced set of videos of a crowded public event, this paper addresses the problem of detecting and re-identifying all appearances of every individual in the scene. The persons are ranked according to the frequency of their appearance and the rank of a person is considered as the measure of his/her importance. Grouping appearances of every person from such videos is a very challenging task. This is due to unavailability of prior information or training data, large changes in illumination, huge variations in camera viewpoints, severe occlusions and videos from different photographers. These problems are made tractable by exploiting a variety of visual and contextual cues i.e., appearance, sensor data and co-occurrence of people. A unified framework is proposed for efficient person matching across videos followed by their ranking. Experimental results on two challenging video data sets demonstrate the effectiveness of the proposed algorithm.
Download

Paper Nr: 320
Title:

Detecting Events in Crowded Scenes using Tracklet Plots

Authors:

Pau Climent-Pérez, Alexandre Mauduit, Dorothy N. Monekosso and Paolo Remagnino

Abstract: The main contribution of this paper is a compact representation of the ‘short tracks’ or tracklets present in a time window of a given video input, which allows to analyse and detect different crowd events. To proceed, first, tracklets are extracted from a time window using a particle filter multi-target tracker. After noise removal, the tracklets are plotted into a square image by normalising their lengths to the size of the image. Different histograms are then applied to this compact representation. Thus, different events in a crowd are detected via a Bag-of-words modelling. Novel video sequences, can then be analysed to detect whether an abnormal or chaotic situation is present. The whole algorithm is tested with our own dataset, also introduced in the paper.
Download

Paper Nr: 341
Title:

Impact of Facial Cosmetics on Automatic Gender and Age Estimation Algorithms

Authors:

Cunjian Chen, Antitza Dantcheva and Arun Ross

Abstract: Recent research has established the negative impact of facial cosmetics on the matching accuracy of automated face recognition systems. In this paper, we analyze the impact of cosmetics on automated gender and age estimation algorithms. In this regard, we consider the use of facial cosmetics for (a) gender spoofing where male subjects attempt to look like females and vice versa, and (b) age alteration where female subjects attempt to look younger or older than they actually are. While such transformations are known to impact human perception, their impact on computer vision algorithms has not been studied. Our findings suggest that facial cosmetics can potentially be used to confound automated gender and age estimation schemes.
Download

Paper Nr: 346
Title:

Exploring Residual and Spatial Consistency for Object Detection

Authors:

Hao Wang, Ya Zhang and Zhe Xu

Abstract: Local image features show a high degree of repeatability, while their local appearance usually does not bring enough discriminative pattern to obtain a reliable matching. In this paper, we present a new object matching algorithm based on a novel robust estimation of residual consensus and flexible spatial consistency filter. We evaluate the similarity between different homography model via two-parameter integrated Weibull distribution and inlier probabilities estimates, which can select uncontaminated model to help eliminating outliers. Spatial consistency test was encoded by the geometric relationships of domain knowledge in two directions, which is invariant to scale, rotation, and translation especially robust to the flipped image. Experiment results on nature images with clutter background demonstrate our method effectiveness and robustness.
Download

Short Papers
Paper Nr: 2
Title:

Image Flower Recognition based on a New Method for Color Feature Extraction

Authors:

Amira Ben Mabrouk, Asma Najjar and Ezzeddine Zagrouba

Abstract: In this paper, we present, first, a new method for color feature extraction based on SURF detectors. Then, we proved its efficiency for flower image classification. Therefore, we described visual content of the flower images using compact and accurate descriptors. These features are combined and the learning process is performed using a multiple kernel framework with a SVM classifier. The proposed method has been tested on the dataset provided by the university of oxford and achieved better results than our implementation of the method proposed by Nilsback and Zisserman (Nilsback and Zisserman, 2008) in terms of classification rate and execution time.
Download

Paper Nr: 14
Title:

Driver Drowsiness Estimation from Facial Expression Features - Computer Vision Feature Investigation using a CG Model

Authors:

Taro Nakamura, Akinobu Maejima and Shigeo Morishima

Abstract: We propose a method for estimating the degree of a driver’s drowsiness on the basis of changes in facial expressions captured by an IR camera. Typically, drowsiness is accompanied by drooping eyelids. Therefore, most related studies have focused on tracking eyelid movement by monitoring facial feature points. However, the drowsiness feature emerges not only in eyelid movements but also in other facial expressions. To more precisely estimate drowsiness, we must select other effective features. In this study, we detected a new drowsiness feature by comparing a video image and CG model that are applied to the existing feature point information. In addition, we propose a more precise degree of drowsiness estimation method using wrinkle changes and calculating local edge intensity on faces, which expresses drowsiness more directly in the initial stage.
Download

Paper Nr: 16
Title:

Enhanced Hierarchical Conditional Random Field Model for Semantic Image Segmentation

Authors:

Li-Li Wang, Shan-Shan Zhu and N. H. C. Yung

Abstract: Pairwise and higher order potentials in the Hierarchical Conditional Random Field (HCRF) model play a vital role in smoothing region boundary and extracting actual object contour in the labeling space. However, pairwise potential evaluated by color information has the tendency to over-smooth small regions which are similar to their neighbors in the color space; and the higher order potential associated with multiple segments is prone to produce incorrect guidance to inference, especially for objects having similar features to the background. To overcome these problems, this paper proposes two enhanced potentials in the HCRF model that is capable to abate the over smoothness by propagating the believed labeling from the unary potential and to perform coherent inference by ensuring reliable segment consistency. Experimental results on the MSRC-21 data set demonstrate that the enhanced HCRF model achieves pleasant visual results, as well as significant improvement in terms of both global accuracy of 87.52% and average accuracy of 80.18%, which outperforms other algorithms reported in the literature so far.
Download

Paper Nr: 17
Title:

Discriminant Boosted Dynamic Time Warping and Its Application to Gesture Recognition

Authors:

Tarik Arici, Sait Celebi, Ali Selman Aydin and Talha Tarik Temiz

Abstract: Dynamic time warping (DTW) measures similarity between two data sequences by minimizing an accumulated distance between two sequence samples at each iteration and a cost is computed to assess the level of the similarity. The DTW cost may then be used to assign a sequence to a class if the problem is a classification problem. In machine learning, classification problems are solved using features with good discrimination power, which are generated by exploiting the distribution of data vectors. Linear Discriminant Analysis (LDA) is such a technique and finds discriminative projection directions which are used to generate features as projections of sequence vectors on to these directions. Unfortunately, these techniques are not applicable to warped sequences because the mapping between the test sequences and the training sequences is not known. To solve this problem, we propose a constrained LDA framework that produces direction vectors that repeat unit vectors that have dimensions equal to the dimensions of a single sequence sample. Such projection vectors can be used without knowing the mapping of test sequence vectors to training sequence vectors. Experiment results show that generating features by discriminant analysis improves the performance significantly.

Paper Nr: 22
Title:

Towards Reliable Real-time Person Detection

Authors:

Silviu-Tudor serban, Srinidhi Mukanahallipatna Simha, Vasanth Bathrinarayanan, Etienne Corvee and Francois Bremond

Abstract: We propose a robust real-time person detection system, which aims to serve as solid foundation for developing solutions at an elevated level of reliability. Our belief is that clever handling of input data correlated with efficacious training algorithms are key for obtaining top performance. We introduce a comprehensive training method based on random sampling that compiles optimal classifiers with minimal bias and overfit rate. Building upon recent advances in multi-scale feature computations, our approach attains state-of-the-art accuracy while running at high frame rate.
Download

Paper Nr: 25
Title:

Hidden Conditional Random Fields for Action Recognition

Authors:

Lifang Chen, Nico van der Aa, Robby T. Tan and Remco C. Veltkamp

Abstract: In the field of action recognition, the design of features has been explored extensively, but the choice of action classification methods is limited. Commonly used classification methods like k-Nearest Neighbors and Support Vector Machines assume conditional independency between features. In contrast, Hidden Conditional Random Fields (HCRFs) include the spatial or temporal dependencies of features to be better suited for rich, overlapping features. In this paper, we investigate the performance of HCRF and Max-Margin HCRF and their baseline versions, the root model and Multi-class SVM, respectively, for action recognition on the Weizmann dataset. We introduce the Part Labels method, which uses explicitly the part labels learned by HCRF as a new set of local features. We show that only modelling spatial structures in 2D space is not sufficient to justify the additional complexity of HCRF, MMHCRF or the Part Labels method for action recognition.
Download

Paper Nr: 32
Title:

Application of Dynamic Distributional Clauses for Multi-hypothesis Initialization in Model-based Object Tracking

Authors:

D. Nitti, G. Chliveros, M. Pateraki, L. De Raedt, E. Hourdakis and P. Trahanias

Abstract: In this position paper we propose the use of the Distributional Clauses Particle Filter in conjunction with a model-based 3D object tracking method in monocular camera sequences. We describe the model based object tracking method that is based on contour and edge features for 3D pose relative estimation. We also describe the application of the Distributional Clauses Particle Filter that takes into account inputs from object tracking. We argue that objects’ dynamics can be modeled via probabilistic rules, which makes possible to predict and utilise a pose hypothesis space for fully occluded or ‘invisible’ (hidden-away) objects that may re-appear in the camera field of view. Important issues, such as losing track of the object in a ‘total occlusion’ scenario, are discussed.
Download

Paper Nr: 33
Title:

Subsign Detection and Classification System for Automated Traffic-sign Inventory Systems

Authors:

Lykele Hazelhoff, Ron op het Veld, Ivo Creusen and Peter H. N. de With

Abstract: Road safety is influenced by the accurate placement and visibility of road signs, which are maintained based on inventories of traffic signs. These inventories are created (semi-)automatically from street-level images, based on object detection and classification. These systems often neglect the present complimentary signs (subsigns), although clearly important for the meaning and validity of signs. This paper presents a generic, learning-based approach for both detection and classification of subsigns, which is based on the same principles as the system employed for finding traffic signs and can be used as an extension to automated inventory systems. The system starts with detection of subsigns in a region below each detected sign, followed by analysis of the results obtained for all capturings of the same sign. When a subsign is found, the corresponding pixel regions are extracted and subject to classification. This recognition system is evaluated on 3;104 signs (397 with subsign) identified by an existing inventory system. At a detection rate of 98%, only 757 signs (24:4% of the signs) are labeled as containing a subsign, while 91:4% of the subsigns of a class known to our classifier are also classified correctly.
Download

Paper Nr: 34
Title:

Dictionary based Pooling for Object Categorization

Authors:

Sean Ryan Fanello, Nicoletta Noceti, Giorgio Metta and Francesca Odone

Abstract: It is well known that image representations learned through ad-hoc dictionaries improve the overall results in object categorization problems. Following the widely accepted coding-pooling visual recognition pipeline, these representations are often tightly coupled with a coding stage. In this paper we show how to exploit ad-hoc representations both within the coding and the pooling phases. We learn a dictionary for each object class and then use local descriptors encoded with the learned atoms to guide the pooling operator. We exhaustively evaluate the proposed approach in both single instance object recognition and object categorization problems. From the applications standpoint we consider a classical image retrieval scenario with the Caltech 101, as well as a typical robot vision task with data acquired by the iCub humanoid robot.
Download

Paper Nr: 43
Title:

Temporally Consistent Snow Cover Estimation from Noisy, Irregularly Sampled Measurements

Authors:

Dominic Rüfenacht, Matthew Brown, Jan Beutel and Sabine Süsstrunk

Abstract: We propose a method for accurate and temporally consistent surface classification in the presence of noisy, irregularly sampled measurements, and apply it to the estimation of snow coverage over time. The input imagery is extremely challenging, with large variations in lighting and weather distorting the measurements. Initial snow cover estimations are obtained using a Gaussian Mixture Model of color. To achieve a temporally consistent snow cover estimation, we use a Markov Random Field that penalizes rapid fluctuations in the snow state, and show that the penalty term needs to be quite large, resulting in slow reactivity to changes. We thus propose a classifier to separate good from uninformative images, which allows to use a smaller penalty term. We show that the incorporation of domain knowledge to discard uninformative images leads to better reactivity to changes in snow coverage as well as more accurate snow cover estimations.
Download

Paper Nr: 47
Title:

Toward Object Recognition with Proto-objects and Proto-scenes

Authors:

Fabian Nasse, Rene Grzeszick and Gernot A. Fink

Abstract: In this paper a bottom-up approach for detecting and recognizing objects in complex scenes is presented. In contrast to top-down methods, no prior knowledge about the objects is required beforehand. Instead, two different views on the data are computed: First, a GIST descriptor is used for clustering scenes with a similar global appearance which produces a set of Proto-Scenes. Second, a visual attention model that is based on hiearchical multi-scale segmentation and feature integration is proposed. Regions of Interest that are likely to contain an arbitrary object, a Proto-Object, are determined. These Proto-Object regions are then represented by a Bag-of-Features using Spatial Visual Words. The bottom-up approach makes the detection and recognition tasks more challenging but also more efficient and easier to apply to an arbitrary set of objects. This is an important step toward analyzing complex scenes in an unsupervised manner. The bottom-up knowledge is combined with an informed system that associates Proto-Scenes with objects that may occur in them and an object classifier is trained for recognizing the Proto-Objects. In the experiments on the VOC2011 database the proposed multi-scale visual attention model is compared with current state-of-the-art models for Proto-Object detection. Additionally, the the Proto-Objects are classified with respect to the VOC object set.
Download

Paper Nr: 63
Title:

Large-scale Image Retrieval based on the Vocabulary Tree

Authors:

Bo Cheng, Li Zhuo, Pei Zhang and Jing Zhang

Abstract: In this paper, vocabulary tree based large-scale image retrieval scheme is proposed that can achieve higher accuracy and speed. The novelty of this paper can be summarized as follows. First, because traditional Scale Invariant Feature Transform (SIFT) descriptors are excessively concentrated in some areas of images, the extraction process of SIFT features is optimized to reduce the number. Then, combined with optimized-SIFT, color histogram in Hue, Saturation, Value (HSV) color space is extracted to be another image feature. Moreover, Local Fisher Discriminant Analysis (LFDA) is applied to reduce the dimension of SIFT and color features, which will help to shorten feature-clustering time. Finally, dimension-reduced features are used to generate vocabulary trees which will be used for large-scale image retrieval. The experimental results on several image datasets show that, the proposed method can achieve satisfying retrieval precision.
Download

Paper Nr: 92
Title:

Image Retrieval with Reciprocal and Shared Nearest Neighbors

Authors:

Agni Delvinioti, Hervé Jégou, Laurent Amsaleg and Michael Houle

Abstract: Content-based image retrieval systems typically rely on a similarity measure between image vector representations, such as in bag-of-words, to rank the database images in decreasing order of expected relevance to the query. However, the inherent asymmetry of k-nearest neighborhoods is not properly accounted for by traditional similarity measures, possibly leading to a loss of retrieval accuracy. This paper addresses this issue by proposing similarity measures that use neighborhood information to assess the relationship between images. First, we extend previous work on k-reciprocal nearest neighbors to produce new measures that improve over the original primary metric. Second, we propose measures defined on sets of shared nearest neighbors for reranking the shortlist. Both these methods are simple, yet they significantly improve the accuracy of image search engines on standard benchmark datasets.
Download

Paper Nr: 99
Title:

Learning a Loopy Model Exactly

Authors:

Andreas Christian Müller and Sven Behnke

Abstract: Learning structured models using maximum margin techniques has become an indispensable tool for computer vision researchers, as many computer vision applications can be cast naturally as an image labeling problem. Pixel-based or superpixel-based conditional random fields are particularly popular examples. Typically, neighborhood graphs, which contain a large number of cycles, are used. As exact inference in loopy graphs is NP-hard in general, learning these models without approximations is usually deemed infeasible. In this work we show that, despite the theoretical hardness, it is possible to learn loopy models exactly in practical applications. To this end, we analyze the use of multiple approximate inference techniques together with cutting plane training of structural SVMs. We show that our proposed method yields exact solutions with an optimality guarantees in a computer vision application, for little additional computational cost. We also propose a dynamic caching scheme to accelerate training further, yielding runtimes that are comparable with approximate methods. We hope that this insight can lead to a reconsideration of the tractability of loopy models in computer vision.
Download

Paper Nr: 106
Title:

Approximate Image Matching using Strings of Bag-of-Visual Words Representation

Authors:

Hong-Thinh Nguyen, Cécile Barat and Christophe Ducottet

Abstract: The Spatial Pyramid Matching approach has become very popular to model images as sets of local bag-of words. The image comparison is then done region-by-region with an intersection kernel. Despite its success, this model presents some limitations: the grid partitioning is predefined and identical for all images and the matching is sensitive to intra- and inter-class variations. In this paper, we propose a novel approach based on approximate string matching to overcome these limitations and improve the results. First, we introduce a new image representation as strings of ordered bag-of-words. Second, we present a new edit distance specifically adapted to strings of histograms in the context of image comparison. This distance identifies local alignments between subregions and allows to remove sequences of similar subregions to better match two images. Experiments on 15 Scenes and Caltech 101 show that the proposed approach outperforms the classical spatial pyramid representation and most existing concurrent methods for classification presented in recent years.
Download

Paper Nr: 107
Title:

Environment Adaptive Pedestrian Detection using In-vehicle Camera and GPS

Authors:

Daichi Suzuo, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase, Hiroyuki Ishida and Yoshiko Kojima

Abstract: In recent years, accurate pedestrian detection from in-vehicle camera images is focused to develop a safety driving assistance system. Currently, successful methods are based on statistical learning. However, in such methods, it is necessary to prepare a large amount of training images. Thus, the decrease in the number of training images degrades the detection accuracy. That is, in driving environments with few or no training images, it is difficult to detect pedestrians accurately. Therefore, we propose an approach that collects training images automatically to build classifiers for various driving environments. This is expected to realize highly accurate pedestrian detection by using an appropriate classifier corresponding to the current location. The proposed method consists of three steps; Classification of driving scenes, collection of non-pedestrian images and training of classifiers for each scene class, and associating a scene-class-specific classifier with GPS location information. Through experiments, we confirmed the effectiveness of the method compared to baseline methods.
Download

Paper Nr: 119
Title:

Active Shape Models with SIFT Descriptors and MARS

Authors:

Stephen Milborrow and Fred Nicolls

Abstract: We present a technique for locating landmarks in images of human faces. We replace the 1D gradient profiles of the classical Active Shape Model (ASM) (Cootes and Taylor, 1993) with a simplified form of SIFT descriptors (Lowe, 2004), and use Multivariate Adaptive Regression Splines (MARS) (Friedman, 1991) for descriptor matching. This modified ASM is fast and performs well against existing techniques for automatic face landmarking on frontal faces.
Download

Paper Nr: 133
Title:

Towards Unsupervised Sudden Group Movement Discovery for Video Surveillance

Authors:

Sofia Zaidenberg, Piotr Bilinski and François Brémond

Abstract: This paper presents a novel and unsupervised approach for discovering “sudden” movements in video surveillance videos. The proposed approach automatically detects quick motions in a video, corresponding to any action. A set of possible actions is not required and the proposed method successfully detects potentially alarm-raising actions without training or camera calibration. Moreover, the system uses a group detection and event recognition framework to relate detected sudden movements and groups of people, and provide a semantical interpretation of the scene. We have tested our approach on a dataset of nearly 8 hours of videos recorded from two cameras in the Parisian subway for a European Project. For evaluation, we annotated 1 hour of sequences containing 50 sudden movements.
Download

Paper Nr: 150
Title:

A Pattern Recognition System for Detecting Use of Mobile Phones While Driving

Authors:

Rafael A. Berri, Alexandre G. Silva, Rafael S. Parpinelli, Elaine Girardi and Rangel Arthur

Abstract: It is estimated that 80% of crashes and 65% of near collisions involved drivers inattentive to traffic for three seconds before the event. This paper develops an algorithm for extracting characteristics allowing the cell phones identification used during driving a vehicle. Experiments were performed on sets of images with 100 positive images (with phone) and the other 100 negative images (no phone), containing frontal images of the driver. Support Vector Machine (SVM) with Polynomial kernel is the most advantageous classification system to the features provided by the algorithm, obtaining a success rate of 91.57% for the vision system. Tests done on videos show that it is possible to use the image datasets for training classifiers in real situations. Periods of 3 seconds were correctly classified at 87.43% of cases.
Download

Paper Nr: 195
Title:

Paper Substrate Classification based on 3D Surface Micro-geometry

Authors:

Hossein Malekmohamadi, Khemraj Emrith, Stephen Pollard, Guy Adams, Melvyn Smith and Steve Simske

Abstract: This paper presents an approach to derive a novel 3D signature based on the micro-geometry of paper surfaces so as to uniquely characterise and classify different paper substrates. This procedure is extremely important to confront different conducts of tampering valuable documents. We use a 4-light source photometric stereo (PS) method to recover dense 3D geometry of paper surfaces captured using an ultra-high resolution sensing device. We derived a unique signature for each paper type based on the shape index (SI) map generated from the surface normals of the 3D data. We show that the proposed signature can robustly and accurately classify paper substrates with different physical properties and different surface textures. Additionally, we present results demonstrating that our classification model using the 3D signature performs significantly better as compared to the use of conventional 2D image based descriptors extracted from both printed and non-printed paper surfaces. Accuracy of the proposed method is validated over a dataset comprising of 21 printed and 22 non-printed paper types and a measure of classification success of over 92%is achieved in both cases (92.5% for printed surfaces and 96% for the non-printed ones).
Download

Paper Nr: 202
Title:

Invariant Shape Prior Knowledge for an Edge-based Active Contours - Invariant Shape Prior for Active Contours

Authors:

Mohamed Amine Mezghich, Slim M’Hiri and Faouzi Ghorbel

Abstract: In this paper, we intend to propose a new method to incorporate geometric shape prior into an edge-based active contours for robust object detection in presence of partial occlusions, low contrast and noise. A shape registration method based on phase correlation of binary images, associated with level set functions of the active contour and a reference shape, is used to define prior knowledge making the model invariant with respect to Euclidean transformations. In case of several templates, a set of complete invariant shape descriptors is used to select the most suitable one according to the evolving contour. Experimental results show the ability of the proposed approach to constrain an evolving curve towards a target shapes that may be occluded and cluttered under rigid transformations.
Download

Paper Nr: 209
Title:

Multi-viewpoint Visibility Coverage Estimation for 3D Environment Perception - Volumetric Representation as a Gateway to High Resolution Data

Authors:

Marek Ososinski and Frédéric Labrosse

Abstract: Estimation of visibility is a crucial element of coverage estimation of large, complex environments. This nonprobabilistic problem is often tackled in a 2D context. We present an algorithm that can estimate the visibility of a high resolution scene from a low resolution 3D representation. An octree based voxel representation provides a dataset that is easy to process. Voxel occupancy properties ensure a good approximation of visibilty at high resolution. Our system is capable of producing a reasonable solution to the viewpoint placement issue of the Art gallery problem.
Download

Paper Nr: 211
Title:

Pose Recognition in Indoor Environments using a Fisheye Camera and a Parametric Human Model

Authors:

K. K. Delibasis, V. P. Plagianakos and I. Maglogiannis

Abstract: In this paper we present a system that uses computer vision techniques and a deformable 3D human model, in order to recognize the posture of a monitored person, given the segmented human silhouette from the background. The video data are acquired indoors from a fixed fish-eye camera placed in the living environment. The implemented 3D human model collaborates with a fish-eye camera model, allowing the calculation of the real human position in the 3D-space and consequently recognizing the posture of the monitored person. The paper discusses the details of the human model and fish-eye camera model, as well as the posture recognition methodology. Initial results are also presented for a small number of video sequences, of walking or standing humans.
Download

Paper Nr: 229
Title:

A Multi-stage Segmentation based on Inner-class Relation with Discriminative Learning

Authors:

Haoqi Fan, Yuanshi Zhang and Guoyu Zuo

Abstract: In this paper, we proposed a segmentation approach that not only segment an interest object but also label different semantic parts of the object, where a discriminative model is presented to describe an object in real world images as multiply, disparate and correlative parts. We propose a multi-stage segmentation approach to make inference on the segments of an object. Then we train it under the latent structural SVM learning framework. Then, we showed that our method boost an average increase of about 5% on ETHZ Shape Classes Dataset and 4% on INRIA horses dataset. Finally, extensive experiments of intricate occlusion on INRIA horses dataset show that the approach have a state of the art performance in the condition of occlusion and deformation.
Download

Paper Nr: 240
Title:

A Method of Weather Recognition based on Outdoor Images

Authors:

Qian Li, Yi Kong and Shi-ming Xia

Abstract: To improve the quality of video surveillance in outdoor and automatic acquire of the weather situations, a method to recognize weather phenomenon based on outdoor images is presented. There are three features of our method: firstly, the features, such as the power spectrum slope, contrast, noise and saturation and so on are extracted, after analysing the effect of weather situations on image; secondly, a decision tree is constructed in accordance with the distance between the features; thirdly, when every SVM classifier on the non-leaf node of the decision tree is constructed, some features are selected by assigning the weight. The experiment results prove that the proposed method can effectively recognize the weather situations in outdoor.
Download

Paper Nr: 256
Title:

An Improved Approach for Depth Data based Face Pose Estimation using Particle Swarm Optimization

Authors:

Xiaozheng Mou and Han Wang

Abstract: This paper presents an improved approach for face pose estimation based on depth data using particle swarm optimization (PSO). In this approach, the frontal face of the system-user is first initialized and its depth image is taken as a person-specific template. Each query face of that user is rotated and translated with respect to its centroid using PSO to match with the template. Since the centroid of each query face always changes with the face pose changing, a common reference point has to be defined to measure the exact transformation of the query face. Thus, the nose tips of the optimal transformed face and the query face are localized to recompute the transformation from the query face to the optimal transformed face that matched with the template. Using the recomputed rotation and translation information, finally, the pose of the query face can be approximated by the relative pose between the query face and the template face. Experiments on public database show that the accuracy of this new method is more than 99%, which is much higher than the best performance (< 91%) of existing work.
Download

Paper Nr: 269
Title:

Beyond SIFT for Image Classification

Authors:

Sébastien Paris, Xanadu Halkias and Hervé Glotin

Abstract: In classifying images, scenes or objects, the most popular approach is based on the features extraction-codingpooling framework allowing to generate discriminative and robust image representations from densely extracted local patches, mainly some SIFT/HOG ones. The majority of the latest research is focused on how to improve successfully these coding and pooling parts. In this work, we show that substantial improvements can be also obtained by coding information closer to the pixel values level in the same way that deep-learning architectures do. We introduce a two layer, stacked, coder-pooler architecture where the first layer is specifically dedicated to extract, from our so-called Differential Vectors (DV) patches, some efficient, local low-level features more discriminative and efficient that their classic handcrafted counterpart. This first layer can advantageously replace any classic dense SIFT/HOG patches extraction stage. We demonstrate the effectiveness of our approach on three datasets: UIUC-Sports, Scene 15 and Caltech 101. We achieve excellent performances with simple linear classification while using basic coding and pooling schemes for both layers, i.e. Sparse Coding (SC) and Max-Pooling (MP) respectively.

Paper Nr: 272
Title:

Learning Weighted Joint-based Features for Action Recognition using Depth Camera

Authors:

Guang Chen, Daniel Clarke and Alois Knoll

Abstract: Human action recognition based on joints is a challenging task. The 3D positions of the tracked joints are very noisy if occlusions occur, which increases the intra-class variations in the actions. In this paper, we propose a novel approach to recognize human actions with weighted joint-based features. Previous work has focused on hand-tuned joint-based features, which are difficult and time-consuming to be extended to other modalities. In contrast, we compute the joint-based features using an unsupervised learning approach. To capture the intra-class variance, a multiple kernel learning approach is employed to learn the skeleton structure that combine these joints-base features. We test our algorithm on action application using Microsoft Research Action3D (MSRAction3D) dataset. Experimental evaluation shows that the proposed approach outperforms state-of-the art action recognition algorithms on depth videos.
Download

Paper Nr: 277
Title:

Multi-objective Optimization for Characterization of Optical Flow Methods

Authors:

Jose Delpiano, Luis Pizarro, Rodrigo Verschae and Javier Ruiz-del-Solar

Abstract: Optical flow methods are among the most accurate techniques for estimating displacement and velocity fields in a number of applications that range from neuroscience to robotics. The performance of any optical flow method will naturally depend on the configuration of its parameters. Beyond the standard practice of manual (ad-hoc) selection of parameters for a specific application, in this article we propose a framework for automatic parameter setting that allows searching for an approximated Pareto-optimal set of configurations in the whole parameter space. This final Pareto front characterizes each specific method, enabling proper method comparison. We define two performance criteria, namely the accuracy and speed of the optical flow methods.
Download

Paper Nr: 302
Title:

High Resolution Light Field Photography from Split Ray Imaging and Coded Aperture

Authors:

Shota Taki, Fumihiko Sakaue and Jun Sato

Abstract: In this paper, we propose a method for obtaining high resolution 4D light fields by using low resolution camera sensors and controllable coded apertures. Recently, 4D light filed acquisition has been studied extensively in the field of computational photography. Since the 4D light filed consists of much lager information than the ordinary 2D image, we have to use super high resolution camera sensors in order to obtain high resolution 4D light fields. In this paper, we propose a method for obtaining high resolution 4D light fields from low resolution camera sensors. In this method, we combine the standard light field imaging technique with the coded aperture. By using these techniques, we can obtain high resolution 4D light fields from low resolution cameras with small number of image acquisitions. The efficiency of the proposed method is tested by real images.
Download

Paper Nr: 304
Title:

Assisting Navigation in Homogenous Fog

Authors:

Mihai Negru and Sergiu Nedevschi

Abstract: An important cause of road accidents is the reduced visibility due to the presence of fog or haze. For this reason, there is a fundamental need for Advanced Driving Assistance Systems (ADAS) based on efficient real time algorithms able to detect the presence of fog, estimate the fog’s density, determine the visibility distance and inform the driver about the maximum speed that the vehicle should be traveling. Our solution is an improvement over existing methods of detecting fog due to the temporal integration of the horizon line and inflection point in the image. Our method performs in real time; approximately 50 frames per second. It is based on a single in-vehicle camera and is able to detect day time fog in real time in a wide range of scenarios, including urban scenarios.
Download

Paper Nr: 349
Title:

Preliminary Study on the Design of a Low-cost Movement Analysis System - Reliability Measurement of Timed Up and Go Test

Authors:

Asma Hassani, Alexandre Kubicki, Vincent Brost and Fan Yang

Abstract: In this paper, we present experiments on the design of a novel movement analysis system for real-time balance assessment in the frail elderly. Using the Microsoft Kinect sensors, we capture TUG (Timed Up and Go) tests and mainly analyze the transfer from sitting-to-standing and back-to-sitting which represent two of the most commonly executed human movements. Nine spatio-temporal parameters were extracted from recorded joint positions by 3D skeletal sequence processing. In order to validate and evaluate the developed system, practical test experiences have been performed on ten healthy young subjects, who were asked to realize the TUG in three different conditions: normal, cognitive and motor. Obtained results showed good measurement reliability and reproducibility with important precision. In addition, we observed that even for young healthy subjects, there is a significant difference of movement parameter between normal condition and cognitive condition, which represents a stimulating result in the dual task paradigm field. This preliminary study opens a new research and development way for geriatric health which implies multiple aspects: user-friendly, hygiene, low-cost, home-based environment, and automatic autonomy assessment.
Download

Paper Nr: 354
Title:

From Text Vocabularies to Visual Vocabularies - What Basis?

Authors:

Jean Martinet

Abstract: The popular "bag-of-visual-words" approach for representing and searching visual documents consists in describing images (or video keyframes) using a set of descriptors, that correspond to quantized low-level features. Most of existing approaches for visual words are inspired from works in text indexing, based on the implicit assumption that visual words can be handled the same way as text words. More specifically, these techniques implicitly rely on the same postulate as in text information retrieval, stating that the words distribution for a natural language globally follows Zipf's law -- that is to say, words from a natural language appear in a corpus with a frequency inversely proportional to their rank. However, our study shows that the visual words distribution depends on the choice of low-level features, and also especially on the choice of the clustering method. We also show that when the visual words distribution is close to this of text words, the results of an image retrieval system are increased. To the best of our knowledge, no prior study has yet been carried out to compare the distributions of text words and visual words, with the objective of establishing the theoretical foundations of visual vocabularies.
Download

Paper Nr: 356
Title:

A Survey of Extended Methods to the Bag of Visual Words for Image Categorization and Retrieval

Authors:

Mouna Dammak, Mahmoud Mejdoub and Chokri Ben Amar

Abstract: The semantic gap is a crucial issue in the enhancement of computer vision. The user longs for retrieving images on a semantic level, but the image characterizations can only give a low-level similarity. As a result, recording a stage medium between high-level semantic concepts and low-level visual features is a stimulating task. A recent work, called Bag of visual Words (BoW) have arisen to resolve this difficulty in greater generality through the conception of techniques genius relevantly learning semantic vocabularies. In spite of its clarity and effectiveness, the building of a codebook is a critical step which is ordinarily performed by coding and pooling step. Yet, it is still difficult to build a compact codebook with shortened calculation cost. For that, several approaches try to overcome these difficulties and to improve image representation. In this paper, we introduce a survey investigates to cover the inadequacy of a full description of the most important public approaches for image categorization and retrieval.
Download

Paper Nr: 401
Title:

Weighted SIFT Feature Learning with Hamming Distance for Face Recognition

Authors:

Guoyu Lu, Yingjie Hu and Chandra Kambhamettu

Abstract: Scale-invariant feature transform (SIFT) feature has been successfully utilized for face recognition for its tolerance to the changes of image scaling, rotation and distortion. However, a big concern on the use of original SIFT feature for face recognition is SIFT feature’s high dimensionality which leads to slow image matching. Meanwhile, large memory capacity is required to store high dimensional SIFT features. Aiming to find an efficient approach to solve these issues, we propose a new integrated method for face recognition in this paper. The new method consists of two novel functional modules in which a projection function transforms the original SIFT features into a low dimensional Hamming feature space while each bit of the Hamming descriptor is ranked based on their discrimination power. Furthermore, a weighting function assigns different weights to the correctly matched features based on their matching times. Our proposed face recognition method has been applied on two benchmark facial image datasets: ORL and Yale datasets. The experimental results have shown that the new method is able to produce good image recognition rate with much improved computational speed.
Download

Paper Nr: 29
Title:

A Complete Framework for Fully-automatic People Indexing in Generic Videos

Authors:

Dario Cazzato, Marco Leo and Cosimo Distante

Abstract: Face indexing is a very popular research topic and it has been investigated over the last 10 years. It can be used for a wide range of applications such as automatic video content analysis, data mining, video annotation and labeling, etc. In this work a fully automated framework that can detect how many people are present in a generic video (even having low resolution and/or taken from a mobile camera) is presented. It also extracts the intervals of frames in which each person appears. The main contributions of the proposed work are that no initializations neither a priory knowledge about the scene contents are required. Moreover, this approach introduces a generalized version of the k-means method that, through different statistical indices, automatically determines the number of people in the scene.
Download

Paper Nr: 61
Title:

Obstacle and Planar Object Detection using Sparse 3D Information for a Smart Walker

Authors:

Séverine Cloix, Viviana Weiss, Guido Bologna, Thierry Pun and David Hasler

Abstract: With the increasing proportion of senior citizens, many mobility aid devices have been developed such as the rollator. However, under some circumstances, the latter may cause accidents. The EyeWalker project aims to develop a small and autonomous device for rollators to help elderly people, especially those with some degree of visual impairment, avoiding common dangers like obstacles and hazardous ground changes, both outdoors and indoors. We propose a method of real-time stereo obstacle detection using sparse 3D information. Working with sparse 3D points, in opposition to dense 3D maps, is computationally more efficient and more appropriate for a long battery-life. In our approach, 3D data are extracted from a stereo-rig of two 2D high dynamic range cameras developed at the CSEM (Centre Suisse d'Electronique et de Microtechnique) and processed to perform a boosting classification. We also present a deformable 3D object detector for which the 3D points are combined in several different ways and result in a set of pose estimates used to execute a less ill-posed classification. The evaluation, carried out on real stereo images of obstacles described with both 2D and 3D features, shows promising results for a future use in real-world conditions.
Download

Paper Nr: 69
Title:

A Robust, Real-time Ground Change Detector for a “Smart” Walker

Authors:

Viviana Weiss, Séverine Cloix, Guido Bologna, David Hasler and Thierry Pun

Abstract: Nowadays, there are many different types of mobility aids for elderly people. Nevertheless, these devices may lead to accidents, depending on the terrain where they are being used. In this paper, we present a robust ground change detector that will warn the user of potentially risky situations. Specifically, we propose a robust classification algorithm to detect ground changes based on colour histograms and texture descriptors. In our design, we compare the current frame and the average of the k previous frames using different colour systems and Local Edge Patterns. To assess the performance of our algorithm, we evaluated different Artificial Neural Networks architectures. The best results were obtained by representing in the input neurons measures related to Histogram Intersections, Kolmogorov-Smirnov distance, Cumulative Integrals and Earth mover’s distance. Under real environmental conditions our results indicated that our proposed detector can accurately distinguish the grounds changes in real-time.
Download

Paper Nr: 83
Title:

Detecting Unusual Inactivity by Introducing Activity Histogram Comparisons

Authors:

Rainer Planinc and Martin Kampel

Abstract: Unusual inactivity at elderly’s homes is an evidence that help is needed. Hence, the automatic detection of abnormal behaviour with a low number of false positives is desired. The aim of this work is to improve the accuracy of inactivity detection by introducing a new approach based on histogram comparison in order to reliably detect abnormal behaviour in elderly’s homes. The proposed approach compares activity histograms with a pre-trained reference histogram and detects deviations from normal behavior. Evaluation is performed on a dataset containing 103 days of activity, where six days were reported as containing ”unusual” inactivity (i.e., longer absence from home) by an elderly couple.
Download

Paper Nr: 95
Title:

Photo Rating of Facial Pictures based on Image Segmentation

Authors:

Arnaud Lienhard, Marion Reinhard, Alice Caplier and Patricia Ladret

Abstract: A single glance at a face is enough to infer a first impression about someone. With the increasing amount of pictures available, selecting the most suitable picture for a given use is a difficult task. This work focuses on the estimation of the image quality of facial portraits. Some image quality features are extracted such as blur, color representation, illumination and it is shown that concerning facial picture rating, it is better to estimate each feature on the different picture parts (background and foreground). The performance of the proposed image quality estimator is evaluated and compared with a subjective facial picture quality estimation experiment.
Download

Paper Nr: 108
Title:

VabCut: A Video Extension of GrabCut for Unsupervised Video Foreground Object Segmentation

Authors:

Sebastien Poullot and Shin'Ichi Satoh

Abstract: This paper introduces VabCut, a video extension of GrabCut, an original unsupervised solution to tackle the video foreground object segmentation task. Vabcut works on an extension of the RGB colour domain to RGBM, where M is the motion. It requires a prior step: the computation of the motion layer (M-layer) of the frame to segment. In order to compute this layer we propose to intersect the frame to segment with N temporally close aligned frames. This paper also introduces a new iterative and collaborative method for an optimal frame alignment, based on points of interest and RANSAC, which automatically discards outliers and refines the homographies in turns. The whole method is fully automatic and can handle standard video, i.e. not professional, shaky, blurry or else. We tested VabCut on the SegTrack 2011 benchmark, and demonstrated its effectiveness, it especially outperforms the state of the art methods while being faster.
Download

Paper Nr: 114
Title:

Regional SVM Classifiers with a Spatial Model for Object Detection

Authors:

Zhu Teng, Baopeng Zhang, Onecue Kim and Dong-Joong Kang

Abstract: This paper presents regional Support Vector Machine (SVM) classifiers with a spatial model for object detection. The conventional SVM maps all the features of training examples into a feature space, treats these features individually, and ignores the spatial relationship of the features. The regional SVMs with a spatial model we propose in this paper take into account a 3-dimentional relationship of features. One-dimensional relationship is incorporated into the regional SVMs. The other two-dimensional relationship is the pairwise relationship of regional SVM classifiers acting on features, and is modelled by a simple conditional random field (CRF). The object detection system based on the regional SVM classifiers with the spatial model is demonstrated on several public datasets, and the performance is compared with that of other object detection algorithms.
Download

Paper Nr: 134
Title:

Group Tracking and Behavior Recognition in Long Video Surveillance Sequences

Authors:

Carolina Gárate, Sofia Zaidenberg, Julien Badie and François Brémond

Abstract: This paper makes use of recent advances in group tracking and behavior recognition to process large amounts of video surveillance data from an underground railway station and perform a statistical analysis. The most important advantages of our approach are the robustness to process long videos and the capacity to recognize several and different events at once. This analysis automatically brings forward data about the usage of the station and the various behaviors of groups in different hours of the day. This data would be very hard to obtain without an automatic group tracking and behavior recognition method. We present the results and interpretation of one month of processed data from a video surveillance camera in the Torino subway.
Download

Paper Nr: 142
Title:

Collaborative Vision Network for Personalized Office Ergonomics

Authors:

Tommi Määttä, Chih-Wei Chen, Aki Härmä and Hamid Aghajan

Abstract: This paper proposes a collaborative vision network that leverages a personal webcam and cameras of the workplace to provide feedback relating to an office-worker’s adherence to ergonomic guidelines. This can lead to increased well-being for the individual and better productivity in their work. The proposed system is evaluated with a recorded multi-camera dataset from a regular office environment. First, analysis results on various ergonomic issues are presented based on personal webcams of the two workers. Second, both personal and ambient cameras are used through sensor fusion to infer the mobility state of one of the workers. Results for various fusion approaches are shown and their impact on vision network design is briefly discussed.
Download

Paper Nr: 156
Title:

In Search of a Car - Utilizing a 3D Model with Context for Object Detection

Authors:

Mikael Nilsson and Håkan Ardö

Abstract: Automatic video analysis of interactions between road users is desired for city and road planning. A first step of such a system is object localization of road users. In this work, we present a method of detecting a specific car in an intersection from a monocular camera image. A camera calibration and segmentation are utilized as inputs by the method in order to detect a car. Using these inputs, a sampled search space in the ground plane, including rotations, is explored with a 3D model of a car in order to produce output in form of rectangle detections in the ground plane. Evaluation on real recorded data, with ground truth for one car using GPS, indicates that a car can be detected in over 90% of the time with an average error around 0.5m.
Download

Paper Nr: 163
Title:

SVM-based Video Segmentation and Annotation of Lectures and Conferences

Authors:

Stefano Masneri and Oliver Schreer

Abstract: This paper presents a classification system for video lectures and conferences based on Support Vector Machines (SVM). The aim is to classify videos into four different classes (talk, presentation, blackboard, mix). On top of this, the system further analyses presentation segments to detect slide transitions, animations and dynamic content such as video inside the presentation. The developed approach uses various colour and facial features from two different datasets of several hundred hours of video to train an SVM classifier. The system performs the classification on frame-by-frame basis and does not require pre-computed shotcut information. To avoid over-segmentation and to take advantage of the temporal correlation of succeeding frames, the results are merged every 50 frames into a single class. The presented results prove the robustness and accuracy of the algorithm. Given the generality of the approach, the system can be easily adapted to other lecture datasets.
Download

Paper Nr: 177
Title:

What to Show? - Automatic Stream Selection among Multiple Sensors

Authors:

Rémi Emonet, E. Oberzaucher and J.-M. Odobez

Abstract: The installation of surveillance networks has been growing exponentially in the last decade. In practice, videos from large surveillance networks are almost never watched, and it is frequent to see surveillance video wall monitors showing empty scenes. There is thus a need to design methods to continuously select streams to be shown to human operators. This paper addresses this issue and make three main contributions: it introduces and investigates, for the first time in the literature, the live stream selection task; based on the theory of social attention, it formalizes a way of obtaining some ground truth for the task and hence a way of evaluating stream selection algorithms; and finally, it proposes a two-step approach to solve this task and compares different approaches for interestingness rating using our framework. Experiments conducted on 9 cameras from a metro station and 5 hours of data randomly selected over one week show that, while complex unsupervised activity modeling algorithms achieve good performance, simpler approaches based on amount of motion perform almost as well for this type of indoor setting.
Download

Paper Nr: 182
Title:

Dense Segmentation of Textured Fruits in Video Sequences

Authors:

Waqar S. Qureshi, Shin'ichi Satoh, Matthew N. Dailey and Mongkol Ekpanyapong

Abstract: Autonomous monitoring of fruit crops based on mobile camera sensors requires methods to segment fruit regions from the background in images. Previous methods based on color and shape cues have been successful in some cases, but the detection of textured green fruits among green plant material remains a challenging problem. A recently proposed method uses sparse keypoint detection, keypoint descriptor computation, and keypoint descriptor classification followed by morphological techniques to fill the gaps between positively classified keypoints. We propose a textured fruit segmentation method based on super-pixel oversegmentation, dense SIFT descriptors, and and bag-of-visual-word histogram classification within each super-pixel. An empirical evaluation of the proposed technique for textured fruit segmentation yields 96.67% detection rate, a per-pixel accuracy of 97.657%, and a per frame false alarm rate of 0.645%, compared to a detection rate of 90.0%, accuracy of 84.94%, and false alarm rate of 0.887% for the baseline sparse keypoint-based method. We conclude that super-pixel oversegmentation, dense SIFT descriptors, and bag-of-visual-word histogram classification are effective for in-field segmentation of textured green fruits from the background..
Download

Paper Nr: 220
Title:

Fast Violence Detection in Video

Authors:

Oscar Deniz, Ismael Serrano, Gloria Bueno and Tae-Kyun Kim

Abstract: Whereas the action recognition problem has become a hot topic within computer vision, the detection of fights or in general aggressive behavior has been comparatively less studied. Such capability may be extremely useful in some video surveillance scenarios like in prisons, psychiatric centers or even in camera phones. Recent work has considered the well-known Bag-of-Words framework often used in generic action recognition for the specific problem of fight detection. Under this framework, spatio-temporal features are extracted from the video sequences and used for classification. Despite encouraging results in which near 90% accuracy rates were achieved, the computational cost of extracting such features is prohibitive for practical applications, particularly in surveillance and media rating systems. The task of violence detection may have, however, specific features that can be leveraged. Inspired by results that suggest that kinematic features alone are discriminant for specific actions, this work proposes a novel method which uses extreme acceleration patterns as the main feature. These extreme accelerations are efficiently estimated by applying the Radon transform to the power spectrum of consecutive frames. Experiments show that accuracy improvements of up to 12% are achieved with respect to state-of-the-art action recognition methods. Most importantly, the proposed method is at least 15 times faster.
Download

Paper Nr: 231
Title:

A Bottom-up Approach to Class-dependent Feature Selection for Material Classification

Authors:

Pascal Mettes, Robby Tan and Remco Veltkamp

Abstract: In this work, the merits of class-dependent image feature selection for real-world material classification is investigated. Current state-of-the-art approaches to material classification attempt to discriminate materials based on their surface properties by using a rich set of heterogeneous local features. The primary foundation of these approaches is the hypothesis that materials can be optimally discriminated using a single combination of features. Here, a method for determining the optimal subset of features for each material category separately is introduced. Furthermore, translation and scale-invariant polar grids have been designed in this work to show that, although materials are not restricted to a specific shape, there is a clear structure in the spatial allocation of local features. Experimental evaluation on a database of real-world materials indicates that indeed each material category has its own preference. The use of both the class-dependent feature selection and polar grids results in recognition rates which exceed the current state-of-the-art results.
Download

Paper Nr: 238
Title:

Self-learning Voxel-based Multi-camera Occlusion Maps for 3D Reconstruction

Authors:

Maarten Slembrouck, Dimitri Van Cauwelaert, David Van Hamme, Dirk Van Haerenborgh, Peter Van Hese, Peter Veelaert and Wilfried Philips

Abstract: The quality of a shape-from-silhouettes 3D reconstruction technique strongly depends on the completeness of the silhouettes from each of the cameras. Static occlusion, due to e.g. furniture, makes reconstruction difficult, as we assume no prior knowledge concerning shape and size of occluding objects in the scene. In this paper we present a self-learning algorithm that is able to build an occlusion map for each camera from a voxel perspective. This information is then used to determine which cameras need to be evaluated when reconstructing the 3D model at every voxel in the scene. We show promising results in a multi-camera setup with seven cameras where the object is significantly better reconstructed compared to the state of the art methods, despite the occluding object in the center of the room.
Download

Paper Nr: 248
Title:

Constructing Facial Expression Log from Video Sequences using Face Quality Assessment

Authors:

Mohammad A. Haque, Kamal Nasrollahi and Thomas B. Moeslund

Abstract: Facial expression logs from long video sequences effectively provide the opportunity to analyse facial expression changes for medical diagnosis, behaviour analysis, and smart home management. Generating facial expression log involves expression recognition from each frame of a video. However, expression recognition performance greatly depends on the quality of the face image in the video. When a facial video is captured, it can be subjected to problems like low resolution, pose variation, low brightness, and motion blur. Thus, this paper proposes a system for constructing facial expression log by employing a face quality assessment method and investigates its influence on the representations of facial expression logs of long video sequences. A framework is defined to incorporate face quality assessment with facial expression recognition and logging system. While assessing the face quality a face-completeness metric is used along with some other state-of-the-art metrics. Instead of discarding all of the low quality faces from a video sequence, a windowing approach has been applied to select best quality faces in regular intervals. Experimental results show a good agreement between the expression logs generated from all face frames and the expression logs generated by selecting best faces in regular intervals.
Download

Paper Nr: 249
Title:

Vision based System for Vacant Parking Lot Detection: VPLD

Authors:

Imen Masmoudi, Ali Wali, Anis Jamoussi and Adel M. Alimi

Abstract: The proposed system comes in the context of intelligent parking lots management and presents an approach for vacant parking spots detection and localization. Our system provides a camera-based solution, which can deal with outdoor parking lots. It returns the real time states of the parking lots providing the number of available vacant places and its specific positions in order to guide the drivers through the roads. In order to eliminate the real world challenges, we propose a combination of the Adaptive Background Subtraction algorithm to overcome the problems of changing lighting and shadow effects with the Speeded Up Robust Features algorithm to benefit from its robustness to the scale changes and the rotation. Our approach presents also a new state ”Transition” for the classification of the parking places states.
Download

Paper Nr: 275
Title:

Dynamic Scene Recognition based on Improved Visual Vocabulary Model

Authors:

Lin Yan-Hao and Lu-Fang GAO

Abstract: In this paper, we present a scene recognition framework, which could process the images and recognize the scene in the images. We demonstrate and evaluate the performance of our system on a dataset of Oxford typical landmarks. In this paper, we put forward a novel method of local k-meriod for building a vocabulary and introduce a novel quantization method of soft-assignment based on the Gaussian mixture model. Then we also introduced the Gaussian model in order to classify the images into different scenes by calculating the probability of whether an image belongs to the scene , and we further improve the model by drawing out the consistent features and filtering out the noise features. Our experiment proves that these methods actually improve the classifying performance.
Download

Paper Nr: 279
Title:

2D-3D Face Recognition via Restricted Boltzmann Machines

Authors:

Xiaolong Wang, Vincent Ly, Rui Guo and Chandra Kambhamettu

Abstract: This paper proposes a new scheme for the 2D-3D face recognition problem. Our proposed framework mainly consists of Restricted Boltzmann Machines (RBMs) and a correlation learning model. In the framework, a single-layer network based on RBMs is adopted to extract latent features over two different modalities. Furthermore, the latent hidden layer features of different models are projected to formulate a shared space based on correlation learning. Then several different correlation learning schemes are evaluated against the proposed scheme. We evaluate the advocated approach on a popular face dataset-FRGCV2.0. Experimental results demonstrate that the latent features extracted using RBMs are effective in improving the performance of correlation mapping for 2D-3D face recognition.
Download

Paper Nr: 289
Title:

Training Optimum-Path Forest on Graphics Processing Units

Authors:

Adriana S. Iwashita, Marcos V. T. Romero, Alexandro Baldassin, Kelton A. P. Costa and Joao P. Papa

Abstract: In this paper, we presented a Graphics Processing Unit (GPU)-based training algorithm for Optimum-Path Forest (OPF) classifier. The proposed approach employs the idea of a vector-matrix multiplication to speed up both traditional OPF training algorithm and a recently proposed Central Processing Unit (CPU)-based OPF training algorithm. Experiments in several public datasets have showed the efficiency of the proposed approach, which demonstrated to be up to 14 times faster for some datasets. To the best of our knowledge, this is the first GPU-based implementation for OPF training algorithm.
Download

Paper Nr: 293
Title:

Face Recognition based on Binary Images for Link Selection

Authors:

Sanghun Lee, Soochang Kim, Young-hoon Kim and Chulhee Lee

Abstract: A face recognition system which utilizes binary facial images and a bitwise similarity calculation method is proposed for link selection between mobile devices. As a pre-processing step, normalized differences of Gaussian and facial region estimation were used to handle illumination conditions. Binary images were used to extract facial feature sets that did not exceed 700 bytes. Scale pyramids and XNOR+AND similarity scores were used for fast feature matching between reference data sets and pre-processed test data. The proposed method achieved about an 85.9% recognition rate with a database that consisted of 135 facial images with various head poses, obtained by enrolling one reference data set per subject.
Download

Paper Nr: 296
Title:

Boosted Random Forest

Authors:

Yohei Mishina, Masamitsu Tsuchiya and Hironobu Fujiyoshi

Abstract: The ability of generalization by random forests is higher than that by other multi-class classifiers because of the effect of bagging and feature selection. Since random forests based on ensemble learning requires a lot of decision trees to obtain high performance, it is not suitable for implementing the algorithm on the small-scale hardware such as embedded system. In this paper, we propose a boosted random forests in which boosting algorithm is introduced into random forests. Experimental results show that the proposed method, which consists of fewer decision trees, has higher generalization ability comparing to the conventional method.
Download

Paper Nr: 297
Title:

Deformable Part Model based Multiple Pedestrian Detection for Video Surveillance in Crowded Scenes

Authors:

Lu Wang, Xiaoli Ji, Qingxu Deng and Mingxing Jia

Abstract: Pedestrian detection is a challenging task for video surveillance. The problem becomes more difficult when occlusion is prevalent. In this paper, we extend a deformable part-based pedestrian detector to pedestrian detection in crowded scenes by considering both body part detection responses and detections' mutual spatial relationship. Specifically, we first decompose the full body detector into several body part detectors, whose detection responses can be computed efficiently from the response of the full body detector. Then, given the detection responses of the body part detectors, hypotheses are nominated by considering both detection scores and responses’ mutual spatial relationship. Finally, a local optimization process is applied to make the final decision, where an objective function encouraging detections with high confidence, high discriminability and low conflict with other detections is proposed to select the best candidate detections. Experimental results show the effectiveness of the proposed approach.
Download

Paper Nr: 303
Title:

Logos Detection from Moving Vehicles

Authors:

A. Ben Hamida, M. Brulin and H. Nicolas

Abstract: To deal with road accidents, especially accidents caused by trucks containing dangerous products, the possible solution is to control these vehicles’ passage. We aim at developing a software technique confirming that all the entered engines inside a tunnel are securely quitted, to guarantee that no accidents, no breakdowns have occurred inside. To implement such solution, we identify the ingoing and outgoing trucks by extracting their significant marks. These marks help to differentiate each vehicle from the other. They are the mounted logos as license plates and pictograms. To ensure the safe exit of one truck, we look for the similarity between the ingoing and outgoing vehicle’s images by comparing their detected symbols. In this paper, we present a controlling system capable to extract logos from moving trucks to verify their safe entrees and exits. Both theoretical analyses and experimental results are provided to show the performance of the proposed system.
Download

Paper Nr: 308
Title:

Fast Optimum-Path Forest Classification on Graphics Processors

Authors:

Marcos V. T. Romero, Adriana S. Iwashita, Luciene P. Papa, André N. Souza and João P. Papa

Abstract: Some pattern recognition techniques may present a high computational cost for learning samples’ behaviour. The Optimum-Path Forest (OPF) classifier has been recently developed in order to overcome such drawbacks. Although it can achieve faster training steps when compared to some state-of-art techniques, OPF can be slower for testing in some situations. Therefore, we propose in this paper an implementation in graphics cards of the OPF classification, which showed to be more efficient than traditional OPF with similar accuracies.
Download

Paper Nr: 313
Title:

Ghost Pruning for People Localization in Overlapping Multicamera Systems

Authors:

Muhammad Owais Mehmood, Sebastien Ambellouis and Catherine Achard

Abstract: In this paper, we propose a novel ghost pruning technique for multicamera people localization in overlapping scenarios. First, synergy map is obtained from multiplanar projections across multiple overlapping cameras. Second, occupancy map is generated by back projection from the synergy map across various image layers. This back projected occupancy map is combined with constraints to remove ghosts. The novelty of this paper is the introduction of an intuitive ghost pruning technique, which does not require any temporal information. Experiments on a sequence of the PETS 2009 dataset show significant reduction in the number of ghosts. The purpose and novelty of this paper is focused to the ghost pruning module but detection metrics show results comparable to those of the complete, state-of-the-art multicamera object detection systems.
Download

Paper Nr: 321
Title:

Fall Detection using Ceiling-mounted 3D Depth Camera

Authors:

Michal Kepski and Bogdan Kwolek

Abstract: This paper proposes an algorithm for fall detection using a ceiling-mounted 3D depth camera. The lying pose is separated from common daily activities by a k-NN classifier, which was trained on features expressing headfloor distance, person area and shape’s major length to width. In order to distinguish between intentional lying postures and accidental falls the algorithm also employs motion between static postures. The experimental validation of the algorithm was conducted on realistic depth image sequences of daily activities and simulated falls. It was evaluated on more than 45000 depth images and gave 0% error. To reduce the processing overload an accelerometer was used to indicate the potential impact of the person and to start an analysis of depth images.
Download

Paper Nr: 339
Title:

Novel Parallel Algorithm for Object Recognition with the Ensemble of Classifiers based on the Higher-Order Singular Value Decomposition of Prototype Pattern Tensors

Authors:

Boguslaw Cyganek and Katarzyna Socha

Abstract: In this paper a novel parallel algorithm for the tensor based classifiers for object recognition in digital images is presented. Classification is performed with an ensemble of base classifiers, each operating in the orthogonal subspaces obtained with the Higher-Order Singular Value Decomposition (HOSVD) of the prototype pattern tensors. Parallelism of the system is realized through the functional and data decompositions on different levels of computations. First, the parallel implementation of the HOSVD is presented. Then, the second level of parallelism is gained by partitioning the input dataset. Each of the partitions is used to train a separate tensor classifiers of the ensemble. Despite the computational speed-up and lower memory requirements, also accuracy of the ensemble showed to be higher compared to a single classifier. The method was tested in the context of object recognition in computer vision. The experiments show high accuracy and accelerated performance both in the training and classification stages.
Download

Paper Nr: 348
Title:

A Robust Metric for the Evaluation of Visual Saliency Models

Authors:

Puneet Sharma and Ali Alsam

Abstract: Finding a robust metric for evaluating the visual saliency algorithms has been the subject of research for decades. Motivated by the shuffled AUC metric in this paper, we propose a robust AUC metric that uses the statistical analysis of the fixations data to better judge the goodness of the different saliency algorithms. To calculate the robust AUC metric, we use the first eigenvector obtained from the statistical analysis to define the area from which non-fixations are selected thus mitigating the effect of the center bias. Our results show that the proposed metric results in similar performance when compared with the shuffled AUC metric, but given that the proposed metric is derived from the statistics for the data set, we believe that it is more robust.

Paper Nr: 398
Title:

Texture Classification with Fisher Kernel Extracted from the Continuous Models of RBM

Authors:

Tayyaba Azim and Mahesan Niranjan

Abstract: In this paper, we introduce a novel technique of deriving Fisher kernels from the Gaussian Bernoulli restricted Boltzmann machine (GBRBM) and factored 3-way restricted Boltzmann machine (FRBM) to yield better texture classification results. GBRBM and FRBM, both, are stochastic probabilistic models that have already shown their suitability for modelling real valued continuous data, however, they are not efficient models for classification based on their likelihood performances (Jaakkola and Haussler, 1999; Azim and Niranjan, 2013). We induce discrimination in these models with the help of Fisher kernel that is constructed from the gradients of the parameters of the generative model. From the empirical results shown on two different texture data sets, i.e. Emphysema and Brodatz, we demonstrate how a useful texture classifier could be built from a very compact generative model that represents the data in the Fisher score space discriminately. The proposed discriminative technique allows us to achieve competitive classification performance on texture data sets, without expanding the size of the generative model with large number of hidden units. Also, comparative analysis shows that factored 3-way RBM is a good representative model of textures, giving rise to a Fisher score space that is less sparse and efficient for classification.
Download

Area 4 - Applications and Services

Full Papers
Paper Nr: 131
Title:

On the Usage of Sensor Pattern Noise for Picture-to-Identity Linking through Social Network Accounts

Authors:

Riccardo Satta and Pasquale Stirparo

Abstract: Digital imaging devices have gained an important role in everyone’s life, due to a continuously decreasing price, and of the growing interest on photo sharing through social networks. As a result of the above facts, everyone continuously leaves visual “traces” of his/her presence and life on the Internet, that can constitute precious data for forensic investigators. Digital Image Forensics is the task of analysing such digital images for collecting evidences. In this field, the recent introduction of techniques able to extract a unique “fingerprint” of the source camera of a picture, e.g. based on the Sensor Pattern Noise (SPN), has set the way for a series of useful tools for the forensic investigator. In this paper, we propose a novel usage of SPN, to find social network accounts belonging to a certain person of interest, who has shot a given photo. This task, that we name Picture-to-Identity linking, can be useful in a variety of forensic cases, e.g., finding stolen camera devices, cyber-bullying, or on-line child abuse. We experimentally test a method for Picture-to-Identity linking on a benchmark data set of publicly accessible social network accounts collected from the Internet. We report promising result, which show that such technique has a practical value for forensic practitioners.
Download

Paper Nr: 198
Title:

Towards a Heuristic based Real Time Hybrid Rendering - A Strategy to Improve Real Time Rendering Quality using Heuristics and Ray Tracing

Authors:

Paulo Andrade, Thales Sabino and Esteban Clua

Abstract: Hybrid rendering combines the speed of raster-based rendering with the photorealism of ray trace rendering in order to achieve both speed and visual quality for interactive applications. Since ray tracing images is a demanding task, a hybrid renderer must use ray tracing carefully in order to maintain an acceptable frame rate. Fixed solutions, where only shadows or reflexive objects are ray traced not only cannot guarantee real time, but can represent a waste of processing, if the final result minimally differs from a raster only result. In our work, we present a method to improve hybrid rendering by analysing the scene in real time and decide what should be ray traced, in order to provide the best visual experience within acceptable frame rates.
Download

Paper Nr: 200
Title:

Iris Liveness Detection Methods in Mobile Applications

Authors:

Ana F. Sequeira, Juliano Murari and Jaime S. Cardoso

Abstract: Biometric systems are vulnerable to different kinds of attacks. Particularly, the systems based on iris are vulnerable to direct attacks consisting on the presentation of a fake iris to the sensor trying to access the system as it was from a legitimate user. The analysis of some countermeasures against this type of attacking scheme is the problem addressed in the present paper. Several state-of-the-art methods were implemented and included in a feature selection framework so as to determine the best cardinality and the best subset that conducts to the highest classification rate. Three different classifiers were used: Discriminant analysis, K nearest neighbours and Support Vector Machines. The implemented methods were tested in existing databases for iris liveness purposes (Biosec and Clarkson) and in a new fake database which was constructed for evaluation of iris liveness detection methods in the mobile scenario. The results suggest that this new database is more challenging than the others. Therefore, improvements are required in this line of research to achieve good performance in real world mobile applications.
Download

Paper Nr: 244
Title:

Virtual Touch Screen “VIRTOS” - Implementing Virtual Touch Buttons and Virtual Sliders using a Projector and Camera

Authors:

Takashi Homma and Katsuto Nakajima

Abstract: We propose a large interactive display with virtual touch buttons and sliders on a pale-colored flat wall. Our easy-to-install system consists of a front projector and a single commodity camera. A button touch is detected based on the area of the shadow cast by the user’s hand; this shadow becomes very small when the button is touched. The shadow area is segmented by a brief change of the button to a different color when a large foreground (i.e., the hand and its shadow) covers the button region. Therefore, no time consuming operations, such as morphing or shape analysis, are required. Background subtraction is used to extract the foreground region. The reference image for the background is continuously adjusted to match the ambient light. Our virtual slider is based on this touch-button mechanism. When tested, our scheme proved robust to differences in illumination. The response time for touch detection was about 150 ms. Our virtual slider has a quick response and proved suitable as a controller for a Breakout-style game.
Download

Paper Nr: 245
Title:

Quantitative Analysis of Pulmonary Emphysema using Isotropic Gaussian Markov Random Fields

Authors:

Chathurika Dharmagunawardhana, Sasan Mahmoodi, Michael Bennett and Mahesan Niranjan

Abstract: A novel texture feature based on isotropic Gaussian Markov random fields is proposed for diagnosis and quantification of emphysema and its subtypes. Spatially varying parameters of isotropic Gaussian Markov random fields are estimated and their local distributions constructed using normalized histograms are used as effective texture features. These features integrate the essence of both statistical and structural properties of the texture. Isotropic Gaussian Markov Random Field parameter estimation is computationally efficient than the methods using other MRF models and is suitable for classification of emphysema and its subtypes. Results show that the novel texture features can perform well in discriminating different lung tissues, giving comparative results with the current state of the art texture based emphysema quantification. Furthermore supervised lung parenchyma tissue segmentation is carried out and the effective pathology extents and successful tissue quantification are achieved.
Download

Paper Nr: 286
Title:

Event-driven Dynamic Platform Selection for Power-aware Real-time Anomaly Detection in Video

Authors:

Calum G Blair and Neil M Robertson

Abstract: In surveillance and scene awareness applications using power-constrained or battery-powered equipment, performance characteristics of processing hardware must be considered. We describe a novel framework for moving processing platform selection from a single design-time choice to a continuous run-time one, greatly increasing flexibility and responsiveness. Using Histogram of Oriented Gradients (HOG) object detectors and Mixture of Gaussians (MoG) motion detectors running on 3 platforms (FPGA, GPU, CPU), we characterise processing time, power consumption and accuracy of each task. Using a dynamic anomaly measure based on contextual object behaviour, we reallocate these tasks between processors to provide faster, more accurate detections when an increased anomaly level is seen, and reduced power consumption in routine or static scenes. We compare power- and speed- optimised processing arrangements with automatic event-driven platform selection, showing the power and accuracy tradeoffs between each. Real-time performance is evaluated on a parked vehicle detection scenario using the i-LIDS dataset. Automatic selection is 10% more accurate than power-optimised selection, at the cost of 12W higher average power consumption in a desktop system.
Download

Paper Nr: 340
Title:

Optimization of Endoscopic Video Stabilization by Local Motion Exclusion

Authors:

Thomas Gross, Navya Amin, Marvin C. Offiah, Susanne Rosenthal, Nail El-Sourani and Markus Borschbach

Abstract: Hitherto video stabilization algorithms for different types of videos have been proposed. Our work majorly focuses on developing stabilization algorithms for endoscopic videos which include distortions peculiar to endoscopy. In this paper, we deal with the optimization of the motion detection procedure which is the most important step in the development of a video stabilization algorithm. It presents a robust motion estimation procedure to enhance the quality of the outcome. The outcome of the later steps in the stabilization, namely motion compensation and image composition depend on the level of precision of the motion estimation step. The results of a previous version of the stabilization algorithm are here compared to a new optimized version. Furthermore, the improvements of the outcomes using the video quality estimation methods are also discussed.
Download

Short Papers
Paper Nr: 30
Title:

Mastering the Art of Persuasion - Intelligent Tutoring System for Presenters

Authors:

Anh-Tuan Nguyen, Wei Chen and Matthias Rauterberg

Abstract: Public speaking is a non-trivial task since it is affected by how nonverbal behaviors are expressed. Practicing to deliver the appropriate expressions is difficult while they are mostly given subconsciously. This paper presents our empirical study on the nonverbal behaviors of presenters. Such information was used as the ground truth to develop an intelligent tutoring system. The system can capture bodily characteristics of presenters via a depth camera, interpret this information in order to assess the quality of the presentation, and then give feedbacks to users. Feedbacks are delivered immediately through a virtual conference room, in which the reactions of the simulated avatars can be controlled based on the performance of presenters.

Paper Nr: 78
Title:

Improved Pulse Detection from Head Motions using DCT

Authors:

Ramin Irani, Kamal Nasrollahi and Thomas B. Moeslund

Abstract: The heart pulsation sends out the blood throughout the body. The rate in which the heart performs this vital task, heartbeat rate, is of curial importance to the body. Therefore, measuring heartbeat rate, a.k.a. pulse detection, is very important in many applications, especially the medical ones. To measure it, physicians traditionally, either sense the pulsations of some blood vessels or install some sensors on the body. In either case, there is a need for a physical contact between the sensor and the body to obtain the heartbeat rate. This might not be always feasible, for example, for applications like remote patient monitoring. In such cases, contactless sensors, mostly based on computer vision techniques, are emerging as interesting alternatives. This paper proposes such a system, in which the heartbeats (pulses) are detected by subtle motions that appear on the face due to blood circulation. The proposed system has been tested in different facial expressions. The experimental results show that the proposed system is correct and robust and outperforms state-of-the-art.
Download

Paper Nr: 94
Title:

Client-side Mobile Visual Search

Authors:

Andreas Hartl, Dieter Schmalstieg and Gerhard Reitmayr

Abstract: Visual search systems present a simple way to obtain information about our surroundings, our location or an object of interest. Typically, mobile applications of visual search remotely connect to large-scale systems capable of dealing with millions of images. Querying such systems may induce considerable delays, which can severeley harm usability or even lead to complete rejection by the user. In this paper, we investigate an interim solution and system design using a local visual search system for embedded devices. We optimized a traditional visual search system to decrease runtime and also storage space in order to scale to thousands of training images on current off-the-shelf smartphones. We demonstrate practical applicability in a prototype for mobile visual search on the same target platform. Compared with the unmodified version of the pipeline we achieve up to a two-fold speed-up in runtime, save 85% of storage space and provide substantially increased recognition performance. In addition, we integrate the pipeline with a popular Augmented Reality SDK on Android devices and use it as a pre-selector for tracking datasets. This allows to instantly use a large number of tracking targets without requiring user intervention or costly server-side recognition.
Download

Paper Nr: 116
Title:

MobBIO: A Multimodal Database Captured with a Portable Handheld Device

Authors:

Ana F. Sequeira, João C. Monteiro, Ana Rebelo and Hélder P. Oliveira

Abstract: Biometrics represents a return to a natural way of identification: testing someone by what (s)he is, instead of relying on something (s)he owns or knows seems likely to be the way forward. Biometric systems that include multiple sources of information are known as multimodal. Such systems are generally regarded as an alternative to fight a variety of problems all unimodal systems stumble upon. One of the main challenges found in the development of biometric recognition systems is the shortage of publicly available databases acquired under real unconstrained working conditions. Motivated by such need the MobBIO database was created using an Asus EeePad Transformer tablet, with mobile biometric systems in mind. The proposed database is composed by three modalities: iris, face and voice.
Download

Paper Nr: 132
Title:

Towards Fully Automated Person Re-identification

Authors:

Matteo Taiana, Dario Figueira, Athira Nambiar, Jacinto Nascimento and Alexandre Bernardino

Abstract: In this work we propose an architecture for fully automated person re-identification in camera networks. Most works on re-identification operate with manually cropped images both for the gallery (training) and the probe (test) set. However, in a fully automated system, re-identification algorithms must work in series with person detection algorithms, whose output may contain false positives, detections of partially occluded people and detections with bounding boxes misaligned to the people. These effects, when left untreated, may significantly jeopardise the performance of the re-identification system. To tackle this problem we propose modifications to classical person detection and re-identification algorithms, which enable the full system to deal with occlusions and false positives. We show the advantages of the proposed method on a fully labelled video data set acquired by 8 high-resolution cameras in a typical office scenario at working hours.
Download

Paper Nr: 135
Title:

Appearance-based Eye Control System by Manifold Learning

Authors:

Ke Liang, Youssef Chahir, Michèle Molina, Charles Tijus and François Jouen

Abstract: Eye-movements are increasingly employed to study usability issues in HCI (Human-Computer Interacetion) contexts. In this paper we introduce our appearance-based eye control system which utilizes 5 specific eye movements, such as closed-eye movement and eye movements with gaze fixation at the positions (up, down, right, left) for HCI applications. In order to measure these eye movements, we employ a fast appeance-based gaze tracking method with manifold learning technique. First we propose to concatenate local eye appearance Center-Symmetric Local Binary Pattern(CS-LBP) descriptor for each subregion of eye image to form an eye appearance feature vector. The calibration phase is then introduced to construct a trainning samples by spectral clustering. After that, Laplacian Eigenmaps will be applied to the trainning set and unseen input together to get the structure of eye manifolds. Finally we can infer the eye movement of the new input by its distances with the clusters in the trainning set. Experimental results demonstrate that our system with quick 4-points calibration not only can reduce the run-time cost, but also provide another way to mesure eye movements without mesuring gaze coordinates to a HCI application such as our eye control system.
Download

Paper Nr: 176
Title:

Shape Segmentation using Medial Point Clouds with Applications to Dental Cast Analysis

Authors:

Jacek Kustra, Andrei Jalba and Alexandru Telea

Abstract: We present an automatic surface segmentation method for dental cast scans based on the point density properties of the surface skeleton of such shapes. We produce quasi-flat segments separated by soft ridges, in contrast to classical surface segmentation methods that require sharp ridges. We compute the surface skeleton by a fast 3D skeletonization technique followed by its regularization using surface geodesics. We segment the resulting skeleton by a mean-shift approach and transfer the segmentation results back to the surface. We demonstrate our results on an industrial dental-cast segmentation application and several generic 3D shape models.
Download

Paper Nr: 261
Title:

Automated Arteriole and Venule Recognition in Retinal Images using Ensemble Classification

Authors:

M. M. Fraz, A. R. Rudincka, C. G. Owen, D. P. Strachan and S. A. Barman

Abstract: The shape and size of retinal vessels have been prospectively associated with cardiovascular outcomes in adult life, and with cardiovascular precursors in early life, suggesting life course patterning of vascular development. However, the shape and size of arterioles and venules may show similar or opposing associations with disease precursors / outcomes. Hence accurate detection of vessel type is important when considering cardio-metabolic influences on vascular health. This paper presents an automated method of identifying arterioles and venules, based on colour features using the ensemble classifier of boot strapped decision trees. The classifier utilizes pixel based features, vessel profile based features and vessel segment based features from both RGB and HIS colour spaces. To the best of our knowledge, the decision trees based ensemble classifier has been used for the first time for arteriole/venule classification. The classification is performed across the entire image, including the optic disc. The methodology is evaluated on 3149 vessel segments from 40 colour fundus images acquired from an adult population based study in the UK (EPIC Norfolk), resulting in 83% detection rate. This methodology can be further developed into an automated system for measurement of arterio-venous ratio and quantification of arterio-venous nicking in retinal images, which may be of use in identifying those at high risk of cardiovascular events, in need of early intervention.
Download

Paper Nr: 276
Title:

Image Registration to Assist the Diagnosis of Pelvic Floor Disorder in MR Defecography

Authors:

Cicero L. Costa, Marcos A. Batista, Denise Guliato, Tulio A. A. Macedo and Celia Z. Barcelos

Abstract: Over the last decades, the interest in the use of the defecography for the investigation of defecation problems and pelvic floor disorders has increased. The MR defecography assists in the diagnosis of pelvic floor weakening, fecal incontinence, painful defection and genital prolapse. To identify an abnormal morphological variation of the structures relevant for the diagnosis, the radiologist derives several static measures at different moments and at different maneuvers during the exam. However, there is a poor agreement between independent observers for the measurement of the anorectal angle, which is a critical parameter for the interpretation of the defecography. With the aim of reducing the inter-observer variability and assisting the radiologist in the interpretation of the MR defecography for the diagnosis of fecal incontinence, we propose calculating dynamic changes of anorectal junction during the defecation activity. To that end we propose to propagate, automatically, the location of pre-defined landmarks throughout the frames of the MR defecography, for each maneuver, via image registration based on variational model. The analysis of the results shows that our proposal was well succeeded in the propagation of the initial landmarks and to calculate the dynamic changes during each maneuver.