VISAPP 2012 Abstracts


Area 1 - Image Formation and Preprocessing

Full Papers
Paper Nr: 82
Title:

DYNAMIC WEIGHTING BASED ACTIVE CURVE PROPAGATION METHOD FOR VIDEO OBJECT SELECTION

Authors:

Marwen Nouri, Emmanuel Marilly, Olivier Martinot and Nicole Vincent

Abstract: Improving video user experience is an essential task allowing video based algorithms and systems to be more user-friendly. This paper addresses the problem of video object selection by introducing a new interactive framework based on the minimization of the Active Curve energy. Prior assumption and supervised learning can be used to segment images using both color and morphological information. To deal with the segmentation of arbitrary high level object, user interaction is needed to avoid the semantic gap. Hard constraints such scribbles can be drown by user on the first video frame, to roughly mark the object of interest, and there are then automatically propagated to designate the same object in the remainder of the sequence. The resulting scribbles can be used as hard constraints to achieve the whole segmentation process. The active curve model is adapted and new forces are included to govern the curves evolution frame by frame. A spatiotemporal optimization is used to ensure a coherent propagation. To avoid weight definition problem, as in classical active curve based algorithms, a new concept of dynamically adjusted weighting is introduced in order to improve the robustness of our curve propagation.
Download

Paper Nr: 106
Title:

ARTIFACT-FREE JPEG DECOMPRESSION WITH TOTAL GENERALIZED VARIATION

Authors:

Kristian Bredies and Martin Holler

Abstract: We propose a new model for the improved reconstruction of JPEG (Joint Photographic Experts Group) images. Given a JPEG compressed image, our method first determines the set of possible source images and then specifically chooses one of these source images satisfying additional regularity properties. This is realized by employing the recently introduced Total Generalized Variation (TGV) as regularization term and solving a constrained minimization problem. In order to obtain an optimal solution numerically, we propose a primal-dual algorithm. We have developed a parallel implementation of this algorithm for the CPU and the GPU, using OpenMP and Nvidia’s Cuda, respectively. Finally, experiments have been performed, confirming a good visual reconstruction quality as well as the suitability for real-time application.
Download

Paper Nr: 148
Title:

DEPTH INPAINTING WITH TENSOR VOTING USING LOCAL GEOMETRY

Authors:

Mandar Kulkarni, A. N. Rajagopalan and Gerhard Rigoll

Abstract: Range images captured from range scanning devices or reconstructed form optical cameras often suffer from missing regions due to occlusions, reflectivity, limited scanning area, sensor imperfections etc. In this paper, we propose a fast and simple algorithm for range map inpainting using Tensor Voting (TV) framework. From a single range image, we gather and analyze geometric information so as to estimate missing depth values. To deal with large missing regions, TV-based segmentation is initially employed as a cue for a region filling. Subsequently, we use 3D tensor voting for estimating different plane equations and pass depth estimates from all possible local planes that pass through a missing region. A final pass of tensor voting is performed to choose the best depth estimate for each point in the missing region. We demonstrate the effectiveness of our approach on synthetic as well as real data.
Download

Paper Nr: 149
Title:

GENERATION OF HDR IMAGES IN NON-STATIC CONDITIONS BASED ON GRADIENT FUSION

Authors:

Sira Ferradans, Marcelo Bertalmío, Edoardo Provenzi and Vicent Caselles

Abstract: We present a new method for the generation of HDR images in non-static conditions, i.e. hand held camera and/or dynamic scenes, based on gradient fusion. Given a reference image selected from a set of LDR pictures of the same scene taken with multiple time exposure, our method improves the detail rendition of its radiance map by adding information suitably selected and interpolated from the companion images. The proposed technique is free from ghosting and bleeding, two typical artifacts of HDR images built through image fusion in non-static conditions. The advantages provided by the gradient fusion approach will be supported by the comparison between our results and those of the state of the art.
Download

Paper Nr: 150
Title:

HIGH RESOLUTION SURVEILLANCE VIDEO COMPRESSION - Using JPEG2000 Compression of Random Variables

Authors:

Octavian Biris and Joseph L. Mundy

Abstract: This paper proposes a scheme for efficient compression of wide-area aerial video collectors (WAVC) data, based on background modeling and foreground detection using a Gaussian mixture at each pixel. The method implements the novel approach of treating the pixel intensities and wavelet coefficients as random variables. A modified JPEG 2000 algorithm based on the algebra of random variables is then used to perform the compression on the model. This approach leads to a very compact model which is selectively decompressed only in foreground regions. The resulting compression ratio is on the order of 16:1 with minimal loss of detail for moving objects.
Download

Paper Nr: 156
Title:

DESIGN OF A COMPRESSIVE REMOTE IMAGING SYSTEM COMPENSATING A HIGHLY LIGHTWEIGHT ENCODING WITH A REFINED DECODING SCHEME

Authors:

George Tzagkarakis, Arnaud Woiselle, Panagiotis Tsakalides and Jean-Luc Starck

Abstract: Lightweight remote imaging systems have been increasingly used in surveillance and reconnaissance. Nevertheless, the limited power, processing and bandwidth resources is a major issue for the existing solutions, not well addressed by the standard video compression techniques. On the one hand, the MPEGx family achieves a balance between the reconstruction quality and the required bit-rate by exploiting potential intra- and interframe redundancies at the encoder, but at the cost of increased memory and processing demands. On the other hand, the M-JPEG approach consists of a computationally efficient encoding process, with the drawback of resulting in much higher bit-rates. In this paper, we cope with the growing compression ratios, required for all remote imaging applications, by exploiting the inherent property of compressive sensing (CS), acting simultaneously as a sensing and compression framework. The proposed compressive video sensing (CVS) system incorporates the advantages of a very simple CS-based encoding process, while putting the main computational burden at the decoder combining the efficiency of a motion compensation procedure for the extraction of inter-frame correlations, along with an additional super-resolution step to enhance the quality of reconstructed frames. The experimental results reveal a significant improvement of the reconstruction quality when compared with M-JPEG, at equal or even lower bit-rates.
Download

Paper Nr: 201
Title:

A NEW REGION-BASED PDE FOR PERCEPTUAL IMAGE RESTORATION

Authors:

Baptiste Magnier, Philippe Montesinos and Daniel Diep

Abstract: In this paper, we present a new image regularization method using a rotating smoothing filter. The novelty of this approach resides in the mixing of ideas coming both from pixel classification which determines roughly if a pixel belongs to a homogenous region or an edge and an anisotropic perceptual edge detector which computes two precise diffusion directions. These directions are used by an anisotropic diffusion scheme. This anisotropic diffusion is accurately controlled near edges and corners, while isotropic diffusion is applied to smooth homogeneous and highly noisy regions. Our results and a comparison with anisotropic diffusion methods applied on a real image show that our model is able to efficiently regularize images and to control the diffusion.
Download

Paper Nr: 234
Title:

IMAGE RESTORATION VIA HUMAN PERCEPTION AND LIE GROUPS

Authors:

Vittoria Bruni, Elisa Rossi and Domenico Vitulano

Abstract: This paper presents a novel and user’s independent model for image restoration based on Lie group transformations. They allow to define a redundant set of transformations from which automatically select the ones that better invert the physical formation of the defect. Hence restoration consists of gradually reducing the visual perception of the defect. Extensive experimental results on original photographs, from Alinari Archive, affected by semi-transparent blotches show the potential of the proposed approach in removing degradation in different contexts without altering the original content of artworks.
Download

Short Papers
Paper Nr: 13
Title:

FAST SINGLE IMAGE DEFOGGING ALGORITHM BASED ON GUIDED FILTER

Authors:

Li Peng, Chuangbai Xiao, Juan Duan and Xiaoliang Yu

Abstract: In this paper a fast haze-removing algorithm from a single image is presented based on physical model. The main contributions of our algorithm compared with others are it can reduce the Halo effects physically and its high speed. In this algorithm, we take the place of median of median filter using the guided filter to get the atmospheric veil and handle the details of the object edge. As guided filter is a fast and efficient edge-preserving filter, our algorithm has lower time and space complexity. Several experiments show that our algorithm can recover the image fast and avoid Halo and blocking effects physically and improve the visibility of image greatly.

Paper Nr: 36
Title:

PEOPLE DETECTION AND RE-IDENTIFICATION FOR MULTI SURVEILLANCE CAMERAS

Authors:

Etienne Corvee, Slawomir Bak and François Brémond

Abstract: Re-identifying people in a network of non overlapping cameras requires people to be accurately detected and tracked in order to build a strong visual signature of people appearances. Traditional surveillance cameras do not provide high enough image resolution to iris recognition algorithms. State of the art face recognition can not be easily applied to surveillance videos as people need to be facing the camera at a close range. The different lighting environment contained in each camera scene and the strong illumination variability occurring as people walk throughout a scene induce great variability in their appearance. In addition, people images occlud each other onto the image plane making people detection difficult to achieve. We propose a novel simplified Local Binary Pattern features to detect people, head and faces. A Mean Riemannian Covariance Grid (MRCG) is used to model appearance of tracked people to obtain highly discriminative human signature. The methods are evaluated and compared with the state of the art algorithms. We have created a new dataset from a network of 2 cameras showing the usefulness of our system to detect, track and re-identify people using appearance and face features.
Download

Paper Nr: 40
Title:

ITERATIVE DISCRETE STEERING OF THREE-DIMENSIONAL RECONSTRUCTIONS FROM X-RAY IMAGES WITH LIMITED ANGLE

Authors:

Anja Frost, Eike Renners and Michael Hötter

Abstract: Computer Tomography is aimed to calculate a three dimensional reconstruction of the inside of an object from series of X-ray images. This calculation corresponds to the solution of a system of linear equations, in which the equations arise from the measured X-rays and the variables from the voxels of the reconstruction volume, or more precisely, their density values. Unfortunately, some applications do not supply enough equations. In that case, the system is underdetermined. The reconstructed object, as only estimated, seems to be stretched. As there are a few voxels, that are already representing the object true to original, it is possible to exclude these variables from the system of equations. Then, the number of variables decreases. Ideally, the system gets solvable. In this paper we concentrate on the detection of all good reconstructed voxels i.e. we introduce a quality measure, called Accuratio, to evaluate the volume voxel by voxel. In our experi-mental results we show the reliability of Accuratio by applying it to an iterative reconstruction algorithm. In each iteration step the whole volume is evaluated, voxels with high Accuratio are excluded and the new system of equations is reconstructed again. Steadily the reconstructed object becomes “destretched”.
Download

Paper Nr: 45
Title:

IMAGE QUALITY EVALUATION FOR IMPROVING IRIS RECOGNITION SYSTEMS

Authors:

Nadia Feddaoui, Hela Mahersia and Kamel Hamrouni

Abstract: In a classical iris biometric system, images of low quality limit considerably the process efficiency. There is no perfect method for recognition. Commonly, effectiveness depends on the segmentation and characterization results given by the intermediary steps which may affect the final recognition process. To circumvent these defects, we propose a complete quality evaluation method integrating units at different steps of the recognition system. The first unit is used after the image acquisition to detect and discard out- of-focus and motion-blurred images. The second unit added to the segmentation and normalisation step aims to eliminate improperly segmented images and those which have not sufficient textural information. In addition, the proposed method has an important interest in the consideration of a measure quantifying the texture information in the decision-making. Indeed, we have the possibility to decide the dissimilarity of two irises in the verification mode, without the need of a feature extraction stage. We have integrated the proposed quality evaluation method into an iris recognition system based on multi-channel Gabor filters. Tests were carried out on left eyes and right eyes images of Casia-v3-interval iris database. Experimental results show that the proposed quality measures are relevant to improve the system performance and optimize the system in decision.

Paper Nr: 84
Title:

CODED APERTURE STEREO - For Extension of Depth of Field and Refocusing

Authors:

Yuichi Takeda, Shinsaku Hiura and Kosuke Sato

Abstract: Image acquisition techniques using coded apertures have been intensively investigated to improve the performance of image deblurring and depth estimation. Generally, estimation of the scene depth is a key issue in the recovery of optical blur because the size of the blur kernel varies according to the depth. However, since it is hard to estimate the depth of a scene with a single image, most successful methods use several images with different optical parameters captured by a specially developed camera with expensive internal optics. On the other hand, a stereo camera configuration is widely used to obtain the depth map of a scene. Therefore, in this paper, we propose a method for deblurring and depth estimation using a stereo camera with coded apertures. Our system configuration offers several advantages. First, coded apertures make not only deconvolution but also stereo matching very robust, because the loss of high spatial frequency domain information in the blurred image is well suppressed. Second, the size of the blur kernel is linear with the disparity of the stereo images, making calibration of the system very easy. The proof of this linearity is given in this paper together with several experimental results showing the advantages of our method.
Download

Paper Nr: 170
Title:

A NEW APPROACH FOR DENOISING IMAGES BASED ON WEIGHTS OPTIMIZATION

Authors:

Qiyu Jin, Ion Grama and Quansheng Liu

Abstract: We propose a new algorithm to restore an image contaminated by the Gaussian white noise. Our approach is based on the weighted average of the observations in a neighborhood as in the case of the Non-Local Means Filter. But in contrast to the Non-Local Means Filter, we choose the weights by minimizing a tight upper bound of the Mean Square Error. Our theoretical results show that some ”oracle” weights defined by a triangular kernel are optimal. To construct a computable filter the ”oracle” weights are replaced by some estimates. The implementation of the proposed algorithm is straightforward. The simulations show that our approach is very competitive.
Download

Paper Nr: 173
Title:

IMPACT OF H.264 ADVANCED VIDEO CODING INTER-FRAME BLOCK SIZES ON VIDEO QUALITY

Authors:

Harilaos Koumaras, Michail-Alexandros Kourtis, Drakoulis Martakos and Christian Timmerer

Abstract: In this paper, we present a perceptual-based encoding benchmarking of the H.264 Advanced Video Coding (AVC) inter-frame prediction variable block sizes for various spatial and temporal contents. This paper in order to quantify the impact on the video quality of the AVC inter-frame variable block sizes and the responsible prediction algorithm has disabled the motion estimation mechanism of the encoder and manually each block size is selected. Thus each time only one available block size out of the total seven is available to be searched for each MB and it is possible to examine the video quality impact of each block size independently to the remaining ones. The scope of this paper is to study if the use of sophisticated predictions algorithms and variable block sizes enhance the perceived quality of the encoded video signal or if there is not any significant quality degradation when the option of variable size is disabled.
Download

Paper Nr: 204
Title:

LOCAL BLUR ASSESSMENT IN NATURAL IMAGES

Authors:

Loreta Adriana Suta, Mihaela Scuturici, Serge Miguet, Laure Tougne and Mircea-Florin Vaida

Abstract: This paper presents a local no-reference blur assessment method in natural macro-like images. The purpose is to decide the blurriness of the object of interest. In our case, it represents the first step for a plant recognition system. Blur detection works on small non-overlapping blocks using wavelet decomposition and edge classification. At the block level the number of edges is less than on global images. A new set of rules is obtained by a supervised decision tree algorithm trained on a manually labelled base of 1500 blurred/un-blurred images. Our purpose is to achieve a qualitative decision of the blurriness/sharpness of the object of interest making it the first step towards a segmentation process. Experimental results show this method outperforms two other methods found in literature, even if applied on a block basis. Together with a pre-segmentation step, the method allows to decide if the object of interest (leaf, flower) is sharp in order to extract precise botanical key identification features (e. g. leaf border).
Download

Paper Nr: 205
Title:

IMPROVED BM3D FOR CORRELATED NOISE REMOVAL

Authors:

Marcella Matrecano, Giovanni Poggi and Luisa Verdoliva

Abstract: Most of the literature on denoising focuses on the additive-white-gaussian-noise (AWGN) model. However, in many important applicative fields, images are typically affected by non-Gaussian and/or colored noise, in which cases AWGN-based techniques fall much short of their promises. In this paper, we propose a new denoising technique for correlated noise based on the non-local approach. We start from the well-known BM3D algorithm, which can be considered to be the state of the art in AWGN denoising, and modify it in various critical steps in order to take into account the non-whiteness of noise. Experimental results on several test images corrupted by correlated noise confirm the potential of the proposed technique.
Download

Paper Nr: 241
Title:

IMAGE SEQUENCE SUPER-RESOLUTION BASED ON LEARNING USING FEATURE DESCRIPTORS

Authors:

Ana Carolina Correia Rézio, William Robson Schwartz and Helio Pedrini

Abstract: There is currently a growing demand for high-resolution images and videos in several domains of knowledge, such as surveillance, remote sensing, medicine, industrial automation, microscopy, among others. High resolution images provide details that are important to tasks of analysis and visualization of data present in the images. However, due to the cost of high precision sensors and the limitations that exist for reducing the size of the image pixels in the sensor itself, high-resolution images have been acquired from super-resolution methods. This work proposes a method for super-resolving a sequence of images from the compensation residual learned by the features extracted in the residual image and the training set. The results are compared with some methods available in the literature. Quantitative and qualitative measures are used to compare the results obtained with super-resolution techniques considered in the experiments.
Download

Paper Nr: 255
Title:

NON-LOCAL FILTER FOR REMOVING A MIXTURE OF GAUSSIAN AND IMPULSE NOISES

Authors:

Haijuan Hu, Bing Li and Quansheng Liu

Abstract: In this paper we first present two convergence theorems which give a theoretical justification of the Non-Local Means Filter. Based on these theorems, we propose a new filter, called Non-Local Mixed Filter, to remove a mixture of Gaussian and random impulse noises. This filter combines the essential ideas of the Trilateral Filter and the Non-Local Means Filter. It improves the Trilateral Filter and extends the Non-Local Means Filter. Our experiments show that the new filter generally outperforms two other recent proposed methods. A careful discussion and simple formulas are given for the choice of parameters for the proposed filter.
Download

Paper Nr: 262
Title:

MULTIMODALITY AND MULTIRESOLUTION IMAGE FUSION

Authors:

Paul M. de Zeeuw, Eric J. E. M. Pauwels and Jungong Han

Abstract: Standard multiresolution image fusion of multimodal images may yield an output image with artifacts due to the occurrence of opposite contrast in the input images. Equal but opposite contrast leads to noisy patches, instable with respect to slight changes in the input images. Unequal and opposite contrast leads to uncertainty of how to interpret the modality of the result. In this paper a biased fusion is proposed to remedy this, where the bias is towards one image, the so-called iconic image, in a preferred spectrum. A nonlinear fusion rule is proposed to prevent that the fused image reverses the local contrasts as seen in the iconic image. The rule involves saliency and a local match measure. The method is demonstrated by artificial and real-life examples.
Download

Paper Nr: 51
Title:

ONLINE SUNFLICKER REMOVAL USING DYNAMIC TEXTURE PREDICTION

Authors:

A. S. M. Shihavuddin, Nuno Gracias and Rafael Garcia

Abstract: An underwater vision system operating in shallow water faces unique challenges, which often degrade the quality of the acquired data. One of these challenges is the sunflicker effect, created from refracted sunlight casting fast moving patterns on the seafloor. Surprisingly few previous works exist to address this topic. The best performing available method mitigates the sunflickering effect using offline motion compensated filtering. In the present work, we propose an online sunflicker removal method targeted at producing better registration accuracy. The illumination field of the sunflicker effect is considered as a dynamic texture, since it produces repetitive dynamic patterns. With that assumption, the dynamic model of the sunflicker is learned from the registered illumination fields of the previous frames and is used for predicting that of the next coming frame. Such prediction allows for removing the sunflicker patterns from the new frame and successfully register it against previous frames. Comparative results are presented using challenging test sequences which illustrate the better performance of the approach against the closest related method in the literature.
Download

Paper Nr: 73
Title:

CODED PIXELS - Random Coding of Pixel Shape for Super-resolution

Authors:

Tomoki Sasao, Shinsaku Hiura and Kosuke Sato

Abstract: In this paper, we propose a technique to improve the performance of super-resolution by changing the effective shape of each pixel on the image sensor. Since the sampling of the incoming light by the usual image sensors is not impulse-shaped but rectangular, the high spatial frequency component of the latent image is lost through the integration effect of the pixel area. Therefore, by spraying black powder onto the image sensor we give each pixel shape a random code, which jointly aggregates the latent information of the observed scene. Experimental results show that the proposed random code greatly improves the quality of the reconstructed image.
Download

Paper Nr: 96
Title:

ITERATIVE IMAGE INTERPOLATION FOR IRREGULARLY SAMPLED IMAGE

Authors:

Jonghwa Lee and Chulhee Lee

Abstract: For irregularlyFor irregularly sampled color images, an iterative interpolation algorithm utilizing a wavelet shrinkage denoising technique is proposed. Exploiting the non-local information from neighboring blocks, the reconstruction performance converges as the iteration of the proposed algorithm is repeated. Experimental results show that the proposed algorithm outperforms the conventional algorithms in terms of subjective quality and objective measures. The proposed algorithm correctly reconstructs the edge and provides perceptually good performance with randomly chosen 25% pixels.
Download

Paper Nr: 140
Title:

CONTOURLET BASED MULTI-EXPOSURE IMAGE FUSION WITH COMPENSATION FOR MULTI-DIMENSIONAL CAMERA SHAKE

Authors:

Sara Saravi and E. A. Edirisinghe

Abstract: Multi-exposure image fusion algorithms are used for enhancing the perceptual quality of an image captured by sensors of limited dynamic range by rendering multiple images captured at different exposure settings. One practical problem overlooked by existing algorithms is the compensation required for image de-registration due to possible multi-dimensional camera shake that results within the time gap of capturing the multiple exposure images. In our approach RANdom SAmple Consensus (RANSAC) algorithm is used to identify inliers of key-points identified by the Scale Invariant Feature Transform (SIFT) approach subsequently to the use of Coherent Point Drift (CPD) algorithm to register the images based on the selected set of key points. We provide experimental results on set of images with multi-dimensional (translational and rotational) to prove the proposed algorithm’s capability to register and fuse multiple exposure images taken in the presence of camera shake providing subjectively enhanced output images.
Download

Paper Nr: 143
Title:

VIBRO-VISUAL IMAGE FEATURE EXTRACTION WITH CORRELATION IMAGE SENSOR - Circular and Doubly Circular Vibration for Arbitrary Complex Differentials

Authors:

Shigeru Ando and Toru Kurihara

Abstract: of the correlation image sensor(Ando et al., 2009). In this paper, using the newest device of it, we present a theory and experimental evaluation of a bio-inspired vibro-visual correlation imager with various feature extraction capability. Mimicking the involuntary movement (microsaccade) of human eyes, it vibrates rapidly and finely a mirror in its visual axis so as to generate an equivalent vibration of every pixel in a doubly circular locus. The time-varying intensity is captured by a correlation image sensor (CIS) with synchronous reference signals to the vibration, and complex first/second order differentials and Laplacian are obtained as the image features. General theoretical foundations and an implementation result of this system using a novel 640×512 pixel device are presented. Several experimental results using it including a realtime control of resolution and edge detection from a combined use of the first and second order differentials are shown.
Download

Paper Nr: 160
Title:

ARDO: AUTOMATIC REMOVAL OF DYNAMIC OBJECTS - Exclude Moving Objects Automatically from a Collection of Images using a Consumer Camera

Authors:

Patrik Goorts and Philippe Bekaert

Abstract: We present a novel method to remove tourists, cars and other moving objects from photos of monuments, buildings and other attractions. Typically, this is a manual process. Nevertheless, our method is fully automatic and requires neither image stabilization or advanced hardware during acquisition. The method requires a set of images taken from roughly the same location using any consumer photo or video camera and synthesizes the decluttered result using two phases. In the first phase, these images are aligned on to each other using image features and affine transformations. The second phase merges the images to the final result using a median-based per pixel filtering. The results are pleasant looking, even with moving clouds and trees, and outperforms other techniques considering quality and manual intervention.
Download

Paper Nr: 238
Title:

SIMULTANEOUS RECONSTRUCTION AND RECOGNITION OF NOISY CHARACTER-LIKE SYMBOLS

Authors:

László Czúni, Ágnes Lipovits and Dávid Papp

Abstract: In our article we deal with the simultaneous problem of reconstruction and recognition of binary symbols loaded with heavy additive noise. We introduce a Markov Random Field (MRF) model where a shape energy term is responsible to find a solution similar to a tested hypothesis. This way we could increase the precision of the reconstruction process the only question is how to find out the right hypotheses which helps the reconstruction the best way. Fortunately the new energy term gives us the answer: the tested hypotheses with the minimal shape energy component designates the right shape.
Download

Paper Nr: 261
Title:

MOVING OBJECT ANALYSIS IN VIDEO SEQUENCES USING SPACE-TIME INTEREST POINTS

Authors:

Alain Simac-Lejeune

Abstract: Among all the features which can be extracted from videos, we propose to use Space-Time Interest Points (STIPs). STIPs are particularly interesting because they are simple and robust low-level features providing an efficient characterization of moving objects within videos. In this paper, after defining STIPs and after giving some of their properties, we will use STIPs to detect moving objects and to characterize specific changes in the movements of these objects. Proposed results are obtained from two very different types of videos, namely athletic videos and animation movies.
Download

Paper Nr: 301
Title:

AUTOMATIC LOCALIZATION OF INDOOR SOCCER PLAYERS FROM MULTIPLE CAMERAS

Authors:

Erikson Freitas de Morais, Siome Goldenstein and Anderson Rocha

Abstract: Nowadays, there is an ever growing quest for finding sophisticated performance evaluation tools by team sports that could give them an additional inch or a quarter of a second of advantage in a competition. Using cameras to shoot the events of a game, for instance, the teams can analyze the performance of the athletes and even extrapolate the data to obtain semantical information about the behavior of the teams themselves at relatively low costs. In this context, this paper introduces a new approach for better estimating the positions of indoor soccer players using multiple cameras at all moments of a game. The setup consists of four stationary cameras set around the soccer court. Our solution relies on individual object detectors (one per camera) working in the image coordinates and a robust fusion approach working in the world coordinates in a plane that represents the soccer court. The fusion approach relies on a gradient ascent algorithm over a multimodal bidimensional mixture of Gaussians function representing all the players in the soccer court. In the experiments, we show that the proposed solution improves standard object detector approaches and greatly reduces the mean error rate of soccer player detection to a few centimeters with respect to the actual positions of the players.
Download

Paper Nr: 304
Title:

VISIBILITY RESTORATION FROM SINGLE IMAGE BASED OPTICAL MODEL

Authors:

Qieshi Zhang and Sei-ichiro Kamata

Abstract: In this paper, we propose a segmentation based method to estimate the haze-free image by the optical model. In this work, we estimate the atmospheric light by color barycenter hexagon (CBH) model and use the watershed to segment the image to calculate transmission map by dark pixels with single image. Firstly, non-color region is segmented by CBH model and calculate the atmospheric light. Then, use the watershed with rang component of CBH model to segment the color image into several sub-regions, and estimate the transmission map. Finally, use the optical model with the parameters to restore the haze-free image. The experimental results show that our method is more effective and able to get better results than other compared single image based methods.
Download

Area 2 - Image and Video Analysis

Full Papers
Paper Nr: 75
Title:

SHADOW AND SPECULAR REMOVAL BY PHOTOMETRIC LINEARIZATION BASED ON PCA WITH OUTLIER EXCLUSION

Authors:

Takahiro Mori, Shinsaku Hiura and Kosuke Sato

Abstract: The photometric linearization method converts real images, including various photometric components such as diffuse reflection, specular reflection, attached and cast shadow, into images with diffuse reflection components only, which satisfies the Lambertian law. The conventional method(Mukaigawa et al., 2007) based on a random sampling framework successfully achieves the task; however, it contains two problems. The first is that the three basis images selected from the input images by the user seriously affect the linearization result quality. The other is that it takes a long time to process the enormous number of random samples needed to find the correct answer probabilistically. We therefore propose a novel algorithm using the PCA (principal component analysis) method with outlier exclusion. We used knowledge of photometric phenomena for the outlier detection and the experiments show that the method provides fast and precise linearization results.
Download

Paper Nr: 102
Title:

CONTINUOUS REGION-BASED PROCESSING OF SPATIOTEMPORAL SALIENCY

Authors:

Jan Tünnermann and Bärbel Mertsching

Abstract: This paper describes a region-based attention approach on motion saliency, which is important for systems that perceive and interact with dynamic environments. Frames are collected to create volumes, which are sliced into stacks of spatiotemporal images. Color segmentation is applied to these images. The orientations of the resulting regions are used to calculate their prominence in a spatiotemporal context. Saliency is projected back into image space. Tests with different inputs produced results comparable with other state-of-the-art methods. We also demonstrate how top-down influence can affect the processing in order to attend objects that move in a particular direction. The model constitutes a framework for later integration of spatiotemporal and spatial saliency as independent streams, which respect different requirements in resolution and timing.
Download

Paper Nr: 113
Title:

EXTRACTION OF REGION BOUNDARY PATTERNS WITH ACTIVE CONTOURS

Authors:

Mohamed Ben Salah and Amar Mitiche

Abstract: In this study we address the problem of recovering region boundary patterns consistent with a given pattern. A level set method formulated in the variational framework evolves an active contour towards regions of interest boundaries while omitting the others. The curve evolution results from the minimization of a functional which measures the similarity between the distribution of an image-based geometric feature on the curve and a model distribution. The corresponding curve evolution equation can be viewed as a geodesic active contour flow having a variable stopping function. This affords a global representation of the objects boundaries which can effectively drive active curve segmentation in a variety of otherwise adverse conditions. We ran several experiments supported by quantitative performance evaluations using various examples of segmentation and tracking.
Download

Paper Nr: 147
Title:

ORIENTATION-BASED SEGMENTATION OF TEXTURED IMAGES BY ENERGY MINIMIZATION

Authors:

Maria Sagrebin-Mitzel and Til Aach

Abstract: We consider textured images, where the textures are composed of different numbers of additively superimposed oriented patterns. Our aim is to develop an energy minimization approach to segment these images into regions according to the number of patterns superimposed. The number of superimposed patterns can be inferred by testing orientation tensors for rank deficiency. In particular, the hypothesis that a local image patch exhibits a given number of superimposed oriented patterns holds if the corresponding orientation tensor is rank deficient by one. The tests can be carried out based on quantities computed from the eigenvalues of the orientation tensors, or equivalently from invariants such as determinant, minors and trace. Direct thresholding of these quantities leads, however, to non-robust segmentation results. We therefore develop energy functions which consist of a data term evaluating tensor rank, and a smoothness term which assesses smoothness of the segmentation results. As the orientation tensors and thus the data term depend on the number of orientations tested for, we derive a hierarchical algorithm for approximate energy minimization using graph cuts. We show the robustness of the approach using both synthetic and real image data.
Download

Paper Nr: 161
Title:

MINIMALLY OVERLAPPING PATHS SETS FOR CLOSED CONTOUR EXTRACTION

Authors:

Julien Mille, Sébastien Bougleux and Laurent Cohen

Abstract: Active contours and minimal paths have been extensively studied theoretical tools for image segmentation. The recent geodesically linked active contour model, which basically consists in a set of vertices connected by paths of minimal cost, blend the benefits of both concepts. This makes up a closed piecewise-defined curve, over which an edge or region energy functional can be formulated. As an important shortcoming, the geodesically linked active contour model in its initial formulation does not guarantee to represent a simple curve, consistent with respect to the purpose of segmentation. In this paper, we propose to extract a similarly piecewise-defined curve from a set of possible paths, such that the resulting structure is guaranteed to represent a relevant closed curve. Toward this goal, we introduce a global constraint penalizing excessive overlap between paths.
Download

Paper Nr: 177
Title:

GROUP-WISE SPARSE CORRESPONDENCES BETWEEN IMAGES BASED ON A COMMON LABELLING APPROACH

Authors:

Albert Solé-Ribalta, Gerard Sanromà, Francesc Serratosa and René Alquézar

Abstract: Finding sparse correspondences between two images is a usual process needed for several higher-level computer vision tasks. For instance, in robot positioning, it is frequent to make use of images that the robot captures from their cameras to guide the localisation or reduce the intrinsic ambiguity of a specific localisation obtained by other methods. Nevertheless, obtaining good correspondence between two images with a high degree of dissimilarity is a complex task that may lead to important positioning errors. With the aim of increasing the accuracy with respect to the pair-wise image matching approaches, we present a new method to compute group-wise correspondences among a set of images. Thus, pair-wise errors are compensated and better correspondences between images are obtained. These correspondences can be used as a less-noisy input for the localisation process. Group-wise correspondences are computed by finding the common labelling of a set of salient points obtained from the images. Results show a clear increase in effectiveness with respect to methods that use only two images.
Download

Paper Nr: 186
Title:

RENDERING VIRTUAL OBJECTS INTO REAL SCENE BY USING A DIFFUSE SPHERE

Authors:

Yuji Kanda, Fumihiko Sakaue, Jun Sato, Kazuhisa Ishimaru and Masayuki Imanishi

Abstract: In this paper, we propose an efficient method for rendering virtual objects into real scene images. We put a diffuse sphere into the scene as a reference object, and use the direct relationship between the intensity of the diffuse sphere and the intensity of virtual objects for rendering the virtual objects in images. We also generate shadows of the virtual objects by using shadows of the reference sphere. As a result, arbitrary virtual objects and their shadows can be rendered quite efficiently from a single diffuse sphere put in the scene. The proposed method is tested in real image experiments, and evaluated quantitatively comparing with the existing method.
Download

Paper Nr: 186
Title:

RENDERING VIRTUAL OBJECTS INTO REAL SCENE BY USING A DIFFUSE SPHERE

Authors:

Yuji Kanda, Fumihiko Sakaue, Jun Sato, Kazuhisa Ishimaru and Masayuki Imanishi

Abstract: In this paper, we propose an efficient method for rendering virtual objects into real scene images. We put a diffuse sphere into the scene as a reference object, and use the direct relationship between the intensity of the diffuse sphere and the intensity of virtual objects for rendering the virtual objects in images. We also generate shadows of the virtual objects by using shadows of the reference sphere. As a result, arbitrary virtual objects and their shadows can be rendered quite efficiently from a single diffuse sphere put in the scene. The proposed method is tested in real image experiments, and evaluated quantitatively comparing with the existing method.
Download

Paper Nr: 307
Title:

REGION GROWING: ADOLESCENCE AND ADULTHOOD - Two Visions of Region Growing: in Feature Space and Variational Framework

Authors:

C. Revol-Muller, T. Grenier, J. L. Rose, A. Pacureanu, F. Peyrin and C. Odet

Abstract: Region growing is one of the most intuitive techniques for image segmentation. Starting from one or more seeds, it seeks to extract a meaningful object by iteratively aggregating surrounding pixels. Starting from this simple description, we propose to show how region growing technique can be elevated to the same rank as more recent and sophisticated methods. Two formalisms are presented to describe the process. The first one derived from non-parametric estimation relies upon feature space and kernel functions. The second one is issued from variational framework. Describing the region evolution as a process, which minimizes an energy functional, it thus proves the convergence of the process and takes advantage of the huge amount of work already done on energy functional. In the last part, we illustrate the interest of both formalisms in the context of life imaging. Three segmentation applications are considered using various modalities such as whole body PET imaging, small animal µCT imaging and experimental Synchrotron Radiation µCT imaging. We will thus demonstrate that region growing has reached this last decade a maturation that offers many perspectives of applications to the method.
Download

Short Papers
Paper Nr: 15
Title:

REGION-BASED SKIN COLOR DETECTION

Authors:

Rudra P. K. Poudel, Hammadi Nait-Charif, Jian J. Zhang and David Liu

Abstract: Skin color provides a powerful cue for complex computer vision applications. Although skin color detection has been an active research area for decades, the mainstream technology is based on the individual pixels. This paper presents a new region-based technique for skin color detection which outperforms the current state-of-the-art pixel-based skin color detection method on the popular Compaq dataset (Jones and Rehg, 2002). Color and spatial distance based clustering technique is used to extract the regions from the images, also known as superpixels. In the first step, our technique uses the state-of-the-art non-parametric pixel-based skin color classifier (Jones and Rehg, 2002) which we call the basic skin color classifier. The pixel-based skin color evidence is then aggregated to classify the superpixels. Finally, the Conditional Random Field (CRF) is applied to further improve the results. As CRF operates over superpixels, the computational overhead is minimal. Our technique achieves 91.17% true positive rate with 13.12% false negative rate on the Compaq dataset tested over approximately 14,000 web images.
Download

Paper Nr: 17
Title:

ROBUST ROAD SIGNS SEGMENTATION IN COLOR IMAGES

Authors:

Bishesh Khanal, Sharib Ali and Désiré Sidibé

Abstract: This paper presents an efficient method for road signs segmentation in color images. Color segmentation of road signs is a difficult task due to variations in the image acquisition conditions. Therefore, a color constancy algorithm is usually applied prior to segmentation, which increases the computation time. The proposed method is based on a log-chromaticity color space which shows good invariance properties to changing illumination. Thus, the method is simple and fast since it does not require color constancy algorithms. Experiments with a large dataset and comparison with other approaches, show the robustness and accuracy of the method in detecting road signs in various conditions.
Download

Paper Nr: 21
Title:

QUESTIONING HU’S INVARIANTS - Bad or Good Enough?

Authors:

Diego Martinoia

Abstract: Despite Hu’s invariants were proven not to be independent nor complete long time ago, their use in computer vision applications is still broad, mainly because of their diffusion among common CV libraries and ease of use by inexperienced users. In this paper I want to investigate whether, given their mathematical flaws, they are nevertheless good enough to justify such a wide diffusion, also considering that more sophisticated tools have been developed over the years. In order to do this, I am going to test the robustness of Hu’s invariants in a comparative way against the more modern wavelet invariants, in a hand gesture recognition application. Finally, I am going to discuss, basing my considerations on the experimental data, whether Hu’s invariants are still a viable option for small scale, amateurish applications, or if the time has come to abandon them for more effective solutions.
Download

Paper Nr: 29
Title:

AN ARTIFACT-FREE WAVELET MODEL FOR PERCEPTUAL CONTRAST ENHANCEMENT OF COLOR IMAGES

Authors:

Edoardo Provenzi and Vicent Caselles

Abstract: Contrast enhancement of color images can prove to be a difficult task because artifacts and unnatural colors can appear after the process. In this paper we propose a wavelet-based variational framework in which contrast enhancement is obtained through the minimization of a suitable energy functional of wavelet coefficients. We will show that this new approach has certain advantages with respect to the usual spatial techniques sustained by the fact that the wavelet representation is intrinsically local, multiscale and sparse. The Euler-Lagrange equations of the model are implicit equations involving the detail wavelet coefficients of the image. These equations can be quickly solved by Newton’s method, so that the algorithm can rapidly compute the enhanced detail coefficients. We will discuss the influence of the parameters tests on natural images to show that the method is artifact free within an ample range of variability of its parameters.
Download

Paper Nr: 69
Title:

MONOCULAR DEPTH-BASED BACKGROUND ESTIMATION

Authors:

Diego Cheda, Daniel Ponsa and Antonio M. López

Abstract: In this paper, we address the problem of reconstructing the background of a scene from a video sequence with occluding objects. The images are taken by hand-held cameras. Our method composes the background by selecting the appropriate pixels from previously aligned input images. To do that, we minimize a cost function that penalizes the deviations from the following assumptions: background represents objects whose distance to the camera is maximal, and background objects are stationary. Distance information is roughly obtained by a supervised learning approach that allows us to distinguish between close and distant image regions. Moving foreground objects are filtered out by using stationariness and motion boundary constancy measurements. The cost function is minimized by a graph cuts method. We demonstrate the applicability of our approach to recover an occlusion-free background in a set of sequences.
Download

Paper Nr: 77
Title:

DEPTH PERCEPTION MODEL EXPLOITING BLURRING CAUSED BY RANDOM SMALL CAMERA MOTIONS

Authors:

Norio Tagawa, Yuya Iida and Kan Okubo

Abstract: The small vibration of the eye ball, which occurs when we fix our gaze on an object, is called “fixational eye movement.” It has been reported that this vibration may work not only as a fundamental function to preserve photosensitivity but also as a clue to image analysis, for example contrast enhancement and edge detection. This mechanism can be interpreted as an instance of stochastic resonance, which is inspired by biology, more specifically by neuron dynamics. Moreover, researches for a depth recovery method using camera motions based on an analogy of fixational eye movement are in progress. In this study, using camera motions especially corresponding to the smallest type of fixational eye movement called “tremor.” We have constructed the algorithms which are defined as a differential form, i.e. spatio-temporal derivatives of successive two images are analyzed. However, in these methods, observed noise of derivatives causes serious recovering error. Therefore, we newly examine a method in which a lot of images captured with the same camera motions are integrated and the observed local image blurring is analyzed for extracting depth information, and confirm its effectiveness.
Download

Paper Nr: 117
Title:

PAVEMENT TEXTURE SEGMENTATION USING LBP AND GLCM FOR VISUALLY IMPAIRED PERSON

Authors:

Sun-Hee Weon, Sung-Il Joo and Hyung Il Choi

Abstract: This paper proposes about a method for region segmentation and texture extraction to classify pavement and roadway region in the image that acquired from cameras equipped to the visually impaired person during a walk. First, detect a road boundary line through the line detections technique using the Hough transform, and obtain candidate regions of pavement and roadway. Second, extract texture feature in segmented candidate region, and separated pavement and roadway regions as classified three levels according to perspective scope in triangular model. In this paper, used rotation invariant LBP and GLCM to compare the difference of texture feature that pavement with various precast pavers and relatively a roadway being monotonous. Proposed method in this paper was verified that the analytical performance nighttime did not deteriorate in comparison with the results from the daytime, and region segmentation performance was very well in complex image has various obstacles and pedestrians.
Download

Paper Nr: 141
Title:

AN ALTERNATIVE TO SCALE-SPACE REPRESENTATION FOR EXTRACTING LOCAL FEATURES IN IMAGE RECOGNITION

Authors:

Hans Jørgen Andersen and Giang Phuong Nguyen

Abstract: In image recognition, the common approach for extracting local features using a scale-space representation has usually three main steps; first interest points are extracted at different scales, next from a patch around each interest point the rotation is calculated with corresponding orientation and compensation, and finally a descriptor is computed for the derived patch (i.e. feature of the patch). To avoid the memory and computational intensive process of constructing the scale-space, we use a method where no scale-space is required This is done by dividing the given image into a number of triangles with sizes dependent on the content of the image, at the location of each triangle. In this paper, we will demonstrate that by rotation of the interest regions at the triangles it is possible in grey scale images to achieve a recognition precision comparable with that of MOPS. The test of the proposed method is performed on two data sets of buildings.
Download

Paper Nr: 216
Title:

A GREAT CIRCLE ARC DETECTOR IN EQUIRECTANGULAR IMAGES

Authors:

Seon Ho Oh and Soon Ki Jung

Abstract: We propose a great circle arc detector in a scene represented by an equirectangular image, i.e. a spherical image of the 360º longitude and 180º latitude field of view. Since the straight lines appears curved in equirectangular images, the standard line detection algorithm cannot be used directly in this context. We extend the LSD method (Gioi et al., 2010) to deal with the equirectangular images instead of planar images. So the proposed method has most of the advantages of the LSD method, which gives accurate results with a controlled number of false detections but requires no parameter tuning. This algorithm is tested and compared to other algorithm on a wide set of images.
Download

Paper Nr: 219
Title:

REFA3D: ROBUST SPATIO-TEMPORAL ANALYSIS OF VIDEO SEQUENCES

Authors:

Manuel Grand-Brochier, Christophe Tilmant and Michel Dhome

Abstract: This article proposes a generalization of our approach REFA (Grand-brochier et al., 2011) to spatio-temporal domain. Our new method REFA3D, is based mainly on hes-STIP detector and E-HOG3D. SIFT3D and HOG/HOF are the two must used methods for space-time analysis and give good results. So their studies allow us to understand their construction and to extract some components to improve our approach. The mask of analysis used by REFA is modified and therefore relies on the use of ellipsoids. The validation tests are based on video clips from synthetic transformations as well as real sequences from a simulator or an onboard camera. Our system (detection, description and matching) must be as invariant as possible for the image transformation (rotations, scales, time-scaling). We also study the performance obtained for registration of subsequence, a process often used for the location, for example. All the parameters (analysis shape, thresholds) and changes to the space-time generalization will be detailed in this article.
Download

Paper Nr: 260
Title:

HEIGHT ESTIMATION FROM A SINGLE CAMERA VIEW

Authors:

Mahdi Momeni-K., Sotirios Ch. Diamantas, Fabio Ruggiero and Bruno Siciliano

Abstract: This paper presents a novel technique for the estimation of the height of an object using a single camera view. In the proposed method, the only information required is the knowledge about the pose of the camera with respect to the world (i.e., height and pitch angle of the camera with respect to the ground) and a vanishing point. In the developed theory, the focal length may also be known, but in the proposed experiments it has not been employed: an approximation for small pitch angles has been taken into account and the consequent committed error has been then analysed. The presented method gives accurate results for any object placed in unstructured environments, regardless of the relative distance from the camera. The method has been tested in a series of outdoor and indoor environments, and the experimental results are presented in this paper.
Download

Paper Nr: 264
Title:

ROBUST ONLINE SEGMENTATION OF UNKNOWN OBJECTS FOR MOBILE ROBOTS

Authors:

Xin Wang, Maja Rudinac and Pieter P. Jonker

Abstract: In this paper we present a novel vision system for object-driven and online learning based segmentation of unknown objects in a scene. The main application of this system is for mobile robots exploring unknown environments, where unknown objects need to be inspected and segmented from multiple viewpoints. In an initial step, objects are detected using a bottom-up segmentation method based on salient information. The cluster with the most salient points is assumed to be the most dominant object in the scene and serves as an initial model for online segmentation. Then the dominant object is tracked by a Lucas-Kanade tracker and the object model is constantly updated and learned online based on Random Forests classifier. To refine the model a two-step object segmentation using Gaussian Mixture Models and graph cuts is applied. As a result, the detailed contour information of the dominant unknown object is obtained and can further be used for object grasping and recognition. We tested our system in very challenging conditions with multiple identical objects, severe occlusions, illumination changes and cluttered background and acquired very promising results. In comparison with other methods, our system works online and requires no input from users.
Download

Paper Nr: 275
Title:

GRAPH BASED DESCRIPTOR EVALUATION TOWARDS AUTOMATIC FEATURE SELECTION

Authors:

Anita Keszler, Levente Kovács and Tamás Szirányi

Abstract: This paper presents the first steps towards an automated image and video feature descriptor evaluation framework, based on several points of view. First, evaluation of distance distributions of images and videos for several descriptors are performed, then a graph-based representation of database contents and evaluation of the appearance of the giant component is performed. The goal is to lay the foundations for an evaluation framework where different descriptors and their combinations can be analyzed, with the goal of later application for automatic feature selection.
Download

Paper Nr: 308
Title:

GIBBS-WEIGHTED K-MEANS SEGMENTATION APPROACH WITH INTENSITY INHOMOGENEITY CORRECTION

Authors:

Chia-Yen Lee, Chiun-Sheng Huang, Yeun-Chung Chang, Yi-Hong Chou and Chung-Ming Chen

Abstract: Intensity inhomogeneity caused by an ultrasonic attenuation beam within the body results in an artifact effect. It frequently degrades the boundary and texture information of a lesion in a breast sonogram. A new Gibbs-weighted K-means segmentation approach with intensity inhomogeneity correction is proposed to cluster the prominent components provided by fuzzy cell competition algorithm for segmenting lesion boundaries automatically with reducing the influence of the intensity inhomogeneity. The information of fuzzy C-means, normalized cut, and cell-based fuzzy cell competition algorithm are combined as the feature vector for cell-based clustering. 49 breast sonograms with intensity inhomogeneity, each from a different subject, are randomly selected for performance analysis. The mean distance between the lesion boundaries attained by the proposed algorithm and the corresponding manually delineated boundaries defined by two radiologists is 1.571±0.513 pixels. (Assessing Chan and Vese level set method for intensity inhomogeneity-correction segmentation in the same way, the mean distance error is3.299±1.203 pixels, for the 49 images.) The results show that Gibbs-weighted K-means segmentation approach with intensity inhomogeneity correction could not only correct the intensity inhomogeneity effect but also improve the segmentation results.
Download

Paper Nr: 9
Title:

AN INTELLIGENT-AGENT FOR OBJECT RETRIEVAL APPLIED TO MAMMOGRAM SEGMENTATION

Authors:

Charbel Julien and Charbel Fares

Abstract: In this article, we present a novel and efficient approach for object retrieval in digital images. We apply an intelligent agent on digital mammograms in order to retrieve the breast border. The agent navigates over the image’s pixels and has the ability to modify its form and behavior which are, in turn, controlled by an automaton. While the agent navigates over the pixels it retrieves a list of points that represent the breast contour. The mammogram is, first, smoothed and filtered resulting in a filtered image. After, the agent designed for breast border retrieval is applied on the filtered image in order to get a list of pixels that represents the breast border. The approach was tested on a database of more than 400 digitized mammograms giving segmentation results that were identical to manual segmentation. However, a similar agent can be used on other types of digital images for object retrieval and object recognition. The agent is a hand-crafted tool that represents prior knowledge of the designer who deploys the basic object shape or appearance. In this work the breast shape is considered, since the automaton represents our prior knowledge of the breast-shape.

Paper Nr: 16
Title:

ROAD SIGN DETECTION AND SHAPE RECONSTRUCTION USING GIELIS CURVES

Authors:

Valentine Véga, Désiré Sidibé and Yohan Fougerolle

Abstract: Road signs are among the most important navigation tools in transportation systems. The identification of road signs in images is usually based on first detecting road signs location using color and shape information. In this paper, we introduce such a two-stage detection method. Road signs are located in images based on color segmentation, and their corresponding shape is retrieved using a unified shape representation based on Gielis curves. The contribution of our approach is the shape reconstruction method which permits to detect any common road sign shape, i.e. circle, triangle, rectangle and octagon, by a single algorithm without any training phase. Experimental results with a dataset of 130 images containing 174 road signs of various shapes, show an accurate detection and a correct shape retrieval rate of 81.01% and 80.85% respectively.
Download

Paper Nr: 66
Title:

ACCURATE SIMILARITY MEASURES FOR SILHOUETTES RECOGNITION

Authors:

Saliha Aouat and Slimane Larabi

Abstract: In this paper, we propose a new method to recognize silhouettes of objects. Models of silhouettes are stored in the database using their textual descriptors. Textual Descriptors are written following the part-based method published in (Larabi et al, 2003). The main issue with the textual description is its sensitiveness to noise, in order to overcome this issue, we have applied (Aouat and Larabi, 2010) a convolution to initial outline shape with a Gaussian filter at different scales. The approach was very interesting for shape matching and indexing (Aouat and Larabi, 2009), but unfortunately it is not appropriate to the recognition process because there is no use of similarity measures in order to select the best model for a query silhouette. In this paper, we compute parts areas and geometric quasi-invariants to find the best model for the given query; they are efficient similarity measures to perform the recognition process.
Download

Paper Nr: 83
Title:

VIDEO BASED HUMAN ACTIVITY RECOGNITION USING WAVELET TRANSFORM AND HIDDEN CONDITIONAL RANDOM FIELDS (HCRF)

Authors:

Muhammad Hameed Siddiqi, La The Vinh and Adil Mehmood Khan

Abstract: In this research, we proposed testing and validating the accuracy of employing wavelet transform and Hidden Conditional Random Field (HRCF) for video based activity recognition. For feature extraction, Symlet wavelet was tested and decomposed up to 4 levels, and some of the highest coefficients were extracted from each level of decomposition. These coefficients were based on the average frequency of each video frame and the time difference between each frame. Finally, a novel HRCF model was applied for recognition. The proposed method was tested on a database of ten activities, where the data were collected from nine different people, and compared with one of the existing techniques. The overall recognition rate, using the symlet wavelet family (Symlet 4), was 93% that showed an improvement of 13% in performance.
Download

Paper Nr: 92
Title:

3D CONFOCAL MICROSCOPY DATA ANALYSIS USING LEVEL-SET SEGMENTATION WITH ALPHA DIVERGENCE SIMILARITY MEASURE

Authors:

Leila Meziou, Aymeric Histace, Frédéric Precioso, Bogdan J. Matuszewski and Franck Carreiras

Abstract: Segmentation of cellular structures is of primary interest in cell imaging for a 3D reconstruction of cell shape. Such an analysis provides crucial information about cell morphology and is instrumental in understanding of biological processes leading to development of a particular pathology. The work presented in this paper reports on a novel method for segmentation of cellular structures (nuclei and cell boundaries) from 3D single channel actin tagged fluorescence confocal microscopy images. The proposed segmentation method uses histogram-based image similarity measure in a level-set active-contour framework. The novelty of the method is in application of the alpha-divergence distance measure which can be seen as a generalization of classic Kullback-Leibler and c2 measures. The resulting alpha-divergence level-set formulation leads to a single front evolution formula for both nuclei and cell boundaries segmentation, with no requirements for any enhancement or preprocessing of acquired cell images (a monolayer of human cells (PNT2) culture).
Download

Paper Nr: 118
Title:

AN EFFECTIVE METHOD FOR IMAGE MATCHING BASED ON MODIFIED LBP AND SIFT

Authors:

Yinan Wang, Nuo Zhang, Toshinori Watanabe and Hisashi Koga

Abstract: Scale Invariant Feature Transform (SIFT) is a very powerful and popular descriptor for image registration, which is commonly used in feature matching. However, there is still a need for improvement with respect to the matching accuracy of SIFT. In this paper, we present a combination of modified LBP and SIFT method for more reliable feature matching. The main idea of the proposed method is to extract spatially enhanced image features with modified Local Binary Pattern (LBP) from the images before implementation Difference-of-Gaussian (DoG) in SIFT. The proposed method is also robust to illumination changes, rotation and scaling of images. Experimental results show significant improvement over original SIFT.
Download

Paper Nr: 154
Title:

A GRAPH-BASED SIGNATURE GENERATION FOR PEOPLE RE-IDENTIFICATION IN A MULTI-CAMERA SURVEILLANCE SYSTEM

Authors:

T. D'Orazio and C. Guaragnella

Abstract: In this paper we investigate the problem of people re-identification in case of non overlapping and non calibrated cameras. We propose a novel method for signature generation that considers both color and spatial features along a video sequence and a distance measure to estimate the similarity between silhouettes. A graph based representation has been introduced to model different people, in which uniform regions represent nodes and contiguities among regions represent edges. Comparisons with a standard approach based on histogram similarity have been provided to evaluate the proposed methodology.
Download

Paper Nr: 167
Title:

RIGID TEMPLATE REGISTRATION INMET IMAGES USING CUDA

Authors:

Lennart Svensson, Johan Nysjö, Anders Brun, Ingela Nyström and Ida-Maria Sintorn

Abstract: Rigid registration is a base tool in many applications, especially in Molecular Electron Tomography (MET), and also in, e.g., registration of rigid implants in medical images and as initialization for deformable registration. As MET volumes have a low signal to noise ratio, a complete search of the six-dimensional (6D) parameter space is often employed. In this paper, we describe how rigid registration with normalized crosscorrelation can be implemented on the GPU using NVIDIA’s parallel computing architecture CUDA. We compare the performance to the Colores software and two Matlab implementations, one of which is using the GPU accelerated JACKET library. With well-aligned padding and using CUDA, the performance increases by an order of a magnitude, making it feasible to work with three-dimensional fitness landscapes, here denoted scoring volumes, that are generated on the fly. This will eventually enable the biologists to interactively register macromolecule chains in MET volumes piece by piece.
Download

Paper Nr: 175
Title:

TETRACHROMATIC COLOUR SPACES - Spherical and Toroidal Hue Spaces

Authors:

Alfredo Restrepo Palacios

Abstract: From a 4-hypercube colour space, spaces of the type hue-saturation-luminance are derived. The hue component may have the topology of a 2-sphere, a 2-torus or a 3-sphere, in several possibilities we consider.
Download

Paper Nr: 178
Title:

COMPUTATIONAL MODEL FOR PROBABILITY PREDICTION OF SCAN PATHS IN STATIC SCENES

Authors:

Yorie Nakahira and Minoru Nakayama

Abstract: We develop a computational model of scan paths when viewing static images. The proposed scan path model generates a dynamic distribution of visual attention using multiple image processing algorithms based on biological principles. The probability of any scan paths is computed from this distribution of visual attention at each subsequent numbered fixation. The validity of our model is tested using eye movement data. Our results verify the possibility of conventionally infeasible modeling of the scan paths for static images.
Download

Paper Nr: 183
Title:

A COMPARISON BETWEEN BACKGROUND SUBTRACTION ALGORITHMS USING A CONSUMER DEPTH CAMERA

Authors:

Klaus Greff, André Brandão, Stephan Krauß, Didier Stricker and Esteban Clua

Abstract: Background subtraction is an important preprocessing step in many modern Computer Vision systems. Much work has been done especially in the field of color image based foreground segmentation. But the task is not an easy one so, state of the art background subtraction algorithms are complex both in programming logic and in run time. Depth cameras might offer a compelling alternative to those approaches, because depth information seems to be better suited for the task. But this topic has not been studied much yet, even though the release of Microsoft’s Kinect has brought depth cameras to the public attention. In this paper we strive to fill this gap, by examining some well known background subtraction algorithms for the use with depth images. We propose some necessary adaptions and evaluate them on three different video sequences using ground truth data. The best choice turns out to be a very simple and fast method that we call minimum background.
Download

Paper Nr: 193
Title:

TEXTURE IMAGE ANALYSIS USING LBP AND DATA COMPRESSION

Authors:

Nuo Zhang and Toshinori Watanabe

Abstract: Texture classification is an important technology widely applied in many application fields in image processing. In this study, a novel representation method for texture image is proposed. The proposed approach is based on the consideration of using data compression to search the essential feature of frequent pattern in texture images. Furthermore, to deal with the difficult situation caused by different situations of photography, local binary pattern (LBP) is introduced to the proposed approach to reduce the numbers of varieties of patterns in texture image. Compresibility vector space is adopted in this study instead of learning phase. Based on the patterns extracted by LBP operator which are invariant to monotonic gray-level transformations, data compression helps extract the longest and frequent features. These features provide high analytical ability for texture image. The simulation results will show good performance of our approach.
Download

Paper Nr: 198
Title:

IMAGE SEGMENTATION FOR OBJECT DETECTION ON A DEEPLY EMBEDDED MINIATURE ROBOT

Authors:

Alexander Jungmann, Thomas Schierbaum and Bernd Kleinjohann

Abstract: In this paper, an image segmentation approach for object detection on the miniature robot BeBot - a deeply embedded system - is presented. In order to enable the robot to detect and identify objects in its environment by means of its camera, an efficient image segmentation approach was developed. The fundamental algorithm bases on the region growing and region merging concept and identifies homogeneous regions consisting of adjacent pixels with similar color. By internally representing a contiguous block of pixels in terms of runlengths, the computational effort of both the region growing and the region merging operation is minimized. Finally, for subsequent object detection processes, a region is efficiently translated into a statistically feature representation based on discretized moments.
Download

Paper Nr: 208
Title:

DIRECT COMPUTATION OF DEPTH FROM SHADING FOR PERSPECTIVE PROJECTION

Authors:

Kousuke Wakabayashi, Norio Tagawa and Kan Okubo

Abstract: We present a method for recovering shape from shading in which the surface depth is directly computed. The already proposed method solving the same problem assumes that images are captured under the parallel projection, and hence, it can be correctly used only for the relative thin objects compared with the distance from the camera. If this method is formally extended for the perspective projection completely, the complicated calculations for differential are required. This gives rise to unstable recovery. In this study, we examine an extension of this method so as to treat the perspective projection approximately. In order to keep the simplicity of the original method, we propose the simple approximation of the derivative of the surface with respect to the image coordinate.
Download

Paper Nr: 221
Title:

3D TOF CAMERA BASED OBJECT METROLOGY

Authors:

Mohammed Ibrahim M. and Peddaiah Thappeta

Abstract: Range cameras that determine both range and intensity at each pixel has matured in the last decade and is on the verge of revolutionizing the metrology market in retail, automotive, aerospace and many other. In this paper, we present an algorithm for measuring 3D geometry (height, width and depth) of rigid object using Time of Flight (TOF) camera. The method exploits geometrical structure of object such that intensity and range image compliments each other for a reliable measurement. We discuss the performance of algorithm under varying operating conditions.
Download

Paper Nr: 222
Title:

VIDEO BASED FLAME DETECTION - Using Spatio-temporal Features and SVM Classification

Authors:

Kosmas Dimitropoulos, Filareti Tsalakanidou and Nikos Grammalidis

Abstract: Video-based surveillance systems can be used for early fire detection and localization in order to minimize the damage and casualties caused by wildfires. However, reliability of these systems is an important issue and therefore early detection versus false alarm rate has to be considered. In this paper, we present a new algorithm for video based flame detection, which identifies spatio-temporal features of fire such as colour probability, contour irregularity, spatial energy, flickering and spatio-temporal energy. For each candidate region of an image a feature vector is generated and used as input to an SVM classifier, which discriminates between fire and fire-coloured regions. Experimental results show that the proposed methodology provides high fire detection rates with a reasonable false alarm ratio.
Download

Paper Nr: 229
Title:

A NONLINEAR FEATURE FUSION BY VARIADIC NEURAL NETWORK IN SALIENCY-BASED VISUAL ATTENTION

Authors:

Zahra Kouchaki and Ali Motie Nasrabadi

Abstract: This study presents a novel combinational visual attention system which applies both bottom-up and top-down information. This can be employed in further processing such as object detection and recognition purpose. This biologically-plausible model uses nonlinear fusion of feature maps instead of simple superposition by employing a specific Artificial Neural Network (ANN) as combination operator. After extracting 42 feature maps by Itti’s model, they are weighed purposefully through several training images with their corresponding target masks to highlight the target in the final saliency map. In fact, the weights of 42 feature maps are proportional to their influence on finding target in the final saliency map. The lack of bottom-up information is compensated by applying top-down information with available target masks. Our model could automatically detect the conceptual features of desired object only by considering the target information. We have tried to model the process of combining 42 feature maps to form saliency map by applying the neural network which resembles biological neural network. The Experimental results and comparing our model with the basic saliency model using 32 images of test dataset indicate a noticeable improvement in finding target in the first hit.
Download

Paper Nr: 244
Title:

AUTOMATIC DETECTION OF PALE PATH AND OVERLAPS IN CHROMOSOME IMAGES USING ADAPTIVE SEARCH TECHNIQUE AND RE-THRESHOLDING

Authors:

Rajeev Ranjan, Akila Subasinghe A., Jagath Samarabandu, Peter K. Rogan and Joan H. M. Knoll

Abstract: Detection and separation of overlapping and touching chromosomes is a critical issue in image analysis applications for cytogenetics where accurately segmented chromosomes are essential. We present a novel method of automatic pale path detection for all types of stained chromosome images. Optimum number of cut-points for each cluster of touching or overlapping chromosomes is obtained and analysed sequentially for the pale paths. A self-adaptive search window searches for the minimum grayscale intensity beginning from the vicinity of a cut-point and propagates gradually till the end of pale path. Efficient image and area thresholding restricts the faulty detection of touch or overlap in a chromosome cluster.
Download

Paper Nr: 253
Title:

A COMPREHENSIVE AND COMPARATIVE SURVEY OF THE SIFT ALGORITHM - Feature Detection, Description, and Characterization

Authors:

L. Younes, B. Romaniuk and E. Bittar

Abstract: The SIFT feature extractor was introduced by Lowe in 1999. This algorithm provides invariant features and the corresponding local descriptors. The descriptors are then used in the image matching process. We propose an overview of this algorithm: the methodology and the tricky steps of its implementation, properties of the detector and descriptor. We analyze the structure of detected features. We finally compare our implementation to others, including Lowe’s.
Download

Paper Nr: 257
Title:

TOWARDS A BIO-INSPIRED APPROACH TO CLUSTERING FACIAL GESTURE IN SUSPECT/NON-SUSPECT PERSONS

Authors:

Manuel Hernandez Hernandez, Pedro Luis Sanchez Orellana, Claudio Castellanos Sanchez and Luis Carlos Ortega Tamez

Abstract: In this paper we present the first steps towards a bio-inspired approach for clustering facial gesture in suspects and non suspect persons. After eyes and mouth detected corners we build an anthropomorphical grid on the face. Next, a convolution inspired in the neurons of V1-MT allows to obtain the active neuron ratios and the two principal moments of active neurons for each region in our anthropomorphical grid. The classification in suspect/non-suspect emerge from the difference between ratios of symmetrical regions in the face.

Paper Nr: 269
Title:

IMAGE QUALITY ASSESSMENT BY SALIENCY MAPS

Authors:

Edoardo Ardizzone and Alessandro Bruno

Abstract: Image Quality Assessment (IQA) is an interesting challenge for image processing applications. The goal of IQA is to replace human judgement of perceived image quality with a machine evaluation. A large number of methods have been proposed to evaluate the quality of an image which may be corrupted by noise, distorted during acquisition, transmission, compression, etc. Many methods, in some cases, do not agree with human judgment because they are not correlated with human visual perception. In the last years the most modern IQA models and metrics considered visual saliency as a fundamental issue. The aim of visual saliency is to produce a saliency map that replicates the human visual system (HVS) behaviour in visual attention process. In this paper we show the relationship between different kind of visual saliency maps and IQA measures. We particularly perform a lot of comparisons between Saliency-Based IQA Measures and traditional Objective IQA Measure. In Saliency scientific literature there are many different approaches for saliency maps, we want to investigate which is best one for IQA metrics.
Download

Paper Nr: 273
Title:

A ROBUST BACKGROUND SUBTRACTION ALGORITHM USING THE A ∑-∆ ESTIMATION - Applied to the Visual Analysis of Human Motion

Authors:

Juan Carlos León, Fabio Martínez and Eduardo Romero

Abstract: This paper introduces a novel method for segmenting the human silhouette in video sequences, based on a local version of the classical SD filter. A main difference of our approach is that the filter is not pixelwise oriented, but rather region wise adjusted by using scaled estimations of both the pixel intensity and the horizontal (vertical) gradient, i.e., a multiresolution wavelet decomposition using Haar functions. The classical SD filter is independently applied to each component of the obtained feature vector, previously normalized and a single scalar value is associated to the pixel by averaging the feature vector components. The background is estimated by setting a threshold in a histogram constructed with these integrated values, attempting to maximize the interclass variance. This strategy was evaluated in a set of 6 videos, taken from the Human Eva data set. Results show that the proposed algorithm provides a better segmentation of the human silhouette, specially in the limbs, which are critical for human movement analysis.
Download

Paper Nr: 281
Title:

FEATURE-DRIVEN MAXIMALLY STABLE EXTREMAL REGIONS

Authors:

P. Martins, C. Gatta and P. Carvalho

Abstract: The high repeatability of Maximally Stable Extremal Regions (MSERs) on structured images along with their suitability to be combined with either photometric or shape descriptors to solve image matching problems have contributed to establish the MSER detector as one of the most prominent affine covariant detectors. However, the so-called affine covariance that characterizes MSERs relies on the assumption that objects possess smooth boundaries, a premiss that is not always valid. We introduce an alternative domain for MSER detection in which boundary-related features are highlighted and simultaneously delineated under smooth transitions. Detection results on common benchmarks show improvements that are discussed.
Download

Paper Nr: 290
Title:

THE UNIFORM DECOMPOSITION - Properties and Applications

Authors:

Franklin César Flores and Airton Marco Polidorio

Abstract: The uniform decomposition is an hierarchical segmentation method which main property is the preservation of a given characteristic for each subset of the partition provided by the decomposition. Its computation is incremental and, in each iteration, a candidate region is split only if it is found a valid split where each new subset respects an input criterion, for instance size or shape. This paper discusses some properties of the uniform decomposition and proposes several applications for this decomposition such as hierarchical segmentation, stereo vision pre-processing and interactive segmentation.
Download

Paper Nr: 294
Title:

HIERARCHICAL AND SPATIAL-COLORIMETRIC MODEL TO DETECT MOVING TARGETS

Authors:

C. Gabard, C. Achard, L. Lucat and P. Sayd

Abstract: Background subtraction is often one of the first tasks involved in video surveillance applications. Classical methods only use temporal modelling of the background pixels. Using pixel blocks with fixed size allows robust detection but these approaches lead to a loss of precision. We propose in this paper a model of the scene which combines a temporal and local model with a spatial model. This whole representation of the scene both models fixed elements (background) and mobile ones. This allows improving detection accuracy by transforming the detection problem in a two classes classification problem.
Download

Area 3 - Image and Video Understanding

Full Papers
Paper Nr: 100
Title:

AUTOMATIC PROCESS TO BUILD A CONTEXTUALIZED DETECTOR

Authors:

Thierry Chesnais, Nicolas Allezard, Yoann Dhome and Thierry Chateau

Abstract: This article tackles the real-time pedestrian detection problem using a stationary uncalibrated camera. More precisely we try to specialize a classifier by taking into account the context of the scene. To achieve this goal, we introduce an offline semi-supervised approach which uses an oracle. This latter must automatically label a video, in order to obtain contextualized training data. The proposed oracle is composed of several detectors. Each of them is trained on a different signal: appearance, background subtraction and optical flow signals. Then we merge their responses and keep the more confident detections. A specialized detector is then built on the resulting dataset. Designed for improving camera network installation procedure, the presented method is completely automatic and does not need any knowledge about the scene.
Download

Paper Nr: 138
Title:

LEARNING OBJECT SEGMENTATION USING A MULTI NETWORK SEGMENT CLASSIFICATION APPROACH

Authors:

S. Albertini, I. Gallo, M. Vanetti and A. Nodari

Abstract: In this study we propose a new strategy to perform an object segmentation using a multi neural network approach. We started extending our previously presented object detection method applying a new segment based classification strategy. The result obtained is a segmentation map post processed by a phase that exploits the GrabCut algorithm to obtain a fairly precise and sharp edges of the object of interest in a full automatic way. We tested the new strategy on a clothing commercial dataset obtaining a substantial improvement on the quality of the segmentation results compared with our previous method. The segment classification approach we propose achieves the same improvement on a subset of the Pascal VOC 2011 dataset which is a recent standard segmentation dataset, obtaining a result which is inline with the state of the art.
Download

Paper Nr: 182
Title:

HUMAN AGE ESTIMATION VIA GEOMETRIC AND TEXTURAL FEATURES

Authors:

Merve Kilinc and Yusuf Sinan Akgul

Abstract: Aging progress of a person is influenced by many factors such as genetics, health, lifestyle, and even weather conditions. Therefore human age estimation from a face image is a challenging problem. Aging causes significant variations in facial shape and texture across years. In order to construct a general age classifier, shape and texture information of human face should be used together. In this paper, we propose a new age estimation system that uses a number of overlapping age groups and a classifier that combine geometric and textural facial features. The classifier scoring results are interpolated to produce the estimated age. We tested many geometric and textural facial features with age group classifiers. Comparative experiments show that the best performance is obtained using the fusion of Local Gabor Binary Patterns and Geometric features.
Download

Paper Nr: 223
Title:

HETEROGENEOUS ADABOOST WITH REAL-TIME CONSTRAINTS - Application to the Detection of Pedestrians by Stereovision

Authors:

Loïc Jourdheuil, Nicolas Allezard, Thierry Chateau and Thierry Chesnais

Abstract: This paper presents a learning based method for pedestrians detection, combining appearance and depth map descriptors. Recent works have presented the added value of this combination. We propose two contributions: 1) a comparative study of various depth descriptors including a fast descriptor based on average depth in a sub-window of the tested area and 2) an adaptation of the Adaboost algorithm in order to handle heterogeneous descriptors in terms of computational cost. Our goal is to build a detector balancing detection rate and execution time. We show the relevance of the proposed algorithm on real video data.
Download

Paper Nr: 249
Title:

DETECTOR OF FACIAL LANDMARKS LEARNED BY THE STRUCTURED OUTPUT SVM

Authors:

Michal Uřičař, Vojtěch Franc and Václav Hlaváč

Abstract: In this paper we describe a detector of facial landmarks based on the Deformable Part Models. We treat the task of landmark detection as an instance of the structured output classification problem. We propose to learn the parameters of the detector from data by the Structured Output Support Vector Machines algorithm. In contrast to the previous works, the objective function of the learning algorithm is directly related to the performance of the resulting detector which is controlled by a user-defined loss function. The resulting detector is real-time on a standard PC, simple to implement and it can be easily modified for detection of a different set of landmarks. We evaluate performance of the proposed landmark detector on a challenging “Labeled Faces in the Wild” (LFW) database. The empirical results demonstrate that the proposed detector is consistently more accurate than two public domain implementations based on the Active Appearance Models and the Deformable Part Models. We provide an open-source implementation of the proposed detector and the manual annotation of the facial landmarks for all images in the LFW database.
Download

Paper Nr: 271
Title:

STATIC POSE ESTIMATION FROM DEPTH IMAGES USING RANDOM REGRESSION FORESTS AND HOUGH VOTING

Authors:

Brian Holt and Richard Bowden

Abstract: Robust and fast algorithms for estimating the pose of a human given an image would have a far reaching impact on many fields in and outside of computer vision. We address the problem using depth data that can be captured inexpensively using consumer depth cameras such as the Kinect sensor. To achieve robustness and speed on a small training dataset, we formulate the pose estimation task within a regression and Hough voting framework. Our approach uses random regression forests to predict joint locations from each pixel and accumulate these predictions with Hough voting. The Hough accumulator images are treated as likelihood distributions where maxima correspond to joint location hypotheses. We demonstrate our approach and compare to the state-of-the-art on a publicly available dataset.
Download

Short Papers
Paper Nr: 76
Title:

INTEGRATION OF GENERATIVE LEARNING AND MULTIPLE POSE CLASSIFIERS FOR PEDESTRIAN DETECTION

Authors:

Hidefumi Yoshida, Daisuke Deguchi, Ichiro Ide, Hiroshi Murase, Kunihiro Goto, Yoshikatsu Kimura and Takashi Naito

Abstract: Recently, pedestrian detection from in-vehicle camera images is becoming an important technology in ITS (Intelligent Transportation System). However, it is difficult to detect pedestrians stably due to the variety of their poses and their backgrounds. To tackle this problem, we propose a method to detect various pedestrians from in-vehicle camera images by using multiple classifiers corresponding to various pedestrian pose classes. Since pedestrians’ pose varies widely, it is difficult to construct a single classifier that can detect pedestrians with various poses stably. Therefore, this paper constructs multiple classifiers optimized for variously posed pedestrians by classifying pedestrian images into multiple pose classes. Also, to reduce the bias and the cost for preparing numerous pedestrian images for each pose class for learning, the proposed method employs a generative learning method. Finally, the proposed method constructs multiple classifiers by using the synthesized pedestrian images. Experimental results showed that the detection accuracy of the proposed method outperformed comparative methods, and we confirmed that the proposed method could detect variously posed pedestrians stably.
Download

Paper Nr: 80
Title:

AUTOMATIC OBJECTS DETECTION FOR MODELING INDOOR ENVIRONMENTS

Authors:

Marion Decrouez, Romain Dupont, François Gaspard and James L. Crowley

Abstract: In this paper we describe a new solution for constructing a model of a scene and its objects using various explorations of a single camera in an unknown environment. Object motion presents a difficult challenge to scene modeling. The proposed method combines metric localization and place recognition to detect and model objects without a priori knowledge and to incrementally extend a scene model by adding new places and objects. We demonstrate the quality of our approach with results from image sequences taken from two different scenes.
Download

Paper Nr: 98
Title:

A BIO-INSPIRED LEARNING AND CLASSIFICATION METHOD FOR SUBCELLULAR LOCALIZATION OF A PLASMA MEMBRANE PROTEIN

Authors:

Wafa Bel Haj Ali, Paolo Piro, Lydie Crescence, Dario Giampaglia, Oumelkheir Ferhat, Jacques Darcourt, Thierry Pourcher and Michel Barlaud

Abstract: High-content cellular imaging is an emerging technology for studying many biological phenomena. statistical analyses on large populations (more than thousands) of cells are required. Hence classifying cells by experts is a very time-consuming task and poorly reproducible. In order to overcome such limitations, we propose an automatic supervised classification method. Our new cell classification method consists of two steps: The first one is an indexing process based on specific bio-inspired features using contrast information distributions on cell sub-regions. The second is a supervised learning process to select prototypical samples (that best represent the cells categories) which are used in a leveraged k-NN framework to predict the class of unlabeled cells. In this paper we have tested our new learning algorithm on cellular images acquired for the analysis of changes in the subcellular localization of a membrane protein (the sodium iodide symporter). In order to evaluate the automatic classification performances, we tested our algorithm on a significantly large database of cellular images annotated by experts of our group. Results in term of Mean Avarage Precision (MAP) are very promising, providing precision upper than 87% on average, thus suggesting our method as a valuable decision-support tool in such cellular imaging applications. Such supervised classification method has many other applications in cell imaging in the areas of research in basic biology and medicine but also in clinical histology.
Download

Paper Nr: 105
Title:

AN AFFINE INVARIANT DESCRIPTOR FOR ACTION RECOGNITION

Authors:

Parvin Razzaghi, Niloofar Gheissari and Maziar Palhang

Abstract: The aim of this paper is to introduce an affine invariant descriptor for the action volume. Human motion is completely represented by action volume, which is constructed over successive frames by stacking human silhouettes in consecutive frames (without any need to compute point correspondence). Action volume comprehensively contains spatial and temporal information about an action. An affine invariant action descriptor based upon functions of spherical harmonics coefficients, is introduced to describe the STV. The harmonic nature of this descriptor makes it robust to noise and outliers. We applied the proposed descriptor to the KTH, Weizmann and IXMAS datasets and compared the performance of our algorithm to competing methods available in the literature. The results of our experiments show that our method has a superior or comparable performance to the most successful and recent existing algorithms.

Paper Nr: 119
Title:

DETECTION AND LOCALISATION OF STATIONARY OBJECTS WITH A PAIR OF PTZ CAMERAS

Authors:

Constant Guillot, Quoc-Cuong Pham, Patrick Sayd, Christophe Tilmant and Jean-Marc Lavest

Abstract: We propose a novel approach for detecting and localising stationary objects using a pair of PTZ cameras monitoring a wide scene. First we propose a stationary object detection and labelling algorithm. It relies on the re-identification of foreground blocks of the image and an MRF framework to detect and separate the stationary objects of the scene. Second we propose a geometric approach for robustly matching the detected silhouettes of stationary objects from a pair of PTZ cameras. Our system is tested on challenging sequences which prove its robustness to occlusions even in an unknown non planar 3D scene.
Download

Paper Nr: 130
Title:

SEMANTIC SEGMENTATION USING GRABCUT

Authors:

Christoph Göring, Björn Fröhlich and Joachim Denzler

Abstract: This work analyzes how to utilize the power of the popular GrabCut algorithm for the task of pixel-wise labeling of images, which is also known as semantic segmentation and an important step for scene understanding in various application domains. In contrast to the original GrabCut, the aim of the presented methods is to segment objects in images in a completely automatic manner and label them as one of the previously learned object categories. In this paper, we introduce and analyze two different approaches that extend GrabCut to make use of training images. C-GrabCut generates multiple class-specific segmentations and classifies them by using shape and color information. L-GrabCut uses as a first step an object localization algorithm, which returns a classified bounding box as a hypothesis of an object in the image. Afterwards, this hypothesis is used as an initialization for the GrabCut algorithm. In our experiments, we show that both methods lead to similar results and demonstrate their benefits compared to semantic segmentation methods only based on local features.
Download

Paper Nr: 189
Title:

HUMAN RE-IDENTIFICATION THROUGH DISTANCE METRIC LEARNING BASED ON JENSEN-SHANNON KERNEL

Authors:

Yoshihisa Ijiri, Shihong Lao, Tony X. Han and Hiroshi Murase

Abstract: Human re-identification, i. e., human identification across cameras without an overlapping view, has important applications in video surveillance. The problem is very challenging due to color and illumination variations among cameras as well as the pose variations of people. Assuming that the color of human clothing does not change quickly, previous work relied on color histogram matching of clothing. However, naive color histogram matching across camera network is not robust enough for human re-identification. Therefore, we learned an optimal distance metric between color histograms using a training dataset. The Jensen-Shannon kernel is proposed to learn nonlinear distance metrics. The effectiveness of the proposed method is validated by experimental results.
Download

Paper Nr: 190
Title:

A FAST VOTING-BASED TECHNIQUE FOR HUMAN ACTION RECOGNITION IN VIDEO SEQUENCES

Authors:

Duc-Hieu Tran and Wooi-Boon Goh

Abstract: Human action recognition has been an active research area in recent years. However, building a robust human action recognition system still remains a challenging task due to the large variations in action classes, varying human appearances, illumination changes, camera motion, occlusions and background clutter. Most previous work focus on the goal of improving recognition rates. This paper describes a computationally fast votingbased approach for human action recognition, in which the action in the video sequence is recognized based on the support of the local spatio-temporal features. The proposed technique requires no parameter tuning and can produce recognition rates that are comparable to those in recent published literature. Moreover, the technique can localize the single human action in the video sequence without much additional computation. Recognition results on the KTH and Weizmann action dataset are presented.
Download

Paper Nr: 213
Title:

TRAFFIC LIGHTS DETECTION IN ADVERSE CONDITIONS USING COLOR, SYMMETRY AND SPATIOTEMPORAL INFORMATION

Authors:

George Siogkas, Evangelos Skodras and Evangelos Dermatas

Abstract: This paper proposes the use of a monocular video camera for traffic lights detection, in a variety of conditions, including adverse weather and illumination. The system incorporates a color pre-processing module to enhance the discrimination of red and green regions in the image and handle the “blooming effect” that is often observed in such scenes. The fast radial symmetry transform is utilized for the detection of traffic light candidates and finally false positive results are minimized using spatiotemporal persistency verification. The system is qualitatively assessed in various conditions, including driving in the rain, at night and in city roads with dense traffic, as well as their synergy. It is also quantitatively assessed on a publicly available manually annotated database, scoring high detection rates.
Download

Paper Nr: 239
Title:

UNITY IN DIVERSITY: DISCOVERING TOPICS FROM WORDS - Information Theoretic Co-clustering for Visual Categorization

Authors:

Ashish Gupta and Richard Bowden

Abstract: This paper presents a novel approach to learning a codebook for visual categorization, that resolves the key issue of intra-category appearance variation found in complex real world datasets. The codebook of visual-topics (semantically equivalent descriptors) is made by grouping visual-words (syntactically equivalent descriptors) that are scattered in feature space. We analyze the joint distribution of images and visual-words using information theoretic co-clustering to discover visual-topics. Our approach is compared with the standard ‘Bagof-Words’ approach. The statistically significant performance improvement in all the datasets utilized (Pascal VOC 2006; VOC 2007; VOC 2010; Scene-15) establishes the efficacy of our approach.
Download

Paper Nr: 240
Title:

ON-LINE ACTION RECOGNITION FROM SPARSE FEATURE FLOW

Authors:

Hildegard Kuehne, Dirk Gehrig, Tanja Schultz and Rainer Stiefelhagen

Abstract: The fast and robust recognition of human actions is an important aspect for many video-based applications in the field of human computer interaction and surveillance. Although current recognition algorithms provide more and more advanced results, their usability for on-line applications is still limited. To bridge this gap a online video-based action recognition system is presented that combines histograms of sparse feature point flow with an HMM-based action recognition. The usage of feature point motion is computational more efficient than the more common histograms of optical flow (HoF) by reaching a similar recognition accuracy. For recognition we use low-level action units that are modeled by Hidden-Markov-Models (HMM). They are assembled by a context free grammar to recognize complex activities. The concatenation of small action units to higher level tasks allows the robust recognition of action sequences as well as a continuous on-line evaluation of the ongoing activity. The average runtime is around 34 ms for processing one frame and around 20 ms for calculating one hypothesis for the current action. Assuming that one hypothesis per second is needed, the system can provide a mean capacity of 25 fps. The systems accuracy is compared with state of the art recognition results on a common benchmark dataset as well as with a marker-based recognition system, showing similar results for the given evaluation scenario. The presented approach can be seen as a step towards the on-line evaluation and recognition of human motion directly from video data.
Download

Paper Nr: 276
Title:

SEMANTIC CLASS DETECTORS IN VIDEO GENRE RECOGNITION

Authors:

Michal Hradiš, Ivo Řezníček and Kamil Behúň

Abstract: This paper presents our approach to video genre recognition which we developed for MediaEval 2011 evaluation. We treat the genre recognition task as a classification problem. We encode visual information in standard way using local features and Bag of Word representation. Audio channel is parameterized in similar way starting from its spectrogram. Further, we exploit available automatic speech transcripts and user generated meta-data for which we compute BOW representations as well. It is reasonable to expect that semantic content of a video is strongly related to its genre, and if this semantic information was available it would make genre recognition simpler and more reliable. To this end, we used annotations for 345 semantic classes from TRECVID 2011 semantic indexing task to train semantic class detectors. Responses of these detectors were then used as features for genre recognition. The paper explains the approach in detail, it shows relative performance of the individual features and their combinations measured on MediaEval 2011 genre recognition dataset, and it sketches possible future research. The results show that, although, meta-data is more informative compared to the content-based features, results are improved by adding content-based information to the meta-data. Despite the fact that the semantic detectors were trained on completely different dataset, using them as feature extractors on the target dataset provides better result than the original low-level audio and video features.
Download

Paper Nr: 277
Title:

A COMPREHENSIVE ANALYSIS OF HUMAN MOTION CAPTURE DATA FOR ACTION RECOGNITION

Authors:

Valsamis Ntouskos, Panagiotis Papadakis and Fiora Pirri

Abstract: In this paper, we present an analysis of human motion that can assist the recognition of human actions irrespective of the selection of particular features. We begin with an analysis on the entire set of preclassified motions in order to derive the generic characteristics of articulated human motion and complement the analysis by a more detailed inter-class analysis. The statistical analysis concerns features that describe the significancecontribution of the human joints in performing an action. Furthermore, we adopt a hierarchical analysis on the human body itself in the study of different actions, by grouping joints that share common characteristics. We present our experiments on standard databases for human motion capture data as well as a new commercial dataset with additional classes of human motion and highlight certain interesting results.
Download

Paper Nr: 3
Title:

AUTOMATIC SUSPICIOUS BEHAVIOR DETECTION FROM A SMALL BOOTSTRAP SET

Authors:

Kan Ouivirach, Shashi Gharti and Matthew N. Dailey

Abstract: We propose and evaluate a new method for automatic identification of suspicious behavior in video surveillance data. It partitions the bootstrap set into clusters then assigns new observation sequences to clusters based on statistical tests of HMM log likelihood scores. In an evaluation on a real-world testbed video surveillance data set, the method achieves a false alarm rate of 7.4% at a 100% hit rate. It is thus a practical and effective solution to the problem of inducing scene-specific statistical models useful for bringing suspicious behavior to the attention of human security personnel.
Download

Paper Nr: 8
Title:

ANALYZING INVARIANCE OF FREQUENCY DOMAIN BASED FEATURES FROM VIDEOS WITH REPEATING MOTION

Authors:

Kahraman Ayyildiz and Stefan Conrad

Abstract: This paper discusses an approach, which allows classifying videos by frequency spectra. Many videos contain activities with repeating movements. Sports videos, home improvement videos, or videos showing mechanical motion are some example areas. Motion of these areas usually repeats with a certain main frequency and several side frequencies. Transforming repeating motion to its frequency domain via FFT reveals these frequency features. In this paper we explain how to compute frequency features for video clips and how to use them for classifying. The experimental stage of this work focuses on the invariance of these features with respect to rotation, reflection, scaling, translation and time shift.
Download

Paper Nr: 39
Title:

FAST NEAREST NEIGHBOR SEARCH IN PSEUDOSEMIMETRIC SPACES

Authors:

Markus Lessmann and Rolf P. Würtz

Abstract: Nearest neighbor search in metric spaces is an important task in pattern recognition because it allows a query pattern to be associated with a known pattern from a learned dataset. In low-dimensional spaces a lot of good solutions exist that minimize the number of comparisons between patterns by partitioning the search space using tree structures. In high-dimensional spaces tree methods become useless because they fail to prevent scanning almost the complete dataset. Locality sensitive hashing methods solve the task approximately by grouping patterns that are nearby in search space into buckets. Therefore an appropriate hash function has to be known that is highly likely to assign a query pattern to the same bucket as its nearest neighbor. This works fine as long as all the patterns are of the same dimensionality and exist in the same vector space with a complete metric. Here, we propose a locality-sensitive hashing-scheme that is able to process patterns which are built up of several possibly missing subpatterns causing the patterns to be in vector spaces of different dimensionality. These patterns can only be compared using a pseudosemimetric.
Download

Paper Nr: 61
Title:

A REAL WORLD DETECTION SYSTEM - Combining Color, Shape and Appearance to Enable Real-time Road Sign Detection

Authors:

Peng Wang, Jianmin Li and Bo Zhang

Abstract: Although specific object detection has undergone great advances in recent years, its application to critical real-time circumstances like those in automated vehicle controlling is still limited, especially when facing strict speed and precision requirements. This paper uses a hybrid of various computer vision techniques including color space analysis, Haar-wavelet cascade detector, contour analysis and RANSAC shape-fitting, to achieve a real-time detection speed while maintaining a reasonable precision and false-alarm level. The result is a practical system that out-performed most rivals in an automated vehicle contest and an indication of feasible CV application to speed critical areas.
Download

Paper Nr: 79
Title:

SPARSE REPRESENTATIONS AND INVARIANT SEQUENCE-FEATURE EXTRACTION FOR EVENT DETECTION

Authors:

Alexandru P. Condurache and Alfred Mertins

Abstract: We address the problem of detecting unusual actions performed by a human in a video. Broadly speaking, we achieve our goal by matching the observed action to a set of a-priori known actions. If the observed action can not be matched to any of the known actions (representing the normal case), we conclude that an event has taken place. In this contribution we will show how sparse representations of actions can be used for event detection. Our input data are video sequences showing different actions. Special care is taken to extract features from these sequences. The features are chosen such that the sparse-representations paradigm can be applied and they exhibit a set of invariance properties needed for detecting unusual human actions. We test our methods on sequences showing different people performing various actions such as walking or running.
Download

Paper Nr: 95
Title:

GPU ACCELERATED REAL-TIME OBJECT DETECTION ON HIGH RESOLUTION VIDEOS USING MODIFIED CENSUS TRANSFORM

Authors:

Salih Cihan Tek and Muhittin Gökmen

Abstract: This paper presents a novel GPU accelerated object detection system using CUDA. Because of its detection accuracy, speed and robustness to illumination variations, a boosting based approach with Modified Census Transform features is used. Results are given on the face detection problem for evaluation. Results show that even our single-GPU implementation can run in real-time on high resolution video streams without sacrificing accuracy and outperforms the single-threaded and multi-threaded CPU implementations for resolutions ranging from 640×480 to 1920×1080 by a factor of 12-18x and 4-6x, respectively.
Download

Paper Nr: 135
Title:

MCMC MODULAR ENSEMBLE TRACKING

Authors:

Thomas Penne, Christophe Tilmant, Thierry Chateau and Vincent Barra

Abstract: Object Tracking has become a recurrent problem in video-surveillance and is a important domain in computer vision. It was recently approached using classification techniques and still more recently using boosting methods. We propose here a new object tracking method, based on Ensemble Tracking and integrating two main improvements. The first one lies on the separation of the heterogeneous feature space into a set of homogenous subspaces (modules) and on the application of an Ensemble Tracking-based algorithm on each module. The second one deals with the new tracking problem induced by this separation by building a specific particle filter, weighting each module in order to estimate both position and dimensions of the tracked object and the linear combination of modular decisions leading to the most discriminative observation. Our method is tested on challenging sequences. We prove its performance and we compare its robustness with the state of the art.

Paper Nr: 139
Title:

FACIAL ACTION UNIT RECOGNITION AND INFERENCE FOR FACIAL EXPRESSION ANALYSIS

Authors:

Yu Li Xue, Xia Mao and Qing Chang

Abstract: Human facial expression is extremely abundant, and can be described by numerous facial action units. Recognizing facial action units helps catching the inner emotion or intention of human. In this paper, we propose a novel method for facial action unit recognition and inference. We used Gabor wavelet and optical flow for feature extraction, and used support vector machine and dynamic bayesian network for classification and inference respectively. We combined the advantages of both global and local feature extraction, recognized the most discriminant AUs with multiple classifiers to achieve high recognition rate, and then inference the related AUs. Experiments were conducted on the Cohn-Kanade AU-Coded database. The results demonstrated that compared to early researches for facial action units recognition, our method is capable of recognizing more action units and achieved good performance.
Download

Paper Nr: 158
Title:

MELON AUTHENTICATION BY AGRI-BIOMETRICS - Identifying Individual Fruits using a Single Image of Rind Pattern

Authors:

Rui Ishiyama, Yoichi Nakamura, Akira Monden, Lei Huang and Seiji Yoshimoto

Abstract: We propose a new method of biometric authentication, called agri-biometrics that identifies an individual fruit using a single image of its rind patterns. Our proposed method normalizes the rotations in depth of the fruit and extracts a set of image features, which are compatible to the 'minutiae', from the normalized image; thus, it enables us to apply a state-of-the-art technique of fingerprint matching to identify the rind patterns of fruit. We conducted large-scale experiments to identify/verify 1,776 individual melons in practical situations where the images were taken under different pose and illumination conditions on different days. Our method in the experiments achieved excellent recognition of EER=0.06%. The agri-biometric authentication we propose accomplishes 'verifiable' agri-food traceability and brand protection; once the producers register pictures of their products into the database, anyone can verify the products on hand with the camera of a mobile phone.
Download

Paper Nr: 171
Title:

PASSENGER COUNTING IN PUBLIC RAIL TRANSPORT - Using Head-Shoulder Contour Tracking

Authors:

Pieterjan De Potter, Philippe Belet, Chris Poppe, Steven Verstockt, Peter Lambert and Rik Van de Walle

Abstract: Automated people counting has multiple applications: referring passengers to vehicles with empty seats, gathering statistical information for railway companies to improve their distribution of vehicles, etc. In this paper, a people counting algorithm for public transport vehicles is presented. First, head-shoulder contours are detected by adaboost classification of a combination of a histogram of oriented gradients features and a color histogram. An integral histogram and integral image are used to speed up the extraction of these features. The results of the classification process are clustered and these clusters are tracked by a Kalman filter using a custom error covariance matrix. Finally, the path followed by an observed person is evaluated in order to count passengers entering and exiting the vehicle. Evaluation shows that this approach performs better than previous approaches, especially in scenarios with occlusions.
Download

Paper Nr: 207
Title:

FOOD REGION DETECTION USING BAG-OF-FEATURES REPRESENTATION AND COLOR FEATURE

Authors:

Ruiko Miyano, Yuko Uematsu and Hideo Saito

Abstract: Food image processing has recently attracted attention, because many people take photos of food. For food image processing, recognition of captured food is an essential technology, but region detection of the food area from captured photos is also very important procedure for food recognition. In this paper, we propose a novel method for automatic region detection of food from photos using two kinds of features in input image. To detect food regions, we use a method which is widely used in generic object recognition. We divide an image into small subregions and represent each subregion as Bag-of-Features representation using local feature descriptors and color feature. Using two features, we recognize food subregions and finally connect them as food regions. Our experiments show that the proposed method can detect food region in high accuracy.
Download

Paper Nr: 227
Title:

A NEW DESCRIPTOR BASED ON 2D DCT FOR IMAGE RETRIEVAL

Authors:

Cong Bai, Kidiyo Kpalma and Joseph Ronsin

Abstract: Content-based image retrieval relies on feature comparison between images. So the selection of feature vector is important. As many images are compressed by transforms, constructing the feature vector directly in transform domain is a very popular topic. We propose a new feature vector in DCT domain. Our method selects part of DCT coefficients inside each block to construct AC-Pattern and use DC coefficients between neighboring blocks to construct DC-Pattern. Two histograms are formed and parts of them are used to build a descriptor vector integrating features to do image retrieval. Experiments are done both on face image databases and texture image database. Compared to other methods, results show that we can get better performance on both face and texture database by using the proposed method.
Download

Paper Nr: 228
Title:

ON THE EFFECT OF PERSPECTIVE DISTORTIONS IN FACE RECOGNITION

Authors:

Zahid Riaz and Michael Beetz

Abstract: Face recognition is one of the widely studied topics in the literature image processing and pattern recognition. Generally for the face images, the distance between camera and face is larger than the face size, hence in practice the effects of perspective distortions on the face edges are often ignored by the researchers. While these effects become more prominent if faces are viewed from different angles. In this paper, we study effects of perspective distortion and obtain improved results for face recognition against varying view-points. The approach follows by fitting a 3D model to the face images and creating a texture map by texture rectification at each triangle level. We compare our results with active appearance models (AAM) on two standard face databases.
Download

Paper Nr: 230
Title:

BRINGING ORDER IN THE BAG OF WORDS

Authors:

Shihong Zhang, Rahat Khan, Damien Muselet and Alain Trémeau

Abstract: This paper presents a method to infuse spatial information in the bag of words (BOW) framework for object categorization. The main idea is to account the local spatial distribution of the visual words. Rather than finding rigid local patterns, we consider the visual words in close spatial proximity as a pouch of words and we represent the image as a bag of word-pouches. For this purpose, sub-windows are extracted from the images and characterized by local bags of words. Then a clustering step is applied in the local bag of words space to construct the word-pouches. We show that this representation is complementary to the classical BOW. Thus a concatenation of these two representations is used as the final descriptor. Experiments are conducted on two very well known image datasets.
Download

Paper Nr: 288
Title:

A COMBINED TECHNIQUE FOR DETECTING OBJECTS IN MULTIMODAL IMAGES OF PAINTINGS

Authors:

Dmitry Murashov

Abstract: A combined technique for detecting objects in multimodal images based on specific object detectors and image difference measure is presented. The information-theoretical measures of image difference are proposed. The conditions of applicability of these measures for detecting artefacts in multimodal images are formulated. The technique based on the proposed measures is successfully used for detecting repainting and retouching areas in the images of fine-art paintings. It requires segmentation of only one of the analyzed images.
Download

Paper Nr: 292
Title:

REAL-TIME LOCALIZATION OF OBJECTS IN TIME-OF-FLIGHT DEPTH IMAGES

Authors:

Ulrike Thomas

Abstract: In this paper a Random Sample Consensus (Ransac) based algorithm for object localization in time-of-flight depth images is presented. In contrast to many other approaches for pose estimation, the algorithm does not need an inertial guess of the object’s pose, despite it is able to find objects in real time. This is achieved by hashing suitable object features in a pre-processing step. The approach is model based and only needs point clouds of objects, which can either be provided by a CAD systems or acquired from prior taken measurements. The implemented approach is not a simple Ransac approach, because the algorithm makes use of a more progressive sampling strategy, hence the here presented algorithm is rather a Progressive Sampling Consensus (Prosac) approach. As a consequence, the number of necessary iterations is reduced. The implementation has been evaluated with a couple of exemplary scenarios as they occur in real robotic applications. On the one hand, industrial parts are picked out of a bin and on the other hand every day objects are located on a table.
Download

Paper Nr: 295
Title:

EXPERIMENTS ON FACIAL CLASSIFICATION IN LOW LIGHT CONDITIONS

Authors:

Peter Paul and Yuheng Wang

Abstract: Robustness of SNoW based face detection using local SMQT features to low light conditions is examined through experimental investigation. Low light conditions are emulated by varying camera aperture and camera exposure time to a xenon flash device in night time conditions. For face detection in the context of vehicle occupancy detection, it was found that reducing the illumination to 25% of that required for a properly exposed image to a human observer resulted in a reduction in face classification score that did not significantly reduce classification performance.
Download

Area 4 - Applications and Services

Full Papers
Paper Nr: 116
Title:

FAST BEAD DETECTION AND INEXACT MICROARRAY PATTERN MATCHING FOR IN-SITU ENCODED BEAD-BASED ARRAY

Authors:

Soumita Ghosh, Andreas Schmidt and Dieter Trau

Abstract: This paper presents an automatic bead detection and bead array pattern matching technique developed for the In-situ Encoded Bead-based Array (IEBA) technology. A supervised learning based bead detection technique robust to irregular illumination variations and noise is developed. An efficient and effective graph matching technique that combines graph spectral analysis and sequence alignment is used to match bead array patterns. The matching algorithm proposed is rotation and scale-invariant. The pattern matching algorithm performs in-exact matching and is capable of handling very large numbers of outliers in the target graph as well as large number of occlusions in the template graph. The matching algorithm uses dynamic programming and can give good time performances dependent only on the number of nodes in the template and target graphs, irrespective of the number of outliers and occlusions. The algorithm can detect and match large number of beads in a few seconds.
Download

Paper Nr: 134
Title:

A STEREOSCOPIC AUGMENTED REALITY SYSTEM FOR THE VERIDICAL PERCEPTION OF THE 3D SCENE LAYOUT

Authors:

M. Chessa, M. Garibotti, A. Canessa, A. Gibaldi, S. P. Sabatini and F. Solari

Abstract: The recent diffusion of the stereoscopic 3D technologies has yielded the development of affordable and of everyday use devices for the visualization of such information. This has paved the way for powerful human and computer interaction systems based on augmented reality environment where humans can interact with both virtual and real tools. However, an observer freely moving in front of a 3D display could experience a misperception of the depth and of the shape of virtual objects. Such distortions can have serious consequences in scientific and medical fields, where a veridical perception is required, and they can cause visual fatigue in consumer and entertainment applications. Here, we propose a novel augmented reality system capable to correctly render 3D virtual objects, without adding significant delay, to an observer that changes his position in the real world and acts in the virtual scenario. The correct perception of the scene layout is assessed through two experimental sessions with several observers.
Download

Paper Nr: 285
Title:

3D RECONSTRUCTION OF PLANT ROOTS FROM MRI IMAGES

Authors:

Hannes Schulz, Johannes A. Postma, Dagmar van Dusschoten, Hanno Scharr and Sven Behnke

Abstract: We present a novel method for deriving a structural model of a plant root system from 3D Magnetic Resonance Imaging (MRI) data of soil grown plants. The structural model allows calculation of physiologically relevant parameters. Roughly speaking, MRI images show local water content of the investigated sample. The small, local amounts of water in roots require a relatively high resolution, which results in low SNR images. However, the spatial resolution of the MRI images remains coarse relative to the diameter of typical fine roots, causing many gaps in the visible root system. To reconstruct the root structure, we propose a three step approach: 1) detect tubular structures, 2) connect all pixels to the base of the root using Dijkstra’s algorithm, and 3) prune the tree using two signal strength related thresholds. Dijkstra’s algorithm determines the shortest path of each voxel to the base of the plant root, weighing the Euclidean distance measure by a multi-scale vesselness measure. As a result, paths running within good root candidates are preferred over paths in bare soil. We test this method using both virtually generated MRI images of Maize and real MRI images of Barley roots. In experiments on synthetic data, we show limitations of our algorithm with regard to resolution and noise levels. In addition we show how to use our reconstruction for root phenotyping on real MRI data of Barley roots in soil.
Download

Short Papers
Paper Nr: 43
Title:

IMAGE HIDING USING THE DIFFERENCE OF MODIFIED HISTOGRAM AND PREDICTOR

Authors:

Hui-Yu Huang and Shih-Hsu Chang

Abstract: This paper proposes a high capacity reversible image hiding scheme. It uses the pixel prediction and the difference of modified histogram to compute the best embedding point. Secret data are embedded based on the histogram difference between the current point and its prediction point. This hiding scheme provides a mechanism to embed secret data into host image and can recover the original image without any distortion information after the hidden data have been extracted. Based on prediction mechanism, our proposed approach can improve the predictive accuracy and increase the pixel difference that is to gain the hiding capacity. Experimental results demonstrate that the same average hiding capacity compared with the proposed method and other methods can retain higher image quality and superior to other reversible hiding works.

Paper Nr: 91
Title:

SEGMENTATION OF PLANAR STRUCTURES IN BIOIMAGING

Authors:

A. Martinez-Sanchez, I. Garcia and J. J. Fernandez

Abstract: This work presents an approach to detection of planar structures in three-dimensional (3D) datasets obtained by different bioimaging modalities. The strategy has already turned out to be effective to segment membranes from 3D volumes in the field of electron tomography, an emerging and powerful technique in structural and cellular biology. This approach can also be useful to detect planar structures in general in other bioimaging modalities. The goal of this position paper is to present this approach to the computer vision community and illustrate the performance on a number of representative bioimaging datasets.
Download

Paper Nr: 122
Title:

REAL-TIME FACE SWAPPING IN VIDEO SEQUENCES - Magic Mirror

Authors:

Nuri Murat Arar, Fatma Güney, Nasuh Kaan Bekmezci, Hua Gao and Hazım Kemal Ekenel

Abstract: Magic Mirror is a face swapping tool that replaces the user’s face with a selected famous person’s face in a database. The system consists of a user interface from which the user can select one of the celebrities listed. Upon selection, model fitting automatically starts using the results of face detection and facial feature localization. Model fitting results in a set of points which describe the estimated shape of user’s face. By using the shape information, user’s face is replaced with the selected celebrity’s face. After some post processing for color and lighting adjustments are applied, final output is displayed to the user. The proposed system is able to run in real-time and generates satisfactory face swapping which can be applied for face de-identification in videos or other entertainment applications.
Download

Paper Nr: 226
Title:

RELEVANCE FEEDBACK AS AN INTERACTIVE NAVIGATION TOOL

Authors:

Daniele Borghesani, Costantino Grana and Rita Cucchiara

Abstract: Image collections are searched in common retrieval systems in many different ways, but the typical presentation is by means of a grid styled view. In this paper we try to suggest a novel use of relevance feedback as a tool to warp the view and allow the user to spatially navigate the image collection, and at the same time focus on his retrieval aim. This is obtained by the use of a distance based space warping on the 2D projection of the distance matrix.
Download

Paper Nr: 250
Title:

3D TEXTURE SYNTHESIS FOR MODELING REALISTIC ORGANIC TISSUES

Authors:

Juan-Carlos Prieto, Chantal Revol-Muller, Françoise Peyrin, Patrizia Camelliti and Christophe Odet

Abstract: Virtual anatomy models show in detail characteristics of the human body systems. These models are based in surface representation of the structures and lack information from the interior of the object. Creating models that represent the surface, the interior of the object and are able to provide pathological information is the current challenge of research in life sciences. We present a method to synthesize realistic three-dimensional organic tissues starting from bidimensional textured multi-channel samples. The method relies on an energy function that measures the difference between the reference texture and the synthesized object, through a distance metric that compares perpendicular neighborhoods in the object to neighborhoods in the sample. When this function is minimized by IRLS, the result is a solid object that resembles the sample at every slice. In some cases, the optimization might be aided by adding the feature distance transform, calculated from a given binary mask. This allows to code large textured areas. Multiple textures can also be provided to the optimization in order to create anistropic textures. We apply our method starting from various micrometric images such as histology images or slices of Synchrotron Radiation Computed Micro-Tomography (SRìCT) images. A major advantage of our method is to extend 2D histological information to a 3D representation. We demonstrate the accuracy of the generated texture by comparing statistical and morphological parameters computed from the synthetic object with those obtained from the real object underlying the reference images.
Download

Paper Nr: 267
Title:

TEXTURE OVERLAY ONTO NON-RIGID SURFACE USING COMMODITY DEPTH CAMERA

Authors:

Tomoki Hayashi, Francois de Sorbier and Hideo Saito

Abstract: We present a method for overlaying a texture onto a non-rigid surface using a commodity depth camera. The depth cameras are able to capture 3-D data of a surface in real-time, and have several advantages compared with methods using only standard color cameras. However, it is not easy to register a 3-D deformable mesh to a point cloud of the non-rigid surface while keeping its geometrical topology. In order to solve this problem, our method starts by learning many representative meshes to generate surface deformation models. Then, while capturing 3-D data, we register a feasible 3-D mesh to the target surface and overlay a template texture onto the registered mesh. Even if the depth data are noisy or sparse, the learning-based method provides us with a smooth surface mesh. In addition, our method can be applied to real-time applications. In our experiments, we show some augmented reality results of texture overlay onto a non-textured T-shirt. In our experiments, we show some augmented reality results of texture overlay onto a non-textured T-shirt.
Download

Paper Nr: 30
Title:

OPTICALLY WRITTEN WATERMARKING TECHNOLOGY USING ONE DIMENSIONAL HIGH FREQUENCY PATTERN

Authors:

Kazutake Uehira and Mizuho Komori

Abstract: We propose a new optically written watermarking technology that uses a one dimensional high frequency pattern to protect the portrait rights of 3-D shaped real objects by preventing the use of images captured illegally with cameras. We conducted experiments using a manikin’s face as a real 3-D object assuming that this technology would be applied to human faces in the future. We utilized the phase difference between two color component patterns, i.e., binary information was expressed if the phase of the high frequency pattern was the same or its opposite. The experimental results demonstrated this technique was robust against the pattern being deformed due to the curved surface of the 3-D shaped object and a high accuracy of 100% in reading out the embedded data was possible by optimizing the conditions under which data were embedded. As a result, we could confirm the technique we propose is feasible.
Download

Paper Nr: 97
Title:

FAST CALIBRATION METHOD FOR ACTIVE CAMERAS

Authors:

Piero Donaggio and Stefano Ghidoni

Abstract: In this paper a model for active cameras that considers complex camera dynamics and lens distortion is presented. This model is particularly suited for real-time applications, thanks to the low computational load required when the active camera is moved. In addition, a simple technique for interpolating calibration parameters is described, resulting in very accurate calibration over the full range of focal lengths. The proposed system can be employed to enhance the patrolling activity performed by a network of active cameras that supervise large areas. Experiments are also presented, showing the improvement provided over traditional pin-hole camera models.
Download

Paper Nr: 114
Title:

INTEGRITY AUTHENTICATION METHOD FOR JPEG IMAGES USING REVERSIBLE WATERMARKING

Authors:

Hyun-Wu Jo, Dong-Gyu Yeo and Hae-Yeoun Lee

Abstract: In these days, with increasing the importance of multimedia security, various multimedia security techniques are studied. In this paper, we propose a content authentication algorithm based on reversible watermarking which supports JPEG compression commonly used for multimedia contents. After splitting image blocks, a specific authentication code for each block is extracted and embedded into the quantized coefficients on JPEG compression which are preserved against lossy processing. At a decoding process, the watermarked JPEG image is authenticated by extracting the embedded code and restored to have the original image quality. To evaluate the performance of the proposed algorithm, we analyzed image quality and compression ratio on various test images. The average PSNR value and compression ratio of the watermarked JPEG image were 33.13dB and 90.65%, respectively, whose difference with the standard JPEG compression were 2.44dB and 1.62%.
Download

Paper Nr: 163
Title:

CONTENT-BASED IMAGE RESIZING ON MOBILE DEVICES

Authors:

Sebastiano Battiato, Giovanni Maria Farinella, Nicolò Grippaldi and Giovanni Puglisi

Abstract: Content-aware image resizing are effective algorithms that allow to take into account the visual content of images during the resizing process. Despite the technological advances in the context of mobile devices, content-aware image resizing algorithms are still far to be used on a hand held device due to the computational resources needed during the resizing. In this paper we afford this problem employing a method which has linear complexity with respect to the number of lines (rows/columns) to be reduced/augmented. The method has been tested, both qualitatively and quantitatively, on a mobile platform.
Download

Area 5 - Motion, Tracking and Stereo Vision

Full Papers
Paper Nr: 81
Title:

TRACKING-BY-REIDENTIFICATION IN A NON-OVERLAPPING FIELDS OF VIEW CAMERAS NETWORK

Authors:

Boris Meden, Frédéric Lerasle, Patrick Sayd and Christophe Gabard

Abstract: This article tackles the problem of automatic multi-pedestrian tracking in non-overlapping fields of view camera networks, using monocular, uncalibrated cameras. Tracking is locally addressed by a Tracking-by- Detection and reidentification algorithm. We propose here to introduce the concept of global identity into a multi-target tracking algorithm, qualifying people at the network level, to allow us to rebound observation discontinuities. We embed that identity into the tracking loop thanks to the mixed-state particle filter framework, thus including it in the search space. Doing so, each tracker maintains a mutli-modality on the identity in the network of its target. We increase the decision strength introducing a high level decision scheme which integrates all the trackers hypothesis over all the cameras of the network with previous reidentification results and the topology of the network. The tracking and reidentification module is first tested with a single camera. We then evaluate the whole framework on a 3 non-overlapping fields of view network with 7 identities. The only a priori knowledge assumed is a topological map of the network.
Download

Paper Nr: 87
Title:

TRACKING INTERACTING OBJECTS IN COMPLEX SITUATIONS BY USING CONTEXTUAL REASONING

Authors:

Rosario Di Lascio, Pasquale Foggia, Alessia Saggese and Mario Vento

Abstract: In this paper we propose a novel real-time tracking algorithm robust with respect to several common errors occurring in object detection systems, especially in the presence of total or partial occlusions. The algorithm takes into account the history of each object, whereas most other methods base their decisions on only the last few frames. More precisely, it associates each object with a state encoding the relevant information of its past history, that enable the most appropriate way of assigning an identity to the object on the basis of its current and past conditions. Thus, strategies that are more complex but also riskier are only applied when the algorithm is confident that is appropriate to do so. An experimental evaluation of the algorithm has been performed using the PETS2010 database, comparing the obtained performance with the results of the PETS 2010 contest participants.
Download

Paper Nr: 90
Title:

SELF-CONSISTENT 3D SURFACE RECONSTRUCTION AND REFLECTANCE MODEL ESTIMATION OF METALLIC SURFACES

Authors:

Steffen Herbort and Christian Wöhler

Abstract: 3D surface reconstruction data measured with active range scanners typically suffer from high-frequency noise on small scales. This poses a problem for highly demanding surface inspection tasks and all other applications that require a high accuracy of the depth data. One way to achieve increased 3D reconstruction accuracy is the fusion of active range scanning data and photometric image information. Typically, this requires modeling of the surface reflectance behavior, which, in turn, implies the surface to be known with high accuracy to determine valid reflectance parameters as long as no calibration object is available. In this study, we propose an approach that provides a detailed 3D surface reconstruction along with simultaneously estimated parameters of the reflectance model. For 3D surface reconstruction, we employ an algorithm that combines active range scanning data for large-scale accuracy with image-based information for small-scale accuracy. For inferring the reflectance function, we incorporate the estimation of the reflectance model into a self-consistent computational scheme that successively increases the resolution and thus determines the reflectance parameters based on refined depth information. We present results for a homogeneous dark rough metallic surface, which is reconstructed based on a single coarse 3D scan and 12 images acquired under different illumination conditions.
Download

Paper Nr: 125
Title:

FAST PROTOTYPING OF EMBEDDED IMAGE PROCESSING APPLICATION ON HOMOGENOUS SYSTEM - A Parallel Particle Filter Tracking Method on Homogeneous Network of Communicating Processors (HNCP)

Authors:

Hanen Chenini, Jean Pierre Derutin and Thierry Chateau

Abstract: This article discusses the design of an application specific MP-SoC (Multi- Processors System on Chip) architecture dedicated to face tracking algorithm. The proposed algorithm tracks a Region-Of-Interest (ROI) by determining the similarity measures between the reference and the target frames. In our approach, this measure is the estimation of the Kullback-Leibler divergence from the K-nearest neighbor (KNN) framework. The metric between pixels is an Euclidean norm in a joint geometric and radiometric space. The adopted measure allows us to check if the regions have similar colors and also if these colors appear at the same location. Considering the necessary computation amounts, we propose a parallel hardware implementation of the developed algorithm on MP-SoC architecture. Creating multiple processors in one system is hard for software developers using traditional hardware design approaches due to the complexity to design software models suitable for such FPGA implementations. In order to deal with this problem, we have introduced a CubeGen tool to avoid fastidious manual editing operations for the designer. This new methodology enables us to instantiate a generic Homogeneous Network of Communicating Processors (called HNCP) tailored for our targeted application. Our implementations are demonstrated using the Xilinx FPGA chip XC6VLX240T.
Download

Paper Nr: 152
Title:

OPTIMAL CONTROL THEORY FOR MULTI-RESOLUTION PROBLEMS IN COMPUTER VISION - Application to Optical-flow Estimation

Authors:

Pascal Zille and Thomas Corpetti

Abstract: This paper is concerned with the multi-resolution issue used in many computer vision applications. Such approaches are very popular to optimize a cost function that, in most of the situations, has been linearized for mathematical facility reasons. In general, a multi-resolution setup consists in a redefinition of the problem at a different resolution level where the mathematical assumptions (usually linearity) hold. Following a coarseto- fine strategy, a usual process consists in 1) optimizing the large scales and 2) use this result as an initial condition for the estimation at finer scales. Such process is repeated until the plain image resolution. One of the main drawbacks of such downscaling approach is its incapacity to correct the eventual errors that have been made at larger scales. These latter are indeed propagated along the scales and disturb the final result. In this paper, we suggest a new formulation of the multi-resolution setup where we exploit some smoothing techniques issued from optimal control theory and in particular variational data assimilation. The time is here artificial and is related to the various scales we are dealing with. Following a consistent mathematical framework, we define an original downscaling/upscaling technique to perform the multi-resolution. We validate this approach by defining a simple optical flow estimation technique based on Lucas-Kanade. Experimental results on synthetic data demonstrate the efficiency of this new methodology.
Download

Paper Nr: 166
Title:

A NOVEL STATE PARAMETRIZATION FOR STEREO-SLAM

Authors:

Arne Petersen and Reinhard Koch

Abstract: This paper proposes a novel parametrization for probabilistic stereo SLAM algorithms. It is optimized to fulfill the assumption of Gaussian probability distributions for system errors. Moreover it makes full use of the contraints induced by stereo vision and provides a close to linear observation model. Therefore the position and orientation are estimated incremetally. The parametrization of landmarks is chosen as the landmarks projection in the master camera and its disparity to the projection in the slave camera. This way a minimal parametrization is given, that is predestinated for linear probabilistic estimators.
Download

Paper Nr: 192
Title:

AN EVALUATION METHODOLOGY FOR STEREO CORRESPONDENCE ALGORITHMS

Authors:

Ivan Cabezas, Maria Trujillo and Margaret Florian

Abstract: A comparison of stereo correspondence algorithms can be conducted by a quantitative evaluation of disparity maps. Among the existing evaluation methodologies, the Middlebury’s methodology is commonly used. However, the Middlebury’s methodology has shortcomings in the evaluation model and the error measure. These shortcomings may bias the evaluation results, and make a fair judgment about algorithms accuracy difficult. An alternative, the A* methodology is based on a multiobjective optimisation model that only provides a subset of algorithms with comparable accuracy. In this paper, a quantitative evaluation of disparity maps is proposed. It performs an exhaustive assessment of the entire set of algorithms. As innovative aspect, evaluation results are shown and analysed as disjoint groups of stereo correspondence algorithms with comparable accuracy. This innovation is obtained by a partitioning and grouping algorithm. On the other hand, the used error measure offers advantages over the error measure used in the Middlebury’s methodology. The experimental validation is based on the Middlebury’s test-bed and algorithms repository. The obtained results show seven groups with different accuracies. Moreover, the top-ranked stereo correspondence algorithms by the Middlebury’s methodology are not necessarily the most accurate in the proposed methodology.
Download

Paper Nr: 203
Title:

MCMC PARTICLE FILTER WITH OVERRELAXATED SLICE SAMPLING FOR ACCURATE RAIL INSPECTION

Authors:

Marcos Nieto, Andoni Cortés, Oihana Otaegui and Iñigo Etxabe

Abstract: This paper introduces a rail inspection system which detects rail flaws using computer vision algorithms. Unlike other methods designed for the same purpose, we propose a method that automatically fits a 3D rail model to the observations during regular services and normal traffic conditions. The proposed strategy is based on a novel application of the slice sampling technique with overrelaxation in the framework of MCMC (Markov Chain Monte Carlo) particle filters. This combination allows us to efficiently exploit the temporal coherence of observations and to obtain more accurate estimates than with other techniques such as importance sampling or Metropolis-Hastings. The results show that the system is able to efficient and robustly obtain measurements of the wear of the rails, while we show as well that it is possible to introduce the slice sampling technique into MCMC particle filters.
Download

Paper Nr: 218
Title:

OCCLUSION HANDLING FOR THE INTEGRATION OF VIRTUAL OBJECTS INTO VIDEO

Authors:

Kai Cordes, Björn Scheuermann, Bodo Rosenhahn and Jörn Ostermann

Abstract: This paper demonstrates how to effectively exploit occlusion and reappearance information of feature points in structure and motion recovery from video. Due to temporary occlusion with foreground objects, feature tracks discontinue. If these features reappear after their occlusion, they are connected to the correct previously discontinued trajectory during sequential camera and scene estimation. The combination of optical flow for features in consecutive frames and SIFT matching for the wide baseline feature connection provides accurate and stable feature tracking. The knowledge of occluded parts of a connected feature track is used to feed a segmentation algorithm which crops the foreground image regions automatically. The resulting segmentation provides an important step in scene understanding which eases integration of virtual objects into video significantly. The presented approach enables the automatic occlusion of integrated virtual objects with foreground regions of the video. Demonstrations show very realistic results in augmented reality.
Download

Paper Nr: 224
Title:

REAL-TIME POSE ESTIMATION USING TREE STRUCTURES BUILT FROM SKELETONISED VOLUME SEQUENCES

Authors:

Rune Havnung Bakken and Adrian Hilton

Abstract: Pose estimation in the context of human motion analysis is the process of approximating the body configuration in each frame of a motion sequence. We propose a novel pose estimation method based on constructing tree structures from skeletonised visual hulls reconstructed from multi-view video. The pose is estimated independently in each frame, so the method can recover from errors in previous frames, which overcomes the problems of tracking. Publically available datasets were used to evaluate the method. On real data the method performs at a framerate of 15–64 fps depending on the resolution of the volume. Using synthetic data the positions of the extremities were determined with a mean error of 47–53 mm depending on the resolution.
Download

Paper Nr: 245
Title:

OBJECT VOLUMETRIC ESTIMATION BASED ON GENERIC FITTED PRIMITIVES FOR SERVICE ROBOTICS

Authors:

Tiberiu T. Cociaș, Sorin M. Grigorescu and Florin Moldoveanu

Abstract: This paper present an approach for object surface estimation from a single perspective using a stereo camera configuration. The goal of the method is to capture the particularity of an object of interest by fitting a generic primitive who best models the recognized shape. The shape modeling process is performed on 3D Regions of Interest (ROI) obtained by classifying the objects present in disparity maps. The principle uses a number of control points, calculated from the primitive Point Distribution Model (PDM). These control points drive the modeling behavior in the disparity point cloud data based on the principle of active contours, or snakes. Finally a compact 3D object mesh can be generated using Delaunay triangulation. The obtained PDM models are intended to be used for the purpose of precise object manipulation in service robotics applications.
Download

Short Papers
Paper Nr: 27
Title:

ENERGY-MINIMIZATION BASED MOTION ESTIMATION USING ADAPTIVE SMOOTHNESS PRIORS

Authors:

Tarik Arici and Vural Aksakalli

Abstract: Energy minimization algorithms are used in low-level computer vision applications for labeling tasks such as stereo-disparity estimation, image restoration, motion estimation, and optical flow. The energy function involves terms that evaluate the goodness of a solution in terms of a prior knowledge in addition to data terms. The most widely used priors are smoothness-based priors, which enhance the quality significantly. However, the smoothness assumption is not valid across discontinuities (e.g. motion boundaries). We present a method to update the weights of smoothness terms using the dual problem when the approximation algorithm is iterative. The dual of the primal energy minimization problem is used to infer about the validity of the smoothness prior and impose it more correctly at each iteration. We demonstrate the effectiveness of this method against the state-of-the-art in the optical flow literature.
Download

Paper Nr: 33
Title:

ENHANCING MEMORY-BASED PARTICLE FILTER WITH DETECTION-BASED MEMORY ACQUISITION FOR ROBUSTNESS UNDER SEVERE OCCLUSION

Authors:

Dan Mikami, Kazuhiro Otsuka, Shiro Kumano and Junji Yamato

Abstract: A novel enhancement for the memory-based particle filter is proposed for visual pose tracking under severe occlusions. The enhancement is the addition of a detection-based memory acquisition mechanism. The memorybased particle filter, M-PF, is a particle filter that predicts prior distributions from past history of target state, which achieved high robustness against complex dynamics of a tracking target. Such high performance requires sufficient history stored in memory. Conventionally, M-PF conducts online memory acquisition which assumes simple target’s dynamics without occlusions for guaranteeing high quality histories. The requirement of memory acquisition narrows the coverage of M-PF in practice. In this paper, we propose a new memory acquisition mechanism for M-PF. The key idea is to use a target detector that can produce additional prior distribution of the target state. We call it M-PFDMA for M-PF with detection-based memory acquisition. The detection-based prior distribution well predicts possible target position/pose even in limited visibility conditions caused by occlusions. Such better prior distributions contribute to stable estimation of target state, which is then added to memorized data. As a result, M-PFDMA can start with no memory entries but soon achieve stable tracking even under severe occlusions. Experiments confirm M-PFDMA’s good performance in such conditions.
Download

Paper Nr: 55
Title:

DENSE PIXEL MATCHING BETWEEN UNRECTIFIED AND DISTORTED IMAGES USING DYNAMIC PROGRAMMING

Authors:

Jerome Thevenon, Jesus Martinez-del-Rincon, Romain Dieny and Jean-Christophe Nebel

Abstract: In this paper, a novel framework for dense pixel matching based on dynamic programming is introduced. Unlike most techniques proposed in the literature, our approach assumes neither known camera geometry nor the availability of rectified images. Under such conditions, the matching task cannot be reduced to finding correspondences between a pair of scanlines. We propose to extend existing dynamic programming methodologies to a larger dimensional space by using a 3D scoring matrix so that correspondences between a line and a whole image can be calculated. After assessing our framework on a standard evaluation dataset of rectified stereo images, experiments are conducted on unrectified and non-linearly distorted images. Results validate our new approach and reveal the versatility of our algorithm.
Download

Paper Nr: 60
Title:

ACCELERATED PEOPLE TRACKING USING TEXTURE IN A CAMERA NETWORK

Authors:

Wasit Limprasert, Andrew Wallace and Greg Michaelson

Abstract: We present an approach to tracking multiple human subjects within a camera network. A particle filter framework is used in which we combine foreground-background subtraction with a novel approach to texture learning and likelihood computation based on an ellipsoid model. As there are inevitable problems with multiple subjects due to occlusion and crossing, we include a robust method to suppress distraction between subjects. To achieve real-time performance, we have also developed our code on a graphics processing unit to achieve a 10-fold reduction in processing time with an approximate frame rate of 10 frames per second.
Download

Paper Nr: 68
Title:

IMPROVEMENT OF MOTION ESTIMATION BY ASSESSING THE ERRORS ON THE EVOLUTION EQUATION

Authors:

Isabelle Herlin, Dominique Béréziat and Nicolas Mercier

Abstract: Image assimilation methods are nowadays widely used to retrieve motion from image sequences with heuristics on the underlying dynamics. A mathematical model on the temporal evolution of the motion field has to be chosen, according to these heuristics, that approximately describes the evolution of the velocity at a pixel over the sequence. In order to quantify this approximation, we add an error term in the evolution equation of the motion field and design a weak formulation of 4D-Var image assimilation. The designed cost function simultaneously depends on the initial motion field and on the error value at each time step. The BFGS solver performs minimization to retrieve both motion field and errors. The method is evaluated and quantified on twin experiments, as no ground truth would be available for real data. The results demonstrate that the motion field is better estimated thanks to the error control.
Download

Paper Nr: 85
Title:

BAYESIAN PERSPECTIVE-PLANE (BPP) FOR LOCALIZATION

Authors:

Zhaozheng Hu and Takashi Matsuyama

Abstract: The "perspective-plane" problem proposed in this paper is similar to the "perspective-n-point (PnP)" or "perspective-n-line (PnL)" problems, yet with broader applications and potentials, since planar scenes are more widely available than control points or lines in practice. We address this problem in the Bayesian framework and propose the "Bayesian perspective-plane (BPP)" algorithm, which can deal with more generalized constraints rather than type-specific ones to determine the plane for localization. Computation of the plane normal is formulated as a maximum likelihood problem, and is solved by using the Maximum Likelihood Searching Model (MLS-M). Two searching modes of 2D and 1D are presented. With the computed normal, the plane distance and the position of the object or camera can be computed readily. The BPP algorithm has been tested with real image data by using different types of scene constraints. The 2D and 1D searching modes were illustrated for plane normal computation. The results demonstrate that the algorithm is accurate and generalized for object localization.
Download

Paper Nr: 107
Title:

NON-PARAMETRIC ACQUISITION OF NEAR-DIRAC PIXEL CORRESPONDENCES

Authors:

Bradley Atcheson and Wolfgang Heidrich

Abstract: Many computer vision and graphics applications require the acquisition of correspondences between the pixels of a 2D illumination pattern and those of captured 2D photographs. Trivial cases with only one-to-one correspondences require only a few measurements. In more general scenes containing complex inter-reflections, capturing the full reflectance field requires more extensive sampling and complex processing schemes. We present a method that addresses the middle-ground: scenes where each pixel maps to a small, compact set of pixels that cannot easily be modeled parametrically. The coding method is based on optically-constructed Bloom filters and frequency coding. It is non-adaptive, allowing fast acquisition, robust to measurement noise, and can be decoded with only moderate computational power. It requires fewer measurements and scales up to higher resolutions more efficiently than previous methods.
Download

Paper Nr: 108
Title:

GPU-FRIENDLY MULTI-VIEW STEREO FOR OUTDOOR PLANAR SCENE RECONSTRUCTION

Authors:

Hyojin Kim, Quinn Hunter, Mark Duchaineau, Kenneth Joy and Nelson Max

Abstract: This paper presents a new multi-view stereo approach that reconstructs aerial or outdoor scenes in both a planar and a point representation. One of the key features is to integrate two heterogeneous schemes for planar and non-planar reconstruction, given a color segmentation where each segment is classified as either planar or non-planar. In planar reconstruction, an optimal plane for each segment is chosen among possible plane candidates by comparing the remapped reference segment region with multiple target images in parallel on a GPU. In point reconstruction for non-planar objects, remapped pixel descriptors along an epipolar line pair are efficiently matched on a GPU. Our method also detects and discards incorrect segment planes and outliers that have a large 3D discontinuity with the neighboring segment planes. Several aerial and outdoor scene reconstruction results with quantitative analyses are provided.
Download

Paper Nr: 112
Title:

A PATCH RECTIFICATION STRATEGY FOR MULTI-HOMOGRAPHY ESTIMATION

Authors:

Mario Santes

Abstract: In this work, we describe how to recover the perspective functions induced by the dominant faces of a rigid polyhedral scene captured by two cameras at different positions. The faces of any polyhedron can be dealt with as if they were planar surfaces, therefore we analyze this problem in a multi-planar scene context. We assume that the scene is populated with polyhedral objects and that a regular grid is imposed on the reference image. We describe a robust strategy that is able to merge patches that belong to the same planar surface and to impose the epipolar geometry constraints to the recovered perspective functions.

Paper Nr: 120
Title:

PARAMETER AND CONFIGURATION ANALYSIS FOR NON-LINEAR POSE ESTIMATION WITH POINTS AND LINES

Authors:

Martin Schumann, Bernhard Reinert and Stefan Mueller

Abstract: In markerless model-based tracking approaches image features as points or straight lines are used to estimate the pose. We introduce an analysis of parametrizations of the pose data as well as of error measurements between 2D image features and 3D model data. Further, we give a review of critical geometrical configurations as they can appear on the input data. From these results the best parameter choice for a non-linear pose estimator is proposed that is optimal by construction to handle a combined input of feature correspondences and works on an arbitrary number and choice of feature type. It uses the knowledge of the 3D model to analyze the input data for critical geometrical configurations.
Download

Paper Nr: 162
Title:

OCCUPANCY ANALYSIS OF SPORTS ARENAS USING THERMAL IMAGING

Authors:

Rikke Gade, Anders Jørgensen and Thomas B. Moeslund

Abstract: This paper presents a system for automatic analysis of the occupancy of sports arenas. By using a thermal camera for image capturing the number of persons and their location on the court are found without violating any privacy issues. The images are binarised with an automatic threshold method. Reflections due to shiny surfaces are eliminated by analysing symmetric patterns. Occlusions are dealt with through a concavity analysis of the binary regions. The system is tested in five different sports arenas, for more than three full weeks altogether. These tests showed that after a short initialisation routine the system operates independent of the different environments. The system can very precisely distinguish between zero, some or many persons on the court and give a good indication of which parts of the court that has been used.
Download

Paper Nr: 209
Title:

HEAD SHAPE ESTIMATION USING A PARTICLE FILTER INCLUDING UNKNOWN STATIC PARAMETERS

Authors:

Catherine Herold, Vincent Despiegel, Stéphane Gentric, Séverine Dubuisson and Isabelle Bloch

Abstract: We present a particle filter algorithm to optimize the static shape parameters of a given face observed under multiple views and during time. Our goal is to determine the 3D shape of the head given these observations, by selecting the most suitable deformation parameters. The main idea of our method is to integrate the unknown static parameters in the particle filter hidden state and to filter and modify these parameter values given the recursively incoming observations. We propose here a comparative study of different variants of this approach evaluated on synthetic data. These results show the potential given by this type of particle based methods, which have mainly been presented from a theoretical point of view until now. We conclude with a discussion on the adaptation of these methods to real data sequences.
Download

Paper Nr: 242
Title:

3D HUMAN BODY POSE ESTIMATION BY SUPERQUADRICS

Authors:

Ilya Afanasyev, Massimo Lunardelli, Nicolò Biasi, Luca Baglivo, Mattia Tavernini, Francesco Setti and Mariolino De Cecco

Abstract: This paper presents a method for 3D Human Body pose estimation. 3D real data of the searched object is acquired by a multi-camera system and segmented by a special preprocessing algorithm based on clothing analysis. The human body model is built by nine SuperQuadrics (SQ) with a-priori known anthropometric scaling and shape parameters. The pose is estimated hierarchically by RANSAC-object search with a least square fitting 3D point cloud to SQ models: at first the body, and then the limbs. The solution is verified by evaluating the matching score, i.e. the number of inliers corresponding to a-piori chosen distance threshold, and comparing this score with admissible inlier threshold for the body and limbs. This method can be used for 3D object recognition, localization and pose estimation of Human Body.
Download

Paper Nr: 243
Title:

REAL TIME OBJECT TRACKING ON GPGPU

Authors:

Maciej Chociej and Adam Polak

Abstract: We propose a system for tracking objects in a video stream from a stationary camera. Our method, as often used, involves foreground-background separation and optical flow calculation. The major finding is fast feedback process that leads to an accurate detection of background-object and object-object boundaries and maintaining them during object occlusions. The contribution of this paper also includes improvements to computing dense optical flow and foreground separation. The methods described were implemented on a GPGPU and yield performance results sufficient for real time processing. Additionally, our approach makes no a priori assumptions on the characteristics of tracked objects and can be utilized to track both rigid and deformable objects of various shapes and sizes.
Download

Paper Nr: 246
Title:

HIGH RESOLUTION POINT CLOUD GENERATION FROM KINECT AND HD CAMERAS USING GRAPH CUT

Authors:

Suvam Patra, Brojeshwar Bhowmick, Subhashis Banerjee and Prem Kalra

Abstract: This paper describes a methodology for obtaining a high resolution dense point cloud using Kinect (Smisek et al., 2011) and HD cameras. Kinect produces a VGA resolution photograph and a noisy point cloud. But high resolution images of the same scene can easily be obtained using additional HD cameras. We combine the information to generate a high resolution dense point cloud. First, we do a joint calibration of Kinect and the HD cameras using traditional epipolar geometry (Hartley and Zisserman, 2004). Then we use the sparse point cloud obtained from Kinect and the high resolution information from the HD cameras to produce a dense point cloud in a registered frame using graph cut optimization. Experimental results show that this approach can significantly enhance the resolution of the Kinect point cloud.
Download

Paper Nr: 265
Title:

SINGLE SNAPSHOT SYSTEM FOR THE FAST 3D MODELING USING DYNAMIC TIME WARPING

Authors:

Luis Ruiz, Xavier Mateo, Ciro Gràcia and Xavier Binefa

Abstract: In this work we explore the automatic 3D modeling of a person using images acquired from a range camera. Using only one range camera and two mirrors, the objective is to obtain a full 3D model with one single snapshot. The combination of the camera and the two mirrors give us three non-overlapping meshes, making impossible to use common zippering algorithms based on overlapping meshes. Therefore, Dynamic Time Warping algorithm is used to find the best matching between boundaries of the meshes. Experimental results and error evaluations are given to show the robustness and efficiency of our method.
Download

Paper Nr: 298
Title:

ANALYSIS OF DEFORMATION PROCESSES USING BLOCK-MATCHING TECHNIQUES

Authors:

Alvaro Rodriguez, Carlos Fernandez-Lozano, Jose-Antonio Seoane, Juan R. Rabuñal and Julian Dorado

Abstract: Non rigid motion estimation is one of the main issues in computer vision. Its applications range from civil engineering or traffic systems to medical image analysis. The challenge consists in processing a sequence of images from of a physical body subjected to deformation processes and extracting its displacement field. In this paper, an iterative Block-Matching technique is proposed to measure displacements in deformable surfaces. This technique is based on successive interpolation and smoothing phases to calculate the dense displacement field of a body. The proposed technique was experimentally validated by studying the Yosemite sequence and it was tested in the analysis of strength test and biomedical images.
Download

Paper Nr: 1
Title:

SOLVING THE THREE-POINT CAMERA POSE PROBLEM IN THE VICINITY OF THE DANGER CYLINDER

Authors:

Michael Q. Rieck

Abstract: A new theorem in solid geometry is introduced and shown to be quite useful for solving the Perspective 3-Point Pose Problem (P3P) in the general vicinity of the danger cylinder. Also resulting from this is a criterion for partially deciding which mathematical solution is the correct physical solution. Simulations have demonstrated the greater accuracy of the new method for solving P3P, over a standard classical method, under the following condition. The distance from the camera’s optical center to the axis of the danger cylinder must be sufficiently small, compared with the distance from the optical center to the plane containing the control points.
Download

Paper Nr: 19
Title:

PUTATIVE MATCH ANALYSIS - A Repeatable Alternative to RANSAC for Matching of Aerial Images

Authors:

Anders Hast and Andrea Marchetti

Abstract: One disadvantage with RANSAC is that it is based on randomness and will therefore often yield a different set of inliers in each run, especially if the dataset contains a large number of outliers. A repeatable algorithm for finding both matches and the homography is proposed that will yield the same set of matches every time and is therefore a useful tool when trying to evaluate other algorithms involved and their parameters.
Download

Paper Nr: 34
Title:

SEMI-AUTOMATIC FLOOR-PLAN RECONSTRUCTION FROM A 360º PANORAMIC IMAGE

Authors:

Alexander Tarnavsky, Alexander Yusupov and Dmitry Rudoy

Abstract: The ease of panorama creation has made it very popular. Although it is a very convenient way to convey the environment, panoramic images can often be confusing. This discomfort has major influence in a 360 degree indoor panorama, where the viewer is forced to look in all the directions at the same time. In this paper we propose an alternative approach for visualization of the indoor environment. Instead of using the panorama directly, our method reconstructs a floor-plan from it and displays the created 3D model. During the reconstruction the user is only required to mark the corners in the original image. For the wall planes we use unwrapped texture mapping. Our experiments show that the proposed approach copes very well with the complex environments that include large spaces and corridors.
Download

Paper Nr: 70
Title:

A SIMPLE DERIVATION TO IMPLEMENT TRACKING DISTRIBUTIONS

Authors:

Wei Yu, Jifeng Ning, Nan Geng and Jiong Zhang

Abstract: We present a simple and straightforward derivation to implement active contours for tracking distributions (Freedman and Zhang, 2004) and its improvement, i.e., distribution tracking through background mismatch (Zhang and Freedman, 2005). In the original work, two steps are performed in order to derive the tracking evolution equations. In the first step, curve flows are derived using Green’s Theorem, and in the second step level set method is used to implement the curve flows, which seems to be somewhat complex. In our implementation, tracking evolution equations are derived directly by using variational theory. This is useful to understand the tracking method better. The final tracking evolution equations are identical to the previous work (Freedman and Zhang, 2004; Zhang and Freedman, 2005).
Download

Paper Nr: 88
Title:

STEREO VISION-BASED 3D CAMERA POSE AND OBJECT STRUCTURE ESTIMATION - An Application to Service Robotics

Authors:

Sorin M. Grigorescu, Tiberiu T. Cociaș, Gigel Maceșanu and Florin Moldoveanu

Abstract: In this paper, a robotic pose (position and orientation) estimation and volumetric object modeling system is proposed. The main goal of the methods is to reliably detect the structure of objects of interest present in a visualized robotic scene, together with a precise estimation of the robot’s pose with respect to the detected objects. The robustness of the robotic pose estimation module is achieved by filtering the 2D correspondence matches in order to detect false positives. Once the pose of the robot is obtained, the volumetric structure of the imaged objects of interest is reconstructed through 3D shape primitives and a 3D Region of Interest (ROI).
Download

Paper Nr: 104
Title:

REAL-TIME 3D MULTIPLE HUMAN TRACKING WITH ROBUSTNESS ENHANCEMENT THROUGH MACHINE LEARNING

Authors:

Suraj Nair, Emmanuel Dean-Leon and Alois Knoll

Abstract: This paper presents a novel and robust vision-based real-time 3D multiple human tracking system. It is capable of automatically detecting and tracking multiple humans in real-time even when they occlude each other. Furthermore, it is robust towards drastically changing lighting conditions. The system consists of 2 parts, 1. a vision based human tracking system using multiple visual cues with a robust occlusion handling module, 2. a machine learning based module for intelligent multi-modal fusion and self adapting the system towards drastic light changes. The paper also proposes an approach to validate the system through zero-error ground truth data obtained by virtual environments. The system is also validated in real-world scenarios.
Download

Paper Nr: 133
Title:

FRAME INTERPOLATION WITH OCCLUSION DETECTION USING A TIME COHERENT SEGMENTATION

Authors:

Rida Sadek, Coloma Ballester, Luis Garrido, Enric Meinhardt and Vicent Caselles

Abstract: In this paper we propose an interpolation method to produce a sequence of plausible intermediate frames between two input images. The main feature of the proposed method is the handling of occlusions using a time coherent video segmentation into spatio-temporal regions. Occlusions and disocclusions are defined as points in a frame where a region ends or starts, respectively. Out of these points, forward and backward motion fields are used to interpolate the intermediate frames. After motion-based interpolation, there may still be some holes which are filled using a hole filling algorithm. We illustrate the proposed method with some experiments.
Download

Paper Nr: 137
Title:

KERNEL SELECTION BY MUTUAL INFORMATION FOR NONPARAMETRIC OBJECT TRACKING

Authors:

J. M. Berthommé, T. Chateau and M. Dhome

Abstract: This paper presents a method to select kernels for the subsampling of nonparametric models used in realtime object tracking in video streams. We propose a method based on mutual information, inspired by the CMIM algorithm (Fleuret, 2004) for the selection of binary features. This builds, incrementally, a model of appearance of the object to follow, based on representative and independant kernels taken from points of that object. Experiments show gains, in terms of accuracy, compared to other sampling strategies.
Download

Paper Nr: 169
Title:

FAST STEREO MATCHING METHOD BASED ON OPTIMIZED CORRELATION ALGORITHM FOR FACE DEPTH ESTIMATION

Authors:

Amel Aissaoui, Rémi Auguste, Tarek Yahiaoui, Jean Martinet and Chabane Djeraba

Abstract: In this paper, we introduce a novel approach for face stereo reconstruction based on stereo vision. The approach is based on real time generation of facial disparity map, requiring neither expensive devices nor generic face model. An algorithm based on incorporating topological information of the face in the disparity estimation process is proposed to enhance the result of the 3D reconstruction. Some experimental results are presented to demonstrate the reconstruction accuracy of the proposed method.
Download

Paper Nr: 211
Title:

VEHICLE SPEED ESTIMATION FROM TWO IMAGES FOR LIDAR SECOND ASSESSMENT

Authors:

Charles Beumier

Abstract: Vehicle speed control has been traditionally carried out by RADAR and more recently by LIDAR systems. We present a solution that derives the speed from two images acquired by a static camera and one real dimension from the vehicle. It was designed to serve the purpose of second assessment in case of legal dispute about a LIDAR speed measure. The approach follows a stereo paradigm, considering the equivalent problem of a stationary vehicle captured by a moving camera. 3D coordinates of vehicle points are obtained as the intersection of 3D lines emanating from corresponding points in both images, using the camera pinhole model. The displacement, approximated by a translation, is derived from the best match of reconstructed 3D points, minimising the residual error of 3D line intersection and the deviation with the known dimensions of the licence plate. A graphical interface lets the user select and refine vehicle points, starting with the 4 corners of the licence plate. The plate dimension is selected from a list or typed in. More than 100 speed estimation results confirmed hypothesis about the translation approximation and showed a maximal deviation with LIDAR speed of less than +/- 10 % as required by the application.
Download

Paper Nr: 247
Title:

PARAMETRIC DATA ASSOCIATION PRIOR FOR MULTI-TARGET TRACKING BASED ON RAO-BLACKWELLIZED MONTE CARLO DATA ASSOCIATION

Authors:

Oliver Greß and Stefan Posch

Abstract: Association of observations to underlying targets is a crucial task in probabilistic tracking of multiple targets. The Rao-Blackwellized Monte Carlo Data Association (RBMCDA) framework circumvents the combinatorial explosion by approximating the joint distribution of targets and association variables by Monte Carlo samples in the space of association variables. We present a parametric data association prior distribution required by RBMCDA, which models the formation of observations. To sample from this distribution an efficient algorithm is developed. The Interacting Multiple Models (IMM) filter is integrated into the RBMCDA framework to model the changing dynamics of targets aiming at tracking small particles in microscopy images. The proposed method is evaluated in a proof of concept and evaluated using synthetic data.
Download

Paper Nr: 254
Title:

COMPARISON OF GLOBAL-APPEARANCE TECHNIQUES APPLIED TO VISUAL MAP BUILDING AND LOCALIZATION

Authors:

Francisco Amorós, Luis Payá, Oscar Reinoso and Luis M. Jiménez

Abstract: Techniques based on the global appearance of visual information have proved to be a robust alternative in the field of robotic mapping and localization. However, they present some critical issues that must be studied when trying to build an application that works in real time. In this paper, we review and compare several methods to build a global descriptor of panoramic scenes and we study the critical parameters that make their applicable or not in real mapping and localizations tasks, such as invariance against rotations, computational costs and accuracy in robot localization. All the experiments have been carried out with omnidirectional images captured in a real environment under realistic lighting conditions.
Download

Paper Nr: 256
Title:

ADAPTIVE SEARCH-BASED HIERARCHICAL MOTION ESTIMATION USING SPATIAL PRIORS

Authors:

Michael Santoro, Ghassan Al-Regib and Yucel Altunbasak

Abstract: Since motion estimation via block matching is an ill-posed problem, it requires the use of smoothness constraints to regularize the motion field. The block matching error and smoothness constraints together form an energy expression to be minimized. Motion vectors (MVs) from a candidate set are used to determine which MV minimizes the overall energy. These MVs, which may consist of spatial or temporal MVs, determine the quality of the motion field. Therefore, to ensure a high-quality motion field, we propose a new method to improve the quality of the MVs. The proposed method uses a novel approach to incorporate prior spatial MVs into block matching. By incorporating these MVs into block matching, we significantly reduce the size of the candidate set and improve the quality of the motion field.
Download

Paper Nr: 258
Title:

PEOPLE COUNTING WITH STEREO CAMERAS - Two Template-based Solutions

Authors:

Tim van Oosterhout, Ben Kröse and Gwenn Englebienne

Abstract: People counting is a challenging task with many applications. We propose a method with a fixed stereo camera that is based on projecting a template onto the depth image. The method was tested on a challenging outdoor dataset with good results and runs in real time.
Download

Paper Nr: 263
Title:

COVARIANCE BASED FISH TRACKING IN REAL-LIFE UNDERWATER ENVIRONMENT

Authors:

Concetto Spampinato, Simone Palazzo, Daniela Giordano, Isaak Kavasidis, Fang-Pang Lin and Yun-Te Lin

Abstract: In this paper we present a covariance based tracking algorithm for intelligent video analysis to assist marine biologists in understanding the complex marine ecosystem in the Ken-Ding sub-tropical coral reef in Taiwan by processing underwater real-time videos recorded in open ocean. One of the most important aspects of marine biology research is the investigation of fish trajectories to identify events of interest such as fish preying, mating, schooling, etc. This task, of course, requires a reliable tracking algorithm able to deal with 1) the difficulties of following fish that have multiple degrees of freedom and 2) the possible varying conditions of the underwater environment. To accommodate these needs, we have developed a tracking algorithm that exploits covariance representation to describe the object’s appearance and statistical information and also to join different types of features such as location, color intensities, derivatives, etc. The accuracy of the algorithm was evaluated by using hand-labeled ground truth data on 30000 frames belonging to ten different videos, achieving an average performance of about 94%, estimated using multiple ratios that provide indication on how good is a tracking algorithm both globally (e.g. counting objects in a fixed range of time) and locally (e.g. in distinguish occlusions among objects).
Download

Paper Nr: 268
Title:

UTILIZATION AND PERFORMANCE CONSIDERATIONS IN RESOURCE OPTIMIZED STEREO MATCHING FOR REAL-TIME RECONFIGURABLE HARDWARE

Authors:

Fredrik Ekstrand, Carl Ahlberg, Mikael Ekström, Lars Asplund and Giacomo Spampinato

Abstract: This paper presents a quantitative evaluation of a set of approaches for increasing the accuracy of an area-based stereo matching method. It is targeting real-time FPGA systems focused on low resource usage and maximized improvement per cost unit to enable concurrent processing. The approaches are applied to a resource optimized correspondence implementation and the individual and cumulative costs and improvements are assessed. A combination of the implemented approaches perform close to other area-matching implementations, but at substantially lower resource usage. Additionally, the limitation in image size associated with standard methods is removed. As fully piped complete on-chip solutions, all improvements are highly suitable for real-time stereo-vision systems.
Download

Paper Nr: 278
Title:

THE EFFECT OF FEATURE COMPOSITION ON THE LOCALIZATION ACCURACY OF VISUAL SLAM SYSTEMS

Authors:

Mohamed Heshmat and Mohamed Abdellatif

Abstract: Simultaneous Localization and Mapping, SLAM, for mobile robots using a single camera, has attracted several researchers in the recent years. In this paper, we study the effect of feature point geometrical composition on the associated localization errors. The study will help to design an efficient feature management strategy that can reach high accuracy using fewer features. The basic idea is inspired from camera calibration literature which requires calibration target points to have significant perspective effect to derive accurate camera parameters. When the scene have significant perspective effect, it is expected that this will reduce the errors since it implicitly comply with the utilized perspective projection model. Experiments were done to explore the effect of scene features composition on the localization errors using the state of the art visual Mono SLAM algorithm.
Download