Detailed List of C3I Ph.D. Research

Ph.D. Theses (C3I Group)

2007

Advances in Automated Image Categorization: Sorting Images using Person Recognition Techniques

Gabriel Costache  (PC as supervisor & PI) Thesis PDF here  employed in Xperi, Galway

The core problem addressed by this thesis is to provide practical tools for automatic sorting and cataloging of a typical consumer collection of digital images. The thesis presents a complete system solution, comprising (i) automated detection of face regions in images; (ii) multiple automated face recognition modules; (iii) automated colour and texture analysis of regions peripheral to the detected face regions; (iv) a decision fusion module combining the outputs of each recognition or peripheral region analysis module and enabling an output measure of similarity between each detected face regions and a user selected reference. Each system component is implemented and tested independently. The complete system is then tested and initial results indicate that the combined performance of two independent face recognition modules (DCT and PCA) o®er measurable improvement over a single recognition module when applied to typical consumer collections of images. The analysis of peripheral regions using the colour correlogram is shown to further improve the accuracy of face recognition based modules and to enable more granular sorting of the images containing a particular individual based on distinctive hair features or clothing. Techniques to improve the robustness of the system to variations in illumination and facial pose are investigated, tested and verified. A technique to significantly accelerate the retraining of basis functions f or PCA-based face recognition is presented with initial test results. Several working computer and Web applications based on, and illustrating features of the core system components are described and documented in detail.

2009

Advances in the Modelling of Facial Sub-Regions and Facial Expressions using Active Appearance Techniques

Ioana Bacivarov (PC as supervisor & PI) Thesis PDF here

Advances are presented in the modeling of facial sub-regions, in the use of enhanced whole face-models, and based on these sub-region models, in the determination of facial expressions. Models are derived using techniques from the field of active appearance modelling (AAM). A technical description and review of such techniques and a number of additional state-of-art techniques for face detection and face region analysis are provided. A detailed literature review covering a range of topics relating to facial expression analysis is provided. In particular the prior use of AAM techniques for facial feature extraction is reviewed. A range of methodologies for classifying facial expressions are also reviewed. Improved eye-region and lips region models are presented. These models employ the concept of overlapping landmark points, enabling the resulting models to handle eye-gaze, different degrees of closure of the eye, and texture variations in the lips due to the appearance of teeth when the mouth opens in a smile. The eye model is further improved by providing a component-AAM implementation enabling independent modelling of the state of each eye. Initialization of the lips model is improved using a hue-based pre-filter. A whole-face component-AAM model is provided, combining the improved eye and lips models in an overall framework which significantly increases the accuracy of fitting of the AAM to facial expressions. A range of experiments are performed to tune and test this model for the purpose of the accurate classification of facial expressions in unseen images. Both nearest-neighbour (NN) and support vector machine (SVM) classification methodologies are used. Testing of the system to classify the six universal emotions and the neutral face state shows that an accuracy of 83% can be achieved when using SVM classification. Preliminary investigations on additional enhancements to improve on this performance are provided, including the use of (i) pre-filters for gender, race, and age, (ii) person-specific AAM models, and (iii) expression tracking across multiple images in a video sequence. All of these techniques are shown to have potential to further enhance the accuracy of expression recognition of the underlying component-AAM face model with eye and lips subregional models.

2014

Advances in the testing of stereo image acquisition devices

Istvan Andorko (PC as supervisor & PI) Thesis PDF here  employed in Xperi, Galway

The core problem addressed in the current thesis is to provide a novel test bed for depth map generation and stereo image-based algorithms. The test bed comprises of elements such as (i) the choice and calibration of stereo image acquisition devices; (ii) choice and setup of objects in the test scenes; (iii) light intensity and temperature settings; (iv) camera-object and stereo baseline distance setups and (v) a repeatable test scene. In order to determine each element of the test bed, a number of experiments are presented. During the experiment process, four depth map algorithms are selected which, in conjunction with four different test scenes, are used for the initial measurements where (i) the influence of light intensity on algorithm performance is determined; (ii) the performances of the algorithms at different camera-object distances are determined and (iii) the influence of similar objects in the test scene is described. In order to acquire stereo images, a number of devices are used, from off-the-shelf stereo cameras to custom-built stereo devices. During the stereo image acquisition, a number of possible error sources are noticed, and the problems are mitigated. The experiments show that slight changes in the light intensity, light direction, camera-object distance and stereo base length can have a significant influence on the result of the depth map algorithms. With the help of these results, the details of the test bed are determined, and an on-line database is provided, containing details of the test scene and example stereo test images derived from the scene. This is available online for other researchers to download at http://www.andorko.com/stereo.html. The proposed test scene is provided with a view to having a standardized test scene that can be easily replicated by other researchers for testing of stereo acquisition systems and associated depth map acquisition algorithms.

2016

Contributions to practical iris biometrics on smartphones

Shejin Thavalengal (PC as supervisor & PI) Thesis PDF here  Data Science Manager at H&M, Sweden

This thesis investigates the practical adaption of iris biometrics on smartphones. Iris recognition is a mature and widely deployed technology which will be able to provide the high security demanded by next generation smartphones. Practical challenges in widely adopting this technology on smartphones are identified. Based on this, a number of design strategies are presented for constraint free, high performing iris biometrics on smartphones. A prototype, smartphone form factor device is presented to be used as a front-facing camera. Analysis of its optical properties and iris imaging capabilities shows that such a device with improved optics and sensors could be used for implementing iris recognition in the next generation of smartphones. A novel iris liveness detection is presented to prevent spoofing attacks on such a system. Also, the social impact of wider adoption of this technology is discussed. Iris pattern obfuscation is presented to address various security and privacy concerns which may arise when iris recognition will be a part of our daily life.

Save Cite Cited by 3 Related articles All 2 versions

2018

Contributions to the measurement of depth in consumer imaging

Hossein Javidnia (PC as supervisor & PI) Thesis PDF here  Faculty member at Trinity College Dublin

This thesis aims to examine and investigate methods that could potentially utilize images captured by consumer cameras such as smartphones to estimate depth and generate a 3D structure. After more than a century of research in depth sensing and 3D reconstruction, there are still open and unsolved challenges, and ultimately a practical solution for each problem will have to rely on combining a range of techniques as there is no single best solution which can satisfy all the requirements of a depth sensing application. Based on this, a number of methods and frameworks are presented to take advantage of the existing consumer cameras in depth sensing applications. A method is presented to postprocess the depth maps with respect to the geometrical structure of the scene. Later, this method is adopted to evaluate the effectiveness of the deep learning approaches in monocular depth estimation. To utilize the current mono cameras available on smartphones, a framework is presented to use the pre-capturing small motions for 3D reconstruction and depth sensing applications. Similarly, a mono camera can be used to capture a sequence of images in different focal planes known as focal stack. A framework is designed to estimate dense depth map from focal stack in a reasonably fast processing time for high resolution images. Lastly, to investigate the potentials of the current consumer multi-camera arrays, a framework is proposed to estimate dense depth map from these cameras. The advanced capabilities of today’s smartphones brings hope that we can arrive at a consensual depth sensing imaging system in the next decade or so, and hopefully some of the contributions of this research will contribute in part to this solution.

Contributions to deep learning methodologies

Shabab Bazrafkan (PC as supervisor & PI) Thesis PDF here  Head of Machine Learning at blackshark.ai

In recent years the Deep Neural Networks (DNN) has been used widely in a big range of machine learning and data-mining purposes. This pattern recognition approach can handle highly nonlinear problems. In this work, three main contributions to DNN are presented. 1- A method called Semi Parallel Deep Neural Networks (SPDNN) is introduced wherein several deep architectures are mixed and merged using graph contraction technique to take advantage of all the parent networks. 2- The importance of data is investigated in several attempts and an augmentation technique know as Smart Augmentation is presented. 3- To extract more information from a database, multiple works on Generative Adversarial Networks (GAN) are given wherein the joint distribution of data and its ground truth is approximated and in other projects conditional generators for classification and regression problems are trained and tested.

2019

Design and development of a performance evaluation framework for remote eye gaze estimation systems

Anuradh Kar (PC as supervisor & PI) Thesis PDF here  employed in Aramis Lab, Sorbonne University, France

In this dissertation, a comprehensive evaluation framework for remote eye gaze estimation systems that are implemented in consumer electronics applications is developed. For this, firstly, a detailed literature review was made which helped to gain deep insights about the current state-of-the-art in eye gaze estimation algorithms and applications, by categorizing eye gaze research works into different consumer use cases. The wide range of existing gaze estimation algorithms were classified and their applications in interdisciplinary areas such as human computer interactions, cognitive studies and consumer electronics platforms like automotive, handheld devices, augmented and virtual reality were summarised. The review further identified the major challenges faced by contemporary remote gaze estimation systems, which include variable operating conditions such as user distance from tracker, viewing angle, head pose and platform movements that have significant impact on a gaze tracker’s performance. Other issues include deficit of common evaluation methodologies, standard metrics or any comprehensive tools or software which may be used for quantitatively evaluating gaze data quality and studying impact of the various challenging operating conditions on gaze estimation accuracy. Based on the outcomes of this review, the concept of a dedicated performance evaluation framework for remote eye gaze estimation systems was formulated. This framework was implemented in this thesis work through the following steps: a) defining new experimental protocols for collection of data from a remote eye tracker operating under several challenging operating conditions b) collection of gaze data from a number of participants using a commercial remote eye tracker under variable operating conditions c) development of a set of numerical metrics and visualization methods using the collected data to express gaze tracking accuracy in homogeneous units and quantitatively explore gaze data characteristics and quality d) implementing machine learning models using the collected gaze datasets to identify and predict error patterns produced in gaze data by different operating conditions e) development of a software and web-application that incorporates the developed metrics and visualization methods into user-friendly graphical interfaces f) creation of open source code and data repositories containing the performance evaluation tools and methods developed in this thesis, so that they can be used by researchers and engineers working with remote gaze estimation systems. The aim of this dissertation is to present a set of methods, data, tools and algorithms as analytical resources for the eye gaze research community to use for better understanding of eye tracking data quality, detection of anomalous gaze data and prediction of possible error levels under various operating conditions of an eye tracker. Overall, these methods are envisioned to improve the quality and reliability of eye tracking systems operating under practical and challenging scenarios in current and future consumer applications.

 

Contributions to unconstrained palmprint recognition on smartphones

Adrian Ungureanu (PC as supervisor & PI) Thesis PDF here  employed in Adobe, Romania.

This thesis investigates the suitability of unconstrained palmprint as a biometric modality for handheld devices equipped with a camera, such as smartphones. A detailed literature survey is provided, covering existing datasets, methods for region of interest extraction (ROI) extraction and feature extraction from palmprints. Following a series of exploratory experiments, a novel dataset of palmprints from 81 subjects and acquired using 5 different smartphone cameras is developed. Details are provided of initial data acquisitions, the final acquisition and management protocol and the associated Ethics Application. The dataset was collected in several acquisition phases over a period of 8 months. A set of baseline matching experiments is also detailed and manual mark-up of the palmprint data is included with the dataset. The accurate extraction of palmprint ROI was identified as a key component in the biometric recognition pipeline but the mark-up used for ROI extraction is not available in palmprint datasets. Thus, a second dataset was acquired using a 3D sensor and aligned camera with suitable mark-up data. Over 25,000 images were acquired from 26 subjects over the course of 1 year. Corresponding experiments were designed to evaluate a range of machine learning approaches to ROI extraction and a new quality measure was developed to compare the accuracy of ROI extraction for palmprints. Detailed experiments compared various ROI extraction techniques and have demonstrated that unconstrained palmprint can serve as a practical means of biometric authentication using standard smartphone cameras and without a need for specialized fingerprint or 3D face sensors.

On extending depth of field in fast photographic lenses

Niamh Fitzgerald (AG as supervisor; PC as project PI) Thesis PDF here 

Modern smartphone cameras are equipped with multiple camera modules. These modules are typically five millimetres in length limiting the focal length of the lens. To preserve resolution the entrance pupil of these lenses are relatively large compared to the focal length. Large apertures are subject to large amplitudes of optical aberrations. These perturbations can only be compensated within a short total track length by the use of high order polynomial surfaces. Fast lenses characteristically have a shallow depth of field whereby a small region in object space is adequately in-focus. This can be an undesirable effect for spatially-dependent imaging applications such as biometric sensing and user authentication. A large depth of field can reduce the need for mechanical refocusing and computation- ally intensive focusing algorithms. Traditionally in photographic optics, reducing pupil diameter directly results in an increased depth of field. However, due to res- olution requirements the aperture cannot be stopped down. This thesis proposes novel approaches to extending depth of field for fast miniaturised camera lenses by employing optical design methods in whole or in major part. Utilising the lens parameters to increase the usable depth of field offers low-cost solutions that can possible replace voice coil motors and reduce the power consumption in camera modules.

2020

Deep learning techniques in data augmentation and neural network design

Joe Lemley (PC as supervisor & PI) Thesis PDF here  employed in FotoNation Ireland Ltd,, Parkmore, Galway

In recent years, deep learning has revolutionized computer vision and has been applied to a range of problems where it often achieves accuracies equal to or greater than those obtainable by individual human experts. This research improves on the state-of-the-art by proposing, implementing, and testing new models, architectures, and training methods that are more efficient while maintaining or improving the accuracy of previous methods. Special attention is focused on improvements that facilitate the specific needs of resource-constrained devices such as smartphones, and embedded systems, and in cases where obtaining sufficient data is difficult. For this reason, the topic of data augmentation is a major theme of this work. Due to the ever greater need for smarter embedded devices, my research has focused on novel network designs and data augmentation techniques for a wide range of diverse tasks, connected only by the need for more efficient architectures and more data – in many cases improving the accuracy over previous works in the process.

Depth of field reduction for miniature-camera imaging

Timothee Cognard  (CD as supervisor; PC as project PI) Thesis PDF here

This thesis investigates the reduction of depth-of- field for miniature camera systems, such as the ones embedded in smartphones. The work on digital shallow depth-of- field is already mature and implemented in most modern phones in the form of image processing algorithms. Yet, little research has been performed to provide optical design solutions for depth-of- field management. This thesis proposes a theoretical framework for the understanding of depth-of-fi eld in the particular case of miniature cameras and two practical solutions to reduce the depth of eld. The fi rst one is an apodization phase-mask designed in order to achieve depth super-resolution with about 20% lobe width reduction. The second one is an f/1.0 and an f/1.4 lens. Both solutions are presented and compared. Finally, some preliminary work is presented on how image processing could further improve the hardware contribution of this thesis.

2022

Contributions to data augmentation techniques and synthetic data for training deep neural networks

Viktor Varkarakis  (PC as supervisor & PI) Thesis PDF here employed in FotoNation Ireland Ltd,, Parkmore, Galway

In the recent years deep learning has become more and more popular and it is applied in a variety of fields, yielding outstanding results in different machine learning applications. Deep learning based solutions thrive when a large amount of data is available for a specific problem but data availability and preparation are the biggest bottlenecks in the deep learning pipelines. With the fast-changing technology environment, new unique problems arise daily. In order to realise solutions in many of these specific problem domains there is a growing need to build custom datasets that are tailored for a particular use case with matching ground truth data. Acquiring such datasets at the scale required for training with today’s AI systems and subsequently annotating them with an accurate ground truth is challenging. Furthermore, with the recent introduction of GDPR and associated complications introduced, industry now faces additional challenges in the collection of training data that is linked to individual persons. This dissertation focuses on ways to overcome the unavailability of real data and avoid the challenges that come with a data acquisition process. More specifically data augmentation techniques are proposed to overcome the unavailability of real data, improve performance and allow the use of low-complexity models, suitable for implementation in edge devices. Furthermore, the idea of using AI tools to build large synthetic datasets is considered as an alternative to real data samples. The first steps in order to build and incorporate synthetic datasets effectively into the deep learning training pipelines include: building AI tools, that will generate a large amount of new data and/or augment these data samples and also create methodologies and techniques to validate that the generate data behave like real ones and also measure whether their use is effective when incorporated in the training pipelines, with this dissertation contributing to both of these steps.

Evaluations of thermal imaging technology for automotive use cases

Muhammad Ali Farooq  (PC as supervisor & PI) Thesis PDF here employed in C3I as postdoctoral researchers

Thermal imaging has been widely used in high-end applications for instance industrial and military applications as it provides superior and effective results in challenging environments and weather conditions such that in low lighting scenarios and has aggregate immunity to visual limitations thus providing increased situational awareness. This research is about exploring the potential of thermal imaging for smart vehicular systems including both in-cabin and out-cabin applications using uncooled LWIR thermal imaging technology. Novel thermal datasets are collected in indoor and road-side environments using a specially designed low-cost, yet effective prototype thermal camera module developed under the Heliaus project. The collected data along with public datasets are further used for generating large-scale thermal synthetic data using the composite structure of advanced machine learning algorithms. The next phase of this work focuses on designing AI-based smart imaging pipelines which include driver gender classification system and object detection in the thermal spectrum. The performance of these systems is evaluated using various quantitative metrics which include overall accuracy, sensitivity, specificity, precision, recall curve, mean average precision, and frames per second. Furthermore, the trained and fine-tuned neural architectures on thermal data are deployed on Edge-GPU embedded devices for real-time onboard feasibility validation tests. This is accomplished by performing optimal optimization of successfully converged deep learning models on thermal data using SoA neural accelerators to achieve a reduced amount of inference time and a higher FPS rate.

Aplanatic lens correctors for imaging optics

Michelle Rocha (AG as supervisor; PC as project PI) Thesis PDF here

This thesis proposes analytically developed aplanatic field correctors. To begin, an astigmatism correction method for aplanatic Gregorian telescopes was developed employing two lenses with spherical surfaces and a spherical GRIN medium. Sec ond, to expand the field of view of RC telescopes, a meniscus lens with aspherical surfaces has been designed. This meniscus can flatten the image surface while cor recting astigmatism. The meniscus, on the other hand, has lateral color. Finally, using the same principle as the meniscus, a refractive pair has also been designed to increase the FoV in RC telescopes. It does, however, correct the lateral color at the expense of introducing axial color. By adjusting the distance between the pair, this pair may function for various spectral bands individually. All of the methods mentioned above may be added to or removed from an existing aplanatic telescope without compromising image quality or changing the original telescope design. Furthermore, their employment maintains the telescope’s aplanatic properties.

2023

Contributions to neural network models and training datasets for facial depth

Faisal Khan (PC as supervisor & PI) Thesis PDF here   employed in Valeo, Tuam, Co. Galway

The depth estimation problem has made significant progress due to recent improvements in Convolutional Neural Networks (CNN) and the incorporation of traditional methodologies in these deep learning systems. Depth estimation is one of the fundamental computer vision tasks, as it involves the inverse problem of reconstructing the three-dimensional scene structure from two-dimensional projections. Due to the compactness and low cost of monocular cameras, there has been a significant and increasing interest in depth estimation from a single RGB image. Current single-view depth estimation techniques, however, are extremely slow for real-time inference on an embedded platform and are based on fairly large deep neural networks that require a large range of training sets. Due to the difficulties in obtaining dense ground-truth depth at scale across various environments, a range of datasets with distinctive features and biases have developed. This thesis firstly provides a summary of the depth estimation datasets, depth estimation techniques, studies, patterns, difficulties, loss function and opportunities that are present for open research. For effective depth estimation from a single image frame, a method is proposed to generate synthetic high accuracy human facial depth from synthetic 3D face models that enables us to train the CNN models to resolve facial depth estimation challenges. To validate the synthetic facial depth data, a brief comparison analysis of cutting-edge depth estimation algorithms on individual image frames from the generated synthetic dataset is proposed. Following that, two different lightweight encoder-decoder-based neural networks for training on the generated dataset are proposed, and when tested and evaluated across four public datasets, the proposed networks are shown to be computationally efficient and outperform the current state-of-the-art. The proposed lightweight models will allow us to use the low-complexity models, making them suitable for implementation on edge devices. Synthetic human facial depth data can help overcome the lack of real data and can increase the performance of the deep learning methods for depth maps.

 

Contributions Towards 3D Synthetic Facial Data Generation And 3D Face Analysis With Weak Supervision

Shubhajit Basak (MS as supervisor; PC as co-supervisor & PI) Thesis PDF here  employed in FotoNation Ireland Ltd., Parkmore, Galway

In this dissertation, we address the issue of the unavailability of high-quality, accurate real- face data by applying these two approaches. With the help of low-cost digital asset creation software and an open- source computer graphics tool, we first build a pipeline to create a large synthetic face dataset. We rendered around 300k synthetic face images with extensive data diversity, such as different scene illuminations, backgrounds, facial expressions, etc., with their ground truth annotations like the 3D head pose and facial raw depth. We validate the synthetic data with two different facial analysis tasks - head pose estimation and face depth estimation. While learning the head pose from the synthetic images, we propose an unsupervised domain adversarial learning methodology to reduce the domain gap between the real and synthetic face images. We show that using our method, we can achieve near- state- of-the-art (SOTA) results compared to the methods that solely use real data to train their model. Furthermore, to solve the scarcity of 3D face data, we propose a weakly supervised approach to extract the 3D face information from a single 2D face image. For this 3D face reconstruction task, we use the popular vision transformer with hierarchical feature fusion as the feature extractor module and train our network with a differential renderer in an unsupervised fashion without any real 3D face scan data. Though this approach is able to generate accurate 3D face shape from a single 2D face image, the model size is large and requires high computational resources. This makes it unsuitable for low-cost consumer electronic devices or processing at the edge. So in the last section of this thesis, we propose a pipeline to build 3D facial dense landmarks with 520 key points that cover the entire face as well as carry the information of the overall facial structure. To show that the data generated by our proposed method is able to preserve the 3D information, we train a dense face landmark predictor with this data. The trained model achieves comparable results to
other SOTA methods in the 3D facial alignment task.

2024

Expected completions from Rishabh Jain, Wang Yao, Dan Bigioi