Advances in Deep Learning for Vision, with Applications to Industrial Inspection

Decanato - Facoltà di scienze informatiche

Data d'inizio: 27 Marzo 2014

Data di fine: 28 Marzo 2014

You are cordially invited to attend the PhD Dissertation Defense of Jonathan MASCI on Thursday, March 27th 2014 at 15h30 in room SI-008 (Informatics building)
 
Abstract:
Learning features for object detection and recognition with deep learning has received increasing attention in the past several years and recently attained widespread popularity.
In this PhD thesis we investigate its applications to the automatic surface inspection system of our industrial partner ArcelorMittal, for classification and segmentation problems. Currently employed algorithms, in fact, use fixed feature extractors which are hard to tune and require extensive prior-knowledge.
Our work, instead, focuses on learnable systems that can be used to improve recognition and detection without requiring hard to obtain task-specific domain knowledge.
For image classification we propose extensions to max-pooling convolutional networks, so that they can be applied to solve the general defect classification problem via a new pooling and feature encoding schemes.
State-of-the-art deep learning algorithms for object detection/segmentation have reached outstanding performance. Unfortunately, they do not meet the required processing speeds of steel industry. We propose an architecture that does not suffer the same computational bottleneck (1500-fold speed-up) while retaining equal performance.
To further advance the field we study the learning of morphological operators, largely used in industry. Only few attempts have been proposed in the literature, but no approach has ever considered the problem in its generality because of its hard formulation. We tackle it from a different perspective and introduce a learnable framework which seamlessly integrates morphological operators; hence bringing these powerful tools to deep learning for the first time.
Re-engineering an industrial system requires time. In order to deliver an im- mediate return we investigate metric learning problems to boost performance of currently used features. Our multimodal similarity sensitive hashing model scales well to web-scale datasets and, thanks to the binary representation, requires little storage and involves a cheap distance computation. It outperforms previous state-of-the-art approaches without requiring additional resources.

Dissertation Committee:

  • Prof. Jürgen Schmidhuber, Università della Svizzera italiana, Switzerland (Research Advisor)
  • Prof. Illia Horenko, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Michael Bronstein, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Yann LeCun, New York University, USA (External Member)
  • Prof. Hugues Talbot, Università Paris-Est, France (External Member)