Seminars at the Faculty of Informatics
Multimodal Document Annotation
In this presentation the problem of treatment and annotation of multimodal documents is discussed. Recent years have brought an outstanding and constantly growing amount of heterogeneous data. This is immense quantity of videos, available thanks to the widespread availability of TV and the Internet, can be a source of very useful and important information. In order to handle such data and be able to utilize it correctly, its indexing and annotation is required. However, because of the complexity and multimodality of the data a human annotator is usually needed. On the other hand, it is not possible to annotate such a large quantity of videos due to the costs (taking into account both time and manpower) of manual intervention. To address this problem there are techniques being developed that can determine the most suitable instances of the dataset for annotation. Active learning is a group of such methods that try to determine the most informative and relevant samples for manual annotation. An approach that aims at reducing human annotator involvement is proposed, namely active learning with annotation propagation for multimodal data. In particular, this method addresses the problem of how to effectively name numerous persons within a video with the lowest degree of human annotation. The results show the advantages of the use of both the overlaid names (as the source of labels for the cold start) and propagation of annotation within clusters. Additionally, the cross-modal effects are visible when annotation is given for just a single modality. Additional experiments are performed to test this framework, which includes speaker annotation and speaker identification model training
Mateusz Budnik got his B.Sc. in Computer Science from Wroclaw University of Technology in 2011. He received his double M.Sc degree in Computer Science and Control Theory from Wroclaw University of Technology and Coventry University in 2012. He is currently a PhD candidate at the University of Grenoble-Alpes and CNRS with the defense scheduled for December 2016. He is a member of MRIM and GETALP groups at the Laboratoire d'Informatique de Grenoble.
During his PhD he worked on multimedia annotation and label propagation, active learning and deep learning with applications to multimedia. He authored or co-authored 17 papers, including 2 journal papers and 12 peer-reviewed conference/workshop papers.