Seminars at the Faculty of Informatics

The Faculty of Informatics is pleased to announce a seminar given by Paul Groth

TITLE: The Origin of Data: Determining the Provenance of Data in Multi-Institutional Scientific Systems
SPEAKER: Paul Groth
DATE: Wednesday, August 29, 2007
PLACE: USI Lugano, Room SI - 006, Informatics Building (Via G. Buffi 13)
TIME: 11.00-12.00

Science is changing. Increasingly, it can no longer be done in the confines of a single scientist's lab or for that matter in a single department, research institute or university. As scientists become reliant on data, processing, and apparatus provided by a variety of different institutions connected via complex workflows, it becomes difficult for them to understand, reproduce, and verify their results. Fundamentally, scientists lack a mechanism for determining the provenance of data in these environments.
To address this problem, we propose a new definition of provenance specifically suited to the computational model (service-oriented
architecture) that underpins such multi-institutional applications.
The provenance of a result is the process that led to that result.
We have conceived a computer-based representation of provenance that allows us to perform useful reasoning about the origin of data.
We examine the nature of such a computer-based representation, which is articulated around the documentation of process.
We then examine the architecture of a provenance system. This architecture is centered around the notion of a store designed to support the provenance life cycle. It consists of 2 phases: the recording phase and the reasoning phase. Initially, the process documentation is archived in the store, and subsequently the documentation is reasoned over. We then show how this system can be used to answer provenance use cases from a Grid-enabled bioinformatics application.
The presentation will draw upon our experience in the PASOA
( and EU Provenance ( projects and will rely on use cases from the domains of bioinformatics, high energy physics, organ transplant management and aerospace engineering.

Paul Groth recently completed his PhD in computer science at the Intelligence, Agents, Multimedia Group at the University of Southampton. His research interests include provenance, large scale distributed systems, HCI, and e-Science. Paul has previously done research at the Fraunhofer Institute for Manufacturing Engineering and Automation, the University of Ulm and the Institute for Human and Machine Cognition. He holds a bachelors degree in Computer Science from the University of West Florida. (