Facoltà di scienze informatiche

Strumenti

Info per

English

Facoltà

Studia con noi

Ricerca

Info pratiche

Notizie ed eventi

Eventi

Aprile

2024

30.
04.
2024

Quo vadis education in the era of automation

Seminari

Maggio

2024

03.
05.
2024

Hazard Detection for Robotic Applications as Visual Anomaly Detection

Difese di tesi

Maggio

2024

04.
05.
2024

XXVIII Dies academicus

Maggio

2024

06.
05.
2024

CTL* Verification and Synthesis using Existential Horn Clauses

Seminari

Maggio

2024

15.
05.
2024

Exploring the Usage of Pre-trained Models for Code-Related Tasks

Difese di tesi

Maggio

2024

17.
05.
2024

Bachelor Info Day, un'occasione per conoscere l'USI

Maggio

2024

25.
05.
2024

Porte aperte all'IRSOL e alla Specola

Maggio

2024

27.
05.
2024

Cerimonia dei diplomi USI

Informatics Seminar on Wedensday, December 10th at 13.30 - Thomas Heinis

Decanato - Facoltà di scienze informatiche

Data d'inizio: 10 Dicembre 2008

Data di fine: 11 Dicembre 2008

The Faculty of Informatics is pleased to announce a seminar given by Thomas Heinis

TITLE: Efficient Lineage Tracking for Scientific Workflows

SPEAKER: Thomas Heinis, Institute for Pervasive Computing, ETH Zurich

DATE: Wednesday, December 10th, 2008

PLACE: USI Università della Svizzera italiana, room SI-006, Informatics building (Via G. Buffi 13)

TIME: 13.30

ABSTRACT:

Data lineage and data provenance have been identified as key problems in the management of scientific data. Not knowing the exact provenance and processing pipeline used to produce a derived data set often renders the data set useless from a scientific point of view. On the positive side, capturing provenance information is facilitated by the widespread use of workflow tools for processing scientific data since the workflow process describes all the steps involved in producing a given data set. On the negative side, efficiently storing and querying such information has until now proven to be difficult. Known solutions use recursive queries and even recursive tables to represent scientific workflows. Such solutions do not scale and are rather inefficient.

In this talk I will present our approach to the problem. We use a space and query efficient interval representation for dependency graphs and show how to transform arbitrary workflow processes into graphs that can be stored using such representation. The approach is very efficient with respect to the time required to encode the graph and to ask lineage related questions. We have benchmarked our approach by using it to store the data lineage of several different scientific workflows.

In the remainder of the talk I will discuss how we have put our method to use in Sisyphus, a tool we have developed to process, manage and visualize Proteomics data. Experiment data processing in Sisyphus is subject to constant change as the focus of the experiments changes and different or new processing algorithms must be considered. Clearly, with a perpetually changing data processing pipeline, tracking the lineage of the data becomes of utter importance. Tracking the lineage of data in Sisyphus is however difficult as data dependencies are intricate and the amount of lineage data is vast, making scalable and efficient tracking mechanisms mandatory.

BIO:

Thomas Heinis received a master's degree in computer science from the Federal Institute of Technology (ETH) in Zurich, Switzerland, in March 2002. After working in the industry, he visited Purdue University from August 2003 to May 2004 where he carried out research in the area of Grid computing. Since June 2004 he is a research assistant at the Information and Communication Group in the Institute for Pervasive Computing at ETH Zurich, Switzerland. His main research interests are in the field of grid & autonomic computing as well as data lineage in the context of scientific applications.

HOST: Prof. Cesare Pautasso

Contatti

Decanato - Facoltà di scienze informatiche

+41 58 666 46 90

[email protected]

Quicklink

http://www.inf.unisi.ch

Allegati

Add to your calendar

Condividi

Facebook

Twitter

LinkedIn

Whatsapp

Email

Stampa

Facoltà di scienze informatiche
Università della Svizzera italiana
Via Buffi 13
6900 Lugano, Svizzera
tel +41 58 666 46 90
fax +41 58 666 45 36
e-mail [email protected]
Altri contatti Feedback sul sito

Indicazioni

Raggiungere il campus

Resta in contatto

Facoltà

Studia con noi

Ricerca

Info pratiche

Notizie ed eventi

Informatics Seminar on Wedensday, December 10th at 13.30 - Thomas Heinis

Contatti

Quicklink

Allegati

Condividi

Stampa

Indicazioni

Resta in contatto