Democratizing Graph Analytics

Staff - Faculty of Informatics

Start date: 10 March 2017

End date: 11 March 2017

Speaker:

Marco Serafini

 

Hamad bin Khalifa University

Date:

Friday, March 10, 2017

Place:

USI Lugano Campus, room SI-006, Informatics building (Via G. Buffi 13)

Time:

09:30

 

 

Abstract:

Graphs are a natural and increasingly popular data representation in a large number of fields, from the Web, to advertising and biology, to metadata. There is a rich and growing literature on algorithms for graph analytics and graph mining. However, handling graph data, building effective graph analytics pipelines, and selecting the right analysis algorithms still requires specific expertise that is rare in average practitioners. One key element in making graph analytics more accessible is better systems support. Systems should have programming abstractions that simplify the implementation of graph analytics tasks and the interpretation of their results. This talk will present two efforts in this direction. The first is Arabesque, a system for distributed graph mining. Arabesque defines a high-level filter-process computational model that simplifies the development of scalable graph mining algorithms such as finding frequent subgraphs or cliques. Im plementations on top of Arabesque require a handful of lines of code, scale to trillions of subgraphs, and represent in some cases the first available distributed solutions. The talk will then present QGraph, a system for parallel graph search that distributes sequential graph search algorithms, balances load, and minimizes coordination. QGraph supports "heavy" searches that return a very large number of results, making graph search a viable filtering step in graph analytics pipelines.

 

 

Biography:

Marco Serafini is a Scientist at the Qatar Computing Research Institute, where he develops programming abstractions and systems for scalable graph search, exploration, and mining. He also works on elasticity and load balancing for real-time distributed data management systems, as well as on distributed coordination. His work has appeared in venues such as SOSP, NSDI, VLDB, ICDE, DSN, and PODC. He serves or has served as PC member of SOSP, VLDB, Eurosys, ICDE, ICDCS, and WWW, among others, and he co-chaired the PaPoC workshop, which is co-located with Eurosys. Before QCRI he was with Yahoo! Research, where he worked on the Zookeeper coordination system. Marco got his PhD from TU Darmstadt, Germany.

 

 

Host:

Prof. Kai Hormann