Seminars at the Faculty of Informatics
Recommender Systems in the Wild
In the past years, the research field of Recommender Systems has experienced a surge in complex methods and advanced techniques. While such efforts have tremendously improved the capabilities of RecSys, they also made the task of designing a RecSys more daunting than ever, as the space of options increased dramatically.
In this talk I will present 2 cases of Recommender Systems "in the wild". Their designs were heavily influenced by a thorough analysis of the data generated by the target platforms. In both cases, we show that the previous work, driven by intuitive but wrong assumptions, cannot match the performance of our systems.
I will first introduce "Bartering Books to Beers: A Recommender System for Exchange Platforms", our latest paper published at WSDM2017. In this work we propose new models for bartering-based recommendation, for which we introduce three novel datasets from online bartering platforms. Surprisingly, we find that existing methods (based on matching algorithms) perform poorly on real-world platforms, as they rely on idealized assumptions that are not supported by real bartering data. We develop approaches based on Matrix Factorization in order to model the reciprocal interest between users and each other's items. We also find that the social ties between members have a strong influence, as does the time at which they trade, therefore we extend our model to be socially- and temporally-aware. In the second part of the talk I will introduce an ongoing project in collaboration with the Wikimedia Foundation. As of today, 37% of the articles in the English Wikipedia are flagged as "stubs" (e.g., not enough content, subpar quality, etc.) It would be therefore extremely useful to provide the Wikipedia editors with a RecySys which could guide them in improving the content and quality of such articles. Departing from previous work, often based on natural language generative models, we are currently developing a set of techniques to mine and recommend article templates (i.e., ordered lists of sections).
Michele Catasta is a research scientist and lecturer in Data Science at EPFL, Switzerland. In a few months he will move to Stanford (under Prof. Leskovec's supervision) to pursue his research agenda in data science and machine learning on massive datasets. Michele was in the founding team of Sindice.com, the largest Semantic Web search engine (now SIREn Solutions). He also worked for MIT Media Lab, Google and Yahoo Labs. In the past years, he received several awards and recognitions - among them, a focused grant from Samsung Research USA.