Fine-Grained Code Summarization

Istituto del software

Data: 3 Marzo 2022 / 16:30 - 17:30

Online

Join here

Speaker
Luca Pascarella, Software Institute, USI

Abstract
Code comments provide information about the underlying implementation and can be used by developers to enhance the understandability of source code. To help developers in writing and maintaining code comments, researchers proposed automatic source code summarization techniques. For example, researchers exploit deep learning (DL) models trained on large datasets of comments and code pairs created at the function level. Although it is easy to automatically create a training dataset featuring, for example, Java methods, the performance of DL techniques in an automatic documentation task is discouraging. For this reason, it would be worthwhile to start addressing a 'simpler problem', namely the generation of comments documenting small snippets of code rather than entire functions. Nonetheless, this requires the creation of a training dataset featuring examples of documented code snippets (i.e., code statements with their related comments), something difficult to accomplish automatically due to the lack of accurate linking techniques able to associate comments to code statements. In this talk, I am going to present our ongoing research aimed at creating a large-scale and fine-grained dataset for automated code summarization. This dataset opens to the development of techniques that automatically link code statements with their relevant comments, such as an automatic fine-grained code summarization.

Biography
Luca Pascarella is a postdoctoral researcher at the Università della Svizzera italiana (USI), Switzerland, where he is part of the Software Institute. He received his Ph.D. in Computer Science from the Delft University of Technology (TU Delft), The Netherlands, in 2020. His broader mission aims to smooth engineering tasks through data-driven algorithms, which leverage the large amount of information recorded during modern engineering processes. His research interests include empirical software engineering, mining software repository, and code review. He received an ACM SIGSOFT Distinguished Paper Award at MSR 2017 and a Best Paper Award Honorable Mention at CSCW2018. More information is available at: https://lucapascarella.com.

Chair
Tahereh Zohdinasab
 

*************************

In February 2019, the Software Institute started its SI Seminar Series. Every Thursday afternoon, a researcher of the Institute will publicly give a short talk on a software engineering argument of their choice. Examples include, but are not limited to novel interesting papers, seminal papers, personal research overview, discussion of preliminary research ideas, tutorials, and small experiments.

On our YouTube playlist you can watch some of the past seminars. More details on the next seminar, the upcoming seminars, and an archive of the past speakers are available here.