Analysis and Optimization of Task Granularity on the Java Virtual Machine

Staff - Faculty of Informatics

Date: 6 July 2018 / 16:30 - 18:00

USI Lugano Campus, room SI-003, Informatics building (Via G. Buffi 13)

 

You are cordially invited to attend the PhD Dissertation Defense of Andrea ROSÀ on Friday, July 6 2018 at 16h30 in room SI-003 (Informatics building)

 

Abstract:

Task granularity, i.e., the amount of work performed by parallel tasks, is a key performance attribute of parallel applications. On the one hand, fine-grained tasks (i.e., small tasks carrying out few computations) may introduce considerable parallelization overheads. On the other hand, coarse-grained tasks (i.e., large tasks performing substantial computations) may not fully utilize the available CPU cores, leading to missed parallelization opportunities.

We focus on task-parallel applications running in a single Java Virtual Machine on a shared-memory multicore. Despite their performance may considerably depend on the granularity of their tasks, this topic has received little attention in the literature. Our work fills this gap, analyzing and optimizing the task granularity of such applications.

In this dissertation, we present a new methodology to accurately and efficiently collect the granularity of each executed task, implemented in a novel profiler. Our profiler collects carefully selected metrics from the whole system stack with low overhead. Our tool helps developers locate performance and scalability problems, and identifies classes and methods where optimizations related to task granularity are needed, guiding developers towards useful optimizations.

Moreover, we introduce a novel technique to drastically reduce the overhead of task-granularity profiling, by reifying the class hierarchy of the target application within a separate instrumentation process. Our approach allows the instrumentation process to instrument only the classes representing tasks, inserting more efficient instrumentation code which decreases the overhead of task detection. Our technique significantly speeds up task-granularity profiling and so enables the collection of accurate metrics with low overhead.

We use our novel techniques to analyze task granularity in the DaCapo, ScalaBench, and Spark Perf benchmark suites. We reveal inefficiencies related to fine-grained and coarse-grained tasks in several workloads. We demonstrate that the collected task-granularity profiles are actionable by optimizing task granularity in numerous benchmarks, performing optimizations in classes and methods indicated by our tool. Our optimizations result in significant speedups (up to a factor of 5.90x) in numerous workloads suffering from coarse- and fine-grained tasks in different environments. Our results highlight the importance of analyzing and optimizing task granularity on the Java Virtual Machine.

Dissertation Committee:

  • Prof. Walter Binder, Università della Svizzera italiana, Switzerland (Research Advisor)
  • Prof. Fernando Pedone, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Robert Soulé, Università della Svizzera italiana, Switzerland (Internal Member)
  • Prof. Giuseppe Serazzi, Politecnico di Milano, Italy (External Member)
  • Prof. Petr Tuma, Charles University in Prague, Czech Republic (External Member)