Towards a Scalable Supercomputer Architecture

Staff - Faculty of Informatics

Start date: 27 January 2010

End date: 28 January 2010

The Faculty of Informatics is pleased to announce a seminar given by Christoph Lenzen

DATE: Wednesday, January 27th, 2010
PLACE: USI Università della Svizzera italiana, room SI-008, Informatics building (Via G. Buffi 13)
TIME: 16.30

Many of today's supercomputers provide point-to-point communication between processors on top of a three-dimensional torus topology. This leads to a significant scalability issue, as essentially the effective bandwidth a node may utilize deteriorates like 1/n^(1/3) in a network of n nodes. Since the next generation of supercomputers will feature a million processors or more, this means less than 1% of the available communication capabilities can be used. In contrast, a fully connected network would suffer only small-factor loss in bandwidth utilization, but is prohibitively expensive.

Therefore, we propose the CLEX architecture, which strives for an efficient compromise between both extremes. Analysis shows that in a CLEX system, each node can exploit a constant fraction of its available bandwidth for point-to-point communication to or from arbitrary destinations. This is achieved at reasonable degrees of n^eps, for an arbitrarily small constant eps>0. Furthermore, messages are forwarded constantly often and travel a physical distance proportional to the physical network diameter; thus, in an uncongested setting, message delays are asymptotically optimum. We evaluated the practical merit of our proposal through simulation of two systems comprising roughly 250,000 and 1,000,000 processors, respectively. The results indicate that a CLEX system of this size would outperform a torus architecture by at least an order of magnitude, at node degrees of 66 resp. 35.

In 2007, Christoph Lenzen received his diploma degree in mathematics from the University of Bonn, Germany. His thesis dealt with multiscale simulation in structural mechanics. Afterwards, he started his Ph.D. studies in distributed computing at ETH Zurich, advised by Roger Wattenhofer. So far he focused on graph algorithms, clock synchronization, and parallel load balancing problems.

HOST: Prof. Rolf Krause