NICTA Embedded Systems Public Seminar
From Efficient Cloud Infrastructure to Simulating Clouds Efficiently
Peter Strazdins, PhD, School of Computer Science, ANU College of Engineering and Computer Science
Time/Venue
Friday 4 February 2011, 14:30
NICTA, Neville Roach Laboratory, Level 1 Seminar Room West, 223 Anzac Parade (Building L5), Kensington NSW 2052
Abstract
The first part of this talk will give an overview of some present and future work related to infrastructure for cloud computing for HPC. This includes optimizing live migration and communication performance of Xen, a dynamic scheduling framework for heterogeneous clouds, a resilient and heterogeneity-oblivious HPC programming paradigm, clouds for large-scale scientific data sets, and intra-processor network profiling tools (what has that got to do with clouds? The Single Chip Cloud Computer :).
The second part of this talk describes a profiling methodology and performance tuning of the Met Office Unified Model for weather and climate simulation.
We develop an efficient profiling methodology and scalability analysis of the MetUM version 7.5 at both low scale and high scale atmosphere grid resolutions. Variability within the execution of the MetUM and variability of the run-time of identical jobs on a highly shared cluster are taken into account. The methodology uses a lightweight profiler internal to the MetUM which we have enhanced to have minimal overhead and enables accurate profiling with a relatively modest usage of processor time.
At high-scale resolution, the MetUM scaled to core counts of 2000, with load imbalance accounting a significant fraction the loss from ideal performance. This was on the he NCI vayu cluster (Nehalem X5570 nodes with Inifinband). Recent patches have removed two relatively small identified sources of inefficiency. Process and NUMA affinity, made available in recent versions of OpenMPI, had a surprisingly large impact on scalability.
Internal segment size parameters gave a modest performance improvement at low-scale resolution (such as are used in climate simulation); this however was not significant a higher scales. Near-square process grid configurations tended to give the best performance. Byte-swapping optimizations vastly improved I/O performance, which has in turn a large impact on performance in operational runs.
Biography:
Peter Strazdins received a PhD in Computer Science from the Australian National University in 1990. Since then, he has been with the School of Computer Science at Australian National University, and was closely associated with the ANU-Fujitsu CAP Parallel Computing Project over the years 1990 - 2002. His research interests include parallel numerical algorithms and libraries, computer architecture and operating systems for high performance computers, computer simulation, performance modelling and analysis. Peter is currently a Senior Lecturer and Associate Head (Coursework) at the School of Computer Science.

