JUNE 18–22, 2017

Session Details

Name: Tutorial 09: Advanced OpenMP: Performance & 4.5 Features
Time: Sunday, June 18, 2017
09:00 am - 01:00 pm
Room:   Extrakt  
Breaks:08:00 am - 10:00 am Welcome Coffee
11:00 am - 11:30 am Coffee Break
01:00 pm - 02:00 pm Lunch
Presenter:   Bronis R. de Supinski, LLNL
  Michael Klemm, Intel
  Eric Stotzer, Texas Instruments
  Christian Terboven, RWTH Aachen University
Abstract:   With the increasing prevalence of multicore processors, shared-memory programming models are essential. OpenMP is a popular, portable, widely supported and easy-to-use shared-memory model. Developers usually find OpenMP easy to learn. However, they are often disappointed with the performance and scalability of the resulting code. This disappointment stems not from shortcomings of OpenMP but rather from the lack of depth with which it is employed. Our Advanced OpenMP Programming tutorial addresses this critical need by exploring the implications of possible OpenMP parallelization strategies, both in terms of correctness and performance. While we quickly review the basics of OpenMP programming, we assume attendees understand basic parallelization concepts and will easily grasp those basics. In two parts we discuss language features in-depth, with emphasis on advanced features like tasking, vectorization and compute acceleration. In the first part, we focus on performance aspects, such as data and thread locality on NUMA architectures, and exploitation of the comparably new language features. The second part is a presentation of the directives for attached compute accelerators.

Content Level 
10% introductory 50% intermediate 40% advanced  

Targeted Audience 
Our primary target is HPC programmers with some knowledge of OpenMP that want to implement efficient shared-memory code for multi-core NUMA systems and accelerated systems.