|Name:||Tutorial 11: InfiniBand & High-Speed Ethernet: Advanced Features, Challenges in Designing HEC Systems & Usage|
|Time:||Sunday, June 18, 2017
02:00 pm - 06:00 pm
|Breaks:||04:00 pm - 04:30 pm Coffee Break|
|Presenter:||Dhabaleswar K. Panda, Ohio State University|
|Hari Subramoni, Ohio State University|
|Abstract:||As InfiniBand (IB) and High-Speed Ethernet (HSE) technologies mature, they are being used to design and deploy different kinds of High-End Computing (HEC) systems: HPC clusters with accelerators (GPGPUs and Xeon Phi) supporting MPI, Storage and Parallel File Systems, Cloud Computing systems with SR-IOV Virtualization, Big Data systems with Hadoop (HDFS, MapReduce and HBase) and Spark, Multi-tier Datacenters with Web 2.0 (memcached), Deep Learning middleware and Grid Computing systems. These systems are bringing new challenges in terms of performance, scalability, portability, reliability and network congestion. Many scientists, engineers, researchers, managers and system administrators are becoming interested in learning about these challenges, approaches being used to solve these challenges, and the associated impact on performance and scalability. This tutorial will start with an overview of these systems. Advanced hardware and software features of IB, HSE and RoCE and their capabilities to address these challenges will be emphasized. Next, we will focus on RDMA programming (OpenFabrics and Libfabrics), and network management infrastructure and tools to effectively use these systems. A common set of challenges being faced while designing these systems will be presented. Finally, case studies focusing on domain-specific challenges in designing these systems, their solutions/sample performance numbers will be presented
The content level will be as follows: 10% beginner, 50% intermediate, and 40% advanced.
This tutorial is targeted for scientists, engineers, developers, researchers and, system administrators working on the design, development and maintenance of high-end computing systems, high performance communication and I/O, storage, networking, middleware, virtualization, and applications related to high-end and cloud computing systems.
The audiences are expected to have knowledge on the basic features and working on IB or HSE (or any other high-speed networking) technologies. For audiences not familiar with any of these, taking the complementary basic tutorial (titled “IB and High-speed Ethernet for Dummies”) is recommended.