Vector Hardware
Steve Lantz with contributions from Aaron Birkland and the Texas Advanced Computing Center
Cornell Center for Advanced Computing and Texas Advanced Computing Center
Revisions: 9/2022, 5/2021, 1/2021, 5/2018, 6/2017, 10/2013 (original)
This section provides a basic overview of SIMD hardware and instruction sets found in modern CPUs and coprocessors, including those in large-scale HPC systems such as Stampede2.
Objectives
After you complete this topic, you should be able to:
- Explain how the width of vector registers affects SIMD operations
- Describe the role of a core's vector processing units in executing vector instructions
- Explain why there are overlapping but non-identical subsets of the AVX-512 instruction set on KNL, SKX, and ICX processors
- Distinguish between vector parallelism and multithreading parallelism
- Compare and contrast the benefits of vectorization and multithreading on processor performance
Prerequisites
- Knowledge of C and/or Fortran, as well as a basic knowledge of what assembly language is
- Familiarity with batch job submission on large compute clusters