Introduction

Steve Lantz with contributions from Aaron Birkland and the Texas Advanced Computing Center
Cornell Center for Advanced Computing and Texas Advanced Computing Center

Revisions: 3/2023, 9/2022, 5/2021, 1/2021, 5/2018, 6/2017, 10/2013 (original)

Vectorization is a process by which floating-point computations in scientific code are compiled into special instructions that execute elementary operations (+,-,*, etc.) or functions (exp, cos, etc.) in parallel on fixed-size vector arrays. The ultimate goal of vectorization is an increase in floating-point performance (possibly integer and logical performance as well) through hardware parallelism..

This topic is a general introduction the vectorization process, focusing on what vectorization is and how it increases performance.

Objectives

After you complete this topic, you should be able to:

  • Describe the concept of vectorization and the motivation for making use of it in your application
  • Explain what vectorization involves from the hardware, compiler, and user perspectives
  • Define SIMD, and relate this term to the execution of vector instructions
  • Discuss the effect of vector length on speedup
  • Give several reasons why ideal speedup may not be realized in application performance as a whole
  • Define vector intrinsics
  • Describe how a simple loop can be vectorized automatically by a compiler
  • Explain how a fused multiply-add instruction improves vector performance
Prerequisites
 
©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Inclusivity Statement