Cornell Virtual Workshop > Vectorization

Introduction

Steve Lantz with contributions from Aaron Birkland and the Texas Advanced Computing Center
Cornell Center for Advanced Computing and Texas Advanced Computing Center

Revisions: 3/2023, 9/2022, 5/2021, 1/2021, 5/2018, 6/2017, 10/2013 (original)

Vectorization is a process by which floating-point computations in scientific code are compiled into special instructions that execute elementary operations (+,-,*, etc.) or functions (exp, cos, etc.) in parallel on fixed-size vector arrays. The ultimate goal of vectorization is an increase in floating-point performance (possibly integer and logical performance as well) through hardware parallelism..

This topic is a general introduction the vectorization process, focusing on what vectorization is and how it increases performance.

Objectives

After you complete this topic, you should be able to:

Describe the concept of vectorization and the motivation for making use of it in your application
Explain what vectorization involves from the hardware, compiler, and user perspectives
Define SIMD, and relate this term to the execution of vector instructions
Discuss the effect of vector length on speedup
Give several reasons why ideal speedup may not be realized in application performance as a whole
Define vector intrinsics
Describe how a simple loop can be vectorized automatically by a compiler
Explain how a fused multiply-add instruction improves vector performance

Prerequisites

Knowledge of C and/or Fortran, as well as a basic knowledge of what assembly language is
Familiarity with batch job submission on large compute clusters