Skip to main content


This module provides an introduction to concepts of visualization with a focus on parallel computing techniques to handle large datasets. It should clarify some of the choices in configuring tools such as ParaView for use on Stampede2.

Visualization encompasses all of the steps necessary to understand a given dataset. When a scientific simulation completes, or when sensors produce a file on disk, those files can be far from intelligible. Each file has its own data format containing fields of values defined on mathematical spaces; units and physical meaning are assigned to these values. If the data is an image from a microscope, display can be immediate. If it corresponds to the space of all products in a flame combustion, there are many decisions to make and further processing to transform that information into colored pixels on a screen. Visualization steps fall into two groups: those that transform the data into polygons with color, and those that take those polygons and display them on the screen.

The graphics pipeline is the set of steps to display a given set of polygons as pixels on a screen. Understanding this pipeline can explain why your screen is unexpectedly all black or why data seems to flicker in and out.

All of the visualization pipeline can be parallelized in various ways. An overview of the types of parallelization shows the basic options. Parallelization of the graphics pipeline is done in very specific ways because the need to sort polygons before displaying them is crucial.

Given these techniques, the interface for ParaView and other visualization packages should become clearer.

Originally developed January 2009
Last updated October 2014

Aaron Birkland (original author), Adam Brazier
Cornell Center for Advanced Computing