Visualization Overview
In this topic, we first provide a brief overview of some visualization concepts, and then highlight a few key Python libraries for data visualization: Matplotlib, the mainstay of Python graphics, which provides a low-level but powerful set of visualization operations; Seaborn, a library geared towards data science applications with a higher-level API, sitting on top of Matplotlib and integrating nicely with Pandas; and Bokeh, a library supporting web-based, interactive visualizations, including pan/zoom, object selection, metadata tags, user-definable callbacks, and much more.
Plotting with matplotlib can be accomplished through various means. The most straightforward and widely used is through the matplotlib.pyplot module, which is conventionally imported with a shorthand name: import matplotlib.pyplot as plt
. The plt module contains many functions for generating different kinds of plots and customizing their layout and appearance. These functions can be passed data stored in a number of different data structures, such as Python lists, NumPy arrays, and Pandas dataframes. In addition, Pandas defines methods on Dataframe and Series objects for plotting, which use the underlying matplotlib functions in plt. Seaborn is more focused on working with data in Dataframes, but uses the same matplotlib primitives underneath. Thus one can use the higher-level Seaborn API to generate plots of interest in conjunction with the lower-level Matplotlib API to customize those plots as desired.