A Map of Hitting in Baseball

Each point represents a single player in the history of major league baseball, whose career batting statistics are summarized in the form of rate-per-at-bat for different hitting outcomes averaged over their career. Blue dots represent pitchers (specifically, those people listed in the pitching dataset who appeared as a pitcher in at least 10 games over their career), while red dots represent position players (referred to simply as "players" here). Large dots represent people who have been inducted in the Baseball Hall of Fame (HOF), while small dots represent people who have not. Because this is a representation of hitting statistics, it is perhaps not surprising that pitchers and position players tend to be separated from one another (since pitchers are not primarily valued for their hitting skills, and have historically been poor hitters). This map is interactive, in that you can hover over a particular player and see information about their name and some career batting statistics. You can also pan and zoom around the map by selecting various tools in the right-hand panel if you want to examine subgroups in more detail. If you zoom in and want to reset to the full map, select the reset tool in the panel.

Our arbitrary and not particularly systematic choice to identify anyone as a "Pitcher" if they appeared in at least 10 games as a pitcher over the course of their career has interesting implications. For example, Jimmie Foxx — one of the greatest sluggers in the history of baseball — more or less came out of retirement in 1945 when many other players were fighting in World War II, and played as both a position player and a pitcher, just reaching the arbitrary 10-game threshold. He shows up in the hitting map alongside some other Hall of Fame sluggers, including Babe Ruth — who actually did pitch a lot and pitched very effectively in the early part of his career before being converted to a full-time position player.

While the details of TSNE plots such as this are difficult to interpret, and change stochastically from run to run, the fact that many HOF position players are bunched up along the edges of the map presumably indicates that they are "outliers" with regard to their hitting skills, perhaps providing insight into their inclusion in the Hall of Fame. There are different subsets of HOF players, which perhaps reflect both different hitting characteristics (e.g., sluggers vs. not) or different eras. Some of the HOF position players situated more toward to interior of the diagram seem to reflect lighter-hitting players more valued for their fielding skills. If you poke around the southernmost part of the map, you can find that the statistics of Tris Speaker (in the HOF) and Shoeless Joe Jackson (not) are extremely similar. Shoeless Joe's exclusion from the Hall of Fame is something that ought to be rectified.