SQL JOINs
Combining data from more than one table with INNER JOINs
Because of the normalization process, data of interest are often spread across multiple tables, which are related (directly or via intermediary tables) by foreign keys. SQL has JOIN functionality to combine the data from different tables; effectively this constructs a table from the requested columns, with the rows from multiple tables matched up via the ON keyword. The returned data can be filtered using WHERE as described previously.
Consider the following tables in a database:
Consider a simple question one might ask of the instruments, such as "what sort of measurement unit (C, F, mm, m/s, etc.) does each Magnotron_2016 instrument take?" The information to answer this question is stored in two tables—Instrument and Instrument_models—which are related together by instrument_model being the primary key of Instruments_models and being a column in the Instruments table, with a foreign key between the two. Imagine doing this by hand; for each instrument_id, you'd get the instrument model and then look that up in Instrument_models to get the corresponding measurement_type, for each instrument_id. This is what an INNER JOIN, the JOIN we examine in this topic, does, all in one query. For the query in question, it would look like this:
If the two tables in question contain these data:
Then the output will look like this:
It is possible to JOIN more tables. If we wanted all of the measurements for each Magotron_2016 model, we'd also need to join to the Measurements table, as with this table:
Here, "M" was already taken as an alias, so we used "X" instead; sometimes we might have used "I_M" for Instrument_models or just used the full, unaliased names. We have freedom to order the column list as we wish them to be returned, and the order of listing the table JOINs is unimportant for INNER JOINs.
Other JOINs
The INNER JOIN is the most commonly-used SQL JOIN, and you may often see it written as just JOIN. However, there are other two other forms of JOIN in use:
- LEFT OUTER JOIN or RIGHT OUTER JOIN: these have the form "<table1> LEFT/RIGHT OUTER JOIN <table2> ON <table1>.<column> = <table2>.<column>" where one table is the base table for the JOIN and even if there is no match in the other table, there will be a row returned with NULL values in any requested components from the JOINed table. In a LEFT OUTER JOIN <table1>, named first, is the base table, and <table2> is the base table for a RIGHT OUTER JOIN. This sort of query is often used to find unmatched values across tables and is often followed with a "WHERE ... IS NULL/NOT NULL" clause. With OUTER JOINs, the order in which the tables are presented does matter.
- CROSS JOIN. This forms a cartesian product of the tables' rows and doesn't specify or attempt to match on any fields. It is not used as often as INNER and OUTER JOINs.