First Normal Form (1NF)
As normally written, 1NF imposes three main requirements on the database:
- Only atomic values: this means there should be one value in each column, per row
- No repeating groups: this means you shouldn't have multiple columns containing the same sort of thing (observer1, observer2, etc.)
- No repeated rows: in practice, we generally achieve this by specifying a primary key and will do so here
The upshot of this process generally results in more tables than we started with; for this reason, it's often best to leave the designation of the primary keys until the new tables are created, as the last step so the database can be in 1NF'. Each of the tables created as a result of this process must themselves be in 1NF.
The first constraint of atomicity, is broken by the columns instrument_names and instrument_types. This can be resolved by creating an Instruments table where each individual instrument will instead occupy its own row (you can imagine that those non-atomic columns are actually like tables within the Measurements table). The second constraint, of no repeating groups, is failed by the three measurement columns, with one column for each of rain_, wind_ and temperature_ measurement. This is actually fairly arguable, as they each have different units, but we can put them as rows in their own table and it would allow for us to have one or two of those measurements only without resorting to NULLable columns; furthermore, if we get new measurement types, we shan't need to add columns to the table (adding rows is a natural process in relational databases; adding columns should be much rarer). In general, in a normalized database, NULLable columns in one table can be avoided by creating a new table in which a missing value would mean a missing row instead. So, remembering to specify primary keys (indicated by "PK"), we might end up with this:
Our database is now in 1NF, where we have introduced a instrument_id as PK for the Instruments table—this could be a surrogate key or perhaps a serial number, uniquely identifying a given instrument—and have a compound primary key in the measurements table, with the three columns required to uniquely specify a measurement. Note also that we recognised the need for measurement units for the measurements and added it to the instruments table on the assumption that the units of measurement are part of the instrument's' specification. When normalization creates the need for extra tables, we typically also pick up a foreign key connecting them, which has been shown on the diagram for explanatory purposes although in practice it wouldn't be created until the database was fully normalized.