Relational databases are a commonly-used and powerful way of storing data to allow robust querying and maintaining consistency of the stored data; sometimes called "SQL databases", they are queried using the Structured Query Language (SQL) and SQL can also be used to create, populate and alter the database. Relational database management systems (RDBMSs) and their SQL flavours have much in common across the different distributions; although advanced features may require consulting the documentation specific to that RDBMS, most operations can be done with standard SQL

Objectives

After you complete this workshop, you should be able to:

  • Understand a relational database which has already been created
  • Query a database which already exists, including loading more data
  • Design a normalized relational database
  • Incorporate relational databases into your workflow
Prerequisites

Executing any of the material contained in this workshop will require a relational database management system (RDBMS), although the SQL used here should nearly all work in any common RDBMS

There are are no formal prerequisites for this Virtual Workshop topic. The following forms of preparation may be helpful for contextualizing the material in this workshop with real-world examples:

  • Familiarity with a dataset of interest (but can just use sample datasets)
  • Familiarity with a processing/analysis pipeline
Requirements

System/compute requirements to complete this topic:

  • A relational database management system (RDBMS) installed and accessible
  • Permissions in the RDBMS to execute various tasks depending on purpose (eg, querying, creating databases, populating tables, writing to database)
©   Cornell University  |  Center for Advanced Computing  |  Copyright Statement  |  Inclusivity Statement