cloud-init and user_data

Many cloud platforms provide a mechanism called "cloud-init" for initializing new instances. Cloud-init allows a "user_data" script to be run when the instance first boots, and these scripts are often used for things like configuring the instance settings and adding user accounts. The OpenStack Terraform provider allows you to specify such a user_data script as part of an openstack_compute_instance_v2 instance configuration.

Expert cloud-init users who seek finer control over cloud-init may prefer to use Terraform's template_cloudinit_config template. But for most users, the simpler (but limited) user_data approach described here will be sufficient.

A user_data Example

In this example for a hypothetical application, we are adding a user_data script to be run on our "follower" instances. The script passes the IP address of our "leader" instance to every follower so the follower is able to connect to the leader. It also updates the operating system, installs Python's "pip" installer and installs the Python package "numpy".

We will discuss each of the commands individually, but it should be noted that their ordering here is not accidental and may be fragile depending on the operating system of your instance. You must be prepared to modify it if needed. In general, you should expect to work iteratively when developing your own user_data provisioning scripts, as getting these commands to succeed can be fussier than you might expect.

To work with this example, download a new version of the followers.tf file (first seen on the Create Multiple Instances page), replacing the existing file. This new version adds a "user_data" argument to the resource block (lines 13-22, below), starting with "user_data = <<-EOF".


resource "openstack_compute_instance_v2" "followers" {
  count           = var.num_followers
  name            = "terratest_follower_${count.index}"
  image_id        = var.image_id
  flavor_name     = var.follower_flavor
  key_pair        = openstack_compute_keypair_v2.this.name
  security_groups = [openstack_compute_secgroup_v2.ssh_ping.name]

  network {
    name = openstack_networking_network_v2.private.name
  }

  user_data = <<-EOF
    #! /bin/bash
    LOG="/opt/terratest.log"
    sudo apt-get update> $LOG
    echo ${openstack_networking_floatingip_v2.leader.address} > /opt/leader_fip.txt
    echo '* libraries/restart-without-asking boolean true' | sudo debconf-set-selections
    pip3 install numpy >> $LOG
  EOF
}

If you are unfamiliar with this syntax, it allows multiple lines of text to be treated as a single string in a shell script or other application (such as a Terraform configuration file!). All of the lines after the "<<-EOF" and before the terminating "EOF" line will be assigned as one string to the "user_data" argument. This is a convenient way to keep all of your configuration text in one file, but you also have the option of storing the script in its own file and using it with an argument like user_data = file("install_numpy.sh"). Let's look at the each section of this script in turn:


#! /bin/bash
LOG="/opt/terratest.log"
sudo apt-get update > $LOG

Because this text is going to be executed as a script file, we need to declare the shell in which the script should be executed (line 1). "Bash" is always a good choice. It can be difficult to debug issues with user_data scripts because they are running where you can't watch them. A good aid for this is to redirect the output of all commands into a log file (line 2). After the instance is up and running you can log in and view the log file to figure out what went wrong. Once you are done debugging the script you have the option of removing the logging. Our first real command is to update the operating system (line 3) - always a good practice, and often necessary so that later commands will work correctly.


echo ${openstack_networking_floatingip_v2.leader.address} > /opt/leader_fip.txt

This line is taking a value from a Terraform configuration attribute and writing it to a file. From there, an application on the follower instance could read the value and use it to connect to the leader. Placing an enclosing "${}" around the name of a Terraform property provides the value of that property. Such a usage creates a dependency in Terraform between this instance and the one providing the IP address, because the property reference can only be replaced by its value when the target instance's address has been decided.


echo '* libraries/restart-without-asking boolean true' | sudo debconf-set-selections
sudo apt-get -y install python3-pip >> $LOG
pip3 install numpy >> $LOG

The idea here is to install the "numpy" package for Python 3 because our imaginary application needs it (line 3). The image used in the "follower" example is for Ubuntu 22 and already includes pip3, but for demonstration purposes we are installing it again (line 2). The "apt-get" installer used in that command will sometimes prompt you if it installs components that require services to be restarted. So, we are changing a system setting before anything else to make sure that no interactive prompt throws off our script (line 1). These are just a few examples of the kinds of challenges you may face when writing user_data provisioning scripts.

Deploying and Connecting

You can apply the changes using terraform apply var-file=terratest.tfvars. If that is successful, you will see that Terraform understood that adding (or changing) a user_data script requires that the instance be recreated, while the same allocated IP address can then be associated with the new instance. This is efficient and good, but it can lead to a problem.

Once Terraform is done deploying and the instance has had time to boot up (which may take longer than before because of the new user_data script), you can try to SSH into the instance. If you had previously SSH'd into an earlier version of this follower instance at this same IP address, you may see an error message that begins: WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!. This happens because SSH remembered information about the previous instance it reached at this IP address, and now that information has changed (of course it has - it's a new instance!). If this happens to you, you can tell SSH not to worry about it by adding this option to your SSH command: -o StrictHostKeyChecking=no.

After successfully logging in to the follower instance, have a look around to see if the script worked correctly. You may need to wait a bit for the user_data script to finish before you will see the desired effects. Check to see:

  • Is there a leader_fip.txt file and a log file in /opt? Does the log file indicate that pip3 and numpy were installed?
  • Can you run pip3?
  • In Python3, can you import numpy?

If any of these did not work correctly, check the log file, adjust your user_data script and deploy again.

 
©  |   Cornell University    |   Center for Advanced Computing    |   Copyright Statement    |   Inclusivity Statement