Creating RHEL7 Deployment Templates

There is a critical step when creating OS images or templates for use in image-based provisioning systems such as those embedded in most virtualization platforms. That step is cleaning up residual instance-specific data from the base, or golden, image. That step is called ‘sysprep’ in the Windows administration terminology. Failing to do so can lead to various problems such as the provisioned hosts failing to boot, filing to automatically gain network connectivity or subtle identification issues when trying to embed the provisioned host into network-wide distributed systems.

Recent changes in the low-level plumbing of Linux systems, mostly due to the switch to systemd-based system and service management from System-V based management, necessitate some updates the the procedures used to perform golden image cleanup. This post documents the various steps needed to clean up a RHEL7 golden image. While I haven’t tested them directly on such systems, similar steps should apply to Fedora and CentOS 7 systems as well.

 Erasing network configuration data

Failing to perform this step may result in issues ranging from provisioned hosts failing to automatically gain network connectivity to duplicate IPs on the network created during the provisioning process. To perform this step one must delete all the files that match the ‘ifcfg-*‘ pattern that reside under the following path, except the ‘ifcfg-lo‘ file:

/etc/sysconfig/network-scripts

Some provisioning systems will create network configuration during the provisioning process, however I have found this to be somewhat fragile. I have found you can increase the chances of a provisioned host successfully gaining network connectivity, especially in DHCP-managed networks, by creating a generic configuration for the first network interface. You do this by creating the ‘ifcfg-eth0’ file with the following configuration:

DEVICE=eth0
ONBOOT=yes
TYPE=Ethernet
BOOTPROTO=dhcp

Note that line 4 above was not needed in RHEL6 as DHCP was used by default, but it seems it is needed in RHEL7. Apart from the interface configuration files, all host-specific configuration should be removed from the ‘/etc/sysconfig/network‘ file. I find its easiest to simply empty the file, but others might prefer to leave some generic configuration such as disabling the IPV6 stack or setting up DNS domain search suffix. One could also simply erase the ‘/etc/sysconfig/network‘ file, but I found that this causes unneeded error messages to be yielded during the boot process.

Erasing network interface hardware association

This step is needed so that the network interface created for the provisioned hosts will end up being called ‘<code>eth0</code>’ buy the OS, so that it will pick up the basic configuration created in the previous step. To perform this step one must delete the following file if it exists:

/etc/udev/rules.d/70-persistent-net.rules

Sysadmins should be familiar with this step as a similar step was also needed in older RHEL versions. In my testing I have found that this file is not always created on RHEL7. But this might be a function of the virtualization environment I was testing with (RHEV/oVirt).

Erasing hostname

This step is needed because of the way ‘systemd’ works differently then previous systems. One needs to erase the following file:

/etc/hostname

Failing to delete this file will result in the provisioned hosts retaining the hostname of the golden image host.

Erasing machine-id

/etc/machine-id‘ is an important file for ‘systemd’. Removing this file will not only cause a host to fail to boot, but will also cause systemd to act very erroneously yielding many errors and taking very long to reach even single-user mode or the rescue console. The code in this file can be used by various systems to uniquely identify a host on the network, therefore steps must be taken to make sure a new code is generated for provisioned hosts.

The ‘systemd’ package provides the ‘systemd-machine-id-setup’ tool that can be used by provisioning systems to generate a new ‘machine-id’ file, however, not all image-based provisioning systems provide means to have such a tool run during the provisioning process. One way to clear out the ‘machine-id’ code from the golden image host is to empty out the ‘/etc/machine-id‘ with the following command:

> /etc/machine-id

Running the above command will cause ‘systemd’ to create a temporary machine-id code every time the provisioned host boots up. It may be desirable to retain the machine-id generated during the fiirst boot-up of a provisioned host over the lifetime of that host. For that purpose some sort of a first-boot script needs to be written. The ‘systemd’ developers have already implemented such a script, but it is not yet included in RHEL7 or even Fedora 20.

In some cases the environment running the provisioned hosts may take care of this ‘machine-id’ issue for you.The ‘systemd-machine-id-setup‘ script, used by ‘systemd’ to generate the machine-id, supports using a UUID number that could be passed into the virtual machine in KVM-based environments. I can confirm this indeed happens in ‘oVirt’ thereby creating the situation where each VM gets its own uniqe and persistent ID when the ‘/etc/system-id‘ file is empty. The same script also supports container environments as detailed in this document.

Erasing SSH host keys

The SSH service is configure by default in such a way that when the SSH host keys, used to uniquely identify a host for the SSH client, are erased, a new set is generated during service start up. Failing to erase the SSH host keys will result in having all provisioned hosts using the same set of keys which is harmful for SSH protocol security. To erase the SSH host keys one must simply erase all files matching the following pattern:

/etc/ssh/ssh_host_*

Erasing RHN system ID

The RHN system ID is used to uniquely identify a host in the RHN or when managing hosts with tools such as Spacewalk or Satellite. Failign to erase this system ID will cause all provisioned hosts to be recognized as the same host by the management system rendering it useless. To erase the RHN system ID and cause it to be re-generated one must sipmly erase the following file:

/etc/sysconfig/rhn/systemid

Regenerating ‘initrd’ image

In RHEL7, ‘systemd’ is started from the ‘initrd’ image even before the root file system had been mounted. Therefore, many of the basic system configuration files such as ‘/etc/hostname‘ are actually read from the ‘initrd’ image and not from the root file system. In order to properly clean up those files, a generic ‘initrd’ file need to be generated with the following command:

dracut --no-hostonly --force

Note: The installation process sets up a host-specific ‘initrd’ file by default. This means that hosts that boot with a generic ‘initrd’ will work differently in subtle ways and may boot more slowly. In particular, removing the ‘/etc/hostname file from the ‘initrd’ image will cause systemd to initialize the hostname to ‘localhost’. That value will be retained as the hostname until NetworkManager will start up and set the hostname by using revers-dns lookup (By default). This fact could be seen in the system logs for example. This behavior might influence various services that start up before the network is setup. It is advisable to use some kind of a first-boot script to create the host-specific configuration files and the host-specific ‘initrd’ image once the provisioned host had booted and detected its final configuration.

Cleaning up log files

While not essential, it is better to have the provisioned hosts free of log messages created when the golden image host was installed. One needs to erase all the log files ‘/var/log‘ and other log directories (if any).

Once the files are erased, some processes may continue writing to the orphaned inodes of those files. The files will, however, be removed once those processes had been shut down along with the system. When a provisioned hosts boots up, a new, empty set of files will be created for it.

The above is needed only if your system is running ‘rsyslog’ (Which is setup by default on RHEL7), an equivalent service, or another service that writes its logs directly to ‘/var/log‘. Unless the ‘/var/log/journal‘ directory is manually created, the ‘journald‘ service saves its journal file to ‘/run/log/journal‘ which is erased automatically on boot.

Triggering manual first-boot setup (anti-step)

While I believe this step should not be done in practice, since it creates a situation in which a sysadmin must manually configure various details for each and every provisioned host (In my view, this beats the purpose of having images in the first place). Since I’ve seen this step mentioned in various guides that have to do with creating system images, I’ve seen fit to mention it here if only to discourage other from performing it. This step essentially consists of creating the ‘/.unconfigured‘ file or running the ‘sys-unconfig‘ script. As stated, this causes the provisioned host to interactively ask for various configuration details such as the keyboard layout, the root password and the timezone when it boots.

Automating the process

Here is a GitHub repository where you can find a small script I wrote to perform all the steps mentioned above. This script is written in such a way that, when run without parameters ,it only displays what is about to be done, without actually committing any changes. To make the script actually clean up the host-specific data, one needs to run it with the ‘-f‘ command-line argument.

I simply copy this script to ‘/usr/local/sbin‘ on the golden host. That way I can run it on that host when it is ready to be turned into an image. I can also run it on a provisioned host when I decide that it had reached a point where an image should be made from it.

15 thoughts on “Creating RHEL7 Deployment Templates

  1. I see a few problems:

    1 > /etc/machine-id

    You need to run the command without the “1” otherwise it will just fail and you will not get an empty machine-id file.

    touch /.unconfigured

    This is something I specifically mentioned NOT to do, it makes the machine boot into a graphical configuration console instead of booting normally, this is probably what is preventing the other files from being created (it is waiting for your answers to create them).

    The SSH configuration files will not be created unless the SSHD service is started which may or may not happen depending on your configuration.

    Note that systemd can boot the host properly without most files (For example, if your host’s IP is backwards-resolvable in DNS, you don’t need an /etc/hostname file), so the files will not necessarily be created.

    • I copy the 1 in ‘1>/etc/machine-id’ by mistake to the reply only. I actually run only: > /etc/machine-id

      Okay I removed /.unconfigured.
      but now when I run:
      > /etc/machine-id
      I got:
      -bash /etc/machine-id: Read-only file system

      Thanks again.

      • Unless your ‘/’ and ‘/etc’ are on different file systems (unlikely) it seems impossible that you could remove ‘/.unconfigured’ and yet get that error message. Are you experiencing some kind of storage problems that have caused your OS to switch the ‘/’ file system into read-only mode?

        BTW, if you just need a simple minimal-install image of RHEL, you can download a pre-created image from redhat.com. If your virtualization environment does not enable you to use cloud-init, you can reset the image’s root password using this procedure: https://access.redhat.com/discussions/664843.
        Note that you will probably want to disable the cloud-init service in that image as well if you’re not using it.

  2. No file system problem.
    It’s because it mount /etc/machine-id in a different filesystem and stay in this state.
    I read in man :
    “This tool will execute no operation if /etc/machine-id doesn’t
    contain any valid machine ID, isn’t mounted as an independent
    temporary file system, of /etc is read-only. If those conditions are
    met, it will then write current machine ID to disk and unmount the
    transient /etc/machine-id file in a race-free manner to ensure that
    this file is always valid for other processes.”

    I cannot use cloud-init.

    I did not manage to make it work – help welcome.

    • Ah, I’m sorry, I wasn’t paying attention. You are right, when you boot a machine with an empty /etc/machine-id, systemd generates a new machine-id and mounts it from a read-only ram file-system.
      THIS IS THE DESIRED OUTCOME. It means your image is fine.
      But note what I wrote in the post after the command. Depending on your virtualization environment, the “generated” machine ID might be permanent and unique to the VM (which is what you want) or change on every boot, in which case you might want to write a script that will make it permanent by copying the file aside, unmounting the ram filesystem and placing the copied file back. That script should run during the first time the VM boots.

      • If I will write such a script, after booting the server in the first time, it will give a network configuration menu and prompt to change the root password?

  3. Having the server display a setup menu kinda beats the point of having a template in the first place IMO. The procedure described here is meant to allow you to create templates that you can clone in large quantities without human intervention. But you do need an environment with properly configured DNS and DHCP servers to support that.
    If you do want the menu, creating ‘/.unconfigured’ will cause the VM to display it. The script to set the machine ID has nothing to do with this.

    • I think you may want to have the NIC names be persistent in the VMs you provision from the template, so just deleting the persistent name file before creating the template may be preferable to linking it to ‘/dev/null’.

  4. Pingback: 制作centos/rhel 7虚拟机金模板(Golden Template) – 回忆书签 – Log@X.X.B

  5. Pingback: 制作centos/rhel 7虚拟机金模板(Golden Template) – Log@X.X.B

  6. Pingback: 制作centos/rhel 7虚拟机金模板(Golden Template) – 回忆书签 – Log@X.X.B

Leave a comment