Managing a large set of systems has always been a hassle. Although tools like Ansible and Puppet simplify the process significantly, it can still be time-consuming when managing tens of systems.
Appliances designed to run specific software often start as a simple Debian installation. The initial installation and configuration take a few hours, and updates usually need to be performed manually. While this small investment of time is perfectly fine for a handful of systems, growing to tens of machines requires automating installations and enabling automatic (security) updates. However, the moment you want—or need—to upgrade the Debian installation to a new major version, it can become a true nightmare.
Luckily, solutions have existed for years. Instead of starting with a standard Linux installation and configuring it afterward, you can create an image with a preconfigured installation. This reduces installation time to a few minutes, and updates become nothing more than uploading a new image. Instead of repeating every installation, configuration, or update step, managing tens of systems is reduced to managing a single image.
This approach has been the standard for Android for years. SteamOS, Flatcar, and Tesla cars are all based on this concept. And who doesn’t remember the era of Linux live CDs like KNOPPIX?
To support robust updates, a system utilizes two separate partitions: one for the running system image and the other for the new update. Updates can be downloaded and installed while the system is still running the old image, reducing downtime to a single reboot.
Interested in how easy it would be to set up a system like this, I created a simple template for building a modern, immutable Linux distribution.
Read-Only Operating System (ROOS)
ROOS is a template using mkosi with additional scripts to create a feature-complete, immutable Linux image.
Unlike some other desktop-focused immutable Linux distributions, this template is meant for server appliances; therefore, the /etc directory is read-only by default.
Out of the box, the generated image provides a Linux system that can be accessed using SSH and nothing more.
Additional functionality can be added by enabling profiles during build time.
systemd and mkosi
Mkosi is a wrapper around the most commonly used package managers designed to generate customized disk images. It has been developed as part of the systemd project. Since systemd was introduced around 15 years ago, it has become the init system for most Linux distributions.
While systemd has been criticized for being very opinionated and deviating from commonly used Unix startup scripts, it provides significant functionality. It has standardized behavior across distributions and made creating immutable systems much easier. For example, in recent years, systemd added:
- systemd-boot: An EFI bootloader with boot assessment and counting.
- systemd-repart: A tool to repartition the disk.
- systemd-sysupdate: A tool for handling atomic updates.
These are available on every recent Linux distribution. Therefore, ROOS utilizes a lot of functionality native to systemd to reduce dependencies on other packages. For more information, read Lennart Poettering’s blog (creator of systemd and mkosi).
Base Image
Mkosi uses systemd-repart for creating disk images.
The default configuration for ROOS is provided in mkosi.conf.d/roos-base/mkosi.repart.
It creates a disk image with four partitions.
The two most important are the ESP partition for the EFI bootloader and the root partition.
The other two are used by dm-verity for integrity and validation of the root partition.
The root partition is provided as a read-only image using the EroFS file system. This is a more modern format than SquashFS and is quite popular in Android and containerized environments. By using zstd compression, the images remain small; however, to keep room for additional applications, the default size of the root partition is set to 4GB.
The additional partitions required for the A/B setup and the /var partition are created on boot by the systemd-repart service.
The configuration is included in mkosi.images/roos-base/mkosi.extra/usr/lib/repart.d.
As an additional security measure, the /var partition is encrypted, and the encryption key is stored in the TPM.
While the TPM is far from perfect at protecting against attackers with physical access, it heightens the bar significantly.
The keys can also be bound to specific PCRs to ensure the system is booted into ROOS and into a specific state.
This can detect tampering with the BIOS; however, since there is no standard provision for a recovery key (like BitLocker), the key is currently only linked to the Secure Boot state to prevent data loss.
A profile is provided to enable the A/B partition setup, making it easier to test the image in a VM without requiring large disks.
A Note on Partition Mounting and Machine IDs
Mounting partitions depends completely on the UAPI Group Discoverable Partitions Specification.
Partitions are mounted based on their UUIDs.
According to the specifications, the UUID of the /var partition is a hash containing the machine-id.
Normally, the machine-id is generated on first boot and stored in /etc.
Since /etc is immutable on ROOS, this standard approach doesn’t work.
Therefore, the systemd.machine_id=firmware kernel option is used to ensure the machine-id is derived from the firmware.
Additionally, the systemd-repart service is altered in the initrd (mkosi.images/initrd) to ensure this machine ID is used as the seed for creating the /var partition.
If no machine ID can be found, the system will use a random value.
This results in a failed boot because the system will generate a random ID when setting up /var, and then generate a new, mismatched ID when trying to mount it.
This issue is easily replicated in QEMU, as emulated machines often lack a firmware UUID by default.
To fix this, the --uuid option must be set (avoid using all zeros, as that is interpreted as unset).
Also, be aware that some cheap NUC-like machines do not have unique machine IDs, as I experienced during development.
The /var partition uses the copy-on-write file system BTRFS.
This file system supports hard links and subvolumes.
While not directly used by ROOS basics, this is relevant for running containers or creating snapshots of the system configuration.
BTRFS is also advised for building via mkosi, as it uses hard links to optimize the build process.
Unified Kernel Images (UKIs)
With EFI and the systemd EFI stub, it is possible to create EFI executables that pack the initrd, kernel, microcode, and kernel command line into a single file. These Unified Kernel Images (UKIs) are automatically detected by the systemd-boot bootloader, providing a simple method for updates.
When a new version of the root partition image is provided, a new UKI is installed.
The kernel command line inside the UKI contains a reference to the hash of the root partition via the roothash= parameter.
Based on this hash, the system can automatically detect the corresponding root partition.
Systemd-boot also supports automatic boot assessment and uses the UKI filename to count the number of boot attempts. After booting, systemd marks a UKI as “blessed” if it boots successfully. If booting fails, it will automatically revert to the old version after a few attempts.
To support different kernel command lines, a UKI can provide multiple profiles. ROOS includes two additional profiles shown in the boot menu:
- Factory Reset: Cleans the
/varpartition and creates a new empty one. - Storage Target Mode: Makes the machine’s disks available over NVMe-over-TCP.
Note that there is no authentication for this mode, but since the
/varpartition is encrypted, it does not automatically grant access to data. This mode is mainly provided to facilitate image replacement if an update breaks the system.
systemd-sysupdate
Sysupdate is a component of systemd that downloads and installs disk images and files, such as the root partition and UKIs.
Configuration is stored in /usr/lib/sysupdate.d, and it relies on versions encoded in filenames and partition labels to detect installed versions.
If sysupdate detects that a UKI and a corresponding root partition are present, it marks that version as installed.
Based on the currently running version, the active partition and UKI are protected from deletion.
It will also remove old UKI versions if the corresponding partition needs to be overwritten.
The systemd-sysupdate.timer automatically checks for updates every 8 hours.
Setting up the Update Server
To provide updates, you only need a simple HTTP server.
Sysupdate downloads the SHA256SUMS files and determines the available version based on the filenames.
Mkosi creates the necessary SHA256SUMS file by default, but prefixes it with the image name and ID.
Mkosi can also start an HTTP server to serve the mkosi.output directory using mkosi serve, but you must rename or symlink the checksum file before sysupdate can use it.
While this file can be signed using GPG, this is disabled for now to simplify the setup, assuming updates are provided over HTTPS.
ROOS allows you to configure the update server as UPDATE_URL in the mkosi.env file.
This file stores simple configuration options, such as the URL and default username.
An example is provided as mkosi.env.example.
When UPDATE_URL is set, the build process creates the necessary update files for the UKI and root partition using templates from roos.resources/sysupdate.
Users and SSH
The default user is configured using the DEFAULT_USER key.
Because only the /var partition is writable, the home directory is stored in /var/home rather than on the root partition.
Since /etc is immutable, the passwd and shadow files are read-only, meaning passwords cannot be changed.
While systemd provides systemd-homed and userdb for dynamic users, a single predefined user is sufficient for ROOS.
As most appliances are managed over the network, no passwords are set; logins depend completely on public keys.
Since the home directory is empty and created only on the first boot, SSH is configured to read authorized keys from /etc/authorized_keys.
You can provide an authorized keys file to ROOS by creating a config/authorized_keys file in your build tree.
Host keys are stored in /var/lib/ssh/etc/ssh/ and are generated on boot as normal.
Sudo is used for privilege escalation and is configured to allow running commands as root without a password.
The DEFAULT_USER is added to the wheel group for this purpose.
Systemd also provides its own replacement called run0, and there is a Rust-based alternative used by Ubuntu, so this may change in the future.
Network Configuration
Most systems simply depend on DHCP for network configuration. To support this basic setup, a systemd-networkd configuration file is provided to automatically enable DHCP for all wired connections. Additional or advanced network configuration can be achieved using configuration extensions and credentials (covered in the sections below).
To ensure all machines running the same image have unique hostnames, the hostname is derived from the machine ID by default.
This is similar to embedded systems that contain random-looking characters in their hostnames.
The default hostname is set to roos-???????, where the question marks are replaced with a hashed value of the machine ID.
Terminal
Without passwords, logging in from a physical terminal would be trivial and insecure. Therefore, the login terminal is disabled. Instead, fastfetch is used to display system information—such as the IP address and disk usage—when a monitor is connected.
This can be inconvenient during development and testing.
You can override this behavior by creating a file at /usr/lib/systemd/system-preset/70-enable-login.preset (in the mkosi.extra directory of the main image) with the following content:
disable fastfetch.service
enable [email protected]
This undoes the default preset that disabled getty and enabled fastfetch.
Extendibility
A completely read-only system is often not very usable. One of the main barriers to adopting immutable images is the requirement to make configuration adjustments per system. ROOS addresses this through several methods.
Custom Images
The main image configuration can be edited by creating mkosi.conf in the root of the ROOS repository.
Here, you can add packages to the image and configure which profiles should be included.
Additional configuration files can be added to the mkosi.extra directory; the content of this directory is copied into the image after all packages are installed.
By default, all disk images include the ROOS configuration mkosi.conf/roos and files from the mkosi.image/roos-base image.
If you wish to create multiple images at once from a single repository, you can create subdirectories in mkosi.images.
System and Configuration Extensions
Systemd provides methods to apply additional configuration on top of the immutable image: systemd-sysext and systemd-confext.
- sysext: Provides disk overlays with additional files for
/usrand/opt. - confext: Provides disk overlays for
/etc.
These images can be created by systemd-repart and are mounted during boot using overlayfs.
Since they are mounted during boot, the files are not visible during the very early boot stages.
Services must be explicitly configured to run after systemd-sysext and systemd-confext to ensure they see the additional files.
Every extension contains an extension-release file that is checked to ensure the ID, IMAGE_ID, SYSEXT_LEVEL, or CONFEXT_LEVEL matches the system’s values.
This links extensions to specific versions of the main image, allowing them to be updated together.
Examples of a confext and sysext are provided in the extensions/ and confexts/ folders, including a build script to create signed extensions.
Extensions can be stored in /var/extensions and /var/confexts.
It is also possible to install them on the EFI boot partition or use them to extend the initrd of a UKI.
For more information, see Lennart Poettering’s blog post about extensions.
Credentials
Another configuration method is the usage of credentials.
These are encrypted options read by systemd services, decrypted by a key on the /var partition or the TPM.
They can be stored on the /var partition or the EFI boot partition.
Systemd provides default keys that can be configured using the credential system. For example, you can add systemd-networkd configuration files, VPN configurations, or SSH keys. Note that for SSH, the configuration is already overridden by ROOS and probably conflicts with options provided by systemd.
For more information, see the CREDENTIALS page of systemd.
Security
Without going into excessive detail, it is important to note that by default, anyone can create and install extensions and credentials. While credentials can be encrypted, non-encrypted credentials are also accepted.
Both extensions and credentials depend on Secure Boot to fully lock down the system. While UKIs and images are signed by a user-configured key, you must manually add this key and enable Secure Boot. Currently, auto-configuring Secure Boot using custom keys is not possible due to the risk of soft-bricking devices.
Furthermore, on many systems, you must include a Microsoft signing key (and a vendor signing key) to load EFI drivers from the BIOS. Consequently, you end up with a system that both you and Microsoft can access—unless you add the hashes of the EFI drivers manually instead of trusting the generic keys. However, when Secure Boot is correctly configured, extensions and credentials will only be loaded if they are signed by a trusted key.
Usage
To create your own image, a README.md is provided in the ROOS repository.
It contains all the steps required to setup and start building your own custom appliances.
