The art of writing Ansible roles

Introduction

Tools such as CFEngine, Puppet, Chef, and Ansible have radically changed infrastructure automation by providing a structured framework for organizing and sharing configuration management code, with Ansible being one of the newer tools. Ansible is widely used to automate the configuration of your IT environment and is highly flexible. Although its creators position Ansible as a tool that "provides open-source automation that is simple, flexible, and powerful," it is not always clear what exactly makes robust Ansible code robust. And as simple as it sounds, it is not always so in practice—especially in more complex environments and setups.
Within Ansible, roles are a concept that allows you to encapsulate code. Roles contain tasks, variables, handlers, and files, brought together in a modular, reusable unit that, if written correctly, is easy to share between projects and teams. Whether you manage a handful of servers or orchestrate complex (multi-)cloud environments, roles help you write cleaner, more maintainable code. At the same time, there are many challenges, and it is difficult to design a role that you would label as robust and "complete." In this blog, we therefore introduce a number of rules of thumb for what characterizes a robust Ansible role, so that you can write better Ansible roles.

A few rules of thumb

The rules of thumb in this blog are:

  1. You only use the main.yml of a role to include or import tasks.
  2. All tasks must be optimized for efficiency.
  3. Data and code must be completely separate
  4. Tasks must run on an include basis
  5. Roles must handle check mode correctly
  6. All tasks must be idempotent.
  7. Variables must have the correct namespace and be hierarchical.
  8. Tags must be implemented and documented.
  9. Roles must be validated
  10. Roles must be documented

Please note: these are rules of thumb. Depending on the use and requirements of a role, some rules may be less relevant or additional rules may be necessary. In general, they are applicable in many situations. Below, we will elaborate on each one, with context and examples.

Illustrations with a role: linux_user

To illustrate the rules of thumb, we will create a role that Linux users can create. We do this as follows:

# Creation of a basic role structure using ansible-galaxy
ansible-galaxy role init linux_user

# Navigate into the new role
cd linux_user/

# Create a supplementary tasks file
touch tasks/linux_user.yml
                                                        

Copy code

Our directory structure now looks like this:

# ls ./*
./README.md

./defaults:
main.yml

./files:

./handlers:
main.yml

./meta:
main.yml

./tasks:
linux_user.yml  main.yml

./templates:

./tests:
inventory  test.yml

./vars:
main.yml
                                                        

Copy code

This provides us with a basic structure to work with. In main.yml, we list the tasks for all user-related tasks:

# cat tasks/main.yml
---
- name: Include Linux user tasks
  loop: "{{ linux_users }}"
  loop_control:
    loop_var: user
    label: "{{ user }}"
  tags:
    - linux_user
    - linux_user_authorized_keys
    - linux_user_create
  when:
    - ansible_facts["os_family"] in linux_user_allowed_os_families
    - linux_user_allowed_target_hostgroups | intersect(group_names)
  ansible.builtin.include_tasks: linux_user.yml
                                                        

Copy code

The linux_user.yml file shown above contains the actual tasks for creating users.

# cat tasks/linux_user.yml
---
- name: Create user "{{ user.name }}"
  become: true
  notify:
    - "Send mail notification about user creation"
  tags:
    - linux_user
    - linux_user_create
  ansible.builtin.user:
    # Required
    name: "{{ user.name }}"

    # Optional
    append: "{{ user.append | default(omit) }}"
    comment: "{{ user.comment | default(omit) }}"
    create_home: "{{ user.create_home | default(omit) }}"
    force: "{{ user.force | default(omit) }}"
    group: "{{ user.group | default(omit) }}"
    groups: "{{ user.groups | default(omit) }}"
    hidden: "{{ user.hidden | default(omit) }}"
    home: "{{ user.home | default(omit) }}"
    password: "{{ user.password | default(omit) }}"
    password_expire_account_disable: "{{ user.password_expire_account_disable | default(omit) }}"
    password_expire_max: "{{ user.password_expire_max | default(omit) }}"
    password_expire_min: "{{ user.password_expire_min | default(omit) }}"
    password_expire_warn: "{{ user.password_expire_warn | default(omit) }}"
    password_lock: "{{ user.password_lock | default(omit) }}"
    remove: "{{ user.remove | default(omit) }}"
    shell: "{{ user.shell | default(omit) }}"
    state: "{{ user.state | default(omit) }}"
    system: "{{ user.system | default(omit) }}"
    uid: "{{ user.uid | default(omit) }}"
    umask: "{{ user.umask | default(omit) }}"
    update_password: "{{ user.update_password | default(omit) }}"

- name: Set authorized keys for user "{{ user.name }}"
  become: true
  loop: "{{ user.authorized_keys | default(linux_user_default_authorized_keys) }}"
  loop_control:
    loop_var: key
    label: "{{ key.key }}"
  tags:
    - linux_user
    - linux_user_authorized_keys
  ansible.posix.authorized_key:
    # Required
    key: "{{ key.key }}"
    user: "{{ user.name }}"

    # Optional
    comment: "{{ key.comment | default(omit) }}"
    exclusive: "{{ key.exclusive | default(omit) }}"
    key_options: "{{ key.key_options | default(omit) }}"
    path: "{{ key.path | default(omit) }}"
    state: "{{ key.state | default(omit) }}"
                                                        

Copy code

In the task file above, a handler named "Send mail notification about user creation" is configured, which looks like this:

# cat handlers/main.yml
---
- name: Send mail notification about user creation
  delegate_to: localhost
  when: linux_user_smtp_enabled
  community.general.mail:
    body: "{{ linux_user_smtp_body }}"
    from: "{{ linux_user_smtp_mail_from }}"
    host: "{{ linux_user_smtp_host }}"
    port: "{{ linux_user_smtp_port }}"
    subject: "{{ linux_user_smtp_subject }}"
    to: "{{ linux_user_smtp_mail_to }}"
                                                        

Copy code

As you can see, many variables are used. We define them in defaults and vars:

# cat defaults/main.yml
---
# Users
linux_user_default_authorized_keys: []
linux_users: []

# SMTP
linux_user_smtp_enabled: false
linux_user_smtp_body: "User mutations executed on {{ ansible_facts.hostname }}"
linux_user_smtp_host: ""
linux_user_smtp_mail_from: ""
linux_user_smtp_mail_to: ""
linux_user_smtp_port: ""
linux_user_smtp_subject: "User mutations"

# cat vars/main.yml
---
# Control
linux_user_allowed_os_families:
  - "RedHat"
linux_user_allowed_target_hostgroups:
  - "dev_servers"
                                                        

Copy code

Rules of thumb in practice

1. Only use main.yml to include/import tasks.

If you use main.yml directly to define many tasks, a role quickly becomes chaotic and difficult to read. A best practice is therefore to use main.yml only for including or importing other tasks, as in the example.

2. Optimize tasks for efficiency

Tasks must be efficient. In our example, we use include_tasks instead of import_tasks. Importing is usually faster because Ansible preprocesses tasks, while including happens during runtime.

So why use include_tasks? Because import_tasks does not support a loop on the import statement itself. If you work with import_tasks, you have to do the loop in linux_user.yml. But because we also configure authorized_keys, you may have to do a double loop. Depending on the size of your runtime, this can actually be slower than a single runtime loop with include_tasks. In this scenario, include_tasks is therefore more efficient.

# cat tasks/main.yml
---
- name: Include Linux user tasks
  tags:
    - linux_user
    - linux_user_authorized_keys
    - linux_user_create
  when:
    - ansible_facts["os_family"] in linux_user_allowed_os_families
    - linux_user_allowed_target_hostgroups | intersect(group_names)
  ansible.builtin.import_tasks: linux_user.yml

# cat tasks/linux_user.yml
- name: Create user "{{ user.name }}"
  become: true
  loop: "{{ linux_users }}"
  loop_control:
    loop_var: user
    label: "{{ user }}"
  notify:
    - "Send mail notification about user creation"
  tags:
    - linux_user
    - linux_user_create
  ansible.builtin.user:
{...}
                                                        

Copy code

This preprocesses the user creation task and would indeed be faster if we only had that task. However, in our case, we also have a task to configure authorized SSH keys for all users. In that authorized key task, we would have to iterate again, because this does not happen with the import statement. Iterating twice over the users, pre-calculated, ultimately takes more time than doing this once during runtime (depending on the size of the runtime). Therefore, include_tasks is faster in this situation and our tasks are optimized for efficiency.

3. Completely separate data and code

Keep user data outside the role. Variables must be provided externally (via playbooks, inventory, group_vars/host_vars, or other methods). For example:

---
- name: Run role
  hosts: localhost
  vars:
    linux_user_smtp_host: "smtp.sue.nl"
    linux_user_smtp_mail_from: "source@sue.nl"
    linux_user_smtp_mail_to: "recipient@sue.nl"
    linux_user_smtp_port: "25"
    linux_users:
      - name: user1
        state: absent
      - name: user2
  roles:
    - role: sue.generic.linux_user
                                                        

Copy code

4. Run tasks on an include basis

Automation is powerful, but it can also cause damage if you accidentally target the wrong machines. In the example, we only include user tasks if certain conditions are met:

  when:
    - ansible_facts["os_family"] in linux_user_allowed_os_families
    - linux_user_allowed_target_hostgroups | intersect(group_names)
                                                        

Copy code

These conditionals ensure that tasks are only performed on predefined permitted OS families and specific permitted host groups of servers. By conditionally including tasks where necessary, rather than performing them on all machines and excluding certain machines, we prevent users from being accidentally created on systems to which they do not need access. This increases security.

5. Roles must handle check mode correctly

Check mode handling must be implemented per task, and check mode must fully approximate the actual run: catching the same errors and producing the same output. To achieve this, you must include the check mode behavior for each individual task. Sometimes this means, for example, that you must disable check_mode for a task.

To illustrate this: suppose we expect a local file to exist with our users' passwords and we want to read this file with the shell module. There are, of course, better ways to do this, and there are also better ways to store secrets, such as a dynamically rotating secret as described in our blog post "Creating Automatically Rotating Secrets Using Terraform." But by using the shell module here, we clearly show why check mode should always approximate the actual run.

There is one caveat to this scenario: the file containing the password does not actually exist. We could retrieve the password with something like:

- name: Retrieve user password
  changed_when: false
  check_mode: false
  failed_when: user_password_result.rc != 0
  register: user_password_result
  vars:
    password_file: "/tmp/password.txt"
  ansible.builtin.shell: |
    if [ -f "{{ password_file }}" ]; then
      cat "{{ password_file }}"
    else
      echo "Password file not found: {{ password_file }}"
      exit 1
    fi
                                                        

Copy code

Note that check_mode is set to false here, which means that this task will always be executed. Why? Because the shell module does not run in check mode, and that can cause a difference between check mode and the actual run. If the password file does not exist and we are running in check mode, the task will not generate an error (because the script is not executed and therefore does not fail due to the file being missing). When running without check mode, the task will fail because the file does not exist. If we do not explicitly set check_mode to false in this task, check mode would not accurately reflect the actual run.

6. Make tasks idempotent

All tasks must be able to run more than once and have the same result if nothing changes. We call a task with this property idempotent. Usually, you don't have to worry about this, because most modules already handle idempotency. In edge cases—for example, when using the command, shell, or lineinfile modules, or when writing your own modules—it is important to explicitly include idempotency. In the case of our role, for example, we want to add an env var to the .bashrc file of the newly created users:

- name: Ensure custom environment variable is set in .bashrc for user "{{ user.name }}"
  become: true
  tags:
    - linux_user
    - linux_user_bashrc
  ansible.builtin.lineinfile:
    path: "{{ user.home | default('/home/' + user.name) }}/.bashrc"
    regexp: '^export MY_CUSTOM_VAR='
    line: 'export MY_CUSTOM_VAR="hello-world"'
    state: present
    create: true
    owner: "{{ user.name }}"
    group: "{{ user.group | default(user.name) }}"
    mode: '0644'
                                                        

Copy code

Note that by adding regexp, we write the lineinfile task in an idempotent way: the line is never added more than once. A task such as the one below is not idempotent, because it continues to add the env var to the users' .bashrc file with every run.

 name: Always append line to .bashrc for user "{{ user.name }}" (NOT idempotent)
  become: true
  tags:
    - linux_user
    - linux_user_bashrc
  ansible.builtin.lineinfile:
    path: "{{ user.home | default('/home/' + user.name) }}/.bashrc"
    line: 'export MY_CUSTOM_VAR="hello-world"'
    insertafter: EOF
    state: present
    create: yes
    owner: "{{ user.name }}"
    group: "{{ user.group | default(user.name) }}"
    mode: '0644'
                                                        

Copy code

7. Variables must be correctly named and hierarchized.

All roles must be correctly namespaced to prevent unexpected overrides and conflicting variables, especially in larger environments where many variables are present. This is usually done by prefixing each variable of a role with the name of the role, as we have done in our role (linux_user).

In addition to namespacing, variables must also be correctly hierarchical. All variables, with the exception of linux_user_allowed_os_family, are defined in the role's defaults. This is the lowest level in the variable hierarchy, making them easy to override. For variables within roles that are important to keep at a fixed value—such as our linux_user_allowed_os_family variable—vars should be used instead of defaults. This makes it more difficult to accidentally override these variables.

For a complete overview of variable precedence, please refer to the official Ansible documentation on variable precedence.

8. Implement and document tags

Tags enable targeted runs. In the example: linux_user_create for user creation only and linux_user_authorized_keys for SSH keys only. Always document tags so that their purpose is clear.

9. Validate roles

Always validate your role with linting and testing. Ansible-lint is the standard for linting. There are several options for testing, with Ansible Molecule being a well-known choice. (A full explanation is beyond the scope of this blog.)

10. Document roles

Documentation is often overlooked, but it is just as important as the role itself. A clear README with requirements, variables, sample playbook, and tags makes reuse much easier. Below is an example README as shown in the blog.

# cat README.md
# linux_user
Ansible role for creating Linux users

# Requirements
This role requires the following collections to be present:
- ansible.builtin
- ansible.posix
- community.general

# Role Variables
## Defaults
User related defaults:
`linux_user_default_authorized_keys`: []
`linux_users`: []

Where users can be configured as follows, where the values come from the [ansible user module](https://docs.ansible.com/ansible/latest/collections/ansible/builtin/user_module.html):
`linux_users.user.name` (required)
`linux_users.user.append`
`linux_users.user.comment`
`linux_users.user.create_home`
`linux_users.user.force`
`linux_users.user.group`
`linux_users.user.groups`
`linux_users.user.hidden`
`linux_users.user.home`
`linux_users.user.password`
`linux_users.user.password_expire_account_disable`
`linux_users.user.password_expire_max`
`linux_users.user.password_expire_min`
`linux_users.user.password_expire_warn`
`linux_users.user.password_lock`
`linux_users.user.remove`
`linux_users.user.shell`
`linux_users.user.state`
`linux_users.user.system`
`linux_users.user.uid`
`linux_users.user.umask`
`linux_users.user.update_password`

Optionally, SMTP notifications can be configured:
`linux_user_smtp_enabled`: false
`linux_user_smtp_body`: "User mutations executed on {{ ansible_facts.hostname }}"
`linux_user_smtp_host`: ""
`linux_user_smtp_mail_from`: ""
`linux_user_smtp_mail_to`: ""
`linux_user_smtp_port`: ""
`linux_user_smtp_subject`: "User mutations"

## Variables
`linux_user_allowed_os_families`: ["RedHat"]

# Example Playbook
```yaml
---
- name: Run linux_user role
  hosts: localhost
  vars:
    linux_user_smtp_body: "A user mutation has been done in the sue.nl domain!"
    linux_user_smtp_host: "smtp.sue.nl"
    linux_user_smtp_mail_from: "source@sue.nl"
    linux_user_smtp_mail_to: "recipient@sue.nl"
    linux_user_smtp_port: "25"
    linux_user_smtp_subject: "User mutations in sue.nl"
    linux_users:
      - name: user_1
      - name: user_2
        state: absent
      - name: svc_user_1
        create_home: false
        shell: "/bin/bash"
        uid: 15001
      - name: svc_user_2
        authorized_keys:
          - "key1"
          - "key2"
  roles:
    - role: sue.generic.linux_user
```

# Tags
This role supports a multiple of tags:
- `linux_user`: runs all plays
- `linux_user_create`: only create users
- `linux_user_authorized_keys`: only configure authorized keys for a user

# Supported
Tested and working on the following operating systems:
- AlmaLinux 9.5 (Teal Serval)

# License
MIT

# Author Information
- Nathan van Buuren (Sue B.V.)
                                                        

Copy code

Conclusion

And those were the rules of thumb. Again, these are not hard and fast rules, but guidelines to help you write more consistent, robust, and reusable Ansible roles.

Ansible is a powerful automation tool with many applications. Would you like to learn more about writing your own Ansible modules, setting up your Ansible configuration correctly, or when to use Ansible and when not to? Feel free to contact us, we are happy to help.

Stay informed
By subscribing to our newsletter, you declare that you agree with our privacy statement.

Ready to improve your Ansible configuration?

michael.veentjer 1
Michael Veentjer

Let's chat!


Ready to improve your Ansible configuration?

* required

By submitting this form, you confirm that you have read and understood our privacy statement.
Privacy overview
This website uses cookies. We use cookies to ensure that our website and services function properly, to gain insight into the use of our website, and to improve our products and marketing. For more information, please read our privacy and cookie policy.