Introduction
Tools like CFEngine, Puppet, Chef and Ansible have revolutionized infrastructure automation by providing a structured framework for organizing and sharing configuration management code, with Ansible being one of the more recent tools. Ansible is widely used for automating the configuration of your IT environment, and it is highly flexible. While the creators of Ansible position it as a tool that ‘offers open-source automation that is simple, flexible, and powerful’, it might not be entirely clear to anyone what makes for robust ansible code, and things might not be so simple as they state, especially in more complex environments and setups.
Within Ansible, roles are a concept in which code is encapsulated. Within roles there are tasks, variables, handlers, and files, all shipped into a modular, reusable unit that, if written correctly, can be easily shared across projects and teams. Whether you’re managing a handful of servers or orchestrating complex (multi-)cloud environments, roles enable you to write cleaner, more maintainable code. Writing roles has many challenges, however, and it can be quite difficult to conceive an Ansible role that one would consider to be robust and ‘complete’. In this blog post, we would like to introduce some rules of thumbs on what a robust ansible role consists of, which will hopefully enable you to write better ansible roles.
Some Rules of Thumb
The rules of thumb proposed here are as follows:
- The main.yml of a role should only be used for including and importing tasks
- All tasks should be optimized for efficiency
- Data and code should be fully separated
- Tasks should run on an include basis
- Roles should handle check mode correctly
- All tasks should be idempotent
- Variables should be properly namespaced and hierarchized
- Tags should be implemented and documented
- Roles should be validated
- Roles should be documented
Note that these are rules of thumb only. Depending on the usage and requirements of the role, some rules should be left out or added. Generally speaking, however, these will be applicable. Below, we will expand on all of these rules with illustrations and context to show you what we mean by them.
Illustrations with a role: linux_user
To illustrate the rules of thumb given above, we will create a role that can create Linux users. The creation is done as such:
# Creation of a basic role structure using ansible-galaxy
ansible-galaxy role init linux_user
# Navigate into the new role
cd linux_user/
# Create a supplementary tasks file
touch tasks/linux_user.yml
Copy code
Our directory structure now looks like this:
# ls ./*
./README.md
./defaults:
main.yml
./files:
./handlers:
main.yml
./meta:
main.yml
./tasks:
linux_user.yml main.yml
./templates:
./tests:
inventory test.yml
./vars:
main.yml
Copy code
This provides us a basic structure to work with. In the main.yml, we will be including the tasks for all user related tasks:
# cat tasks/main.yml
---
- name: Include Linux user tasks
loop: "{{ linux_users }}"
loop_control:
loop_var: user
label: "{{ user }}"
tags:
- linux_user
- linux_user_authorized_keys
- linux_user_create
when:
- ansible_facts["os_family"] in linux_user_allowed_os_families
- linux_user_allowed_target_hostgroups | intersect(group_names)
ansible.builtin.include_tasks: linux_user.yml
Copy code
The linux_user.yml file included above contains the actual user creation tasks.
# cat tasks/linux_user.yml
---
- name: Create user "{{ user.name }}"
become: true
notify:
- "Send mail notification about user creation"
tags:
- linux_user
- linux_user_create
ansible.builtin.user:
# Required
name: "{{ user.name }}"
# Optional
append: "{{ user.append | default(omit) }}"
comment: "{{ user.comment | default(omit) }}"
create_home: "{{ user.create_home | default(omit) }}"
force: "{{ user.force | default(omit) }}"
group: "{{ user.group | default(omit) }}"
groups: "{{ user.groups | default(omit) }}"
hidden: "{{ user.hidden | default(omit) }}"
home: "{{ user.home | default(omit) }}"
password: "{{ user.password | default(omit) }}"
password_expire_account_disable: "{{ user.password_expire_account_disable | default(omit) }}"
password_expire_max: "{{ user.password_expire_max | default(omit) }}"
password_expire_min: "{{ user.password_expire_min | default(omit) }}"
password_expire_warn: "{{ user.password_expire_warn | default(omit) }}"
password_lock: "{{ user.password_lock | default(omit) }}"
remove: "{{ user.remove | default(omit) }}"
shell: "{{ user.shell | default(omit) }}"
state: "{{ user.state | default(omit) }}"
system: "{{ user.system | default(omit) }}"
uid: "{{ user.uid | default(omit) }}"
umask: "{{ user.umask | default(omit) }}"
update_password: "{{ user.update_password | default(omit) }}"
- name: Set authorized keys for user "{{ user.name }}"
become: true
loop: "{{ user.authorized_keys | default(linux_user_default_authorized_keys) }}"
loop_control:
loop_var: key
label: "{{ key.key }}"
tags:
- linux_user
- linux_user_authorized_keys
ansible.posix.authorized_key:
# Required
key: "{{ key.key }}"
user: "{{ user.name }}"
# Optional
comment: "{{ key.comment | default(omit) }}"
exclusive: "{{ key.exclusive | default(omit) }}"
key_options: "{{ key.key_options | default(omit) }}"
path: "{{ key.path | default(omit) }}"
state: "{{ key.state | default(omit) }}"
Copy code
In the above tasks file, a handler is configured called ‘Send mail notification about user creation’, which looks like this:
# cat handlers/main.yml
---
- name: Send mail notification about user creation
delegate_to: localhost
when: linux_user_smtp_enabled
community.general.mail:
body: "{{ linux_user_smtp_body }}"
from: "{{ linux_user_smtp_mail_from }}"
host: "{{ linux_user_smtp_host }}"
port: "{{ linux_user_smtp_port }}"
subject: "{{ linux_user_smtp_subject }}"
to: "{{ linux_user_smtp_mail_to }}"
Copy code
As you can see, many variables are used in the above files. These are configured in the default and vars files of the role:
# cat defaults/main.yml
---
# Users
linux_user_default_authorized_keys: []
linux_users: []
# SMTP
linux_user_smtp_enabled: false
linux_user_smtp_body: "User mutations executed on {{ ansible_facts.hostname }}"
linux_user_smtp_host: ""
linux_user_smtp_mail_from: ""
linux_user_smtp_mail_to: ""
linux_user_smtp_port: ""
linux_user_smtp_subject: "User mutations"
# cat vars/main.yml
---
# Control
linux_user_allowed_os_families:
- "RedHat"
linux_user_allowed_target_hostgroups:
- "dev_servers"
Copy code
Rules of Thumb in Practice
1. The main.yml of a role should only be used for including and importing tasks
When using the main.yml directly for defining tasks, especially with roles that contain many tasks, the role becomes very unorganized, harder to read, and harder to understand. It is therefore a good practice to use the main.yml only for including tasks, as we do in the role given above.
2. All tasks should be optimized for efficiency
All tasks, for obvious reasons, should be optimized for efficiency. As stated before, the main.yml should only be used for importing or including tasks.
In our example role, we include tasks from linux_user.yml, as opposed to importing them. The latter, importing, is actually faster. The difference in speed lies in the way the role is processed by Ansible. When including tasks, the statement is calculated in runtime, whereas imported tasks are precalculated before tasks run. The latter process is faster most of the time, but it heavily depends on the size of the runtime.
Why then does our main.yml use include instead of import? This is because importing doesn’t support looping over variables on the import statement itself. So, if we wanted to speed up our user creation tasks, we could use import_tasks in the main.yml, and iterate over the users in linux_user.yml, something like the following:
# cat tasks/main.yml
---
- name: Include Linux user tasks
tags:
- linux_user
- linux_user_authorized_keys
- linux_user_create
when:
- ansible_facts["os_family"] in linux_user_allowed_os_families
- linux_user_allowed_target_hostgroups | intersect(group_names)
ansible.builtin.import_tasks: linux_user.yml
# cat tasks/linux_user.yml
- name: Create user "{{ user.name }}"
become: true
loop: "{{ linux_users }}"
loop_control:
loop_var: user
label: "{{ user }}"
notify:
- "Send mail notification about user creation"
tags:
- linux_user
- linux_user_create
ansible.builtin.user:
{...}
Copy code
This preprocesses the user creation task and would be faster indeed if we only had the user creation task. But in our case, we also have a task to configure authorized SSH keys for all users. In this authorized key task, we would now need to iterate once more, as it is not being done on the import statement. Iterating over the users twice, precalculated, will actually cost more time than doing this once during runtime (depending on the size of the runtime), and thus include_tasks is faster in this situation, and our tasks are optimized for efficiency.
3. Data and code should be fully seperated
To keep our role fully modular and scalable, all user data should be left out of the role itself. Depending on your Ansible setup, variables should be passed to the role externally in some way. Here is an example of passing vars straight to the role through a playbook that runs the role. Note that I published my role into an Ansible Galaxy collection called ‘sue.generic’:
---
- name: Run role
hosts: localhost
vars:
linux_user_smtp_host: "smtp.sue.nl"
linux_user_smtp_mail_from: "source@sue.nl"
linux_user_smtp_mail_to: "recipient@sue.nl"
linux_user_smtp_port: "25"
linux_users:
- name: user1
state: absent
- name: user2
roles:
- role: sue.generic.linux_user
Copy code
4. Tasks should run on an include basis
An automation tool like Ansible is very effective in running certain actions, like pushing some configuration files, yet because of that also possibly very effective at polluting or destroying your environment. In some cases, when running the wrong tasks on a machine or set of machines, you can do serious damage to your IT infrastructure. In our example role, we set the following conditionals on when the linux user tasks get included:
when:
- ansible_facts["os_family"] in linux_user_allowed_os_families
- linux_user_allowed_target_hostgroups | intersect(group_names)
Copy code
These conditionals limit the tasks to only run against some predefined allowed OS families and some predefined allowed hostgroups of servers. By conditionally including tasks where needed as opposed to running tasks against all machines and excluding certain machines, we prevent accidentally pushing users to systems they don’t need access to, enhancing security.
5. Roles should handle check mode correctly
Check mode handling should be implemented for each task and check mode should fully resemble the actual run, catch the same errors and output the same output. To achieve this, one will need to consider the check mode behaviour for each and every task. Sometimes, for example, check_mode should be disabled for a task.
To illustrate this, let’s say that we expect that there exists a local file containing our users’ password and that we want to read this file using the shell module. There are of course better ways to do this and there’s also better ways of storing secrets, like having a dynamically rotating secret as described in our blog post called ‘Creating Automatically Rotating Secrets Using Terraform’, but using the shell module here illustrates the point that check_mode should always resemble the actual run.
There is a caveat to this scenario: the file containing the password doesn’t actually exist. We could retrieve the password using something in the lines of:
- name: Retrieve user password
changed_when: false
check_mode: false
failed_when: user_password_result.rc != 0
register: user_password_result
vars:
password_file: "/tmp/password.txt"
ansible.builtin.shell: |
if [ -f "{{ password_file }}" ]; then
cat "{{ password_file }}"
else
echo "Password file not found: {{ password_file }}"
exit 1
fi
Copy code
Note that check_mode is set to false here, causing this task to always run. Why? Because the shell module doesn’t run in check mode, and that can cause an inequality between our check mode and actual run. If the password file doesn’t exist and we run in check mode, the task would not generate an error (as the script doesn’t run and thus doesn’t fail because the file is missing). When running without check mode, it would fail as the file doesn’t exist. So in the case we don’t explicitly set check_mode to false in this task, check mode would not properly resemble the actual run.
6. All tasks should be idempotent
All tasks should be able to run more than once and have the same outcome if nothing changes. A task that has this property is said to be idempotent. Most of the time, you don’t need to worry about this as most modules will have idempotency handling in them. In edge cases, however, like when using the command, shell or lineinfile modules or when writing your own modules, having idempotency in mind is important. In the case of our role, we might want to add an env var to the newly created users’ .bashrc file:
- name: Ensure custom environment variable is set in .bashrc for user "{{ user.name }}"
become: true
tags:
- linux_user
- linux_user_bashrc
ansible.builtin.lineinfile:
path: "{{ user.home | default('/home/' + user.name) }}/.bashrc"
regexp: '^export MY_CUSTOM_VAR='
line: 'export MY_CUSTOM_VAR="hello-world"'
state: present
create: true
owner: "{{ user.name }}"
group: "{{ user.group | default(user.name) }}"
mode: '0644'
Copy code
Note that by adding regexp, we write the lineinfile task in an omnipotent way, i.e. the line will never be added more than once, whereas doing something like the task below is not omnipotent as it will keep on adding the env var to the users’ .bashrc file in every run:
name: Always append line to .bashrc for user "{{ user.name }}" (NOT idempotent)
become: true
tags:
- linux_user
- linux_user_bashrc
ansible.builtin.lineinfile:
path: "{{ user.home | default('/home/' + user.name) }}/.bashrc"
line: 'export MY_CUSTOM_VAR="hello-world"'
insertafter: EOF
state: present
create: yes
owner: "{{ user.name }}"
group: "{{ user.group | default(user.name) }}"
mode: '0644'
Copy code
7. Variables should be properly namespaced and hierarchized
All roles should be properly namespaced to avoid unforeseen variable overrides and conflicting variables, especially in bigger environments where many variables are present. This is usually done by prefixing every var for the role with the role name, like we have done in our role (`linux_user`).
Next to namespacing variables, the variables need to be properly hierarchized. All of the variables, except for linux_user_allowed_os_family, are defined in the role’s defaults, which is the lowest in the variable hierarchy, making it very easy to override them. As for variables within roles: if a role contains variables that are more important to be set at a certain value, like our linux_user_allowed_os_family var, vars should be used instead of defaults, making it harder to override them. For the exact list of variable precedence, refer to the official ansible documentation on variable precedence.
8. Tags should be implemented and documented
All roles should support tagging for finegrained, targeted runs. In our role for instance, we can run only the user creation task by using the linux_user_create, or only run the authorized_keys task by using the linux_user_authorized_keys tag. Each tag should be documented such that their purpose is clear.
9. Roles should be validated
When finalizing your new role, it should always be validated by at least linting and testing it. For linting, ansible-lint is most commonly used. For testing, there’s multiple options, a popular one being Ansible Molecule. To cover linting and testing in this post would be way too extensive, so we’re just mentioning it and not including any examples. If you would like to see examples of linting and testing, however, make sure to reach out to us!
10. Roles should be documented
Our new role isn’t complete without documentation on how to use it. Documentation is often neglected, but no less important than properly writing the role itself. There are tools to auto-generate the README.md for an ansible role, but we provide a custom written one below:
# cat README.md
# linux_user
Ansible role for creating Linux users
# Requirements
This role requires the following collections to be present:
- ansible.builtin
- ansible.posix
- community.general
# Role Variables
## Defaults
User related defaults:
`linux_user_default_authorized_keys`: []
`linux_users`: []
Where users can be configured as follows, where the values come from the [ansible user module](https://docs.ansible.com/ansible/latest/collections/ansible/builtin/user_module.html):
`linux_users.user.name` (required)
`linux_users.user.append`
`linux_users.user.comment`
`linux_users.user.create_home`
`linux_users.user.force`
`linux_users.user.group`
`linux_users.user.groups`
`linux_users.user.hidden`
`linux_users.user.home`
`linux_users.user.password`
`linux_users.user.password_expire_account_disable`
`linux_users.user.password_expire_max`
`linux_users.user.password_expire_min`
`linux_users.user.password_expire_warn`
`linux_users.user.password_lock`
`linux_users.user.remove`
`linux_users.user.shell`
`linux_users.user.state`
`linux_users.user.system`
`linux_users.user.uid`
`linux_users.user.umask`
`linux_users.user.update_password`
Optionally, SMTP notifications can be configured:
`linux_user_smtp_enabled`: false
`linux_user_smtp_body`: "User mutations executed on {{ ansible_facts.hostname }}"
`linux_user_smtp_host`: ""
`linux_user_smtp_mail_from`: ""
`linux_user_smtp_mail_to`: ""
`linux_user_smtp_port`: ""
`linux_user_smtp_subject`: "User mutations"
## Variables
`linux_user_allowed_os_families`: ["RedHat"]
# Example Playbook
```yaml
---
- name: Run linux_user role
hosts: localhost
vars:
linux_user_smtp_body: "A user mutation has been done in the sue.nl domain!"
linux_user_smtp_host: "smtp.sue.nl"
linux_user_smtp_mail_from: "source@sue.nl"
linux_user_smtp_mail_to: "recipient@sue.nl"
linux_user_smtp_port: "25"
linux_user_smtp_subject: "User mutations in sue.nl"
linux_users:
- name: user_1
- name: user_2
state: absent
- name: svc_user_1
create_home: false
shell: "/bin/bash"
uid: 15001
- name: svc_user_2
authorized_keys:
- "key1"
- "key2"
roles:
- role: sue.generic.linux_user
```
# Tags
This role supports a multiple of tags:
- `linux_user`: runs all plays
- `linux_user_create`: only create users
- `linux_user_authorized_keys`: only configure authorized keys for a user
# Supported
Tested and working on the following operating systems:
- AlmaLinux 9.5 (Teal Serval)
# License
MIT
# Author Information
- Nathan van Buuren (Sue B.V.)
Copy code
Conclusion
And that concludes the rules of thumbs. Again, these are by no means fixed rules, but more like guidelines that will hopefully help you write better, more consistent and more useful ansible roles.
Ansible is an incredibly powerful automation tool that can be used for many purposes. Are you interested in things like how you can write your own ansible modules, how to manage your ansible setup properly, insights on when or when not to use ansible or any other topic that got covered in this post? Make sure to reach out to us!