Ansible for DevOps: Server and configuration management for humans
Summary #
Author: Jeff Geerling #
Last annotated on: 2021-05-15 #
Highlights count: 66 #
Notes count: 9 #
Highlights and notes: #
Loc: #
color: orange. note:
Vagrant, a server provisioning tool, and VirtualBox, a local virtualization environment, make a potent combination for testing infrastructure and individual server configurations locally.
Loc: #
color: orange. note:
up the first time, Vagrant automatically provisions the newly-minted VM using whatever provisioner you have configured in the Vagrantfile. You can also run vagrant provision after the VM has been created to explicitly run the provisioner again. It’s this last feature that is most important for us. Ansible is one of many provisioners integrated with Vagrant (others include basic shell scripts, Chef, Docker, Puppet, and Salt).
Loc: #
color: orange. note:
playbook, step by step: 1 — This first line is a marker showing that the rest of the document will be formatted in YAML (read a getting started guide for YAML).
Loc: #
color: orange. note:
2 - hosts: all This line tells Ansible to which hosts this playbook applies. all works here, since Vagrant is invisibly using its own Ansible inventory file (instead of the one we created earlier in /etc/ansible/hosts), which just defines the Vagrant VM.
Loc: #
color: yellow. note:
Discover Ansible’s parallel nature First, I want to make sure Vagrant configured the VMs with the right hostnames. Use ansible with the -a argument ‘hostname’ to run hostname against all the servers: $ ansible multi -a “hostname”
Loc: #
color: orange. note:
By default, Ansible will run your commands in parallel, using multiple process forks, so the command will complete more quickly.
Loc: #
color: orange. note:
To get an exhaustive list of all the environment details (‘facts’, in Ansible’s lingo) for a particular server (or for a group of servers), use the command ansible [host-or-group] -m setup.
Loc: #
color: orange. note: adhoc command, multi is group in inventory
ansible multi -b -m service -a “name=ntpd state=started \ enabled=yes”
Loc: #
color: orange. note: adhoc command to run command, without module
Check to make sure Django is installed and working correctly. $ ansible app -a “python -c ‘import django; \ print django.get_version()’” 192.168.60.5 | SUCCESS | rc=0 >> 1.11.12 192.168.60.4 | SUCCESS | rc=0 >> 1.11.12
Loc: #
color: orange. note:
One thing that is universal to all of Ansible’s SSH connection methods is that Ansible uses the connection to transfer one or a few files defining a play or command to the remote server, then runs the play/command, then deletes the transferred file(s), and reports back the results. A fast, stable, and secure SSH connection is of paramount importance to Ansible.
Loc: #
color: yellow. note:
Playbooks are written in YAML, a simple human-readable syntax popular for defining configuration.
Loc: #
color: yellow. note:
Ad-hoc commands alone make Ansible a powerful tool; playbooks turn Ansible into a top-notch server provisioning and configuration management tool.
Loc: #
color: yellow. note:
The greater-than sign (>) immediately following the command: module directive tells YAML “automatically quote the next set of indented lines as one long string, with each line separated by a space”.
Loc: #
color: yellow. note:
The first line, —, is how we mark this document as using YAML syntax (like using <html> at the top of an HTML document, or <?php at the top of a block of PHP code).
Loc: #
color: yellow. note:
The third line, become: yes tells Ansible to run all the commands through sudo, so the commands will be run as the root user.
Loc: #
color: yellow. note:
Ansible allows lists of variables to be passed into tasks using with_items: Define a list of items and each one will be passed into the play, referenced using the item variable (e.g. item ).
Loc: #
color: yellow. note: Limit hosts
You can also limit the hosts on which the playbook is run via the ansible-playbook command: $ ansible-playbook playbook.yml –limit webservers
Loc: #
color: yellow. note:
You could also limit the playbook to one particular host: $ ansible-playbook playbook.yml –limit xyz.example.com
Loc: #
color: yellow. note:
If you want to see a list of hosts that would be affected by your playbook before you actually run it, use –list-hosts: $ ansible-playbook playbook.yml –list-hosts
Loc: #
color: yellow. note:
If no remote_user is defined alongside the hosts in a playbook, Ansible assumes you’ll connect as the user defined in your inventory file for a particular host, and then will fall back to your local user account name. You can explicitly define a remote user to use for remote plays using the –user (-u) option: $ ansible-playbook playbook.yml –user=johndoe
Loc: #
color: yellow. note:
–connection=TYPE (-c TYPE): The type of connection which will be used (this defaults to ssh; you might sometimes want to use local to run a playbook on your local machine, or on a remote server via cron).
Loc: #
color: yellow. note:
Ansible lets you run tasks before or after the main tasks (defined in tasks:) or roles (defined in roles:—we’ll get to roles later) using pre_tasks and post_tasks, respectively.
Loc: #
color: yellow. note:
handlers are special kinds of tasks you run at the end of a play by adding the notify option to any of the tasks in that group. The handler will only be called if one of the tasks notifying the handler makes a change to the server (and doesn’t fail), and it will only be notified at the end of the play.
Loc: #
color: yellow. note:
Modifying PHP’s configuration is a perfect way to demonstrate lineinfile’s simplicity and usefulness: 88 - name: Adjust OpCache memory setting. 89 lineinfile: 90 dest: “/etc/php/7.1/apache2/conf.d/10-opcache.ini” 91 regexp: “^opcache.memory_consumption” 92 line: “opcache.memory_consumption = 96” 93 state: present 94 notify: restart apache Ansible’s lineinfile module does a simple task: ensures that a particular line of text exists (or doesn’t exist) in a file.
Loc: #
color: yellow. note:
You can also pass in extra variables using quoted JSON, YAML, or even by passing a JSON or YAML file directly, like –extra-vars “@even_more_vars.json” or –extra-vars “@even_more_vars.yml, but at this point, you might be better off using one of the other methods below.
Loc: #
color: yellow. note:
Magic variables with host and group variables and information If you ever need to retrieve a specific host’s variables from another host, Ansible provides a magic hostvars variable containing all the defined host variables (from inventory files and any discovered YAML files inside host_vars directories). # From any host, returns “jane”. hostvars[‘host1’][‘admin_user’] There are a variety of other variables Ansible provides that you may need to use from time to time: groups: A list of all group names in the inventory. group_names: A list of all the groups of which the current host is a part. inventory_hostname: The hostname of the current host, according to the inventory (this can differ from ansible_hostname, which is the hostname reported by the system). inventory_hostname_short: The first part of inventory_hostname, up to the first period. play_hosts: All hosts on which the current play will be run.
Loc: #
color: yellow. note:
Facts (Variables derived from system information) By default, whenever you run an Ansible playbook, Ansible first gathers information (“facts”) about each host in the play. You may have noticed this whenever we ran playbooks in earlier chapters: $ ansible-playbook playbook.yml PLAY [group] ** GATHERING FACTS * ok: [host1] ok: [host2] ok: [host3] Facts can be extremely helpful when you’re running playbooks; you can use gathered information like host IP addresses, CPU type, disk space, operating system information, and network interface information to change when certain tasks are run, or to change certain information used in configuration files.
Loc: #
color: yellow. note:
register In Ansible, any play can ‘register’ a variable, and once registered, that variable will be available to all subsequent tasks. Registered variables work just like normal variables or host facts.
Loc: #
color: yellow. note:
Many command-line utilities print results to stderr instead of stdout, so failed_when can be used to tell Ansible when a task has actually failed and is not just reporting its results in the wrong way.
Loc: #
color: yellow. note: Running locally a playbook
As a quick example, here’s a short playbook that you can run with the command ansible-playbook test.yml –connection=local:
Loc: #
color: yellow. note:
Blocks Introduced in Ansible 2.0.0, Blocks allow you to group related tasks together and apply particular task parameters on the block level. They also allow you to handle errors inside the blocks in a way similar to most programming languages’ exception handling.
Loc: #
color: yellow. note:
Tasks can easily be included in a similar way. In the tasks: section of your playbook, you can add import_tasks directives like so: tasks: - import_tasks: imported-tasks.yml
Loc: #
color: yellow. note:
If you need to have included tasks that are dynamic—that is, they need to do different things depending on how the rest of the playbook runs—then you can use include_tasks rather than import_tasks.
Loc: #
color: yellow. note:
Handler imports and includes Handlers can be imported or included just like tasks, within a playbook’s handlers section. For example: handlers: - import_tasks: handlers.yml
Loc: #
color: yellow. note:
- hosts: all remote_user: root tasks: […] - import_playbook: web.yml - import_playbook: db.yml
Loc: #
color: yellow. note:
Including playbooks inside other playbooks makes your playbook organization a little more sane, but once you start wrapping up your entire infrastructure’s configuration in playbooks, you might end up with something resembling Russian nesting dolls.
Loc: #
color: yellow. note:
Wouldn’t it be nice if there were a way to take bits of related configuration, and package them together nicely? Additionally, what if we could take these packages (often configuring the same thing on many different servers) and make them flexible so that we can use the same package throughout our infrastructure, with slightly different settings on individual servers or groups of servers? Ansible Roles can do all that and more!
Loc: #
color: yellow. note: Role basic structure
There are only two directories required to make a working Ansible role: role_name/ meta/ tasks/ If you create a directory structure like the one shown above, with a main.yml file in each directory, Ansible will run all the tasks defined in tasks/main.yml if you call the role from your playbook using the following syntax: 1 — 2 - hosts: all 3 roles: 4 - role_name
Loc: #
color: yellow. note:
Inside the meta folder, add a simple main.yml file with the following contents: 1 — 2 dependencies: []
Loc: #
color: yellow. note:
When running a role’s tasks, Ansible picks up variables defined in a role’s vars/main.yml file and defaults/main.yml (I’ll get to the differences between the two later), but will allow your playbooks to override the defaults or other role-provided variables if you want.
Loc: #
color: yellow. note:
But for many organizations, basic CLI use is inadequate: The business needs detailed reporting of infrastructure deployments and failures, especially for audit purposes. Team-based infrastructure management requires varying levels of involvement in playbook management, inventory management, and key and password access. A thorough visual overview of the current and historical playbook runs and server health helps identify potential issues before they affect the bottom line. Playbook scheduling ensures infrastructure remains in a known state. Ansible Tower fulfills these requirements—and many more—and provides a great mechanism for team-based Ansible usage.
Loc: #
color: yellow. note: at ICTC we used buildbot for this purpose.
Tower Alternatives Ansible Tower is purpose-built for use with Ansible playbooks, but there are many other ways to run playbooks on your servers with a solid workflow. If price is a major concern, and you don’t need all the bells and whistles Tower provides, you can use other popular tools like Jenkins, Rundeck, or Go CI.
Loc: #
color: yellow. note:
Testing
Loc: #
color: yellow. note:
Since all your infrastructure is defined in code, you can automating unit, functional, and integration tests on your infrastructure, just like you do for your applications. This chapter covers different levels of infrastructure testing, and highlights tools and techniques that help you test and develop Ansible content.
Loc: #
color: yellow. note:
Linting YAML with yamllint Once you have a playbook written, it’s a good idea to make sure the basic YAML syntax is correct. YAML parsers can be forgiving, but many of the most common errors in Ansible playbooks, especially for beginners, is whitespace issues. yamllint is a simple YAML lint tool which can be installed via Pip: pip3 install yamllint
Loc: #
color: yellow. note:
Performing a –syntax-check Syntax checking is similarly straightforward, and only requires a few seconds for even larger, more complex playbooks with dozens or hundreds of includes. When you run a playbook with –syntax-check, the plays are not run; instead, Ansible loads the entire playbook statically and ensures everything can be loaded without a fatal error. If you are missing an imported task file, misspelled a module name, or are supplying a module with invalid parameters, –syntax-check will quickly identify the problem.
Loc: #
color: yellow. note:
But there are some aspects to this playbook which could be improved. Let’s see if ansible-lint can highlight them. Install it via Pip with: pip3 install ansible-lint
Loc: #
color: yellow. note:
Vagrant can help with this process somewhat, and it is well-suited to the task, but Vagrant can be a little slow, and it doesn’t work well in CI or lightweight environments.
Loc: #
color: yellow. note:
And the best part? Everything in Molecule is controlled by Ansible playbooks! Molecule is easy to install: pip3 install molecule
Loc: #
color: yellow. note:
This directory’s presence indicates there are one or more Molecule scenarios available for testing and development purposes. Let’s take a look at what’s inside the default scenario: molecule/ default/ INSTALL.rst converge.yml molecule.yml verify.yml
Loc: #
color: yellow. note:
The converge.yml file is an Ansible playbook, and Molecule runs it on the test environment immediately after setup is complete.
Loc: #
color: yellow. note:
The verify.yml file is another Ansible playbook, which is run after Molecule runs the converge.yml playbook and tests idempotence. It is meant for verification tests, e.g. ensuring a web service your role installs responds properly, or a certain application is configured correctly.
Loc: #
color: yellow. note: Workflow
For automation development, I usually have a workflow like the following: Create a new role with a Molecule test environment. Start working on the tasks in the role. Add a fail: task where I want to set a ‘breakpoint’, and run molecule converge. After the playbook runs and hits my fail task, log into the environment with molecule login. Explore the environment, check my configuration files, do some extra sleuthing if needed. Go back to my role, work on the rest of the role’s automation tasks. Run molecule converge again. (If there are any issues or I get my environment in a broken state, run molecule destroy to wipe away the environment then molecule converge to bring it back again.) Once I feel satisfied, run molecule test to run the full test cycle and make sure my automation works flawlessly and with idempotence.
Loc: #
color: yellow. note:
Testing a playbook with Molecule Molecule’s useful for testing more than just roles. I regularly use Molecule to test playbooks, collections, and even Kubernetes Operators!
Loc: #
color: yellow. note:
So first, run molecule init scenario to initialize a default Molecule scenario in the playbook’s folder:
Loc: #
color: yellow. note:
This playbook is pretty simple, though, and it doesn’t seem like there are any errors with it. We should debug this problem by logging into the test environment, and checking out what’s wrong with the httpd service: $ molecule login [root@instance /]# systemctl status httpd Failed to get D-Bus connection: Operation not permitted
Loc: #
color: yellow. note:
Molecule allows almost infinite flexibility, when it comes to configuring the test environment. In our case, we need to be able to test services running in Docker containers, meaning the Docker containers need to be able to run an init system (in this case, systemd).
Loc: #
color: yellow. note:
This configuration makes four changes to the default Molecule file: one change (the image), and three additions: Set the image to a dynamically-defined image that I maintain, which has Python and Ansible installed on it, as well as a properly configured systemd, so I can run services inside the container. Molecule allows bash-style variables and defaults, so I set $MOLECULE_DISTRO:-centos8 in the image name. This will allow substitution for other distros, like Debian, later. Override the command Molecule sets for the Docker container, so the container image uses its own preconfigured COMMAND, which in this case starts the systemd init system and keeps the container running. Add a necessary volume mount to allow processes to be managed inside the container. Set the privileged flag on the container, so systemd can initialize properly.
Loc: #
color: yellow. note: **
The privileged flag should be used with care; don’t run Docker images or software you don’t trust using privileged, because that mode allows software running inside the Docker image to run as if it were running on the host machine directly, bypassing many of the security benefits of using containers. It is convenient but potentially dangerous. If necessary, consider maintaining your own testing container images if you need to run with privileged and wish to test with Docker.
Loc: #
color: yellow. note:
When developing or debugging with Molecule, you can run only the verify step using molecule verify. As I’ve stated earlier, you could put this test in the Ansible playbook itself, so the test is always run as part of your automation. But the structure of your tests may dictate adding extra validation into Molecule’s validate.yml playbook, like we did here.
Loc: #
color: yellow. note:
Running your playbook in check mode One step beyond local integration testing is running your playbook with –check, which runs the entire playbook on your live infrastructure, but without performing any changes. Instead, Ansible highlights tasks that would’ve resulted in a change to show what will happen when you actually run the playbook later.
Loc: #
color: yellow. note:
For even more detailed information about what changes would occur, add the –diff option, and Ansible will output changes that would’ve been made to your servers line-by-line. This option produces a lot of output if check mode makes a lot of changes, so use it conservatively unless you want to scroll through a lot of text!
Loc: #
color: yellow. note:
Serverspec and Testinfra tests can be run locally, via SSH, through Docker’s APIs, or through other means, without the need for an agent installed on your servers, so they’re lightweight tools for testing your infrastructure (just like Ansible is a lightweight tool for managing your infrastructure).
Loc: #
color: yellow. note:
There’s a lot of debate over whether well-written Ansible playbooks themselves (especially along with the dry-run –check mode, and Molecule for CI) are adequate for well-tested infrastructure, but many teams are more comfortable maintaining infrastructure tests separately (especially if the team is already familiar with another tool!).
Loc: #
color: yellow. note: Argument for not using or not needing testinra.
Consider this: a truly idempotent Ansible playbook is already a great testing tool if it uses Ansible’s robust core modules and fail, assert, wait_for and other tests to ensure a specific state for your server. If you use Ansible’s user module to ensure a given user exists and is in a given group, and run the same playbook with –check and get ok for the same task, isn’t that a good enough test your server is configured correctly?