Scaling automation with ansible-pull

Ansible is a wonderful software to automatically configure your systems. The default mode of using ansible is Push Model.


Ansible Push

That means from your box, and only using ssh + python, you can configure your flee of machines.


Ansible is imperative. You define tasks in your playbooks, roles and they will run in a serial manner on the remote machines. The task will first check if needs to run and otherwise it will skip the action. And although we can use conditional to skip actions, tasks will perform all checks. For that reason ansible seems slow instead of other configuration tools. Ansible runs in serial mode the tasks but in psedo-parallel mode against the remote servers, to increase the speed. But sometimes you need to gather_facts and that would cost in execution time. There are solutions to cache the ansible facts in a redis (in memory key:value db) but even then, you need to find a work-around to speed your deployments.

But there is an another way, the Pull Mode!


Useful Reading Materials

to learn more on the subject, you can start reading these two articles on ansible-pull.


Pull Mode

So here how it looks:

Ansible Pull


You will first notice, that your ansible repository is moved from you local machine to an online git repository. For me, this is GitLab. As my git repo is private, I have created a Read-Only, time-limit, Deploy Token.

With that scenario, our (ephemeral - or not) VMs will pull their ansible configuration from the git repo and run the tasks locally. I usually build my infrastructure with Terraform by HashiCorp and make advance of cloud-init to initiate their initial configuration.


The tail of my user-data.yml looks pretty much like this:

# Install packages
  - ansible

# Run ansible-pull
  - ansible-pull -U 


You can either create a playbook named with the hostname of the remote server, eg. node1.yml or use the local.yml as the default playbook name.

Here is an example that will also put ansible-pull into a cron entry. This is very useful because it will check for any changes in the git repo every 15 minutes and run ansible again.

- hosts: localhost

    - name: Ensure ansible-pull is running every 15 minutes
        name: "ansible-pull"
        minute: "15"
        job: "ansible-pull -U &> /dev/null"

    - name: Create a custom local vimrc file
        path: /etc/vim/vimrc.local
        line: 'set modeline'
        create: yes

    - name: Remove "cloud-init" package
        name: "cloud-init"
        purge: yes
        state: absent

    - name: Remove useless packages from the cache
        autoclean: yes

    - name: Remove dependencies that are no longer required
        autoremove: yes

# vim: sts=2 sw=2 ts=2 et