Reuse your tdd skills to build an application cluster with vagrant and chef: iteration 3

This is the last of the series (iteration 1, iteration 2) ‘Reuse your tdd skills to build an application cluster with vagrant and chef’.

I created, on my repository, a fully functionnal example that will serve as a support to this post. Let’s explore the cookbook:

It is divided into 4 recipes:

  • a data recipe that configures our data storage: creates a vm, installs mysql, creates a user and a schema
  • a search recipe: creates a vm, installs java and an elasticsearch instance
  • an appserver recipe: creates 2 vm, installs java and jetty
  • a proxy recipe: creates a vm, installs haproxy and configures the members

To reach our goal we have to understand Chef layout.

Understanding the layout
A cookbook follows a precise directory layout. In the last post we created that cookbook manually. The knife tool make this process less tedious.
First install knife

$ gem install knife-solo
$ knife cookbook create limber -C 'Louis Gueye' -I apachev2 -m '' -o site-cookbooks -r md
WARNING: No knife configuration file found
** Creating cookbook limber
** Creating README for cookbook: limber
** Creating CHANGELOG for cookbook: limber
** Creating metadata for cookbook: limber
$ tree site-cookbooks/limber
├── attributes
├── definitions
├── files
│   └── default
├── libraries
├── metadata.rb
├── providers
├── recipes
│   └── default.rb
├── resources
└── templates
    └── default

10 directories, 4 files

Knife just created a classic cookbook layout. You may not need everything but it saves time. For example creating a resource is clearly an advanced feature. You won’t need it before a long time. So don’t forget to delete unused dirs at the end because they mislead the cookbook users. Don’t try to understand everything upfront. Just follow the methodology: test first and you’ll soon enough be confronted to the concepts you need.

Recipes name drive most naming: let’s say you have a data recipe, then you’ll endup with the following related files:


Where do I write my tests?
To write tests for a recipe named data, we have to create create site-cookbooks/limber/files/default/data_test.rb.

require 'minitest/spec'

describe_recipe 'limber::data' do

  include MiniTest::Chef::Assertions
  include MiniTest::Chef::Resources

  MiniTest::Chef::Resources.register_resource :mysql_database, :connection

  it "selecting on database 'limber' with user 'limber' should succeed" do
    resource ='limber')
    resource.connection({:host => 'localhost',
                         :username => 'limber',
                         :password => '*mysql-limber@0')
    provider =, nil).tap(&:load_current_resource)
    row = provider.send(:db).query('select 1 from dual where 1 = 1')
    assert row


Implementation your test in your recipes
The implementation of the above test would be:

include_recipe 'limber::default'
include_recipe 'database::mysql'

mysql_connection = {:host => 'localhost',
                    :username => 'root',
                    :password => '*mysql-root@0'}

mysql_database 'limber' do
  connection mysql_connection
  action :create

mysql_database_user 'limber' do
  connection mysql_connection
  password '*mysql-limber@0'
  database_name 'limber'
  host 'localhost'
  privileges [:select, :update, :insert, :delete]
  action [:create, :grant]
mysql_database_user node['app']['db']['user'] do
  connection mysql_connection
  password '*mysql-limber@0'
  database_name 'limber'
  host '%'
  privileges [:select, :update, :insert, :delete]
  action [:grant]

include_recipe 'minitest-handler'

Refactor in attributes
The above implementation works but is not resistant to change. The obvious first refactor is to extract user attributes, database attributes, … and reuse them in both test and implementation. The attributes files were created for that exact purpose.
Below, the limber/attributes/data.rb file:

include_attribute "limber::default"
include_attribute "mysql::server"

node.set['mysql']['server_root_password'] = '*mysql-root@0'
node.set['mysql']['server_debian_password'] = '*mysql-root@0'
node.set['mysql']['server_repl_password'] = '*mysql-root@0'

node.set['app']['db']['schema'] = node['app']['name']
node.set['app']['db']['user'] = node['app']['name']
node.set['app']['db']['password'] = '*mysql-limber@0'

Use templates if needed
When needed, we can take advantage of ruby templating mechanism to replace default values with node values. This is a really powerful customization tool.
Say you want to configure elasticsearch: create a template under templates/default/elasticsearch.yml.erb <%= node['elasticsearch'][:cluster][:name] %>
bootstrap.mlockall: <%= node['elasticsearch'][:bootstrap][:mlockall] %> <%= node.set['elasticsearch'][:discovery][:zen][:ping][:multicast][:enabled] %>

Then interpolate in your recipe (recipes/search.rb) withe the template resource:

template '/etc/elasticsearch/elasticsearch.yml' do
  source 'elasticsearch.yml.erb'
  mode '0644'
  owner 'root'
  group 'root'

include_recipe 'minitest-handler'

Elascticsearch default values will be replaced
Reuse recipes
A good practice is to favor recipes reuse. For instance everyone knows java recipe is not trivial because the ecosystem is at war and the required licence agreement has made the automation process a bit tricky. So don’t try to create your own java recipe, use the provided one and contribute if it doesn’t exactly fit your needs.
Another good practice is to gather all common behaviour in the default recipe and reuse it in the other ones: in our default recipe we ask Chef to install basic packages like vim, curl, tree and htop.

include_recipe 'limber::default'
include_recipe 'java'

Use foodcritic
Foodcritic is a lint tool that reveals weaknesses and syntactic code smells. It should help you write cleaner cookbooks. I did not find a way to automatically run it from vagrant and I think it’s not the best approach. A better one would be to include vagrant provisioning in a more generic build and after vagrant completes, launch foodcritic. I guess it would be easily done with Rake.

$ sudo gem install foodcritic
$ echo "gem 'foodcritic', '>= 2.2.0'" >> Gemfile
$ bundle install
$ bundle exec foodcritic site-cookbooks/limber
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/appserver.rb:47
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/data.rb:38
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/default.rb:1
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/default.rb:2
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/proxy.rb:3
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/search.rb:24

Correct those warnings to make foodcritic happy
This completes our attempt to learn Chef using our tdd skills. We saw that the tool is not trivial to understand but once apprehended it’s flexible enough to build powerful recipes. The testing framework is really a plus.

We used vagrant to create fresh virtual machine with virtualbox provider and used chef-solo to provision them. This is one usage but it barely scratches the surface of configuration management and there is far more to explore:

  • use chef in server mode: our example seems quite complex just to create and provision empty machines. the server mode  added value lies in it ability to update/upgrade those machines configuration at commit.
  • use puppet instead of chef
  • use the very promising saltstack instead of chef
  • use linux containers: docker or raw lxc might be a better alternative than fat containers like virtualbox or vmware as they are really faster. vagrant-lxc and vagrant-docker are on early development stage
  • use Iaas: AWS or Digital Ocean are remote providers that also prevent you from worrying about the network configuration but you have to learn about the solution and its limitations (communication beetween machines in the same cluster, shared file system, users, etc.)

I really enjoyed (and still do) what I’ve learned so far because agility expands to operations and that is excellent news for end products because we are (some more thant us) on the right way to deliver high quality softwares with nearly zero downtime deployments. Infrastructure as code really takes it to the next level!

J Timberman blog: full of interesting material

Seth Vargo’s blog: test oriented

Shawn Dahlen: qite complete

Robert fox: useful tips

Bryan Berry: very inspiring

Reuse your tdd skills to build an application cluster with vagrant and chef: iteration 2

This is the 2nd iteration of our attempt to build an application cluster.

So far we’ve got familiar with vagrant, chef and its plugins. We’ve provisioned our machine with the ‘curl’ package.

The goal of this iteration is to use a more complex cookbook, configure it and test it

There are many testing tools in Ruby. You thus benefit from the whole testing ecosystem. But you’d better use a tool that has the simplest integration with Chef, and it’s even better if it is expressive. minitest-chef-handler was the clear winner. Minitest is an expressive testing framework that allows you 2 syntax: specs or classical bdd. It was developed by the Seattle ruby user group. David Calavera built custom chef assertions (suitable to configuration management) on top of minitest. Bryan MacLellan went one step further: he encapsulated minitest-chef-handler into a cookbook embracing chef execution lifecycle. You can run that cookbook like any other often after all cookbooks have completed. The chef execution will fail if the test doesn’t pass and of course the chef context is available in your test. Exactly what I was looking for: how convenient! OSS I love you…

Setting up minitest-handler-cookbook is like any other cookbook:

  • add the cookbook location to Berksfile,
    cookbook 'minitest-handler', :git => ''
  • use the cookbook in the provision section of your Vagrantfile (remove your old implementation to note failure)
    config.vm.provision 'chef_solo' do |chef|
      chef.add_recipe 'minitest-handler'

You can run vagrant to validate minitest configuration

$ vagrant reload
[2013-08-23T13:14:06-03:00] INFO: Chef Run complete in 30.675623573 seconds
[2013-08-23T13:14:06-03:00] INFO: Running report handlers
Run options: -v --seed 55076

# Running tests:

Finished tests in 0.001199s, 0.0000 tests/s, 0.0000 assertions/s.

0 tests, 0 assertions, 0 failures, 0 errors, 0 skips
[2013-08-23T13:14:06-03:00] INFO: Report handlers complete

We might think that we can juste write some xx_test.rb and minitest will find them. Sadly minitest-handler-cookbook only works with cookbooks. You fisrt have to know a bit about cookbooks. So we’ll first create a minimal but functionnal cookbook. We’ll come back later on the topic.
A cookbook is a catalog that groups related, coherent and self-contained sets of instructions (recipes).
For example the mysql cookbook is composed of independent recipes: install mysql server recipe, install mysql client recipe, create a user recipe, create a database recipe.
Every cookbook has a default recipe. So we’ll create one with a default recipe wich will do nothing for now, write our test, note failure then make it pass.
The convention recommends that custom cookbooks should be placed in a folder named site-coobooks. Inside that directory you can create your cookbook say myapp (because this cookbook is about creating/provisioning the ‘myapp’ cluster)

$ mkdir -p site-cookbooks/myapp/{recipes,files}
$ mkdir -p site-cookbooks/myapp/files/default/test
$ touch site-cookbooks/myapp/metadata.rb
$ touch site-cookbooks/myapp/
$ touch site-cookbooks/myapp/recipes/default.rb
$ touch site-cookbooks/myapp/files/default/test/default_test.rb

Note that the test file is named after the recipe name: the test file of the recipe foo.rb would be foo_test.rb. You can customize that behaviour but it’s often good practice to follow the convention.

Well now your test in default_test.rb:

require 'minitest/spec'

describe_recipe 'myapp::default' do

  it "should install the curl package" do


The first required file in a cookbook is the metadata file:

$ sudo vi site-cookbooks/myapp/metadata.rb


name             'myapp'
maintainer       'Louis Gueye'
maintainer_email ''
license          'Apache 2.0'
description      'Installs/Configures my app'
long_description, ''))
version          '0.1.0'

recipe 'myapp::default', 'Configures my app'

depends 'curl'

Don’t forget to tell chef where your cookbook is and to call your cookbook in Vagrantfile:

config.vm.provision 'chef_solo' do |chef|
  chef.cookbooks_path = %w('cookbooks','site-cookbooks')
  chef.add_recipe 'myapp' # short form for 'myapp::default' which is the actual recipe
  chef.add_recipe 'minitest-handler'

And remember: any cookbook sould be uploaded to your server, so referencing your cookbook in Berksfile is mandatory

cookbook 'myapp', :path => './site-cookbooks/myapp'

Note failure:

# Remove previous curl install
$ vagrant ssh -c 'sudo aptitude purge curl'
$ vagrant reload
[2013-08-26T06:02:25-03:00] INFO: Chef Run complete in 0.227265756 seconds
[2013-08-26T06:02:25-03:00] INFO: Running report handlers
Run options: -v --seed 35660

# Running tests:

recipe::myapp::default#test_0001_should install the curl package =
0.16 s = F

Finished tests in 0.161410s, 6.1954 tests/s, 6.1954 assertions/s.

  1) Failure:
recipe::myapp::default#test_0001_should install the curl package [/var/chef/minitest/myapp/default_test.rb:6]:
Expected package 'curl' to be installed

Finally, make it pass:

$ echo "include_recipe 'curl'" >> site-cookbooks/myapp/recipes/default.rb
$ vagrant reload
[2013-08-26T06:04:45-03:00] INFO: Running report handlers
Run options: -v --seed 4601

# Running tests:

recipe::myapp::default#test_0001_should install the curl package =
0.03 s = .

Finished tests in 0.030814s, 32.4528 tests/s, 32.4528 assertions/s.

1 tests, 1 assertions, 0 failures, 0 errors, 0 skips
[2013-08-26T06:04:45-03:00] INFO: Report handlers complete

Nice: full tdd cycle for package install. We’re armed to test/code/refactor chef cookbooks!

In the next post we’ll write a cookbook for the whole cluster in tdd.

Reuse your tdd skills to build an application cluster with vagrant and chef: iteration 1

Lately, at the office, my team and I were willing to improve our release process and reach Continuous Delivery state.

So automating our application’s infrastructure (1 database, 2 appservers, 1 search engine, 1 proxy) meant:

  • automating the database migration: index, column, constraints, table, view, data, etc,
  • automating the webapp deployment: the easy part,
  • automating the search engine migration: aliases, indices, mappings, full reindex, partial reindex.
  • automating the proxy configuration: register/unregister members.

When you add a Zero Downtime (french blog entry with english references) constraint you’re basically left with 2 choices:

  • act on the existing cluster: migrate database, then index, then sequentially migrate the webapps. This technique is known as the Blue Green Deployment. This technique works as soon as versions N+1 and version N are fully compatible. You must maintain at least 2 versions of your code until version N is no longer used. Only then will you be able to cleanup what’s remaining of the version N in your code.
  • act on a brand new cluster: reproduce an identical cluster, populate the new database and the new index with the current data, then switch to the new cluster (Immutable server à la Netflix). A little easier as you don’t have to maintain 2 versions of codebase. You still need to write search engine and database migration scripts.

While the first method is used pretty often, the second requires more recent tools and skills (mostly ops ones) making most dev teams uncomfortable with them. Should they really be? Being able to reproduce  identical environments at will really looked like magic to me just 3 weeks ago (I have a strong java background). I got dragged into the huge but totaly addictive automation and configuration management ecosystem. I chose to give Chef and Vagrant a try to build the above cluster while using my TDD knownledge.

Continue reading

Spice-up your application: add elasticsearch geo feature

Lately I’ve been busy working on elasticsearch features for my company.
In the process I came accross the shiny “geo search” feature. While not being that sensitive to shiny and new technologies (don’t get me wrong, I don’l like dusty ones) I still wanted to test elasticsearch geo capabilities for further adoption.
The reference documentation on the subject is a post from Shay Banon on the elasticsearch website. Geo search is made possible by indexing coordinates that conform to “geo_point” structure, then sort by “_geo_distance” or filter by “geo_distance“.
The other very useful resource was that post from Gauthier Lemoine‘s blog . The author’s app could find the nearest stations to the Eiffel Tour.
The the example is clear and includes index feed from the ratp’s open data, then a search based on the Eiffel Tour coordinates. It is written in python.

I added a geocoding capability that allows a user to provide any location which is a little more convenient. I also tried to use maven’s exec plugin to setup/feed the index based on the file available on which contains the complete list of paris stations: trains, subways and bus.
The maven build drops/creates the index, generates stations data and bulk insert them, starts the webapp then searches against a provided location like “35 avenue daumesnil, 75012 paris”.

Let’s take a look at the relevant parts of the solution
Continue reading

Safely re-index with elasticsearch

I’ve been using Elasticsearch in production for about 1 year. The component does a very good job at indexing/searching but lacks a built-in solution for continuity of service: you can’t hold incoming write request an resume them at will.

In my team, we started thinking about a home-made solution. We really wanted to keep our webapp as robust as possible and avoid to much sub-system dependencies such as queues, webservices, etc. Elasticsearch was already a big one, from an ops perspective (mainly because it’s young). We also really needed the interruption-free feature because we could not afford even a minute of service interruption.

After some time we had to face it: with no built-in solution around we had to build one and guess what? We could no more avoid queues because they really are robust component and are a perfect match for a {pause/resume|consumer/producer} paradigm.

Continue reading

Selenium from scratch

Hi reader,

If you’re concerned with testing then you’ve already came accross the difficult task of UI testing.
We all agree that fat client model is different from thin client’s but they share common characteristics: screens and transitions between screens. Testing them is different but seems equaly difficult. We will cover thin UI ie web site pages testing.
Before thinking about the “how”, we’ll think of the “what”.
Every UI is a set of screens (we pages in our case). Pages display data in UI components. Displaying data as a result of an action is something we might want to test:

Given some persisted messages
When I navigate to the "list messages" page
Then the UI should display the persisted messages

Given I input the "create message" form with invalid phone number
When I submit the "create message" form
Then the UI should display "Invalid phone number" error

The other very important behaviour to test is the transitions between screens.
In an HTTP context, requesting a resource with the “GET” method might not be allowed. In a security context, accessing a resource might not be allowed at all. From the UI perspective preventing a transition from screen A to B often translates into removing/hidding a link/button from the page:

Given I provide "admin/admin" authentication
When I navigate to the "catalog"
Then the page should display the "delete" link

Given I provide "user/user" authentication
When I navigate to the "catalog"
Then the page should not display the "delete" link

We will test 2 transitions: a successful form submission and a failure.
Continue reading

Android from scratch, part 2: use android-maven-plugin

Hi people,

I wasn’t sure to write again on Android because the first time I tried the platform I identified 3 knowledge levels:
– academic: knowing what Android is without ever building anything on that platform;
– basic: reading a lot about the topic and sorting information to finally extract the minimum/essential (there are a lot of incomplete/inaccurate information that make them unusable ‘as is’) required to build and deploy an app on a device (cf previous post);
– professional: involves build skills (stable, repeatable, customized per env), developement skills (fat client knowledge is a plus), release/distribution skills (branding, publishing on Google Play, etc)

That last step is mean for professionnals but I’ll try to present the building part. With the right tools we can reach a pretty nice result.
Continue reading