Dear organizations

It’s been a while since I did not write anything. There’s a good reason for that: I’ve been busy trying to understand a new business (derivatives in general, specifically derivatives margin computation). I left my comfort zone so I had to learn a lot about financial domain and mechanism.

I inherited a project context that needed healing at many levels. I’m not allowed to share the specifics but more than ever, the following recommendations applied to the project context.

Depending on your company and what its information system (on the broader sense) suffers from, you might want to take different actions. But I suspect the following piece of advice won’t harm:

  • Invest in training and research

Accurately understand your process and your business. More importantly, make sure that not less than 80% of your collaborators know about that business at a satisfying level. It helps being on the same page. When you understand the problem you are more likely to find wise solutions. Take the time to provide simple, accurate documentation. Provide internal trainings. Newcomers should first understand the business before trying to improve it with external trainings. Take the time to gather various feedbacks and to analyze them. Most of the time various problems share a common cause.

  • Provide a plan to improve

Hire a smart collaborator to provide and share the common vision: smart does not necessarily mean the most technically skilled. Most of the time it’s not the most technically skilled but a collaborator who can tackle the problems from the right perspective and perfectly leverage his team members skills. He must understand them and be methodical. Once done, he will come up with a plan, a vision and steps to reach the goal.

Provide time framed objectives: “By september of 2015 all application should be migrated to Awesome FWK 2.0. Consequently OLD FWK should be decommissioned by then”. It keeps people motivated for longer goals.

Provide one big goal per year and steps to achieve them: “This year we will build a continuous delivery framework. The year after all new applications will use it. The year after all legacy application should also use it”.

  • Hire smart and skilled collaborators

It’s better to have few highly skilled collaborators than an army of inexperienced ones. I can’t stress out less that requirement: it’s at the heart of your success

Also don’t be in a hurry to choose the collaborators. Take your time. By carefully choosing your collaborators both candidates and teams will thank you: they are more likely to build something on the long term.

  • Make it simple

Simplicity is the ultimate sophistication (Leonardo Da Vinci): it takes a lot of skills and work to transform a complex concept into something accessible to the majority. It also requires skill to extract the essence out of it and remove peripheral noise. Think of all the complex products we use on a daily basis that remarkably abstact complexity for the user’s sake. They require several areas of expertise (cars, ATM, planes) yet all this knowledge is hidden to us: this is sophistication, not the other way around. Think of Whatsapp: very simple feaures but an impressive backend infrastructure.

Divide and conquer: break the complex problem into dozen of simpler ones. Think of Git: many simple tools that don’t do much but do it remarkably well.

  • Iterate

Simple concept yet powerful. Worst case scenario: you realize after a few iterations that you are on the wrong path. You can either abort or rectify.

Many projects owe their success solely to that very principle. It’s an excellent risk management practice.

The idea is so powerful that it has been borrowed many times in software industry:

    • Make it work, make it right, make it fast
    • Minimum viable product
    • Test, code, refactor
    • Baby steps
    • Plan do check adjust (PDCA)
  • Make it extensible and maintainable

Change and evolution is a natural process. Why fight it ? It’s a common mistake to limit the cost of a software only to its “build/construction” phase. Actually most of its cost is attributable to its evolution stage during which it frequently needs to integrate new constraints, new features while still supporting all previous ones. Without extension in mind the evolution is really costly and painful.

Track down duplication: at class level, package level, module level, application level, component level. You’ll get a stable as well as cohesive information system.

Duplication weakens your reliability and your cohesion because its generates synchronisation overhead and most of the time double verification. It really has a cost

Separation of concerns/responsibilities is the fundamental design principle that will make your software sustainable and extensible in time.

That principle, pushed at organizational level, led to a “not so new lately” architetural style: Micro services.

  • Make it predictable

Test (not going to elaborate on that obvious one)


    • Let your IT team concentrate on the process and the business, not on time consuming and error prone technical tasks: automate as often as possible. If a process is automated the teams will have time to improve it, evolve it, refine as close as possible to business needs.
    • People often make the mistake to think that automation threatens their job. Allow me to strongly doubt that belief. A business process can not improve by itself, neither can an automation process keep itself in sync with the business process. Both need human brains to refine, polish and evolve.

Reuse your tdd skills to build an application cluster with vagrant and chef: iteration 3

This is the last of the series (iteration 1, iteration 2) ‘Reuse your tdd skills to build an application cluster with vagrant and chef’.

I created, on my repository, a fully functionnal example that will serve as a support to this post. Let’s explore the cookbook:

It is divided into 4 recipes:

  • a data recipe that configures our data storage: creates a vm, installs mysql, creates a user and a schema
  • a search recipe: creates a vm, installs java and an elasticsearch instance
  • an appserver recipe: creates 2 vm, installs java and jetty
  • a proxy recipe: creates a vm, installs haproxy and configures the members

To reach our goal we have to understand Chef layout.

Understanding the layout
A cookbook follows a precise directory layout. In the last post we created that cookbook manually. The knife tool make this process less tedious.
First install knife

$ gem install knife-solo
$ knife cookbook create limber -C 'Louis Gueye' -I apachev2 -m '' -o site-cookbooks -r md
WARNING: No knife configuration file found
** Creating cookbook limber
** Creating README for cookbook: limber
** Creating CHANGELOG for cookbook: limber
** Creating metadata for cookbook: limber
$ tree site-cookbooks/limber
├── attributes
├── definitions
├── files
│   └── default
├── libraries
├── metadata.rb
├── providers
├── recipes
│   └── default.rb
├── resources
└── templates
    └── default

10 directories, 4 files

Knife just created a classic cookbook layout. You may not need everything but it saves time. For example creating a resource is clearly an advanced feature. You won’t need it before a long time. So don’t forget to delete unused dirs at the end because they mislead the cookbook users. Don’t try to understand everything upfront. Just follow the methodology: test first and you’ll soon enough be confronted to the concepts you need.

Recipes name drive most naming: let’s say you have a data recipe, then you’ll endup with the following related files:


Where do I write my tests?
To write tests for a recipe named data, we have to create create site-cookbooks/limber/files/default/data_test.rb.

require 'minitest/spec'

describe_recipe 'limber::data' do

  include MiniTest::Chef::Assertions
  include MiniTest::Chef::Resources

  MiniTest::Chef::Resources.register_resource :mysql_database, :connection

  it "selecting on database 'limber' with user 'limber' should succeed" do
    resource ='limber')
    resource.connection({:host => 'localhost',
                         :username => 'limber',
                         :password => '*mysql-limber@0')
    provider =, nil).tap(&:load_current_resource)
    row = provider.send(:db).query('select 1 from dual where 1 = 1')
    assert row


Implementation your test in your recipes
The implementation of the above test would be:

include_recipe 'limber::default'
include_recipe 'database::mysql'

mysql_connection = {:host => 'localhost',
                    :username => 'root',
                    :password => '*mysql-root@0'}

mysql_database 'limber' do
  connection mysql_connection
  action :create

mysql_database_user 'limber' do
  connection mysql_connection
  password '*mysql-limber@0'
  database_name 'limber'
  host 'localhost'
  privileges [:select, :update, :insert, :delete]
  action [:create, :grant]
mysql_database_user node['app']['db']['user'] do
  connection mysql_connection
  password '*mysql-limber@0'
  database_name 'limber'
  host '%'
  privileges [:select, :update, :insert, :delete]
  action [:grant]

include_recipe 'minitest-handler'

Refactor in attributes
The above implementation works but is not resistant to change. The obvious first refactor is to extract user attributes, database attributes, … and reuse them in both test and implementation. The attributes files were created for that exact purpose.
Below, the limber/attributes/data.rb file:

include_attribute "limber::default"
include_attribute "mysql::server"

node.set['mysql']['server_root_password'] = '*mysql-root@0'
node.set['mysql']['server_debian_password'] = '*mysql-root@0'
node.set['mysql']['server_repl_password'] = '*mysql-root@0'

node.set['app']['db']['schema'] = node['app']['name']
node.set['app']['db']['user'] = node['app']['name']
node.set['app']['db']['password'] = '*mysql-limber@0'

Use templates if needed
When needed, we can take advantage of ruby templating mechanism to replace default values with node values. This is a really powerful customization tool.
Say you want to configure elasticsearch: create a template under templates/default/elasticsearch.yml.erb <%= node['elasticsearch'][:cluster][:name] %>
bootstrap.mlockall: <%= node['elasticsearch'][:bootstrap][:mlockall] %> <%= node.set['elasticsearch'][:discovery][:zen][:ping][:multicast][:enabled] %>

Then interpolate in your recipe (recipes/search.rb) withe the template resource:

template '/etc/elasticsearch/elasticsearch.yml' do
  source 'elasticsearch.yml.erb'
  mode '0644'
  owner 'root'
  group 'root'

include_recipe 'minitest-handler'

Elascticsearch default values will be replaced
Reuse recipes
A good practice is to favor recipes reuse. For instance everyone knows java recipe is not trivial because the ecosystem is at war and the required licence agreement has made the automation process a bit tricky. So don’t try to create your own java recipe, use the provided one and contribute if it doesn’t exactly fit your needs.
Another good practice is to gather all common behaviour in the default recipe and reuse it in the other ones: in our default recipe we ask Chef to install basic packages like vim, curl, tree and htop.

include_recipe 'limber::default'
include_recipe 'java'

Use foodcritic
Foodcritic is a lint tool that reveals weaknesses and syntactic code smells. It should help you write cleaner cookbooks. I did not find a way to automatically run it from vagrant and I think it’s not the best approach. A better one would be to include vagrant provisioning in a more generic build and after vagrant completes, launch foodcritic. I guess it would be easily done with Rake.

$ sudo gem install foodcritic
$ echo "gem 'foodcritic', '>= 2.2.0'" >> Gemfile
$ bundle install
$ bundle exec foodcritic site-cookbooks/limber
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/appserver.rb:47
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/data.rb:38
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/default.rb:1
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/default.rb:2
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/proxy.rb:3
FC007: Ensure recipe dependencies are reflected in cookbook metadata: site-cookbooks/limber/recipes/search.rb:24

Correct those warnings to make foodcritic happy
This completes our attempt to learn Chef using our tdd skills. We saw that the tool is not trivial to understand but once apprehended it’s flexible enough to build powerful recipes. The testing framework is really a plus.

We used vagrant to create fresh virtual machine with virtualbox provider and used chef-solo to provision them. This is one usage but it barely scratches the surface of configuration management and there is far more to explore:

  • use chef in server mode: our example seems quite complex just to create and provision empty machines. the server mode  added value lies in it ability to update/upgrade those machines configuration at commit.
  • use puppet instead of chef
  • use the very promising saltstack instead of chef
  • use linux containers: docker or raw lxc might be a better alternative than fat containers like virtualbox or vmware as they are really faster. vagrant-lxc and vagrant-docker are on early development stage
  • use Iaas: AWS or Digital Ocean are remote providers that also prevent you from worrying about the network configuration but you have to learn about the solution and its limitations (communication beetween machines in the same cluster, shared file system, users, etc.)

I really enjoyed (and still do) what I’ve learned so far because agility expands to operations and that is excellent news for end products because we are (some more thant us) on the right way to deliver high quality softwares with nearly zero downtime deployments. Infrastructure as code really takes it to the next level!

J Timberman blog: full of interesting material

Seth Vargo’s blog: test oriented

Shawn Dahlen: qite complete

Robert fox: useful tips

Bryan Berry: very inspiring


Reuse your tdd skills to build an application cluster with vagrant and chef: iteration 2

This is the 2nd iteration of our attempt to build an application cluster.

So far we’ve got familiar with vagrant, chef and its plugins. We’ve provisioned our machine with the ‘curl’ package.

The goal of this iteration is to use a more complex cookbook, configure it and test it

There are many testing tools in Ruby. You thus benefit from the whole testing ecosystem. But you’d better use a tool that has the simplest integration with Chef, and it’s even better if it is expressive. minitest-chef-handler was the clear winner. Minitest is an expressive testing framework that allows you 2 syntax: specs or classical bdd. It was developed by the Seattle ruby user group. David Calavera built custom chef assertions (suitable to configuration management) on top of minitest. Bryan MacLellan went one step further: he encapsulated minitest-chef-handler into a cookbook embracing chef execution lifecycle. You can run that cookbook like any other often after all cookbooks have completed. The chef execution will fail if the test doesn’t pass and of course the chef context is available in your test. Exactly what I was looking for: how convenient! OSS I love you…

Setting up minitest-handler-cookbook is like any other cookbook:

  • add the cookbook location to Berksfile,
    cookbook 'minitest-handler', :git => ''
  • use the cookbook in the provision section of your Vagrantfile (remove your old implementation to note failure)
    config.vm.provision 'chef_solo' do |chef|
      chef.add_recipe 'minitest-handler'

You can run vagrant to validate minitest configuration

$ vagrant reload
[2013-08-23T13:14:06-03:00] INFO: Chef Run complete in 30.675623573 seconds
[2013-08-23T13:14:06-03:00] INFO: Running report handlers
Run options: -v --seed 55076

# Running tests:

Finished tests in 0.001199s, 0.0000 tests/s, 0.0000 assertions/s.

0 tests, 0 assertions, 0 failures, 0 errors, 0 skips
[2013-08-23T13:14:06-03:00] INFO: Report handlers complete

We might think that we can juste write some xx_test.rb and minitest will find them. Sadly minitest-handler-cookbook only works with cookbooks. You fisrt have to know a bit about cookbooks. So we’ll first create a minimal but functionnal cookbook. We’ll come back later on the topic.
A cookbook is a catalog that groups related, coherent and self-contained sets of instructions (recipes).
For example the mysql cookbook is composed of independent recipes: install mysql server recipe, install mysql client recipe, create a user recipe, create a database recipe.
Every cookbook has a default recipe. So we’ll create one with a default recipe wich will do nothing for now, write our test, note failure then make it pass.
The convention recommends that custom cookbooks should be placed in a folder named site-coobooks. Inside that directory you can create your cookbook say myapp (because this cookbook is about creating/provisioning the ‘myapp’ cluster)

$ mkdir -p site-cookbooks/myapp/{recipes,files}
$ mkdir -p site-cookbooks/myapp/files/default/test
$ touch site-cookbooks/myapp/metadata.rb
$ touch site-cookbooks/myapp/
$ touch site-cookbooks/myapp/recipes/default.rb
$ touch site-cookbooks/myapp/files/default/test/default_test.rb

Note that the test file is named after the recipe name: the test file of the recipe foo.rb would be foo_test.rb. You can customize that behaviour but it’s often good practice to follow the convention.

Well now your test in default_test.rb:

require 'minitest/spec'

describe_recipe 'myapp::default' do

  it "should install the curl package" do


The first required file in a cookbook is the metadata file:

$ sudo vi site-cookbooks/myapp/metadata.rb


name             'myapp'
maintainer       'Louis Gueye'
maintainer_email ''
license          'Apache 2.0'
description      'Installs/Configures my app'
long_description, ''))
version          '0.1.0'

recipe 'myapp::default', 'Configures my app'

depends 'curl'

Don’t forget to tell chef where your cookbook is and to call your cookbook in Vagrantfile:

config.vm.provision 'chef_solo' do |chef|
  chef.cookbooks_path = %w('cookbooks','site-cookbooks')
  chef.add_recipe 'myapp' # short form for 'myapp::default' which is the actual recipe
  chef.add_recipe 'minitest-handler'

And remember: any cookbook sould be uploaded to your server, so referencing your cookbook in Berksfile is mandatory

cookbook 'myapp', :path => './site-cookbooks/myapp'

Note failure:

# Remove previous curl install
$ vagrant ssh -c 'sudo aptitude purge curl'
$ vagrant reload
[2013-08-26T06:02:25-03:00] INFO: Chef Run complete in 0.227265756 seconds
[2013-08-26T06:02:25-03:00] INFO: Running report handlers
Run options: -v --seed 35660

# Running tests:

recipe::myapp::default#test_0001_should install the curl package =
0.16 s = F

Finished tests in 0.161410s, 6.1954 tests/s, 6.1954 assertions/s.

  1) Failure:
recipe::myapp::default#test_0001_should install the curl package [/var/chef/minitest/myapp/default_test.rb:6]:
Expected package 'curl' to be installed

Finally, make it pass:

$ echo "include_recipe 'curl'" >> site-cookbooks/myapp/recipes/default.rb
$ vagrant reload
[2013-08-26T06:04:45-03:00] INFO: Running report handlers
Run options: -v --seed 4601

# Running tests:

recipe::myapp::default#test_0001_should install the curl package =
0.03 s = .

Finished tests in 0.030814s, 32.4528 tests/s, 32.4528 assertions/s.

1 tests, 1 assertions, 0 failures, 0 errors, 0 skips
[2013-08-26T06:04:45-03:00] INFO: Report handlers complete

Nice: full tdd cycle for package install. We’re armed to test/code/refactor chef cookbooks!

In the next post we’ll write a cookbook for the whole cluster in tdd.


Reuse your tdd skills to build an application cluster with vagrant and chef: iteration 1

Lately, at the office, my team and I were willing to improve our release process and reach Continuous Delivery state.

So automating our application’s infrastructure (1 database, 2 appservers, 1 search engine, 1 proxy) meant:

  • automating the database migration: index, column, constraints, table, view, data, etc,
  • automating the webapp deployment: the easy part,
  • automating the search engine migration: aliases, indices, mappings, full reindex, partial reindex.
  • automating the proxy configuration: register/unregister members.

When you add a Zero Downtime (french blog entry with english references) constraint you’re basically left with 2 choices:

  • act on the existing cluster: migrate database, then index, then sequentially migrate the webapps. This technique is known as the Blue Green Deployment. This technique works as soon as versions N+1 and version N are fully compatible. You must maintain at least 2 versions of your code until version N is no longer used. Only then will you be able to cleanup what’s remaining of the version N in your code.
  • act on a brand new cluster: reproduce an identical cluster, populate the new database and the new index with the current data, then switch to the new cluster (Immutable server à la Netflix). A little easier as you don’t have to maintain 2 versions of codebase. You still need to write search engine and database migration scripts.

While the first method is used pretty often, the second requires more recent tools and skills (mostly ops ones) making most dev teams uncomfortable with them. Should they really be? Being able to reproduce  identical environments at will really looked like magic to me just 3 weeks ago (I have a strong java background). I got dragged into the huge but totaly addictive automation and configuration management ecosystem. I chose to give Chef and Vagrant a try to build the above cluster while using my TDD knownledge.

Continue reading


Spice-up your application: add elasticsearch geo feature

Lately I’ve been busy working on elasticsearch features for my company.
In the process I came accross the shiny “geo search” feature. While not being that sensitive to shiny and new technologies (don’t get me wrong, I don’l like dusty ones) I still wanted to test elasticsearch geo capabilities for further adoption.
The reference documentation on the subject is a post from Shay Banon on the elasticsearch website. Geo search is made possible by indexing coordinates that conform to “geo_point” structure, then sort by “_geo_distance” or filter by “geo_distance“.
The other very useful resource was that post from Gauthier Lemoine‘s blog . The author’s app could find the nearest stations to the Eiffel Tour.
The the example is clear and includes index feed from the ratp’s open data, then a search based on the Eiffel Tour coordinates. It is written in python.

I added a geocoding capability that allows a user to provide any location which is a little more convenient. I also tried to use maven’s exec plugin to setup/feed the index based on the file available on which contains the complete list of paris stations: trains, subways and bus.
The maven build drops/creates the index, generates stations data and bulk insert them, starts the webapp then searches against a provided location like “35 avenue daumesnil, 75012 paris”.

Let’s take a look at the relevant parts of the solution
Continue reading


Safely re-index with elasticsearch

I’ve been using Elasticsearch in production for about 1 year. The component does a very good job at indexing/searching but lacks a built-in solution for continuity of service: you can’t hold incoming write request an resume them at will.

In my team, we started thinking about a home-made solution. We really wanted to keep our webapp as robust as possible and avoid to much sub-system dependencies such as queues, webservices, etc. Elasticsearch was already a big one, from an ops perspective (mainly because it’s young). We also really needed the interruption-free feature because we could not afford even a minute of service interruption.

After some time we had to face it: with no built-in solution around we had to build one and guess what? We could no more avoid queues because they really are robust component and are a perfect match for a {pause/resume|consumer/producer} paradigm.

Continue reading

bdd, technical, tools

Selenium from scratch

Hi reader,

If you’re concerned with testing then you’ve already came accross the difficult task of UI testing.
We all agree that fat client model is different from thin client’s but they share common characteristics: screens and transitions between screens. Testing them is different but seems equaly difficult. We will cover thin UI ie web site pages testing.
Before thinking about the “how”, we’ll think of the “what”.
Every UI is a set of screens (we pages in our case). Pages display data in UI components. Displaying data as a result of an action is something we might want to test:

Given some persisted messages
When I navigate to the "list messages" page
Then the UI should display the persisted messages

Given I input the "create message" form with invalid phone number
When I submit the "create message" form
Then the UI should display "Invalid phone number" error

The other very important behaviour to test is the transitions between screens.
In an HTTP context, requesting a resource with the “GET” method might not be allowed. In a security context, accessing a resource might not be allowed at all. From the UI perspective preventing a transition from screen A to B often translates into removing/hidding a link/button from the page:

Given I provide "admin/admin" authentication
When I navigate to the "catalog"
Then the page should display the "delete" link

Given I provide "user/user" authentication
When I navigate to the "catalog"
Then the page should not display the "delete" link

We will test 2 transitions: a successful form submission and a failure.
Continue reading