Puppet or Chef?

Posted: 2012-10-28
Category: DevOps

Back in the UK at PHPNE this May I saw an awesome talk from Ian Chilton, who explained very simply why using Vagrant for your development environments was a good idea. He mentioned briefly server provisioning but didn't get fully into it, and suggested we go out and play with Puppet and Chef to see which fit our needs.

I started off using Puppet to build up development boxes, with the use-case of testing how PyroCMS worked with different combinations of PHP 5.3 and PHP 5.4, Apache, Nginx, PHP-FPM, Postgres, etc without having to utterly bastardise the built in OSX LAMP stack or try to make MAMP do something useful.

So, if you're hopping on the "devops" train and would like to learn a server provisioning tool, which should you learn?

Syntax

Puppet uses a proprietary custom language, while Chef uses Ruby. This straight off the bat means a lot of Ruby developers prefer Chef as they can build out in a familiar language, but the Puppet language is so incredibly simple it should not be seen as a barrier to entry.

Update: Turns out there is a Ruby DSL for Puppet too, released in November 2010. I missed that entirely.

Often the same code written in Puppet or Chef will look cleaner in Puppet, but in Chef you can do some pretty crazy stuff. As it's using Ruby you can use programming structures like loops and if statements, which is not possible with Puppet.

Manifest / Cookbook Flow

Manifests (Puppet) and Cookbooks (Chef) are essentially the same thing. You write in these documents what you want to ensure is installed, created, running, etc and through the glory of idempotence these can be run as many times as you like without barfing all over your server.

While the two ideas are similar, they have one major difference. Puppet will evaluate all of the manifest files then run it in whatever order it deems best to ensure requirements are met. Chef will simply do things in whatever order it's defined, running top to bottom through each of your cookbooks - which I'm pretty sure run alphabetically for each node. For many programmers Chef is more logical, while the Puppet approach takes a little mental acrobatics to understand at first but makes a lot of sense when you get the hang of it.

I myself am not sure which I prefer here. The "do X then Y" approach does make debugging a very simple process, but you need to be careful that you're actually creating idempotence recipes - which is the whole point of this.

Modules

Both systems have a f**kload of modules available for software like PHP, MySQL, RabbitMQ, Postgres, whatever. Most of the modules are third-party and have been configured to work nicely with Debian, Ubuntu, Fedora, RedHat, etc using several layers of abstraction.

The only real difference here is that Chef has a community repository where people can submit these modules and show ratings and installation counts, whereas Puppet relies on a Google search and probably a relatively out of date GitHub repository. I've had to send multiple pull requests to various popular Puppet modules just to get them to actually install on an Ubuntu 12.04 box.

Update: Yet another error on my part. Puppet has a Forge, which is the same as the Chef community repository. That said, I don't remember ever landing on it when looking for modules. Searching "Puppet PHP" puts a result in 6th place, with GitHub being all of the results before it. Sorry about that.

Environments

Most projects have multiple environments; unless you're a cowboy coder who enjoys life on the edge and does all of his coding using vim through a SSH tunnel.

Puppet doesn't really provide any sort of solution for this. I'm not hating on Puppet, I just don't think this was ever really one of its goals in life.

Update: Puppet does have environments, but you won't get that working with Vagrant. The Puppet / Vagrant integration only allows for the "manifests" and "modules" folders, but environments sit in their own folder at that level - meaning you can only work overrides for your other environments like Staging and Production. This might be ok for you, but it should be watched out for.

Chef offers "environments", which are a little ruby file where you give each environment a name and maybe a description. I have dev.rb, stag.rb and prod.rb. Prod and staging dont really do anything, but my develop has a few overridden attributes:

name "dev"
description "The development environment"

override_attributes ({
    "api" => {
        "db_host" => "localhost",
        "server_name" => "dev.api.example.re",
        "docroot" => "/vagrant/www/api/public",
        "config_dir" => "/vagrant/www/api/fuel/app/config"
    },
    "frontend" => {
        "db_host" => "localhost",
        "server_name" => "dev.example.com",
        "docroot" => "/vagrant/www/frontend/public",
        "config_dir" => "/vagrant/www/frontend/fuel/app/config"
    }
})

ENV['FUEL_ENV'] = 'development'

Awesome sauce.

Multiple Tools

Puppet & Chef Solo

Both Puppet and Chef Solo run on a single machine and that is that. Chef Solo and Puppet are for the same use case, but the difference is really this; Puppet's language is simplistic and often quite elegant, where as Chef is an pile of really powerful functions, methods and arrays.

Puppet Master & Chef Server

This at first seems a little crazy, but with Chef Server you have your local "workstation" which has a command line tool called "knife". You use knife to move your cookbooks, recipes, roles, environments, etc up onto the "Chef Server". This chef server can be your own EC2 instance, VPS, whatever running Chef - or you can pay to use Opscode's servers. I have friends running their own chef server to avoid paying the monthly subscription, but no servers anywhere are free. For now I'm using the service and it's only gone down the once - but that was on Friday while half of the internet seemed to be broken so I can't really blame them.

I haven't played with Puppet Master at all, I only found out it existed it got linked up in the comments on Reddit. Sure its on the homepage, but it's below the fold and that is a huge site with a lot of information. Also this.

Encrypted Data Bags

There are lof of funky little extras in Chef Server, one of which is called Encrypted Data Bags. Instead of storing API credentials and passwords in git where anyone in the team can see them you can put them in a secure location. Why is that a big deal? Well maybe that guy you fired still has a copy of the code at home and wants to play a "hilarious prank" on you.

I expect Puppet can do this too, but it looks like you need to do some manual work to get it going. I didn't need to use this feature for my Puppet project.

Knife

Knife is amazing. I don't mean that in the over-used "everything good is amazing" sort I way, I mean I was genuinely amazed by this too. Knife works with Chef Server to manage your servers. Here are some cool commands:

$ knife ec2 server create \
    -S projectname -i ~/.ssh/projectname.pem \
    -G www,default \
    -x ubuntu \
    -d ubuntu12.04-gems \
    -E prod \
    -I ami-82fa58eb \
    -f m1.small \
    -r "role[base],role[frontend]" 

That makes me a frontend production server with the exact specs that I need. I walk off and get a coffee, come back and that server is sat there ready to go.

$ knife ssh -E stag "roles:base" "sudo chef-client"

Update my entire staging environment, with the latest code pulled from GitHub and whatever software I need to run.

$ knife ec2 server list

Gets a list of all servers on the account. This is amazeballs when you have a whole bunch of servers hiding behind load balancers and you want to know whats what.

Summary

Server provisioning is brilliant. While some people think it might be overcomplicating setting up a server, it's amazingly useful and I would be having a hard time doing my new job properly without it.

It means I can provision my local VM to be identical to my staging servers, use the exact same versions of software, keep my workstation clean, package and bundle all of the tech for the whole company into one repo with a few submodules, distribute that same code to multiple servers and set up freelancers no matter what OS they have in minutes instead of hours or days, all thanks to some crazy Ruby code.

As for "Puppet or Chef" there is no real answer, they are two different tools that do the same thing in slightly different ways, to make a better environment for yourself than just running (W|X|M)AMP and assuming your code will work when its deployed. Ideally you'd be provisioning your production site too (and I know its not always possible). Provisioning a large network of sites with Chef Server does seem to work very nicely, specially if you are using EC2 with the knife plugin. Give them both a whirl and see what tickles your fancy.

Further Reading

Comments

Gravatar
Lowe

2012-10-29

Some minor corrections;

Puppets language isn't proprietary, as the whole thing is Apache (2) Licensed. Puppet also supports environments. Puppet does have if statements (but not for/while in the Puppet language as it is declarative). Puppet also have a ruby based DSL for writing manifests (that with version 3 actually is a first class citizen in its own right) Puppet has the forge for community and puppetlabs maintained modules. Puppet has a master client set up as well. Just as chef.

Honestly, you don't seem to have done that much research into Puppet, at all.

Gravatar

2012-10-29

Lowe: For me proprietary == custom, which it is. It's a new language built by them, but glad to hear its open source.

As for everything else. Shitballs. Environments I had no idea.

http://docs.puppetlabs.com/guides/environment.html

Puppet Forge: I seemed to remember using it, but it was about 4 months ago when I used Puppet and Google didnt help me. The SEO must be awful, because not one single time while I was building my stack with Puppet did I end up on their site.

http://forge.puppetlabs.com/

If statements.... kinda, but in the same way that you'd pass a callback as an array value in Ruby - its not an ACTUAL if statement. I don't feel like I was wrong on that.

"Puppet has a master client set up as well. Just as chef."

Cool, where?

I'm happy to be corrected on a bunch of this stuff, as it helps me learn. I didn't just Google Puppet a bit then write a blog, I have used it for months.

A lot of the documentation could do with improving on both Chef and Puppet, as while there are plenty of people who know a lot about both, pretty much every developer I talk to about Puppet or Chef are still in the dark. I'll make some corrections to help everyone learn. Let me know if its enough.

Gravatar
Jtimberman

2012-10-29

Chef environments are primarily used to "pin" cookbook version constraints. So you can say "In production, we'll use app cookbook version 1, but in dev we use 1.2". Certainly setting environment specific attributes is useful, but the version constraints are one of the best features of environments IMO.

  • http://wiki.opscode.com/display/chef/Environments
  • http://wiki.opscode.com/display/chef/Version+Constraints
Gravatar
John Fuller

2012-10-30

And for the majority who don't need all the features of Puppet and Chef, take 15 minutes to learn and setup Ansible and move on with your life.

http://ansible.cc/

Gravatar
Tom Jones

2012-10-30

At work we use puppet to build our 20 or so AWS environments. Our configs are 4000+ lines spread over 77 files. It is an absolute nightmare to work with. Myself and colleagues regularly have to spend 3 days rerunning builds to try to debug them. The indeterminate run order means that a particular bug may occur on one run, then on the next run there may be 6 bugs. Run again and it drops to 2. Debugging is close to impossible when it takes half an hour to provision a server and each time steps are performed in a different order.

I've tinkered with chef at home, and while I'm sure it will have it's difficulties, the single fact that each time it runs it runs deterministically makes it vastly easier to work with.

As far as I'm concerned, puppet is deprecated.

Gravatar
Aleksey Tsalolikhin

2012-10-30

You might want to also take a look at CFEngine, which inspired a lot of the ideas found in Puppet and Chef.

See "Relative Origins of Puppet, Chef and CFEngine" http://verticalsysadmin.com/blog/uncategorized/relative-origins-of-cfengine-chef-and-puppet

Using any configuration management system properly is better than not using configuration management at all.

CFEngine is being used to manage hundreds of thousands of servers in large companies. It is designed to be highly scalable.

Gravatar
Alex North-keys

2012-10-30

Fix ticket http://tickets.opscode.com/browse/CHEF-13 (deployment verification a.k.a. --noop support) and Chef would be useful - actually there's finally progress being made here as of 2012-07, 3.5 years after the ticket was created. Get Puppet to be deterministically, predictably ordered in production and it would be useful, and for those steeped in Puppet I mean that the graph-based model should resolve to predictable, not just repeatable, procedural sequence. "Properly expressed dependencies in Puppet provide ordering where you care to have it" is not enough.

Doesn't mean they aren't powerful and flexible, it just means I don't really want to use them in production.

Essentially, any deployment system that exposes a language that allows the user to write directly to the filesystem will find it difficult to solidly separate non-modifying verification of a system's configuration from making a modifying run. And if you can't safely verify the deployed system's config, your tools aren't mature. Having a in-configurator kernel of operations that can be controlled with respect to be modify/no-modify is critical, and Chef just doesn't have it. Further, solving CHEF-13 essentially could mean making direct Ruby use a thing of the past - and I don't see its developer culture as being able to take that step.

Puppet at least has got the create-a-declarative-language thing in place, so they're ahead of Chef in theory even if using Puppet can break down in practice. Don't even get me started on whether Puppet can even complete a run in the default number of passes or how to fully define dependencies to the point it shouldn't matter - just tell me it can now precompute the full ordering and complete in one pass in a predictable order, or don't tell me at all. Unless Puppet has improved to that point, it's only useful in development and testing, not in production. Maybe it already has improved that far - I'll check in on it again in a few years after the memory has faded a bit more.

Of the two, although I admit I enjoyed Chef more than Puppet, but it's Puppet that clearly has an easier path to fixing its problems.

Gravatar
Dag Wieers

2012-10-30

You should definitely check out Ansible (http://ansible.cc/). It's the new kid on the block, but given it's only 8 months old, it's a viable alternative with some compelling and unique features. Multi-tier orchestration, complex inventory and flexible provisioning.

Gravatar
Steven T

2012-11-04

I have used Puppet in production but have recently started to use Ansible and it is so much easier to learn and implement.

Posting comments after three months has been disabled.