dimecres, 18 de setembre del 2013

Installing puppetmaster with puppet

Working in a demo for a puppet masterclass, I decided to use a puppet agent to install and configure puppet-server and puppetdb. Here explains the process.

Demo VM

I used vagrant and Virtualbox to deploy a VM to show the demo, once installed both, run the following:

mkdir demopuppet
cd demopuppet
vagrant box add centos_i386 http://developer.nrel.gov/downloads/vagrant-boxes/CentOS-6.4-i386-v20130427.box
vagrant init centos_i386
vagrant up


You can find more boxes here, get one with puppet agent already installed.

Yum repositories

I will explain the code by functionality.
First of all, yum repositories. The first thing I did was to install epel and puppetlabs rpm and then, I realized I could use puppet itself to install them. 

puppet resource yumrepo epel >> puppetinstall.pp
puppet resource yumrepo puppetlabs-products >> puppetinstall.pp
puppet resource yumrepo puppetlabs-deps >> puppetinstall.pp

Then, on looking that exit, I realized that to give a demo solution, the content had to be edited to avoid gpgcheck (installing keys with puppet is a little more complicated if you only want a .pp as an exit). So the final result was:

#Repositories needed to install puppetmaster
yumrepo { 'epel':
  descr          => 'Extra Packages for Enterprise Linux 6 - $basearch',
  enabled        => '1',
  failovermethod => 'priority',
  gpgcheck       => '0',
  mirrorlist     => 'https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch',
}
yumrepo { 'puppetlabs-deps':
  baseurl  => 'http://yum.puppetlabs.com/el/6/dependencies/$basearch',
  descr    => 'Puppet Labs Dependencies El 6 - $basearch',
  enabled  => '1',
  gpgcheck => '0',
}
yumrepo { 'puppetlabs-products':
  baseurl  => 'http://yum.puppetlabs.com/el/6/products/$basearch',
  descr    => 'Puppet Labs Products El 6 - $basearch',
  enabled  => '1',
  gpgcheck => '0',
}

But I didn't like a solution that reduces quality, so although is good to use yumrepo, I thing the best solution is the one following:

package {'epel-release':
  ensure => 'installed',
  source => 'http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm',
  provider => 'rpm',
}
package {'puppetlabs-release':
  ensure => 'installed',
  source => 'http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-7.noarch.rpm',
  provider => 'rpm',
}

Packages 

All these packages have to be installed after repositories are settled, so we will set a dependency. Vim and screen are not really needed, but I love them :-).

package{
  ['vim-enhanced',
   'screen',
   'puppet-server',
   'puppetdb',
   'puppetdb-terminus',
  ]:
    ensure => installed,
    require => [ Package['puppetlabs-release'], Package['epel-release'] ],
}

Puppet configurations

Remember that we are working in a demo procedure, so this changes are to avoid some good (but manual) practices. First one is enable autosign certificates, to allow any node to authenticate into puppetmaster without human intervention.

# Enables autosign node certificates
file {'/etc/puppet/autosign.conf':
  ensure  => present,
  content => "*\n",
  require => Package['puppet-server'],
  notify  => Service['puppetmaster'],
}

Next one is setting a new hostname to work with defaults. Puppetmaster and nodes search for a domain name called 'puppet', so if we want to work with defaults, we have to set the correct hostname. I searched a little and found the following (lost source, sorry)

#Set master hostname into puppet.localdomain
file { "/etc/hostname":
    ensure => present,
    owner => root,
    group => root,
    mode => 644,
    content => "puppet.localdomain\n",
    notify => Exec["set-hostname"],
  }
  exec { "set-hostname":
    command => "/bin/hostname -F /etc/hostname",
    unless => "/usr/bin/test `hostname` = `/bin/cat /etc/hostname`",
  }

# Set some more names to localhost
host { 'localhost':
  ensure       => 'absent',
}
host { 'localhost4':
  ensure       => 'present',
  host_aliases => ['puppet', 'puppet.localdomain', 'puppetdb', 'localhost', 'localhost.localdomain',  'localhost4.localdomain4'],
  ip              => '127.0.0.1',
  target        => '/etc/hosts',
  require      => Exec['set-hostname'],
}
host { 'localhost6':
  ensure       => 'present',
  host_aliases => ['puppet', 'puppet.localdomain', 'puppetdb', 'localhost', 'localhost.localdomain',  'localhost6.localdomain6'],
  ip           => '::1',
  target       => '/etc/hosts',
  require      => Exec['set-hostname'],
}

Puppet Service

We managed well with configurations, now we can set the service.

# Configure puppetmaster to be started
service {'puppetmaster':
  ensure => running,
  enable => true,
  require => Exec['set-hostname'],
}

PuppetDB configurations

As we can see here, puppetdb installation is not as easy as 'yum install', we need to manually create files routes.yaml, puppetdb.conf and add some stuff into puppet.conf. I have also seen that when installing for first time, for a strange reason does not set https access, so I have to add an exec to assure this config.

#PuppetDB service configuration
service {'puppetdb':
  ensure => running,
  enable => true,
  require => Service['puppetmaster']
}
$puppetdb_conf = '[main]
server = puppet
'
file{"/etc/puppet/puppetdb.conf":
  content => $puppetdb_conf,
  require => Package['puppetdb-terminus'],
  notify  => Service['puppetmaster'],
}
$puppetdb_route = '---
master:
  facts:
    terminus: puppetdb
    cache: yaml
'
file{"/etc/puppet/routes.yaml":
  content => $puppetdb_route,
  require => Package['puppetdb-terminus'],
  notify  => Service['puppetmaster'],
}
#Workarround since there is no len in augeas for puppet.conf
exec{"config-puppetdb":
  command => 'echo "[master]" >> /etc/puppet/puppet.conf; echo "  storeconfigs = true" >> /etc/puppet/puppet.conf; echo "  storeconfigs_backend = puppetdb" >> /etc/puppet/puppet.conf;',
  unless => 'grep "\[master\]" /etc/puppet/puppet.conf',
  path => ['/bin'],
  require => Package['puppetdb-terminus'],
  notify  => Service['puppetmaster'],
}
#Workarround for first install problem
exec{"puppetdb-ssl-setup":
  command => 'puppetdb-ssl-setup',
  path => ['/bin', '/sbin', '/usr/sbin', '/usr/bin'],
  require => Package['puppetdb'],
  notify => Service['puppetdb'],
}

Execution

So, when we finally have all this code, we just need to execute this instruction:

sudo puppet apply /vagrant/puppetinstall.pp

And let the magic flow...

Demo

After all this, the real demo can be a script like that:

#!/bin/bash
sudo puppet module install puppetlabs-apache
sudo puppet module install puppetlabs-firewall
sudo touch /etc/puppet/manifests/site.pp
sudo chmod a+w /etc/puppet/manifests/site.pp
cat > /etc/puppet/manifests/site.pp <<EOF
node 'puppet' {
  require firewall
  #PuppetDB access and apache standard ports
  firewall{ '100 allow apache and puppetdb access':
    port   => [80,8080,443],
    proto  => tcp,
    action => accept,
  }
  class{ 'apache': }
}
EOF
sudo chmod 644 /etc/puppet/manifests/site.pp
sudo puppet agent -t -d

And let people see your puppetdb instance working (show them some API calls) and your brand new apache working.

Here follows my gist with twice files:

dijous, 12 de setembre del 2013

Thoughts about Git-flow

Some time ago, I started to use a workflow similar to Git-flow (based on it but a little more complicated). When the other day I had the oportunity to audit release management procedures on a client, I saw an oportunity to think on many questions about git-flow and the way to apply it on an organization. They were on the path of migrate their model to git-flow and, of course, some questions arise.

In my opinion, Git-flow is a very good model for git projects, especially when you are developing a product, with a bit more static timings on releases. But, when you are working in a service, maybe this model has to be questioned to have some more flows if you want a more continous delivery workflow.

Bug definition

One thing that normally is a headache on defining is bug categories, not every bug is of the same category. Using priority as category, some are a very critical ones, with a fast response and deployment necessity. Every one must agree to tag them as a hotfix, and have to be deployed as fast as possible.
But when thinking in not so critical bugs, response has to be diferent, firefighter developers must be a necessary evil, not desirable evil.

Just as a remember, bug priority must be defined using a couple of elements:

  • Urgency: Normally defined by project manager/product owner, it's sometimes a political decision.
  • Severity: Normally defined by QA/development. Can be defined this way.

There is also another classification method (compatible with previous one):

  • Development bugs: Bugs detected on next version product.
  • Production bugs: Bugs detected on production version.

Git-flow bug definition

Git-flow model defines bugs only by this second classification:

  • Production bug resolutions are called hotfix and are resolved from production branch and merged into production branch (and spread into development branch).
  • Development bugs are resolved directly on release branch (and spread into development and production branch when delivering this release).

Limits

As you can imagine, not all production bugs has the same priority neither complexity. Using only this definition, maybe you won't be very comfortable in using a fast deployment workflow for some bugs you need to resolve. I found these questions interesting:
  • Do we have to define two release procedures for hotfix integration depending on its priority?
  • Do we have to deliver a new release on every hotfix integration?
  • So, is a good idea to treat all production bugs as hotfix?
  • How can we manage different priority bugs with git-flow?
As you can see, not everyone would have enough with git-flow model.

Git-flow and continuous delivery

If you are a DevOps fan, you are surely familiar with continuous delivery concept. Defining it in an extremely bad and simple way, we can say that we want to deliver our product as fast as automation permits.

With this objective in mind, we can see that git-flow offers a very good workflow to acomplish it. But again, we need to set a little more questions which need an answer:
  • When is desirable to have a "candidate" version packaged?
  • How can we reflect our own project status flow into Git?
  • How can we get versions with git-flow to set into packages?

Answers depends most on organization, but probably a generic answer is possible. I will try to give some answers based upon my experience on a near future.

dilluns, 9 de setembre del 2013

MongoDB Architectures

When you talk about data base clustering the first thing you search is for an architecture and a failover solution. Many services offers some kind of arbitrer to manage them, and taking decisions about which instance has to be the master is its main purpose.
You can see Redis, for example. Since version 2.4 exists a sentinel component which monitors HA instances. Also MySQL needs an external script to monitor master failover and promote some slave. So when you think about MongoDB clustering, it is easy to think that an arbiter is always needed to give an automatic failover promotion. This is a HUGE error...

Let's talk first about MongoDB replication. You can set two kinds of replication:
- Replica Sets: Standard master-slaves architecture.
- Sharded clusters: A data partitioning solution with Replica Sets.
MongoDB gives an automatic solution for failover promotions: all components know each other and when starts a cluster, a votation is done to decide who has to be the primary. All the procedure is well documented. This is good, but also a little headache, because if you want to be sure some instance win, you will have to deploy an odd number of instances. 

So let's see some posible architectures and analize them.

Think in this minimum HA architecture:


Obviously, you will need to make an external component to decide when to promote, if primary fails, secondary by itself won't be able to make a votation, it has anyone to do with. Your system will crash.

To avoid this, we can add a new Secondary to act as an arbiter:



Now you have a real HA solution, without configuring any extra thing. But... if you do this solution, you probably are not the one paying. ARE YOU SERIOUS?? Paying for a full machine, but only pinging the other two instances... Please, be a little more expense careful. 



Ok, so you need HA, you care about costs and minimize instances, but your DB has a high reading needs, and need some more secondary instances... so you decide to add a new instance:



Wait, we saw that an odd number of members are needed, and now we have four, will it work? Well, as you can see, it seems that will work, but the arbiter will be a little unuseful. So, being a little HA paranoid, why to have a 3 server schema and not have a better fault tolerance? Try this:



Since here, add so many secondaries as you want and play with your arbiters to have an odd set.

Nice, now you know how to improve a massive read cluster, but what if we need a better write performance? Some will say... MIGRATE TO CASSANDRA!! Really? You are reading this probably because you have done a search, have compared some noSQL solutions, and chose mongoDB so... find a way with mongoDB!!

If you need write performance, write on multiple instances. We talked earlier of sharding (be sure that you REALLY need it). See this architecture.



As you see, we are using the same number of servers, but complexity has grown a lot. We have new components, let me explain them:
- mongos: Router to DB, you have to access to them from your app.
- CSx: called Config Server, mongod instance that only holds metadata about the cluster.

So as you can see, you can write in all servers, but of course will be different data. Each data document could only be written in one server, and be readed from two of them, but if you do a good work with shard keys, you probably will multiply your write performance.

If you choose a sharding solution see some tutorials, work with your development team and be sure to do a smart data partition.