dijous, 18 de juny del 2015

Using docker as jenkins slave orchestrator

After several years working with jenkins and enjoying its plugins, I become sure that maybe there are some other modern CI tools that are more beautiful and more pipeline oriented (see Travis-ci, go-cd and do some google to find some more), but jenkins give me something important: in-house control, thousands of plugins and programatic personalization.

After several months working on the objective of grouping the about 100 standalone jenkins servers that my organization has into a single platform, finally we created a jenkins architecture based on docker slaves that gives us a good project to offer to our teams. Let me explain a little what we needed and how we finally implemented.

Requirements

  • Offer a common CI platform with LDAP integration
  • Each group has unique requirements
  • No one could be able to modify others work
  • Work parallelization has to be on demand and scallable
  • Each group has to be able to start on its own, without human interaction (that is, without asking an administrator to create them the infraestructure)

First approaches

Virtual Machine slaves

As all our teams already know how to work with jenkins and have active projects, this was an important reason to not be disruptive and change to another platform. So the first approach was to deploy jenkins on a virtual server with puppet and create different kind of slaves (linux, windows, macOS) on virtual machines also configured with puppet. This wasn't bad, but as you probably know, it is slow to start VM on demand, although with vSphere plugin it's an easy thing.

Plugins needed

  • vSphere plugin

Problems

  • Potential collisions: giving control of what to have installed on the slaves via puppet to each project implied collisions between them.
  • Security risks: having multithreaded slaves gave us problems of security and some malicious user could delete all user content, included other projects.

Predefined dockers on independent Virtual Machines

After hearing about docker, I didn't hesitate to give it a try, and started to learn and prepared a private registry in our official binary repositories app, artifactory. At first, I saw it difficult to use it for jenkins, as no one on the company had knowledge yet about it but almost all of them knew puppet. When we met the problems explained above, then it seemed a desirable solution, maybe using puppet to configure them. We used coreOS images deployed into VMWare at first, but we detected problems like not starting docker again after an automatic update, or having problems with some npm package and AUFS. So finally we decided to use project atomic images.

Plugins needed

  • Docker plugin

Problems

  • Docker plugin is not as functional as desired (testing 0.9.3), preconfigured slaves work fine, but has errors when trying to automate builds or template adds.
  • Puppet is not the good way of configuring containers, is possible, but difficult using a master. And completely over-engineering for our needs.
  • Configuring containers for each docker daemon is not so easy due to plugin problems, and the scalability is not extremely fluent.
So, we were in the right track, but needed to solve some problems and to improve the architecture, until we discovered a couple of jewels.

Final (or not, we are devops) solution

After seeing how goodness could docker do for us, we started to improve jenkins experience allowing us to divide the portal for each project team and trying to simplify their start. So we used these plugins
  • Authorization plugins: LDAP, Matrix-based security 
  • Folder plugin: The most effective way to isolate jobs between groups 
  • Groovy plugin: When a plugin is not good enough, the best way to modify jenkins itself 
  • DSL plugin: The best possible way to create predefined job structures like wizards. 
  • Plugins to access our other services (Jira, Artifactory, Sonar, etc)
  • Some other useful stuff like builders, archive, etc.
With them we can offer:

  • A wizard to create its own project folders and give them exclusive permissions, as they don't have global permissions.
  • A wizard to create a structure per component (git repo) where it includes 
    • a job to build their unique container
    • preconfigured pipeline with a job for each phase and triggering one after the other
  • Some groovy stuff to fix docker-plugin problems


And so on... I'm finishing the article





dijous, 10 d’octubre del 2013

Tip: Solve problems with rpmbuild between RH5 and RH6

Recently I have been creating rpms from a Centos 6.4 to be installed into a RHEL 5.8, and due to diferent versions of rpm (4.4.2 vs 4.8.0) packages created weren't compatible. After searching a lot with google, no one give a solution to my problem, so I post here to share it.

I you ever found an rpm with these dependencies (rpm -qp package.rpm -R):
rpmlib(CompressedFileNames) <= 3.0.4-1
rpmlib(FileDigests) <= 4.6.0-1
rpmlib(PayloadFilesHavePrefix) <= 4.0-1
rpmlib(PayloadIsXz) <= 5.2-1


You won't be able to install into a RH5.

To solve the problem, is as easy as erasing a package, redhat-rpm-config where rpmbuild is installed.
This is because configurations detailed by that package to build rpms are only compatible with 4.8.0 version, since functions FileDigests and PayloadIsXz are newer.


dimecres, 18 de setembre del 2013

Installing puppetmaster with puppet

Working in a demo for a puppet masterclass, I decided to use a puppet agent to install and configure puppet-server and puppetdb. Here explains the process.

Demo VM

I used vagrant and Virtualbox to deploy a VM to show the demo, once installed both, run the following:

mkdir demopuppet
cd demopuppet
vagrant box add centos_i386 http://developer.nrel.gov/downloads/vagrant-boxes/CentOS-6.4-i386-v20130427.box
vagrant init centos_i386
vagrant up


You can find more boxes here, get one with puppet agent already installed.

Yum repositories

I will explain the code by functionality.
First of all, yum repositories. The first thing I did was to install epel and puppetlabs rpm and then, I realized I could use puppet itself to install them. 

puppet resource yumrepo epel >> puppetinstall.pp
puppet resource yumrepo puppetlabs-products >> puppetinstall.pp
puppet resource yumrepo puppetlabs-deps >> puppetinstall.pp

Then, on looking that exit, I realized that to give a demo solution, the content had to be edited to avoid gpgcheck (installing keys with puppet is a little more complicated if you only want a .pp as an exit). So the final result was:

#Repositories needed to install puppetmaster
yumrepo { 'epel':
  descr          => 'Extra Packages for Enterprise Linux 6 - $basearch',
  enabled        => '1',
  failovermethod => 'priority',
  gpgcheck       => '0',
  mirrorlist     => 'https://mirrors.fedoraproject.org/metalink?repo=epel-6&arch=$basearch',
}
yumrepo { 'puppetlabs-deps':
  baseurl  => 'http://yum.puppetlabs.com/el/6/dependencies/$basearch',
  descr    => 'Puppet Labs Dependencies El 6 - $basearch',
  enabled  => '1',
  gpgcheck => '0',
}
yumrepo { 'puppetlabs-products':
  baseurl  => 'http://yum.puppetlabs.com/el/6/products/$basearch',
  descr    => 'Puppet Labs Products El 6 - $basearch',
  enabled  => '1',
  gpgcheck => '0',
}

But I didn't like a solution that reduces quality, so although is good to use yumrepo, I thing the best solution is the one following:

package {'epel-release':
  ensure => 'installed',
  source => 'http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm',
  provider => 'rpm',
}
package {'puppetlabs-release':
  ensure => 'installed',
  source => 'http://yum.puppetlabs.com/el/6/products/i386/puppetlabs-release-6-7.noarch.rpm',
  provider => 'rpm',
}

Packages 

All these packages have to be installed after repositories are settled, so we will set a dependency. Vim and screen are not really needed, but I love them :-).

package{
  ['vim-enhanced',
   'screen',
   'puppet-server',
   'puppetdb',
   'puppetdb-terminus',
  ]:
    ensure => installed,
    require => [ Package['puppetlabs-release'], Package['epel-release'] ],
}

Puppet configurations

Remember that we are working in a demo procedure, so this changes are to avoid some good (but manual) practices. First one is enable autosign certificates, to allow any node to authenticate into puppetmaster without human intervention.

# Enables autosign node certificates
file {'/etc/puppet/autosign.conf':
  ensure  => present,
  content => "*\n",
  require => Package['puppet-server'],
  notify  => Service['puppetmaster'],
}

Next one is setting a new hostname to work with defaults. Puppetmaster and nodes search for a domain name called 'puppet', so if we want to work with defaults, we have to set the correct hostname. I searched a little and found the following (lost source, sorry)

#Set master hostname into puppet.localdomain
file { "/etc/hostname":
    ensure => present,
    owner => root,
    group => root,
    mode => 644,
    content => "puppet.localdomain\n",
    notify => Exec["set-hostname"],
  }
  exec { "set-hostname":
    command => "/bin/hostname -F /etc/hostname",
    unless => "/usr/bin/test `hostname` = `/bin/cat /etc/hostname`",
  }

# Set some more names to localhost
host { 'localhost':
  ensure       => 'absent',
}
host { 'localhost4':
  ensure       => 'present',
  host_aliases => ['puppet', 'puppet.localdomain', 'puppetdb', 'localhost', 'localhost.localdomain',  'localhost4.localdomain4'],
  ip              => '127.0.0.1',
  target        => '/etc/hosts',
  require      => Exec['set-hostname'],
}
host { 'localhost6':
  ensure       => 'present',
  host_aliases => ['puppet', 'puppet.localdomain', 'puppetdb', 'localhost', 'localhost.localdomain',  'localhost6.localdomain6'],
  ip           => '::1',
  target       => '/etc/hosts',
  require      => Exec['set-hostname'],
}

Puppet Service

We managed well with configurations, now we can set the service.

# Configure puppetmaster to be started
service {'puppetmaster':
  ensure => running,
  enable => true,
  require => Exec['set-hostname'],
}

PuppetDB configurations

As we can see here, puppetdb installation is not as easy as 'yum install', we need to manually create files routes.yaml, puppetdb.conf and add some stuff into puppet.conf. I have also seen that when installing for first time, for a strange reason does not set https access, so I have to add an exec to assure this config.

#PuppetDB service configuration
service {'puppetdb':
  ensure => running,
  enable => true,
  require => Service['puppetmaster']
}
$puppetdb_conf = '[main]
server = puppet
'
file{"/etc/puppet/puppetdb.conf":
  content => $puppetdb_conf,
  require => Package['puppetdb-terminus'],
  notify  => Service['puppetmaster'],
}
$puppetdb_route = '---
master:
  facts:
    terminus: puppetdb
    cache: yaml
'
file{"/etc/puppet/routes.yaml":
  content => $puppetdb_route,
  require => Package['puppetdb-terminus'],
  notify  => Service['puppetmaster'],
}
#Workarround since there is no len in augeas for puppet.conf
exec{"config-puppetdb":
  command => 'echo "[master]" >> /etc/puppet/puppet.conf; echo "  storeconfigs = true" >> /etc/puppet/puppet.conf; echo "  storeconfigs_backend = puppetdb" >> /etc/puppet/puppet.conf;',
  unless => 'grep "\[master\]" /etc/puppet/puppet.conf',
  path => ['/bin'],
  require => Package['puppetdb-terminus'],
  notify  => Service['puppetmaster'],
}
#Workarround for first install problem
exec{"puppetdb-ssl-setup":
  command => 'puppetdb-ssl-setup',
  path => ['/bin', '/sbin', '/usr/sbin', '/usr/bin'],
  require => Package['puppetdb'],
  notify => Service['puppetdb'],
}

Execution

So, when we finally have all this code, we just need to execute this instruction:

sudo puppet apply /vagrant/puppetinstall.pp

And let the magic flow...

Demo

After all this, the real demo can be a script like that:

#!/bin/bash
sudo puppet module install puppetlabs-apache
sudo puppet module install puppetlabs-firewall
sudo touch /etc/puppet/manifests/site.pp
sudo chmod a+w /etc/puppet/manifests/site.pp
cat > /etc/puppet/manifests/site.pp <<EOF
node 'puppet' {
  require firewall
  #PuppetDB access and apache standard ports
  firewall{ '100 allow apache and puppetdb access':
    port   => [80,8080,443],
    proto  => tcp,
    action => accept,
  }
  class{ 'apache': }
}
EOF
sudo chmod 644 /etc/puppet/manifests/site.pp
sudo puppet agent -t -d

And let people see your puppetdb instance working (show them some API calls) and your brand new apache working.

Here follows my gist with twice files:

dijous, 12 de setembre del 2013

Thoughts about Git-flow

Some time ago, I started to use a workflow similar to Git-flow (based on it but a little more complicated). When the other day I had the oportunity to audit release management procedures on a client, I saw an oportunity to think on many questions about git-flow and the way to apply it on an organization. They were on the path of migrate their model to git-flow and, of course, some questions arise.

In my opinion, Git-flow is a very good model for git projects, especially when you are developing a product, with a bit more static timings on releases. But, when you are working in a service, maybe this model has to be questioned to have some more flows if you want a more continous delivery workflow.

Bug definition

One thing that normally is a headache on defining is bug categories, not every bug is of the same category. Using priority as category, some are a very critical ones, with a fast response and deployment necessity. Every one must agree to tag them as a hotfix, and have to be deployed as fast as possible.
But when thinking in not so critical bugs, response has to be diferent, firefighter developers must be a necessary evil, not desirable evil.

Just as a remember, bug priority must be defined using a couple of elements:

  • Urgency: Normally defined by project manager/product owner, it's sometimes a political decision.
  • Severity: Normally defined by QA/development. Can be defined this way.

There is also another classification method (compatible with previous one):

  • Development bugs: Bugs detected on next version product.
  • Production bugs: Bugs detected on production version.

Git-flow bug definition

Git-flow model defines bugs only by this second classification:

  • Production bug resolutions are called hotfix and are resolved from production branch and merged into production branch (and spread into development branch).
  • Development bugs are resolved directly on release branch (and spread into development and production branch when delivering this release).

Limits

As you can imagine, not all production bugs has the same priority neither complexity. Using only this definition, maybe you won't be very comfortable in using a fast deployment workflow for some bugs you need to resolve. I found these questions interesting:
  • Do we have to define two release procedures for hotfix integration depending on its priority?
  • Do we have to deliver a new release on every hotfix integration?
  • So, is a good idea to treat all production bugs as hotfix?
  • How can we manage different priority bugs with git-flow?
As you can see, not everyone would have enough with git-flow model.

Git-flow and continuous delivery

If you are a DevOps fan, you are surely familiar with continuous delivery concept. Defining it in an extremely bad and simple way, we can say that we want to deliver our product as fast as automation permits.

With this objective in mind, we can see that git-flow offers a very good workflow to acomplish it. But again, we need to set a little more questions which need an answer:
  • When is desirable to have a "candidate" version packaged?
  • How can we reflect our own project status flow into Git?
  • How can we get versions with git-flow to set into packages?

Answers depends most on organization, but probably a generic answer is possible. I will try to give some answers based upon my experience on a near future.

dilluns, 9 de setembre del 2013

MongoDB Architectures

When you talk about data base clustering the first thing you search is for an architecture and a failover solution. Many services offers some kind of arbitrer to manage them, and taking decisions about which instance has to be the master is its main purpose.
You can see Redis, for example. Since version 2.4 exists a sentinel component which monitors HA instances. Also MySQL needs an external script to monitor master failover and promote some slave. So when you think about MongoDB clustering, it is easy to think that an arbiter is always needed to give an automatic failover promotion. This is a HUGE error...

Let's talk first about MongoDB replication. You can set two kinds of replication:
- Replica Sets: Standard master-slaves architecture.
- Sharded clusters: A data partitioning solution with Replica Sets.
MongoDB gives an automatic solution for failover promotions: all components know each other and when starts a cluster, a votation is done to decide who has to be the primary. All the procedure is well documented. This is good, but also a little headache, because if you want to be sure some instance win, you will have to deploy an odd number of instances. 

So let's see some posible architectures and analize them.

Think in this minimum HA architecture:


Obviously, you will need to make an external component to decide when to promote, if primary fails, secondary by itself won't be able to make a votation, it has anyone to do with. Your system will crash.

To avoid this, we can add a new Secondary to act as an arbiter:



Now you have a real HA solution, without configuring any extra thing. But... if you do this solution, you probably are not the one paying. ARE YOU SERIOUS?? Paying for a full machine, but only pinging the other two instances... Please, be a little more expense careful. 



Ok, so you need HA, you care about costs and minimize instances, but your DB has a high reading needs, and need some more secondary instances... so you decide to add a new instance:



Wait, we saw that an odd number of members are needed, and now we have four, will it work? Well, as you can see, it seems that will work, but the arbiter will be a little unuseful. So, being a little HA paranoid, why to have a 3 server schema and not have a better fault tolerance? Try this:



Since here, add so many secondaries as you want and play with your arbiters to have an odd set.

Nice, now you know how to improve a massive read cluster, but what if we need a better write performance? Some will say... MIGRATE TO CASSANDRA!! Really? You are reading this probably because you have done a search, have compared some noSQL solutions, and chose mongoDB so... find a way with mongoDB!!

If you need write performance, write on multiple instances. We talked earlier of sharding (be sure that you REALLY need it). See this architecture.



As you see, we are using the same number of servers, but complexity has grown a lot. We have new components, let me explain them:
- mongos: Router to DB, you have to access to them from your app.
- CSx: called Config Server, mongod instance that only holds metadata about the cluster.

So as you can see, you can write in all servers, but of course will be different data. Each data document could only be written in one server, and be readed from two of them, but if you do a good work with shard keys, you probably will multiply your write performance.

If you choose a sharding solution see some tutorials, work with your development team and be sure to do a smart data partition.