Tatva-Artha

meaning of "it"

Monitoring rails processes (apache, passenger, delayed_job) using god and capistrano

with one comment


When it comes to monitoring servers and processes in production, there is quite a few open source tools: nagios and hyperic are proven solutions for enterprise apps. Monit and god are the favorites of ruby/rails community. God has been around for a few years now and most critical issues with god seem resolved. So, I decided to use that on my project.

If you’ve worked with any deployment related stuff before, you know that it hardly ever goes as planned. With few machines involved, few processes, file/folder permissions and process permissions, there is bound to be things that will take longer. God is no different. And no wonder you will find wiki/forum posts from folks who’ve been thru it. I had my share of those and in the end I am happy with the config I found.

Here is a journey as it went:

Goal-1: Monitor delayed_job using god:

# run with: god -c /path/to/file.god -D
RAILS_ROOT = "/home/sjain/projects/pfa/omt"
 
# Watch for delayed_job
God.watch do |w|
 
  w.name = "delayed_job"
  w.interval = 30.seconds # default
  w.dir = RAILS_ROOT
  w.start = File.join(RAILS_ROOT, "script/delayed_job start")
  w.stop = File.join(RAILS_ROOT, "script/delayed_job stop")
  w.restart = File.join(RAILS_ROOT, "script/delayed_job restart")
  w.start_grace = 10.seconds
  w.restart_grace = 10.seconds
  w.pid_file = File.join(RAILS_ROOT, "tmp/pids/delayed_job.pid")
  w.log = File.join(RAILS_ROOT, 'log/god.log')
 
  w.behavior(:clean_pid_file)
 
  w.start_if do |on|
    on.condition(:process_running) do |c|
      c.interval = 5.seconds
      c.running = false
    end
  end
 
  # lifecycle stuff goes here ...
end

This is fairly standard from God’s website. The only interesting thing here is I kept RAILS_ROOT as a string constant at the top since I know I will have to make it configurable as part of my capistrano deploy script. With this, I could run following command and have god monitor my delayed job:

$ god -c /var/www/html/application/current/config/app.conf
# Verify with 'god status'

Goal-2: Monitor apache using god:

# Watch for apache
God.watch do |w|
 
  watch_name = "app_server"
  w.name = watch_name
  w.interval = 30.seconds # default
  w.dir = RAILS_ROOT
  w.start = File.join(RAILS_ROOT, "/etc/init.d/httpd start")
  w.stop = File.join(RAILS_ROOT, "/etc/init.d/httpd stop")
  w.restart = File.join(RAILS_ROOT, "/etc/init.d/httpd restart")
  w.start_grace = 10.seconds
  w.restart_grace = 10.seconds
 
  w.start_if do |on|
    on.condition(:http_response_code) do |c|
      c.host = 'localhost'
      c.port = 3000
      c.path = '/'
      c.code_is_not = 200
      c.timeout = 10.seconds
      c.times = [2, 5]
    end
  end
end

This was a straight modification from the delayed_job config, which didn’t work for a few reasons:

1) Since I was starting god under my account, it couldn’t execute the start/stop commands specified in the watch file. For those to work, I had to start god script under sudo permissions.

$ sudo god -c /var/www/html/application/current/config/app.conf
# Once you start god as sudo, all other commands for god need sudo as well ... 'sudo god status'

2) The other issue with the above apache monitoring config was that once I start god as sudo, I still need my apache process to run under restricted user on the destination machine. Since we use CentOS for our production environment, this use is ‘www-data’. So, updated my god config with following:

# Watch for apache
God.watch do |w|
  ...
  w.uid = 'www-data'
  w.gid = 'www-data'
  ...
end

Also, now that I am starting god as sudo, all watches now need to specify uid/gid. So, I had to go back and update my delayed_job watch with uid/gid to be that of my ‘railsdeploy’ user. My rails app gets deployed under ‘railsdeploy’ user account and delayed_job process runs under this user as well.

# Watch for delayed_job
USER = "railsdeploy"
GROUP = "railsdeploy"
 
God.watch do |w|
  ...
  w.uid = USER
  w.gid = GROUP
  ...
end

Notice, how we are not hard-coding user/group directly. We know that eventually our capistrano script will need to replace those variables with configurable values. I kept all such things at the top of my .god file. There is a few more to come.

3) No, that was not all. The next problem I ran into was related to user environment. Since I am starting my god under sudo now, it was not guaranteed that my environment variables are still the same under that user. I found this the hard way (another 1/2 hr.) and in my case the default PATH under sudo was setup to give access to a different ruby version that the one we use for our production environment. So, I had to update both of my watches to have few necessary environment variables.

RAILS_ROOT = "/var/www/html/application/current"
RAILS_ENV = "sandbox"
RUBY_PATH = "/opt/ruby-enterprise/bin"
 
God.watch do |w|
  ...
  w.dir = RAILS_ROOT
  w.env = {
    'RAILS_ROOT' => RAILS_ROOT,
    'RAILS_ENV' => RAILS_ENV,
    'PATH' => "#{RUBY_PATH}:/usr/bin:/bin"	
  }
  ...
end

A few tricks to remember when working with god is where to look for clues in case something doesn’t work. I found following things useful.

Adding log-level of debug when firing up god:

$ sudo god --log-level debug -c /var/www/html/application/current/config/app.conf
# with this god prints information each time it does a check on your monitored processes. These are big help to validate your frequency of poll and tweak your retry limits etc.

If you didn’t specify the log file location for god in command line, they go to system messages. This can be /var/log/messages on CentOS or /var/log/Syslog on Ubuntu (I guess). You can also specify log file in command line:

$ sudo god --log-level debug -c /var/www/html/application/current/config/app.conf -l /var/log/god.log

Or, during your test setup, you can run god in non-daemon mode to have it print all messages in console.

# -D option for non-daemon mode
$ sudo god --log-level debug -c /var/www/html/application/current/config/app.conf -D

So, with this, I have 2 processes being monitored and each with different user/group setting. So far, so good.

Goal-3: Monitor passenger using god (failed):

Since we run apache+passenger combo in our environments, I wanted to see if I can restart passenger also if http-response didn’t work as part of apache watch. So I attempted this:

# adding passenger to apache monitoring
God.watch do |w|
  ...
  apache_init = "/etc/init.d/httpd"
  passenger_restart = "touch #{RAILS_ENV}/tmp/restart.txt"
  w.start = "#{apache_init} start; #{passenger_restart}"
  w.stop = "#{apache_init} stop; #{passenger_restart}"
  w.restart = "#{apache_init} restart; #{passenger_restart}"
  ...
end

So, what I trying to do here is run 2 commands as part of one watch. Apparently, this doesn’t work. I believe, this has more to do with how god hands over those commands to under lying system that anything else. I couldn’t figure this one and I am not aware of the best way to monitor passenger outside of apache to I left it at that. My guess if the http-response is not working and I restart apache, it should technically restart passenger as well and fix the issue.

Goal-4: Leverage capistrano to help with god configuration:

Since I don’t even run apache locally for development (I run mongrel), I had to test all these things on a sandbox machine. I leveraged capistrano to help me out with deploying these god config files after I modified them each time locally. Here is what capistrano tasks looked like:

 namespace :god do
   task :start, :roles => :app do
     god_config_file = "#{latest_release}/config/omt.god"
     sudo "god --log-level debug -c #{god_config_file}"
   end
   task :stop, :roles => :app do
    sudo "god terminate" rescue nil
  end
  task :restart, :roles => :app do
    god.stop
    god.start
   end
   task :status, :roles => :app do
    sudo "god status"
   end
   task :log, :roles => :app do
    sudo "tail -f /var/log/messages"
   end
  task :deploy_config, :roles => :app do
     god_config_file = "#{latest_release}/config/omt.god"
     god_script_template = File.dirname(__FILE__) + "/deploy/templates/omt.god.erb.rb"
     data = ERB.new(IO.read(god_script_template)).result(binding)
     sudo "god load #{god_config_file}"
   end
   task :redeploy, :roles => :app do
     god.deploy_config
     god.load_config
   end
 end

Most tasks are straight forward with managing god itself (start/stop/restart/log). The deploy_config tasks worked great with the following config file in order to generate proper .god config file on the destination machine:

# run with: god -c /path/to/file.god -D
RAILS_ROOT = "<%= latest_release %>"
RAILS_ENV = "<%= rails_env %>"
USER = "<%= user %>"
GROUP = "<%= group %>"
RUBY_PATH = "<%= ruby_home %>/bin"
GOD_ENVIRONMENT = {
  'RAILS_ROOT' => RAILS_ROOT,
  'RAILS_ENV' => RAILS_ENV,
  'PATH' => "#{RUBY_PATH}:/usr/bin:/bin"
}
 
God.watch do |w|
  ...
  w.dir = RAILS_ROOT
  w.env = GOD_ENVIRONMENT
  w.uid = USER
  w.gid = GROUP
  ...
end

As you can see above, the constants defined at the top of .god.erb.rb template file picked values for variables from the capistrano script and replace them in template to generate the actual .god config file. Now, I could use this file to generate script for each environment and have it generate this with each god deploy. The god script once configured doesn’t change often but I still like to regenerate it with each capistrano deploy just as I would generate apache vhost file using capistrano with each deploy. This is debatable but I just like it that way.

Goal-5: Configure god as init script

Any monitoring solution isn’t complete until it takes care of starting/stopping itself automatically. God has to be configured to start automatically on server reboot. Now, it was time to create init script for god itself. After a few trial errors I ended up with following init script template:

#!/bin/bash
#
# God
#
# chkconfig: - 99 1
# description: start, stop, restart God (bet you feel powerful)
#
 
# source function library
. /etc/rc.d/init.d/functions
 
RUBY_PATH="<%= ruby_home %>/bin"
PATH=$RUBY_PATH:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=$RUBY_PATH/god
PIDFILE=/var/run/god.pid
LOGFILE=/var/log/god.log
SCRIPTNAME=/etc/init.d/god
CONFIGFILEDIR=/etc/god
 
#DEBUG_OPTIONS="--log-level debug"
DEBUG_OPTIONS=""
 
# Gracefully exit if 'god' gem is not available.
test -x $DAEMON || exit 0
 
RETVAL=0
 
god_start() {
  start_cmd="$DAEMON -l $LOGFILE -P $PIDFILE $DEBUG_OPTIONS"
  #stop_cmd="kill -QUIT `cat $PIDFILE`"
  echo $start_cmd
  $start_cmd || echo -en "god already running"
  RETVAL=$?
  if [ "$RETVAL" == '0' ]; then
    sleep 2 # wait for server to load before loading config files
    if [ -d $CONFIGFILEDIR ]; then
      for file in `ls -1 $CONFIGFILEDIR/*.god`; do
        echo "god: loading $file ..."
        $DAEMON load $file
      done
    fi
  fi
  return $RETVAL
}
 
god_stop() {
  stop_cmd="god terminate"
  echo $stop_cmd
  $stop_cmd || echo -en "god not running"
}
 
case "$1" in
  start)
    god_start
    RETVAL=$?
    ;;
  stop)
    god_stop
    RETVAL=$?
    ;;
  restart)
    god_stop
    god_start
    RETVAL=$?
    ;;
  status)
    $DAEMON status
    RETVAL=$?
    ;;
  *)
    echo "Usage: god {start|stop|restart|status}"
    exit 1
    ;;
esac
 
exit $RETVAL

A few things to note here. If you look carefully at the god_start() function, it starts the god process as daemon and then loads any *.god files in /etc/god/ folder. This has several benefits. When the server reboots, god doesn’t need to hunt around for config files in various places. And whatever application needs to be monitored can have one of their config files installed here for god to pickup. The capistrano deploy can now be updated to deploy a .god file in this folder and then call “god load app.god” to have it loaded during each deploy. The next time machine/god restarts, it will pick up monitoring of configured apps automatically.

The capistrano task could look like:

namespace :god do
  task :deploy_config, :roles => :app do
    # deploy omt.god file
    god_config_file = "#{latest_release}/config/omt.god"
    god_script_template = File.dirname(__FILE__) + "/deploy/templates/omt.god.erb.rb"
    data = ERB.new(IO.read(god_script_template)).result(binding)
    put data, god_config_file
    sudo "mkdir -p /etc/god", :pty => true
    sudo "ln -sf #{god_config_file} /etc/god/omt.god", :pty => true
    # load file into god service (assumes god is already running)
    god_config_file = "#{latest_release}/config/omt.god"
    sudo "god load #{god_config_file}"
  end
end

Since each deployed environment (sandbox, staging, production) can have ruby-home in different folder location. I couldn’t use static init script file. I leveraged capistrano to use a template file (with ERB engine) to replace variables on the fly. The capistrano task for this looks as follows:

Even deploying god init script file could be done using capistrano:

namespace :god do
  task :deploy_init_script, :roles => :app do
    # Note: god gem (version >= 0.11.0) must be installed as system gem for this to work
    god_init_template = File.dirname(__FILE__) + "/deploy/templates/god.init.template.sh"
    data = ERB.new(IO.read(god_init_template)).result(binding)
    put(data, "/tmp/god_init_script", :via => :scp, :mode => "755") # put command doesn't support sudo, so we can't directly copy to /etc/init.d folder
    sudo "mv /tmp/god_init_script /etc/init.d/god"
    sudo "/sbin/chkconfig --level 35 god on" # enable run levels 3 and 5
  end
end

As you can see above, we copy the init script in /etc/init.d/god folder and then run chkconfig command to have it restart automatically in unix run-levels 3 and 5. Now, assuming god is already setup on your server to run as init script. We can modify our capistrano task for god start/stop/restart/status commands to leverage unix’s standard “service” command.

# capistrano tasks: god:start, god:stop, god:restart, god:status
namespace :god do
  ["start", "stop", "restart", "status"].each do |cmd|
    task cmd.to_sym, :roles => :app do
      sudo "service god #{cmd}"
    end
  end
end

There is still a few more things I have to left to do. For example, monitoring mysql database server. Since my database in production resides on a different server than app server, this will require installing a different script on mysql server machine or somehow monitor remotely from one of the app servers…

As I mentioned previously, deployment tasks never go as planned. There are always hiccups. But it is fun to see it all working at the end of the day!

http://www.tatvartha.com/wp-content/plugins/sociofluid/images/digg_16.png http://www.tatvartha.com/wp-content/plugins/sociofluid/images/reddit_16.png http://www.tatvartha.com/wp-content/plugins/sociofluid/images/stumbleupon_16.png http://www.tatvartha.com/wp-content/plugins/sociofluid/images/delicious_16.png http://www.tatvartha.com/wp-content/plugins/sociofluid/images/google_16.png http://www.tatvartha.com/wp-content/plugins/sociofluid/images/twitter_16.png

No related posts.

Related posts brought to you by Yet Another Related Posts Plugin.

Written by Sharad

September 1st, 2010 at 6:06 pm

One Response to 'Monitoring rails processes (apache, passenger, delayed_job) using god and capistrano'

Subscribe to comments with RSS or TrackBack to 'Monitoring rails processes (apache, passenger, delayed_job) using god and capistrano'.

  1. Thanks for the article, I am looking to do this exeact setup in the next few days.

    For the bit about apache and passenger, have you tried something like this?

    “#{apache_init} start && #{passenger_restart}”

    This will execute the second command only if the first command exists with a status of 0.

    Karl

    16 Sep 10 at 9:40 pm

Leave a Reply