
Web developers often get into the rut of thinking about every programming task in the context of a request and a response. A request comes for a URL, content is retrieved and converted into useful output, then sent back to the client. Lather, rinse, repeat.
But there are also other types of programming tasks that don’t fit into that cycle. What about tasks that need to happen
Here are some examples from my applications:
Previously, I approached most of these problems with a few rake tasks and a cron job that ran every minute. While it worked, it wasn’t as fast as it could be and felt a bit hackish (a delay of even one minute is too slow sometimes).
For a while, I’ve wanted to learn more about messaging queues. I love tools that don’t only enhance something I’m already doing, but completely change the way I think about designing an application.

Queues are a great tool for some tasks. Having the ability to send something off to a queue can solve some of these problems and also give you another option for optimizing the speed of normal HTTP responses, too.
Initially, I decided to try this out on my blog. I use the remote Akismet service to check comments for SPAM. To be honest, Akismet is usually fast enough that I could make the call in the middle of the request without any problems, but I wanted to try out the message queue before deploying a similar idea at PeepCode.com.
In the application, every comment starts out with a received state (using acts_as_state_machine). The Comment controller will fire off a job and a separate worker will handle the SPAM-checking so the web process can respond quickly and get back to work responding to other web requests.

There’s been some fresh activity even just over the last few weeks in this area. Ara Howard compared some of these recently. I haven’t evaluated all of these packages, but here are a few I’ve looked at:
| Product | Features | Drawbacks |
| beanstalkd | Fast, simple, designed to mirror the style of memcached. Rails plugin available, or usable with a simple Ruby-based API. Server written in C, but is very easy to install. | Memory only…jobs are not persistent. New, so the internal protocol may change. Workers may be difficult to manage. |
| bj | Rails plugin. Self-spawning. | Can only send shell commands. Jobs start a full copy of your Rails app on every execution. |
| BackgroundRB | Ruby-based. Can be polled for incremental feedback on the progress of a job. Recently rewritten. | |
| Amazon SQS | Runs on Amazon’s cluster, so it can handle a ton of traffic. | Operated by Amazon, so it doesn’t run locally. Not open source. |
| Apache ActiveMQ | Well-known. Persistent. | Requires several installation steps and database tables. |
| ActiveMessaging | Rails plugin. Works with ActiveMQ and others. | Requires external job server. |
| BBQ | Nothing to install…involves only a single line of code! | Doesn’t work on Windows NT4. |
For this blog, I chose to try beanstalkd.
Download the beanstalkd server and compile it. Use make for production or make debug for your development copy (to print out extra messages as it’s working). There’s no task to install it, but you can just copy the executable to /usr/local/bin.
Start the server (use -h to see other possible arguments):
% beanstalkd
beanstalkd: net.c:90 in unbrake: releasing the brakes
Install the beanstalk-client gem. For this blog, I chose to use the gem directly.
sudo gem install beanstalk-client
In merb_init.rb (or config/environment.rb), I setup a connection to the beanstalk server.
BEANSTALK = Beanstalk::Pool.new(['localhost:11300'])
In the Comments controller, I put a comment job into the queue, using the id of the new comment.
# Comments controller
def create
@comment = Comment.new(params[:comment])
if @comment.save
BEANSTALK.yput({:type => "comment", :id => @comment.id}) rescue nil
# Then redirect and return
The yput method uses YAML to serialize any arguments and put them into the queue.
Finally, I wrote a rake task to function as the worker.
loop do
job = BEANSTALK.reserve
# ybody deserializes the job
job_hash = job.ybody
case job_hash[:type]
when "comment"
if Comment.check_for_spam(job_hash[:id])
job.delete
else
job.bury
end
else
puts "Don't know what type of job this is: #{job_hash.inspect}"
end
end
In the future, I’d like to look into using daemonize or some other method for running the worker. In the meantime, I’m using god to start the worker and keep it running.
The details are a bit of a hack, but here is the god.conf if you want to try it. The benefit is that god keeps the worker running and daemonizes it so it runs in the background.
sudo god start -c /var/www/apps/mysite.com/current/config/god.conf
I can also call god restart beanstalk-worker from a Capistrano task to restart it and keep the code fresh.

In practice over the past week or so, this has been very reliable. The message passing is so fast that sometimes it actually runs the SPAM check before the redirect back to the article page is done!
It was fairly simple to setup and now provides me with a tool for accomplishing tasks that don’t need to be completed in the scope of an HTTP response.
id and some kind of identifier, not an entire model. I could have stored the entire contents of the comment, but it’s more efficient to just pass the id of the comment and let the worker get a current copy from the database.Comment model, I keep the code in one place (even though it will be executed in different contexts).Now that it’s working smoothly here, I hope to use it on PeepCode. Some possibilities:
Ryan Daigle’s Rails 2 PDF is now available in Italiano as well as English and Español.
BackgroundRB just reached 1.0. And looks very cool, and is very active development. I’ve had good success with it.
As ever, an interesting and useful read – thanks!
Can be easily done in a BackgrounDRb worker:
Love your posts. Very informative.
Have you considered AP4R as well?
The link to Rubyforge is below:
http://rubyforge.org/projects/ap4r/
I am not involved in this project. But sounds like a good choice for these kind of tasks as well.
Cheers, Venkat.
You may be interested in the recent alpha release of the InlineBBQ Queue service ;)
I’m using ap4r to do this. I find it a little more convenient because it just calls right back into whatever controller I tell it. No need to worry about rake tasks.
For instance, this:
will work just as you expect: the queue action will queue the download request and return immediately. Later (configurable), ap4r calls the download action with :story and :url. You can, of course, tell it to call a completely different controller if you want. And it’s load-balanced like any other incoming request.
Pretty handy.
Sparrow is a pretty interesting project.
http://code.google.com/p/sparrow/
Speaks Memcached, written in ruby and Eventmachine so should be pretty bullet proof.
*...(
”# ybody unserializes the job”
Ouch my ears! “deserializes” please.
I know, this is very trivial.
Question:
So I understand that you add a job to the queue, but I’m confused about when the worker picks it up.
So do you simply start your rake task and it runs forever, checking the queue with the do loop?
How’s this with memory/cpu usage?
@Alex: Fixed…thanks.
Yes, the rake task sits in a do loop. It’s been running constantly for over a week now and CPU usage is at 0.02. So it doesn’t max out the CPU as one would expect.
I get a 503 error with this link:
http://pastie.textmate.org/private/ovgxu2ihoicli2ktrwtbew
As ever, an interesting and useful read – thanks!
I would tend to agree with John Topley and say that it can be easily done in a BackgrounDRb worker:
Thanks for this post—Queueing is something that I’ve been wondering about ever since Amazon introduced their queueing service (really expensive!)
I just implemented this because backgroundrb was being finicky, and so far I like it better (simpler and cleaner and easier to test). If you don’t want to fire up ruby at the end of the beanstalk stop command, try:
Interesting post.
What’s the difference between job.delete and job.bury ? I couldn’t find any information about that on the beanstalkd or beanstalk-client sites.
@Ivan: Delete removes a job altogether, while bury puts it in a deferred queue that can be accessed separately.
If a job completed successfully, it can be deleted. If it had a problem or couldn’t be done right away for some reason, it can go into the buried queue.
it’s not quite accurate to say that bj can only run shell commands. it’s actually possible to have bj load the rails app initially and then to evaluate the job from the db. so you can simply submit ruby code to run. the reason it does not do this is for robustness: rails apps leak memory like crazy, everyone knows it, so stacking up a backgroudn daemon that loads a rails app and runs jobs through it effectively doubled the processes that are leaking and memory requirements of the system. by loading only on demand per process the system stays robust and has minimal memory requirements at the expense of cpu. of course, as i said, bj allows you to do this if you want to, but it is not the default. another feature of bj is that it allows you to cluster backend process – it’s very easy to setup 10 background job runners. just an fyi.
Thanks for the clarification, Ara.
Message queueing systems are popping up all over the place. Shopify wrote their own, too:
http://blog.leetsoft.com/2008/2/17/delayed-job-dj
This:
def MyController def queue ap4r.async_to({:action => ‘download’}, {:story => story.id, :url => params[:url]}) end def download # long-running task end end
doesnt work. Are you sure everythink is fine?
Thanks Lukas Kalender
Hi! I am using somewhat the same setup as you are on one of my sites and I have a question regarding your god config file. I hope you want to answer. How do you manage to monitor the beanstalkd process from god? Monitoring the ruby scripts that interact with the beanstalkd isn’t a problem, but the message queue system itself seems harder monitor. How do you do that? Are you also monitoring f.ex nginx from your god setup? I would love to hear how you are doing those two things!
If you don’t want to write about it here, you can also send me an email directly!
Thanks.
Best regards Sebastian