Archive for the ‘General’ Category

When and Where to use Caching In Rails

Tuesday, May 11th, 2010

(Alternative title to this post: Rails Caching Enlightenment Through Perl)

This entire post will either make you think I’m a horrible web programmer, or hopefully, show you the deep and meaningful insights that I’ve managed to eke out from the experience.

Server Crashes Are Bad
Last night I was up coding, but sadly, not rails code. Yesterday I had a call from a client telling me that the website had crashed, and their clients were pissed off because this happened just when they had put out their weekly newsletter. The server (a virtual hosted system hosted at GoDaddy) had locked up tight, rendering all the sites that were on the server un-reachable.

The server was rebooted, and digging into it there was no clear reason why it had crashed, which is of course not what they wanted to hear. I had a development server set up at home, so I did a bit of testing to see if I could duplicate it. The server was old and slow (not even dual core), but usable as a Linux workstation, so I figured that if I could make the site feel faster here, the gains on the production server would be huge.

Determining A Baseline
First was to hit it with the trusted Apache Benchmark utility, I started with a perfectly reasonable 10 concurrent connections for 1000 hits hitting the landing page that was sent out in the newsletter, and probably one of the more intensive pages DB and logic wise on the site.

$ ab -c 10 -n 1000 “http://site.com/page.html?id=117”

Looking at my system monitor, I immediately saw my CPU use jump to 100%, disk access go from a blip here and there to constant, and the machine ground to a halt.

I hit ctrl-c pretty fast.

OK, problem found, the site is a vampire and sucks the life out of the server. I played with a few different settings and eventually ended up using 5 concurrent connections and 20 hits “-c 5 -n 20” which gave me an average 2000-4000ms to serve pages, about 1 request a second.

Horrible right? Before you put me against the wall to keep me from doing any web programming again, remember this is a really old server. Please? Maybe one smoke before it’s time for me to go?

Small Fixes, Small Gains
So there were three things that I figured I needed to look at:

  1. Use the YSlow plugin to find some small gains
  2. Finally see if there are any just bad code that could be refactored out, loops within loops, useless re-calculation, etc.
  3. Re-examine the number of queries going on on the page

YSlow gave me a few things to do. Setting the expires header for images, moving CSS and JS to the top and bottom of the page, and a couple of other minor things that gave me no real gains via ab.

Surprisingly, there weren’t any low hanging fruit for bad code or useless loops within loops. This sucked, mostly because that meant me going through and re-looking at SQL and refactoring that, which I’m not sure about you, but that doesn’t sound like fun to me.
Somewhat more surprisingly there were only a couple of extra queries, mostly related to the ORM I was using, Class::DBI and some just silly things. Sadly none of these gave me any more gains.

One thing I did find was where the issues were. When I commented out the main grid of items that is the focus on the page, the page response went from 2000-4000ms response time to 200. Hmm…, so what to do with this. What if I could make it so the time to generate the main product grid didn’t happen? So I commented out the dynamic code, and copied in the HTML produced (from the view source window in firefox) to see if it was the dynamic generation (which wasn’t really that complex from what I could see). Again, 200-400ms time, serving 9-10 requests a second, with almost no CPU or disk impact.

OK, so I thought what if there was a way to pre-generate the HTML periodically, and then have the perl code load that instead of doing it dynamically each time. That almost sounds like….. “caching“. Huh, almost like something that should be built in.

Enter Caching
Honestly my experiences with caching have been minimal, most of the time I am trying to prevent caching (for re-uploaded images with the same filename, that sort of thing), and also it just hadn’t come up yet, probably because most of the sites I have worked on don’t get huge enough traffic to require it. Luckily I had just read something about Caching in the HTML::Mason developer docs while finding some information for something else.

HTML::Mason has the concept of “components”, similar to partials in the rails world. You’d call something like this:

blah blah
< & "/comp/gallery.mc", id => $id, page => $cur_page, title => "TiR" &>
blah blah

When the page is rendered it would call the gallery.mc component with the given arguments, render it, running whatever code is in there (HTML::Mason isn’t the nice separated MVC that Rails is, so there’s potentially lots of controller code in your pages and components) and replacing the < & &> with the output. The documents have a nice section on the built in page and component (think fragment) caching where all you need to do is to add this code to the top of your component’s “init” section:

return if $m->cache_self(key => 'fookey', expires_in => '3 hours', [other options...] );

This lets your component see if it’s already in the cache, and not expired, and if it is, serves that, and if not, renders and then caches itself with the given key. The only tricky part is figuring out the right cache key to ensure it’s unique for each section of code. I ended up writing something like this:

$key = "gallery|$id|$cur_page|$title";
return if $m->cache_self(key => $key, expires_in => '10 minutes', [other options...] );

This makes the cache key a hash of the arguments sent to the component, ensuring that each differently rendered version of the page will get a different cache key. Not perfect I’m sure, but a nice mix of good caching and safety.

Running ‘ab’ again I found that while the first couple of requests still took 2000-4000ms to run, subsequent pages were served in the 200-400ms range, and the CPU and disk load was way down.

WTF – This is a Rails Blog
So why am I telling you all this Perl stuff? It’s because this is related more to web programming and programmer mindset than Perl or HTML::Mason. You could replace “perl” with “ruby”, “component” with “partial” and “HTML::Mason” with “Rails” and get the same idea.

Because everything ran fine when the site was under development and only two or three people were hitting it I didn’t have to worry about caching or performance issues. In fact, I didn’t even think about performance because thigns “just worked”. When things did go badly (again, server crashes == pissed off clients), I had to scramble to find a solution (luckily only one night of work).

I’m still doing testing with the new caching code, but I expect to put it online tonight or tomorrow, and look forward to the before and after numbers on the production server.

Lessons Learned
My lessons learned:

  • Watch from the start for cachable pieces of code. Big complex SQL queries or complex logic that can be created once a week, day or even every minute is a candidate. In Rails it can be as simple as surrounding the code with < % cache do %> .. < % end %>.
  • Test performance from the onset. Learn to love Apache Benchmark and start hitting your sites potential hot pages from the start, and watch and learn what causes reponsiveness to go down.

Resources
For those of you wanting some actual rails resources to learn more about this stuff, have a look at the following:

  • Understanding ‘ab’ results – Nice resource for how to read that output.
  • Caching with Rails – The rails guides documentation with details on page, fragment, action caching and everything in between.
  • Rails 2.1 Caching – A bit older, but a nice list of the caching capabilities introduced and available in Rails 2.1, still pretty relevant.
  • The Scaling Rails Podcast Series – Fantastic information in here, I recommend watching all of them, if you can’t, hit #2, 3, 5, 6, 7 for caching, and then #15 and 16 for load testing with ab and friends.
Any other resources or hints as to how to deal with caching in Rails (or Perl for that matter! :) ?

Programmer Cartoons

Wednesday, April 28th, 2010

Programmer Funny Comic T-ShirtOK, just another silly diversion, and a bit of a silly list.  Andrew mentioned Abstruse Goose to me in the #fv.rb channel, I responded with Not Invented Here, and well, I figured it’d be good to put a few down.

  • XKCD – A given, should not even be mentioned.  I insult you by even thinking that you wouldn’t already be subscribed.  I’d also be insulting you if I implied that you shouldn’t forget that there are also forums and other community goodies that are just as good as the comic.
  • Not Invented Here – Great cartoon about a programming shop, with a lot of “does he spy on me” moments lately.  Case in point.  Best of all it’s not that old so you can go and catch up fairly easily.
  • Dilbert.  An old staple of the pain that programmers working in the big tall buildings in the big companies feel.  I fell out of love with Dilbert a while back, but something pulled me back in with a few cases where I’m pretty sure Scott Adams is working in my company.  Here’s a prime example.
  • I just found Abstruse Goose through this great R2D2 Hacking The Death Star post.  Hooked and subscribed.
  • Not exactly a comic, but the What Is Your Favorite Programmer Comic on StackOverflow has some awesome gems as well as some that aren’t part of any series.
  • Joy Of Tech is an Apple oriented cartoon you’re sure to enjoy.
  • Virtual Shackles is fairly new to me as well, but beautifully drawn, and funny as hell.
  • Geek and Poke is a look at the modern socialnetworkconnectedtwittering world.
  • Another not-programmer specific one, but you have to love The Oatmeal.
  • Just a Bit Off.  What more can you say about a comic that has a comic like this one. Oh my….
  • Userfriendly is a site and community I’ve been involved with for a long time, and even though Illiad is in a “rerun” stage, this is a classic strip that deservies your attention.
  • Generally speaking, programmers have completed some form of post-secondary education, meaning that PHD Comics will have some relatable situations.
  • New: Out of print, but great to browse the archives is Hackles.
  • New: I can’t believe I missed General Protection Fault!
  • New: Designer oriented, still looks good: The Brads – example flash bashing :)  - @andrewvit
  • New: Ubersoft

Anyone got some suggestions of any that I’ve missed?

Updated: Thanks for the suggestions, I’ve added those to the list!

Quiet Day of Reading

Tuesday, April 27th, 2010

Been a bit busy the last couple of days, I have a great post on Rails podcasts sitting chambered and ready to be formatted and have images added, and I’ll post that tonight.  The last couple of lunch hours I’ve taken a bit of my own advice and have been hiding in the lunch room and reading through The Pickaxe Book and 4 chapters in I’m glad I am.  It’s nice to get back to basics and go back and examine some of the principles that I’m using in more detail, as a lot of the time when I’m doing rails coding I really don’t know what’s going on down in the core of Ruby that may have to do with inheritance, class methods, and so on.

A New Project – Post to Your Blog From Twitter

Saturday, April 24th, 2010

So I’ve had a bit of a challenge finding a huge amount of inspiration for my referee site, so I decided to do something a bit fun and do something I’ve wanted to do for a while, and create a connection between twitter and blogging.  Well, also I just wanted to try something different and challenging.

The idea is that during the day it’s pretty easy to bang off a quick post from tweetdeck whereas going to your blog control panel, add new post, put in a title, tab, put in your post, link, formatting, etc, then hit publish, wait 2, 3, 4, and then continue on.  Then sometimes you’re tweeting things out as well, so you either install the annoying “tweet what I just blogged” plugin (*cough*) or you re-type something similar but different and post it out.  I’m sure there are better workflows for this of course, but that’s how I work during the day (no, never on work hours of course).

A better workflow I saw was you have a wee little bot running on a server somewhere that watches for posts from a certain user with a certain tag and a certain format.  Then it scoops them up, reformats them, and connects to your blog, posts the article, and voila!

So I could do something like this:

@thinkingonrails Cool article about Posting From Twitter @ http://bit.ly/thir - Man this is awesome, I wish I'd read it yesterday! #tirp

And then have it show up on the blog as something like:

Title: Posting From Twitter
Body: Cool article about Posting From Twitter.  Man this is awesome, I wish I'd read it yesterday!  Posted from [insert awesome name for the software I haven't thought of yet].

Obviously some thought would have to go into the pre-formatting, or maybe it would just take the first N characters as the title and dump it all into the body, I’m not really sure.

Unfortunately I haven’t gotten to far :(

I don’t want to put a shell of a program up on github just yet, and I need to do some looking to find the right libraries.  So far I’ve found twibot, which is a bit of a pain to install (but if you read the instructions here you’ll find it easy), but looks like a nice shell to work with.  Unfortunately most of my time tonight has been fighting with something stupidly simple… getting a gem to load.

Update – So you can basically ignore everything below.  Turns out that I had forgotten (or never learned) that you need to also require ‘rubygems’ in your console programs, as apparently IRB already does that for you. Also I discovered that as soon as you publicly post about your issues, you’ll find the solution, making your public post an embarrassing reminder of how you learned something today :)  So you can sort of ignore everything below…

No, seriously. I’m sure it has something to do with using RVM, but I’ve been too stubborn to install the gem in the system path to see if that fixes it.

I’ve got a list of gems installed, including twibot.

alan:~/code/ruby/twitterpost]$ gem list | grep twibot
twibot (0.1.7)
[alan:~/code/ruby/twitterpost]$

In irb I can load the gem just dandy.

[alan:~/code/ruby/twitterpost]$ irb
ruby-1.8.7-p249 &gt; require 'twibot'
 =&gt; true
ruby-1.8.7-p249 &gt;

But as soon as I try to run it in a .rb file (with env ruby or the full path to my ruby in the #! line… I just get this:

[alan:~/code/ruby/twitterpost]$ ./twitterpost.rb
./twitterpost.rb:2:in `require': no such file to load -- twibot (LoadError)
	from ./twitterpost.rb:2
[alan:~/code/ruby/twitterpost]$

*sigh*

That’s from a script, running “ruby ./twitterpost.rb”, running from RubyMine 2.0.2 (which supports RVM BTW, and shows the required gems loaded, and the correct RVM environment) all gives the same results.

I really wish this post was full of code, and me telling you how there is a project up on github, and hey, I also figured out how to use the ruby XML-RPC library… but no, it’s as a friend of mine says, just a bunch of fail.  Hopefully I’ll get this worked out tonight some more so I can post with some more success soon.

Quick Tip – Go Back and Re-Watch Things You Didn’t Understand Before

Thursday, April 22nd, 2010

Today I finally got around to watching the first Rails Dispatch episode of building a blog under Rails 3, and was surprised by a couple of things.

First was that Rails 3 isn’t that different than Rails 2 (which BTW, is apparently obselete :) at least for the “normal” tasks of CRUD like operations in the demo application.  A few bits of different syntax here and there, but for the most part it was all pretty recognizable.

I was secondly surprised at how much I understood.  Not that I’m dumb of course, but a lot of things in rails I just let wash over me and accept that I Just Don’t Get Yet(tm)(r)(c).  This stuff however, was perfectly recognizable and grokkable, and because of that I actually picked up a couple of new things.

So my personal revelation of the day (I’m not going to call them tips cause that’d indicate that other people don’t all know this stuff already) is to go back and re-watch, re-read, or re-listen to things that you maybe didn’t fully grasp, or didn’t grasp that fully, and see if the second viewing with a few weeks or months of time gives you a better grasp of it.

Rails Compared To Other Frameworks

Wednesday, April 21st, 2010

This is just a quick post before I hit the hay for the night.  I was doing my job today and fighting through modifying a corporate website to add some fairly simple functionality.  After literally half a day of fighting with it (granted, with being distracted and pulled away as I often am during the course of my day job), I realized just some of how awesome Ruby on Rails is.

I came from a world with no frameworks, and my only real frame (hehehe) of reference was in the Ruby on Rails introduction where the concept of “convention over configuration” is presented, and it’s compared to having to deal with miles of XML configuration files.  When I first checked out Rails it was cool, and has been until now, several years later.  However in that time I had never really used any other web frameworks, so I just sort of assumed that most other ones were like Rails but not as good, or with a slightly different way of doing things, or something.

Today I realized just how fully bad it could be, and this was with a “good” non-rails framework.  I won’t name said framework as I’m sure it’s fans would be out in droves to tell me how I was wrong, but deciphering how to display a page, or finding decent documentation on how to display a page, made me really appreciate the work that the Rails community has put into creating decent documentation and examples, as well as the simplicity of rails’ directory layout.

You really don’t appreciate how nice it is to have a setup with myapp/config/, myapp/app/views myapp/app/controllers and myapp/app/models until you have to dig into figuring out that apps/frontend/modules/[something]/templates/[somethingelse] is where you need to be, and you still have to muck with routing and view config setups before it’ll even work.  Argh!

That’s not to say that rails can’t get complex, or that maybe if I’d have spent more than a couple of days with this other setup I couldn’t figure it out and it would all make perfect and complete sense, but I think that even the first time I saw rails 1.0 back in 2004 I could pretty much grok what was going on, even without knowing Ruby.  Just showing some static pages was a battle today (and in doing so I discovered that I can’t just display them as static pages because I need to have them respond to POSTs for some minor functionality they have on them, but that’s another story).

Course, if you use something else other than Rails you are not Wrong, and I’m sure it works just dandy for you, but the difference I see in Rails is that when you are used to something else, Rails seems a heck of a lot easier to jump into than going in the other direction.