Ruby Performance :: Case-When vs. If-Else

Posted by Chris Blackburn Mon, 24 Nov 2008 21:38:00 GMT

Another interesting set of timings are the differences between the branching control statements case-when and if-else. Here is a quick example:

require 'profilings'
include PeepcodeProfiler

###
# Case-when vs. If-Elsif-Else
###

time_this("case-when using strings") {
  x = "10"
  case x
  when "1"
    puts "it is #{x}"
  when "2"
    puts "it is #{x}"
  when "3"
    puts "it is #{x}"
  when "4"
    puts "it is #{x}"
  when "5"
    puts "it is #{x}"
  when "6"
    puts "it is #{x}"
  when "7"
    puts "it is #{x}"
  when "8"
    puts "it is #{x}"
  when "9"
    puts "it is #{x}"
  when "10"
    puts "it is #{x}"
  end
}
Timings for case when using strings
Thread ID: 218880
Total: 0.000113

 %self     total     self     wait    child    calls  name
 36.28      0.00     0.00     0.00     0.00       10  Kernel#=== (ruby_runtime:0}
 31.86      0.00     0.00     0.00     0.00        2  IO#write (ruby_runtime:0}
 22.12      0.00     0.00     0.00     0.00       10  String#== (ruby_runtime:0}
  9.73      0.00     0.00     0.00     0.00        1  Kernel#puts (ruby_runtime:0}
  0.00      0.00     0.00     0.00     0.00        0  PeepcodeProfiler#time_this (./profilings.rb:8}

...and again the faster if-elsif-else:

time_this("if-elsif-else using strings") {
  x = "10"
  if x == "1"
    puts "it is #{x}"
  elsif x ==  "2"
    puts "it is #{x}"
  elsif x ==  "3"
    puts "it is #{x}"
  elsif x ==  "4"
    puts "it is #{x}"
  elsif x ==  "5"
    puts "it is #{x}"
  elsif x ==  "6"
    puts "it is #{x}"
  elsif x ==  "7"
    puts "it is #{x}"
  elsif x ==  "8"
    puts "it is #{x}"
  elsif x ==  "9"
    puts "it is #{x}"
  elsif x ==  "10"
    puts "it is #{x}"
  else
    puts "it is #{x}"
  end
}
Timings for if elsif else using strings
Thread ID: 218880
Total: 0.000049

 %self     total     self     wait    child    calls  name
 46.94      0.00     0.00     0.00     0.00       10  String#== (ruby_runtime:0}
 30.61      0.00     0.00     0.00     0.00        2  IO#write (ruby_runtime:0}
 22.45      0.00     0.00     0.00     0.00        1  Kernel#puts (ruby_runtime:0}
  0.00      0.00     0.00     0.00     0.00        0  PeepcodeProfiler#time_this (./profilings.rb:8}

OK, obviously the case-when version is slower because it does a compare for each value using the === Kernel operator, which in turn calls the correct == operator for the value type. Whereas the if-elseif-else version doesn’t incur the overhead of the Kernel triple-equal operator for each comparison.

So I wonder if the same thing happens with a different object type:

time_this("case when using integers") {
  x = 10
  case x
  when 1
    puts "it is #{x}"
  when 2
    puts "it is #{x}"
  when 3
    puts "it is #{x}"
  when 4
    puts "it is #{x}"
  when 5
    puts "it is #{x}"
  when 6
    puts "it is #{x}"
  when 7
    puts "it is #{x}"
  when 8
    puts "it is #{x}"
  when 9
    puts "it is #{x}"
  when 10
    puts "it is #{x}"
  end
}
Timings for case when using integers
Thread ID: 218880
Total: 0.000104

 %self     total     self     wait    child    calls  name
 46.15      0.00     0.00     0.00     0.00       10  Kernel#=== (ruby_runtime:0}
 19.23      0.00     0.00     0.00     0.00        9  Fixnum#== (ruby_runtime:0}
 15.38      0.00     0.00     0.00     0.00        2  IO#write (ruby_runtime:0}
 15.38      0.00     0.00     0.00     0.00        1  Kernel#puts (ruby_runtime:0}
  3.85      0.00     0.00     0.00     0.00        1  Fixnum#to_s (ruby_runtime:0}
  0.00      0.00     0.00     0.00     0.00        0  PeepcodeProfiler#time_this (./profilings.rb:8}

Hmmm, bad news. Here we see the Kernel === operator in the call stack again.

time_this("if elsif else using integers") {
  x = 10
  if x == 1
    puts "it is #{x}"
  elsif x ==  2
    puts "it is #{x}"
  elsif x ==  3
    puts "it is #{x}"
  elsif x ==  4
    puts "it is #{x}"
  elsif x ==  5
    puts "it is #{x}"
  elsif x ==  6
    puts "it is #{x}"
  elsif x ==  7
    puts "it is #{x}"
  elsif x ==  8
    puts "it is #{x}"
  elsif x ==  9
    puts "it is #{x}"
  elsif x ==  10
    puts "it is #{x}"
  else
    puts "it is #{x}"
  end
}
Timings for if elsif else using integers
Thread ID: 218880
Total: 0.000057

 %self     total     self     wait    child    calls  name
 50.88      0.00     0.00     0.00     0.00       10  Fixnum#== (ruby_runtime:0}
 26.32      0.00     0.00     0.00     0.00        2  IO#write (ruby_runtime:0}
 15.79      0.00     0.00     0.00     0.00        1  Kernel#puts (ruby_runtime:0}
  7.02      0.00     0.00     0.00     0.00        1  Fixnum#to_s (ruby_runtime:0}
  0.00      0.00     0.00     0.00     0.00        0  PeepcodeProfiler#time_this (./profilings.rb:8}

Interesting. Even for a different object type, the case statement incurs the overhead of the Kernel === operator.

Now I’m not saying to always avoid the case statement. It is really a judgement call. I typically prefer cleaner code over slight optimization benefits. However, in this case (no pun intended) if-elsif-else is consistently about twice as fast as the case-when statement. You be the judge for your own particular situation.

Take a look here for the Peepcode profiler that we use when doing our metrics. If you come up with interesting metrics please let us know. We are happy to publish them here, giving you the credit, or link to them from here.

Happy Hacking!!!

UPDATE: Peter Wagenet brought up a good point, noting that our metrics did not completely determine if the slower case statement is indeed slower merely because it uses === operator, or if there is more overhead involved with case. So here is our test…

time_this("if elsif else using integers with ===") {
  x = 10
  if x === 1
    puts "it is #{x}"
  elsif x ===  2
    puts "it is #{x}"
  elsif x ===  3
    puts "it is #{x}"
  elsif x ===  4
    puts "it is #{x}"
  elsif x ===  5
    puts "it is #{x}"
  elsif x ===  6
    puts "it is #{x}"
  elsif x ===  7
    puts "it is #{x}"
  elsif x ===  8
    puts "it is #{x}"
  elsif x ===  9
    puts "it is #{x}"
  elsif x ===  10
    puts "it is #{x}"
  else
    puts "it is #{x}"
  end
}
Timings for if elsif else using integers with ===
Thread ID: 218880
Total: 0.000081

 %self     total     self     wait    child    calls  name
 50.62      0.00     0.00     0.00     0.00       10  Kernel#=== (ruby_runtime:0}
 23.46      0.00     0.00     0.00     0.00        9  Fixnum#== (ruby_runtime:0}
 11.11      0.00     0.00     0.00     0.00        1  Kernel#puts (ruby_runtime:0}
 11.11      0.00     0.00     0.00     0.00        2  IO#write (ruby_runtime:0}
  3.70      0.00     0.00     0.00     0.00        1  Fixnum#to_s (ruby_runtime:0}
  0.00      0.00     0.00     0.00     0.00        0  PeepcodeProfiler#time_this (./profilings.rb:8}

After running the test several times I have determined that using the === operator is obviously, and consistently slower than == alone. However, the case statement is consistently slower still, than if-elsif using the === operator. Therefore case does incur overhead beyond the simple use of === by about 20%.

Posted in  | Tags , , , , ,  | 4 comments | no trackbacks

Being a Pragmatic Programmer

Posted by Chris Blackburn Mon, 24 Nov 2008 18:28:00 GMT

My friend and colleague Peter Wagenet has written an excellent article about the true meaning of “pragmatic programming”, and how to achieve it. Check it out!

Posted in  | Tags  | no comments | no trackbacks

The Peepcode Profiler

Posted by Chris Blackburn Tue, 18 Nov 2008 21:15:00 GMT

A few people have asked me for the module I’ve been using to profile my code in my Double Quotes vs. Single Quotes article. It is merely a wrapper around RubyProf, dead simple and nothing worth commenting on, but since some keep asking—here it is:

require 'rubygems'
require 'ruby-prof'

module PeepcodeProfiler
  def time_this(comment, &block)
    RubyProf.measure_mode = RubyProf::PROCESS_TIME
    RubyProf.start
    yield
    result = RubyProf.stop
    puts "\nTimings for #{comment}"
    printer = RubyProf::FlatPrinter.new(result)
    printer.print(STDOUT, 0)
  end
end

I wrote this for some work on a “Scaling Rails” minibook for Peepcode, which I seem to be unable to finish. I hope Geoffrey forgives me.

Posted in  | Tags , , , ,  | 3 comments | no trackbacks

Rails and How to Index Your DB Tables

Posted by Chris Blackburn Thu, 06 Nov 2008 22:48:00 GMT

Between the time your migration files get generated and deploying to production with your new application you will likely need to add indexes to one or more of your tables. How do you know which tables need indexes and which fields in those tables to index?

Here are some tips to help figure out where the likely bottlenecks will occur with regard to your database tables:

1. Always index *_id foreign_key fields. Let’s say you have 2 tables, users and addresses. You may have migration file code in ./db/migrate that looks like this:

  ...
  def self.up
    create_table "users", :force => true do |t|
      t.column :login, :email             :string
    end
  end
  ...
  ...
  def self.up
    create_table "addresses", :force => true do |t|
      t.column :user_id       :integer
      t.column :address       :string
    end
  end
  ...

You will always want to index the foreign_key user_id, with few exceptions. So simply add the add_index line like this:

  ...
  def self.up
    create_table "addresses", :force => true do |t|
      t.column :user_id       :integer
      t.column :address       :string
    end
    add_index "addresses", :user_id
  end
  ...

This will facilitate much faster associations when you lookup related models like this view code:

  <%=h user.address.address %>

  # OR as the case may be...

  <%=h address.user.email %>

2. As a rule of thumb index any column for which you will be using to lookup a row or set of rows. For example, if you will be doing something like this:

  u = User.find_by_email('test@example.com')

... then you will want to index the email field.

3. Run explain plans on your associations to determine where indexes would benefit the queries. Now this one is more tricky unless you know about the query_analyzer plugin. This nifty little tool will automatically dump out explain plans into your logfiles in development mode that look like this example from the README file:

# development.log

P Load (0.008669)
 => SELECT p.* FROM p INNER JOIN d ON p.id = d.p_id WHERE (d.p_id = 2 AND ((d.type = 'P')))

Analyzing P Load

select_type | key_len | type   | Extra       | id | possible_keys | rows | table  | ref    | key
 ----------------------------------------------------------------------------------------------------
SIMPLE      |         | ALL    | Using where | 1  |               | 74   |    d   |        |
SIMPLE      | 4       | eq_ref | Using where | 1  | PRIMARY       | 1    |    p   | d.p_id | PRIMARY

Now what this tells you is that the table d is missing an index because MySQL is having to look at “ALL” the records. Whenever you see a type of ALL and/or possible_keys is empty it should be a red flag telling you to index something.

If we had followed tips 1 and 2 this explain plan would be much different, showing reference types of ref and eq_ref which is typically a good thing.

For more information see the README file here: http://agilewebdevelopment.com/plugins/query_analyzer.

Posted in  | Tags , , , , , ,  | 2 comments | 1 trackback

Older posts: 1 2 3 4