There's this thing they say about Ruby - everything is an object. It's true, with very few exceptions, one of them being the block. Well guess what, this little gem of an inconsistency came back to bite me when I was trying to do something involving dynamic redefinition of methods.
The context: I recently wrote a little method decorator to help me figure out the execution times of the methods in a class. Nothing complicated - for a given class, alias each method, then redefine it; the new method invokes the original method while measuring the execution time. Here's a pseudocode-ish example to clarify:
define_method method do |*args|
t = Time.now
result = self.send(aliased_original_method, *args)
diff = Time.now-t
puts "#{klass}##{method} took #{diff} s" if diff > 0
return result
end
You may have already noticed that the psedocode above doesn't handle methods which accept blocks - if I tried to decorate Array, then Array#each would fail to execute. My actual solution did handle this, and I'll publish that in another post, maybe others will find it useful.
Anyways, this didn't take long, but once I was done, I was intrigued by the notion of a generic method decorator. It would be pretty cool if I could include my Decorator module into a class, pass it an arbitrary block to do any of those
AOP-ish things like logging or, as I said, measuring execution times, and have all the methods decorated by that block. All the decorator block should have to do to execute the original method would be to
yield
.
So this led me to try to figure out the whole deal with blocks. Simply put, there are two ways to handle blocks as parameters - implicitly and explicitly.
Implicitly passing and invoking blocks
This is the usual way in which blocks are passed to methods. Here's what it looks like:
def foo(*args)
yield(args.join(' '))
end
foo('Sidu', 'Ponnappa'){|name| puts "Hello #{name}"} # => "Hello Sidu Ponnappa"
*args
allows us to handle an arbitrary number of parameters - they're made available inside the method as an array, where we join them and pass them to our block via
yield
.
The block is passed to the method by enclosing it in curly braces and placing it after the method invocation. Only one block can be passed to a method in this manner.
Most importantly,
the block is never bound, and so is not available as an object. It is
implicitly invoked by calling
yield
within the method.
Explicitly passing, binding and invoking blocks
We go this route if we want a handle to the block. Here's a code example - it's similar to the one above, but we bind to the block and then invoke it
explicitly.
def foo(*args, &blk)
blk.call(args.join(' '))
end
foo('Sidu', 'Ponnappa'){|name| puts "Hello #{name}"} # => "Hello Sidu Ponnappa"
The
&
binds the block to the variable
blk
making it available as a
Proc
object.
An even more explicit style involves first binding a variable to the block and
then passing it to the method as an argument (as opposed to using
&
and having Ruby do it automagically). This style is often used when doing functional programming - Reg Braithwaite has a beautiful
article covering this style of programming in Ruby.
Anyways, here's the example:
def foo(*args)
blk = args.delete_at(-1) # We know that the last argument
# is the bound block
blk.call(args.join(' '))
end
the_block = lambda {|name| puts "Hello #{name}"}
foo('Sidu', 'Ponnappa', the_block) # => "Hello Sidu Ponnappa"
As you can see, we bind the block to
the_block
using the built in Ruby method
lambda
and pass it as a regular argument. No magic like the previous examples - the block (now a Proc object) is treated like any other object would be. This, to my eyes, is the most consistent way to use blocks (everything
should be an object). It has a significant disadvantage, however, as we'll see in the next section.
The difference - implicit invocation is much faster
The reason why there are two approaches is simple - performance. Binding a block takes time, so we try to avoid it by going the implicit invocation route. Let's get a handle on the actual differences in performance, though, by benchmarking the examples above (modified slightly to avoid 100000 'puts'). I've renamed the three different example methods to foo, bar and ooga respectively.
require 'benchmark'
# Implicit
def foo(*args)
yield(args.join(' '))
end
puts foo('Sidu', 'Ponnappa'){|name| "Hello #{name}"} # => "Hello Sidu Ponnappa"
# Explicitly binds block when passed
def bar(*args, &block)
block.call(args.join(' '))
end
puts bar('Sidu', 'Ponnappa'){|name| "Hello #{name}"} # => "Hello Sidu Ponnappa"
# Explicitly binds block before passing
def ooga(*args)
blk = args.delete_at(-1)
blk.call(args.join(' '))
end
the_block = lambda {|name| "Hello #{name}"}
puts ooga('Sidu', 'Ponnappa', the_block) # => "Hello Sidu Ponnappa"
puts "Starting benchmark"
n = 100000
Benchmark.bmbm(10) do |rpt|
rpt.report("foo") do
n.times {foo('Sidu', 'Ponnappa'){|name| "Hello #{name}"}}
end
rpt.report("bar") do
n.times {bar('Sidu', 'Ponnappa'){|name| "Hello #{name}"}}
end
rpt.report("ooga") do
n.times {
the_block = lambda {|name| "Hello #{name}"}
ooga('Sidu', 'Ponnappa', the_block)
}
end
end
Output:
Hello Sidu Ponnappa
Hello Sidu Ponnappa
Hello Sidu Ponnappa
Starting benchmark
Rehearsal ---------------------------------------------
foo 0.781000 0.000000 0.781000 ( 0.782000)
bar 1.406000 0.000000 1.406000 ( 1.406000)
ooga 1.438000 0.016000 1.454000 ( 1.453000)
------------------------------------ total: 3.641000sec
user system total real
foo 0.782000 0.000000 0.782000 ( 0.781000)
bar 1.375000 0.015000 1.390000 ( 1.406000)
ooga 1.453000 0.032000 1.485000 ( 1.485000)
As you can see,
bar
, which uses an explicit invocation is approximately 75% slower than
foo
.
ooga
, where the block is bound right at the beginning and passed as a parameter is the slowest.
TANSTAAFL, I guess.
This trick of benchmarking is borrowed from Joel VanderWerf, who
posted a similar benchmark involving all permutations of implicit and explicit invocations over at the Ruby forum.
The catch - implicitly invoking a block from within another block does not work
As a direct consequence of this performance benefit, most of the Ruby code I've seen takes the implicit route. Unfortunately, it is not possible to dynamically redefine methods which expect blocks as implicit parameters - not, and have them continue to behave as before. I know that sounds weird, but read on to the example and all shall be made clear. Hah, always wanted to say that. Ahem.
Getting back to the point, if you dynamically define a method using
define_method
, the method body is passed to it as a block. You cannot pass a block to this dynamically defined method implicitly - at least not that I could find. If there is a way, please let me know - it would help me get a lot of stuff done neatly. In the meanwhile, here's an example demonstrating this inconsistent behaviour.
class SandBox
def abc(*args)
yield(*args)
end
define_method :xyz do |*args|
yield(*args)
end
end
SandBox.new.abc(1,2,3){|*args| p args} # => [1, 2, 3]
SandBox.new.xyz(4,5,6){|*args| p args} # => no block given (LocalJumpError)
SandBox.new.method(:abc).call(1,2,3){|*args| p args} # => [1, 2, 3]
SandBox.new.method(:xyz).call(4,5,6){|*args| p args} # => no block given (LocalJumpError)
The calls to
abc
succeed, but those to
xyz
throw a
LocalJumpError
. There seems to be some fundamental difference in the methods created by
def
and
define_method
, with the latter being unable to handle implicitly passed blocks. Here's something else which I tried, which didn't work either:
lmbda = lambda{|*args| yield(*args)}
prc = Proc.new{|*args| yield(*args)}
lmbda.call(7, 8, 9){|*args| p args} # => no block given (LocalJumpError)
prc.call(10,11,12){|*args| p args} # => no block given (LocalJumpError)
Note that while
lambda
and
Proc.new
both bind a block creating a Proc object,
lamda
causes the bound block to behave more like a method. It also has some differences in the scope available to the bound block.
Proc.new
is mildly deprecated in favour of
lambda
.
To Summarise
- Blocks violate the 'everything is an object' rule in Ruby for performance reasons. They only become objects when bound to a variable.
- Implicit invocation of a block using
yield
is much faster than alternatives involving binding the block to a variable.
- Most Ruby code uses implicit block passing to avoid binding blocks.
- Blocks cannot themselves accept a block as an implicit parameter (rather, I couldn't find any way to do this - suggestions welcome).
- If you define a method using
define_method
, the method body is passed in as a block. This new method cannot itself make use of yield
to invoke an unbound block passed to it implicitly.
- This is inconsistent behaviour, which, if I haven't missed something, kinda sucks.
While searching for a solution to my problem, I came across Paul Cantrell's
exhaustive documentation of the different flavours of blocks/closures in Ruby, as well as their little eccentricities. It's well worth a read.
Update 2007-11-27:
As an anonymous commenter pointed out, Ruby 1.9 will indeed fix this inconsistency. The details can be found
here.
You may also want to read: Ruby blocks redux: Ruby 1.9.0, Ruby 1.8.6 and JRuby 1.0.3, which was posted after the release of 1.9.0
Looking for help with your Ruby/Rails project? Hire us!
If you liked this post, you could