Ruby's new as a factory

Ruby's new is often described as being the perfect implementation of a factory method. In C++/Java/C#, you're forced to do something like User.build() or User.create() instead of new User() because there's no way you can change the way the new keyword behaves. In Ruby on the other hand, new is simply a class method on User and can be arbitrarily overwritten. Note that I'm saying overwritten, not overridden - I don't mean override in a sub-class but actually overwrite - replace - a method. You can overwrite User.new to do just about anything - typical uses would be to implement object pooling, the Singleton pattern (overwrite new to return the same instance every time), stuff like that. What's just as important is that you're sticking to the convention (new() instead of an arbitrary choice like create()) which makes life easier for everyone because it's natural and transparent to the consumers of your classes.

Ola coincidentally happened to cover the same topic a couple of days ago when talking about Steve Yegge's post on code bloat, so I'll just link to his post and skip the introduction. Look to the second half for a description of how to use new as a factory. So, let's move on to the example which got me interested in this. I've constructed a sample problem which has roughly the same structure as what I was working with - if some bits of it look rather contrived, it's because they are ;-).

The problem is this - I have a base Operation class which has some state and some logic. There are two sub-classes of Operation, Add and Multiply. Here is the code for these classes - take a moment to look them over.
class Operation
def initialize(a, b)
@a = a
@b = b
end

def to_s
"#{self.class}(#{@a}, #{@b})"
end
end

class Add < Operation
def do
@a+@b
end
end

class Multiply < Operation
def do
@a*@b
end
end
The state is represented by @a and @b and the logic, such as it is, by to_s().

Input in the form of strings like add 2 5 and multiply 3 7. These strings need to be parsed and the appropriate sub-class of Operator constructed with the numbers as its state. Operator is however never instantiated because it doesn't make sense to do so - a perfect candidate for an abstract class if such a thing existed in Ruby. The sub-classes of Operator expose a standard interface in the form of the do() method which is responsible for returning the result of that operation on the numbers it contains. Yup, you're right, what you're seeing is the command pattern.

There is a controller class (yes, all right, I admit it was a Rails app which spawned this post) which handles the bit which involves receiving commands and constructing command objects from them. It looks something like this:
class Controller
def execute(commands)
operations = build_operations(commands)
operations.each{|operation| puts "#{operation}: #{operation.do()}"}
end

private

# Iterate over parsed commands and use them to construct
# appropriately initialised operation objects
def build_operations(commands)
parse_operations_and_values(commands).collect{|operation, a, b|
Kernel.const_get(operation).new(a.to_i, b.to_i)
}
end

# Iterate over a collection of commands and extract an
# operation and the values on which it operates from each command
# ['Add 2 5', 'Multiply 3 7'] when parsed returns
# [['Add', '2', '5'], ['Multiply', '3', '7']]
def parse_operations_and_values(commands)
commands.collect{|command| command.split}
end
end
If you're wondering about all the Array magic in build_operations(), remember that Ruby automatically decomposes Arrays, so if I do
operation, a, b = ['Add', '2', '5']
Ruby figures out that 'Add' goes into operation, '2' into a and '5' into b. This nifty ability (called destructuring assignment) also allows us to make it look like we're returning more than one value from a method when we're actually returning a collection and having Ruby assign elements from it automatically.

const_get() returns the value of the named constant passed to it. When invoked on Kernel (or Object) it ends up returning the class of that name. So Kernel.const_get('Add') returns the class Add (remember that classes are also objects in Ruby).

Let's try executing the lot like so:
commands = ['Add 2 5', 'Multiply 3 7']
Controller.new.execute(commands)
The output looks like this:
Add(2, 5): 7
Multiply(3, 7): 21


As you've realised, this is an excellent candidate for a factory - most of the code in Controller can be moved into Operation so that the Controller is no longer involved in the details of parsing commands and building Operations. But instead of simply adding a create() method to Operation, what I'd really like to be able to do is something like commands.collect{|command| Operation.new(command)} and get a neat little collection of Adds and Multiplys. The whole thing is completely transparent to the consumer who never really cared about whether the objects were Adds or Multiplys so long as they exposed the do() interface. Let's try to work toward this form of Operation.

If you've read Ola's post then you already know that the default implementation of new() looks something like this:
def self.new(*args, &block)
obj = self.allocate
obj.send :initialize, *args, &block
obj
end
Of course, this doesn't work for us because we don't want to ever instantiate Operation. What we want is for Operation to look like this:
class Operation
def self.new(command)
operation, a, b = parse(command)
operation_class = Kernel.const_get(operation)
operation_class.new(a.to_i, b.to_i)
end

def self.parse(command)
command.split
end

def initialize(a, b)
@a = a
@b = b
end

def to_s
"#{self.class}(#{@a}, #{@b})"
end
end
The catch with this implementation is that overwriting new() modifies it even for the Add and Multiply sub-classes. Since the signatures of Operation's and Add/Multiply's constructors are different, this piece of code is dead in the water - not something we wanted. We could of course re-implement new() in both sub-classes to get around this, but that's tedious, repetitive and plain ugly.

So the trick here is to alias (or copy) the original new() in Operation before overwriting it. Now that we have a copy of new(), we use the inherited() object life-cycle hook to listen for points in the code where Operation is subclassed. When we detect that some class is inheriting from Operation, we simply replace the modified new() with the original.

See for yourself. This is the completed solution, so you should be able to simply copy it and run it.
class Operation
class << self
alias :__new__ :new

def inherited(subclass)
puts "#{subclass} has inherited #{self}"
class << subclass
alias :new :__new__
end
end
end

def self.new(command)
operation, a, b = parse(command)
operation_class = Kernel.const_get(operation)
operation_class.new(a.to_i, b.to_i)
end

def self.parse(command)
command.split
end

def initialize(a, b)
@a = a
@b = b
end

def to_s
"#{self.class}(#{@a}, #{@b})"
end
end

class Add < Operation
def do
@a+@b
end
end

class Multiply < Operation
def do
@a*@b
end
end


class Controller
def execute(commands)
commands.collect{|command|
Operation.new(command)
}.each{|operation|
puts "#{operation}: #{operation.do()}"
}
end
end


commands = ['Add 2 5', 'Multiply 3 7']
Controller.new.execute(commands)
The extra magic can be seen right at the beginning of Operation where we alias/copy the new method into the __new__ method. When we detect an inherited() event, we simply reverse the aliasing.

Running this produces the following output:
Add has inherited Operation
Multiply has inherited Operation
Add(2, 5): 7
Multiply(3, 7): 21


Controllers should always act as routers between the UI and the domain layer and contain as little as possible of the business logic. As you can see, the changes we have made has slimmed Controller down considerably, so that's one benefit right away. Also, consumers of Operation now deal with a much simpler (and non-arbitrary) interface when building Operations from commands.

5 comments:

Rich M said...

"In C++/Java/C#, you're forced to do something like User.build() or User.create() instead of new User() because there's no way you can change the way the new keyword behaves."

Sorry mate, you're misinformed. You can do exactly that in Dot Net.

Unknown said...

Thanks for cross-posting from DZone, Richard. Here's my response, again cross-posted.

You've been confused by the constructor in VB.Net. In VB.Net both the constructor and the object initializer (for lack of a better word) have the same name, 'New'. However, if you look up the examples in the link you posted, there is no way to modify the process of creating an instance of SingletonForm, only how it is initialised after it has been created. You have no control over allocating memory for it, or over the type of object created (it will always be a SingletonForm), whereas my post demonstrates how you can over-write Operation.new to return instances of Add and Multiply.

A clear indicator of these limitations is the need to mark the New() constructor as protected (typical Singleton implementation even in Java and C#). Another is that the New() method is an instance method (if it's an instance method, then someone else has already created the instance). A third is that you can never do 'New SingletonForm()' and have it return a SingletonApplicationForm, if you get my drift.

Unknown said...

Oh, and if you read that post on singletons you've linked to, you'll see that once he's done he can no longer say

Dim myForm As SingletonForm = New SingletonForm()

but he's instead forced to say

Dim myForm As SingletonForm = SingletonForm.GetInstance()

Anonymous said...

really cool stuff, i was looking for something like this. thanks for this post.
/linh

Ethan said...

Couldn't you avoid aliasing around Class#new by using #allocate? Something like:


class Operation
  def self.new(command)
    operation, a, b = parse(command)
    operation_class = Kernel.const_get(operation)
    operation_instance=operation_class.allocate
    operation_instance.send(:initialize, a.to_i, b.to_i)
    operation_instance
  end


That does have the downside of basically reimplementing Class#new in your own new method, but it's only a couple of lines.