Sunday, August 19, 2007

Ruby require idiom

Ruby Kernel#require(filename) read and execute the specified file only once regardless how many time we call it as opposite to Kernel#load(filename) which reload the file each time. With application having many files, each depend on others without this behavior of Kernel#require(filename), a single file will be read executed multi time, which may not desired from functional and performance's perspective. The Kernel#require(filename) however has an issue, after loading it puts path of a file required in global array $" and does not know if we refer to the same file using different path. e.g.
#file : lib/foo.rb
puts "loading #{__FILE__}"

#file : lib/bar.rb
require File.dirname(__FILE__) + '/foo'
puts "loading #{__FILE__}"

#file : app/runme.rb

require File.dirname(__FILE__) +'/../lib/foo'
require File.dirname(__FILE__) +'/../lib/bar'
puts $"
Then run
ruby app/runme.rb

loading ./app/../lib/foo.rb
loading ./app/../lib/foo.rb
loading ./app/../lib/bar.rb
["app/../lib/foo.rb", "./app/../lib/foo.rb", "app/../lib/bar.rb"]
There are basically few well known techniques to deal with this problem
1. Using absolute path
2. Modifying $LOAD_PATH
3. Using defined?

USING ABSOLUTE PATH
In this variant we always call Kernel#require with a absolute path, the File::expand_path will be used to remove '..' symbol representing parent directory e.g
#file : lib/bar.rb
require File.expand_path(File.dirname(__FILE__)) + '/foo'
puts "loading #{__FILE__}"

#file : app/runme.rb
require File.expand_path(File.dirname(__FILE__)+'/../lib')+'/foo'
require File.expand_path(File.dirname(__FILE__)+'/../lib')+'/bar'
puts $"
run
ruby app/runme.rb

loading D:/huy/rubyapp/require_1/lib/foo.rb
loading D:/huy/rubyapp/require_1/lib/bar.rb
["D:/huy/rubyapp/require_1/lib/foo.rb", "D:/huy/rubyapp/require_1/lib/bar.rb"]
This method is described in post ruby require idiom

MODIFYING $LOAD_PATH
The second quite popular technique is to modify $LOAD_PATH directly e.g.
#file : lib/bar.rb

libpath=File.expand_path(File.dirname(__FILE__)+'/lib')
$LOAD_PATH.unshift(libpath) unless $LOAD_PATH.first==libpath

require 'foo'
puts "loading #{__FILE__}"

#file : app/runme.rb

libpath=File.expand_path(File.dirname(__FILE__)+'/lib')
$LOAD_PATH.unshift(libpath) unless $LOAD_PATH.first==libpath

require 'foo'
require 'bar'
puts $"
In the previous mentioned techniques, the File#expand_path method is used to get absolute path of either file or directory, the same file or directory is kept only once in a relevant global variable.

USING defined?
This technique is very old and frequently used by C programmers to guard header file to include multiple e.g.
#file foo.h

#ifdef __FOO_H

#define __FOO_H

...

#endif
In Ruby, I have seen this technique being applied using Kernel#defined? e.g.

#file : lib/foo.rb
unless defined?(FooDefined)
   FooDefined=true
   puts "loading #{__FILE__}"
end

#file : lib/bar.rb
unless defined?(BarDefined)
  BarDefined=true
  require File.dirname(__FILE__) + '/foo'
  puts "loading #{__FILE__}"
end

#file : app/runme.rb

require File.dirname(__FILE__)+'/../lib/foo'
require File.dirname(__FILE__)+'/../lib/bar'
puts $"
UPDATE on 26-12-2007
The new Kernel#require in Ruby 1.9 store full path in $" make this article obsolete.