Ruby on Rails
TextSearch (Version #91)

What does it do?

It’s pretty common to want to search for a string across multiple fields in a table, or even across multiple joined tables. The ActiveRecord model already knows how the tables are joined together, so can we take advantage of that information to write a search method? Yes! Here’s what the proposed solution looks like:


require_dependency "search"

class Movie < ActiveRecord::Base
   has_and_belongs_to_many :genres
   searches_on :all
end

Note that :all is not a field, but a directive indicating that all text/varchar fields in my ‘movies’ table will be searched. I could have explicitly told it to search on :title and :description like this:


class Movie < ActiveRecord::Base
   has_and_belongs_to_many :genres
   searches_on :title, :description
end

Now, all future calls to Movie.search will only search the “title” and “description” fields.

Further Usage Examples

In the following examples, all calls to the search method return an array of Movie objects.


# Make a case insensitive search for "star" within all text (or
varchar) fields of "movies"
  @movies = Movie.search "star"  

# Same, but case sensitive
  @movies = Movie.search "star", :case => :sensitive

# Search all text (or varchar) fields within Movie
# and Genre, according to the associations indicated
# by those models' classes.  For example, if genre 
# has_many movies and Genre has a "name" and
# a "description" field, then this search will return
# all movies containing "star" as well as all movies
# belonging to any genre whose description or name
# contains "star":
  @movies = Movie.search "star", :include => [:genre]  

# The search above could also be flipped around if 
# you want to ask the question "What genres 
# contain movies whose titles or descriptions contain
# 'star'?"
  @genres = Genre.search "star", :include => [:movie], :only =>
["movies.title", "movies.description"]

# Search for "star" within a restricted set of 
# text/varchar fields:
  @movies = Movie.search "star", :only => ["title", "description"]
  @movies = Movie.search "star", :except => ["title"]

# Search for "star" with certain other conditions, 
# such as within a particular genre:
  @movies = Movie.search "star", :conditions => "genre_id = 1"

# Search for "star" with other conditions in a
# joined table:
  @movies = Movie.search "star", :join_include => [:genre],
:conditions => "genres.hidden = 0"

The Search Library’s Code

Add the following code to a file called “search.rb” in your application’s “lib” directory:


# This is an add-on to the ActiveRecord::Base class.  It allows simple searching to be
# accomplished by using, for example, @movies = Movie.search("text")
module ActiveRecord
  class Base
    # Allow the user to set the default searchable fields
    def self.searches_on(*args)
      if not args.empty? and args.first != :all
        @searchable_fields = args.collect { |f| f.to_s }
      end
    end
    
    # Return the default set of fields to search on
    def self.searchable_fields(tables = nil, klass = self)
      # If the model has declared what it searches_on, then use that...
      return @searchable_fields unless @searchable_fields.nil?
      
      # ... otherwise, use all text/varchar fields as the default
      fields = []
      tables ||= []
      string_columns = klass.columns.select { |c| c.type == :text or c.type == :string }
      fields = string_columns.collect { |c| klass.table_name + "." + c.name }

      if not tables.empty?
        tables.each do |table|
          klass = eval table.to_s.classify
          fields += searchable_fields([], klass)
        end
      end
      
      return fields
    end

    # Search the movie database for the given parameters:
    #   text = a string to search for
    #   :only => an array of fields in which to search for the text;
    #     default is 'all text or string columns'
    #   :except => an array of fields to exclude from the default searchable columns
    #   :case => :sensitive or :insensitive
    #   :include => an array of tables to include in the joins.  Fields that
    #     have searchable text will automatically be included in the default
    #     set of :search_columns.
    #   :join_include => an array of tables to include in the joins, but only
    #     for joining. (Searchable fields will not automatically be included.)
    #   :conditions => a string of additional conditions (constraints)
    #   :offset => paging offset (integer)
    #   :limit => number of rows to return (integer)
    #   :order => sort order (order_by SQL snippet)
    def self.search(text = nil, options = {})
      options.assert_valid_keys(:only, :except, :case, :include,
                        :join_include, :conditions, :offset, :limit, :order)
      case_insensitive = true unless options[:case] == :sensitive
      
      # The fields to search (default is all text fields)
      fields = options[:only] || searchable_fields(options[:include])
      fields -= options[:except] if not options[:except].nil?

      # Now build the SQL for the search if there is text to search for
      condition_list = []
      unless text.nil?
        text_condition = if case_insensitive
          fields.collect { |f| "UCASE(#{f}) LIKE #{sanitize('%'+text.upcase+'%')}" }.join " OR "
        else
          fields.collect { |f| "#{f} LIKE #{sanitize('%'+text+'%')}" }.join " OR "
        end

        # Add the text search term's SQL to the conditions string unless
        # the text was nil to begin with.
        condition_list << "(" + text_condition + ")"
      end
      condition_list << "#{options[:conditions]}" if options[:conditions]
      conditions = condition_list.join " AND "
      conditions = nil if conditions.empty?

      includes = (options[:include] || []) + (options[:join_include] || [])
      includes = nil if includes.size == 0
      
      find :all, :include => includes, :conditions => conditions,
           :offset => options[:offset], :limit => options[:limit], :order => options[:order]
    end
  end
end

NOTE If your text editor doesn’t support fancy quotes like “these”, it may replace them with dots or question marks if you copy/paste the above code. Look carefully.

Author

You can contact Duane Johnson if you have suggestions, modifications or questions regarding this code contribution.

See Also

—-
This will not work on databases where UCASE is not supported, instead change UCASE to UPPER and you can then use this on databases like sqlite3 or postgresql.
Jason Toy

You can handle multiple words (using an AND until someone refactors some more) by making the following modification:


...
      def self.search(keywords = [], options = {})
        keywords = keywords.split if keywords.is_a? String

...

        unless keywords.nil?
          keywords.each do |text|
            text_condition = if case_insensitive
              fields.collect do |f|
                "UCASE(#{f}) LIKE '%#{text.upcase}%'" 
              end.join " OR "
            else
              fields.collect { |f| "#{f} LIKE '%#{text}%'" }.join " OR "
            end

            # Add the text search term's SQL to the conditions string unless
            # the text was nil to begin with.
            condition_list << "(" + text_condition + ")" unless text_condition.nil?
            end
          end
...

Michel Valdrighi?
note: fixed in above code: keywords.split! (doesn’t exist) keywords.empty? to keywords.nil?. added unless text_condition.nil? to end of snippet

i have a problem with this lib. I got this error:

wrong number of arguments (1 for 0)

maybe need a update for the Rails version? :)
Thnx
David jgc?

I was getting the same “wrong number of arguments (1 for 0)” error using MySQL and searches_on :all. However, I got it to work by specifying the columns to use:

searches_on :name, :description

it appears the wiki is removing double equals in this line

string_columns = klass.columns.select { |c| c.type == :text or c.type == :string }
which is causing people to get the “wrong number of arguments” errors it should be
string_columns = klass.columns.select { |c| c.type == :text or c.type \=\= :string }
without the backslashes

Here’s a revision that adds soundex support:


# This is an add-on to the ActiveRecord::Base class.  It allows simple searching to be
# accomplished by using, for example, @movies = Movie.search("text")
module ActiveRecord
  class Base
    # Allow the user to set the default searchable fields
    def self.searches_on(*args)
      if not args.empty? and args.first != :all
        @searchable_fields = args.collect { |f| f.to_s }
      end
    end

    # Return the default set of fields to search on
    def self.searchable_fields(tables = nil, klass = self)
      # If the model has declared what it searches_on, then use that...
      return @searchable_fields unless @searchable_fields.nil?

      # ... otherwise, use all text/varchar fields as the default
      fields = []
      tables ||= []
      string_columns = klass.columns.select { |c| c.type == :text or c.type  :string }
      fields = string_columns.collect { |c| klass.table_name + "." + c.name }

      if not tables.empty?
        tables.each do |table|
          klass = eval table.to_s.classify
          fields += searchable_fields([], klass)
        end
      end

      return fields
    end

    # Search the movie database for the given parameters:
    #   text = a string to search for
    #   :only => an array of fields in which to search for the text;
    #     default is 'all text or string columns'
    #   :except => an array of fields to exclude from the default searchable columns
    #   :case => :sensitive or :insensitive
    #   :soundex => true or false
    #   :include => an array of tables to include in the joins.  Fields that
    #     have searchable text will automatically be included in the default
    #     set of :search_columns.
    #   :join_include => an array of tables to include in the joins, but only
    #     for joining. (Searchable fields will not automatically be included.)
    #   :conditions => a string of additional conditions (constraints)
    #   :offset => paging offset (integer)
    #   :limit => number of rows to return (integer)
    #   :order => sort order (order_by SQL snippet)
    def self.search(keywords = [], options = {})
      options.assert_valid_keys(:only, :except, :case, :include, :soundex,
                        :join_include, :conditions, :offset, :limit, :order)
      case_insensitive = true unless options[:case]  :sensitive
      soundex = true unless options[:soundex]  false

      # The fields to search (default is all text fields)
      fields = options[:only] || searchable_fields(options[:include])
      fields -= options[:except] if not options[:except].nil?

      # Now build the SQL for the search if there is text to search for
      condition_list = []
      unless keywords.nil?
        keywords.each do |text|
          text_condition = if soundex and case_insensitive
            fields.collect { |f| ”#{f} SOUNDS LIKE #{sanitize(text)} OR UPPER LIKE #{sanitize(’text.upcase’)}” }.join ” OR ” 
          elsif soundex
            fields.collect { |f| ”#{f} SOUNDS LIKE #{sanitize(text)}” }.join ” OR ” 
          elsif case_insensitive
            fields.collect { |f| “UPPER LIKE #{sanitize(’text.upcase’)}” }.join ” OR ” 
          else
            fields.collect { |f| ”#{f} LIKE #{sanitize(’text’)}” }.join ” OR ” 
          end

          # Add the text search term’s SQL to the conditions string unless
          # the text was nil to begin with.
          condition_list << “(”  text_condition  “)” 
        end
      end
      condition_list << ”#{options[:conditions]}” if options[:conditions]
      conditions = condition_list.join ” AND ” 
      conditions = nil if conditions.empty?

      includes = (options[:include] || []) + (options[:join_include] || [])
      includes = nil if includes.size == 0

      find :all, :include => includes, :conditions => conditions,
           :offset => options[:offset], :limit => options[:limit], :order => options[:order]
    end
  end
end


I’d like to get this to support bridging fields, too. (Yes, I’m making that term up.) For example, I’d like to search first_name and surname together as one field. Anyone have a patch for that?

You might want to look into UnderstandingAggregation

If you are using MySQL, you can modify it to use the in built-in full text index (as long as it is already in the db) with


condition_list &lt;&lt; "(" + "MATCH(#{fields.collect.join(",")}) AGAINST (#{sanitize(text)} IN BOOLEAN MODE)" + ")"

adding the index can be done with SQL like this

ALTER TABLE `dbname`.`table` ADD INDEX `idx_fulltext`(`field1`, `feild2`); </pre><br />

I also need to search through date fields in the database. Is it possible using this?

—-

Query support

Here’s a version that supports a simple query language. You can do queries such as

  • emmy or “golden globe”
  • (“julia lau” or “ana porter”) music
  • basketball -lakers

The wiki yucks things up in a few spots, so you might want to click the Edit button and copy and paste from there.


# Adds search method to ActiveRecord::Base.
# The query language supports the operators
# (), not, and, or
# Precedence in that order.
# - is an alias for not.
# If no operator is present, and is assumed.
# Lastly, anything within double quotes is treated as 
# a single search term.
#
# For example,
#  ruby rails => records where both ruby and rails appear
#  "ruby on rails" => records where "ruby on rails" appears
#  ruby or rails => records where ruby or rails (or both) appears
#  ruby or chunky bacon => records where ruby appears or both chunky and bacon appear
#  not dead or alive => records where alive appears or dead is absent
#  -(ruby or rails) => records where neither ruby nor rails appears
#  (ruby or rails) -"ruby on rails" => records where ruby or rails appears but not the phrase "ruby on rails" 
#
# Query feature by Nate McNamara (<a href="mailto:nate@mcnamara.net">nate@mcnamara.net</a>)
# Original TextSearch library by Duane Johnson.
module ActiveRecord
  class Base
    # Allow the user to set the default searchable fields
    def self.searches_on(*args)
      if not args.empty? and args.first != :all
        @searchable_fields = args.collect { |f| f.to_s }
      end
    end

    # Return the default set of fields to search on
    def self.searchable_fields(tables = nil, klass = self)
      # If the model has declared what it searches_on, then use that...
      return @searchable_fields unless @searchable_fields.nil?

      # ... otherwise, use all text/varchar fields as the default
      fields = []
      tables ||= []

      string_columns = klass.columns.select { |c|
        c.type  :text or c.type  :string
      }

      fields = string_columns.collect { |c|
        klass.table_name  ”.”  c.name
      }

      if not tables.empty?
        tables.each do |table|
          klass = eval table.to_s.classify
          fields += searchable_fields([], klass)
        end
      end

      return fields
    end

    # Search the model’s text and varchar fields
    #   text = a set of words to search for
    #   :only => an array of fields in which to search for the text;
    #     default is ‘all text or string columns’

    #   :except => an array of fields to exclude
    #     from the default searchable columns
    #   :case => :sensitive or :insensitive
    #   :include => an array of tables to include in the joins.  Fields that
    #     have searchable text will automatically be included in the default
    #     set of :search_columns.
    #   :join_include => an array of tables to include in the joins, but only
    #     for joining. (Searchable fields will not automatically be included.)
    #   :conditions => a string of additional conditions (constraints)
    #   :offset => paging offset (integer)
    #   :limit => number of rows to return (integer)
    #   :order => sort order (order_by SQL snippet)
    def self.search(text = nil, options = {})
      fields = options[:only] || searchable_fields(options[:include])
      if options[:except]
        fields -= options[:except]
      end

      unless options[:case] == :sensitive
        text.downcase!
        fields.map! { |f| “lower(#{f})” }
      end

      condition_list = []
      unless text.nil?
        condition_list << build_text_condition(fields, text)
      end
      if options[:conditions]
        condition_list << ”#{options[:conditions]}” 
      end
      conditions = condition_list.join ” AND ” 

      includes = (options[:include] || []) + (options[:join_include] || [])

      find(:all,
           :include => includes.empty? ? nil : includes,
           :conditions => conditions.empty? ? nil : conditions,
           :offset => options[:offset],
           :limit => options[:limit],
           :order => options[:order])
    end

    private

    # A chunk is a string of non-whitespace,
    # except that anything inside double quotes
    # is a chunk, including whitespace
    def self.make_chunks(s)
      chunks = []
      while s.length > 0
        next_interesting_index = (s = /\s|\”/)
        if next_interesting_index
          if next_interesting_index > 0
            chunks << s[0…next_interesting_index]
            s = s[next_interesting_index..-1]
          else
            if s = /\”/
              s = s[1..-1]
              next_interesting_index = (s = /[\”]/)
              if next_interesting_index
                chunks << s[0…next_interesting_index]
                s = s[next_interesting_index+1..-1]
              elsif s.length > 0
                chunks << s
                s = ’’

              end
            else
              next_interesting_index = (s = /\S/)
              s = s[next_interesting_index..-1]
            end
          end
        else
          chunks << s
          s = ’’
        end
      end

      chunks
    end

    def self.process_chunk(chunk)
      case chunk
      when /-/
        if chunk.length  1
          [:not]
        else
          [:not, *process_chunk(chunk[1..-1])]
        end
      when /^\(.*\)$/
        if chunk.length  2
          [:left_paren, :right_paren]
       else          
[:left_paren].concat(process_chunk(chunk[1..-2])) << :right_paren end when /^\(/ if chunk.length 1 [:left_paren] else [:left_paren].concat(process_chunk(chunk[1..-1])) end when /\)$/ if chunk.length 1 [:right_paren] else process_chunk(chunk[0..-2]) << :right_paren end when ‘and’ [:and] when ‘or’ [:or] when ‘not’ [:not] else [chunk] end end def self.lex(s) tokens = [] make_chunks(s).each { |chunk| tokens.concat(process_chunk(chunk)) } tokens end def self.parse_paren_expr(tokens) expr_tokens = [] while !tokens.empty? && tokens0 != :right_paren expr_tokens << tokens.shift end if !tokens.empty? tokens.shift end parse_expr(expr_tokens) end def self.parse_term(tokens) if tokens.empty? return ’’ end token = tokens.shift case token when :not [:not, parse_term(tokens)] when :left_paren parse_paren_expr(tokens) when :right_paren ’’ # skip bogus token when :and ’’ # skip bogus token when :or ’’ # skip bogus token else token end end def self.parse_and_expr(tokens, operand) if (tokens0 :and) tokens.shift end # Even if :and is missing, :and is implicit [:and, operand, parse_term(tokens)] end def self.parse_or_expr(tokens, operand) if (tokens[0] :or) tokens.shift [:or, operand, parse_expr(tokens)] else parse_and_expr(tokens, operand) end end def self.parse_expr(tokens) if tokens.empty? return ’’ end expr = parse_term(tokens) while !tokens.empty? expr = parse_or_expr(tokens, expr) end expr end def self.parse_tokens(tokens) tree = parse_expr(tokens) tree.kind_of?(Array)? tree : [tree] end def self.parse(text) parse_tokens(lex(text)) end def self.apply_demorgans(tree) if tree == [] return [] end token = tree.kind_of?(Array)? tree0 : tree case token when :not if (tree1.kind_of?(Array)) subtree = tree1 if subtree0 :and [:or, apply_demorgans([:not, subtree[1]]), apply_demorgans([:not, subtree[2]])] elsif tree[1][0] :or [:and, apply_demorgans([:not, subtree1]), apply_demorgans([:not, subtree2])] else # assert tree1[0] == :not apply_demorgans(subtree1) end else tree end when :and apply_demorgans(tree2)] when :or apply_demorgans(tree2)] else tree end end def self.demorganize(tree) result = apply_demorgans(tree) result.kind_of?(Array)? result : [result] end def self.sql_escape(s) s.gsub(’’, ’\’).gsub(’’, ’\’) end def self.compound_tc(fields, tree) ‘(’ + build_tc_from_tree(fields, tree1) + ’ ’ tree0.to_s ’ ’ + build_tc_from_tree(fields, tree2) + ‘)’ end def self.build_tc_from_tree(fields, tree) token = tree.kind_of?(Array)? tree0 : tree case token when :and compound_tc(fields, tree) when :or compound_tc(fields, tree) when :not # assert tree1.kind_of?(String) “(” + fields.map { |f| “(#{f} is null or #{f} not like #{sanitize(’sql_escape(tree1)’)})” }.join(” and “) + “)” else “(” + fields.map { |f| ”#{f} like #{sanitize(’sql_escape(token)’)}” }.join(” or “) + “)” end end def self.build_text_condition(fields, text) build_tc_from_tree(fields, demorganize(parse(text))) end end end

—-

Second-order associations

I added second-order association support to Mr McNamara’s code. —Moritz Heidkamp


# Adds search method to ActiveRecord::Base.
# The query language supports the operators
# (), not, and, or
# Precedence in that order.
# - is an alias for not.
# If no operator is present, and is assumed.
# Lastly, anything within double quotes is treated as 
# a single search term.
#
# For example,
#  ruby rails => records where both ruby and rails appear
#  "ruby on rails" => records where "ruby on rails" appears
#  ruby or rails => records where ruby or rails (or both) appears
#  ruby or chunky bacon => records where ruby appears or both chunky and bacon appear
#  not dead or alive => records where alive appears or dead is absent
#  -(ruby or rails) => records where neither ruby nor rails appears
#  (ruby or rails) -"ruby on rails" => records where ruby or rails appears but not the phrase "ruby on rails" 
#
# Query feature by Nate McNamara (nate@mcnamara.net)
# Support for second-order associations by Moritz Heidkamp
# Original TextSearch library by Duane Johnson.
module ActiveRecord
  class Base
    # Allow the user to set the default searchable fields
    def self.searches_on(*args)
      if not args.empty? and args.first != :all
        @searchable_fields = args.collect { |f| f.to_s }
      end
    end

    # Return the default set of fields to search on
    def self.searchable_fields(tables = nil, klass = self)
      # If the model has declared what it searches_on, then use that...
      return @searchable_fields unless @searchable_fields.nil?

      # ... otherwise, use all text/varchar fields as the default
      fields = []
      tables ||= []
      tables = [tables] unless tables.kind_of? Array

      string_columns = klass.columns.select { |c|
        c.type  :text or c.type  :string
      }

      fields = string_columns.collect { |c|
        klass.table_name + "." + c.name
      }

      # also collect searchable fields of eagerly loaded associations
      fields += searchable_fields_in_associations(tables)

      return fields
    end

    def self.searchable_fields_in_associations(tables)
      fields = []

      if not tables.empty?
        tables.each do |table, assocs|
          if table.kind_of? Hash
            fields += searchable_fields_in_associations(table)
          else
            klass = eval table.to_s.classify
            fields += searchable_fields(assocs, klass)
          end
        end
      end

      return fields
    end

    # Search the model's text and varchar fields
    #   text = a set of words to search for
    #   :only => an array of fields in which to search for the text;
    #     default is 'all text or string columns'
    #   :except => an array of fields to exclude
    #     from the default searchable columns
    #   :case => :sensitive or :insensitive
    #   :include => an array of tables to include in the joins.  Fields that
    #     have searchable text will automatically be included in the default
    #     set of :search_columns.
    #   :join_include => an array of tables to include in the joins, but only
    #     for joining. (Searchable fields will not automatically be included.)
    #   :conditions => a string of additional conditions (constraints)
    #   :offset => paging offset (integer)
    #   :limit => number of rows to return (integer)
    #   :order => sort order (order_by SQL snippet)
    def self.search(text = nil, options = {})      
      if options[:include].kind_of? Array
        includes = options[:include] || []
      else
        includes = [options[:include]] || []
      end

      fields = options[:only] || searchable_fields(includes)
      if options[:except]
        fields -= options[:except]
      end

      unless options[:case] == :sensitive
        text.downcase!
        fields.map! { |f| "lower(#{f})" }
      end

      condition_list = []
      unless text.nil?
        condition_list << build_text_condition(fields, text)
      end
      if options[:conditions]
        condition_list << "#{options[:conditions]}" 
      end
      conditions = condition_list.join " AND " 

      includes += options[:join_include] || []

      find(:all,
           :include => includes.empty? ? nil : includes,
           :conditions => conditions.empty? ? nil : conditions,
           :offset => options[:offset],
           :limit => options[:limit],
           :order => options[:order])
    end

    private

    # A chunk is a string of non-whitespace,
    # except that anything inside double quotes
    # is a chunk, including whitespace
    def self.make_chunks(s)
      chunks = []
      while s.length > 0
        next_interesting_index = (s =~ /\s|\"/)
        if next_interesting_index
          if next_interesting_index > 0
            chunks << s[0...next_interesting_index]
            s = s[next_interesting_index..-1]
          else
            if s =~ /^\"/
              s = s[1..-1]
              next_interesting_index = (s =~ /[\"]/)
              if next_interesting_index
                chunks << s[0...next_interesting_index]
                s = s[next_interesting_index+1..-1]
              elsif s.length > 0
                chunks << s
                s = ''
              end
            else
              next_interesting_index = (s =~ /\S/)
              s = s[next_interesting_index..-1]
            end
          end
        else
          chunks << s
          s = ''
        end
      end

      chunks
    end

    def self.process_chunk(chunk)
      case chunk
      when /^-/
        if chunk.length  1
          [:not]
        else
          [:not, *process_chunk(chunk[1..-1])]
        end
      when /^\(.*\)$/
        if chunk.length  2
          [:left_paren, :right_paren]
       else          
[:left_paren].concat(process_chunk(chunk[1..-2])) << :right_paren
        end
      when /^\(/
        if chunk.length  1
          [:left_paren]
        else
          [:left_paren].concat(process_chunk(chunk[1..-1]))
        end
      when /\)$/
        if chunk.length  1
          [:right_paren]
        else
          process_chunk(chunk[0..-2]) << :right_paren
        end
      when 'and'
        [:and]
      when 'or'
        [:or]
      when 'not'
        [:not]
      else
        [chunk]
      end
    end

    def self.lex(s)
      tokens = []

      make_chunks(s).each { |chunk|
        tokens.concat(process_chunk(chunk))
      }

      tokens
    end

    def self.parse_paren_expr(tokens)
      expr_tokens = []
      while !tokens.empty? && tokens[0] != :right_paren
        expr_tokens << tokens.shift
      end

      if !tokens.empty?
        tokens.shift
      end

      parse_expr(expr_tokens)
    end

    def self.parse_term(tokens)
      if tokens.empty?
        return ''
      end

      token = tokens.shift
      case token
      when :not
          [:not, parse_term(tokens)]
      when :left_paren
        parse_paren_expr(tokens)
      when :right_paren
        '' # skip bogus token
      when :and
          '' # skip bogus token
      when :or
          '' # skip bogus token
      else
        token
      end
    end

    def self.parse_and_expr(tokens, operand)
      if (tokens[0]  == :and)
        tokens.shift
      end
      # Even if :and is missing, :and is implicit
      [:and, operand, parse_term(tokens)]
    end

    def self.parse_or_expr(tokens, operand)
      if (tokens[0]  == :or)
        tokens.shift
        [:or, operand, parse_expr(tokens)]
      else
        parse_and_expr(tokens, operand)
      end
    end

    def self.parse_expr(tokens)
      if tokens.empty?
        return ''
      end

      expr = parse_term(tokens)
      while !tokens.empty?
        expr = parse_or_expr(tokens, expr)
      end

      expr
    end

    def self.parse_tokens(tokens)
      tree = parse_expr(tokens)
      tree.kind_of?(Array)? tree : [tree]
    end

    def self.parse(text)
      parse_tokens(lex(text))
    end

    def self.apply_demorgans(tree)
      if tree == []
        return []
      end

      token = tree.kind_of?(Array)? tree[0] : tree
      case token
      when :not
          if (tree[1].kind_of?(Array))
            subtree = tree[1]
            if subtree[0] == :and
                [:or,
                 apply_demorgans([:not, subtree[1]]),
                 apply_demorgans([:not, subtree[2]])]
            elsif tree[1][0] == :or
                [:and,
                 apply_demorgans([:not, subtree[1]]),
                 apply_demorgans([:not, subtree[2]])]
            else
              # assert tree[1][0] == :not
              apply_demorgans(subtree[1])
            end
          else
            tree
          end
      when :and
          [:and, apply_demorgans(tree[1]), apply_demorgans(tree[2])]
      when :or
          [:or, apply_demorgans(tree[1]), apply_demorgans(tree[2])]
      else
        tree
      end
    end

    def self.demorganize(tree)
      result = apply_demorgans(tree)
      result.kind_of?(Array)? result : [result]
    end

    def self.sql_escape(s)
      s.gsub('%', '\%').gsub('_', '\_')
    end

    def self.compound_tc(fields, tree)
      '(' +
        build_tc_from_tree(fields, tree[1]) +
        ' ' + tree[0].to_s + ' ' +
        build_tc_from_tree(fields, tree[2]) +
        ')'
    end

    def self.build_tc_from_tree(fields, tree)
      token = tree.kind_of?(Array)? tree[0] : tree
      case token
      when :and
          compound_tc(fields, tree)
      when :or
          compound_tc(fields, tree)
      when :not
          # assert tree[1].kind_of?(String)
        "(" +
        fields.map { |f|
          "(#{f} is null or #{f} not like #{sanitize('%'+sql_escape(tree[1])+'%')})" 
        }.join(" and ") +
          ")" 
      else
        "(" +
        fields.map { |f|
          "#{f} like #{sanitize('%'+sql_escape(token)+'%')}" 
        }.join(" or ") +
          ")" 
      end
    end

    def self.build_text_condition(fields, text)
      build_tc_from_tree(fields, demorganize(parse(text)))
    end
  end
end

—-

Question:
Using this solution, I got the following error:

*{RAILS_ROOT}/lib/search.rb:37:in `searchable_fields’
*{RAILS_ROOT}/lib/search.rb:61:in `searchable_fields_in_associations’
*{RAILS_ROOT}/lib/search.rb:56:in `searchable_fields_in_associations’
*{RAILS_ROOT}/lib/search.rb:46:in `searchable_fields’
*{RAILS_ROOT}/lib/search.rb:92:in `search’
*{RAILS_ROOT}/app/controllers/search_controller.rb:59:in `executeSearch’
-e:4

Any idea? I just used [model].search("xy"). Of course, I added the require and searches_on lines into the model files.

Thanks for help and this great lib!

Answer:
The wiki code section seems to omit two equal signs next to one another. You’re going to have to go through and add them where they should be, or it will give you errors like that.

Question: How do i limit the amount of results retured by the search? Say if i olny wanted the first 10????

Answer: the quick and dirty solution is to add “LIMIT X” to the search SQL generated by the above code. Of course, this doesn’t show the X most relevant hits based on the search, but that a much bigger problem to solve and may even depend on the type of data you are searching, which is domain-specific.

Question: I want to paginate the results, i tried several plugins but they only support the ‘find’ method of Rails. Any tips how about that?

Question: Is there any other place to get the code. The Wiki is messing everything up. I tried copying from Edit, but that doesn’t work well either. Specifically, I’m interested in the code for the query language. BTW, it’s more than just == that gets confused. [] gets dropped in many places, single and double quotes. It get’s really interesting when escape sequences are needed in the code.

What does it do?

It’s pretty common to want to search for a string across multiple fields in a table, or even across multiple joined tables. The ActiveRecord model already knows how the tables are joined together, so can we take advantage of that information to write a search method? Yes! Here’s what the proposed solution looks like:


require_dependency "search"

class Movie < ActiveRecord::Base
   has_and_belongs_to_many :genres
   searches_on :all
end

Note that :all is not a field, but a directive indicating that all text/varchar fields in my ‘movies’ table will be searched. I could have explicitly told it to search on :title and :description like this:


class Movie < ActiveRecord::Base
   has_and_belongs_to_many :genres
   searches_on :title, :description
end

Now, all future calls to Movie.search will only search the “title” and “description” fields.

Further Usage Examples

In the following examples, all calls to the search method return an array of Movie objects.


# Make a case insensitive search for "star" within all text (or
varchar) fields of "movies"
  @movies = Movie.search "star"  

# Same, but case sensitive
  @movies = Movie.search "star", :case => :sensitive

# Search all text (or varchar) fields within Movie
# and Genre, according to the associations indicated
# by those models' classes.  For example, if genre 
# has_many movies and Genre has a "name" and
# a "description" field, then this search will return
# all movies containing "star" as well as all movies
# belonging to any genre whose description or name
# contains "star":
  @movies = Movie.search "star", :include => [:genre]  

# The search above could also be flipped around if 
# you want to ask the question "What genres 
# contain movies whose titles or descriptions contain
# 'star'?"
  @genres = Genre.search "star", :include => [:movie], :only =>
["movies.title", "movies.description"]

# Search for "star" within a restricted set of 
# text/varchar fields:
  @movies = Movie.search "star", :only => ["title", "description"]
  @movies = Movie.search "star", :except => ["title"]

# Search for "star" with certain other conditions, 
# such as within a particular genre:
  @movies = Movie.search "star", :conditions => "genre_id = 1"

# Search for "star" with other conditions in a
# joined table:
  @movies = Movie.search "star", :join_include => [:genre],
:conditions => "genres.hidden = 0"

The Search Library’s Code

Add the following code to a file called “search.rb” in your application’s “lib” directory:


# This is an add-on to the ActiveRecord::Base class.  It allows simple searching to be
# accomplished by using, for example, @movies = Movie.search("text")
module ActiveRecord
  class Base
    # Allow the user to set the default searchable fields
    def self.searches_on(*args)
      if not args.empty? and args.first != :all
        @searchable_fields = args.collect { |f| f.to_s }
      end
    end
    
    # Return the default set of fields to search on
    def self.searchable_fields(tables = nil, klass = self)
      # If the model has declared what it searches_on, then use that...
      return @searchable_fields unless @searchable_fields.nil?
      
      # ... otherwise, use all text/varchar fields as the default
      fields = []
      tables ||= []
      string_columns = klass.columns.select { |c| c.type == :text or c.type == :string }
      fields = string_columns.collect { |c| klass.table_name + "." + c.name }

      if not tables.empty?
        tables.each do |table|
          klass = eval table.to_s.classify
          fields += searchable_fields([], klass)
        end
      end
      
      return fields
    end

    # Search the movie database for the given parameters:
    #   text = a string to search for
    #   :only => an array of fields in which to search for the text;
    #     default is 'all text or string columns'
    #   :except => an array of fields to exclude from the default searchable columns
    #   :case => :sensitive or :insensitive
    #   :include => an array of tables to include in the joins.  Fields that
    #     have searchable text will automatically be included in the default
    #     set of :search_columns.
    #   :join_include => an array of tables to include in the joins, but only
    #     for joining. (Searchable fields will not automatically be included.)
    #   :conditions => a string of additional conditions (constraints)
    #   :offset => paging offset (integer)
    #   :limit => number of rows to return (integer)
    #   :order => sort order (order_by SQL snippet)
    def self.search(text = nil, options = {})
      options.assert_valid_keys(:only, :except, :case, :include,
                        :join_include, :conditions, :offset, :limit, :order)
      case_insensitive = true unless options[:case] == :sensitive
      
      # The fields to search (default is all text fields)
      fields = options[:only] || searchable_fields(options[:include])
      fields -= options[:except] if not options[:except].nil?

      # Now build the SQL for the search if there is text to search for
      condition_list = []
      unless text.nil?
        text_condition = if case_insensitive
          fields.collect { |f| "UCASE(#{f}) LIKE #{sanitize('%'+text.upcase+'%')}" }.join " OR "
        else
          fields.collect { |f| "#{f} LIKE #{sanitize('%'+text+'%')}" }.join " OR "
        end

        # Add the text search term's SQL to the conditions string unless
        # the text was nil to begin with.
        condition_list << "(" + text_condition + ")"
      end
      condition_list << "#{options[:conditions]}" if options[:conditions]
      conditions = condition_list.join " AND "
      conditions = nil if conditions.empty?

      includes = (options[:include] || []) + (options[:join_include] || [])
      includes = nil if includes.size == 0
      
      find :all, :include => includes, :conditions => conditions,
           :offset => options[:offset], :limit => options[:limit], :order => options[:order]
    end
  end
end

NOTE If your text editor doesn’t support fancy quotes like “these”, it may replace them with dots or question marks if you copy/paste the above code. Look carefully.

Author

You can contact Duane Johnson if you have suggestions, modifications or questions regarding this code contribution.

See Also

—-
This will not work on databases where UCASE is not supported, instead change UCASE to UPPER and you can then use this on databases like sqlite3 or postgresql.
Jason Toy

You can handle multiple words (using an AND until someone refactors some more) by making the following modification:


...
      def self.search(keywords = [], options = {})
        keywords = keywords.split if keywords.is_a? String

...

        unless keywords.nil?
          keywords.each do |text|
            text_condition = if case_insensitive
              fields.collect do |f|
                "UCASE(#{f}) LIKE '%#{text.upcase}%'" 
              end.join " OR "
            else
              fields.collect { |f| "#{f} LIKE '%#{text}%'" }.join " OR "
            end

            # Add the text search term's SQL to the conditions string unless
            # the text was nil to begin with.
            condition_list << "(" + text_condition + ")" unless text_condition.nil?
            end
          end
...

Michel Valdrighi?
note: fixed in above code: keywords.split! (doesn’t exist) keywords.empty? to keywords.nil?. added unless text_condition.nil? to end of snippet

i have a problem with this lib. I got this error:

wrong number of arguments (1 for 0)

maybe need a update for the Rails version? :)
Thnx
David jgc?

I was getting the same “wrong number of arguments (1 for 0)” error using MySQL and searches_on :all. However, I got it to work by specifying the columns to use:

searches_on :name, :description

it appears the wiki is removing double equals in this line

string_columns = klass.columns.select { |c| c.type == :text or c.type == :string }
which is causing people to get the “wrong number of arguments” errors it should be
string_columns = klass.columns.select { |c| c.type == :text or c.type \=\= :string }
without the backslashes

Here’s a revision that adds soundex support:


# This is an add-on to the ActiveRecord::Base class.  It allows simple searching to be
# accomplished by using, for example, @movies = Movie.search("text")
module ActiveRecord
  class Base
    # Allow the user to set the default searchable fields
    def self.searches_on(*args)
      if not args.empty? and args.first != :all
        @searchable_fields = args.collect { |f| f.to_s }
      end
    end

    # Return the default set of fields to search on
    def self.searchable_fields(tables = nil, klass = self)
      # If the model has declared what it searches_on, then use that...
      return @searchable_fields unless @searchable_fields.nil?

      # ... otherwise, use all text/varchar fields as the default
      fields = []
      tables ||= []
      string_columns = klass.columns.select { |c| c.type == :text or c.type  :string }
      fields = string_columns.collect { |c| klass.table_name + "." + c.name }

      if not tables.empty?
        tables.each do |table|
          klass = eval table.to_s.classify
          fields += searchable_fields([], klass)
        end
      end

      return fields
    end

    # Search the movie database for the given parameters:
    #   text = a string to search for
    #   :only => an array of fields in which to search for the text;
    #     default is 'all text or string columns'
    #   :except => an array of fields to exclude from the default searchable columns
    #   :case => :sensitive or :insensitive
    #   :soundex => true or false
    #   :include => an array of tables to include in the joins.  Fields that
    #     have searchable text will automatically be included in the default
    #     set of :search_columns.
    #   :join_include => an array of tables to include in the joins, but only
    #     for joining. (Searchable fields will not automatically be included.)
    #   :conditions => a string of additional conditions (constraints)
    #   :offset => paging offset (integer)
    #   :limit => number of rows to return (integer)
    #   :order => sort order (order_by SQL snippet)
    def self.search(keywords = [], options = {})
      options.assert_valid_keys(:only, :except, :case, :include, :soundex,
                        :join_include, :conditions, :offset, :limit, :order)
      case_insensitive = true unless options[:case]  :sensitive
      soundex = true unless options[:soundex]  false

      # The fields to search (default is all text fields)
      fields = options[:only] || searchable_fields(options[:include])
      fields -= options[:except] if not options[:except].nil?

      # Now build the SQL for the search if there is text to search for
      condition_list = []
      unless keywords.nil?
        keywords.each do |text|
          text_condition = if soundex and case_insensitive
            fields.collect { |f| ”#{f} SOUNDS LIKE #{sanitize(text)} OR UPPER LIKE #{sanitize(’text.upcase’)}” }.join ” OR ” 
          elsif soundex
            fields.collect { |f| ”#{f} SOUNDS LIKE #{sanitize(text)}” }.join ” OR ” 
          elsif case_insensitive
            fields.collect { |f| “UPPER LIKE #{sanitize(’text.upcase’)}” }.join ” OR ” 
          else
            fields.collect { |f| ”#{f} LIKE #{sanitize(’text’)}” }.join ” OR ” 
          end

          # Add the text search term’s SQL to the conditions string unless
          # the text was nil to begin with.
          condition_list << “(”  text_condition  “)” 
        end
      end
      condition_list << ”#{options[:conditions]}” if options[:conditions]
      conditions = condition_list.join ” AND ” 
      conditions = nil if conditions.empty?

      includes = (options[:include] || []) + (options[:join_include] || [])
      includes = nil if includes.size == 0

      find :all, :include => includes, :conditions => conditions,
           :offset => options[:offset], :limit => options[:limit], :order => options[:order]
    end
  end
end


I’d like to get this to support bridging fields, too. (Yes, I’m making that term up.) For example, I’d like to search first_name and surname together as one field. Anyone have a patch for that?

You might want to look into UnderstandingAggregation

If you are using MySQL, you can modify it to use the in built-in full text index (as long as it is already in the db) with


condition_list &lt;&lt; "(" + "MATCH(#{fields.collect.join(",")}) AGAINST (#{sanitize(text)} IN BOOLEAN MODE)" + ")"

adding the index can be done with SQL like this

ALTER TABLE `dbname`.`table` ADD INDEX `idx_fulltext`(`field1`, `feild2`); </pre><br />

I also need to search through date fields in the database. Is it possible using this?

—-

Query support

Here’s a version that supports a simple query language. You can do queries such as

  • emmy or “golden globe”
  • (“julia lau” or “ana porter”) music
  • basketball -lakers

The wiki yucks things up in a few spots, so you might want to click the Edit button and copy and paste from there.


# Adds search method to ActiveRecord::Base.
# The query language supports the operators
# (), not, and, or
# Precedence in that order.
# - is an alias for not.
# If no operator is present, and is assumed.
# Lastly, anything within double quotes is treated as 
# a single search term.
#
# For example,
#  ruby rails => records where both ruby and rails appear
#  "ruby on rails" => records where "ruby on rails" appears
#  ruby or rails => records where ruby or rails (or both) appears
#  ruby or chunky bacon => records where ruby appears or both chunky and bacon appear
#  not dead or alive => records where alive appears or dead is absent
#  -(ruby or rails) => records where neither ruby nor rails appears
#  (ruby or rails) -"ruby on rails" => records where ruby or rails appears but not the phrase "ruby on rails" 
#
# Query feature by Nate McNamara (<a href="mailto:nate@mcnamara.net">nate@mcnamara.net</a>)
# Original TextSearch library by Duane Johnson.
module ActiveRecord
  class Base
    # Allow the user to set the default searchable fields
    def self.searches_on(*args)
      if not args.empty? and args.first != :all
        @searchable_fields = args.collect { |f| f.to_s }
      end
    end

    # Return the default set of fields to search on
    def self.searchable_fields(tables = nil, klass = self)
      # If the model has declared what it searches_on, then use that...
      return @searchable_fields unless @searchable_fields.nil?

      # ... otherwise, use all text/varchar fields as the default
      fields = []
      tables ||= []

      string_columns = klass.columns.select { |c|
        c.type  :text or c.type  :string
      }

      fields = string_columns.collect { |c|
        klass.table_name  ”.”  c.name
      }

      if not tables.empty?
        tables.each do |table|
          klass = eval table.to_s.classify
          fields += searchable_fields([], klass)
        end
      end

      return fields
    end

    # Search the model’s text and varchar fields
    #   text = a set of words to search for
    #   :only => an array of fields in which to search for the text;
    #     default is ‘all text or string columns’

    #   :except => an array of fields to exclude
    #     from the default searchable columns
    #   :case => :sensitive or :insensitive
    #   :include => an array of tables to include in the joins.  Fields that
    #     have searchable text will automatically be included in the default
    #     set of :search_columns.
    #   :join_include => an array of tables to include in the joins, but only
    #     for joining. (Searchable fields will not automatically be included.)
    #   :conditions => a string of additional conditions (constraints)
    #   :offset => paging offset (integer)
    #   :limit => number of rows to return (integer)
    #   :order => sort order (order_by SQL snippet)
    def self.search(text = nil, options = {})
      fields = options[:only] || searchable_fields(options[:include])
      if options[:except]
        fields -= options[:except]
      end

      unless options[:case] == :sensitive
        text.downcase!
        fields.map! { |f| “lower(#{f})” }
      end

      condition_list = []
      unless text.nil?
        condition_list << build_text_condition(fields, text)
      end
      if options[:conditions]
        condition_list << ”#{options[:conditions]}” 
      end
      conditions = condition_list.join ” AND ” 

      includes = (options[:include] || []) + (options[:join_include] || [])

      find(:all,
           :include => includes.empty? ? nil : includes,
           :conditions => conditions.empty? ? nil : conditions,
           :offset => options[:offset],
           :limit => options[:limit],
           :order => options[:order])
    end

    private

    # A chunk is a string of non-whitespace,
    # except that anything inside double quotes
    # is a chunk, including whitespace
    def self.make_chunks(s)
      chunks = []
      while s.length > 0
        next_interesting_index = (s = /\s|\”/)
        if next_interesting_index
          if next_interesting_index > 0
            chunks << s[0…next_interesting_index]
            s = s[next_interesting_index..-1]
          else
            if s = /\”/
              s = s[1..-1]
              next_interesting_index = (s = /[\”]/)
              if next_interesting_index
                chunks << s[0…next_interesting_index]
                s = s[next_interesting_index+1..-1]
              elsif s.length > 0
                chunks << s
                s = ’’

              end
            else
              next_interesting_index = (s = /\S/)
              s = s[next_interesting_index..-1]
            end
          end
        else
          chunks << s
          s = ’’
        end
      end

      chunks
    end

    def self.process_chunk(chunk)
      case chunk
      when /-/
        if chunk.length  1
          [:not]
        else
          [:not, *process_chunk(chunk[1..-1])]
        end
      when /^\(.*\)$/
        if chunk.length  2
          [:left_paren, :right_paren]
       else          
[:left_paren].concat(process_chunk(chunk[1..-2])) << :right_paren end when /^\(/ if chunk.length 1 [:left_paren] else [:left_paren].concat(process_chunk(chunk[1..-1])) end when /\)$/ if chunk.length 1 [:right_paren] else process_chunk(chunk[0..-2]) << :right_paren end when ‘and’ [:and] when ‘or’ [:or] when ‘not’ [:not] else [chunk] end end def self.lex(s) tokens = [] make_chunks(s).each { |chunk| tokens.concat(process_chunk(chunk)) } tokens end def self.parse_paren_expr(tokens) expr_tokens = [] while !tokens.empty? && tokens0 != :right_paren expr_tokens << tokens.shift end if !tokens.empty? tokens.shift end parse_expr(expr_tokens) end def self.parse_term(tokens) if tokens.empty? return ’’ end token = tokens.shift case token when :not [:not, parse_term(tokens)] when :left_paren parse_paren_expr(tokens) when :right_paren ’’ # skip bogus token when :and ’’ # skip bogus token when :or ’’ # skip bogus token else token end end def self.parse_and_expr(tokens, operand) if (tokens0 :and) tokens.shift end # Even if :and is missing, :and is implicit [:and, operand, parse_term(tokens)] end def self.parse_or_expr(tokens, operand) if (tokens[0] :or) tokens.shift [:or, operand, parse_expr(tokens)] else parse_and_expr(tokens, operand) end end def self.parse_expr(tokens) if tokens.empty? return ’’ end expr = parse_term(tokens) while !tokens.empty? expr = parse_or_expr(tokens, expr) end expr end def self.parse_tokens(tokens) tree = parse_expr(tokens) tree.kind_of?(Array)? tree : [tree] end def self.parse(text) parse_tokens(lex(text)) end def self.apply_demorgans(tree) if tree == [] return [] end token = tree.kind_of?(Array)? tree0 : tree case token when :not if (tree1.kind_of?(Array)) subtree = tree1 if subtree0 :and [:or, apply_demorgans([:not, subtree[1]]), apply_demorgans([:not, subtree[2]])] elsif tree[1][0] :or [:and, apply_demorgans([:not, subtree1]), apply_demorgans([:not, subtree2])] else # assert tree1[0] == :not apply_demorgans(subtree1) end else tree end when :and apply_demorgans(tree2)] when :or apply_demorgans(tree2)] else tree end end def self.demorganize(tree) result = apply_demorgans(tree) result.kind_of?(Array)? result : [result] end def self.sql_escape(s) s.gsub(’’, ’\’).gsub(’’, ’\’) end def self.compound_tc(fields, tree) ‘(’ + build_tc_from_tree(fields, tree1) + ’ ’ tree0.to_s ’ ’ + build_tc_from_tree(fields, tree2) + ‘)’ end def self.build_tc_from_tree(fields, tree) token = tree.kind_of?(Array)? tree0 : tree case token when :and compound_tc(fields, tree) when :or compound_tc(fields, tree) when :not # assert tree1.kind_of?(String) “(” + fields.map { |f| “(#{f} is null or #{f} not like #{sanitize(’sql_escape(tree1)’)})” }.join(” and “) + “)” else “(” + fields.map { |f| ”#{f} like #{sanitize(’sql_escape(token)’)}” }.join(” or “) + “)” end end def self.build_text_condition(fields, text) build_tc_from_tree(fields, demorganize(parse(text))) end end end

—-

Second-order associations

I added second-order association support to Mr McNamara’s code. —Moritz Heidkamp


# Adds search method to ActiveRecord::Base.
# The query language supports the operators
# (), not, and, or
# Precedence in that order.
# - is an alias for not.
# If no operator is present, and is assumed.
# Lastly, anything within double quotes is treated as 
# a single search term.
#
# For example,
#  ruby rails => records where both ruby and rails appear
#  "ruby on rails" => records where "ruby on rails" appears
#  ruby or rails => records where ruby or rails (or both) appears
#  ruby or chunky bacon => records where ruby appears or both chunky and bacon appear
#  not dead or alive => records where alive appears or dead is absent
#  -(ruby or rails) => records where neither ruby nor rails appears
#  (ruby or rails) -"ruby on rails" => records where ruby or rails appears but not the phrase "ruby on rails" 
#
# Query feature by Nate McNamara (nate@mcnamara.net)
# Support for second-order associations by Moritz Heidkamp
# Original TextSearch library by Duane Johnson.
module ActiveRecord
  class Base
    # Allow the user to set the default searchable fields
    def self.searches_on(*args)
      if not args.empty? and args.first != :all
        @searchable_fields = args.collect { |f| f.to_s }
      end
    end

    # Return the default set of fields to search on
    def self.searchable_fields(tables = nil, klass = self)
      # If the model has declared what it searches_on, then use that...
      return @searchable_fields unless @searchable_fields.nil?

      # ... otherwise, use all text/varchar fields as the default
      fields = []
      tables ||= []
      tables = [tables] unless tables.kind_of? Array

      string_columns = klass.columns.select { |c|
        c.type  :text or c.type  :string
      }

      fields = string_columns.collect { |c|
        klass.table_name + "." + c.name
      }

      # also collect searchable fields of eagerly loaded associations
      fields += searchable_fields_in_associations(tables)

      return fields
    end

    def self.searchable_fields_in_associations(tables)
      fields = []

      if not tables.empty?
        tables.each do |table, assocs|
          if table.kind_of? Hash
            fields += searchable_fields_in_associations(table)
          else
            klass = eval table.to_s.classify
            fields += searchable_fields(assocs, klass)
          end
        end
      end

      return fields
    end

    # Search the model's text and varchar fields
    #   text = a set of words to search for
    #   :only => an array of fields in which to search for the text;
    #     default is 'all text or string columns'
    #   :except => an array of fields to exclude
    #     from the default searchable columns
    #   :case => :sensitive or :insensitive
    #   :include => an array of tables to include in the joins.  Fields that
    #     have searchable text will automatically be included in the default
    #     set of :search_columns.
    #   :join_include => an array of tables to include in the joins, but only
    #     for joining. (Searchable fields will not automatically be included.)
    #   :conditions => a string of additional conditions (constraints)
    #   :offset => paging offset (integer)
    #   :limit => number of rows to return (integer)
    #   :order => sort order (order_by SQL snippet)
    def self.search(text = nil, options = {})      
      if options[:include].kind_of? Array
        includes = options[:include] || []
      else
        includes = [options[:include]] || []
      end

      fields = options[:only] || searchable_fields(includes)
      if options[:except]
        fields -= options[:except]
      end

      unless options[:case] == :sensitive
        text.downcase!
        fields.map! { |f| "lower(#{f})" }
      end

      condition_list = []
      unless text.nil?
        condition_list << build_text_condition(fields, text)
      end
      if options[:conditions]
        condition_list << "#{options[:conditions]}" 
      end
      conditions = condition_list.join " AND " 

      includes += options[:join_include] || []

      find(:all,
           :include => includes.empty? ? nil : includes,
           :conditions => conditions.empty? ? nil : conditions,
           :offset => options[:offset],
           :limit => options[:limit],
           :order => options[:order])
    end

    private

    # A chunk is a string of non-whitespace,
    # except that anything inside double quotes
    # is a chunk, including whitespace
    def self.make_chunks(s)
      chunks = []
      while s.length > 0
        next_interesting_index = (s =~ /\s|\"/)
        if next_interesting_index
          if next_interesting_index > 0
            chunks << s[0...next_interesting_index]
            s = s[next_interesting_index..-1]
          else
            if s =~ /^\"/
              s = s[1..-1]
              next_interesting_index = (s =~ /[\"]/)
              if next_interesting_index
                chunks << s[0...next_interesting_index]
                s = s[next_interesting_index+1..-1]
              elsif s.length > 0
                chunks << s
                s = ''
              end
            else
              next_interesting_index = (s =~ /\S/)
              s = s[next_interesting_index..-1]
            end
          end
        else
          chunks << s
          s = ''
        end
      end

      chunks
    end

    def self.process_chunk(chunk)
      case chunk
      when /^-/
        if chunk.length  1
          [:not]
        else
          [:not, *process_chunk(chunk[1..-1])]
        end
      when /^\(.*\)$/
        if chunk.length  2
          [:left_paren, :right_paren]
       else          
[:left_paren].concat(process_chunk(chunk[1..-2])) << :right_paren
        end
      when /^\(/
        if chunk.length  1
          [:left_paren]
        else
          [:left_paren].concat(process_chunk(chunk[1..-1]))
        end
      when /\)$/
        if chunk.length  1
          [:right_paren]
        else
          process_chunk(chunk[0..-2]) << :right_paren
        end
      when 'and'
        [:and]
      when 'or'
        [:or]
      when 'not'
        [:not]
      else
        [chunk]
      end
    end

    def self.lex(s)
      tokens = []

      make_chunks(s).each { |chunk|
        tokens.concat(process_chunk(chunk))
      }

      tokens
    end

    def self.parse_paren_expr(tokens)
      expr_tokens = []
      while !tokens.empty? && tokens[0] != :right_paren
        expr_tokens << tokens.shift
      end

      if !tokens.empty?
        tokens.shift
      end

      parse_expr(expr_tokens)
    end

    def self.parse_term(tokens)
      if tokens.empty?
        return ''
      end

      token = tokens.shift
      case token
      when :not
          [:not, parse_term(tokens)]
      when :left_paren
        parse_paren_expr(tokens)
      when :right_paren
        '' # skip bogus token
      when :and
          '' # skip bogus token
      when :or
          '' # skip bogus token
      else
        token
      end
    end

    def self.parse_and_expr(tokens, operand)
      if (tokens[0]  == :and)
        tokens.shift
      end
      # Even if :and is missing, :and is implicit
      [:and, operand, parse_term(tokens)]
    end

    def self.parse_or_expr(tokens, operand)
      if (tokens[0]  == :or)
        tokens.shift
        [:or, operand, parse_expr(tokens)]
      else
        parse_and_expr(tokens, operand)
      end
    end

    def self.parse_expr(tokens)
      if tokens.empty?
        return ''
      end

      expr = parse_term(tokens)
      while !tokens.empty?
        expr = parse_or_expr(tokens, expr)
      end

      expr
    end

    def self.parse_tokens(tokens)
      tree = parse_expr(tokens)
      tree.kind_of?(Array)? tree : [tree]
    end

    def self.parse(text)
      parse_tokens(lex(text))
    end

    def self.apply_demorgans(tree)
      if tree == []
        return []
      end

      token = tree.kind_of?(Array)? tree[0] : tree
      case token
      when :not
          if (tree[1].kind_of?(Array))
            subtree = tree[1]
            if subtree[0] == :and
                [:or,
                 apply_demorgans([:not, subtree[1]]),
                 apply_demorgans([:not, subtree[2]])]
            elsif tree[1][0] == :or
                [:and,
                 apply_demorgans([:not, subtree[1]]),
                 apply_demorgans([:not, subtree[2]])]
            else
              # assert tree[1][0] == :not
              apply_demorgans(subtree[1])
            end
          else
            tree
          end
      when :and
          [:and, apply_demorgans(tree[1]), apply_demorgans(tree[2])]
      when :or
          [:or, apply_demorgans(tree[1]), apply_demorgans(tree[2])]
      else
        tree
      end
    end

    def self.demorganize(tree)
      result = apply_demorgans(tree)
      result.kind_of?(Array)? result : [result]
    end

    def self.sql_escape(s)
      s.gsub('%', '\%').gsub('_', '\_')
    end

    def self.compound_tc(fields, tree)
      '(' +
        build_tc_from_tree(fields, tree[1]) +
        ' ' + tree[0].to_s + ' ' +
        build_tc_from_tree(fields, tree[2]) +
        ')'
    end

    def self.build_tc_from_tree(fields, tree)
      token = tree.kind_of?(Array)? tree[0] : tree
      case token
      when :and
          compound_tc(fields, tree)
      when :or
          compound_tc(fields, tree)
      when :not
          # assert tree[1].kind_of?(String)
        "(" +
        fields.map { |f|
          "(#{f} is null or #{f} not like #{sanitize('%'+sql_escape(tree[1])+'%')})" 
        }.join(" and ") +
          ")" 
      else
        "(" +
        fields.map { |f|
          "#{f} like #{sanitize('%'+sql_escape(token)+'%')}" 
        }.join(" or ") +
          ")" 
      end
    end

    def self.build_text_condition(fields, text)
      build_tc_from_tree(fields, demorganize(parse(text)))
    end
  end
end

—-

Question:
Using this solution, I got the following error:

*{RAILS_ROOT}/lib/search.rb:37:in `searchable_fields’
*{RAILS_ROOT}/lib/search.rb:61:in `searchable_fields_in_associations’
*{RAILS_ROOT}/lib/search.rb:56:in `searchable_fields_in_associations’
*{RAILS_ROOT}/lib/search.rb:46:in `searchable_fields’
*{RAILS_ROOT}/lib/search.rb:92:in `search’
*{RAILS_ROOT}/app/controllers/search_controller.rb:59:in `executeSearch’
-e:4

Any idea? I just used [model].search("xy"). Of course, I added the require and searches_on lines into the model files.

Thanks for help and this great lib!

Answer:
The wiki code section seems to omit two equal signs next to one another. You’re going to have to go through and add them where they should be, or it will give you errors like that.

Question: How do i limit the amount of results retured by the search? Say if i olny wanted the first 10????

Answer: the quick and dirty solution is to add “LIMIT X” to the search SQL generated by the above code. Of course, this doesn’t show the X most relevant hits based on the search, but that a much bigger problem to solve and may even depend on the type of data you are searching, which is domain-specific.

Question: I want to paginate the results, i tried several plugins but they only support the ‘find’ method of Rails. Any tips how about that?

Question: Is there any other place to get the code. The Wiki is messing everything up. I tried copying from Edit, but that doesn’t work well either. Specifically, I’m interested in the code for the query language. BTW, it’s more than just == that gets confused. [] gets dropped in many places, single and double quotes. It get’s really interesting when escape sequences are needed in the code.