Ruby on Rails
HowtoUseMultipleDatabases (Version #29)

Rails can handle multiple DBs. The “Connection to multiple databases in different models” section of the ActiveRecord API Doc explains how to setup class-specific connection. Also see this helpful tip on proper configuration.

The rest of this document examines how to determine which database to connect to on a per-request basis.

So you want to connect to multiple databases eh? What are you, crazy? :)

Naturally there are pros and cons to this approach. I’ll show you how it is done first, then leave it up to you to decide if you really want to do it.

This guide will only explain how to determine which database to connect to per request.

This differs from the default where each rails application uses only one production database. This way, you need only change your config/database.yml file and you’re good to go. With multiple databases, this is no longer acceptable.

app/controllers/abstract_application.rb defines a class called \AbstractApplicationController. All other user-created controllers descend through this. We’re going to place a before_filter on this sucker to step in front of the normal db mechanism. Because \ActiveRecord’s connection works on a first-come/first-serve basis, if we’re able to set the establish_connection before the default does, we win.

Got it? Good.


class ==AbstractApplicationController== < ==ActionController==::Base
  # here, we hop into the front of the request-handling
  # pipeline to run a method called hijack_db
  before_filter :hijack_db
 
  # here, we're going to manually establish a connection
  # to the database we feel like connecting to.
  def hijack_db
    # completely ridiculous condition
    is_odd = Time.now.hour % 2 == 1
 
    # determine the database name
    db_name = is_odd ? "dis_odd_db" : "dat_even_db"
 
    # lets manually connect to the proper db
		    <a href="http://wiki.rubyonrails.com/rails/pages/ActiveRecord" class="existingWikiWord">ActiveRecord</a>::Base.establish_connection(
      :adapter  => "mysql",
      :host     => "localhost",
      :username => "mr_roboto",
      :password => "secret_secret_i_got_a_secret",
      :database => db_name
    )
  end
end

Of course, you would have better database determination logic than whether the hour is an odd number or not. You might want to switch the database depending on the URL. For example:

  • bears.footballteamsrus.com maps to the database called “bears”
  • swans.footballteamsrus.com maps to the database called “swans”
  • etc.

Perhaps you’re creating database administration tool with Rails (wink, nudge) and need to switch the database based on a query string variable (a.k.a. @params).

Perhaps, you’re a web host and you’d like to dynamically choose a database based on which client’s site is being loaded.

Whatever the reason, that code will make it so.

Why the Hell would I want to do this?

As mentioned, there are pros and cons to this approach. I’m going to list what I can and leave it up to you to decide.

Cons (because everyone always lists the pros first)

  1. When changing the database schema, you’ve got to change it in all the databases
  2. When backing up, you’ve got multiple databases to manage
  3. Database caching is non-existant (read: small performance hit)
  4. Creating statistics and aggregating data becomes more of a challenge
  5. If you hijack the database connection logic, your functional and unit tests won’t be able to switch to the test database properly
  6. Even if you don’t hijack the database connection logic, functional and unit tests expect to push the data into the main test db, which may cause Unexpected Results.

Pros

  1. Complete isolation – you can have different database schemas and different functionality without affecting the others
  2. The \ActiveRecord code requires fewer relationships and the SQL is much cleaner when you don’t have to add `WHERE team_id = 123` to every query (granted, much of this \ActiveRecord handles more than gracefully)
  3. Less data and smaller indices mean faster SQL queries (in theory)
  4. No central database means each database could be located on a different box
  5. Backing up a single football team (in the above example) means dumping one database, not writing writing sql that says `WHERE team_id = 69`
  6. If you go the extra step and deploy one Rails app per customer, taking someone offline doesn’t mean taking everyone offline. Furthermore, if you are deployed on multiple isolated machines and one dies, you get fewer simultaneous irate support calls.
  7. If you’re wildly successful and want to scale up by having a bunch of read-only databases and a master writable database, this technique might be the way to go. See the \LiveJournal scaling article
  8. You are crreating an app that integrates multiple existing databases into a single web front end

Which way do you recommend?

David strongly recommends the single database approach (link?). Personally, I agree. \ActiveRecord was designed for the 1 database scenario. Life is simpler with 1 database. Aim for the single database approach unless you know you need multiple databases, or you’d just like to be different. :)

what-a-day with the invaluable help of #rubyonrails

I have a situation where I have one database that controls authentication for a mess of other applications (some rails… some not). In my rails applications that use this authentication scheme I have my database.yml configured to connect to the central authentication database (called general_security). Then, since the database specific to the rails app live on the same server (username,password,host etc… all the same just the database is different) I handle the different database name in my model files.

class Mpq < ==ActiveRecord==::Base
  def self.table_name() "mpq.mpqs" end
end

This of course wouldn’t work if the username,password, or host was different. Also note that this doesn’t work with PostgreSQL. — AustinMoody

What if my session contains model objects?

I kept running into problems using the above examples verbatim. I inserted the hijack_db method and before_filter into my application controller. The problem was that I also do authentication and carry a User object in the session. So I was forced to include “model :user” inside the application controller. This caused Rails to barf if I didn’t have a connection set up.

So what I had to do was stick the users table in the DB for my environment (the DB I would be using if I stuck to the config DBs). Its ugly, but it works. So I always hijack the DB before anything important happens, but I do need to provide tables for model objects carried in the session in the DB setup by database.yml.

Check out ActiveRecordCaveats for a note about using multiple databases with inheritance.
Check here
for example code of multiple databases.

Accessing Multiple Databases

What if you need to connect to a second (or more) databases from your application? For instance, you must use a database other than your application database to verify a login.

Add additional records into your database.yml file for each database. For a “security� database, create this entry for development, test, and production environments.

security_development: adapter: postgresql database: security host: localhost username: uuuu password: xxxxx

Next, edit the ./config/environment.rb file, just under the ActiveRecord::Base.establish_connection line add a line to connect from your model class to the name you used above.

The RAILS_ENV variable contains either “development”, “test”, or “production”, so we can use that to differentiate depending on the environment.

Say we have a model “Authorization� that needs to access a table in the security database, so we add:

Authorization.establish_connection "security_#{RAILS_ENV}�

Unknown: Do we have to do this for every table we need to access in non-application databases? Would it create multiple connections or cache the connection by its name? Does the security_test database get deleted and reconstructed during tests?

Sharing External ActiveRecord Connections by Dave Thomas suggesets this:

class LegacyBase < ActiveRecord::Base self.abstract_class = true establish_connection “legacy_#{RAILS_ENV}” end class LegacyOrder < LegacyBase … end class LegacyLineItem < LegacyBase … end

Note the trick done to prevent ActiveRecord to look for the the LegacyBase, by marking is as a abstract class.

Overriding only the database name

ActiveRecord::Base.configurations seems to be undocumented. It can be used for overriding only a database name, but keeping the same host and login information. For example:

config = ActiveRecord::Base.configurations[RAILS_ENV] class LegacyModel < ActiveRecord::Base establish_connection ( :database => “someOtherDatabase”, :adapter => config‘adapter’, :username => config‘username’, … ) end

This would be useful if you know the database is going to be stored on the same host with the same information. (Or, if you need to know information about your current database configuration for other purposes)

What about fixtures?

Here is how I made my fixtures use the proper database. Keep in mind there may be a better way of doing this. My table name is ‘words’ with model Word and in word.rb I have
Word.establish_connection(:genMarkov)

Now in your unit_test instead of using:

fixtures :words

use:

def setup Fixtures.create_fixtures(File.join(RAILS_ROOT, ‘test’, ’fixtures’), ’words’) { Word.connection } end

category:Howto

Rails can handle multiple DBs. The “Connection to multiple databases in different models” section of the ActiveRecord API Doc explains how to setup class-specific connection. Also see this helpful tip on proper configuration.

The rest of this document examines how to determine which database to connect to on a per-request basis.

So you want to connect to multiple databases eh? What are you, crazy? :)

Naturally there are pros and cons to this approach. I’ll show you how it is done first, then leave it up to you to decide if you really want to do it.

This guide will only explain how to determine which database to connect to per request.

This differs from the default where each rails application uses only one production database. This way, you need only change your config/database.yml file and you’re good to go. With multiple databases, this is no longer acceptable.

app/controllers/abstract_application.rb defines a class called \AbstractApplicationController. All other user-created controllers descend through this. We’re going to place a before_filter on this sucker to step in front of the normal db mechanism. Because \ActiveRecord’s connection works on a first-come/first-serve basis, if we’re able to set the establish_connection before the default does, we win.

Got it? Good.


class ==AbstractApplicationController== < ==ActionController==::Base
  # here, we hop into the front of the request-handling
  # pipeline to run a method called hijack_db
  before_filter :hijack_db
 
  # here, we're going to manually establish a connection
  # to the database we feel like connecting to.
  def hijack_db
    # completely ridiculous condition
    is_odd = Time.now.hour % 2 == 1
 
    # determine the database name
    db_name = is_odd ? "dis_odd_db" : "dat_even_db"
 
    # lets manually connect to the proper db
		    <a href="http://wiki.rubyonrails.com/rails/pages/ActiveRecord" class="existingWikiWord">ActiveRecord</a>::Base.establish_connection(
      :adapter  => "mysql",
      :host     => "localhost",
      :username => "mr_roboto",
      :password => "secret_secret_i_got_a_secret",
      :database => db_name
    )
  end
end

Of course, you would have better database determination logic than whether the hour is an odd number or not. You might want to switch the database depending on the URL. For example:

  • bears.footballteamsrus.com maps to the database called “bears”
  • swans.footballteamsrus.com maps to the database called “swans”
  • etc.

Perhaps you’re creating database administration tool with Rails (wink, nudge) and need to switch the database based on a query string variable (a.k.a. @params).

Perhaps, you’re a web host and you’d like to dynamically choose a database based on which client’s site is being loaded.

Whatever the reason, that code will make it so.

Why the Hell would I want to do this?

As mentioned, there are pros and cons to this approach. I’m going to list what I can and leave it up to you to decide.

Cons (because everyone always lists the pros first)

  1. When changing the database schema, you’ve got to change it in all the databases
  2. When backing up, you’ve got multiple databases to manage
  3. Database caching is non-existant (read: small performance hit)
  4. Creating statistics and aggregating data becomes more of a challenge
  5. If you hijack the database connection logic, your functional and unit tests won’t be able to switch to the test database properly
  6. Even if you don’t hijack the database connection logic, functional and unit tests expect to push the data into the main test db, which may cause Unexpected Results.

Pros

  1. Complete isolation – you can have different database schemas and different functionality without affecting the others
  2. The \ActiveRecord code requires fewer relationships and the SQL is much cleaner when you don’t have to add `WHERE team_id = 123` to every query (granted, much of this \ActiveRecord handles more than gracefully)
  3. Less data and smaller indices mean faster SQL queries (in theory)
  4. No central database means each database could be located on a different box
  5. Backing up a single football team (in the above example) means dumping one database, not writing writing sql that says `WHERE team_id = 69`
  6. If you go the extra step and deploy one Rails app per customer, taking someone offline doesn’t mean taking everyone offline. Furthermore, if you are deployed on multiple isolated machines and one dies, you get fewer simultaneous irate support calls.
  7. If you’re wildly successful and want to scale up by having a bunch of read-only databases and a master writable database, this technique might be the way to go. See the \LiveJournal scaling article
  8. You are crreating an app that integrates multiple existing databases into a single web front end

Which way do you recommend?

David strongly recommends the single database approach (link?). Personally, I agree. \ActiveRecord was designed for the 1 database scenario. Life is simpler with 1 database. Aim for the single database approach unless you know you need multiple databases, or you’d just like to be different. :)

what-a-day with the invaluable help of #rubyonrails

I have a situation where I have one database that controls authentication for a mess of other applications (some rails… some not). In my rails applications that use this authentication scheme I have my database.yml configured to connect to the central authentication database (called general_security). Then, since the database specific to the rails app live on the same server (username,password,host etc… all the same just the database is different) I handle the different database name in my model files.

class Mpq < ==ActiveRecord==::Base
  def self.table_name() "mpq.mpqs" end
end

This of course wouldn’t work if the username,password, or host was different. Also note that this doesn’t work with PostgreSQL. — AustinMoody

What if my session contains model objects?

I kept running into problems using the above examples verbatim. I inserted the hijack_db method and before_filter into my application controller. The problem was that I also do authentication and carry a User object in the session. So I was forced to include “model :user” inside the application controller. This caused Rails to barf if I didn’t have a connection set up.

So what I had to do was stick the users table in the DB for my environment (the DB I would be using if I stuck to the config DBs). Its ugly, but it works. So I always hijack the DB before anything important happens, but I do need to provide tables for model objects carried in the session in the DB setup by database.yml.

Check out ActiveRecordCaveats for a note about using multiple databases with inheritance.
Check here
for example code of multiple databases.

Accessing Multiple Databases

What if you need to connect to a second (or more) databases from your application? For instance, you must use a database other than your application database to verify a login.

Add additional records into your database.yml file for each database. For a “security� database, create this entry for development, test, and production environments.

security_development: adapter: postgresql database: security host: localhost username: uuuu password: xxxxx

Next, edit the ./config/environment.rb file, just under the ActiveRecord::Base.establish_connection line add a line to connect from your model class to the name you used above.

The RAILS_ENV variable contains either “development”, “test”, or “production”, so we can use that to differentiate depending on the environment.

Say we have a model “Authorization� that needs to access a table in the security database, so we add:

Authorization.establish_connection "security_#{RAILS_ENV}�

Unknown: Do we have to do this for every table we need to access in non-application databases? Would it create multiple connections or cache the connection by its name? Does the security_test database get deleted and reconstructed during tests?

Sharing External ActiveRecord Connections by Dave Thomas suggesets this:

class LegacyBase < ActiveRecord::Base self.abstract_class = true establish_connection “legacy_#{RAILS_ENV}” end class LegacyOrder < LegacyBase … end class LegacyLineItem < LegacyBase … end

Note the trick done to prevent ActiveRecord to look for the the LegacyBase, by marking is as a abstract class.

Overriding only the database name

ActiveRecord::Base.configurations seems to be undocumented. It can be used for overriding only a database name, but keeping the same host and login information. For example:

config = ActiveRecord::Base.configurations[RAILS_ENV] class LegacyModel < ActiveRecord::Base establish_connection ( :database => “someOtherDatabase”, :adapter => config‘adapter’, :username => config‘username’, … ) end

This would be useful if you know the database is going to be stored on the same host with the same information. (Or, if you need to know information about your current database configuration for other purposes)

What about fixtures?

Here is how I made my fixtures use the proper database. Keep in mind there may be a better way of doing this. My table name is ‘words’ with model Word and in word.rb I have
Word.establish_connection(:genMarkov)

Now in your unit_test instead of using:

fixtures :words

use:

def setup Fixtures.create_fixtures(File.join(RAILS_ROOT, ‘test’, ’fixtures’), ’words’) { Word.connection } end

category:Howto