Want to see the full-length video right now for free?
 
  In this lesson, we're going to learn advanced techniques for querying our
has_many associations.
We'll start with the same domain model that we had in the previous lesson, but
this time add Location into the mix:
class Person < ActiveRecord::Base
  belongs_to :location
  belongs_to :role
end
class Role < ActiveRecord::Base
  has_many :people
end
class Location < ActiveRecord::Base
  has_many :people
end
Locations and people have the same kind of 1-N relationship that we previously had between roles and people.
Here's the current state of affairs in our database:
Role.all
| id | name | billable | 
|---|---|---|
| 1 | Developer | t | 
| 2 | Manager | f | 
| 3 | Unassigned | f | 
Location.all
| id | name | billable | 
|---|---|---|
| 1 | Boston | 1 | 
| 2 | New York | 1 | 
| 3 | Denver | 2 | 
Person.all
| id | name | role_id | location_id | 
|---|---|---|---|
| 1 | Wendell | 1 | 1 | 
| 2 | Christie | 1 | 1 | 
| 3 | Sandy | 1 | 3 | 
| 4 | Eve | 2 | 2 | 
This time, we want to find all distinct locations with at least one person who belongs to a billable role.
In other words, find billable locations, where a billable location is a location that has at least one billable person, and a billable person is somebody whose role is billable. The word "distinct" is in there for reasons that will soon become clear.
joins methodJust like with a belongs_to association, we can [tell ActiveRecord to join a has_many association][]:
Location.joins(:people)
which generates similar SQL (but with the primary and foreign keys flipped):
SELECT "locations".*
FROM "locations"
INNER JOIN "people"
  ON "people"."location_id" = "locations"."id";
and retrieves data like this:
| locations | people | |||||
|---|---|---|---|---|---|---|
| id | name | region_id | id | name | role_id | location_id | 
| 1 | Boston | 1 | 1 | Wendell | 1 | 1 | 
| 1 | Boston | 1 | 2 | Christie | 1 | 1 | 
| 3 | Denver | 2 | 3 | Sandy | 1 | 3 | 
| 2 | New York | 1 | 4 | Eve | 2 | 2 | 
The output is similar, but the place where it gets interesting is that we end
up with Boston more than once. This is because we joined all the people that
belonged to a location onto the location, and there is more than one person
with a location_id of 1, which corresponds to Boston.
If you're used to thinking about your associated collections more like a tree than a table, then this takes a little getting used to; but having everything in this two-dimensional world of joins and tables is actually very useful and lets us efficiently reason about and query our data.
[tell ActiveRecord to join a has_many association]: http://guides.rubyonrails.org/active_record_querying.html#using-array-hash-of-named-associations
However, in addition to joining a direct association, ActiveRecord allows us to
go even further and join indirect associations. This is almost like doing a
has_many/through on the fly:
Location.joins(people: :role)
which generates SQL like this:
SELECT "locations".*
FROM "locations"
INNER JOIN "people"
  ON "people"."location_id" = "locations"."id"
INNER JOIN "roles"
  ON "roles"."id" = "people"."role_id";
| locations | people | roles | |||||||
|---|---|---|---|---|---|---|---|---|---|
| id | name | region_id | id | name | role_id | location_id | id | name | billable | 
| 1 | Boston | 1 | 1 | Wendell | 1 | 1 | 1 | Developer | t | 
| 1 | Boston | 1 | 2 | Christie | 1 | 1 | 1 | Developer | t | 
| 3 | Denver | 2 | 3 | Sandy | 1 | 3 | 1 | Developer | t | 
| 2 | New York | 1 | 4 | Eve | 2 | 2 | 2 | Manager | f | 
You can see that we are now gluing three tables together: we join roles to
people (which doesn't result in duplicated rows since each person only belongs
to one role), and then we join role-enhanced people to each location.
So currently the result set would be four ActiveRecord Location objects, two of which are Boston; but we have all of the attributes of a location's people and their roles available to query against should we wish to.
where methodWe can now filter the way we want to, with [ActiveRecord's where method][]:
Location.joins(people: :role).where(roles: { billable: true })
which generates SQL like this:
SELECT "locations".*
FROM "locations"
INNER JOIN "people"
  ON "people"."location_id" = "locations"."id"
INNER JOIN "roles"
  ON "roles"."id" = "people"."role_id"
WHERE "roles"."billable" = 't';
and retrieves data like this:
| id | name | region_id | id | name | role_id | location_id | id | name | billable | 
|---|---|---|---|---|---|---|---|---|---|
| 1 | Boston | 1 | 1 | Wendell | 1 | 1 | 1 | Developer | t | 
| 1 | Boston | 1 | 2 | Christie | 1 | 1 | 1 | Developer | t | 
| 3 | Denver | 2 | 3 | Sandy | 1 | 3 | 1 | Developer | t | 
| people | roles | locations | |||||||
|---|---|---|---|---|---|---|---|---|---|
| id | name | region_id | id | name | role_id | location_id | id | name | billable | 
| 1 | Boston | 1 | 1 | Wendell | 1 | 1 | 1 | Developer | t | 
| 1 | Boston | 1 | 2 | Christie | 1 | 1 | 1 | Developer | t | 
| 3 | Denver | 2 | 3 | Sandy | 1 | 3 | 1 | Developer | t | 
but, again, we're seeing Boston twice. Let's fix that.
[ActiveRecord's where method]: http://guides.rubyonrails.org/active_record_querying.html#specifying-conditions-on-the-joined-tables
distinct methodFortunately, ActiveRecord has a nifty method called distinct, which throws in
the DISTINCT keyword (at least, in Postgres; other databases might work
differently, but ActiveRecord will handle it):
Location.joins(people: :role).where(roles: { billable: true }).distinct
which generates SQL like this:
SELECT DISTINCT "locations".*
FROM "locations"
INNER JOIN "people"
  ON "people"."location_id" = "locations"."id"
INNER JOIN "roles"
  ON "roles"."id" = "people"."role_id"
WHERE "roles"."billable" = 't';
and retrieves data like this:
| id | name | region_id | 
|---|---|---|
| 3 | Denver | 2 | 
| 1 | Boston | 1 | 
Mission accomplished!
Just like last time, we can now encapsulate our lovely query in a method:
class Location < ActiveRecord::Base
  def self.billable
    joins(people: :role).where(roles: { billable: true }).distinct
  end
end
In general, we want to avoid letting these ActiveRecord deeper queries move outside of our ActiveRecord objects. We could have queries like this scattered throughout our controllers, for example, but we should move them into a nicely named method or scope and then call that method from everywhere else.
Even in this simple example, we would probably define Role.billable, and use
that to define People.billable, and then only reference People.billable in
Location.billable. But we'll leave that as an exercise for the reader, since
it's easier to see here all in one place.
Now, we want to order the billable locations by region name, then by location
name. Which means we need a Region model:
class Location < ActiveRecord::Base
  belongs_to :region
end
class Region < ActiveRecord::Base
  has_many :locations
end
We can start with things we've already learned to get the ordering part working for all locations:
Location.joins(:region).merge(Region.order(:name)).order(:name)
which generates SQL like this:
SELECT "locations".*
FROM "locations"
INNER JOIN "regions"
  ON "regions"."id" = "locations"."region_id"
ORDER BY "regions"."name" ASC, "locations"."name" ASC;
and retrieves data like this:
| locations | regions | |||
|---|---|---|---|---|
| id | name | region_id | id | name | 
| 1 | Boston | 1 | 1 | East | 
| 2 | New York | 1 | 1 | East | 
| 3 | Denver | 2 | 2 | West | 
We can see it joins the tables correctly, and merges the scopes correctly.
Let's pull this logic into its own method, by_region_and_location_name:
class Location < ActiveRecord::Base
  def self.billable
    joins(people: :role).where(roles: { billable: true }).distinct
  end
  def self.by_region_and_location_name
    joins(:region).merge(Region.order(:name)).order(:name)
  end
end
It gets tricky when we want to combine our two Location scopes. If we try
something like this:
Location.billable.by_region_and_location_name
which generates SQL like this:
SELECT DISTINCT "locations".*
FROM "locations"
INNER JOIN "people"
  ON "people"."location_id" = "locations"."id"
INNER JOIN "roles"
  ON "roles"."id" = "people"."role_id"
INNER JOIN "regions"
  ON "regions"."id" = "locations"."region_id"
WHERE "roles"."billable" = 't'
ORDER BY "regions"."name" ASC, "locations"."name" ASC;
then Postgres throws an error:
PG::InvalidColumnReference: ERROR:  for SELECT DISTINCT, ORDER BY expressions must
appear in select list
LINE 1: ...gion_id" WHERE "roles"."billable" = 't'  ORDER BY "regions"....
This can be frustrating, but there are good reasons for it. Basically, it's an
order of operations issue: we have to make sure that when we eliminate some
rows with DISTINCT, we don't lose important information that we needed for
ordering. So we need to be explicit about how to end up with distinct things
first, and then order them.
from methodTo resolve this, we're first going to use [ActiveRecord's from method][] to
create a sub-query that returns distinct billable locations:
Location.from(Location.billable, :locations)
which generates SQL like this:
SELECT "locations".*
FROM (
  SELECT DISTINCT "locations".*
  FROM "locations"
  INNER JOIN "people"
    ON "people"."location_id" = "locations"."id"
  INNER JOIN "roles"
    ON "roles"."id" = "people"."role_id"
  WHERE "roles"."billable" = 't'
) locations;
and retrieves data like this:
| id | name | region_id | 
|---|---|---|
| 3 | Denver | 2 | 
| 1 | Boston | 1 | 
The second argument to the from method is the alias that we want for our
virtual table that results from the sub-query; we specify the same old
conventional name so that subsequent queries will get what they expect.
[ActiveRecord's from method]: http://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-from
Finally, we can put it all together:
Location.from(Location.billable, :locations).by_region_and_location_name
which generates SQL like this:
SELECT "locations".*
FROM (
  SELECT DISTINCT "locations".*
  FROM "locations"
  INNER JOIN "people"
    ON "people"."location_id" = "locations"."id"
  INNER JOIN "roles"
    ON "roles"."id" = "people"."role_id"
  WHERE "roles"."billable" = 't'
) locations
INNER JOIN "regions"
  ON "regions"."id" = "locations"."region_id"
ORDER BY "regions"."name" ASC, "locations"."name" ASC;
and retrieves data like this:
| locations | regions | |||
|---|---|---|---|---|
| id | name | region_id | id | name | 
| 1 | Boston | 1 | 1 | East | 
| 3 | Denver | 2 | 2 | West | 
Success!
We very frequently come across codebases that do this kind of work using plain old Ruby, and therefore make lots of extra database hits and use lots of unnecessary memory building unnecessary ActiveRecord objects.
Your database is amazingly good at querying and ordering your data: don't
reinvent the wheel if you don't have to. Just about anything that you can do
with Ruby's Enumerable, the database can do better.
And that's it for querying has_many associations! We have an exercise for you
to practice some of these techniques, and then we'll see you back for our next
adventure: Custom Joins with ActiveRecord.
Good luck!