Database Joins: A Quick Primer

The following is a simple introduction to various ways of joining database tables. The example code should work identically in MySQL and PostgreSQL unless otherwise noted.

Set Up

First let's lay foundation for the data we'll be joining. We'll use a simple company <> people pattern.

We have a table of three companies.

SELECT * FROM companies;

 company_id |  company   
------------+---------
          1 | iMarc
          2 | Apple
         10 | Startup

We also have a table of people. These people, if employed, can be associated with one of the companies.

SELECT * FROM people;

 person_id | company_id |      name      
-----------+------------+----------------
         1 |          1 | Dave Tufts
         3 |          2 | Steve Jobs
         4 |          2 | The Woz
        20 |            | Bob Unemployed
  • Three of the people are employed at a company.
  • Bob Unemployed isn't associated with any company.
  • One of the companies listed above, Startup, doesn't have any employees.

I've also included the SQL I used to create and populate these tables below. Feel free to try these joins in your own database.

The Joins

Now we'll look at various ways of pulling back the data.

The Straight Join (or Plain Join, or Regular Join)

This only returns data that can be referenced in both tables.

SELECT * 
FROM people, companies 
WHERE people.company_id = companies.company_id;
-- ===================================================

 person_id | company_id |    name    | company_id | company  
-----------+------------+------------+------------+---------
         1 |          1 | Dave Tufts |          1 | iMarc
         3 |          2 | Steve Jobs |          2 | Apple
         4 |          2 | The Woz    |          2 | Apple

The Left Join

Use a left join if you want data that can be referenced in the first table of your select statement regardless if there is any associated data in the second table. Reading left-to-right, if we reference people first, a LEFT JOIN will return all people. If the person is associated with a company, the company info will be returned. If they are not associated with a company, NULL values will be returned.

SELECT * 
FROM people 
LEFT JOIN companies 
  USING (company_id);
-- ===================================================

 company_id | person_id |      name      | company  
------------+-----------+----------------+---------
          1 |         1 | Dave Tufts     | iMarc
          2 |         3 | Steve Jobs     | Apple
          2 |         4 | The Woz        | Apple
            |        20 | Bob Unemployed | 

Both PostgreSQL and MySQL treat a simple LEFT OUTER JOIN identically to a LEFT JOIN. In the example above, I could have added the word 'OUTER' and got back the same results

The Right Join

Use a right join if you want data that can be referenced in the second table of your select regardless if there is any associated data in the first table.

SELECT * 
FROM people 
RIGHT JOIN companies 
  USING (company_id);
-- ===================================================

 company_id | person_id   |    name    | company   
------------+-------------+------------+---------
          1 |           1 | Dave Tufts | iMarc
          2 |           3 | Steve Jobs | Apple
          2 |           4 | The Woz    | Apple
          0 |             |            | Startup

Like the left join, MySQL and PostgreSQL would return the identical result if I used "RIGHT OUTER JOIN" above.

Full Outer Join (PostgreSQL-only)

The full outer join returns a single row for all full data sets from both tables. As far as I know, this syntax only works in real databases. However, MySQL has many storage engines, maybe one of them supports the FULL OUTER JOIN.

SELECT * 
FROM companies 
FULL OUTER JOIN people 
  ON people.company_id=companies.company_id;
-- ===================================================

 company_id | company | person_id | company_id |      name      
------------+---------+-----------+------------+----------------
          1 | iMarc   |         1 |          1 | Dave Tufts
          2 | Apple   |         3 |          2 | Steve Jobs
          2 | Apple   |         4 |          2 | The Woz
            |         |        20 |            | Bob Unemployed
         10 | Startup |           |            | 

Another fun* thing you can do with the FULL OUTER JOIN is only return data that is incomplete — data that's in one table but not the other. Basically the opposite of a straight join.

SELECT * 
FROM companies 
FULL OUTER JOIN people 
  ON people.company_id=companies.company_id 
WHERE companies.company_id IS NULL 
  OR people.company_id IS NULL;
-- ===================================================

 company_id | company | person_id | company_id |      name      
------------+---------+-----------+------------+----------------
            |         |        20 |            | Bob Unemployed
         10 | Startup |           |            | 

* By fun, I don't really mean fun. There's nothing fun about a full outer join.

The Dreaded Cartesian Join

I can't think of any reason why you'd would to this, but a cartesian join returns every possibility of data relation between the two tables. If you have tables with millions of rows, you should try this now!

SELECT * FROM people, companies;
-- ===================================================

 person_id | company_id |      name      | company_id | company   
-----------+------------+----------------+------------+---------
         1 |          1 | Dave Tufts     |          1 | iMarc
         3 |          2 | Steve Jobs     |          1 | iMarc
         4 |          2 | The Woz        |          1 | iMarc
        20 |            | Bob Unemployed |          1 | iMarc
         1 |          1 | Dave Tufts     |          2 | Apple
         3 |          2 | Steve Jobs     |          2 | Apple
         4 |          2 | The Woz        |          2 | Apple
        20 |            | Bob Unemployed |          2 | Apple
         1 |          1 | Dave Tufts     |         10 | Startup
         3 |          2 | Steve Jobs     |         10 | Startup
         4 |          2 | The Woz        |         10 | Startup
        20 |            | Bob Unemployed |         10 | Startup

That's all. I hope that made sense. If you want to try these SELECT and JOIN statements on your own database, here's the SQL I used to create and populate the tables.

CREATE TABLE companies (
    company_id serial PRIMARY KEY,
    company varchar(32) NOT NULL
);
INSERT INTO companies (company_id, company) VALUES (1, 'iMarc');
INSERT INTO companies (company_id, company) VALUES (2, 'Apple');
INSERT INTO companies (company_id, company) VALUES (10, 'Startup');

CREATE TABLE people (
  	person_id serial PRIMARY KEY,
  	company_id integer DEFAULT NULL,
  	name varchar(32) NOT NULL
);
INSERT INTO people (person_id, company_id, name) 
    VALUES (1, 1, 'Dave Tufts');
INSERT INTO people (person_id, company_id, name) 
    VALUES (3, 2, 'Steve Jobs');
INSERT INTO people (person_id, company_id, name) 
    VALUES (4, 2, 'The Woz');
INSERT INTO people (person_id, company_id, name) 
    VALUES (20, NULL, 'Bob Unemployed');

Comments

Wednesday, Jul 8, 2009 / 8:49pm Errol Sayre said…

Thanks for this, now I have a bookmarked page to send people to when they ask me these sorts of questions ;-)

Wednesday, Jul 8, 2009 / 8:50pm Errol Sayre said…

Of course you know a real employment database would have a separate table to identify the relations between person and company so that a given person could have multiple jobs across multiple companies ;-)

Thursday, Jul 9, 2009 / 9:36am Robert Mohns said…

@errol: beware feature creep and "that's easy" assumptions... http://blog.bitquabit.com/2009/07/01/one-which-i-call-out-hacker-news/

:-)

Thursday, Jul 9, 2009 / 10:29am Nick said…

Good stuff. If you'd like the Oracle equivalent let me know.

Thursday, Jul 9, 2009 / 10:41am Dave Tufts said…

@Nick - Sure. I think if you load the same data in Oracle, you should get back the same results back. I think Oracle also supports the same FULL OUTER JOIN syntax as PostgreSQL. But yes, I'd love to see the results from Oracle, just to confirm they're the same. Feel free to email me [my-first-name] @ [our-domain] - thanks

Thursday, Jul 9, 2009 / 1:01pm Will Bond said…

Another good mention is the NATURAL JOIN, which automatically joins on column names shared between the two tables. This syntax is supported by most of the big names except for SQL Server.

The , operator in your first example is equivalent to what is known as a CROSS JOIN, which is also called a cartesian join, although that term is never used in the SQL language. Since you have a WHERE clause you are in essentially getting a JOIN or INNER JOIN where the ON clause is the same as your WHERE clause.

To further muddy the waters, PostgreSQL (and I believe SQLite, Oracle and SQL Server) properly considers JOIN an INNER JOIN and requires an ON clause. MySQL, liking to play fast and loose, will treat JOIN as a CROSS JOIN if no ON clause is included.

SQLite does not support RIGHT or FULL JOINs, however a RIGHT JOIN is the same as a LEFT JOIN with the tables switched.

One little interesting parts of USING conditions is that they remove one copy of each column listed from any SELECT * queries. If you are using SQL Server, unfortunately it does not support USING conditions.

Unfortunately there are tons of little inconsistencies between the big RDBMSs. I have documented a number of them at the Flourish SQL page.

Wednesday, Jul 15, 2009 / 10:54am Errol Sayre said…

@robert

You have a good point, though I am generally of the opinion that writing a flexible setup once and reusing it is better than writing a pared down each time its needed.

Still, my original comment was more about being a smart-ass then design philosophy ;-)

Tuesday, Sep 15, 2009 / 7:24pm Adam Pittman said…

Good stuff, Dave. For those who are interested in doing a full outer join in a dbms not supporting it (sql server 2005 and up do actually support it), keep in mind that a full outer join is simply the union of a left join with a right join. In place of Dave's first example we could write:

SELECT * FROM people LEFT JOIN companies ON people.company_id = companies.company_id UNION SELECT * FROM people RIGHT JOIN companies ON people.company_id = companies.company_id

...and get the same results. Likewise, the second example could be written:

SELECT * FROM people LEFT JOIN companies ON people.company_id = companies.company_id UNION SELECT * FROM people RIGHT JOIN companies ON people.company_id = companies.company_id EXCEPT SELECT * FROM people, companies WHERE people.company_id = companies.company_id

...although of course neither of these is as concise as Dave's examples employing the full outer join keyword. Then again, every kind of join keyword is really just short-hand; the same results can always be had using some combination of subqueries and set operators (union, except ["minus" with some vendors], intersect).

Comments have been turned off on this blog.
Read something more recent.

Statements and opinions expressed in this blog and any comments made are the private opinions of the respective poster, and, as such, iMarc LLC is neither responsible nor liable for such content.

Meet The Author

Dave Tufts

Vice President, Director of Technology

Search

Recent Blog Posts

Recent Comments

  • Changes

    Jean Fitzgerald commented: Congratulations Jeff! It's a great move for you, the company will never regret it. You are one of the most creative people I know too.

  • Changes

    Claire Turcotte commented: Jeffrey! Congratulations. VERY PROUD OF YOU. Send me an answer............ Love, Memere

  • What To Know Before You SEO

    seo course bangalore commented: Wow, awesome blog layout! How long have you been blogging for? you made blogging look easy. The overall look of your website is fantastic, as well as the content! http://seocoursevideo.com/worry/

  • Changes

    Nick Hill commented: Congrats to Will and Jeff!

  • Scrolling, clicking, and the fold

    jay commented: Yeah ,,people will scroll up and down no matter what, now a days a lot of sites have horizontal scroll ..Personally i think scrolling is better than clicking..no time to waste lookin for the link and then clicking..scrolling offers all in one go.

We heart Visitors

  • iMarc
  • 14 Inn Street
  • Newburyport, MA 01950
  • Phone: (978) 462-8848
  • Fax: (978) 462-8807
  • Directions

Contact Us

Whether you have a huge project or just want to talk about updating your site, we’re here to help. Fill out the form, and we’ll get right back to you.

Contact Us
  • All Fields Required

Close