Finding Candidates’ Similarities with Self-Join
by Dave Tufts - March 6, 2008 / 11:47pm View more articles
In database-land, SQL self-joins are queries where the table is compared to itself. Meanwhile, in political-land, people often complain of homogeneous candidates and lack of choices.
Using a self-join on the public voting records of Hillary Clinton, John McCain, and Barack Obama, we can see just how similar these candidates are. VoteSmart.org has compiled the complete voting records for all members of congress. As of March 2008, the three remaining likely presidential candidates are all senators, making it easy to compare their records.
I made a quick database table and imported all the data from VoteSmart.org. The database table included the candidate's name, the issue they voted on, and the vote that was cast. Every senatorial vote is a "Y" (yes/for vote), "N" (no/against vote), or "NV" (No Vote was cast).
Table "votes"
Column | Type | Modifiers
--------------+------------------------+----------
candidate | character varying(255) | not null
issue | character varying(255) | not null
vote | character varying(5) | not null
Indexes:
"votes_issue_key" UNIQUE, btree (candidate, issue, vote)
First I wanted to see how responsible each candidate was. The Constitution clearly outlines a Senator's job description. Senators propose and vote on Bills. Bills that pass become laws. If you're a Senator and you're not voting on Bills, you're not doing your job.
A quick SQL count, showed that Obama was actually the least responsible candidate, missing almost 1/5th of his roll calls. I'm not complaining, who wouldn't want to take 1/5th of their work week off...like maybe every Friday?
SELECT
vote,
count(*)
FROM
votes
WHERE
candidate_id = 'Obama'
GROUP BY
vote;
-------+------
count | vote
-------+------
NV | 45
Y | 153
N | 57
In terms of not missing votes, Hillary is the most responsible, only missing 9% of her roll calls.
View the full report for all the details.
Analyzing Clinton, McCain, and Obama's records, we can also determine how similar they are. To do this, we bust out the self-join
First, we'll see how many opportunities two candidates cast a vote on the same issue. To do this, we:
- pick two candidates
- ignore missed votes (NV) by either candidate
- self-join on the issue
SELECT
count(t1.*)
FROM
votes t1,
votes t2
WHERE
t1.candidate_id = 'McCain'
AND t2.candidate_id = 'Obama'
AND t1.vote != 'NV'
AND t2.vote != 'NV'
AND t1.issue = t2.issue ;
-------
count
-------
189
Now to find similarities, we execute the same self-joining select but only include results where both candidates voted the same way. An equality comparison to the vote column on both sides of the self-join.
SELECT
count(t1.*)
FROM
votes t1,
votes t2
WHERE
t1.candidate_id = 'McCain'
AND t2.candidate_id = 'Obama'
AND t1.vote != 'NV'
AND t2.vote != 'NV'
AND t1.issue = t2.issue
AND t1.vote = t2.vote ;
-------
count
-------
60
Results
- Clinton and Obama cast the same vote 92% of the time.
- Clinton and McCain cast the same vote 47% of the time
- McCain and Obama cast the same vote 31% of the time
- The three candidates were all present for the same roll call 191 times. 29% of the time they all voted the same way — either all for or all against the issue at hand.
(This part is edited from the original post afterJess Turcotte pointed out the errors in my conclusion — Originally, I thought it was Clinton's record that changed. I was wrong.)
One question that stood out was how could Clinton and Obama be identical, while showing a 16% difference when compared to McCain? Since Clinton and McCain were both in office four years before Obama, they must have been more similar when it was just the two of them. One of them must have drastically changed their voting pattern around 2005.
John McCain did.
- Jan 2001–Jan 2005, Clinton and McCain were 73% similar. They cast the same vote 76 times in 109 opportunities.
- After 2005, Clinton and McCain are only 34% similar, casting the same vote 65 times in 190 opportunities.
Comparing each candidate to a control subject, makes it pretty obvious that Hillary was consistent from 2001–2008, and John McCain changed notably after 2005. Before 2005 McCain voted 60%–70% in-line with either Clinton or Ted Kennedy (my control subject). After 2005, he only voted similarly with those two democrats 32% of the time.
What made McCain’s voting record change so drastically?
Conclusion
The results seem to prove that the Democratic candidate will be at least 50% different from the Republican candidate this coming November. To paraphrase Ralph Nader circa 2000, a vote for Obama or Clinton will not be a vote for McCain…unless McCain reverts back to his pre-2005 voting patterns.
For more political detail and less technical detail, view the full report.
Comments have been turned off on this blog.
Read something more recent.
7 Comments
RIP Mitt.
Hillary is consistent, I'll give her that, but that's very much the part that scares me - she HAS the ability if put in power to get things done along her agenda, and to be honest - her agenda is socialism - something I'm not a fan of. What we need, is an Ayn Rand like candidate - someone who stands for individual responsibility - someone who believes that the best type of government is the one which governs the least. Unfortunately, it appears that such a view is simply not fashionable any longer - it's easier to be a looter...a leach on the remaining few who try to do the right thing...and then get raped in taxes... I'm all for helping people who really need it - but those who need it need to want to help themselves, and I'm afraid that's more rare than ever these days..
As usual - I'm way off topic...my bad.