Tuesday, November 20, 2012

Voter Fraud in Montana's Senate race? Not so much.

In a recent article published in Explore Big Sky, Gallatin County Vice Chair Tammy Hall made the claim that Tester may have won because of vote fraud. Here’s the part of the article that's relevant:

"Tammy Hall, first vice president for Gallatin County Republican Women, felt the local Republican effort to get out the vote was strong, but said voter fraud in the county was a factor in the outcomes.
“When you’ve got 8,000 people registering to vote on the same day, you’ve lost control,” said Hall, also one of 22 delegates representing Montana at the Republican National Convention in August."

As a political scientist, there’s nothing more annoying than someone who lobs into the body politic claims of voter fraud without a shred of empirical evidence with which to back the claim. Trust is vital in republican government, and without it, our elected officials cannot hope to government. Part of the reason Congress cannot function is the historically low-levels of trust voters have in the institution. Claiming voter fraud when your candidate or party doesn’t win—instead of engaging in a profound retrospection based upon real data—does not help matters. And it certainly doesn’t help your party or candidate plan for the next election if you just put your head in the sand and scream that the other side cheated.

Voter fraud is serious business, but it can be detected using some rather simple statistical tests. Walter Mebane, a political methodologist at the University of Michigan, has written extensively on the matter. Go to his website and see his research. In a chapter in the book Election Fraud (edited by Mike Alvarez, Thad Hall, and Susan Hyde), Mebane proposes the use of a simple statistical test, called the Second Digit Benford’s Law Test for Vote Counts, to detect the likelihood of fraud. Simply put, the digits in a group of numbers are likely to have a particular distribution. Mebane explains:

“Benford’s Law states that in a list of statistical data, such as vote tallies from different precincts, the digits of the numbers that make up those data points follow a specific distribution. In each significant digit position, smaller numerals appear more often than larger numerals. In a set of data that follow Benford’s Law, the first significant digit is the number 1 roughly 30 percent of the time. There is what I call a second-digit Benford’s Law (2BL) distribution when the first digits have no particular pattern but the second digits of the data points do follow the pattern given by Benford’s Law. In this case the second significant digit is 0 (zero) about 12 percent of the time” (Mebane 2008).

Very simply put, if we see variation from the Benford’s law in the distribution of the second digit in precinct level vote returns, we have possible evidence of voter fraud. It is not proof of fraud, but it raises the possibility that monkey business may be at work.

Because I’m not a political methodologist and not as smart as Mebane, I searched the web for a routine that would allow me to examine precinct level data in a statistical software package political scientists, economists, and sociologists use called Stata. I found such a routine called Digdis. I installed the routine, downloaded precinct vote totals for the Senate race from the Montana Secretary of State’s website, and ran a quick analysis on the second significant figure of the precinct totals. Here’s a screen capture of the Stata output:

The data show the number of 1-9 digits in the second significant position, the percent expected value in the distribution according to Benford’s Law, the variation from that expected percentage, and the statistical significance of said variation. In no instance does the difference in the expected value from the actual value rise to conventional levels of statistical significance (p<.05).

In other words, across the 794 voting precincts, the vote totals reported in the Senate race show NO STATISTICALLY DISCERNIBLE EVIDENCE OF VOTE FRAUD.

Before screaming about fraud, it might make some sense to learn a little about the likelihood of it and build a solid empirical case based upon quantitative evidence to demonstrate some proof of those claims. To paraphrase Dan Rather, voter fraud in the Senate race? “That dog just don’t hunt”.


Ryan Brady said...

Unfortunately, this analysis does not support your conclusion that vote fraud played no role.

Benford's Law can only detect irregularities in digit distribution for data sets with certain qualities. Those data sets must be governed by at least one rule in which numbers are assigned. For example, in a list of prices for goods sold, Benford's Law would detect the irregularity of the recurring "9" at then end of most prices. Take humans assigning numbers as another example: people subconsciously repeat certain digits, which undermines randomness and allows Benford's Law to detect tendencies.

Benford's Law cannot, however, detect irregularities in data sets comprised of numbers produced from mathematical combinations--because there are no irregularities.

Your statistical analysis shows no irregularities in the last digit of the precinct totals. But this implies that someone would have been picking numbers for precinct totals. In that case, yes, your method would detect the irregularities due to humans' inability to model randomness when picking numbers.

But not all vote fraud has to be the product of the human mind. To limit your analysis only to this, and then to conclude no vote fraud took place, is simply illogical. Take, for example, a formula where n = reported precinct vote total for candidate x, and r = the actual precinct vote total for candidate x. Now let's say n = r + 1. In this formula, candidate x's vote total would be fraudulently increased by 1 at every precinct. If every final digit is skewed up by 1, Benford's Law will have absolutely nothing to say on whether the actual precinct vote total was altered, because it will still detect the same variation of digits in the last placeholder.

More realistically, a computerized vote fraud model might flip every 20th vote for candidate x to candidate y. If that's the case, the data set will still be comprised of numbers produced purely from mathematical combination, and no irregularities will be detectable.

You have only proven a human did not pick numbers out of the air. You have not proven vote fraud didn't happen. Granted, I'd like to think Ms. Hall is mistaken, anyway(!), but there's nothing in your analysis to affirmatively show this.

David Parker said...


Excellent post. Yes, the 2BL test would be unable to detect all fraud (particularly fraud linked to computer software). Even a positive test does not prove fraud--only that fraud may exist. The results I report do suggest that that type of fraud Ms. Hall was suggesting, however, is likely not present. Just for kicks, I ran the analysis separately on precincts won by the Democrats and precincts won by the Republicans. Again, the results did not detect any statistically significant deviance from the expected distribution of significant digits.


Ryan Brady said...

The guy that authored that study is a retired NSA scientist.