Monday, October 21, 2013

Yikes! If you can't trust annoymous people on the internet, who can you trust?

This Seattle PI article published in September makes the argument 20% of all Yelp.com reviews are either fake or "astroturfed" (artificially created to appear natural in origin).  It cites a 400% increase in proportion of fake reviews over just a 6 year span.




It references a recent Harvard Business School publishing which systematically breaks down the reviews, analyzing the incidents of both high and low reviews of restaurants, services, and other establishments reviewed in Yelp.com.  It is understandable this market for positive reviews has become so widespread, a Harvard Magazine article from 2011 found even "a one-star increase in Yelp rating leads to a 5-9 percent increase in revenue."

The analysis by the business school publishing corrects for many error factors, such as erroneous values and the natural Yelp.com filtering scheme, however; the volume of one in five reviews being fake seems extremely high.  To test this value, I could conduct a test of my own to test the data.  I would use the following equation:

where:
p^ is my sample proportion
p is the population proportion
q= 1-p
n is my sample size




From this sample, say of 100 reviews, what is the probability of finding 15% of the sample reviews being fake.  Putting the values into the equation, I find:





which gives us a z value of -1.25 which is .3944.  For a z value on the far end of the distribution it becomes .5000-.3944= .1056. 


This means the I would only have a 10% chance of finding a 15 reviews as fake if the data from the original article is correct.  Now to start cold calling random, anonymous people from the internet in the hopes they tell me the truth...

No comments:

Post a Comment