A paper by CIS PhD student David W. Vinson, and advisor Rick Dale, was selected by Yelp’s engineers as a recipient of Yelp’s Academic Dataset grand prize. David, as a graduate first author, will receive a monetary reward and the paper is being featured on Yelp’s website. The award competition invites researchers to download hundreds of thousands of reviews of businesses from Phoenix, AZ and do something interesting with the data. David, along with his advisor, used ideas from information theory becoming prominent in psycholinguistics to show interesting patterns in the Yelp dataset. They found that more information tended to be encoded in reviews that were more extreme, whether more negative or more positive. “Information” refers, in general, to the idea that reviewers tended to use less likely words in less likely combinations when they wanted to express a more extreme opinion about a business. Vinson and Dale’s results suggest that ideas from psycholinguistics may relate in interesting ways to large datasets that can be gleaned through online sources, such as Yelp’s Phoenix dataset.
Click here to learn about Yelp’s Dataset Challenge.
Click here for Vinson & Dale’s submission.