Big Data Much Better Than People at Determining Keywords


“Humans, I’m going to make the case, are horrible at picking keywords,” claims Gary King, the director of Harvard’s Institute for Quantitative Social Science, during a talk last year on Capitol Hill. “… I’m going to convince you that you are horrible at doing Google searches.”

King was in Washington, D.C. to explain to a picked audience of policymakers and fellow academics what was, “The big deal about big data.” As part of that talk, hosted by SAGE Publishing with co-sponsors the American Political Science Association and the American Statistical Association, King gave a series of examples demonstrating the utility of big data in conducting social science and the necessity of solid and innovative data analysis to make sense of anything.

In this example, he detailed work that asked a couple dozen Harvard students to plow through Twitter posts containing the word “Boston” in them. Their purpose was to identify what the key words would be in tweets talking about the Boston Marathon bombing – and not about anything else. The students came up with 149 unique keywords – which they then failed to recall with astounding regularity. Humans, King explains, are generally bad at recall but generally good at recognition. And that creates problems in many arenas, such as analysis of vast amounts of text.

So he and his team developed technology they call Thresher, which conducts automated text analysis. As a paper he, Patrick Lam, and Margaret Roberts will see appear in the American Journal of Political Science explains, “We develop a computer-assisted (as opposed to fully automated) statistical approach that suggests keywords from available text without needing structured data as inputs.” The benefit: “This framing poses the statistical problem in a new way, which leads to a widely applicable algorithm.” In short, a win for computational social science.

You can see this seventh video from his talk below.

King’s talk, “The big deal about big data,” was hosted by SAGE Publishing with co-sponsors the American Political Science Association and the American Statistical Association.

Videos in the series

  1. Ziyad Marar On The Opportunities That Big Data Provides Social Scientists
  2. What Makes Big Data Valuable?
  3. Examples: Exciting Data That Is Useless Without Analytics
  4. Example: Social Scientists Determine Cause Of Death At A Distance
  5. Example: Analysis Rids Social Security Forecasts Of Bias
  6. Example: New Data Methods Combat Gerrymandering
  7. Example: Big Data Much Better Than People At Determining Keywords
  8. Example: Watching Chinese Citizens Get Around Censorship
  9. Example: Learning The Real Reason For Chinese Censorship
  10. The Spectacular Success Of Quantitative Social Science



This entry was posted in SAGE Connection, social science, Technology. Bookmark the permalink.

Leave a Reply