One of political scientist Gary King’s interests is the study of censorship in China. This involves several moving parts, including how the central government can keep tabs on a population well over a billion strong, and how that same population can overcome the scrupulous attentions of the country’s censors.
Dealing with China-scale research is in a sense the beau ideal of big data, and a great proof of concept for any scholarship – read ‘data analysis’ making sense of this data. And in this video, King details some of the big data-oriented challenges and successes of studying censorship in China using a bespoke algorithm developed at the King-led Institute for Quantitative Social Science at Harvard.
As the website for the private company (Thresher Ventures) that King spun out of the research, explains: “Thresher helps expert analysts find what they are looking for in unstructured text even when authors are being creative, trying to shape the dialogue, or outright trying to hide. Thresher works in any language and does not require search logs or ontologies.”
As he explains in this portion of a talk — “The big deal about big data” — he gave on Capitol Hill last year to policymakers and fellow academics, there are a number of ways that the Chinese get creative to get their true feelings aired. King explains how if you use the word ‘freedom’ on some social media websites in China it will be filtered out. So you substitute a character that’s similar to the banned one in appearance, a homograph. Hence we see ‘eye field’ being used instead of the characters for ‘freedom.’ Or sly social commentators will use a homophone, which means written criticism of the official ‘harmonious society’ policy is replaced by potshots at a ‘river crab,’ since both are pronounced ‘hexie.’
King’s goal wasn’t to offer those assembled a primer on how to fly under the radar of Chinese censors, but rather to show the difficulty of ploughing through this raw data looking for trends. In the next video is this series, he’ll explain some of the scholarly payoffs from this work.
King’s talk was hosted by SAGE Publishing with co-sponsors the American Political Science Association and the American Statistical Association. His goal was to give a series of examples demonstrating the utility of big data in conducting social science and the necessity of solid and innovative data analysis to make sense of anything.
You can see this eighth video from his talk below.
Videos in the series:
- Ziyad Marar On The Opportunities That Big Data Provides Social Scientists
- What Makes Big Data Valuable?
- Examples: Exciting Data That Is Useless Without Analytics
- Example: Social Scientists Determine Cause Of Death At A Distance
- Example: Analysis Rids Social Security Forecasts Of Bias
- Example: New Data Methods Combat Gerrymandering
- Example: Big Data Much Better Than People at Determining Keywords
- Example: Watching Chinese Citizens Get Around Censorship
- Example: Learning The Real Reason For Chinese Censorship
- The Spectacular Success Of Quantitative Social Science