In the long term, the implications of this will probably be as profound as the invention of statistics was in the late 17th century. The rise of "big data" provides far greater opportunities for quantitative analysis than any amount of polling or statistical modelling. But it is not just the quantity of data that is different. It represents an entirely different type of knowledge, accompanied by a new mode of expertise.
长期来看,其影响可能会像17世纪后期统计技术的发明一样深远。“大数据”的兴起为定量分析提供了比任何数量的投票或统计建模都大得多的机会。但不只是数据的数量不同,它代表了一种完全不同的知识类型,伴随着一种新的专业技能模式。
First, there is no fixed scale of analysis (such as the nation) nor any settled categories (such as "unemployed"). These vast new data sets can be mined in search of patterns, trends, correlations and emergent moods. It becomes a way of tracking the identities that people bestow upon themselves (such as "ImwithCorbyn" or "entrepreneur") rather than imposing classifications upon them. This is a form of aggregation suitable to a more fluid political age, in which not everything can be reliably referred back to some Enlightenment ideal of the nation state as guardian of the public interest.
首先,它没有固定的分析尺度(比如国家),也没有固定的分类(比如“失业”)。这些庞大的新的数据集可以被挖掘出来以寻找模式、趋势、相关性和突发情绪。它成为一种追踪人们赋予自己身份的方式(比如“我支持科尔宾”或者“企业家”),而不是将分类强加给他们。这是一种适用于流动性更大的政治时代的集合形式,在这个时代,并不是每件事情都可以可靠地追溯到某种启蒙运动的理想来作为公共利益的守护者。
Second, the majority of us are entirely oblivious to what all this data says about us, either individually or collectively. There is no equivalent of an Office for National Statistics for commercially collected big data. We live in an age in which our feelings, identities and affiliations can be tracked and analysed with unprecedented speed and sensitivity – but there is nothing that anchors this new capacity in the public interest or public debate. There are data analysts who work for Google and Facebook, but they are not "experts" of the sort who generate statistics and who are now so widely condemned. The anonymity and secrecy of the new analysts potentially makes them far more politically powerful than any social scientist.
其次,我们大多数人完全没有注意到这些数据对我们个人或集体的影响。目前还没有一个类似于国家统计局的机构来处理商业上收集的大数据。在我们所生活的时代,人们可以以前所未有的速度和敏感性追踪和分析我们的情感、身份和从属关系——但没有什么能将这种新能力锚定在公众利益或公众辩论上。谷歌和脸书也有数据分析师,但他们不是产生统计数据以及现在饱受谴责的的那种“专家”。新分析师的匿名性和保密性可能使他们在政治上比任何社会科学家都强大得多。
A company such as Facebook has the capacity to carry quantitative social science on hundreds of millions of people, at very low cost. But it has very little incentive to reveal the results. In 2014, when Facebook researchers published results of a study of "emotional contagion" that they had carried out on their users – in which they altered news feeds to see how it affected the content that users then shared in response – there was an outcry that people were being unwittingly experimented on. So, from Facebook's point of view, why go to all the hassle of publishing? Why not just do the study and keep quiet?
像脸书这样的公司有能力以极低的成本为数亿人提供定量的社会科学,但它将结果公开的动机少之又少。2014年,脸书的研究人员公布了他们对用户进行的一项“情绪传染”研究的结果——其中他们改变了新闻源以观察它如何影响用户之后分享的内容——人们表示强烈抗议,称自己在不知不觉当中成了小白鼠。所以,从脸书的角度来讲,为什么要自找麻烦将结果公布出来呢?本本分分地搞研究不香吗?