Some people think hospitals are the most dangerous places on earth, more people die inside them than anywhere else.
有些人认为医院是世界上最危险的地方,医院里死亡的人比其他地方都多。
When people say things like that, they are technically right and may get fooled by what's known as selection bias.
当人们这样说时,严格来说他们是对的,他们可能会被所谓的选择偏差所欺骗。
A selection bias occurs when we look at information that it is not fully representative of the data intended to be studied.
当我们看到的信息不能完全代表要研究的数据时,就会出现选择偏差。
As a result of the biased sample, we then draw a false conclusion. One story that explains the bias perhaps better than any other is that of Abraham Wald.
由于样本有偏差,我们得出了错误的结论。亚伯拉罕·瓦尔德的故事可能比其他任何故事都更能解释这种偏差。
Wald was a brilliant mathematician, who, after the Nazis persecuted him and his family as Jews in Austria, fled to the United States in 1938.
瓦尔德是一位才华横溢的数学家,在纳粹迫害奥地利犹太人瓦尔德及其家人后,他于1938年逃往美国。
During World War II, Wald was invited to become a member of the Statistical Research Group, an elite think tank to aid the American war effort against Nazi Germany.
二战期间,瓦尔德被邀请成为统计研究小组的成员,这是一个精英智囊团,旨在协助美国对抗纳粹德国。
One day the US Air Force came to Wald and his colleagues with a problem. Many of their planes got shot down due to a lack of armor.
有一天,美国空军向瓦尔德及其同事提出了一个问题。他们的许多飞机由于缺乏装甲而被击落。
The officers presented Wald with data for all the aircrafts that made it back from their mission.
军官们向瓦尔德展示了所有完成任务后成功返航的飞机的数据。
The planes had lots of holes on the body and wings but less below the engines.
飞机的机身和机翼上有很多弹孔,但发动机下方的弹孔较少。
The officers then asked the mathematicians to compute the optimal protection by concentrating the armor where the planes were getting hit the most.
然后,军官们要求数学家计算出最佳防护,方法是将装甲集中在飞机被击中最多的地方。
After studying the problem, Wald suggested something unexpected. The armor, he said, doesn't go where the bullet holes are. It goes where the bullet holes aren't.
在研究了这个问题之后,瓦尔德提出了一个意想不到的建议。他说,装甲不会出现在有弹孔的地方。它会出现在没有弹孔的地方。
The officers didn't understand, because they were looking at a biased sample. Wald wasn't.
军官们不明白,因为他们看到的是一个有偏差的样本。瓦尔德不是。
He realized that to get representative data to analyze, he needed to include the missing holes, the missing planes, the missing information.
他意识到,要获得有代表性的数据进行分析,他需要包括缺失的弹孔、缺失的飞机和缺失的信息。
The reason planes were coming back with fewer hits to the engine is that planes that got hit in the engine weren't coming back, he explained.
他解释说,飞机回来时发动机被击中较少的原因是,发动机被击中的飞机没有回来。
But selection bias isn't just the result of missing information. The simpson paradox is a phenomenon in which a trend appears in groups of data but then when the groups are combined, disappears.
但选择偏差不仅仅是信息缺失的结果。辛普森悖论是一种现象,即数据集中出现一种趋势,但当这些子集合并在一起时,趋势就消失了。
It shows the importance of really understanding the data we select for analysis.
这表明了,真正理解我们选择用于分析的数据,有多么重要。
One famous example came from students applying to the University of California, Berkeley in 1973.
一个著名的例子来自1973年申请加州大学伯克利分校的学生。
The data showed that males applying were more likely to be accepted than females. People thought that the institution was discriminating against women.
数据显示,申请的男性比女性更容易被录取。人们认为该机构歧视女性。
When researchers dug deeper into the data they found out that men had applied to less competitive departments with higher rates of admission.
当研究人员深入研究数据时,他们发现男性申请的部门竞争不那么激烈,录取率更高。
Women chose more competitive departments with fewer available spots. After correcting for this detail the data showed a significant bias in favor of women - not men.
女性选择了竞争更激烈的部门,但可用名额较少。在纠正这个细节后,数据显示录取优待女性,而不是男性。
Planes that were shot in the engines were not analyzed. Women in Berkley weren't discriminated against, but instead picked more competitive classes.
引擎被击中的飞机没有被分析。伯克利的女性没有受到歧视,而是选择了更具竞争力的课程。
People that die in hospitals are often already sick when they are admitted. What are your thoughts? Is selection bias corrupting your decision making?
在医院去世的人往往在入院时就已经生病了。你的想法是什么?选择偏差会破坏你的决策吗?
And if it's fooling you, how about the people behind the “research” you see being published in popular media?
如果你被它欺骗了,那么你看到的在流行媒体上发表的“研究”背后的人又是怎么想的呢?
Did they really make sure they selected an unbiased fully representative sample? Share your thoughts in the comments below!
他们真的确保选择了一个公正、完全具有代表性的样本吗?在下面的评论中分享你的想法!
And if you still don't quite understand it, here is a simple challenge to experience it first hand!
如果你仍然不太明白,这里有一个简单的挑战,让你亲身体验它!
Go out of your house, knock at the doors of your next 10 neighbors, and ask those who open if they are afraid of strangers.
走出家门,敲附近10个邻居的门,问问那些开门的人是否害怕陌生人。
After you are done, report your findings in the comments below and explain to us: what can your research tell us about your community?
完成后,在下面的评论中报告你的发现并向我们解释:你的研究能告诉我们关于社区的什么信息?
If you like how we explain complicated ideas in simple cartoon animation, you can support us! Visit patreon.com/sprouts!
如果你喜欢我们用简单的卡通动画解释复杂的想法,你可以支持我们!访问patreon.com/sprouts!
Just visit us, learn how it works and what's in it for you. We hope to see you there. And if you are a parent or educator, check out our website sproutsschools.com.
只需访问我们,了解它的工作原理以及它对你有什么好处。我们希望在那里见到你。如果你是家长或教育工作者,请访问我们的网站sproutsschools.com。
There you can find this and other video lessons, additional resources and classroom activities.
您可以在那里找到本课程和其他视频课程、额外资源和课堂活动。