Andrew Ng is hunched over his smartphone, in a pantomime of key-pecking, squinting, typo-ridden discomfort. “This is how we do it today,” he says.
吴恩达(Andrew Ng)驼着背低着头,略带夸张地在他的智能手机上比划着不停点击屏幕、眯着眼却仍然错字连篇的那种不自在的样子。“我们如今是这样做的,”他称。
“And this is how we should be doing it,” says the chief scientist for Baidu, China’s largest search engine. He sits back in his chair, speaking to no one in particular with his phone placed on the table. The one-finger typing agony of millions of smartphone users should one day become a thing of the past, he says. All it would take is the creation of a reasonably accurate, pocket-sized electronic version of a human brain.
“而我们应该这样做,”这位百度(Baidu)的首席科学家称。他靠在座位上,没有特定对象地说着话,手机放在桌子上。他说,数百万智能手机用户用一个手指敲字的痛苦有一天应该成为过去。而这只需要创造一种达到合理精确度、与口袋大小相当的电子版人类大脑。百度是中国最大的搜索引擎。
Mr Ng is an expert in deep learning, a branch of artificial intelligence that focus on teaching computers how to talk, listen, read, and think like us. The area is fast becoming a priority for the world’s biggest technology companies, including Baidu as it tackles the era of the mobile internet.
吴恩达是深度学习(deep learning)领域的专家,该领域是人工智能的一个分支,专注于让计算机学习如何像我们一样听、说、读、思。由于该领域与移动互联网时代紧密相连,它正迅速成为包括百度在内的全球最大科技公司的优先发展领域。
“The whole world is switching to mobile devices but no one has created a usable interface to input into the devices,” he says. With the development of artificial intelligence, “soon you’ll be able to order food and just say ‘Can I have some food delivered to my house before I get home?’ out loud.”
“整个世界都在转向移动设备,但是还没人创造出向移动设备输入指令的有用接口,”他称。随着人工智能的发展,“很快你将可以在订购食物时只需要大声说一句‘能在我回家前送些食物到我家中吗?’”
“It won’t even feel like technology, it will just be in the background.”
“感觉上甚至都不像是科技,而就在后台里。”
In addition to better voice recognition, AI is being talked about for any number of uses from predicting advertising clicks to recognising faces.
除了更好的语音识别,从预测广告点击量到人脸识别技术的很多领域都在讨论使用人工智能。
Since joining Baidu last year, Mr Ng has been steadily working to implement this vision. A UK native with Chinese roots, he founded in 2011 Google Brain, the US technology company’s deep learning project, and led it until he joined the Chinese company last year. Poaching him was regarded as a coup in the technology world.
自从去年加入百度以来,吴恩达一直在为实现这个愿景而稳扎稳打。作为一名出生在英国的华人,他在2011年创建了“谷歌大脑”(Google Brain)——谷歌的深度学习项目,并且在去年加入百度前一直领导着该项目。百度撬走吴恩达被认为是科技界的一次政变。
He describes the advanced computers at Baidu’s Sunnyvale, California, lab as “rocket engines” whose software can be taught to mimic the functioning of the human mind. Their “fuel” is data, which he gets from Baidu’s trove of online video and audio output as he works to teach the electronic brain to listen and speak.
他把百度位于加州森尼韦尔(Sunnyvale)实验室中的先进计算机比作“火箭引擎”,计算机中的软件可以学习模拟人类思想的功能。在吴恩达教电子大脑听和说时,它们的“燃料”就是他从百度在线视频和音频输出资料库中得到的数据。
The company has an advantage in deep-learning algorithms for speech recognition in that most video and audio in China is accompanied by text — nearly all news clips, television shows and films are close-captioned and almost all are available to Baidu and Iqiyi, its video affiliate.
百度在语音识别深度学习算法方面具有优势,因为中国大多数视频和音频都伴有文本——几乎所有新闻剪辑、电视节目及电影都有详细的字幕,而百度及其视频子公司爱奇艺(Iqiyi)可以获得几乎所有此类内容。
While a typical academic project uses 2,000 hours of audio data to train voice recognition, says Mr Ng, the troves of data available to China’s version of Google mean he is able to use 100,000 hours.
吴恩达说,一个典型的学术项目会利用2000小时的音频数据来训练语音识别,但百度——中国版谷歌——拥有的庞大数据库意味着他可以利用10万小时。
He declines to specify just how much the extra 98,000 hours improves the accuracy of his project, but insists it is vital.
他拒绝详细说明额外9.8万小时在多大程度上提升了其项目的精确度,但坚称这至关重要。
“A lot of people underestimate the difference between 95 per cent and 99 per cent accuracy. It’s not an ‘incremental’ improvement of 4 per cent; it’s the difference between using it occasionally versus using it all the time,” he says.
“许多人低估了95%精确度与99%精确度之间的区别。这不是4%的“增量”提升;这是偶尔使用与始终使用之间的区别,”他说。
Thanks to the strides made in Chinese language voice recognition — a particular challenge because of the number of homonyms and the importance of context — Baidu will soon roll out Deepspeech, a voice recognition software similar to Apple’s Siri.
由于在汉语语音识别方面取得了巨大进步(汉语中的大量同音异义词和语境的重要性使之极具挑战),百度即将推出Deepspeech——一款类似于苹果(Apple)的Siri的语音识别软件。
Other Chinese companies including Alibaba and Tencent are also making advances in AI, but thanks largely to Mr Ng’s reputation Baidu is now judged by industry experts to be ahead of its domestic peers, ranking up alongside US rivals Facebook, Google, and IBM.
包括阿里巴巴(Alibaba)、腾讯(Tencent)在内的其他中国企业在人工智能方面也取得了进步,但主要得益于吴恩达的声望,行业专家如今认为百度要领先于国内同行,可与美国竞争对手Facebook、谷歌和IBM比肩。
“Artificial intelligence is an oligopoly,” says Yang Jing, founder of AI Era, an association for the artificial intelligence industry in China. “It’s a game for the titans.”
“人工智能是寡头垄断行业,”中国人工智能行业协会新智元(AI Era)创始人杨静说,“这是一个巨头间的游戏。”
Baidu already saves Rmb17m ($2.7m) per day at its data centres by using deep-learning algorithms to predict hard drive malfunctions, and it is also using AI to optimise the use of advertisements and photos to improve clickthrough rates. It would not reveal how much it is spending on AI development overall.
百度通过在数据中心利用深度学习算法预测硬盘故障已经可以每天节省1700万元人民币(合270万美元),而且还利用人工智能优化广告和相片的使用来提升点击率。该公司并未透露在人工智能开发上共计投入多少资金。
But in spite of lofty long-term ambitions, translating deep learning into money-making projects is still largely on the horizon.
尽管雄心勃勃,但要将深度学习转变成赚钱的项目仍有很长一段路要走。
Mr Ng is undaunted. “There’s no question that [AI] is creating huge economic value; there’s no question that this will continue to create huge advances,” he says. “There is still a huge gap between the way machines learn and the way humans learn.”
吴恩达毫无畏惧。“毫无疑问,(人工智能)正在创造巨大的经济价值;毫无疑问,这将继续创造巨大的进步,”他说,“机器的学习方式与人类的学习方式之间仍存在巨大差距。”