Teaching Machines to Understand Us 让机器理解我们 之一 引言

Teaching Machines to Understand Us   By Tom Simonite  MIT Technology Review Vol.118 No.5 2015


让机器理解我们  作者 Tom Simonite  MIT科技评论 2015年第118卷5号

A reincarnation of one of the oldest ideas in artificial intelligence could finally make it possible to truly converse with our computers. And Facebook has a chance to make it happen first.


The first time Yann LeCun revolutionized artificial intelligence, it was a false dawn. It was 1995, and for almost a decade, the young French-man had been dedicated to what many computer scientists considered a bad idea: that crudely mimicking certain features of the brain was the best way to bring about intelligent machines. But LeCun had shown that this approach could produce something strikingly smart and useful. Working at Bell Labs, he made software that roughly simulated neurons and learned to read handwritten text by looking at many diferent examples. Bell Labs’ corporate parent, AT&T, used it to sell the first machines capable of reading the handwriting on checks and written forms. To LeCun and a few fellow believers in artificial neural networks, it seemed to mark the beginning of an era in which machines could learn many other skills previously limited to humans. It wasn’t.

Yann LeCun第一次关于人工智能的革命,是一次错误的黎明。那是在1995年,这个年轻的法国人在这上面花了接近10年时间,但很多计算机科学家认为这是个不好的主意:天然的模拟人脑的某些特性,认为这是实现智能机器的最好方法。但LeCun证明了这种方法是可以带来非常智能、有用的一些东西的。在Bell实验室工作的时候,他编制了粗糙的模拟神经元的软件,并通过观察很多不同的例子来学习阅读手写字体。Bell实验室的集团母公司,AT&T,用这种技术制造出了第一种可以阅读手写支票和手写表格的机器并进行销售。对于LeCun和几个人工神经网络的信徒同伴来说,这似乎意味着机器可以学习人类能力的时代的开始,但并不是。

“This whole project kind of disappeared on the day of its biggest success,” says LeCun. On the same day he celebrated the launch of bank machines that could read thousands of checks per hour, AT&T announced it was splitting into three companies dedicated to diferent markets in communications and computing. LeCun became head of research at a slimmer AT&T and was directed to work on other things; in 2002 he would leave AT&T, soon to become a professor at New York University. Meanwhile, researchers elsewhere found that they could not apply LeCun’s breakthrough to other computing problems. The brain-inspired approach to AI went back to being a fringe interest.


扫描二维码关注公众号,回复: 18369 查看本文章

LeCun, now a stocky 55-year-old with a ready smile and a sideways sweep of dark hair touched with gray, never stopped pursuing that fringe interest. And remarkably, the rest of the world has come around. The ideas that he and a few others nurtured in the face of over two decades of apathy and sometimes outright rejection have in the past few years produced striking results in areas like face and speech recognition. Deep learning, as the field is now known, has become a new battleground between Google and other leading technology companies that are racing to use it in consumer services. One such company is Facebook, which hired LeCun from NYU in December 2013 and put him in charge of a new artificial-intelligence research group, FAIR, that today has 50 researchers but will grow to 100. LeCun’s lab is Facebook’s first significant investment in fundamental research, and it could be crucial to the company’s attempts to become more than just a virtual social venue. It might also reshape our expectations of what machines can do.


Deep Learning’s Leaders



Working in



Geoff Hinton

Google&University of Toronto

Did his PhD on artificial neural networks in the 1970s. Showed how to train larger, “deep” neural networks on large data sets in the 2000s, and proved their power for speech and image recognition.


Yann LeCun


Got interested in neural networks as an undergraduate, and later pioneered the use of deep learning for image recognition. Now leads a group at Facebook trying to create software that understands text and can hold conversations.


Yoshua Bengio

IBM&University of Montreal

Started working on artificial neural networks after meeting LeCun at Bell Labs in the 1980s. Was one of the first to apply the technique to understanding words and language. Now working with IBM to improve its Watson software.


Andrew Ng


Led a project at Google that worked out how neural networks could be trained on millions of pieces of data, allowing greater accuracy. Now oversees research at Baidu, which is working on improved speech recognition.


Demis Hassabis


Worked on AI in the games industry, then researched neuroscience to get ideas about building intelligence. He founded DeepMind, which Google bought last year and runs as a quasi-independent unit.







Geoff Hinton





Yann LeCun




Yoshua Bengio





Andrew Ng




Demis Hassabis



Facebook and other companies, including Google, IBM, and Microsoft, have moved quickly to get into this area in the past few years because deep learning is far better than previous AI techniques at getting computers to pick up skills that challenge machines, like understanding photos. Those more established techniques require human experts to laboriously program certain abilities, such as how to detect lines and corners in images. Deep-learning software figures out how to make sense of data for itself, without any such programming. Some systems can now recognize images or faces about as accurately as humans.


Now LeCun is aiming for something much more powerful. He wants to deliver software with the language skills and common sense needed for basic conversation. Instead of having to communicate with machines by clicking buttons or entering carefully chosen search terms, we could just tell them what we want as if we were talking to another person. “Our relationship with the digital world will completely change due to intelligent agents you can interact with,” he predicts. He thinks deep learning can produce software that understands our sentences and can respond with appropriate answers, clarifying questions, or suggestions of its own.


Agents that answer factual questions or book restaurants for us are one obvious — if not exactly world-changing — application. It’s also easy to see how such software might lead to more stimulating video-game characters or improve online learning. More provocatively, LeCun says systems that grasp ordinary language could get to know us well enough to understand what’s good for us. “Systems like this should be able to understand not just what people would be entertained by but what they need to see regardless of whether they will enjoy it,” he says. Such feats aren’t possible using the techniques behind the search engines, spam filters, and virtual assistants that try to understand us today. They often ignore the order of words and get by with statistical tricks like matching and counting keywords. Apple’s Siri, for example, tries to fit what you say into a small number of categories that trigger scripted responses. “They don’t really understand the text,” says LeCun. “It’s amazing that it works at all.” Meanwhile, systems that seem to have mastered complex language tasks, such as IBM’s Jeopardy! winner Watson, do it by being super-specialized to a particular format. “It’s cute as a demonstration, but not work that would really translate to any other situation,” he says.

能回答实际问题或为我们订餐馆是一个明显的应用(虽然这种应用不能改变世界),很容易还可以看到这种程序还能带来更刺激的视频游戏角色,或者改善在线学习。更激动人心的是,LeCun认为掌握自然语言的系统可以很好的理解我们,然后理解什么是对我们有益的东西。“这样的系统应当不仅能理解人们对什么感兴趣,还应能知道人们需要看到什么,不管是不是对这个感兴趣”,他说。搜索引擎、垃圾邮件过滤、现在的虚拟助理这些背后的技术不可能有这种功能,它们通常忽略文字的顺序,靠统计技术比如匹配计算关键字。比如苹果的Siri,努力将你说的话归到几种类别中,然后触发编排好的响应。LeCun说:“它们并不真正理解这些文字,它只能进行简单的工作”。同时,那些似乎掌握了复杂的语言理解功能的系统,比如IBM Watson,Jeopardy!赢家,只是将其编排成了特殊情况的专有模式。他说:“作为一个展示,是非常漂亮的,但如果换成其他情况,就无法真正工作了”。

In contrast, deep-learning software may be able to make sense of language more the way humans do. Researchers at  Facebook, Google, and elsewhere are developing software that has shown progress toward understanding what words mean. LeCun’s team has a system capable of reading simple stories and answering questions about them, drawing on faculties like logical deduction and a rudimentary understanding of time.


However, as LeCun knows firsthand, artificial intelligence is notorious for blips of progress that stoke predictions of big leaps forward but ultimately change very little. Creating software that can handle the dazzling complexities of language is a bigger challenge than training it to recognize objects in pictures. Deep learning’s usefulness for speech recognition and image detection is beyond doubt, but it’s still just a guess that it will master language and transform our lives more radically. We don’t yet know for sure whether deep learning is a blip that will turn out to be something much bigger.


