Text / Tencent technology Han Yimin
May 2, 2017, Tencent announced the appointment of speech recognition technology experts Dr. Yu Dong AI Lab (artificial intelligence laboratory), deputy director, and the establishment of the United States Seattle AI laboratory.
Dr. Yu Dong will be responsible for the operation and management of Seattle AI Labs, and promote Tencent's basic research in the field of speech recognition and natural language understanding.
This is the most recent period, Tencent in the field of artificial intelligence the third big move.
More than a month ago on March 19, by the Tencent AI Lab research and development of the chess AI art, in the Japanese UEC Cup Go contest won the championship. This is a low-key operation nearly a year later, Tencent AI Lab first show the results of external research.
When the team is in UEC
Soon after winning the art, March 23, Tencent announced the appointment of the field of artificial intelligence, the top scientist Dr. Zhang Tong as Tencent AI Lab (Tencent artificial intelligence laboratory) director.
Today, another general joined the Tencent artificial intelligence camp.
For the addition of Yu Dong, Tencent AI Lab Director Dr. Zhang Tong said, Dr. Yu Dong is the field of speech recognition and depth of learning experts. We are pleased to invite Dr. Yu to join Tencent AI Lab, I believe his arrival will greatly enhance the technical strength of Tencent AI. We hope that the Tencent AI Lab is not only a laboratory, but also a connector, by connecting the world's best talent together, continue to promote the AI's basic research and application in more scenes landing, so AI everywhere. & Rdquo;
Tencent AI Lab, deputy director of Dr. Yu Dong said: "I am very pleased to join Tencent AI Lab. In the past 10 years, Tencent has accumulated a wealth of application scenarios, massive data, strong computing power and first-class scientific and technological personnel, which are carried out in-depth research and application of AI is an important basis for attracting global talent is an important reason. I believe that with the establishment of Seattle AI Lab, the future will have more first-class talent to join Tencent AI Lab, to jointly promote the development of global AI technology. & Rdquo;
In the international speech recognition research community, Yu Dong is a name can not be ignored.
Prior to joining Tencent, Yu Dong was the chief researcher of Speech and Dialog Group, and a part-time professor at Zhejiang University, a visiting guest professor at China University of China and a visiting researcher at Shanghai Jiaotong University.
(CD-DNN-HMM) is the first successful application of deep learning technology in large vocabulary speech recognition tasks, and the deep-rooted neural network (CD-DNN-HMM), which is developed by Dr. George Dahl and Dr. Deng Li. Their breakthrough work, won the 2013 IEEE Signal Processing Association (IEEE SPS) best paper award, caused a large vocabulary speech recognition research direction changes, greatly promoted the development of speech recognition technology.
At the same time, in recent years, Tencent is also increasing investment in the field of artificial intelligence. In April 2016, Tencent AI Lab was established, headquartered in Shenzhen. As Tencent AI-level laboratory, AI Lab focuses on the combination of basic research and application exploration, is committed to improving AI decision-making, understanding and creativity, and Tencent products and services to provide AI technical support.
Tencent AI Lab is led by Dr. Zhang Tong, a specialist in machine learning and large data, and has more than 50 AI engineers (90% of Dr.) and more than 200 application engineers from more than 50 world-renowned institutions. As the top experts in the direction of speech recognition technology, the addition of Yu Dong means that Tencent's layout on artificial intelligence will further extend to the basic research direction.
Yu Dong joined the occasion of Tencent, we conducted an exclusive interview with him, in the understanding of the Tencent AI Seattle AI experimental person in charge at the same time, also a glimpse of Tencent in the artificial intelligence layout of more puzzles.
A key called AI
At present, the artificial intelligence boom swept the global industry, speech recognition is one of the most likely to become the first generation of mass-level application of technology, which is inseparable from the speech recognition in the basic research progress, and Yu Dong is related to research can be a breakthrough The key person of progress.
In the late summer of autumn and winter of August 28, experts and scholars from all over the world gathered in Florence, Italy, the next three days, organized by the International Voice Communications Association (ISCA) organized the 12th annual meeting (Interspeech 2011) will be in this The birthplace of the Renaissance.
As one of the two most important international conferences in the field of voice (another ICASSP), Interspeech annually attracts practitioners from academia and industry to communicate with the latest technology and research in the field of voice.
The conference was published the next day, a paper titled "Conversational Speech Transcription Using Context-Dependent Deep Neural Networks" (which uses context-sensitive deep neural networks for conversational voice transcription) and quickly attracted the attention of the scientific community.
In this paper, a new speech recognition method based on artificial neural network is proposed. The experimental results show that the new method can greatly reduce the error rate of speech recognition. This means that the artificial neural network, which had set off a boom in the late 1980s and finally passed down, was once again introduced into the field of speech recognition, which opened the depth of speech recognition.
Yu Dong is the main researcher of this achievement.
2011 years from the rise of today's artificial intelligence boom for the next six years, artificial neural network in the academic community after several ups and downs, that time is not optimistic.
Yu Dong on Tencent technology recalled the scene at that time, still quite mixed feelings: "This work (the depth of learning methods into the field of speech recognition) at the beginning by a lot of suspicion, many colleagues or friends have experienced the late 80s and 90s At first, the neural network from the climax to the low tide of the process, so they have some doubts about this. & Rdquo;
But Yu Dong and his team opened up a new way to respond with the actual results of the question, "basically two years, many companies repeat our work and found that the recognition rate is indeed very helpful, and soon become the industry standard The But before we work, this kind of paper published in fact has some difficulties, but two years later, it becomes difficult to use the depth of learning technology articles, in turn. & Rdquo;
The key to the depth of learning has opened the new door of speech recognition research, and after entering the depth of learning, speech recognition is constantly breaking through.
In mid-September 2016, Microsoft reported a new milestone in speech recognition: the new system's response to the switchboard benchmark test set fell to 6.3%; a month later, Microsoft also announced in this A benchmark test set has successfully achieved a historic breakthrough: their speech recognition system word error rate (WER) and professional transcriber quite even lower, reaching 5.9%.
The meaning behind the data is that the recognition rate of speech recognition in the near field has been able to do more than the practical threshold, in many applications can be applied to the scene. Such as WeChat inside the voice to the text, voice input method and various types of APP voice input box.
In the practical application level, voice input has become a lot of APP necessary features; but at the research level, there are still many voice recognition to be overcome difficult.
Yu Dong introduction, the current difficult environment, such as far-field, high noise, or accent recognition is still need to solve the problem; adaptive method (adaptation) is also an important research direction.
Living in the forefront of the field of voice, Yu Dong's research focus on the deeper level to go. After joining the Tencent AI Lab, Yu Dong will also lead the team to focus on far-field voice recognition and natural language to understand the two directions. In the artificial intelligence layout of Tencent AI Lab, AI Lab, led by Yu Dong, will be another home.
From Tencent's offer
As one of the earliest researchers in the field of speech recognition, Yu Dong is one of the top experts in speech recognition and in-depth learning. He published two monographs and published more than 160 papers. It was the invention of 60 patents Person and the depth of learning open source software CNTK sponsor and one of the main author. Won the 2013 and 2016 IEEE Signal Processing Association Best Paper Award. He is a member of the IEEE Speech Language Processing Professional Committee and has been the editorial board of IEEE / ACM audio, voice and language processing boards, IEEE Signal Processing magazines and other journals.
In 2016, Yu Dong and Microsoft Research Institute colleague Deng Li co-authored "analytic depth of learning and mdash; speech recognition practice" published in the country, for the first time devoted to how the depth of learning methods, especially deep neural network (DNN) technology applied to Speech recognition (ASR) field.
In 1998 that joined Microsoft's Yu Dong, witnessed the emerald city in the Silicon Valley outside the rise of the United States to become the process of artificial intelligence research gathering.
Headquartered in Seattle, the old IT giant Microsoft from the early start on the artificial intelligence has a relatively large investment, cultivate a large number of artificial intelligence in the more experienced talent, and now Microsoft's artificial intelligence and research group business group already has five or six thousand People, this is a huge talent pool.
Microsoft's investment in a few years ago gradually attracted, including Google, Facebook, Apple, including technology giants to Seattle to set up large-scale R & D center. Today, the headquarters also located in Seattle, Amazon also established more than a thousand people of artificial intelligence team.
Many technology giants get together, Seattle's growing appeal to talent, talent gathering effect makes a lot of professionals from Silicon Valley or other places to Seattle, Seattle now every year the inflow of population is very large.
As an international top voice research experts, to Yu Dong stretched out a large number of olive branch, choose to join Tencent before there are many companies to find over, but did not impress him.
Yu Dong finally chose Tencent, because it has to carry out the advantages of speech recognition research.
Yu Dong on Tencent technology, choose to join Tencent there are several reasons, first, voice recognition must have a large data source, must have a large computing power, there must be export feedback mechanism and then optimize the product, that must have landing scene. The second ones prefer to study, like to solve some challenging problems, some other companies have their own advantages and disadvantages, such as products but lack of research. And Yu Dong value of these conditions, Tencent can meet. "For Yu Dong, Tencent has to carry out the advantages of speech recognition research.
Seattle Lab without KPI
Tencent to build first-class AI laboratory determination, Seattle in the artificial intelligence on the talent pool, and Yu Dong's final join, so that the establishment of Seattle AI laboratory has become a matter of course, and this laboratory no KPI.
The reason for not having a KPI is related to the positioning of the Seattle lab.
In April 2016, Tencent set up AI Lab (Tencent artificial intelligence laboratory), is committed to the basic research of artificial intelligence science research, as well as in-depth exploration of application areas, so that "academic impact, industrial output".
At present the laboratory has more than 50 world-renowned college AI scientists (90% for the doctor), and more than 200 experienced engineers for basic research and application of exploration.
AI Lab focuses on four areas of basic research, including: computer vision, speech recognition, natural language processing and machine learning, and strive to fully cover, and deep-level expansion of AI's cutting-edge technology capabilities. At the same time the development of AI in Tencent characteristics of the four business scenarios in the application of the ability: content AI, social AI, AI and platform tools AI.
Tencent AI Lab research direction
In the Tencent AI Lab research system, Seattle AI Labs will assume some of the basic and cutting-edge research work on speech recognition and NLP, and try to solve the more difficult problems in these areas. The AI Lab in Shenzhen will continue. Basic research + rapid application of the combination of the four areas of research and technology applied to the actual scene faster.
In short, can be summarized, Seattle laboratory focused on basic research, the Shenzhen headquarters of the team need to take into account the application of research. But in fact there is no very strict boundaries between basic and applied research, and it is sometimes hard to say whether a thing is a foundation or an application, such as a basic research team that can solve a key problem in a particular technology Immediately applied to the product inside, but these problems in general will be more difficult, it is difficult to predict when it can be resolved.
As a result, Seattle's progress in the laboratory is not as strong as this, which means that more patience is needed and more innovative ideas and algorithms are needed.
In the speech recognition to do more than 20 years of research Yu Dong profound understanding of the basic research needs to do the patience and investment, and in the Tencent conducted a number of communication, in the long-term patient study on the internal also formed a consensus.
"If you want to make a breakthrough in technical research, really need to have some patience, relatively long-term stability of the input, Seattle here basically uphold this idea." We hope that in the long run, can be innovative to overcome the key, the main technical problems in the real application scenarios have a great performance improvement. But because we have no way to predict the end of the day to succeed, we hope that at every stage there will be some progress, this is the only one we can define the progress, but this progress will be how much, there is no way to say more clearly. & Rdquo;
Years of research career so that Yu Dong developed a meticulous style of speech, the interview, for artificial intelligence related technology problems, Yu Dong's answer is very strict, before giving the conclusion, the reasons and various factors will be clear.
But in the speech recognition research and Tencent business possible combination of points, Yu Dong's judgment is very optimistic: Internet of things, games, WeChat, QQ, etc., there are many business scenes used in speech recognition, and semantic understanding and Tencent social applications Will be bigger.
Today, Seattle laboratory has just set up, as the laboratory leader, Yu Dong is currently the main task is to recruit talent to build a team.
Yu Dong hope to build a team of about 20 people, to attract a certain ability to study the talent, researchers and the potential of the two parts we will note that now through various channels to find the right people to join. & Rdquo;
In the future, Yu Dong will lead the Seattle laboratory, which is more biased towards research and is closer to the nature of the American Institute of Large Companies, and continues to explore in speech recognition and semantic understanding. This is a need for long-term commitment to the work, but has done more than 20 years of research Yu Dong and Tencent has been fully prepared.
"We are patient." & Rdquo;