The AI boom has spawned many startups, many of which are AI chip companies. We have seen many giants and startups focus on AI vision chips. However, since the first half of 2018, AI voice chips have also been released. On January 4, 2019, Spirit officially released the first generation AI voice chip TAIHANG in Beijing. It is worth noting that this chip comes from Shanghai Shen Cong Intelligent, a joint venture company of Spirit and SMIC's subsidiary SMIC. What is this?
Is AI voice technology really less challenging than images?
AI image and voice are the two main directions of AI technology application. In contrast, images are now receiving more attention. On the one hand, there are drivers for applications such as security and autonomous vehicles, and on the other hand, there are policy support. At the same time, some people think that compared with images, the technical challenge of voice is smaller, and the existing chips can already meet the needs of AI voice. Is there any misunderstanding?
Citron CTO/Shen Cong Intelligent CEO Zhou Weida accepted an interview with Lei Feng.com (Public No.: Lei Feng). “At present, most of the AI image processing uses CNN. The biggest bottleneck of CNN is not bandwidth and storage, but parallel computing, but Solving Parallel Computing Problems Accelerating multiplication with the Von Neumann CPU architecture is relatively good. AI voice uses DNN, RNN series LSTM, BLSTM, which actually has more challenges than images, one is parallel multiplication calculation, the other is larger model parameters, which will cause the current CPU architecture to face a large bandwidth bottleneck. . ”
“The reason why AI images are the first to receive industry attention is on the one hand, market demand, and on the other hand, because academic circles and industry believe that CNN hardware optimization is less difficult than LSTM's large-scale parametric model. & rdquo; Zhou Weida further explained.
Therefore, it can be clarified that the deep learning algorithm has been proposed since 2005, and it has brought significant improvements in the field of speech recognition and image vision compared with traditional algorithms. Moreover, existing chips can handle both AI images and AI voices, but the computational power is not efficient enough to require dedicated AI chips.
Si Bi Chi CTO / Shen Cong Intelligent CEO Zhou Weida
Why choose self-developed AI chips?
The AI's fiery has brought many AI chip startups on the market, but the chip is a technology-intensive, talent-intensive, capital-intensive high-threshold industry, and the true success of the chip is not mass production, there are subsequent landing applications and constant Iteration. This also makes the chip's investment return period longer. As a company that is known for its algorithms and software, why is it that they are determined to develop their own chips?
Zhou Weida gave an example at the Cores & mdash;— 2019 Spirit AI chip and strategy conference. He said that Spirent has once docked one of the most complicated scenarios, and the algorithm runs on a 4-core ARM chip. It takes up 50%-60% of the computation of the chip, and voice as an interactive means takes up such a high amount of computation in many application scenarios is unacceptable.
therefore,The company decided to do AI voice-specific chips and hope to solve the three closed loops of the general-purpose chip:The general-purpose chip cannot connect with the data; the general-purpose chip cannot connect with the market; more importantly, the general-purpose chip has no algorithm, the algorithm is the soul, and the chip is the frame. A framework without a soul can't produce value, and the chip's future potential must be realized by a dedicated chip.
The demand for AI voice for dedicated chips is unquestionable, but market demand is a stronger driving force. Spitzer CEO Gao Shixing mentioned at the press conference that Spirit has opened up a full-link dialogue technology, deepened the combination of software and hardware, and promoted customized scale with the DUI platform, providing interactive information services through the “session wizard”. To achieve rapid landing. In the field of car networking, smart speakers, children's tablet / story machine, knowledge robots and other fields, the market share is the first, in the car front loading, TV / white electricity, intelligent customer service and other key areas of the market growth rate. In the future, the company will integrate the intelligent terminal solution capabilities and the conversational wizard intelligence service capabilities to form the All In One solution, entering the hotel, real estate, logistics, pension, medical, education, security, community and other industries.
Lei Feng.com also learned that Spirit achieved a break-even in 2017 and made a profit in 2018. Among them is the market's rapid growth in voice demand. For example, the shipment of smart speakers will increase from 18 million in 2018 to 20 million, and the story machine will ship more than 20 million in 2018, as well as smartphones. In 2018, the voice assistant function was added. Of course, the company's market share in all areas is the ultimate choice for them to finally enter the chip field. However, Zhou Weida said in an interview that the company is good at algorithms and software, and finally decided that the most AI chips have many concerns.
Why go out of the different AI chip road?
Now that you have decided to develop your own AI chip, how to do it is the next key issue. Zhou Weida told Lei Feng.Si Bi Chi has been conducting research for one year since 2017.At first, I wanted to cooperate with the IP provider, including porting the algorithm of Spirent to their CPU and DSP, but finally found that the optimization of the hardware was not optimized.
The improvement brought by the cooperation with the IP provider is far from the expectation of the company. Next, the company will contact the chip outsourcing design company, which can design the chip according to the given chip requirements, but Since the chip outsourcing design company adopts the project system, after the chip is implemented, the subsequent PPA (Performance, Power, Area) optimization will not continue. However, a high-performance chip may require 20% of the effort, and subsequent optimization requires 80% of the effort.
Cooperation with chip outsourcing companies will not work, and Spirit will try to cooperate with SoC companies with rich experience in chip design and market experience, but even if Spike is free to invest in the algorithm team and SoC company to jointly design and optimize, calculate in one year. Efficiency has only improved by 20%.
Spirit found that the way to cooperate with the chip design company is not going to work. Therefore, if you think of whether you can establish a deeper cooperation, that is, the chip company out of the design team, Sibi will work out the algorithm team to form a joint venture company. But there are also many difficulties, such as the lack of willingness of SoC companies and the issue of intellectual property rights in the future.
The same deep cooperation model, Spirit also thought of cooperation with the chip foundry, because the chip foundry is very experienced in IP verification and chip production, and also has a good docking with the upstream and downstream of the chip industry chain, The chip's mass production, yield, order and delivery cycle are well protected. It is a good choice for the company. Of course, the foundry hopes to cooperate with mature IP.
In the end, after thinking about the nearly 100 companies in the chip industry chain and investing in the process of research, in March 2018, Spirit teamed up with SMIC's subsidiary SMIC. Juyuan, jointly invested to establish Shanghai Shen Cong Semiconductor Co., Ltd. (referred to as “Shen Cong Intelligence”) officially opened the core of the road, began to stream in August, and verified in November.
The first generation AI chip typical working scene power consumption achieves milliwatt level
Based on previous research and in December 2017, the traditional docking platform has improved the transplantation and optimization of all algorithms. After the establishment of Shen Cong Intelligence, it officially defined the first artificial intelligence voice chip of Spirent, one month. The time defines the complete specification of the chip. In April, the underlying technology development, integration verification, simulation and optimization of the chip began. In less than 5 months, the first chip of the company was successfully launched on August 7. Streaming and lighting up on the day.
However, the AI algorithm is still evolving, which poses a greater challenge to the design of the terminal AI chip. In this regard, Zhou Weida said that thanks to the large number of IoT smart devices docked in the market, we have a very good understanding of the market demand. In addition, our 14 papers were selected into ICASSP to refresh the new record of national independent innovation capability. Our algorithm research is advanced, and we have planning forecasts for the next two or three years or longer.The AI chip of Spirent has been designed and planned. The first is to ensure that the current algorithm can be quickly transplanted. Secondly, we have reserved some space for optimization in the next two or three years.
This fast-flowing AI chip is the Shen Cong TH1520. According to Zhu Congyu, Chong Cong Intelligent CTO, TH1520 is optimized for algorithm hardware, based on dual DSP architecture, integrated codec codec and large-capacity internal memory unit, and TH1520. The AI instruction set extension and algorithm hardware acceleration are adopted, which has an efficiency improvement of more than 10X compared with the conventional general-purpose chip. In addition, the TH1520 is architected with the power of computing power and storage resources to support future algorithm upgrades and extensions.
The TH1520 combines low power consumption with practicality, multi-level wake-up mode, and low-power IP, making it consume as little as milliwatts in the always-on listening phase. Typical operating scenarios consume only tens of milliwatts. The peak power consumption of extreme scenes does not exceed 100 milliwatts.
In contrast, if you use the Arm chip, the performance of the work scene is optimized to at least 500 milliwatts, and some are watt-level power. Zhou Weida said in an interview.It is no exaggeration to say that when we do AI algorithm and hardware combination optimization, the algorithm is optimized to the instruction set, and the memory is optimized to the byte level. Of course, this process has also experienced the tacit cooperation between the hardware and software teams from mutual understanding to the final.
In addition, TH1520 supports a full range of microphone arrays such as single-wheat, double-wheat, linear 4-wife, ring-four-wheat, and ring-shaped 6-wheat. It also supports application interfaces such as USB/SPI/UART/I2S/I2C/GPIO and reference sounds in multiple formats. Flexible deployment of applications across a wide range of IoT products.
Zhou Weida also said that TH1520 is located in all kinds of terminal equipment, and optimizes the algorithm for the home environment, which can quickly land, reduce costs, reduce power consumption, and greatly enhance the user experience of TV, box, white, flat, lamps and other products.
At the press conference, Spirit demonstrated the TH1520's three demos in speakers, TVs and dishwashers, and said that more advanced features are still being debugged.
In addition,Zhou Weida revealed to Lei Feng that the TH1520 will be produced at Q2 this year at the latest. He also revealed that Spirit has its own clear plan for the chip, the goal is to develop a suitable brain-like chip, and some progress has been made.
Openness is the attitude of Spirent
With the self-developed AI chip, Spirit can improve the terminal's voice processing capability, which not only enables more offline voice enhancement experience, reduces data transmission to the cloud, but also based on the chip password added to the chip. Better protect the privacy of users.
Of course, more importantly, Spirit-Shen Cong will create an artificial intelligence interaction that better fits the needs of the product. “Cloud+Core” overall solution. Does this mean that Spirit is more inclined to provide a complete solution, Zhou Weida said that we can provide soft and hard IP, chip to Turnkey solution according to customer needs, we also hope to attract users with the price/performance ratio of the product. I hope everyone can make the voice market bigger first.
The opening of the company is not only reflected in the final product plan, Zhou Weida said that Shen Cong Intelligence also hopes to cooperate with all parties in the chip industry chain, including IP providers and chip design outsourcing companies. In addition, Shen Cong Intelligent hopes not only to interact with human-machine voice, but also to make images in the future. Deep Cong Smart welcomes more visuals, do images and even AI financial companies to work together with Shen Cong Intelligence to develop smarter human brains. High-performance artificial intelligence computing chip.
Lei Feng network summary
Chips are one of the key elements in the development of AI. Of course, the pursuit of computing power has made more chip companies aware of the need to cooperate with algorithms and software companies. However, in the research and development of the SPI chip, the company, which is known for its AI voice algorithm, enters the chip market from the perspective of voice algorithms and market demand. After long-term research and careful consideration, it is ultimately a joint venture company. Completing the development of the AI chip, this process has overcome many challenges. It is still difficult to say that this model will achieve the best results, but it is the most reasonable choice after the search.
In the course of the interview, I can clearly feel the open attitude of Spirent, not only for the openness of R&D AI chip cooperation, but also to open more people to promote the popularization of AI voice technology.