The previous Kirin 950 is an impressive chip that used the most advanced TSMC 16nm FF+ process and the Cortex A72 architecture at the time. This combination of strength and power gave the Kirin 950 a strong competitive edge and brought a rich experience to Huawei. s return.
Later Kirin 960 and Kirin 970 demonstrated the risk side of this strategy. Kirin 960 is a SoC manufactured using 16nm FFC process (less than 16nm FF+ process). It is dominated by Qualcomm Snapdragon 835 and Samsung Exynos 8895 which use 10nm LPE process in the same period. Kirin 970 is replaced by TSMC 10nm process. However, only the Cortex A73 architecture is used, while the Opteron 845 is the Kryo 385 architecture optimized from the Cortex A75. In addition, these two generations of KirinprocessorThe GPU energy efficiency is also constrained by the weak Arm Mali G71/G72 architecture.
When designing the Kirin 980, Huawei was once again in a very favorable position. Arm's new Cortex A76 and Mali G76 architectures have made great leap in energy efficiency, and TSMC is also pushing its 7nm manufacturing process.
As seen at yesterday's press conference, the Kirin 980 uses the Cortex A76+Cortex A55's DynamIQ CPU cluster and the Mali G76 GPU cluster.
In previous designs, several cores in a cluster were running on the same clock and voltage. If a high-performance thread required a high-performance state, other threads would have to increase the frequency, and the energy-efficiency ratio would be forced down. This time, HiSilicon took full advantage of Arm's new DSU cluster and asynchronous CPU configuration, and subdivided the high-performance CPU cluster of Cortex A76 architecture in Kirin 980 into two groups, each operating at different frequencies and voltages, which can effectively improve The energy efficiency ratio in actual use.
In the two Cortex A76 CPUs, the high-frequency group operates at 2.6 GHz, which is much lower than Arm's previously announced 3 GHz target, but higher than the previously predicted 2.5 GHz conservative frequency. Despite this, the new CPU architecture still brings significant performance improvements, and the Cortex A76 still performs very well at 2.6 GHz. The other Cortex A76 CPU runs at 1.92GHz, which should be a good point-of-performance balance. The two cores can be flexibly invoked according to different usage scenarios.
For caching, all Cortex A76s come with a recommended 512KB L2 cache configuration, while the A55 uses a 128KB cache. In the latest DynamIQ cluster configuration, the L2 cache is exclusive to each CPU core. The L3 cache in the DSU is a 4MB shared design with twice the capacity of the Opteron 845.
CPU performance increased by 75%, energy consumption increased by 58%
Huawei said that the Kirin 980 can achieve a 75% performance improvement over the Kirin 970, and the energy consumption ratio is 58% higher than the Kirin 970. The PPT footnote in Huawei's speech shows that its energy efficiency data is based on Dhrystone, and Dhrystone is very focused on the core of the CPU, which will not put too much pressure on the SoC in other aspects such as memory.
The Big.middle.little three-layer structure used by Kirin 980 also brings scheduling complexity. Hisilicon says it has adopted a new "flexible scheduling" mechanism, but unfortunately there is no more detail. Before the first Kirin 980 device was actually acquired, data on performance and power consumption was limited to data estimates.
After some simple calculations, the performance and power performance of the Kirin 980 is basically the same as that predicted by Arm's Cortex A76 architecture.The 2.6GHz Cortex A76 has no problem with the Samsung Exynos 9810 with or exceeding 2.7GHz, which is about 30% ahead of the Snapdragon 845.. And under the 7nm process,Kirin 980's 2.6GHz Cortex A76 core consumes even less than 1.8GHz Samsung Exynos 9810The energy consumption is better than the whole audience.
10 core Mali G76 ≥ 20 core Mali G72
The Kirin 980 is also the first SoC to use Arm's new Mali G76 GPU. The Mali G76 is very different from the previous Bifrost architecture, which has dramatically changed the internal structure of the core.
In order to improve the performance and area efficiency of the architecture, the Mali G76 doubles the size of the base computing module in the GPU, with 8 FMA and ADD/SF pipelines in a single EU. In fact, the computing resources of a Mali G76 core are equivalent to two Mali G72 cores.
On paper, the Kirin 980's Mali G76 MP10 may be smaller at first glance than the Mali G72 MP12 in Kirin 970, but in fact it is equivalent to the Mali G72 MP20, which is 66% more computing resources, which does not include new The computational efficiency of the architecture is improved.
The GPU frequency of Kirin 970 is as high as 747MHz, but the energy consumption ratio at this frequency is not good, which leads to the fact that in actual use, the operating frequency is often reduced due to large power consumption and heat generation. According to Hisilicon, the GPU frequency of the Kirin 980 is 720MHz, the performance is 46% higher than that of the Kirin 970, and the energy consumption ratio is increased by 178%.
Based on the GFXBench Manhattan test commonly used in the industry, after some simple calculations, the following data can be obtained:
Although the size of the Mali G76 is different, and the load of the pixel filling rate and the arithmetic logic operation are different for different tests, the test results are not the same, but it must be said thatAt least under the GFXBench Manhattan test benchmark, the absolute performance of the Kirin 980 is still inferior to the Adlonno 630 of the Opteron 845..
However, the GPU's GPU energy consumption ratio has indeed made a considerable leap. Huawei's 178% is also consistent with the previous speculation when analyzing the Mali G76. If this level is also on the actual equipment, it means that the Kirin 980 has escaped the previous big pit of the Mali G71/G72, and the sustainable power consumption level has returned to the benign level of the Kirin 950.
Faster memory controller
The Kirin 980 uses the LPDDR4X memory controller and claims to be the first in the industry to support 2133MHz memory with a 13% increase in bandwidth. One of the shortcomings of Kirin 970 is that the memory controller consumes a lot of power at higher frequencies. It is hoped that Kirin 980 can solve this problem and maintain high performance at high frequencies.
Huawei announced some memory delay and bandwidth data at the press conference: In the GeekBench 4 test, the Kirin 980 has a latency of 138ns and the Opteron 845 has a latency of 176ns. However, the reference of these data is yet to be confirmed, because the delay of Kirin 970 and Snapdragon 835 is also around 138ns, and the performance of Samsung Exynos 9810 is better than them, the delay is only 78ns.
New ISP and its imaging features
The Kirin 980 uses a new ISP that delivers up to 46% more image processing throughput and supports higher resolution camera data. One of the improvements of the new ISP is the introduction of a 10-bit image processing pipeline for HDR, which is a feature that is widely added in SoC this year, but due to the lack of follow-up in color management and screens in actual mobile phone products, this capability is actually very small. use.
Another improvement for the new ISP is support for "multi-channel noise reduction" technologyThis sounds a lot like the multi-frame noise reduction function introduced in this year's Xiaolong 845. Its noise reduction is based on time frames instead of spatial pixels, and the noise reduction effect is better and there is no blurring side effect. In additionThe new ISP has a new one.videoCoded pipeline to reduce video capture delay by 33%.
but,Kirin 980's video encoding capability is still maintained at 4K@30fpsThis is a competitive disadvantage for a new SoC.
New dual-core NPU, throughput *2
Kirin 980 continues to improve its neural network reasoning acceleration architecture and introduces a new dual-core NPU. The new dual-core NPU still only processes one model kernel at a time, which means that the speed of single model inference can be doubled. Huawei said that the new dual-core NPU of Kirin 980 is 2.2 times faster than the NPU of Kirin 970, and can achieve 4,500 inferences per minute.
Faster Cat.21 4G baseband and Balong 5000 5G baseband
The new Cat.21 4G baseband in Kirin 980 supports 4x4 MIMO, 1.4Gbps download rate and 2x2 MIMO, 200Mbps upload speed, and supports 5CA, 256QAM, 3x carrier aggregation, which is the strongest baseband before 5G.
In addition to the 4G baseband, Huawei also talked about the Balong 5000 5G baseband that can be used with the Kirin 980. This combination is just like the next-generation Qualcomm Snapdragon SoC will be equipped with the X50 baseband. However, Huawei did not disclose the details of the two packages, but said that we will see the equipment using Kirin 980+ Balong 5000 sometime next year.
1732Mbps high speed WiFi module
In previous SoCs, Huawei usually used Broadcom's WiFi module. Broadcom has always been considered the industry leader, and most of the flagship devices on the market use Broadcom's WiFi solution.
However, the Kirin 980 unexpectedly used the new Hi1103 WiFi module, supporting 802.11ac standard, 2x2 MIMO and 160MHz bandwidth, speed up to 1732Mbps. At the same time, Hi1103 also supports L1+L5 dual-frequency GPS positioning, and the positioning accuracy can be increased by 10 times in the L5 frequency band.
Thousands of light rings are all in one
The previous Kirin 960 and Kirin 970 were greatly affected by market competition due to the disadvantages of the process and architecture. This time, Kirin 980 hopes to solve these problems and has made major improvements in all aspects. It uses TSMC's latest 7nm process, Arm's latest CPU architecture Cortex A76 and GPU architecture Mali G76, plus memory controller, ISP, NPU. The improvement of the module makes the Kirin 980 look really a very balanced SoC, which is in an excellent competitive position in the market. The first Mate 20 with a Kirin 980 processor will be launched on October 16 and should be a device worth looking forward to.