Smart Phone SoC design 2023 trends: AI processor is factoring-in

Date: 15/05/2023
Smartphone SoC is one of the most complex semiconductor chips in the world. Though they are made in billions, it undergoes huge amount of high-quality chip design processes before it is getting mass produced. The chip needs to consume extreme low power while delivering glitch-free performance consistently. Only few have mastered skill of designing such extremely complex silicon brains in a short timeframe to serve the narrow window of massive global market opportunity.

The Performance of smart phone is directly dependent on the power of processor cores inside SOC. The semiconductor technology node at which these chips were manufactured was a key factor, so that more processor cores can be packed in smaller space. It is a selling point for smartphone vendors to mention foundry and the node the chip was made in the product review articles of popular best selling smart phones. With Samsung and TSMC both having a tough time going beyond 3 nm. The processor performance is expected to be defined more by the AI engines inside the chips. In that sense, AI performance of chip is now a multiplying factor.

Huawei started the lead in this area followed by Apple, Qualcomm, Mediatek and Samsung, who are racing against each other in AI-ing their mobile phone SoC chips. For these companies it is series of battles to win in this highly competitive market. There is no visible economical advantage in monolithic scaling beyond 3 nanometer. The best economical way to enhance performance is through heterogeneous computing as well as heterogeneous chiplet based 3-D integration. Customized or generic AI processing elements either part of monolithic chip or as separate chip-lets are trending in all the latest designs. AI is not new and also comes of ages to become mainstream pushing traditional CPU elements into the corner of SOC chip.

More and more AI makes the smart phones really smarter. Phones can sense surroundings and behave and alert the user accordingly. AI powered device can start assuming owner of the device as stupid and take the advantage to help its maker and its tech partners to feed some beneficial data. The user irrespective of being non-stupid helplessly need to own such smart device to live the present style of tech dependent life. With this said, let's get into growing neuron elements inside the smart phone SoC device/brain.

The requirements for AI kind of processing:
AI processor can be employed for image processing tasks such as classification, enhancement and super-resolution, optical character recognition (OCR), object tracking, visual scene understanding, face detection and recognition, human activity recognition, gesture recognition, sleep monitoring, gaze tracking, language translation. Writing tasks such as sentence completion, sentence sentiment analysis, interactive chatbots. The need for on-chip AI to simulate human like information processing intelligence enable mobile devices to automate mundane daily tasks of phone user and increase productivity of the user.

Graphic processing particularly in the gaming applications can be significantly improved by using AI processing experiences. By using vector processing accelerators with excellent latency performance, resource management, and also optimal power consumption can be achieved.

It is challenging to identify such processing workloads which AI processors can exclusively manage and also it is important to bring in traditional CPU as and when required, so that they work seamlessly.
Today's smartphones along with taking pictures, and listening to audio sounds, it has sensors to monitor vibration, temperature and position, and such environment and user behaviors. All these are signals which can be processed using AI processors to quickly read/infer from the huge continuous data.

The role of AI in smart phones is rising exponentially getting applied wherever it finds reliable use. Name any of the streaming or sensing like applications, you can find some better way of processing data in that application using AI. The simple reason is that data what smartphone reads is nearly same as what human being processes every day. To give you an example of how AI processing element perform better than a normal CPU, An AI processor can recognize thousands of images per minute using 95 percent less power than a do-if-while loop processor doing the same task. AI processors in today’s smart phones, deliver a performance exceeding 26 trillion operations per second (TOPS). JPEG encoding and decoding, video encoding and decoding can be done hundreds times faster than traditional CPU using AI blocks.

By embedding the AI process component in the camera module it’s possible to do post processing of video signal for image stabilization, depth mapping, warp engine, and color correction. AI processors can provide fast and accurate auto focus, auto exposure and auto-white balance while taking pictures. Neural processors can be embedded as part of the camera module. These processors can optimize white balance, exposure, enhancing the realism and richness of skin tones in all lighting conditions. Neural network processor embedded microphone can also do wonders in language processing, speech recognition, and language translation.

In this highly volatile environment, the ability of the smart phone SoC chip designers to integrate effectively AI processing elements at a rapid pace along with keeping other chip design trends cutting edge defines the success of the chip in the market.

The speed at which AI processors process data consuming less power make them a quintessential elements in a chip. AI processors automatically learn from complex data and anticipate future actions automatically. They categorize data based on patterns with some basic supervision.

They can be trained to learn through trial and error basis just like how we human beings does and can expand and can improve over the time without explicit programming.

By using deep learning neural networking AI processors in smartphones can recognize images and objects with dependable accuracy. Deep learning can identify patterns in data for image, speech, and natural language processing.

In a virtual reality environment, near original looking images are created by using various graphic processing techniques. To enable today’s smart phones for real image type looking virtual reality environment either in games or in any other virtual reality applications smartphone mobile SoCs started supporting innovative image rendering processing abilities such as Ray tracing.
Evolution from CPU to APU: Initial cell phones were heavily using DSPs chips from Analog Devices and TI. With ARM coming in, SoCs developed with CPU and GPU and DSP cores. Today’s chips now pack neural AI fabric. Today's mobile phone SOC chips fully employed heterogeneous computing involving CPU, GPU, neural processing elements and accelerators. AI chips need cache and other memory as close to them as possible with fast read write speeds. In memory computing emerging faster but still not disclosed use of them in any major smart phone chips.

The Challenges for AI chip development is in training. Finding datasets and writing an optimal inference algorithm. Lot of companies are emerging to fill this gap. SoC designers work closely with app developers in identifying a data with clearly defined pattern and behavior, and than go for an exclusive processor component to handle a fixed math function algorithm for such data.

AI processor integrated chips in the market:
The four main mobile phone chip makers Apple, Qualcomm, Huawei, MediaTek and Samsung are into this game. There is lot of talk on Huawei’s Kirin 990 scored better against Apple’s A13 Bionic chipset. But now with Apple A16, Apple took a big lead as per little older sources. ARM supports them with latest cores.

Apple’ A16 Bionic SoC:
Apple Neural Engine: Apple's A-series chips, such as the latest A16 Bionic chip in iPhone 14 is a 16-core neural engine designed to deliver 17 trillion operations per second (TOPS). The first version in Apple A11 Bionic chip performed at throughput of 0.6 teraflops. Apple claim that it has the fastest CPU and GPU embedded in a smartphone. It also packs an image processing feature known as Deep Fusion which uses machine learning to improve low to medium light photography.
 Apple A16

Apple A16 performance

Qualcomm Hexagon DSP based Snapdragon:
Qualcomm Hexagon DSP: Qualcomm Snapdragon SoCs incorporate the Hexagon Digital Signal Processor (DSP). Qualcomm Hexagon architecture optimized for mobile multimedia and communications to process AI workloads efficiently. This VLIW based DSP provides a dedicated AI engine, including vector processors and tensor accelerators, for tasks like neural network inference and voice recognition, image enhancement, computer vision and sensor data processing. For Qualcomm it's old horse evolving to today's requirements.

Huawei Neural Processing Unit (NPU):
Huawei's Kirin SoCs integrate NPUs specifically designed for AI tasks. The NPU offers high-performance AI computing, enabling features such as AI-powered photography, real-time object recognition, and language translation on Huawei smartphones. CPU in Kirin 990 is the tri-cluster octa-core and the GPU is 16-core Mali-G76 MC16 architecture.
Samsung's Exynos with dedicated NPU:
Samsung Neural Processing Unit (NPU): Samsung's Exynos processors employ dedicated NPUs to accelerate AI computations. Exynos 2100 and Exynos 2200 SoCs feature tri-core NPU, where Exynos 2100 deliver up to 26 TOPS.

MediaTek's Dimensity with APU:
MediaTek APU (AI Processing Unit): MediaTek's Dimensity series of SoCs utilize the APU to handle AI workloads. The APU combines CPU, GPU, and dedicated AI processing elements to provide optimized AI performance for various tasks, including image recognition and natural language processing. Mediatek 5th gen APU in the Dimensity 9000 uses AI-NR is now over 16x faster in the same power envelope than MFNR used previously.
The sixth generation APU power the latest Dimensity 9200+ launched just few days before this article's first published date. Less details available on Mediatek APU in the public domain. APU packs components such as deep learning accelerators, visual processing units (flexible cores), hardware-based, multicore scheduler.
MediaTek has used first Hybrid AI-GPU having GPU and APU together to take the data wherever they are good at. This hybrid processing bring down the GPU load and/or provide new IQ enhancement features, ensuring faster gameplay while turning up the in-game visuals. Mediatek’s HyperEngine 5.0 also introduces the first ray tracing solution using an AI-based post-processing denoiser in Vulkan, on an Android platform.

ARM and other silicon IP vendors are already offering AI engine Ips for various processing specific workloads. ARM developed Ethos N78 NPU for Machine Learning acceleration. It has 8 NPUs in a cluster and 64 NPUs in a mesh. Arm Immortalis-G715 GPU is also used for AI loads.

Author: Srinivasa Reddy N
Header ad