Skip to main content

What is a AI chip? Storage and computing in one - AI chip architecture in the post-Moore era

 Storage and computation, or computation in storage, refers to the transformation of the traditional von Neumann architecture from a computation-centric design to a data storage-centric design, that is, the use of memory for data computation, thus avoiding the "storage wall" and "power wall" generated by data handling This avoids the "storage wall" and "power wall" generated by data handling, and greatly improves the parallelism and energy efficiency of data. This architecture is especially suitable for end devices requiring large computing power and low power consumption, such as wearable devices, mobile devices, smart homes, etc.

 

1. Limitations of the von Neumann architecture

 

 The first is performance.

 

In the classical von Neumann architecture, data storage and computation are separated, and data is exchanged between processor CPU memory through the data bus. However, due to the different internal structures, processes, and packaging of processors and memories, the performance of the two also differs greatly. Since 1980, the performance gap between the processor and the memory has been widening, the memory access speed is far from the CPU data processing speed, which travels between the memory and the processor a "storage wall", which seriously restricts the overall performance of the chip.

 

The second is power consumption.

 

As mentioned earlier, due to the separation of processor and memory, in the process of processing data, the data first needs to be carried from the memory to the processor through the bus, and after the processing is completed, the data is carried back to the memory for storage. The energy consumption during data handling is 4 to 1000 times higher than that of floating-point operations. With the advancement of the semiconductor process, although the overall power consumption decreases, the power consumption ratio accounted for by data handling is getting larger and larger. According to the study, in the 7nm era, the sum of access memory power consumption and communication power consumption occupies more than 63% of the total power consumption of the chip.

 


Due to the existence of the above storage wall and power wall bottlenecks, the traditional von Neumann architecture is no longer suitable for AIoT scenarios that focus on big data computing, and the need for new computing architectures has arisen.

 


2. Solution ideas

 

For the design of new computing architectures, researchers have proposed a variety of solutions, which are broadly classified into three categories.

 

(1) High-speed bandwidth data communication, including optical interconnection and 2D/3D stacking.

 

High-speed bandwidth data communication alleviates the storage wall problem mainly by increasing the communication bandwidth. Optical interconnect technology enables high-speed data transmission and reduces power consumption. 2.5D/3D stacking technology is to stack multiple chips together to enhance communication bandwidth by increasing the parallel width or using the serial transmission.

 

(2) Near-storage computing.

 

The basic approach of near-storage computing is to store data as close as possible to the computing unit, thus reducing the latency and power consumption of data handling. Currently, the architecture of near-storage computing mainly includes multi-level cache architecture and high-density on-chip storage.

 

(3) Storage computing in one, i.e., the algorithmic embedding of the memory itself.

 

The core idea of storage computing or in-store computing is that by embedding algorithms in the memory cell itself, computing can be done within the memory cell.

 

Power Consumption Comparison

 

A comparison of the power consumption of traditional off-chip storage, near-storage computing, and in-storage computing can be found in the following figure.

 

 


 

3. Memory-computing chip features

 

As we mentioned earlier, the core idea of memory-computing integration is to embed the algorithm in the memory unit itself, specifically, the weight data of the AI model is stored in the memory unit, and then the core circuit of memory is designed so that the process of data flow is the process of the dot product of input data and weight in the analog domain, which is equivalent to the accumulation of input with weight, that is, convolutional operation. Since the convolutional operation is the core component of the deep learning algorithms, the integrated storage and computation are well suited for deep learning. The architecture completely eliminates access latency and greatly reduces power consumption, which is a true convergence of storage and computation. At the same time, because computation is fully coupled to storage, finer-grained parallelism can be developed for higher performance and energy efficiency.

 


 

4. Status of the integrated storage and computing chip

 

(1) Technology implementation method

 

According to the classification of volatility during the storage period, the implementation of the storage and computing technology can be roughly divided into two ways.

 

volatile-based, existing process mature SRAM, DRAM implementation.

 

Implementation based on non-volatile, new memories such as phase change memory PCM, resistive memory RRAM/reminder ReRAM, floating gate devices or flash memory Flash.

 

Volatile memory SRAM and DRAM are mature processes and are the main memory products commercially available today. Therefore, many vendors and research institutes have started to conduct research on in-store computing based on SRAM and DRAM. However, due to the different manufacturing processes of current memories and processors, a good balance between processing performance and storage capacity cannot be achieved yet.

 

Non-volatile memories include spin moment magnetic memory STTRAM, phase change memory PCM, resistive memory RRAM, etc. These memories have been developed rapidly in the last decade or so, and their capacity has been increasing, and they have a natural integration of computation and storage. However, due to the immaturity of the corresponding vendors and processes, there is still a certain distance from real commercialization.

 

(2) Competitive landscape

 

In recent years, a number of domestic and international DIC startups have emerged.

 

Some of the more famous foreign DIC startups include Mythic and Syntiant, while the old giant Samsung has also developed its DIC technology based on HBM2 DRAM.

 

Domestic companies are even more blossoming, including ZhiCun Technology (based on Flash), Flash Billion Semiconductor (based on memory resistor PLRAM), XinYi Technology (based on RRAM), HengShuo Semiconductor (based on NOR Flash), HouMo Intelligent (research direction including SRAM/MRAM/RRAM), JiuTianRexin (based on SRAM), etc. In addition, there is Ali Pinto (3D bonding stack based on DRAM).

Comments

Popular posts from this blog

The biggest problem with the latest 56 semiconductor manufacturers suspend orders, price increased, and long lead time. How can you fix it?

 Following the suspension of MCU orders by ELAN , Holtek Semiconductor issued a notice on 21st April suspending orders with immediate effect. The price of various semiconductors, especially MCUs, has risen as a result of factors such as the tightness of 8-inch wafer foundries. The demand for MCUs is so high that many major MCU manufacturers at home and abroad are operating at full capacity, but supply still exceeds demand. In its notice, Holtek  Semiconductor stated that Suspension of orders for 2022 Subject: Orders with delivery dates in 2022 are suspended with immediate effect.   Description: 1. The wafer fabs and packaging houses have advised that there will be another wave of price increases soon - price increases of 15%-30%. 2. The fabs are expected to provide 2022 production numbers by early May and will announce 2022 order acceptance rules when confirmed.   3. expected to resume accepting orders for 2022 by mid-May. 4. 2022 orders that have received deposits will be rescheduled

Understanding of DC-DC buck bootstrap circuit

In the peripheral circuit design of DC-DC BUCK chips, we usually add capacitors or a combination of capacitors + resistors between the BOOT and SW pins, this piece of circuit is called bootstrap circuit, the capacitors and resistors in the bootstrap circuit are called bootstrap capacitors and bootstrap resistors.   What is a bootstrap capacitor?   DCDC Buck chip has a pin called BOOT, and some are called BST. The following is an explanation of the BOOT pin of a DCDC chip. In the external circuit design, a capacitor, generally 0.1uF, is needed between the BOOT and SW pin, and is connected to the driver end of the high-end MOS tube of DCDC.     How does a bootstrap capacitor work?   The following is a block diagram of a DCDC BUCK chip, the top NMOS is called the high-side MOSFET and the bottom NMOS is called the low-side MOSFET.     When the high side MOS tube is turned on, SW is VIN, SW charges and stores energy in the inductor, and the inductor current is rising; when the low side MOS

2022 global chip shortage continues: ST、NXP、ADI、XILINX、ONSEMI、DIODES... latest trends

Under the influence of the epidemic and various emergencies around the world, the global chip industry fell into a shortage of stock in 2021. Now it has been a year, and the shortage of chips seems to have not eased. Below, we have collated the latest market developments of the original chip manufacturers such as ST, Renesas, NXP, ADI, ON Semiconductor, Microchip, Qualcomm, etc., so that you can have a good idea of the market situation. ST: Large shortage of high-end products and automotive chips Most stockists have been selling off since the prices of ST  products have fallen back, but this month has seen a small rebound. For example, STM8S003F3P6TR  and STM32F103VCT6 , two pieces of material, have seen a small price increase. It is worth noting that the market price of ST's brake system chips has recently soared, and other automotive chips have also risen, and there is still a large shortage of high-end products and automotive chips, and delivery times are still long.   In additi

Teach you 5 ways to identify and avoid counterfeit electronic components in a second

In the process of purchasing electronic components, the most worrying thing for buyers is not the price, but the quality of the product. There are a variety of IC chips on the market, of all kinds, and without paying attention to the distinction, it is sometimes difficult to see the difference between various materials, whether it is true or false, new or refurbished. The following is a compilation of some methods to identify genuine and fake chips, for your reference. Common chip counterfeiting methods Material A counterfeit material A Original manufacturer's tailor loose material: the original packaging has been disassembled or is no longer available, but product functionality and yield may be lower due to storage time or handling process, etc. Original manufacturer scrap or defective products: mainly products that have not passed factory inspection by the original manufacturer, such as scrap products after reliability testing, poor packaging quality, bad test products, etc. Orig

9 effective ways to improve your electronic components specification for approval

1. Let purchasers find manufacturers to provide specifications, safety information, environmental protection information, insurance information - E-document 2. Verify that the information is true and complete, such as the applicable period of the document, so as not to be fooled by the manufacturer. 3. Let the buyer find the manufacturer to provide samples, specifications, safety information, environmental information, insurance information - Paper files 4. Environmental test: send several samples to the environmental laboratory for ROSH halogen test. 5. Electrical specifications and high-temperature testing - e.g. electrolytic capacitors :  A: measuring capacity and deviation withstand voltage, PIN foot tin is good B: go through the production line to see if the capacitors are deformed after the high temperature of the furnace and if the capacity and the deviation voltage are okay. 6. Body size check - e.g. electrolytic capacitors . Body height and diameter, PIN pin spacing, PIN pin d