a new brain-inspired chip for faster artificial intelligence

by time news

2023-10-20 12:48:27

Researchers from the laboratory IBM Research Almaden (California) have presented in the latest issue of Science the NorthPole chip, a brain-inspired architecture that combines computing with memory to process data efficiently at low energy cost.

Since its inception, computing has focused on the processor, separating memory from calculation. However, transporting large amounts of data between memory and computation comes at a high cost in terms of power consumption, bandwidth, and processing speed.

Transporting large amounts of data between memory and compute consumes high power, bandwidth, and processing speed.

This is especially evident for emerging and advanced real-time artificial intelligence (AI) applications, such as facial recognition, object detection, and behavioral monitoring, which require rapid access to large amounts of data.

As a result, most contemporary computing architectures are rapidly reaching physical and processing bottlenecks and are at risk of becoming economically, technically and environmentally unsustainable given the increasing energy costs they entail.

However, the new prototype chip, which has been in development for almost two decades, has the potential to dramatically change the way in which powerful AI hardware systems can be effectively scaled, the authors of the work note.

Since the birth of the semiconductor industry, computer chips have mainly followed the same basic structure, in which the processing units and the memory that stores the information to be processed are stored discretely.

The NorthPole chip on a PCIe card. / IBM Research

Although this structure has allowed for simpler and more scalable designs over the decades, it has created what is called the von Neumann bottleneckin which it takes time and energy to continually exchange data between memory, processing, and any other devices on a chip.

The work of Dharmendra Modhafrom IBM Research, and his colleagues aim to change this by taking inspiration from the way the brain computes. “It forges a completely different path than the von Neumann architecture,” according to Modha.

In tests with image recognition and object detection models, NorthPole has demonstrated greater energy and spatial efficiency and lower latency than current chips

For the past eight years, Modha has been working on a new type of digital AI chip for neural inference, which he calls NorthPole. It is an extension of TrueNorth, the last brain-inspired chip that Modha worked on before 2014. In tests carried out with popular models of image recognition ResNet-50 y object detection YOLOv4, the new prototype chip, has demonstrated greater energy and space efficiency and lower latency than any other chip currently on the market, and is about 4,000 times faster than TrueNorth, according to IBM Research.

“Breakthrough in chip architecture”

According to Modha, NorthPole is a breakthrough in chip architecture that offers huge improvements in energy, spatial and temporal efficiency. Using the ResNet-50 model, the chip is significantly more efficient than 12nm graphics processing units (GPUs) and central processing units (CPUs). In both cases, the prototype is 25 times more energy efficient, in terms of the number of frames rendered per joule of energy required.

The chip has been manufactured using a 12nm node process and contains 22 billion transistors in 800 square millimeters. It has 256 cores and can perform 2,048 operations per core per cycle with 8-bit precision, with potential to double and quadruple the number of operations with 4 and 2 bit precision, respectively

It contains 22 billion transistors in 800 square millimeters. It has 256 cores and can perform 2,048 operations per core per cycle with 8-bit precision

NorthPole also outperformed on latency, as well as the space needed to compute, in terms of frames rendered per second per billion transistors required. According to Modha, in ResNet-50, the chip outperforms mainstream mainstream architectures, even those that use more advanced technological processes, such as a GPU implemented using a 4nm process.

Modha, in the center, with most of the team working at NorthPole.

One of the biggest advantages of NorthPole, the authors explain, is that all of the device’s memory is on the chip itself, rather than connected separately. Without that von Neumann bottleneck, it can perform AI inference considerably faster than other chips already on the market.

Blur the border between calculation and memory

“The NorthPole architecture blurs the line between compute and memory,” says Modha. “At the individual core level, it appears as ‘near compute memory’ and from off-chip, at the I/O level, it appears as active memory.” . This makes it easy to integrate into systems and significantly reduces the load on the host machine.

But the chip’s biggest advantage is also a limitation: it can only easily draw on the memory it has built in. Any possible speed increase on the chip would be diminished if it had to access the information from another location.

Thanks to an approach called ‘scale-out‘, North Pole can support larger neural networks by dividing them into subnets smaller ones that fit into the chip’s model memory, and connecting these subnets together across multiple NorthPole chips.

So, although there is enough memory in one of these chips (or a set of them) for many of the models useful for specific applications, this chip is not intended to be an all-rounder. “We can’t run GPT-4 with it, but we can run many of the models that companies need,” says Modha.

The device doesn’t need bulky liquid cooling systems to operate, meaning it could be deployed in some rather small spaces.

This efficiency means that the device also does not need bulky liquid cooling systems to operate – fans and heatsinks are more than enough – meaning it could be deployed in some smallish spaces.

While research on the NorthPole chip is ongoing, its structure lends itself to emerging AI use cases as well as more established ones.

Self-driving cars, robotics and satellites

In testing, the NorthPole team focused primarily on uses related to computer vision, in part because funding for the project came from the US Department of Defense.

Some of the main applications were detection, image segmentation and video classification. But it was also tested in other areas, such as natural language processing (with BERT model single encoder) and speech recognition (with DeepSpeech2). The team is currently exploring mapping large decoder-only language models to scalable NorthPole systems.

In testing, the team focused primarily on uses such as computer vision, in part because funding for the project came from the US Department of Defense.

The developers believe this chip could also be used in many types of edge applications that require massive amounts of real-time data processing. “For example,” they comment, “it could be the type of device necessary for autonomous vehicles They go from machines that need maps and fixed routes to operate on a small scale to ones capable of thinking and reacting to the extreme situations that make real-world navigation a challenge even for expert human drivers.

It could also be used in satellites that monitor agriculture and manage wild animal populations, control vehicles and the transport of goods, handle robots safely and detect cyber threats, the authors highlight.

Reference:

Dharmendra Modha et al. “Neural inference at the frontier of energy, space, and time”. Science (October 2023)

#braininspired #chip #faster #artificial #intelligence

You may also like

Leave a Comment