Researchers can use the world’s largest chip supporting GPT-3 for free.

Since ChatGPT swept the world, companies have opened their wallets to explore this technology, and the field of artificial intelligence has set off a new wave of enthusiasm. It has also attracted people’s attention to GPT-3 of OpenAI, which is a large language model (LLM) behind phenomenal ChatGPT, and it provides similar human responses to queries.

However, the hardware for training code (mainly NVIDIA GPU) is in short supply and expensive, or only companies with strong financial resources can use it. However, Pittsburgh Supercomputing Center hopes to provide researchers with free access to the world’s largest chip to run the Transformers model. Microsoft has applied GPT-3 to AI technology in Bing search engine.

The Supercomputing Center (PSC) of Carnegie Mellon University and Pittsburgh University is launching a new round of access to the CS-2 system of Cerebras, which is widely used to train large-scale language models. CS-2 is the AI accelerator in the Neocortex supercomputing system of PSC.

CS-2 has a wafer-scale engine chip of Cerebras, which is considered to be the largest chip in the world with 850,000 cores. The chip is used in artificial intelligence (AI) and traditional supercomputing (such as the decarbonization project used in the National Energy Technology Laboratory), which provides more accurate results through logical calculation.

PSC welcomes high-performance computing projects mainly related to artificial intelligence and machine learning workload, but it is not popular with traditional high-performance computing applications. The project encourages applications to adopt AI algorithm to run traditional HPC workloads. These proposals must be non-patented and can only come from American research institutions or non-profit organizations.

In addition to GPT-3, CS-2 system also supports the popular BERT Transformers model.

Andrew Feldman, CEO of Cerebras Systems, said: "There is a lot of discussion about large language models and very large language models. However, after a given scale (for example, 13 billion parameters), the number of organizations that actually train a language model from scratch is very small. Because you need to solve a lot of hardware problems first. "

Generally, running the Transformers model requires a lot of data, data science knowledge and deep machine learning expertise. Training data takes several weeks, and hardware options include GPU and TPU(GPU is used by Microsoft for Bing and AI, while TPU is used by Google for Bard chat bots).

Neocortex of PSC has been used to study artificial intelligence in the fields of biotechnology, cosmic vision map and health science. The system is also used to study the method of deploying AI on the new memory model deployed in CS-2.

Cerebras’ WSE-2 chip in CS-2 contains 2.6 trillion transistors, and all related AI hardware is concentrated in a chip size. The system has 40GB SRAM on-chip memory, 20 PB bandwidth per second and 12 100Gigabit Ethernet ports. CS-2 is connected to HPE’s SuperdomeFlex system, which has 32 Intel XeonPlatinum 8280L chips, each with 28 cores, and the running frequency is as high as 4GHz.

In contrast, NVIDIA GPUs are usually distributed on many systems. Nvidia’s GPU relies on CUDA parallel programming toolkit, which is widely used to study artificial intelligence. Feldman: "I think we have a great chance to replace GPU and TPU and do it in a more effective way-achieve better goals with fewer mistakes."

The demand for ChatGPT has overwhelmed the OpenAI system, so the demand for AI computing will be overwhelming in the future.