The design can run a large neural network more efficiently than GPU banks connected to each other. But manufacturing and running the chip is a challenge, requiring new methods to record silicon characteristics, a design that includes redundancies to account for manufacturing defects, and a new water system to keep the giant chip cooled.
To build a cluster of WSE-2 chips capable of running record-sized artificial intelligence models, Cerebras had to solve another engineering challenge: how to get in and out of the chip efficiently. Normal chips have their own memory on board, but Cerebras developed an off-chip memory box called MemoryX. The company also created software that allows a neural network to be partially stored in this off-chip memory, just by transferring the calculations to the silicon chip. And he built a hardware and software system called SwarmX that connects everything together.
“They can improve the scalability of large-scale training, beyond what anyone does today,” says Mike Demler, senior analyst at the Linley Group and senior editor of The microprocessor report.
Demler says it’s still unclear how much market there will be for the cluster, mostly because some potential customers are already designing their own more specialized chips. He adds that the actual performance of the chip, in terms of speed, efficiency and cost, is still unclear. Cerebras has not posted any reference results so far.
“There’s a lot of impressive engineering in the new MemoryX and SwarmX technologies,” Demler says. “But, like the processor, this is a very specialized thing; it only makes sense to form the larger models. “
Cerebras chips have so far been adopted by laboratories that need supercomputing power. Early clients include Argonne National Labs, Lawrence Livermore National Lab, pharmaceutical companies such as GlaxoSmithKline and AstraZeneca, and what Feldman describes as “military intelligence” organizations.
This shows that the Cerebras chip can be used for more than feeding neural networks; the calculations made by these laboratories involve similarly massive parallel mathematical operations. “And they always have a thirst for more computing power,” says Demler, who adds that the chip could become important for the future of supercomputing.
David Kanter, an analyst at Real World Technologies and CEO of MLCommons, an organization that measures the performance of different algorithms and artificial intelligence hardware, says he sees a future market for much larger artificial intelligence models. . “I usually tend to believe in data-centric ML [machine learning], so we want bigger data sets that allow us to build bigger models with more parameters, ”says Kanter.