Home General Various News The 5 technical challenges Cerebras overcame in constructing

The 5 technical challenges Cerebras overcame in constructing

217


Superlatives abound at Cerebras, the until-today stealthy next-generation silicon chip firm seeking to make coaching a deep studying mannequin as fast as shopping for toothpaste from Amazon. Launching after virtually three years of quiet growth, Cerebras launched its new chip immediately — and it’s a doozy. The “Wafer Scale Engine” is 1.2 trillion transistors (probably the most ever), 46,225 sq. millimeters (the biggest ever), and consists of 18 gigabytes of on-chip reminiscence (probably the most of any chip available on the market immediately) and 400,000 processing cores (guess the superlative).

CS Wafer Keyboard Comparison

Cerebras’ Wafer Scale Engine is bigger than a typical Mac keyboard (through Cerebras Systems)

It’s made a giant splash right here at Stanford University on the Hot Chips convention, one of many silicon business’s massive confabs for product introductions and roadmaps, with varied ranges of oohs and aahs amongst attendees. You can learn extra in regards to the chip from Tiernan Ray at Fortune and browse the white paper from Cerebras itself.

Superlatives apart although, the technical challenges that Cerebras needed to overcome to succeed in this milestone I believe is the extra attention-grabbing story right here. I sat down with founder and CEO Andrew Feldman this afternoon to debate what his 173 engineers have been constructing quietly simply down the road right here these previous few years with $112 million in enterprise capital funding from Benchmark and others.

Going massive means nothing however challenges

First, a fast background on how the chips that energy your telephones and computer systems get made. Fabs like TSMC take standard-sized silicon wafers and divide them into particular person chips by utilizing gentle to etch the transistors into the chip. Wafers are circles and chips are squares, and so there may be some fundamental geometry concerned in subdividing that circle into a transparent array of particular person chips.

One massive problem on this lithography course of is that errors can creep into the manufacturing course of, requiring intensive testing to confirm high quality and forcing fabs to throw away poorly performing chips. The smaller and extra compact the chip, the much less possible any particular person chip might be inoperative, and the upper the yield for the fab. Higher yield equals larger earnings.

Cerebras throws out the concept of etching a bunch of particular person chips onto a single wafer in lieu of simply utilizing the entire wafer itself as one gigantic chip. That permits all of these particular person cores to attach with each other straight — vastly dashing up the vital suggestions loops utilized in deep studying algorithms — however comes at the price of enormous manufacturing and design challenges to create and handle these chips.

CS Wafer Sean

Cerebras’ technical structure and design was led by co-founder Sean Lie. Feldman and Lie labored collectively on a earlier startup referred to as SeaMicro, which offered to AMD in 2012 for $334 million. (Via Cerebras Systems)

The first problem the crew bumped into based on Feldman was dealing with communication throughout the “scribe lines.” While Cerebras chip encompasses a full wafer, immediately’s lithography gear nonetheless has to behave like there are particular person chips being etched into the silicon wafer. So the corporate needed to invent new strategies to permit every of these particular person chips to speak with one another throughout the entire wafer. Working with TSMC, they not solely invented new channels for communication, but additionally needed to write new software program to deal with chips with trillion plus transistors.

The second problem was yield. With a chip protecting a whole silicon wafer, a single imperfection within the etching of that wafer might render the whole chip inoperative. This has been the block for many years on entire wafer know-how: as a result of legal guidelines of physics, it’s basically not possible to etch a trillion transistors with excellent accuracy repeatedly.

Cerebras approached the issue utilizing redundancy by including further cores all through the chip that might be used as backup within the occasion that an error appeared in that core’s neighborhood on the wafer. “You have to carry just one%, 1.5% of those…



Source hyperlink

LEAVE A REPLY

Please enter your comment!
Please enter your name here