Cerebras interview pt. 2: why wafer scale over GPU?

Home AI Infrastructure Newsletter Cerebras interview pt. 2: why wafer scale over GPU?

Cerebras has been heavily featured in the news following the company’s blockbuster Nasdaq listing, boasting the largest semiconductor IPOs in history at about $5.5 billion. Not long after, Cerebras reached a major performance milestone, running the massive 1T-parameter-open-weight AI model Kimi K2.6 at 981 output tokens per second, making it 6.7 times faster than the next-fastest GPU-based cloud provider (according to Artificial Analysis). 

In Part II of my RCRTV AI TechTalk with Cerebras co-founder Jean-Philippe Fricker, we delve into what “wafer scale” really means, not only in tokens per second, but in AI inference, thermal management and energy efficiencies. For what workloads can wafer-scale architecture possibly beat GPUs and what can it mean for AI and LLMs (especially inference)? Watch the interview and read the highlights, here.

To get insight about how Cerebras founders stayed on course despite the skeptics who, understandably, felt they were “crazy,” check out Part I, here.

Susana 2

Susana Schwartz
Technology Editor
RCRTech

 

AI Infrastructure Top Stories

AI expansion grid: Pacific subsea networks are reshaping routing, with a reciprocal infrastructure deal between Telstra and Google, a new India-Singapore route from FLAG, and a record-breaking fiber transmission test from Japan’s NICT. 

Connectivity for AI sovereignty: Dedicated, high-capacity connectivity directly affects both the economics and competitiveness of large-scale AI facilities, says president and COO Craig Tavares of BUZZ High Performance Computing.


View More News

AI Today: What You Need to Know

Alphabet’s $80 Billion AI Raise: Alphabet launched its first stock sale since 2005 to raise $80 billion in equity to scale AI infrastructure. Berkshire Hathaway is backing the expansion with a $10 billion private placement. 

Nvidia-Akamai expansion of partnership: Targeting “AI factories,” Nvidia and Akamai announced they’d be embedding security directly into the infrastructure layer of AI systems – an effort to protect workloads moving to edge environments.

Intel AI Portfolio: CEO Lip-Bu Tan’s keynote at Computex focused on CPUs as the heart of modern AI infrastructure, plus rackscale AI infrastructure news, such as commercial availability of Xeon 6+ DC processors built on Intel 18A node.

Trump signs AI ‘security’ EO: The new executive order requires AI developers to give the federal government 30-day early access to advanced frontier models before public release – a departure from the administration’s laissez-faire approach.

8 data centers in Hood County?: In Texas, Hood County is a rural community of 62,000 people. Developers have proposed 8 data centers spanning over 7,600 acres, some of which might be powered by a new on-site gas plant.

Nvidia-TSMC bring AI into fabs: TSMC is using NVIDIA accelerated computing and AI for semi design and manufacturing, with CUDA-X libraries and AI models accelerating TSMC workloads across lithography, transistor and process simulation.

 
 

RCR Events

Telco AI Forum, June 16th
Telco AI Forum brings together operators, vendors, hyperscalers, and academia to explore how the evolution of the industry and partnership ecosystems is laying the foundations for AI-native 6G networks and unlocking ROI. Register now

Quantum Safe Networks Forum, July 14th
Quantum Safe Networks Forum brings together telecom operators, cybersecurity experts, and industry analysts to explore how to build resilient, future-ready infrastructure in the face of quantum disruption. Register now

RCR Roundtables AI Infrastructure, October 21st, Dallas, Texas
Join 50 senior data center, energy and AI leaders at the Ritz-Carlton Dallas on October 21 for invitation-only roundtables on powering and scaling AI. Request your invitation 

 

Industry Resources

Webinar, June 9th: Agentic RAN Management: Delivering OPEX efficiency and a path to 6G

Report: AI in testing: Developing trust, delivering results

Report: Test, measurement and service assurance in the AI era

Whitepaper: Powering sovereign AI at scale

Whitepaper: Scalable database design for 5G and beyond

Report: Scaling AIOPs from insight to action

Summit Access: GSMA Device Enablement Summit: How operators can fix device-network fragmentation

What you need to know in 5 minutes

Join 37,000+ professionals receiving the AI Infrastructure Daily Newsletter

This field is for validation purposes and should be left unchanged.

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More