In sum, what to know:
- Critical shift in data center builds: Power and efficiency are no longer an afterthought in data center design.
- U.S.-China semiconductor race: Optical interconnects between modules and silicon photonics could be game changers in the competition for dominance.
- Affordability and accessibility: Accessibility to emerging technologies for enterprises is a challenge in the supply-chain crunch.
Before Tuesday’s panel discussion about “designing scalable AI clusters,” RCRTech talked with Chip design engineer, Raymond Chik, vice-chair of the IEEE AI Hardware & Infrastructure Working Group about the insatiable need for power and energy in data center buildouts.
“Most people focus on speed and performance, but now the architecture has to be configured around power and efficiency,” said Chik. “Power, energy, and silicon are going to be key issues to focus on, with optical connectivity and silicon photonics becoming technologies of interest.”
In terms of compute, next-gen architecture has to be built around energy-efficient compute, as well as connectivity between chips, and board to board. “Architecturally, there are many ways to connect things, and not everything gives you the same power or performance, so the architecture has to be configured not only from a performance and latency perspective, but also from the perspective of power constraints,” says Chik.
That means a greater focus on how to minimize energy consumption and choose a type of connectivity technology and network topology that minimizes traffic and that synchronizes architecturally.
In terms of what Chik sees on the horizon for innovation, he says optical interconnects between modules (versus copper), and silicon photonics (optical I/O interconnects) for connectivity among chips and between boards and racks are the way to get higher bandwidth, lower power consumption, longer reach, and lower latency. Moore’s Law and the physical limits of process sizes will soon mean it’s no longer feasible to increase transistor density for performance. That’s where Chik believes silicon photonics is going to innovate, with photons (light) rather than electrons carrying information, and photonic components fabricated directly onto silicon. It’s a technology that some believe will change the contours of the U.S.-China semiconductor race, with China investing heavily in its growing photonics industry in the hopes of overtaking the U.S. in semiconductors.
“The reality is that when you integrate silicon and photonics-connectivity elements on the same silicon chip, it can increase the chance that something will go wrong, and you can take away the ‘pluggability’ of the optics. You want to be able to unplug and replug.”
As a result, the use of light rather than electricity to transmit data — which allows for significantly higher bandwidth and faster speeds than electrical signals traveling through copper wires — is still nascent, but rapidly evolving. There increasingly are advancements in how silicon can meet future data center demands, as with Imec and NLM Photonics recently achieving 400-gigabit-per-second per lane data rates.
The explosive growth of AI training clusters and other compute-intensive applications will continue to drive innovation toward bandwidth, higher performance, and improved efficiency. For example, Taiwan Semiconductor Manufacturing Company (TSMC) is working with Avicena to produce microLED-based interconnects.
As these technologies evolve, there will also be a push to make them more commercially available. For companies smaller than Nvidia, there is a bottleneck hampering access to high bandwidth memory. “You want high bandwidth between memory and AI compute, especially with LLMs, and that is a big challenge right now,” acknowledged Chik. “Nvidia and larger GPU companies are using high bandwidth memory (HBM) right now, but HBMs are very expensive, so if you are a smaller company, you can’t get access from Micron or major suppliers, as they are catering to the largest players right now.”
Because HBM is not meeting the needs of applications like LLM, Chik and others are working on 3D stacked memory and logic chiplets. “When you combine individual chips into a single, vertically integrated package, you significantly increase performance, bandwidth, and power efficiency. It’s like building a lot of elevators in parallel to reduce latency and improve throughput.” He notes HBM is also a stacked tech, with 4 to 8 dies on top of each other with an I/O bottom die that shifts bits out to the GPUs horizontally, but now, you can put the compute die right below. “You have better connectivity and you’re not limited to shipping your I/Os or data lines along the edge of the chip.”