AI infra shifts from cloud to edge: KPMG

Table of Contents

The KPMG executive noted that some of the models can actually be run in smaller locations, smaller sets of computation resources closer to the client or customers

Artificial intelligence (AI) is driving a fundamental shift in how computing power is deployed and consumed, as enterprises redesign their digital infrastructure to support new types of workloads and smaller, specialized models.

This transition toward distributed and edge-based computing is redefining strategies around cost, location, and scalability, Phil Wong, partner at KPMG, told RCR Wireless News.

“As enterprises move from initial experimentation and then users of AI to more pervasive use of AI, agentic AI, we start to see the workload, the type of workload, change,” Wong said. “Moving more and more from large computation to closer and closer to the edge, in order to reduce latency and drive speed in terms of computation as well as storage.”

He explained that this shift reflects both technical evolution and business adaptation. “Some of the models can actually be run in smaller locations, smaller sets of computation resources closer to the client or customers,” Wong said. “The other thing that we are now also seeing is some enterprises, especially the ones who are more sophisticated and advanced, are now creating their own small language models that are trained and learn in the context — whether it’s the process or the organization.”

This diversification of model types — from massive multimodal systems to task-specific small language models — is driving a more distributed computing environment. “Certainly, the hyperscaler, the large computation, continues to be required,” Wong added. “But you see more and more of the workload getting distributed to the edge, in some cases on devices as well.”

The expansion of AI infrastructure, however, comes with significant financial and logistical challenges. “If you are data center operators or hyperscalers who are actually investing in the physical infrastructure, there’s certainly a lot of cost that is still tied to the acquisition of the land, the construction or the permitting that’s required, and then on top of that, the connectivity, the power, and then the actual equipment,” Wong said.

He noted that supply chain issues are also pushing up costs. “We’re starting to see some constraints, whether it’s constraint[s] of sourcing power or getting power connected to these sites, or supply chain constraints — whether it’s just availability of GPUs or, in some cases, construction materials,” he explained.

For enterprises that consume AI computing capacity rather than build it, the challenge is managing costs efficiently. “From an enterprise perspective, a lot of the cost is around managing the computation and managing the storage costs,” Wong said.

Beyond cost, regulatory complexity also remains a concern. “It comes back to permitting. It comes back to working with the local communities and utilities to get access to power,” he added.

You Might Also Like

Rack To Ran

RCRTech is a daily newsletter focused on AI Infrastructure — the data centers, fiber networks, edge interconnections, chips, data platforms, hardware and software that make AI work for enterprise verticals including telecom, finance, pharmaceutical and automotive.

The idea is to provide our readers with a blend of original and curated content from seasoned journalist and analyst in a newsletter format that lets you choose your own adventure: whether you want to scan the headlines in five minutes, dive into analysis, or read the source material that we’re reading, this is the newsletter for you.

Join 200,000+ professionals receiving the RCRTech Daily Newsletters

Created by RCR Wireless News. Telecom Industry editorial excellence since 1982

© 2025 RCRTech | Powered by Eight Hats

This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish. Accept Read More