Google yesterday touted its TurboQuant as a significant efficiency breakthrough for “extreme compression“ – news that convinced many investors that physical memory and HBM might not be so crucial anymore, causing drops in stocks from companies like SK Hynix, Samsung, Western Digital, SanDisk, and Micron. What’s the big deal? The single biggest hurdle for long-context AI is the KV cache bottleneck, which causes persistent GPU memory, scaling, and inference problems for LLMs. If TurboQuant brings super-efficient AI memory compression, truly reducing the cache memory usage of LLMs by at least 6x and improving performance by 8x times – and all without loss of accuracy – then that’s a pretty big deal. So far, these are early results, but by shrinking the “working memory,” TurboQuant ischanging the fundamentals of how “memory” is stored. It does so using a PolarQuant, which rotates data into a highly dimensional grid-like structure so that information is stored in just 3 bits, instead of 16 or 8. That’s a vast improvement to squeezing numbers into small “buckets,” which sometimes lead to AI hallucinations. Another “trick” employed by the PolarQuant algorithm is the QJL layer (Quantized Johnson-Lindenstrauss), which, rather than store a whole memory, stores a 1-bit “shadow” of the original data – just the “corrections” needed to restore the accuracy of the model following the extreme data compression. RCRTech will follow up on this innovation in a story next week, so check back for a deeper dive. In the meantime, click to our “top stories” on Alibaba, Hive, and edge AI and check out the day’s AI infrastructure news in “What you need to know.”

Susana Schwartz
Technology Editor
RCRTech
AI Infrastructure Top Stories
Alibaba targets $100B AI, Cloud revenue: Alibaba CEO Eddie Wu told investors that demand for AI capabilities is expanding rapidly across industries, touting a full-stack approach, spanning chips, cloud infrastructure, models and applications.
Hive expands HPC infra: HIVE Digital Technologies has brought online a GPU-based AI cloud deployment in Asunción, Paraguay, marking the first operational cluster under its plan to expand HPC infrastructure in the country.
Reader forum on EdgeAI: Paessler GmbH’s David Montoya, a manufacturing and IT/OT convergence expert, reveals why governance frameworks are crucial to Edge AI. He explores risk inefficiency, security gaps, and IT/OT conflicts.
AI Today: What You Need to Know
Arm claims a doubling of per-rack performance: The newly announced Arm AGI CPU, designed to address AI inference bottlenecks, shows Arm is pivoting toward high-performance AI data center silicon, with a focus on agentic AI workloads.
OpenAI boosts Broadcom projections: Broadcom projects its AI semiconductor revenue will double in 2026 to $8.2 billion, driven largely by its new customer OpenAI, which, like many companies, is looking to avoid Nvidia lock-in.
“Behind-the-Meter” Struggles: At S&P’s CERAWeek, industry leaders discussed the trend of data centers attempting to build their own on-site power generation to avoid grid delays, though some deem this a “temporary solution at best”.
Jobs for Microsoft’s 15 new DCs in WI: Microsoft secured approval for 15 new data centers at the former Foxconn site in Wisconsin, valued at over $13 billion. Here’s what 3 Wisconsin professors say it could mean for jobs in the area.
Meta job layoffs: Meta laid off approximately 700 employees yesterday, across Facebook, global operations, recruiting, sales and Reality Labs. The company is refocusing capex on AI, with $115 billion and $135 billion projected for 2026.
Another attempt to smuggle AI chips to China: A new case, yet again involving Supermicro, involved a plot to smuggle $170 million worth of Nvidia chips to China. One Chinese national from Hong Kong and two U.S. citizens have been charged.
Upcoming Events
AI Infrastructure Forum 2025This one-day virtual event will discuss the critical issues and challenges impacting the AI infrastructure ecosystem, examining the growth and evolution of the AI ecosystem as it scales and the need for flexible, sustainable solutions.
Industry Resources
Report: AI infrastructure will power the next economic revolution
Report: How to test and assure telco AI infrastructure
On-demand webinar: How to test and assure telco AI infrastructure