Research Note: Improving Inference with NVIDIA’s ‘CMX’ Inference Context Memory Storage Platform

NVIDIA Vera Rubin

At NVIDIA Live at CES 2026, NVIDIA introduced its Inference Context Memory Storage (ICMS) platform as part of its Rubin AI infrastructure architecture. NVIDIA’s ICMS addresses KV cache scaling challenges in LLM inference workloads.

The technology targets a specific gap in existing memory hierarchies where GPU high-bandwidth memory proves too limited for growing context requirements while general-purpose network storage introduces latency and power consumption penalties that degrade inference efficiency.

Research Note: AWS Releases Trainium3, Teases Trainium4

AWS Trainium3

At its recent AWS re:Invent event, AWS moved its custom AI accelerator strategy into a new phase with the general availability of EC2 Trn3 UltraServers based on the Trainium3 chip and the public preview of its next-generation Trainium4.

Research Note: HPE Announcements at Discover Barcelona 2025

At HPE Discover Barcelona 2025, Hewlett Packard Enterprise centered its announcements on three core pillars of enterprise infrastructure: networking for AI workloads, hybrid cloud and virtualization enhancements, and AI infrastructure systems at scale.

Research Note: Dell Adds 20+ Features to its AI Factory

Dell AI

Dell Technologies announced more than 20 updates to its AI Factory portfolio ahead of next week’s SC25 event, spanning compute, storage, networking, and cooling infrastructure. The announcements center on three primary themes: expanded support for NVIDIA Blackwell GPUs across multiple server platforms, introduction of AMD MI355X-based systems, and deeper integration of automation tools across the infrastructure stack.

Research Note: IBM and AMD Collaborate on Classical-Quantum Computing

IBM Heron Quantum Processor

IBM and AMD recently announced a strategic collaboration to develop quantum-centric supercomputing architectures that combine quantum computers with high-performance computing infrastructure. The partnership is based on a memorandum of understanding between the companies, with no immediate financial exchange.

Research Note: HPE’s Updated AI Factory

At its recent Discover event, HPE announced an expansion of its NVIDIA-based AI Computing portfolio with three distinct AI factory configurations targeting enterprise, service provider, and sovereign deployment scenarios.The offerings center on the upgraded HPE Private Cloud AI platform, which integrates NVIDIA Blackwell GPUs with HPE ProLiant Gen12 servers, custom storage solutions, and orchestration software.

Research Note: AMD Raises its Game at its Advancing AI 2025 Event

AMD Instinct MI350

AMD announced a comprehensive portfolio of AI infrastructure solutions at its recent Advancing AI 2025 event, positioning itself as a full-stack competitor to NVIDIA.

The announcements include the immediate availability of MI350 Series GPUs with 4x generational performance improvements, the ROCm 7.0 software platform achieving 3.5x gains in inference, and the AMD Developer Cloud for broader ecosystem access.

AMD also previewed its 2026 “Helios” rack solution, which integrates MI400 GPUs, EPYC “Venice” CPUs, and Pensando “Vulcano” NICs.

Research Note: NVIDIA NIM Agent Blueprints

NVIDIA

NVIDIA launched its new NIM Agent Blueprints, a catalog of pre-trained, customizable AI workflows to help enterprise developers quickly build and deploy generative AI applications for critical use cases, such as customer service, drug discovery, and data extraction from PDFs.

Research Note: IBM Telum II & Spyre AI Accelerators

IBM Telum II & Spyre

At the Hot Chips 2024 conference in Palo Alto, California, IBM unveiled the next generationj of enterprise AI solutions: the IBM Telum II processor and the IBM Spyre Accelerator. These new technologies should meet the demands of the AI era, providing enhanced performance, scalability, and AI capabilities. Both are expected to be available in 2025.

Research Note: Inside the IBM Research NorthPole Accelerator

IBM Research

IBM Research has developed and released details on a groundbreaking AI chip called NorthPole, which could revolutionize AI hardware systems. Unlike traditional computer chips, NorthPole integrates processing units and memory on the same chip, eliminating the von Neumann bottleneck and significantly improving efficiency.