SC25

SC25: Beyond Super Computing

Supercomputing 2025 delivered a clear message to enterprise IT leaders: the infrastructure conversation has fundamentally changed. The announcements from SC25 were about architectural transformation.

From rack-scale designs to quantum integration to facility-level engineering, the building blocks of large-scale AI and HPC systems are being reimagined.

NVIDIA Expands Into Quantum-GPU Integration

NVIDIA introduced NVQLink, enabling microsecond-scale connections between GPUs and quantum processors. Combined with their Quantum-X photonics platform, this represents a significant step toward hybrid quantum-classical computing architectures.

Quantum computing is transitioning from research curiosity to practical accelerator. National labs are already incorporating quantum processors into supercomputing roadmaps. Enterprise organizations should begin tracking quantum integration timelines, particularly for optimization and materials science workloads.

Dell Technologies Unveils Comprehensive AI Infrastructure Stack

Dell showcased end-to-end AI infrastructure solutions featuring 102.4 Tb/s networking switches, integrated storage, validated reference architectures, and AI automation capabilities.

The company emphasized NVIDIA-optimized systems and complete, tested solutions rather than component sales.

Storage Vendors Position for AI-Centric Workloads

DDN introduced DDN Core, a unified data engine designed for AI factory architectures.

WEKA announced next-generation WekaPod appliances for AI storage and unveiled an augmented memory grid on NeuralMesh for scalable inference workloads. WEKA’s memory grid approach addresses a critical inference bottleneck by extending available memory across the fabric.

Storage is transitioning from a capacity-driven conversation to a data workflow conversation. The focus is shifting to metadata scale, dataset versioning, high-throughput parallel access, and low-latency support for training, checkpoint, and inference operations.

Infrastructure Fundamentals Move to the Forefront

Multiple vendors highlighted advances in optical networking, liquid cooling systems, power delivery architectures, and cable plant design. The focus reflected growing recognition that traditional data center infrastructure cannot support next-generation AI workloads.

Power, cooling, and networking are now first-order design constraints for AI infrastructure. Organizations planning significant AI deployments need early coordination between compute architects and facilities engineering teams. Infrastructure readiness—not just GPU availability—will determine deployment timelines.

SC25 Strategic Themes

Several key themes emerged from the event:

Rack-Scale Architecture is the New Standard

The fundamental unit of AI infrastructure has shifted from the server to the rack. NVIDIA’s NVL72 architecture, combined with OEM implementations from Dell, Supermicro, ASUS, and MiTAC, establishes fully integrated racks as the baseline for large-scale deployments.

These systems combine compute, networking, cooling, and power delivery into unified, validated configurations.

Implication: Infrastructure planning must now address rack-level throughput, fabric utilization, and thermal management. Organizations thinking in server-level terms will face integration challenges and suboptimal performance.

Networking Emerges as the Critical Constraint

SC25 made clear that networking fabric is now the primary bottleneck for AI scale-out. 800G networks are the new planning baseline, with 102.4 Tb/s switches addressing higher radix and bandwidth requirements.

Both InfiniBand and advanced Ethernet approaches are being positioned as viable options, while co-packaged and near-packaged optics are gaining attention due to density and power challenges.

Implication: GPU performance improvements no longer translate directly to cluster performance if networking infrastructure cannot scale accordingly. IT buyers must also weigh fabric architecture, optical interconnect strategies, and cable plant complexity alongside compute specifications.

Storage Architecture Shifts to AI Data Workflows

Storage vendors moved beyond traditional capacity metrics to address AI-specific requirements. DDN’s unified data engine and WEKA’s memory grid architecture represent a fundamental shift toward storage systems optimized for AI training pipelines, inference workloads, and data versioning.

The emphasis is on metadata management, parallel access patterns, and integration with AI frameworks rather than raw throughput alone.

Implication: Organizations deploying AI infrastructure need storage solutions purpose-built for AI workloads. Traditional enterprise storage architectures may create bottlenecks in training and inference pipelines. Storage buyers need to look at metadata scalability, framework integration, and support for AI-specific data access patterns.

Facility Engineering Evolves

Power and cooling are now strategic design variables. SC25 featured megawatt-scale coolant distribution units, high-voltage DC bus architectures, and immersion-cooled optical interconnects. These technologies reflect the reality that next-generation AI systems will exceed traditional data center capabilities.

Implication: Infrastructure teams need to assess facility readiness before committing to large AI deployments. Power delivery, cooling capacity, and thermal management must be addressed at the planning stage, not during implementation.

Analyst’s Take

SC25 marked the point where AI infrastructure planning became genuinely multi-disciplinary. Success will require alignment between compute architects, network engineers, storage specialists, facilities teams, and procurement organizations.

The storage announcements from DDN and WEKA reflect a broader shift: every layer of the infrastructure stack is being reimagined for AI workloads. Organizations that continue evaluating AI infrastructure through traditional enterprise IT frameworks (whether for compute, networking, or storage) will struggle to scale effectively.

The architectural shifts visible at SC25 are a fundamental change in how large-scale AI systems are designed, deployed, and operated.

Disclosure: The author is an industry analyst, and NAND Research an industry analyst firm, that engages in, or has engaged in, research, analysis, and advisory services with many technology companies, which may include those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.