Cisco and Nvidia Reimagine the Data Center for the AI Era

Technicians in a large computer server farm. — Image: Mint_Images/Envato

In a packed hall in Washington, D.C., Nvidia’s GTC revealed more than new technology. It revealed a vision for the future of computing itself.

And within that vision, one partnership stood out: Cisco and Nvidia joining forces to reshape how enterprises build and run AI infrastructure. Together, they’re building a blueprint for data centers built for speed, scalability, and security in the AI era.

Why AI demands a new kind of data center

When it comes to deploying AI infrastructure, an ecosystem of companies is required.

The market leader in networking is partnering with the leader in accelerated computing to tackle the challenge of running large-scale AI systems, which is far more complex than simply deploying a few GPU-enabled servers into the data center.

Historically, a server could have been considered the central “unit of computing.” However, today, the entire data center has become a single computer, and in some cases, multiple data centers. Traditional data centers weren’t designed to handle the massive data flows or the tight coordination that modern AI workloads demand.

The network must transfer data quickly enough so that no cluster of GPUs is overloaded while others remain idle.

The Cisco N9100 and Nvidia’s shared blueprint for AI infrastructure

At GTC, I sat down with Will Eddington, Cisco’s senior vice president of networking engineering, and Gilad Shainer, Nvidia’s senior vice president of networking, who shared their common vision of turning the data center into one powerful, AI-ready computing system.

Shainer was formerly with Mellanox Technologies, an Israeli high-performance networking company acquired by Nvidia in 2020. This was a visionary move that made networking central to Nvidia’s AI strategy and highlighted to the industry how important the network is to AI.

“The strength of the collaboration between Nvidia and Cisco is we build the infrastructure for AI. That’s our focus, and we need to do it very quickly because…every year there is a new generation of systems coming out. Working with Cisco, we can bring capabilities into the enterprise,” said Shainer.

The vision is starting to take shape through new products, specifically the Cisco N9100 series, the first data center switch built using Nvidia’s Spectrum-X Ethernet technology. It can run either Cisco’s NX-OS software or the open-source SONiC operating system, giving cloud providers and enterprises flexibility in how they build their networks. The switch also anchors a new Nvidia Cloud Partner (NCP) compliant reference architecture, a blueprint for combining Cisco’s networking, management, and security tools with Nvidia’s hardware.

With the N9100 as the foundation, the shared blueprint lays out exactly how the pieces fit together. For example, the N9100 switch transfers data between servers equipped with Nvidia GPUs. The design incorporates Nvidia BlueField data processing units (DPUs) and ConnectX SuperNICs, which handle security and data transfer at high speeds. Everything can be managed through Cisco’s Nexus Dashboard, providing IT teams with a unified way to oversee their AI infrastructure.

“We have customers, especially in the neocloud space, who really want that NCP certification aspect. The N9100 switch allows us to support the back-end compute and then the rest of the Cisco Ethernet solutions for front-end storage. Inside, it has Nvidia Spectrum-X Ethernet silicon. In the management layer, it includes things like the Nexus Dashboard, which helps customers manage the fabric and lifecycle, as well as the Nexus Hyperfabric, which is a new cluster management platform,” said Eddington.

As Eddington noted, Cisco introduced Hyperfabric AI, a cloud-managed network fabric that brings all the pieces of the AI infrastructure together in one place. Now available, the system combines compute, networking, and DPUs with built-in management tools for deploying entire AI clusters. Hyperfabric AI assists IT teams in planning their AI infrastructure, including determining the necessary storage and processing requirements.

Cisco is also expanding its Secure AI Factory, a joint platform with Nvidia for running AI workloads. The latest updates add stronger protection and monitoring features. They include the Cisco AI Defense, which has been integrated with Nvidia NeMo Guardrails to keep sensitive data from being exposed outside company boundaries. Meanwhile, the Splunk Observability Cloud and Enterprise Security provide a live view of how AI systems are performing, which allows IT teams to pinpoint any issues.

From reference architecture to real-world impact

Shainer explained that before Cisco and Nvidia release any new AI infrastructure design, they build and test it internally to make sure every layer — whether it’s compute, storage, networking, or security — works together as intended. The goal is to provide something customers can deploy quickly without taking months and making countless errors.

“By building that reference architecture, we enable our customers to go and copy that one-to-one. And when they copy that, they can bring it up very quickly. Customers can also take elements of what we build. That reference is fully optimized, fully set. They can take the reference as-is, or they can take pieces of it and use those pieces with other components of their own,” said Shainer.

One of the first Cisco customers to adopt the reference architecture using the N9100 switch is Blue Sky Compute. The neocloud startup runs a high-speed, Ethernet-based network that connects large GPU clusters used for both training and inference. In an interview with me Blue Sky CEO Ian Hartley said the company is focused on helping enterprises “turn AI into ROI.” At the same time, Cisco and Nvidia provide the backbone that makes it possible.

“There are so many neoclouds, each one of them is selling the raw ingredients of AI. What we do at Blue Sky is focus on tying it together to produce value for the enterprise. That means we’ll consult on data services, model selection, model implementation, of course, the compute, and finally, orchestrating it all to ensure that it does exactly what you want,” said Hartley.

For Blue Sky, the benefits go beyond performance. The company is expanding rapidly, with data centers on both US coasts and plans for nearly 100 global locations. Blue Sky aims to create a global inference platform that leverages a network of sites where companies can run AI models near their users. All while maintaining a consistent experience everywhere. Cisco’s networking expertise makes that possible, Hartley said.

Blue Sky is a prime example of what the next phase of AI adoption looks like. It’s not just about training models but about building smarter infrastructure. As Hartley put for other companies doing the same: “Don’t focus on proof of concept, focus on proof of value.”

IT pros should heed the words of Hartley. There have been many studies that have highlighted how organizations are not getting the anticipated ROI from AI. The highly ballyhooed MIT report is such an example as it found that 95% or companies aren’t getting the expected return from their AI projects. The problem isn’t the technology, as there are plenty of examples of companies that have. The issue lies with companies attempting to piece together AI projects piecemeal.

The deployment of AI requires companies to examine processes end-to-end, while also thinking holistically about the infrastructure – from the underlying network to the compute stack and the applications. This is where blueprints and validated designs, such as the Nvidia NCP, can help, as they reduce the time required for tweaking and tuning infrastructure deployment from months to just a few days, enabling companies to achieve their ROI.

Check out this related article: “Nvidia Becomes First Company Valued Above $5 Trillion” — a powerful reminder of how high the stakes have become in AI infrastructure.

Source of Article

Why AI demands a new kind of data center

The Cisco N9100 and Nvidia’s shared blueprint for AI infrastructure

From reference architecture to real-world impact

You Might Also Like

Intel Spins Off Enterprise Generative AI Deployment Firm Articul8

10 Cool and Useful PowerShell Commands for Enhanced Productivity

Gen AI to Increase US Production — With Caveats

Windows in S mode: A cheat sheet