Alan Weckel  

Great. Thanks, everybody, so much for joining us today. We're going to talk about Demystifying AI networking. It's my pleasure to have Hardev and Martin with me today kind of dive into the market, what we're seeing in AI in this fast-moving market. So we'll start with some introductions, then do some data defining piece here of how big these markets are and then get into a further discussion. It's going to be a great time, and I'm looking forward to it.

So with that, I'm Alan Weckel, technology analyst at 650 Group, and we're going to kind of dive into the market. So first, let's do some introductions of Hardev and Martin. Hardev, why don't you kick off with your background and role at Arista?

Hardev Singh  

Sure. Thank you. My name is Hardev Singh. I'm the General Manager for Cloud Titans and AI, and I've been at Arista for just over 4 years now.

Martin Hull  

And I'm Martin Hull. I'm the Vice President and General Manager of the platform product management team, so responsible for the data center class products, which, of course is the same as the AI class of products. And I've been at Arista a little bit over 14 years.

Alan Weckel  

Awesome. Great. Martin, I still remember the first time we met in person. So time flies there for sure. Before we dive into the topic at hand, let's talk a little bit about AI networking, what it is, what it's different. I find the question very interesting as I think everyone has a good hands-on traditional DC switching. And then suddenly, we had AI show up. And now we've got multiple AI networks from front end to back end, Scale Up to Scale Out and then we have the protocols as well like Ethernet, InfiniBand, NVLink, UALink, et cetera. You talked a lot to customers, Hardev. Can you kind of break this down a little bit what exactly is Scale Up and Scale Out, what's the same as the past and what's kind of different and what are customers asking you about?

Hardev Singh  

Sure, Alan. If you look at the high-level picture, you have the Scale Up network and the Scale Out network. Let me try and go after this Scale Up first. So the Scale Up network is typically a network within a rack. So these are accelerators within a rack that connects to this network, which is at a very high bandwidth. And you can imagine if you have multiple of these accelerators connected to HBMs, which is High Bandwidth Memory, and they want to access memory of different accelerators, you really need very high-bandwidth network. So Scale Up again is the network connecting the GPUs within a rack.

Now today, that can be tens of GPUs going into hundreds of GPUs. With the latest Tomahawk 6, that was announced earlier this week, you can actually theoretically go up to 512 GPUs, right? So that's the Scale Up network. Scale Out is the network that connects to multiple of these accelerator servers. So that's when you're talking to connect to maybe hundreds, thousands or tens of thousands of these GPUs.

And within the Scale Out network, you have this concept of back-end network as well as the front-end network. So let's see what's the distinction there. Back-end is the network that directly connects to the GPUs. This is 100% RDMA traffic, GPUs talking to each other. They are typically 400-gig, now 800-gig connectivity, the round trip time is pretty small, about 10 microseconds. And this is where the training happens, right? So these GPUs are working on these AI workloads, with this communication libraries, exchanging these gradients, it's really a distributed computing problem.

The front-end network is the network where you have storage, you have CPU-based compute. So this would be your compute that would feed in data into the back end for training or even inference it as well as connectivity to the WAN or the metro. So here, the traffic patterns are slightly different. You'll have some RDMA traffic, you'll have TCP traffic, you'll have NVMe traffic for storage. And this is similar to the classic data center. So at a high level, that's a distinction between back-end, front end, Scale Out and Scale Up. From an architecture perspective, it doesn't change much. You have the same leaf-spine architecture, just that the bandwidth is kind of different from the back-end, front-end as well as the Scale Up.

Martin Hull  

So the other way to think about it, Alan, is that what many customers think about as their existing data center, that's the classic data center. That is the front-end. So while customers have been building out for the last decade or so is now classic, believe it or not, it's like the Coca-Cola commercials. So the back-end network is almost a new introduction of a new area of their networking, but really, it's not that fundamentally different from their existing front-end or classic networks. You're still going to be using high-performance, open networking technologies to connect high-performance GPUs, DPUs, accelerators into a cluster, so we're happy to be here to talk about how we demystify some of this terminology, and let's get started.

Alan Weckel  

Yes, absolutely. I think it's interesting because as you mentioned, we used to have one network, and we understood that well. And now we're adding in this new domain, this brand-new network. And not only are we doing it, we're doing it quickly and the bandwidth requirements of that network are really amazing. So the pace of innovation is occurring faster. And I think that causes some confusion. So I'm going to share some slides. I think it's kind of a good intro to do some market sizing, which we all know I love to do.

And we'll kind of start there and set the stage for AI here. So where we are today in AI is in the agentagentic wave. And we can argue exactly when that started. But over the next couple of years, we're going to spend over $1 trillion in infrastructure equipment, so compute, storage, networking to support this agentagentic wave. And to me, that's a mind-blowing number. So I tried to quantify that a little bit differently.

And it means since we just started talking. We've shipped thousands of switch ports into the data center. So good job guys. You shipped thousands of ports since we started here. It's just a tremendous new volume compared to what we were used to in the past. And the good news of that spend, right, is networking is going to really be the glue of that connects that together. So we're talking about a huge amount of money being spent here and ultimately, a significant amount of that will be on networking in order to stitch these GPU, xPU clusters together. And what that means, going back to this pace of innovation is by the end of the decade, the vast majority of infrastructure in the data center is going to be AI or accelerated.

So we got a good handle on traditional compute. And by the end of the decade, I think we're going to have a really good handle on AI. It's going to be the dominant amount of spend in the data center. And that's going to change considerably when we think about Scale Up and Scale Out network opportunity, which we'll get to in a second. So speaking of the AI network and kind of diving into that, everyone loves to kind of talk about the data center as it's one fabric. Then Hardev and Martin, you did a really good job earlier kind of highlighting that it's different and expanding rapidly.

And this chart helps to kind of frame that. If we look at bandwidth growing in the data center, you can see that AI is growing at nearly 100% per year. In other words, what we're throwing at AI now is twice as much as we did a year ago. And next year, it's going to be twice as much as we're doing today. And what that ends up looking like is the chart on the right, where most of the traffic in the data center is going to be AI-related very, very rapidly.

It also brings to the point that all these other networks are ultimately going to have a very large tailwind from this when we talk about DCI and connecting these facilities together or when we begin talking about traditional compute, everything is going to have to be brought up to a certain standard or a certain speed in order to support what we're doing in these AI clusters. So I think, let us kind of have a little bit of a conversation here on this chart. But if we look at what's occurring, we used to have that traditional network and now we have both Scale Up and Scale Out.

And within those domains kind of in Scale Out, that's where we have the InfiniBand versus Ethernet debate. And then when we talk about Scale Up, that's where we get NVLink versus Ethernet. We see UALink, the Ultra Ethernet spec will play an important role there. So it's not just about one network. It's really about these multiple networks there. So I'll kind of ask both of you a question. When was the first time with customers, you began hearing about more than one network in AI. For yourselves, I'm sure, it was a few years ago. But what was that defining moment where there was going to be more than one network in your mind?

Martin Hull  

So it's difficult to really put a pin on that one and get an exact date. I think back to when we launched our 400 gig networking technology, I can't remember it was 4 years ago, 5 years ago. Mark Foss, I know will always correct me when we did a launch. When we first introduced 400-gig, we set the primary use case with data center interconnect and everybody agreed.

We said the secondary use case of that 400-gig technology was going to be for AI and ML networks. And at that time, people were questioning what I meant by an AI and ML network. So 4 years ago, 5 years ago, when we introduced our 400 gig technologies. I think it's when people started to identify that there were dedicated use cases for dedicated AI networks.

Since then, I mean, it's been a little bit over 2 years, 2.5 years since ChatGPT burst onto the scene. And I think we can almost date this era as ChatGPT arrived and all of a sudden, everybody was talking about AI networks. So whether it's 4 or 5 years ago or whether it's 2.5 years ago, around about that time frame is when I think we started to think about AI as being different to the classic traditional data center networks.

Hardev Singh  

Yes. I mean, to add to Martin, we were actually working with our large customers on AI networks before AI even became such a big thing. We were working with them on features on load balancing, a few of these large customers had AI workloads with the first generation of GPUs, the A100. So we really built our software stack to support those AI networks. And whether it was features around load balancing, with features around visibility, telemetry, visibility at the NIC level. So we, in a way, got a head start on working on AI before it kind of really exploded.

And to your question, Alan, the difference between this AI network and what you call the classic cloud from a network architecture is normally different. It's the same leaf spine architecture, but the speeds are radically different. If you look at the AI back end, you're mostly at 400-gig, 800-gig. Most of the traffic is RDMA. The round trip times are small, 10 microsecond compared to the front-end network, which is more classic like where you have storage, you have NVMe traffic. You have TCP/IP traffic, right? You have CPU-based compute, you have wide area connectivity. So from an architecture, they're similar, but the speeds are quite different between the back end and the front end.

Alan Weckel  

Yes, absolutely. I've never been at a point in time where you could create a forecast like the one on this chart, where we go from a market that was 0 or just related really to HPC to one where we're going to be reaching 100 billion in the not-so-distant future beyond this forecast, it's absolutely incredible out there. So I think this chart helps frame that, right, as we look at AI that kind of front-end network, the more traditional one, that has a huge expensive role as we talk about these clusters getting larger.

We've done a good job talking about Scale Out, and then we see Scale Up coming in kind of for those GPU, GPU cash in coherent networks. It becomes a very, very large TAM. But I think behind the scenes, things get a little bit more confusing. So these charts are looking at the exact same data just with a little bit of a different lens on it.

The left one is looking at whether it's kind of networking, we call them switches, right, for the most part, whether it's the transceivers and cables or the NIC's themselves. And then the right one is kind of looking at it from a protocol perspective whether we're Ethernet, InfiniBand or NVLink being the dominant ones, but you can see there's more than just those on the chart. And I think this is important, and we're trying to build that out as we go to the next chart here.

If we slice and dice where we are, and we kind of look at it, we can see that Ethernet, so the non-blue part of the charts, ultimately, it becomes the dominant technology. And again, this is kind of that move from historic AI was very much HPC oriented to one where AI is really about cloud and scale. And so Ethernet takes over there. But we can also then look at the chart and look at the different relationship there between optics and switching or NIC's and switching. And I think that's an important one, right?

If we look at these data points out there and not to spend too much time on your competitors, but when like a Cisco talks about this market, they're including not only their switching and silicon, but they're also talking about their transceiver business, especially Acacia and some of those kind of longer DCI links. And then when we talk about NVIDIA and what they're including, they include not only Ethernet switches, but they also include a NICs and optics. And kind of the key thing that I've seen in tracking this market for over 20 years is, say, each vendor has a different ratio of all those components.

So it's not apples-to-apples when we hear everything out there. It's really a large part of the NICs. And on this chart, in particular, I think it's interesting. I had to compress a lot of the data categories, we're actually tracking nearly 20 different categories in AI to count all those different permutations out there, how vendors buy equipment, how vendors would like things to be architected, how they'd like it to be consumed. So it really is a bunch of different things that get added up to the total market, not a $70 billion market and everything is exactly the same across the vendors.

Hardev Singh  

Yes. No, that's a good point. Alan, if you look at your previous slide where you had the NVLink and the Ethernet, it's probably a good opportunity to explain to the audience here that NVLink is really 100% Scale Up, right? And in your Ethernet kind of forecast for the next few years, you have the Scale Up and the Scale Out portion in that same Ethernet bucket, right?

Alan Weckel  

Yes, exactly. And so again, you can slice and dice this differently, but I think it goes back to the comments you had earlier, which is the speed necessary is significantly higher, right? We kind of started AI at 400-gig. And then obviously, we're moving to 8 and 1.6 whereas on the traditional compute side, for the most part, we were good at 25-gig or 100-gig. So the starting point on AI is just significantly higher than what we've seen in the past.

Martin Hull  

[ Agree ] with me, Alan, that the AI generation started with 400 gig.

Alan Weckel  

I would agree with you. I think -- it started with 400. The problem there is we still talk about 10 gig in the data center for some customers. Yes, it's absolutely a different time frame. And I think if we start at 400, it helps frame the future, right? We're not going to go too far into next decade. But if we're starting at 4, then 3.2T, 6.4T shouldn't scare us, right? They're kind of a natural next step or next evolution in the speeds and feeds part of the conversation.

Hardev Singh  

Alan, how do you see the xPU ecosystem or growth there? I mean, today, we have a dominant vendor. But how do you see the growth of XPUs from the incumbent today?

Alan Weckel  

Yes. You're going to see a diversity of XPUs and GPUs going forward. And it's really about finding the right processor for the right workload. And so you can imagine if we're talking about social media and kind of more of a human engagement, that type of ASIC, whether it's a GPU or some of these XPUs that are coming out will look a lot different than what we're going to try to do on infrastructure as a service or a multi-tenancy into the enterprise. So you're going to see both of them thrive and thrive significantly. But as we kind of move forward, we should expect a more diverse and large set of XPUs in the market.

Hardev Singh  

Which then, Martin, really opens up the opportunity for Scale Up for Ethernet vendors, right? It's a...

Martin Hull  

Scale Up and Scale Out.

Hardev Singh  

And Scale Out, of course. Yes.

Alan Weckel  

Yes, absolutely. So when we think xPU, we should really think Ethernet as the preferred mechanism there for both Scale Out and Scale Up. I don't know. I would assume that's what you guys see when you talk to customers.

Martin Hull  

Yes.

Hardev Singh  

Yes. Great.

Alan Weckel  

Right. So I just have 1 or 2 more slides to talk through, and then I think we can dive into these architecture conversations kind of setting that stage with some absolute numbers. So we talked about the pie shifting already towards the Ethernet and how every vendor is a little bit different in what they sell. And then if we kind of just take a little bit of a snapshot here, removing the optics. What we can see on the left is Ethernet versus InfiniBand.

This is kind of like the Scale Out conversation. And then I really like the chart on the right. If we look at the market today, we're right at that crossover. And again, I'm a little bit backward spacing, looking at quarterly shipments. But we absolutely see the point where Ethernet has caught up and is about to surpass InfiniBand from a Scale Up which are from a Scale Out topology example. So I think it's probably a lot of good work that yourself have been doing with customers, but ultimately, right, or at that point. So are you seeing the same thing, right? We're kind of past the point of InfiniBand versus Ethernet, and now we're on to how can the Ethernet get us to where we need to be.

Martin Hull  

What I find interesting about this chart, and I'm very happy to see the crossover is if you go back 4 quarters, 4 quarters, 5 quarters ago when we were facing questions from many people about InfiniBand versus Ethernet and suggesting that would Ethernet win through, could Ethernet win through. And you fast forward 4 quarters and the answer becomes a very simple, well, apparently, yes. So we're only Ethernet vendor. I can't tell you what we're shipping Ethernet versus anything else. You know what we're shipping, and these charts are reflecting that high volume of transition from InfiniBand to Ethernet.

Hardev Singh  

And InfiniBand has been there for some time. It's very popular with HPC. And then once these small clusters started building up, InfiniBand was probably good enough for the networking. But what we're seeing now when we're talking to customers, they're really looking to build these large-sized clusters, and you're talking tens of thousands, even hundreds of thousands of these GPUs.

At that point, InfiniBand runs into performance challenges, you have a sub net manager that controls the whole data plane communication and then once the cluster size becomes large, you can have convergence issues, if you have link flaps or certain parts of the network that have performance issues, it really becomes challenging for this controller to scale.

So that's where Ethernet was kind of really took on that role for these large clusters. And what customers also like the fact that it's open, open standards, the network teams are used to Ethernet and then the applications as they flow from back end trained data to front end and inferencing, it just makes the whole AI cluster performant.

Alan Weckel  

I know we could spend like a whole hour talking about 1 million xPU clusters. But I think before we get there, let's kind of go down to the Connect side a little bit more at a smaller unit there. One of the questions I get a ton is how many optics are there per xPU, how much is copper?

How much is fiber, and so a question to both of you. In Scale Out networking, I kind of think about this as you have one approach, which is top of rack or middle of the rack and you have another approach, which is end of row. They're very distinct. But yourselves are talking to a lot more customers out there. Can you talk about Scale Out architectures and what you're seeing and what types of switches actually get deployed in each of those?

Martin Hull  

Yes. So I think you're getting the same questions we're getting. But if you think about how customers deploy networks, we moved rapidly from a 32 400-gig switch to a 32 by 400, 64 by 400 gig, 64 by 800 gig. So if you take your unit of networking as a 64 for 800 gig, we're going to use half of those interfaces for local connectivity to compute. You can't get 32 ports of 800-gig compute in a rack. So the natural conclusion there is I put my network switches in a middle of row, middle of 2 racks, middle of 3 racks, whatever fits for you, just because we've addressed the silicon scaling so rapidly that network switch is no longer a single switch per rack. And then the other aspect of that one, now I've used my 32 or 30 ports for local connectivity. The remainder of ports on that device is what I'm using for that full mesh connectivity back to the spine.

So the reason that we want a high rating switch as my leaf is so that I can have a very high network diameter at spine and Scale Up from hundreds to thousands to tens of thousands of GPUs. If you don't have that, you end up needing for the third tier, maybe a fourth tier or a fifth tier as the network, and we'll get on to that a little bit later. But the reason that customers are thinking about worrying about where to put that leaf switch is because once you're outside the rack, you get into that network cable length conversation. Copper gives you a 2-meter reach, maybe a 2.5, you can go to active copper, at 5, 7, 10 meters. 10 meters is pretty good for a 3-rack, 4-rack configuration.

And then yes, you're going to drop into a fiber connection with an optic or a transceiver. And at that point, you're not very far from doing an end of row. Six rows, 8 rows -- sorry, 6 racks, 8 racks, you end up with an end-of-row hybrid configuration. And the other aspect is you're probably going to want 2 switches, 4 switches, so maybe you have a network rack in a row and that's where you would deploy the network connectivity and that where you can actually have A planes, B planes, C planes and D planes coming out of the compute. So the network ratings of the leaf switch has driven some of these connectivity conversations which then turns the conversation about the rack arrangement. Hardev?

Hardev Singh  

Yes. I'll give you an example. If you take a 8,000 GPU cluster, like Martin said, right, you take a 2-tier network, you're looking at almost 30,000, 32,000 optics, right? So the optics really add up and then the power of the optics is in a cluster size like that makes a big difference. So when we talk to customers, what they really care about is how can I have a performant network, a highly reliable network and how can I bring my power down. Any power that saved on the network side, whether it's very efficiently designed hardware or on the optics gives them that extra power that they can then put more GPUs, which adds business value, right?

So at Arista, we are the pioneers for the LPO optics technology. At 800-gig, we now see at least have a couple of large customers who are now deploying 800-gig LPO optics in decent sizable volumes.

And they really see their TCO with lower power with the power saved on the optics, they are able to then have more GPUs because even when these data centers are getting built, the power envelope is defined, right? So you cannot exceed that power. So any power savings on the network side, on the interconnect with optics gives them that option to have more compute, which directly translates to business value.

Martin Hull  

Yes. I have a couple of more copper questions. But since you brought up the LPO LRO conversation, can you talk about what does that actually change in the architecture if you're going to do LPO versus a standard transceiver?

Hardev Singh  

So let me quickly explain what do you mean by LPO. So the LPO optic is the optic without the DSP, right? In the 800-gig generation, the electrical speeds match the optical speeds. You have 8 by 100 electrical matching to 8 by 100 optical. So you don't need the modulation. And then the Certus on these chipsets are powerful enough to push the signal without the DSP. So now when you take the DSP out, you're now reducing cost by 30%, 40%, you're reducing power of that optic by maybe 40% to 50%.

So that's really attractive to customers. So at a high level, that's the distinction between LPO and DSP-based optics or the regular optics that we talk about. LRO are basically retimed in one direction. So they're -- think of them in between LPO and the DSP optics.

Martin Hull  

And in terms of the network infrastructure, it doesn't change the sign. You can use the same fiber infrastructure, parallel duplex, the same link lengths inside these large-scale warehouses, you've got 2-kilometer runs. You've got LPO optics of 2 kilometers. So it doesn't change the physical layer, the layer one. What it does is it transforms the power that's being consumed by the optics. On our analysis, it's -- if you can drop the optics by 50%, you can drop the system power by 25%. And as Hardev said, that feeds through to the bottom line.

Alan Weckel  

Okay. And then like what architectures are more popular. You mentioned you had some customers doing LPO. There are still transceivers, they're still copper. There's end of row, top of rack, et cetera. Like is the industry moving towards one topology? Or which ones are the more popular choices you see?

Hardev Singh  

I mean you see very different architectures depending from customer to customer and from the size of the cluster, right, from a smaller size to a larger size. For a smaller cluster, yes, you can optimize cost. You can have copper connectivity, like Martin said, you have 2- to 3-meter reach, so you could have the switches top of rack or middle of the rack and then have AEC cables or even AOCs to go out to 7 meters or 30 meters and then go to optics for a longer reach.

Once you -- for customers who are building much larger-sized clusters, they are mostly going to standardize on a couple of optics. They'll probably have a 50-meter optic to connect to the GPUs to the NICs on the GPUs and then maybe a 50 or a 500-meter optic to have connectivity between a leaf spine or spine, super spine, right?

End of the day, from an architecture perspective, you want to stay in fewer tiers as possible. And the key differentiation that Arista brings with our AI Etherlink portfolio is we give customers that choice of selecting from not one, 3 different product families. So we have the fixed switches, the 7060 series, we have our flagship AI spine, 7800 modular chassis. And then we have the newest one, the 7700, the distributed Etherlink switch, which kind of really makes the chassis architecture and distribute it, right? So now even if you have a 2-tier network, it's just single hop, so that reduces latency power and then the performance of the network.

Alan Weckel  

Yes. I'm kind of saying a similar thing where every customer is a little bit different. And I think that's an important distinction to make because -- when I look at kind of the market data there and we talked about, what you say, a fixed system that's top of rack, there's a lot of copper content. So going back to that DSP question, there's less DSPs or less transceivers there.

When you go to end of row or you're all fiber, right, there's far more transceivers in the network. And in my data, it's kind of a crazy spectrum. And what it means is, in some AI data centers, the optic can be like 40%, 50% of the cost, because you're all transceivers and you're going in that direction. And others, if you're using LPO or the first hop is copper, right? You can be down in a 20% range. So you get this wide dispersion of answers there. Every customer is really unique. And I think both of you just said the same thing, right? We're not getting common architectures, everyone is unique. I want to talk about 100,000 and 1 million xPU GPU clusters.

So let's go there because everyone loves that million number. What's different when we talk about a few thousand xPUs, 100,000, 200,000 and then 1 million? How does the architecture change?

Martin Hull  

So I know you want to talk about 100,000 and 1 million. Can we talk about a few hundred first?

Alan Weckel  

Absolutely.

Martin Hull  

So we get a lot of customers coming in to talk to us and for a lot of enterprise customers or infrastructure providers, hundreds of GPUs is a significant milestone. And so for customers like that, we can address their net win requirements effectively in a single hub. That's a fixed or a modular architecture. So there is a vast number of small- to medium-sized GPU clusters that are measured in the hundreds of xPUs, hundreds of accelerators that we can address in a single-tier network.

Once you want to go up from 1k to 2k to 4 to 8 to 16, we're getting into real networking at that point. So within a 2-tier network design, based on either the 7060x fixed configuration or a combination of the 7060x with the 7800, we can get to 30,000, 32,000 depending on the port speed of the accelerator, 200-gig, 400-gig, your numbers will vary a little bit. Once you get beyond 32k up to 64k up to 100k, we're talking about some limits in terms of the radix, the diameter of that first hop switch that then feeds into the second tier of the network.

So we've just seen Broadcom announced their new Tomahawk 6 silicon, which is the world's first 100-terabit chip. That 100-terabit chip doubles the IO of that first half switch. So just by simple mathematic modeling, I can now double the diameter to my network. So a customer that maybe was having a limitation of 30,000 can now go to 60,000. If that 200 gig attached, you can now break through that 100,000 GPU, xPU limit using a single -- sorry, using 2-tier fixed configuration systems. Once you start talking about going beyond 100,000 to 0.25 million, 0.5 million, you're going to hit a number of challenges. The first challenge I see customers hitting is space and power. It's not a networking problem at this point.

I can build a 3-tier network that can address the physical requirements of connecting everything in the non-blocking or even undersubscribed design. You're going to hit a space and power constraint. Some customers are going to put in a second floor in their data center. They're going to put a second data center, a third data center close to the first one.

So now we're going to start talking about data center interconnect technologies being used between back-end networks to create this super cluster based on space, power, cooling, rather than just how many ports do you have on your switch and how long your fiber strands.

So Alan, in your research, you call out how many customers there are, that have got more than 1 million servers. Turn that around, realistically, how many customers are there, they're going to have more than 1 million accelerators in the short to medium term. So I'm going to turn that one back around on you, right? But we can build a network that can handle 1 million. Can customers deploy them in a data center?

Alan Weckel  

Yes. I think it's unlikely. We're seeing a few million GPU, xPU clusters this decade, but it's measured in kind of like the fingers on your hand. There's not that many.

Martin Hull  

And we will be more than happy to talk to any and all of them about their network of wireless and how we can align there. But realistically, 30,000, 60,000, 100,000 is kind of where that architecture is today. And it's not necessarily in networking limitation. I think it's -- those additional challenges of civil engineering, power delivery, cooling and all the other challenges that come with that.

Alan Weckel  

Yes. And you're even seeing customers just move a few miles down the road because that will let them tap a different municipalities power grid because, as you said, power is the constraining factor. So it's not even like, "Oh, you could book the building next door or go to a second floor." There's no power in that location. You've got to move elsewhere.

Martin Hull  

Right? So then we'll build some trenches and drop some fiber in, and I'll build you one super data center connected together.

Alan Weckel  

Yes, absolutely. I mean...

Martin Hull  

[ Quite not ] simple, but you get the point.

Alan Weckel  

Yes, it's not that simple, but it is pretty close, right? It's kind of like if you can put in a trench, you can do it. I've seen like in Virginia as an example, the policy there seems to be you can build as many new data centers as you want, but bring your own power. And therefore, you're not going to build many. I saw you both kind of giggle there. So are you seeing that? Obviously, Virginia is kind of the famous spot that's power limited, but are you seeing power limitations make us move to other locations?

Martin Hull  

We're seeing power limitations. We're talking about in the public media, small nuclear reactors, creative ways of getting access to power temporarily or even medium term, how is it you can just get more and more and more power. It is one of the limitations. I think the industry -- the global economy, if you will, is stepping up to that. But I think it becomes one of the short-term limiting factors on scale of the numbers of data centers that are going to get built.

Hardev Singh  

Yes. And to add to that, there are a handful of customers who are really building these 100,000 GPU or going to 1 million GPU clusters. And these are really the foundation models that are getting trained where you need such a large cluster size. And when we talk to these customers, like you said, the power is defined at a site, right? And typically, they only allocate about 10% of that power that's available for the network or maybe you can include storage along with network. 90% of that power is for the GPUs.

So what these customers care about is really a best-of-breed network to make sure that those GPUs are performing well, right? And where Arista again, differentiates is with our modular chassis with the AI spine and now DES, it really gives customers that flexibility to architect the network in a way that you have as fewer tiers as possible. In the fewer tiers, you have less hardware, less optic interconnects, really kind of enables them to stay within that 10% power envelope and then move more power to the GPUs.

Martin Hull  

And avoiding idle time.

Hardev Singh  

And accelerators.

Martin Hull  

Accelerators cost a lot, right? I'd to get 100% utilization out of them rather than 70% utilization. So anything you can do in the network to avoid a bottleneck to avoid drops, link level issues, anything you do in the network is the network getting out of the way and allowing accelerators to do the job that you've assigned them?

Hardev Singh  

Yes. Any slowdown in job completion time is loss of revenue. So again, coming back to the network being critical and that's where software plays an even important role. Customers not only want a very reliable low-power network but also the features to support these workloads. And these workloads can really vary from like the LLMs like we're talking about or custom AI workloads with our EOS on our hardware, we have developed a bunch of features with load balancing, for example, we have DLB, and then we have our Arista cluster load balancing feature, which really takes the performance of load balancing up by, let's say, 8% to 10%.

And what CLB really does is when you're -- when you have this traffic coming out of the GPUs, hits the leaf switch and then you kind of distribute across the spine switches, with CLB, you're now able to not only have the distribution or this even distribution from the leaf to spine, but also from the spine coming back to the leaf.

So now you have a perfectly hashed well-load balanced network. To add to that, visibility and telemetry are equally important to these customers. They want to know what's going on and really debugging job performance is a hard thing. And believe it or not, many times, the network teams get blamed in correctly. So we've developed a bunch of features in our -- for example, our CloudVision, which is a management software where customers can see where the jobs are slowing down, did a link fail, how can I improve the link failures.

So all of that visibility is provided on the CloudVision dashboard, which we've actually worked closely with a large customer, where we've developed custom dashboards in CloudVision to address visibility, monitoring and deployment of these AI networks.

Alan Weckel  

So I think you're going to tell me that Arista always planned for the size of 1 million xPUs and all joking aside, you've done some amazing things. But how is software evolving to get to that scale? You mentioned load balancing. But I'm assuming it's a lot more complicated than you just add 1 or 2 more features in order to build at that scale.

Hardev Singh  

So load balancing is definitely one of the important features, but then congestion control. There are a bunch of knobs or features we are doing with DCQCN, whether it's PFC or ECN. So congestion control is important, load balancing, like I said, visibility and telemetry and then, of course, the routing and the routing protocols are important as well.

Martin Hull  

And the other thing, Alan, I don't know that we planned for 1 million. What you do is you plan the foundation of the house. You plan to have a high-quality operating system that's feature-rich, reliable, doesn't cause an outage. So you start with that strong foundation of a solid quality software stack, as you get the newer, higher-performance switches, you don't switch horses onto a new operating system written from the ground up.

We're still relying on the same operating system. We've always shipped on our switches and routers. So I think that -- it's not the planning for 1 million, but it's putting the foundation in place so that as we evolve our journey and as customers evolve their journey, then at least we're going on this journey together.

Hardev Singh  

And like we touched on earlier, the 1 million is not going to be in 1 physical location, right?

Martin Hull  

Not yet.

Hardev Singh  

I mean, Martin hinted to that, too. So you're going to have this distributed data centers at different physical locations connected via a DC interconnect to really have that one big cluster size.

Alan Weckel  

Yes. It feels like from where we just went, we're going to have that foundational models going to 1 million, 1 million-plus, it's multisite, maybe somebody gets sales site. We're going to have the small business or large enterprise in hundreds of GPUs, like kind of measured in the racks and then there's probably a sweet spot in between for inference scale. Not that we're going to have only 3 types, but it feels like the industry is heading towards rightsized GPUs and rightsized network for the job, right?

Martin Hull  

Yes.

Hardev Singh  

And in the enterprise, also there are certain verticals we're seeing, whether it's financials, research, health care as well as autonomous driving. So all these applications where they have a ton of data, we see these customers starting to build, like you say, a small size, but we are in conversations with them where we see them kind of expanding as their data kind of increases and they see business value out of it. So I think there's a lot of action in the enterprise with AI as well.

Alan Weckel  

I always -- I'm curious about what's going to happen with all these new specs. So we've got Ultra Ethernet. There is UALink. There's a few others. How do these new standards kind of impact to the network?

Hardev Singh  

So today, Alan, when we look at our products, right, the Etherlink AI portfolio, whether it's a fixed boxes chassis or DES, we are NIC agnostic. Whether the NIC can do packet spraying or not, whether it can do out-of-order packet delivery, whether you have the DPU-based NICs, any kind of NIC, our network is agnostic to that.

With UET, we will have the capability in these next-generation NICs to do all these things at a protocol level. So UET, which is the Ultra Ethernet Transport Protocol will add features like packet spraying, out-of-order packet delivery, better congestion control at the protocol level. So we're hoping NIC from maybe Broadcom or AMD will have these features going forward, which then will kind of make the network even more performant, right?

Alan Weckel  

Yes. Do those standards lead us to Ethernet for Scale Up? Or is Ethernet going to be in Scale Up anyway because that's what Ethernet always does is it takes on those additional hard problems.

Martin Hull  

So again, with the Broadcom announcement of the Tomahawk 6, they absolutely made it clear that Tomahawk 6 is built for a Scale Up Ethernet architecture. If you look again at the size of the racks, the density of the compute, the amount of power that's going into them, there's a benefit to having locality between a rack's worth of compute. So you build these high-speed interconnects, and it's a natural place for an Ethernet technology, because it's multivendor and open, so it gives you the flexibility. So Ethernet does have a role to play in the Scale Up aspects of these networks in the same way that Ethernet has a role to play in the back-end.

So I'd like to say it's a back-end for the back-end and it gets very confusing, because then the front-end is a classic. But the Scale Up network is the back-end for the back-end. These networks don't necessarily talk to each other. So you end up with dedicated isolated networks but the features, the tools, the troubleshooting, the monitoring, the telemetry, they didn't change. So people still want that same end-to-end visibility to the front-end network, the back-end network and the Scale Up network. There's no reason why Ethernet cannot be the significant technology in that area of the network as well.

Hardev Singh  

Yes. While UEC is really for the Scale Out network, the opportunity on Scale Up is tremendous, right? As we -- like we discussed earlier, as the xPU ecosystem expands, Ethernet with its reliability and openness will be an option for Scale Up networking as well, and we're pretty excited about that.

Alan Weckel  

Yes. I think it's cool. So I think about this a little bit differently. We're on the cusp of every person in North America, having more than 1 DC switch port associated with them. So like the volumes here as Ethernet grows are truly amazing. All of us are probably like a switch in and of ourselves between our personal lives and our social media lives like multiple ports.

Hardev Singh  

Yes. I mean we have 4 of us in the house, probably 5 devices for each person. So they're just -- there's a lot of devices per person, yes. I believe that statistic. Yes.

Alan Weckel  

So we've talked about the customer, you two sit in some admirable roles of talking to a ton of customers being with the latest and greatest silicon, but I think there's still a lot of blind spots out there. So maybe if I could ask each of you individually. If you could give one piece of advice or one piece of coaching to the customer on their AI journey to make it better, what would that be?

Martin Hull  

So I'll take that question first, Alan. My guide to any customer, whether they're building an AI network or a front-end network is plan early, plan often and keep coming back and revisiting the plans. But do trust the networking vendors that we understand the requirements. We understand how to build these debt works, so engage with us, work with us. And the features that Hardev has described in terms of traffic management, in terms of telemetry, in terms of visibility, whilst a lot of these features are directly applicable to these new back-end AI networks, there actually equally applicable to the classic front-end network.

So don't throw out the baby with the bath water, use Ethernet technologies, use your best-of-breed networking vendors, but there are some new unique challenges about scale, performance, some of the network architectures are going to change. Things we've talked about with LPO optics or reaches. Some of those things were a little bit different. That's why it's important to get ahead of this and not have it come at you without any planning.

Hardev Singh  

Yes. I would say depending on your AI workloads, depending on what you're trying to get out of this network, pick the right platform. At Arista, I believe we have close to 20 products for AI, again, back-end and front-end compared to some of our competitors who probably have one.

Martin Hull  

Much fewer.

Hardev Singh  

Much fewer, and you're trying to take that same box and try to fit it everywhere. But the back-end network is higher bandwidth, but the feature set is very similar to the front-end. Pick the best-of-breed network. LPO is a promising technology, which will help reduce your CapEx as well as OpEx. So the overall TCO comes down and then end of the day, better job completion times. That would be my take on that.

Alan Weckel  

Yes. I would chime in there. I would have said the network is not only in the glue that's going to provide all the connectivity, but it's also going to be the enabler to the technology, whether we're monetizing it at an application level, a consumer level, the network is going to provide us with that user experience into these AI clusters. And so without the network, we won't have self-driving cars or robots or agents today. So that would be my thinking. It's just we need the network to enable these applications and we're all going to be part of that kind of transformation into the AI world.

Martin Hull  

Remind me again, Alan, who is it said the network is the computer?

Alan Weckel  

You.

Martin Hull  

No, no, definitely not. I think it was one of the founders of Sun Microsystems.

Alan Weckel  

Yes. There are some famous founders at their hands and Ethernet, that's for sure. So I think that's a great time, Martin, Hardev so thank you so much. I learned a bunch and I hope our audience did as well. So appreciate your time and always a bunch of fun going back and forth with you. So thank you very much.

Martin Hull  

Thank you, Alan.

Hardev Singh  

Thank you, Alan.

Martin Hull  

Lot of fun.