On this episode of the Futurum Tech Webcast – Interview Series, I am joined by Kavitha Prasad, VP & GM of Data Center, AI & Cloud Execution and Strategy. Our conversation explores the expansion of AI and how their CPU architecture provides adequate infrastructure for this rapidly developing market niche.
In our conversation, we discussed the following:
- The proliferation of AI in the cloud services industry
- Intel’s vision for advancing Xeon CPU architecture
- Intel’s focus on purpose-built workload acceleration
- Intel’s 4th Gen Scalable Processor and its associated AI platforms and products
- How their workload acceleration features will be leveraged and deliver computing business results for AI workloads in the cloud
If you are interested in learning more about CPU architecture and Intel’s plans for the future, make sure to give this episode a listen. To learn more about Intel, check out their website here.
This webcast is sponsored by Intel.
Watch the video of our conversation here:
Or stream the audio here:
If you’ve not yet subscribed to the Futurum Tech Webcast, hit the ‘subscribe’ button while you’re there and you won’t miss an episode.
Disclaimer: The Futurum Tech Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we do not ask that you treat us as such.
Daniel Newman: Hi everyone. Welcome back to another episode of the Futurum Tech Podcast. I’m your host, Daniel Newman, principal analyst, founding partner at Futurum Research. Very excited about this interview series podcast that we have today. I’ll be bringing in Kavitha Prasad from Intel and we’re going to be talking about something I love, near and dear to my heart, and that’s going to be artificial intelligence and some of the work that Intel is doing.
It’s going to be broad, we’re going to cover a lot of ground, but we will be talking about workload acceleration, but you really can’t talk about that without talking about broader adoption, talking about Prem and Cloud, talking about experience and data. So we’re going to talk about a lot of things. Without further ado, welcome Kavitha to this show. Great to see you.
Kavitha Prasad: Thank you so much Daniel. Looking forward to this conversation. Appreciate you having me on this show.
Daniel Newman: Yeah, it’s a lot of fun. I always enjoy talking to you and your teammates over to Intel. Quick introduction of yourself and your role at Intel. Just kind of set the stage for those that haven’t been following your work.
Kavitha Prasad: Yeah. I’m responsible for AI strategy and execution at Intel, Daniel. Have been in this role for almost a year and a half. This is making sure from an Intel perspective, the AI holistically all the way from Edge to the Cloud. Intel place a holistic role in making sure we are making this available across this entire continuum, and that’s what I’m working on.
Daniel Newman: That’s a big job. I’ve been in a few events, I don’t know, like 70 this year. And the one thing I would say is whether it’s hardware or software, whether it’s kind of Edge and devices or data and infrastructure in the data center, the AI story continues to rise to the top. You kind of are hearing pivots away from other things and more and more about really how AI is enabling. And that can be everything from faster connectivity across networks to more usability with access to data and databases. So there’s so much going on there and I think it’s so important.
But here’s a question I guess, I’ll ask. Let’s start big. As I mentioned, AI is becoming sort of pervasive in everything that we talk about in tech, but I think we still have a long way to go.
It’s kind of like Cloud, right? We talk about Cloud that way, but we’re still really early innings for Cloud. Is it similar with AI? Are we in the early innings of AI adoption? And of course from an Intel perspective, what’s your role in where we’re at with the adoption of AI?
Kavitha Prasad: That’s a very good question, Daniel. And in fact, based on even the analyst reports and what we are seeing outside, AI is still in the early stages of development and deployment. So according to the reports, 80% of the companies are investing in AI, but only 20% of those investments are actually resulting in meaningful business outcomes.
So if you think about it, the reason is AI is such a continuous workload. To your point, development, deployment, where do I deploy it? How do I make sure I maintain the continuum after I deploy it? How do I monitor the drift? How do I make sure the results that are getting predicted are closer to the ground truth that I, again which I have trained for? So there are a lot of these scenarios that makes it extremely difficult.
And AI to a great extent today, we think is deep learning, but if you think about AI holistically, it’s multimodal. It is traditional AI, traditional methods, traditional machine learning methods or probabilistic methods or deep learning or domain specific AI. All of these need to come together to get meaningful business outcomes. So a lot of these applications are at the intersection of these multimodalities, either two or three or modalities of these things, and that’s what makes it extremely complicated.
So a lot of focus today has been on training these large scale workloads, but for a meaningful deployment and to get the business outcomes is still a little far off. And that’s where at Intel we are focused on figuring out how do we address these business outcomes. Start with the customer problems and make sure that we are delivering meaningful solutions so that AI can be deployed much more broadly and customers are able to read the benefits of AI.
Daniel Newman: Yes. You make a lot of great points. And I’ve been listening to Intel talk about this for some time. I can remember years back and the name has changed a little bit. You’ve had a different data-centric and obviously now, it’s DCAI. The company DCG, I mean there’s been a bunch of different nomenclatures, but taxonomy aside, I still remember listening to Lisa Spelman talking a few years ago, talking about accelerating the workloads that matter.
Meaning that, yes, you can do some things on GPUs that they are going to be incredible for training, or for, but in general, most companies have very specific needs when it comes to building an AI and acceleration, and Intel obviously has a vision for this. They have a vision within Xeon to build for the future. But also you identified this for years now, that this is going to be a thing.
Talk a little bit about how this vision is evolving, given the fact that you’ve been on the forefront of this workload acceleration now, for better more than half a decade.
Kavitha Prasad: Well, that’s a very good question, Daniel. If you really think about AI, like I mentioned, it is multimodality, right? There are, and each modality requires different kind of compute. So if you look at the probabilistic methods or even the traditional ML or domain specific, they do run very well on CPUs and on Xeons. And then if you look at the deep learning side of things, yes, the training requires the, well the GPUs, but inference also still runs a lot more on CPUs.
So from that perspective, Intel has always realized the need for heterogeneous compute to actually go address this AI market. So when you look at the entire portfolio of Intel products, we do have CPUs that address these traditional ML methods. We are investing into GPUs to address the HPC plus AI kind of workloads, but for all these workloads that are dense, compute that is required for training, we have Habana accelerators that are actually, say for example, you want to get the best performance per PCO for large transformer models, where the world is heading towards, is converging towards, we have Habana accelerators.
And then for a lot of these workloads that need to be deployed where it is, AI is a part of the workload where you need automatic acceleration, but it is a much more complex application. Xeon’s are best suited for it because along with addressing this general purpose nature of these applications, we have acceleration built into the next Gen Intel Xeon processors, where you get these dense matrix multiplication to address the linear algebra nature of the AI, so that it is encompassed to the Xeon so you get the performance out of the box.
I think that also we realized heterogeneous compute is great and the hardware is important, but AI, like all of us know is a software problem first. So how do we make sure we homogenize the software layer across all these hardware components, so that customers can benefit of the goodness of all these compute elements right out of the box, right out of the frameworks, right out of the higher level software stack, is another area where Intel is extremely focused on.
So that is how, looking at a software-defined, hardware-enabled products is how Intel is trying to address these problems to enable AI adoption broadly.
Daniel Newman: And of course, the company is been pretty outwardly vocal, whether it’s been one API, whether it’s been the acquisitions that it’s been making, whether it’s been all the work with OpenVINO, with some of the different developer platforms. The company’s definitely doubled down, even at innovation, the event, right?
Pat Gelsinger spent a lot of time talking about Intel’s relationship with developers, and that just sort of reiterates what you’re saying Kavitha about AI and software because obviously in the semi-space, we tend to group as very hardware-centric, but if you actually look at the most successful semiconductors of the past several years, a lot of it’s been led with software.
And so Intel not only has focused a lot on software, which is important, but I think there’s still a lot of opportunity for the company to be recognized for the work that it’s doing with software, because that probably could be better appreciated. Which brings me to my next question for you.
You’ve made a lot of improvements and there’s been a lot of evolution in terms of the AI and the advancements between your 3rd and 4th generation on the Xeon scalable. Talk a little bit about what’s happened, those advancements that are taking place in 4th Gen.
Kavitha Prasad: Say in 4th Gen. What we realized is that more of the compute is going to be defined by AI as and when these workloads mature, which means that you need acceleration that is dedicated towards AI. And that’s why in Intel 4th Gen scalable processes, we have developed what is called AMX, which is an accelerator that sits close to our Cores, which is capable of accelerating these dense matrix multiplication, which AI workloads are built off.
It’s a large dense matrix multiplication that are used from an AI perspective. And by putting that acceleration very close to the Cores, customers can get the benefit out of those accelerations because we are up-leveling all the capabilities into the frameworks, we are making sure through this software stack, it’s available to the customers at PyTorch level, TensorFlow level, so that they can benefit it right out of the box by using our Xeon 4th Gen scalable processors.
Daniel Newman: So let’s ask the question though that everybody really wants to know, and that is, how does this technological advancement really translate to business value? I mean, look, when we’re accelerating and trying to get more out of our compute, the technical part of my life as an analyst, the journalists, we all want to know how do you squeeze the highest performance out of the lowest power? That’s been the benchmark of semis forever.
We’re only going to be able to create beyond. We know the five and four roadmap and we know beyond two nanometers, you can only snap, so slap so many transistors board we’re going to run. So we need to use other things and other things, AI is one of those other things. So how do you see all this, what you just presented, really being enabled to help businesses drive more value with AI workloads?
Kavitha Prasad: Just by having AMX incorporated into our next gen processors, Daniel, we are able to get four to seven X better performance on AI, gen on gen. That is just the draw performance numbers. But now let’s look at from the application perspective, which is where it is very important for us to look at, is you take any application, you need to ingest the data, you need to pre-process the data, you need to then run the AI on that data and then you need to post-process the data and then give meaningful analytics out of that data.
This is a complex pipeline and for that to happen, the fact that when you are doing the pre-processing and all of that stuff, CPU, general compute works extremely well on that stage of the workload. And when you come to the deep learning side of things or the inference side of things, that’s where AMX combined with the Cores gets the best performance, and then you are doing, again, your post AI analytics, again, you need the general purpose compute.
So by building this AMX along with the general purpose compute of the Cores, we are able to bring meaningful performance to the customers, realtime performance so that the customers can benefit out of it when they deploy these AI workloads on our Xeon processors.
Daniel Newman: It’s interesting, but maybe the easiest way to explain or come back to this is that software engineers don’t really want to be hardware experts or engineers. I don’t mean to create a camp or a wall, but it’s kind of those working on software just kind of want to know that the hardware is going to be there to support their needs.
And so you’re in an era where… And we could actually say perhaps this is going to play really well into your hands, but we’re in an era right now where the markets are a little bit tougher. Companies are going to be a little more cautious with CapEx, and enterprises are going to want to look at their existing hardware landscape and say, “How do we implement and get value out of what we already bought?”
If I had to say, where does Intel really stack up? Well, it’s going to be in those cases where it’s like, “No, you can’t buy another mountain of GPUs to do this.” And, “No, you’re not going to be able to take all this to the Cloud because consumption is really expensive.” Cloud economics might be efficient, but they’re rarely economical.
So I mean, is that maybe one of the really big opportunities for enterprises out there is, “Hey, you can do more with what you have because of what Intel basically enables you to do with these existing servers that you’ve implemented.”
Kavitha Prasad: You put it very well, Daniel, because Intel Xeon are ubiquitous. If you think about it, they’re everywhere. And the fact that you are able to get the AI performance with what you already have and to your point, not invested in another set of GPUs or another set of accelerators, but get the best performance with what you already have, that does play to a great advantage for Intel.
And by making sure we are building on the software stack about our hardware products so the customers can benefit the goodness of the hardware without actually having to get into the depths of what hardware is capable of, does play to a great advantage for us because a lot of these data scientists, a lot of these application developers do sit at the top higher levels of the software stack and they want to get the best performance out of the boss, through the software.
Daniel Newman: All right. So let’s play competitive games here a little bit. As an analyst, one of my responsibilities of course, is to pay attention to the whole landscape. And I don’t think anybody can say with a straight face that AMD hasn’t become a more and more viable competitor to Intel in the past few years.
But I’m always interested in your take because I’m sure leading this particular part of Intel’s business, you’re looking very closely, you know that there was a recent Genoa launch, AMD Genoa.
In terms of your part of the business. How are you thinking about the ability for the 4th Gen Intel Xeon to stack up against AMD’s Genoa with what they’ve recently announced?
Kavitha Prasad: From an AI perspective, Intel does provide the acceleration needed in our next gen processors, Daniel. And if you look at the MLPerf results that was published recently, Intel was the only company that actually participated, from a Xeon perspective and published the results out there.
So we don’t have any competitive metrics from other competition to actually go compare and say where we land with regards to AI. But Intel definitely is in the leading forefront when it comes to making sure AI workloads are accelerated on our CPUs.
Daniel Newman: Yeah. I think that’s going to be a really interesting area to watch. I mean, I can tell you having been watching from the third party, that it’s certainly been more in focus for AMD. Now, this is where you’re early often and of course significant market penetration is going to be used to your advantage. I think AI, obvious, keeping and protecting your flanks, watching the market share, trying to, and let the customers that have large deployments of Intel and of course, those that are upgrading to the Gen 4 know that, “Hey, you can leverage this in its current form without massive additional costs. Taking advantage of our accelerators to grow your AI projects and implement more of your goals is an enterprise.”
Which is going to be super important because we’re in an era with a deflationary tech. The next few years is the markets inflated as companies are spending less, getting rid of additional and unnecessary headcounts, digital transformation projects are going to be more important. And those projects, the success of those projects is going to depend on the ability for companies to implement automation, workflow optimization, to implement AI and analytics tools to automate everything from, quote, “To purchase systems and e-commerce tools and chatbots.”
AI is going to be at the core of this and so being able to do this, so without spinning up a major CapEx is going to be really important. So I hope you guys can land that. And for everybody out there, you’re hearing it from me, I think this is a really interesting application for your 4th Gen Xeon.
Kavitha Prasad: No, that’s very true. That’s so true Daniel. And in that context, if you think about it, Intel has launched reference kits to address the same enterprise market. To your point, be it predictive analytics with the energy sector or with the manufacturing.
We have launched even reference kits, where we have reference designs that show how do you ingest the data, how do you perform compute on the data, how do you run AI, how do you get the post analytics? So these are reference kits that we are enabling, which we have even open source, so that customers can deploy their solutions with by looking at these reference kits.
So Intel is making sure that not only are we investing in hardware, we are investing in software. And we believe a lot in open ecosystem because we believe for us to democratize AI, we cannot be walled gardens, we have to enable the ecosystem. And what better way than to open source whatever our innovations are, into the open source ecosystem so the customers and the developers can get the benefit of Intel goodness right out of the box. And that’s where our focus has been, Daniel. Just to make sure that we bring this adoption, we accelerate this adoption in the market.
Daniel Newman: Early days, like I said, lots and lots of opportunities for AI to be implemented on existing workloads, let alone the bigger scale. But let me end here because I do want to talk about, you are launching a data center focused GPU, Ponte Vecchio I believe has been the code name.
Talk a little bit about that because with, I guess the workloads that maybe need a little more or need a more dedicated resources. Talk about what’s kind of going on there and how you see that being a potential value creator alongside the Gen 4 Xeon.
Kavitha Prasad: So where you need, again for more AI, deep learning kind of workloads or HPC plus AI kind of workloads where it necessitates the need of GPUs, we are investing in the MAX Series, Ponte Vecchio is one of the code names, is that if from a training perspective and from an inference perspective, we have Flex Series that actually addresses the inference needs of the market.
So we are working on the GPUs. Again, with the homogenized AI software stack on top of it, so that between CPUs and GPUs you get the best performance with the homogenized software. We are launching the GPU products as well to address these market needs, Daniel.
Daniel Newman: And with one API, I mean the idea is you’re going to create software and centralized tools for developers across the hardware landscape?
Kavitha Prasad: Correct. So that will act as the homogenizing layer for all of our heterogeneous compute elements, so that customers or developers can be as hardware agnostic as possible.
Daniel Newman: So to summarize everything, as I see it Kavitha, the future of every enterprise and just about every app we use is going to see layers of artificial intelligence implemented to improve performance, to also increase efficiency, and then of course, to make meaningful steps forward in experiences. And this is kind of those that multi-layered cake of what business is, it’s a little bit like the layer cake of semis, right? More performance, less power.
While every enterprise it’s about more performance, more efficiency, more scale, and AI is just such a straightforward toolbox that’s now available and most businesses are really in their earliest innings.
So I started off asking you the question about how far are we along, and my take is very early innings. On the other hand, I just want to maybe end here with you, is how much do you see acceleration of AI happening in just the next couple years?
Kavitha Prasad: It is going to accelerate at a fast pace, Daniel, to your point, right? Everything is getting infused with AI be or email, sentence completion or be it on the client devices. AI is just proliferating way too fast. And as we move from predictive AI to generative AI, the scope is, or the rate of adoption and the scope of enabling AI for workloads is just limitless. We are going to see it take off pretty exponentially.
But again, the key will be how do you make sure from a software perspective, the right tools are available, the right performance, and right KPIs are met? Because these KPIs also do vary whether I’m doing it on the Edge on the client side, or am I doing it on the Cloud side, the KPIs vary a lot.
So how do you get meaningful performance out of the box? Meeting the KPIs of the customers with easy to use so that your development to deployment time is limited. All these are going to play a huge role, but it is going to get adopted at a much faster rate.
Daniel Newman: Well, Kavitha, I love the conversation. Really enjoyed having the opportunity to talk about this. Anybody out there that’s not paying attention to the opportunities with AI is probably not paying attention, period.
It’s impacting our lives, whether it’s on our mobile devices and then the apps we use both personally, but of course, in the enterprise it’s going to be such an important deflationary technology that’s going to drive companies into the future.
So with that in mind, I want to thank you so much for tuning in and joining me everybody. Kavitha, thanks for being on the show. Let’s have you back soon.
Kavitha Prasad: Thank you so much Daniel. Thanks for the opportunity.
Daniel Newman: All right, if you like what you see here on this Futurum Tech Podcast interview series, hit that subscribe button, join us for future episodes. I’ll put more information about all the Intel products and features that Kavitha mentioned in the show notes. So feel free to go down there, scroll around and click there.
We’ve done other episodes with the folks at Intel. We really enjoyed the interviews. Feel free to click on those. We’ll put those in the show notes too. For this episode, it’s time to say goodbye. Thanks for tuning in to the Futurum Tech Podcast interview series. I’m Daniel Newman. We’ll see you really soon.
You Might Also Like
About the Author
Daniel Newman is the Principal Analyst of Futurum Research and the CEO of Broadsuite Media Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise. Read Full Bio