On this episode of The Six Five – On The Road hosts Patrick Moorhead and Daniel Newman are joined by Bratin Saha, VP and General Manager, Machine Learning Services, Amazon AI. Their conversation covers the latest in machine learning, which is the M in re:MARS.
Their conversation revolves around the following:
- Machine learning is no longer a thing of the future
- How machine learning can be used to solve previously unsolvable issues
- The exponentially increasing job opportunities in the growing ML field
- How AWS is making ML accessible to everyone
To learn more about the event, check out the website here.
And don’t forget to subscribe to The Six Five webcast so you never miss an episode.
Check out the episode here:
Listen to the episode on your favorite streaming platform:
Check out the other conversations from the event here:
Disclaimer: The Six Five Webcast is for information and entertainment purposes only. Over the course of this webcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we do not ask that you treat us as such.
Daniel Newman: Hey everybody. Welcome back to another episode of the Six Five podcast. On the road here at Amazon re:MARS. At re:MARS, Pat, is a fun, high tech, fast moving, futuristic event. And we have had the honor of having some really great interviews and we have another one today, but before I introduce our special guest, you ready to rock and roll?
Patrick Moorhead: I’m ready. Yeah. My name is Patrick Moorhead. Co-founder of the Six Five and founder of Moor Insights & Strategy, but I got to tell you, Daniel, I am super excited that we’re doing this in person, but listen, I want get to our guest. Bratin, how are you?
Bratin Saha: I’m good. Thank you, Patrick. Thank you for having me.
Patrick Moorhead: Yeah. Are you having a good week?
Bratin Saha: Yes, it’s been a lot of good news and lots of excitement, exciting announcements.
Patrick Moorhead: I mean, an event like this with all these cool technologies, I mean, it’s, I feel like I’m a kid in a candy store. It’s just been great, but listen, we had made our introductions. We know each other, but would you mind talking about what you do for AWS and how your history, prior history, got you prepared for this role?
Bratin Saha: Sure. So at AWS today, I manage, I lead all of the AI and machine learning services. I’m VP for all of AI and ML services, spanning all the three layers of a stack. As you know, the ML infrastructure and the engines and the SageMaker layer and all of the AI services and solutions we have. And then prior to that, I was at Nvidia, VP of Software Infrastructure. And then prior to that, I was at Intel working on a variety of products on distributed computing image, analytics, cognitive computing, and so on.
Patrick Moorhead: Gosh, how did we never meet before? Right? I spent 11 years at AMD and yeah, it’s pretty wild.
Bratin Saha: Yeah. And then prior to that, I did my PhD at Yale in computer science. And then after that, while I was at Intel, I also went to Harvard Business School. So being through research products, build businesses, and now leading the AI and ML at Amazon.
Patrick Moorhead: You’re one of these really scary people who is super technical and also a business degree. And that kind of seems to be the winning combination there. But now I’m going to… This is a pretty wild, and I was even in manufacturing for 10 years of systems where Intel was… I was Intel’s biggest customer. So here we are.
Bratin Saha: Yeah. I mean, it’s great to meet you and especially at this place where we are seeing a lot of the future unfold.
Daniel Newman: Yeah. It’s for sure. It’s great how they’re tying it all together, too. Like I said, the coolest thing I’ve seen here tend to be a bit of a culmination of everything that MARS is, the machine learning automation, robotics space, and other things, by the way, things, quantum… I mean, this is at a little bit of a quantum tune going on here. And by the way, also a whole bunch of things that require Harvard and Yale and an education. I can’t quite say. I’m not going to read off mine. I mean, I do have an MBA, but I’m super impressed. I’m just, I got to be honest. I’m slightly intimidated, but super impressed. But we interview a lot of really smart people. I mean, we talk to probably 50 of the Fortune CEOs every year on this show. So Bratin, you are super smart. No question about it.
We’re going to ask you some hopefully thoughtful questions and we’re going to take it in stride here, and we’re going to get that little, like, what do they call it? That real world PhD you get from talking to smart people all the time and learning? But one of the things I heard you say, you mentioned something along the lines in the executive briefing that you did for the analysts was kind of about, we keep iterating about the future, but ML is really the now. I mean, ML is hot. It’s happening. Talk a little bit about the ML space and the problems that are being solved today, because I sometimes think people don’t realize just how prevalent ML is in our everyday life.
Bratin Saha: Yes. And I like to say that ML is machine learning is no longer the future. It’s the present and here and now. And customers across every industry, financial services, healthcare, media, software entertainment are using machine learning in a variety of ways. And today on our AWS, more machine learning happens on AWS than anywhere else. And we have more than a 100,000 customers using our services and they’re using it in all kinds of ways. It’s really, really pervasive. So I’ll give you a few examples, Georgia Pacific. Georgia Pacific uses machine learning to help their operators determine the speed at which the paper roll should be used. And they’ve been able to improve paper quality by 40%. Then Koch, one of the largest private companies in the world, Koch has been using machine learning to do predictive maintenance, where they use one of our services like Amazon Monitoron that uses machine learning to detect normal vibrations. And then, we have a lot of customers using SageMaker, do machine learning for all kinds of use cases in finance and personalization and all that.
And so what we are seeing… We have Anthem that uses Amazon Textract, which is a document processing system, and they have been able to automate almost 80% of all the claims processing. And then Intuit, they have been analyzing more than 200 million minutes of customer interactions each year using our machine learning services. And so what you see now is Discovery. Discovery is using our Amazon Personalize to be able to use personalized experience on the Discovery+ app. So what we are seeing now is a very broad pervasive use of machine learning across multiple industries and across all geos.
Patrick Moorhead: Yeah. Incredible progress. But I have this hunch that having a 100,000 customers is not where you want to stop. It’s a great, maybe if we’re talking about a baseball game, maybe it’s the second inning.
Bratin Saha: Or the first innings.
Patrick Moorhead: Yeah. Yeah. But we have made a lot of progress now. Not only do we have a global supply chain issue, we also have a supply issue with skilled workers who are smart enough. We talked a lot about the education of that in AI and ML. How is that impacting the rate of innovation and the uptake of that?
Bratin Saha: So there are multiple things that we are doing to help make machine learning easier. There is an aspect around educating and getting people up to speed with machine learning. There’s a second aspect around how do you make machine learning easier, right? And so, for example, last year at Reinvent, we launched SageMaker Canvas. That’s a no code tool for doing machine learning. You can do machine learning without writing a single line of code. And we have been very pleasantly surprised by the traction of Amazon’s-
Patrick Moorhead: [inaudible 00:07:00] Okay. So it’s no code?
Bratin Saha: It’s no code.
Patrick Moorhead: Okay.
Bratin Saha: And so the system does all the data preparations for you. It does all the model training for you, does all the modern deployment for you, and you, as a user, have to write zero lines of code. Okay? And we have seen the… We have been really pleasantly surprised with the uptake of it. Then, there are all of these AI services, like I mentioned, Amazon Monitoron, Textract, Transcribe where you… It exposes APIs. So as a software developer, you can use it, but you need no machine learning expertise. So that is all of the stuff we are doing to make our products easier to use.
Then, we have the Machine Learning University, where all of the courses that we use internally for Amazon engineers, we are making them available for free to everyone. And then we are partnering with Coursera, for example, and Udacity to provide training for machine learning. Then we have a machine learning solutions lab where our data scientists work with customers to make this happen. So it’s really, we are looking at it in a very broad aspect, starting from make it easy for machine learning. Let machine learning be done with no coding. Let machine learning be done with no machine learning expertise, then the training and the education aspect of it.
Daniel Newman: Yeah. So there’s a ton to unpack there and you kind of alluded to what I was going to ask you. So I’m going to go off the script a little bit and make everyone in the room a little uncomfortable, but you said ask me anything.
Bratin Saha: Yeah, totally.
Daniel Newman: All right. So totally. So what should I eat late… No. Kidding. Kidding. In all serious though, offline, we’re having a little bit of a debate. For instance, something like COVID’s been a fascinating topic for machine learning and AI because we’ve got this huge problem, which by the way, is still pervasive. We’re doing a little better now, but we’re still dealing with spreading. And we were talking in a little group about we’ve got all this data. We’ve got millions and millions and millions of cases now. And we talk all the time about high performance computing. We talk about the data services, databases, AI, ML.
That’s a practical problem. We talk about making things simpler. How do we take something like that? Where’s ML’s role in taking something like that and actually starting to give us better information about why do some people get it and why don’t? Because it seems like it’s been a few years now and the mystery around it maybe has gotten bigger, not smaller. And I know that’s the kind of the scientific equation and that does happen, but why are we not getting a little bit more consensus? Because if machine learning, at least as I’ve learned about it, so much of it is having enough data. Seems like we should.
Bratin Saha: So machine learning is actually playing a big role in medicine now, in healthcare now. And we have services there as well, Amazon Health Lake and Comprehend Medical. And we are seeing machine learning, being increasingly used for precision medicine. So it’s treatments that are geared towards individuals where you can actually take genetic information and couple that with the history, with the health record of an individual, and provide targeted treatments. Even in the case of COVID, machine learning was actually used because you have so many different permutations of what can be done. Machine learning was actually used in the development of some of these medicines. So a lot of the progress and the medical community obviously did an amazing job with the vaccines and the medicines. But a lot of it was also fueled and powered by machine learning. And we going to see even more of it because precision medicine, where machine learning is being used to help couple your genetic information with other health records and then make predictive stuff about your health, that is actually also being worked on by a lot of companies.
Daniel Newman: It seems I didn’t put him on the spot enough. He had a pretty good answer.
Patrick Moorhead: Oh, I know, ask him a harder question. He seems to-
Daniel Newman: [inaudible 00:10:46] Well, it seemed reasonable. I mean, we were out there having a little debate about this I’m sitting here.
Patrick Moorhead: But that was you and I, though, not with him.
Daniel Newman: No. And that’s what I said. I mean, he literally, he wasn’t prepped at all. He just came out strong. I like it. I mean, I’m still at the end. I still want to know if I’m totally immune, because I’ve been exposed so many times and never got it. So I just want to know if there’s a machine… Is there an algorithm that we can do that can quickly get me that answer?
Patrick Moorhead: That. Well, he’s going to answer that, but now we’re going to move to another real question here.
Daniel Newman: Yeah, yeah.
Patrick Moorhead: I’ve heard you use the term industrializing AI and ML. And I think I know what you mean by which is to scale it, but, and you talked… You rattle off a few great things that you’re doing. What else are you doing to industrialize it? Let’s get to the million customers, right? You talked about, you have the layer cake, right, with AI services. You have your middle layer and then you have bare bones IS, right, for people who to have data scientists and who know how to do all this stuff, leave the driving to them, to at the top of your stack, leave the driving to you, or a little bit more of it, the more I’ll call them PAS services. But how are you going to industrialize this?
Bratin Saha: Yeah. And so let’s take a step back and see why we need to industrialize and then I’m going to get to that.
Patrick Moorhead: Okay.
Bratin Saha: So four or five years back when we started, customers would deploy maybe a dozen models at most, maybe half a dozen models, maybe a dozen models at most. Today that we have customers who want to deploy tens of thousands of models. Four or five years back, if you were at the bleeding edge, the state of the art models, they had 10 million parameters, 20 million parameters. Today, it’s hundreds of billions of parameters. And we are getting maybe two trillion parameters. And today, we label more than a million objects a day. Okay? So when you get to millions and billions and hundreds of billions, that is where the industrialization comes in. And I kind of look at it as what the auto industry went through, what software went through.
So we are looking at it along a couple of different vectors. One is how do you make it easy to just automate the whole thing, make it repeatable so that you reduce errors in it. And that is where we are building in product features like things like SageMaker pipelines that allow you to automate the whole thing, checkpoint it, do introduce various kinds of checks, make it repeatable. The second aspect of it is as machine learning is being done on bigger models, more data, you need to be able to make it more performant because the amount of complexity in these models and data is increasing by an order of magnitude every year. So what we are doing is working on performance optimizations, lowering the cost. That’s the other aspect of it, because if you have to do it at scale, you have to lower the cost.
And then the third aspect of it is making products easier. So when I alluded to SageMaker Canvas before, previously our products were, let’s say, geared towards machine learning practitioners or software developers, but you have analysts, you have marketing professionals, finance professionals who want to use this, who don’t want to write the code. So the thought aspect of it is how do you get the products to reach a different persona who deal with data, but they may not have the expertise to do machine learning, right? And then, of course, the fourth part is the final part is the education aspect of it with Machine Learning University and so on. So it’s really about first, make it possible to do machine learning at scale with the right performance and cost. That’s one aspect. Get our products into new customer personas who may not have coding skills. And then third, have a broad based education, so we can get more people involved with doing machine learning and get to the next million customers.
Daniel Newman: So I have a million questions, but we only have about five minutes with you. So I want to get to your announcements. You had a couple of exciting things to share. I’m going to give you the platform and then maybe give us a chance to react.
Bratin Saha: Sure. Thanks for that. So one of the things we are doing is synthetic data generation with SageMaker Ground Truth. Customers often today… If you have to train your models, you need a lot of data. Now acquiring a lot of this data can be difficult. And so we are allowing customers now to generate data synthetically. So they come up with some amount of ground truth data, and then they generate more of it. So think of it, for example, in manufacturing use cases, you want to generate data. You want to generate a lot of examples of manufacturing defects, and so that you can generate synthetically.
The next is a continuation of a journey of introducing machine learning for software development. And so for that, we are launching Code Whisperer. Code Whisperer has been trained on a lot of open source models, Amazon models, and so on. And it looks at your code that you’re writing. It looks at the comments that you have, it understands the context, and then it automatically suggests code pieces for you. So it’s almost like a paired programmer. You have companion that is looking at what you have done, understanding what you have done, and then suggesting code pieces that you can add in.
Patrick Moorhead: Mind blowing. I mean, in a way, I mean, you’re using AI to create AI with Code Whisperer?
Daniel Newman: Code that sort of helps write itself?
Bratin Saha: AI too. We are using AI to help you create code. And you, as a programmer, can just accept all the suggestions or you can make some edits and accept the suggestions. So yeah, we are basically giving you a code companion that makes your job much easier.
Daniel Newman: The low code analogy, Pat, is when we write now, most of those apps will kind of try to finish your sentences for you. It’s sort of the analogy as for coders, as they’re writing, it’s sort of helping them finish their sentences. And by the way, who wouldn’t like that. Of course, you’re going to need all these developers to train it. They’re going to keep training it because it’s like, “Did it get it right? And did it put the period or the, whatever, the hash sign in the right spot.” But I was reading it and I’m like, “Oh, I could probably…” This actually would be the way I would learn because I always used to, when I learned to code, Pat, I would reverse engineer. I would look at HTML or something, how they coded it. And then I would kind of put it like, “Oh, this made this happen.” So it’s pretty neat, though.
Bratin Saha: Yeah. It’s a very innovative piece of work and it just pushes… When we launched Code Guru a couple of years back that was using machine learning to detect bugs. And so this just takes a real step, a leap forward where we are using machine learning to generate code.
Patrick Moorhead: And is it in real time for the user?
Bratin Saha: It’s interactive. So as you’re typing things, it’ll keep giving you suggestions. So it’s not something that you do offline. It’s happens as you’re coding.
Daniel Newman: Yeah. It looks like the suggestions you get when you’re writing.
Patrick Moorhead: Right.
Daniel Newman: It’s actually how it looks.
Patrick Moorhead: See, I didn’t want to use that type of thing is kind of demeans it in a way.
Daniel Newman: Yeah, I know. I mean, I’m dumbed it down to my level.
Patrick Moorhead: No, but I get it. And by the way, on the synthetic data generation, the first thing I thought of was what the auto industry is dealing with right now for self-driving cars. And at least the industry says, the industry says, and in the trucking, autonomous trucking, is that you needed eight million miles of driving to be able to do that. Now that’s really challenging. Okay? And you don’t just end there it’s this perpetual thing. And for, let’s say non-automotive, right? You talked about widgets defects in a manufacturing line. Being able to accelerate that with synthetic data seems like a really good thing to accelerate model creation and actually implementation. Am I looking too deeply into that or
Bratin Saha: [inaudible 00:18:31] No, that’s really… I mean, in many situations getting enough label data is the bottleneck and you can’t really go in and create a lot of defects, right? So if you’re able to synthetically generate data, you’re effectively getting label data much more quickly and more cost effectively. And if you can get data quickly and cost effectively that really accelerates your machine learning innovation, because ultimately, it’s about data.
Daniel Newman: And synthetic data is going to be the key to expeditiously getting to this metaverse opportunity, especially in the industrial applications. We need more data to be able to move faster in terms of product development or creating spaces and environments or engineering product, building buildings, all kinds of things.
Bratin, so great to talk to you. Time went fast. I think Pat and I are sitting here thinking I could ask you five or 10 more questions, but we’ve got a schedule. I’m sure you have a schedule with a whole bunch of meetings, but man, what a terrific Amazon re:MARS. Six Five On the Road at the show. So glad to have the opportunity to sit down with you, some of your peers, and just discuss what’s going on in machine learning, automation, robotics, and space.
Bratin Saha: [inaudible 00:19:42] Thank you. Thank you for having.
Patrick Moorhead: Thanks for coming on. We’d love to talk to you next year to get a point on how far you’ve come.
Bratin Saha: Thank you. Look forward to that.
Patrick Moorhead: Sounds good. And we also have Reinvent. We’ll be there, too.
Bratin Saha: Yeah. Yeah. Well I hope to have meet both of you there.
Daniel Newman: We’ll see you there, Bratin.
Patrick Moorhead: Excellent.
Daniel Newman: Thanks. I’ll see you all later.
Bratin Saha: Thank you. Bye-bye.
Daniel Newman is the Principal Analyst of Futurum Research and the CEO of Broadsuite Media Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise. Read Full Bio