How IBM is Using Supercomputing to Fight COVID-19 – The Six Five Podcast Insiders Edition
On this special episode of The Six Five – Insiders Edition hosts Patrick Moorhead and Daniel Newman welcome Dave Turek, Vice President, HPC and OpenPOWER for IBM Cognitive Systems to discuss IBM’s response to COVID-19 and how this will impact our future.
Tech for Good
IBM has always been a company built around solving problems. The corporate philosophy has always included the pursuit of tech for good. It’s fair to say that we are facing an unprecedented situation in our lifetime. IBM is working to coalesce not only product assets but services assets too that can help with the fight against COVID-19.
Many people might not know that IBM has a team of basic researchers in a wide variety of science fields. From chemistry and physics to electronics and astronomy, IBM researchers in facilities all over the world are working to target this problem.
Last month, the president announced the COVID-19 High Performance Computing Consortium which was an effort sponsored by IBM, The White House and the Department of Energy. This consortium will bring supercomputing assets from around the world together for researchers to work on solutions for COVID-19. Not all assets are from IBM, but we are one team working to find a common goal.
History of Supercomputing at IBM
IBM has been working on supercomputing since the 1980s. They pioneered the machines that have been used to solve the complex problems of the day. But, as Dave noted, IBM has pioneered supercomputing in a way that has completely flipped the notion of thinking. In the late 90s IBM introduced Blue Gene that brought in a new era of high performance computing. It focused on energy efficiency. Blue Gene used tons of low-power processors to gang tackle the problems at hand. It only used a fraction of the energy and floor space that other supercomputers used.
Roadrunner, which came a few years later, was the first petascale computer. It was built from microprocessors from AMD and IBM and then cell processors that were originally developed for game consoles. Now, it’s quite common to see GPUs in supercomputers.
In 2011, IBM then looked at the system and realized it should revolve around data. Big data was starting to become popular in business. So Summit at Oak Ridge brought together the concepts of high-performance computing, machine learning, and deep learning on one platform.
Summit and COVID-19: Supercomputing to Fight the Virus
Summit was launched in 2018 at Oak Ridge where researchers set out to determine how they could take classical HPC methods and apply artificial intelligence to solve problems. Some of the early problems that researchers used Summit to solve were in the fields of biology and microbiology so as COVID-19 manifested itself, it made sense to the researchers at Oak Ridge to look into the ways HPC and artificial intelligence could be used to understand the virus and find a cure.
Specifically, researchers are using Summit to identify the ways molecules and atoms interact in the virus. They can simulate the ways different compounds interact leveraging supercomputing capabilities that drastically reduced the amount of time needed to understand the virus.
Researchers at Oak Ridge and the University of Tennessee have used the Summit Machine and molecular dynamics to look at what compounds can inhibit the virus from replicating itself. They started with 8000 compounds and narrowed it down to 77. That’s a 99% reduction in compounds that have been computationally eliminated. These researchers are now able to refine and run more computational experiments that will ultimately help in laboratory experiments that will determine a cure or treatment plans for COVID-19.
Looking Toward the Future
The future of supercomputing won’t necessarily be around building something faster and better, but taking the existing technologies that we have and reconsidering them to solve new problems. There will likely be a huge amount of innovation in the area of the incorporation of novel ways of thinking and characterizing problems that are going to be shown as having value independent of the underlying hardware, but the magnitude of what is returned to the user will be so great.
If you’d like to learn more about IBM and their response to the COVID-19 pandemic be sure to check out their website and explore the projects and resources they have. Also listen to the full episode below and while you’re at it, be sure to hit subscribe so you never miss an episode of The Six Five podcast.
Patrick Moorhead: Welcome to The Six Five Podcast – Insiders Edition, I’m Patrick Moorhead with Moor Insights and Strategy and I’m joined by our cohost, ever present cohost Daniel Newman with Futurum Research. And just as a reminder on the Six Five Insider Editions we interview executives from most relevant and influential companies across the globe Daniel. How the heck are you doing?
Daniel Newman: I’m super excited to be here. I’m glad that we have this opportunity to get on. We love having some of these companies on here. In this particular episode we have, I’m going to steal your thunder, we have IBM. And this is going to be a really interesting discussion as a whole Pat. For anyone that’s listening to this show and has had a chance to listen to some of these interviews that we’ve done recently, you probably are aware, we are in the middle of March 2020, and this is one of the most unprecedented times in our history. We’re all at home.
We’ve talked about endlessly on our different shows that typically we’re not at home. Typically you and I are somewhere, anywhere else, but at home, trying to get these episodes in, but with COVID-19 and the coronavirus and it’s spread to a global pandemic has basically grounded us. It’s grounded many industries and it’s changing the world and I think we’ll have a chance to talk a little bit more about that today but otherwise. Besides that, Pat, I’m feeling great.
Patrick Moorhead: Awesome. So let me introduce our guest. Daniel, you introduced IBM but I want to introduce our guest Dave Turek, Vice President of HPC and OpenPOWER for IBM Cognitive Systems. And by the way, if any of you have followed supercomputing for the last few decades, you will know Dave, but for those people who don’t know Dave. Dave, can you introduce yourself.
Dave Turek: Sure. Thank you. So I’ve been involved in supercomputing and IBM since the late 1980s. We were doing things with vector processors then but transitioned and really popularized parallel computing in the 90s with the SP program. A lot of your listeners may remember that as Deep Blue, the system that was modified to beat Kasparov in chess. Subsequently, I was involved in helping to start Linux and IBM, Linux clusters and IBM, grid computing and IBM. I’ve been from the very beginning involved with the Department of Energy on the ASCII program, accelerated strategic computing initiative, which is deployed supercomputers to model nuclear weapons for stockpile stewardship. Did the Roadrunner system at Los Alamos the first petaflop system. The Blue Gene family, and the more recent coral program systems at Oakridge and Lawrence Livermore.
Patrick Moorhead: So it’s safe to say that if you have been familiar with supercomputing, you know, Dave. And if you don’t, you do now and you know it is incredible. When you look at the history of what supercomputers have been able to do. They truly have revolutionized and really brought out some of the biggest discoveries that we may just take for granted right now.
Daniel Newman: So Dave I’m going to go ahead and jump in here. We have a few different things we want to chat with you about here. And I want to start off talking a little bit about IBM, as a whole, so we are in, you know, I mentioned earlier in these unprecedented times. Things have totally changed. IBM has always been a company that’s believed in tech for good. So as we’ve sort of entered this new era, this remote work era, this lockdown phase where society is trying to find its bearings what’s going on at IBM? How is the company approaching this? As much as you can share internally, externally, what’s the status?
Dave Turek: So, you’re correct. IBM has always been a company that’s been oriented to solving big problems. This is probably the biggest problem in our lifetime and maybe a few lifetimes before that. It’s in our wheelhouse in terms of the complexity and the effort required. What we’ve been doing is we’ve been coalescing around a set of teams in IBM, that have been harvesting different kinds of assets that can be brought to bear against the problem. We have not only product assets, if you think about it, but services assets. One of the interesting things in COVID-19 is issues with respect to system resiliency. How do you keep computing infrastructure up and running, especially for critical parts of the economy. So we have a lot of offerings and a lot of effort being brought to bear on the problem through mechanisms like that.
We also have substantial research facilities all over the world. And all those researchers are really getting geared and directed to work on COVID-19 related problems. Many of your listeners may not know this, but we do basic research in physics, chemistry, material science, obviously electronics, even astronomy. It’s a whole vast array of scientific endeavors and it’s that diversity of technical skill that can really be brought together and synthesized to target a problem with this degree of complexity. We have thousands of mathematicians, and they’re all capable of making significant contributions to this.
So, it’s the assets that we have, but it’s also a conceptualization of new ways to make these assets available. On Sunday this past week, the President, in his address the nation, made comments about the launch of the so called COVID-19 High Performance Computing Consortium, which is an effort that was sponsored and initiated by IBM, the White House and Department of Energy, to bring supercomputing assets from around the world together, to be made available to research to work on COVID-19. Not all of those assets are ours, doesn’t matter. We’re all one team here, and we’re just trying to find a way to bring our expertise to bear and help everybody get to the bottom of the solution to his problems as quickly as possible.
Patrick Moorhead: Dave, I’m glad you brought that out because it really does take a village here and I’ve been doing this for a mere 30 years, but I’ve been tracking IBM’s supercomputing history and maybe it would help a bit just to do a really quick hit on things. Like you mentioned Blue Gene but Road Runner and Sequoia, and then we can transition directly into Summit.
Dave Turek: I think when you look at the history of supercomputing to me, there are generations that the mark degrees of progress parallelism which is the way we’ve used supercomputing today, and probably pioneered most notably by thinking machines back in the late 80s is the pervasive way in which people approach these scientific problems today. So if you think, going back to let’s say 1987 or so, that’s a paradigm that’s been in use from that point to now. Prior to that there were vector approaches and other approaches pioneered by the original Cray company and companies that don’t no longer exist for that matter CDC, etc.
Two other ideas, I think that have come forward in the last 20 years or so — Blue Gene really popularized the notion of using tons of low power processors to gang tackle the problems in front of us. If you look at dates, Blue Gene effectively was launched in the summer of 1999. At that time, everybody thought the way you build supercomputers were prosperous. And this was 180 degrees, opposite from that. By virtue of doing that, though we managed to really bring to the fore the notion of energy efficiency, and to really popularize this radical scaling to extraordinary degrees so now had tens of thousands of processors working on a problem.
Road Runner, which came a few years after that, was the first system that we deployed that really used accelerators. We built that system out of parts from AMD and IBM microprocessors and the accelerator was the cell processor that at the time, IBM, Toshiba and Sony had designed that for game consoles. And we reapply that to Supercomputing Applications. As a result, Roadrunner became the first petascale computer on the planet. Then post that of course, we’ve been leveraging those ideas and evolving them so now it’s quite customary for example to see GPUs used in supercomputer designs.
I would say there’s one other event that I think is significant in the evolution of supercomputing, and I’ll trace this back, at least in IBM terms, to 2011. In our conversations with the Department of Energy for so-called Exascale computing—the next big hurdle that everybody wanted to get over. We made the determination that the design parameter for the future, more than anything would revolve around data. Big data had become kind of the watchword at that time. We looked at it and said, what does that mean in the context of supercomputing? It was fortunately, that we did that because by virtue of the architectural changes and things we implemented in systems like the summit system at Oakridge, you now have a system that courtesy of the popularity of machine learning and deep learning is able to bring concepts of high performance computing, machine learning and deep learning together on one platform. That was our observation of how critical data would be to make those architectural changes inside the systems to really make that kind of an accommodation.
Daniel Newman: So that takes us to the current Summit. You know this is kind of what’s new and what you guys are like IBM is really reacting and talking about in the market now if you talk a little bit about what’s going on with COVID-19 and how Summit’s playing a role.
Dave Turek: Right, so Summit was installed operationally I think in June of ‘18, part of the Department of Energy’s choral program, Exascale program. And from the very beginning, the researchers at Oak Ridge — and Oak Ridge by the way is the most significant science lab in the Department of Energy portfolio of laboratories. Maybe thousands of researchers they’re covering a vast, vast array of scientific inquiry. Upon deployment they immediately began to explore how they could take classical kind of HPC methods which are characterized as simulation and system representations in mathematics that needed to be fathom to get an understanding of how a processor system work to also begin to explore the possibility of bringing artificial intelligence to bear on these classes of problems.
So historically, for example, you might look at solving a problem in chemistry by applying computational chemistry approaches. And now people are saying, well, what if we use AI approaches instead and put a focus on data, more so than even mathematics. Using the techniques of AI to tackle problems.
Oak Ridge saw that opportunity from the very beginning and from the onset of our delivery, began to explore the interplay of artificial intelligence and HPC, HPC defined in the classical sense. And, and some of the very early applications of that were in the biology area, material science and so on. So, it’s natural then that as COVID-19 manifested itself, that the people interested in exploring that, would look at the experience on Summit and the experience in the fields of biology, chemistry, biochemistry and even physics, to begin to think about ways in which supercomputing computational techniques, or even artificial intelligence techniques, could be brought to brought to bear on that problem.
So, researchers Oak Ridge in Tennessee got together and they began to look at COVID-19 to see, well, what kind of compounds might be really effective in treating infection from COVID-19? That’s an English language statement, but the approach taken was molecular dynamics, so they go down and they look at the interplay of molecules and atoms, at the timescales at which those things interact and they map on to that all the different forces that are in play, in terms of how molecules and atoms interact, huge amount of computing required for that.
And they begin to simulate the behavior of different compounds in context of the COVID-19 virus. And so that computational effort is just massive and being able to leverage your computational capabilities, Summit was really instrumental in terms of dramatically compressing the amount of time required to begin to get an understanding of that problem.
Patrick Moorhead: So essentially, the previous generations before Summit we’re all about flops and with summit sure flops are important but to be able to better size and assess the data, machine learning came into play, where you might be able to do more because the system is smart enough to focus on a smaller set of data. Is that how that works?
Dave Turek: Well it’s, I guess the way I would characterize it is this, we’re in an era in supercomputing that I would characterize as an era of decomposition of problem. Some people use the term workflow, but it doesn’t matter it’s all the same thing. And what I mean is that as you look at a workflow, or at a HPC problem, it’s not the case that every element of that problem is best suited for a solution with the same technology. So what you want to do is you want to cut it up into pieces and say look, this is really good in terms of a parallel approach to solving this particular algorithm, but you know this part of the problem is all about I O and data and data manipulation. I don’t need that kind of instrumentation to solve that problem. And then there are other parts of the problems where you look at it and say, I really need an accelerator here, this really is a max flops kind of issue. And then you have another approach that says, now we’re going to forget the flops entirely. What we’re going to do is we’re going to deploy a machine learning model, and just teach the system how to interpret what it’s observing by just hammering it with tons and tons of data. So it gets back to this big data notion that we embedded into the CORAL system to make sure that you have the opportunity to always handle voluminous amounts of data.
Patrick Moorhead: That’s awesome. It’s probably going to change the landscape of supercomputing for the next 20 years, maybe before we get to quantum. Bigger picture. The initiative, the team led by Dario Gill, can you talk through what the mission and goals are. It’s simple to say find a cure, but there’s a lot of sub-objectives I’m sure before we get there.
Dave Turek: That’s right I mean I think the overarching ambition is finding a cure or finding therapeutic agents or prophylaxis, anything that will remedy the situation that we’re in, and those are not mutually exclusive. Scientifically they could all be connected and need to be explored that way. Dario’s mission is quite broad. So as I alluded to earlier, he’s got people looking at system resiliency, how do you keep things up and running, security issues that are related to the outbreak, the fragility of different kinds of computing infrastructure deployed to different kinds of clients around the world. How do you deal with these kinds of things?
So COVID-19 is not simply a biological problem. If you think about it, it’s a real shock, in terms of, obviously the way people work and behave, but also the way we think about the systems that we operate to conduct society. You know you can talk about for example, deploying a system to help in law enforcement, but if you suddenly have a model where you no longer put law enforcement officers in the field, well, that’s an idea that needs to be rethought. So there’s a real radical reconsideration of just about everything you can think of. And I will tell you, anecdotally, I sat in a meeting, it might have been Monday this week, time is getting stretched and compressed in interesting ways. We looked at a collection of assets that IBM had to bring to bear on COVID-19, and it was 80 pages long. 8-0. That’s just a tremendous set of different kinds of assets that can be brought to bear. And the trick is, well one of the most important things, what’s needed first, where can we have the greatest impact. And that’s the sort of spirit that imbues this whole effort.
Daniel Newman: So, just curious. I mean, still early days, but we’re in the middle of the storm I would say to some extent. Can IBM and if the answer is not yet, I totally understand, but is there any sort of progress that IBM or that you’re starting to kind of see? Because like I said, most people are kind of looking for the answer and as you said, it’s very much a process and it’s going to have a lot of small steps along the way.
So as those 80 pages, which, like you said, is a lot, unless it’s a legislative bill, and it’s not very much, is a ton of resources. And I’ve actually had a number of arguments, Dave, with people out of the tech community over the last few weeks about how what we do, doesn’t matter, you know, it doesn’t matter right now. Tech isn’t important right now in the middle of a biological world pandemic crisis. I’m talking about lawyers and doctors and people that are not in the tech field that kind of, but I keep telling them the tech is super important. What we do makes a big difference. So even little progress it just be great to hear, and like I said if it’s too soon I totally understand.
Dave Turek: So, yes, there’s progress going on, I’ll give you a little vignette, of what I mean by that. Right. So Jeremy Smith at Oak Ridge National Lab and Nicholas Smith of University of Tennessee got engaged to use the Summit Machine to look at this issue of molecular dynamics applied to therapeutic agents for the treatment of COVID-19. What that means is, what compounds, what molecules can be brought to bear that will, for example, inhibit the ability of the virus to replicate itself. Doesn’t replicate you know you’re not going to get sick. Not terribly different by the way from the strategy used against HIV, you know, a few decades ago. Disease is here, alright, if we can’t get rid of the virus completely can we preclude it from replicating itself and causing health problems?
So, here you have a bunch of tech guys, and they’ve got access to the biggest supercomputer in the world. They start looking at this and they start with a list of 8000 compounds, and by applying these techniques and molecular dynamics they cut it down to 77. That’s a phenomenally important step. Because the difference between 8000 and 77 is 99% right? So you got a 99% reduction of things you no longer have to look at, because you’ve already shown computationally that the likelihood of having any useful impact is close to zero.
So now you say, all right with all the resources I have at my disposal, I can now get focused, I can focus on 77. And I can just continue to add more resources to that and progressively do more and more refinement on the behavior of those compounds, until I get clarification to a degree, where I can start actually going back to a conventional laboratory if you will, and start testing the compounds against the live virus. That whole, I’m going to use the word, a little bit out of context, that whole sort of supply chain, if you will, is in place. So the computational scientists at the front end, they’re examining this data. They’re making judgments. They’re refining it, they’re getting input from a bigger scientific community reviewing their results. They’re running more computational experiments, and they’re getting to the point where eventually and they’ve already made agreements in the laboratory in Tennessee, they’ll be able to give insights on a laboratory the kinds of experiments to run, based on their computational analysis of compounds that might be very effective in treatment of COVID-19. So, I can’t think of anything more important than that right now from a tech point of view.
Patrick Moorhead: Yeah, it’s a lot about education and just getting out there and I think that’s part of what the Six Five Insider Podcast is about and we’re blessed to have you on Dave. Now, I’m going to assume that we’re going to figure this out. But it does beg the question, and there’s two problems, there’s the short term problem right now, and then there’s there’s long term. Can you talk a little bit about the future? What are you thinking of in terms of post-COVID, and the opportunities for supercomputing? I mean, I’m not a rocket scientist but on Twitter I gave my top 10 things that accelerate and things that decelerate. I don’t think there’s any question that supercomputing will accelerate, but there’s other technologies that I know that IBM is involved in that could be even more useful than the way that the Von Neumann type of computing that we do today.
Dave Turek: So, I think there are a couple of different ways to respond to the question. First of all, as we look over the course of time, we think the fundamental paradigm, the future of computing is embodied in this notion of classical computing, quantum computing and artificial intelligence. So bits, cubits and neurons, to put it in the vernacular. And that gets back to my comments earlier about looking at problems from a perspective of how do you best decompose them to get the best tool at your disposal to work on the problem at hand. Our view has been that these things will work in concert as we go, in time. You’re not going to see a standalone quantum computer just doing quantum stuff.
The more appropriate sort of analog would be to think of it as an accelerator to a classical computer, because by and large, the vast, vast majority of all computing that will do into the future is going to be classical in nature. But then again you look at the infusion of AI, neurons neuromorphic designs, all these different kinds of things a lot of people are working on, including IBM, and you see how that begins to play a role to kind of reshape what the nature of the classical environment is and how it approaches problems.
So, let me let me try to make this a little more concrete supercomputing for the last N number of decades has been, you know, crudely put go fast computing, I have an application today I’m going to make it run faster tomorrow. And that paradigm has caused the emphasis on floating point networking and so on and so forth. Okay, well we’re getting close to the end of the time in terms of technology. Right, so we know how to get to five nanometers and maybe three, but you also know that the cost of making these jumps is getting horrific, and the benefit is one could argue how profound it is, but we do know one thing there is no zero nanometer technology. So there’s this sort of barrier that we’re coming up against. And you say well, so, how do I make things work better? We’ve deployed software machine learning software based on Bayesian methods that will look at ensembles of simulations. By the way, molecular dynamics is run with ensembles of simulations. And by applying Bayesian techniques, we’ve been able to reduce the number of simulations in some cases by 90%. So you look at that and you say well that’s a pretty rich thing to consider, because I didn’t have to consider technology. I didn’t have to make multibillion dollar investments in fabs. I didn’t have to worry about new materials. I just reconceptualize the problem in a way that allows me to leverage the kinds of technology that we’re familiar with today to get a profoundly speeded up kind of result. And I think you’re going to see a huge amount of innovation in that area of the incorporation of novel ways of thinking and characterizing problems that are going to be shown as having value somewhat independent of the underlying hardware, but the magnitude of what are returned to the user will be so great, it won’t matter. Right?
Patrick Moorhead: Dave, we’re coming up on time right now and I have to say. After about 30 minutes my brain is about two shoe sizes bigger, and even though you and I meet a couple times a year and talk it’s always it’s always a journey. And I’ll admit, I feel a lot safer and a lot better with people like you and companies like IBM doing this, and this isn’t some fad, it’s something that IBM has been doing forever, and I salute you and everything that you do. I really appreciate that. All the engineers that IBM who are tackling this hats off to all of you.
Dave Turek: Thank you.
Daniel Newman: I think it was I think it was really fascinating too, to learn because I would say the average person out there probably has no idea, Dave, the things that you mentioned gastronomy, you know, working on traditional chemistry and biology and these challenges that the tech community and the the largest companies in the tech community are involved in. Because as we’re talking about solving problems like this, it is really a combination of industries, a combination of capabilities, of science and opinion and so many other fields right and that’s just as I watched this world kind of be evolved, Dave. You really do hope that we can all come together, because I think that’s probably one of the biggest challenges right now is as information and social science or social media has changed, getting to the bottom of good information has become harder than ever. So, one thing I think we really do hope as you debate the world and we all talk to people and have our sides and our views, is that we can come to agree on science as the basis of information, which sometimes I’m puzzled about how hard it actually is to get there.
Patrick Moorhead: Daniel, we’re going to cut this off right now but for everybody who tuned in, we really appreciate that. And we appreciate IBM sponsoring this podcast. Don’t forget to check out the show notes that are down at the very bottom. We think they’re valuable resources if you want to know more about what IBM is doing in the space. So if you liked what you heard, click that subscribe button. Until next time, this is it for the Six Five Insider.
Disclaimer: The Six Five Insiders Podcast is for information and entertainment purposes only. Over the course of this podcast, we may talk about companies that are publicly traded and we may even reference that fact and their equity share price, but please do not take anything that we say as a recommendation about what you should do with your investment dollars. We are not investment advisors and we do not ask that you treat us as such.
Image Credit: Forbes
Daniel Newman is the Chief Analyst of Futurum Research and the CEO of The Futurum Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise. Read Full Bio