AWS Announces NVIDIA GPU instances for it G4 Elastic Compute Cloud

 

The News: Friday 20th September saw news out of Amazon Web Services (AWS) about a renewed (expanded) partnership with industry leading Graphics Processor Unit (GPU) manufacturer NVIDIA to offer improved GPU based cloud instances.  The adoption of GPU technology has expanded from the use of these specialist processors for solely graphics acceleration to other commercial uses from everything from Blockchain mining to inference engines as part of Machine Learning applications.

Analyst Take: This is an important announcement as we see best in class cloud services meet best in class machine learning capabilities bringing AI as a Service into the consumption model that continues to grow in favor.

Let’s unpack the news piece by piece.

GPU powered Elastic Compute Cloud (EC2) instances.

The announcement centered on general availability of new G4 instances, a new GPU-powered Amazon Elastic Compute Cloud (AWS EC2) instance designed to accelerate machine learning inference and graphics-intensive workloads.  These computationally heavy tasks are increasingly needing different compute architectures to operate at the required performance levels mandated by Machine Learning use cases.  According to the announcement G4 instances provide the industry’s most cost effective machine learning inference for application such as object detection, metadata tagging for images, automated speech recognition and language translation.

Analyst View – Building on the previous partnership between AWS and Nvidia I am excited to see these industry leaders double down on specialist processing infrastructure as a service.  With the uses cases for GPU based architectures becoming increasingly foundational for Machine Learning applications I see this type of service becoming more widespread through growth in enterprise adoption as well as adoption of similar offerings available from Azure, GCP, Oracle and more.

 Machine Learning-as-a-Service

Machine Learning involves two processes that require increasingly specialized compute – training and inference.  Training involves using labeled data to create models that are capable of making predictions.  Inference is the process of leveraging a trained machine to actually make the predictions. This inference workload typically requires processing a lot of small compute jobs simultaneously, a task which can be most cost effectively processed by accelerating the compute with energy efficient GPUs, where Nvidia is the market leading provider. Back in 2017 AWS lead the market by being the first to introduce instances optimized for Machine Learning in the cloud with the Nvidia V100 Tensor Core GPUs.  This service enabled customers to reduce Machine Learning training from days to hours. Matt Garman Vice President of Compute Services at AWS summed up the developer perspective on the announcement by succinctly stating “with the new G4 instances powered by T4 GPUs we’re making it more affordable to machine learning in the hands of every developer”

Analyst View: As we increasingly see human machine partnerships (Check out my latest book for more on this topic) increasing in multiple industries the need for improved processing power is also increasing.  With the previous G3 instances on AWS clients were seeing that Machine Learning inference could still be 90% of the overall operational costs for running Machine Learning applications.  With these new G4 instances AWS and Nvidia are tackling this issue head on with raw specialized compute horsepower-aaS.

Deep Dive Into The New Offering

The new G4 instances feature Nvidia T4 GPUs, which are the latest generation of the vendor’s product range as well as custom 2nd generation Intel Xeon Scalable (Cascade Lake) processors with up to 100Gbps of networking throughput and up to 1.8TB of local NVMe storage.  This configuration is claimed by AWS to deliver the most cost-effective GPU instances available on the market today.  This new G4 instance, according to AWS, offers up to a 1.8x increase in graphics performance and up to 2x video transcoding over the previous G3 instances.

Analyst View – With this new level of performance provided via an easy to consume cloud model, AWS and NVIDIA are enabling clients to use a remote workstation in the cloud and run graphics intensive applications as well as efficiently create high-resolution 3D content for games and movies.  Having this capability in a scalable off-premises cloud provided model with such leading performance will open up new opportunities for content creators in the creative arts. 

Game Developer Nirvana?

In the press announcement Electronic Arts (EA) was quoted heavily. EA is a global leader in digital interactive entertainment, delivering games, content and online services to hundreds of millions of players globally.  Erik Zigman EA’s Vice President of Cloud, Social Marketplace and Cloud Gaming Engineering was quoted as saying “leveraging the power of the cloud with providers such as AWS has revolutionized how we create games and how players experience them” he went onto to be quoted as saying “working with AWS’ G4 instance has enabled us to build cost-effective and powerful services that are optimized for bringing online gaming to a wide range of devices”.

Analyst View – While the industry buzz will be on how these new G4 instances will be leveraged for Machine Learning applications, the impact on the gaming and content industries should not be underestimated. As the need for workstations-on-desk is replaced by Workstation-aaS in the cloud, the implications for the cost of provision will be felt on IT budgets in a positive way, while performance will improve for the developer through access to the latest and greatest GPU powered performance.

Summary – I see this as a very positive announcement for AWS, but maybe more so for NVIDIA.  By bringing their industry leading GPU processors to the Cloud they are able to continue to cement their leadership position in the ever-expanding GPU market; especially as it pertains to inference which is the emerging opportunity for AI and market based leadership.  With the multiple uses cases able to be supported by this capability spanning everything from; content creation, games development through to enterprise Machine Learning, I predict increased adoption of the AWS instance model. AWS users will be able to leverage a single instance to accelerate multiple types of production workloads seamlessly and more importantly at a reduced cost to previous architecture choices.

Read more Analysis from Futurum Research:

Pure Storage Accelerate: Flash, Cloud, AI and Everything-as-a-Service

Qualcomm Introduces Cloud AI 100 Powering Cloud Inference

IBM z15: New Data Privacy Passports Ready to Power the Encryption Everywhere Blueprint and Vision

Futurum Research provides industry research and analysis. These columns are for educational purposes only and should not be considered in any way investment advice.

Images: AWS

Daniel Newman

Daniel Newman is the Principal Analyst of Futurum Research and the CEO of Broadsuite Media Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise. From Big Data to IoT to Cloud Computing, Newman makes the connections between business, people and tech that are required for companies to benefit most from their technology projects, which leads to his ideas regularly being cited by CNBC, Barrons, Business Insider and hundreds of other sites across the world. A 7x Best-Selling Author including his most recent “Human/Machine.” Daniel is also a Forbes and MarketWatch (Dow Jones) contributor. MBA and Graduate Adjunct Professor, Daniel Newman is a Chicago Native and his speaking takes him around the world each year as he shares his vision of the role technology will play in our future.
Daniel Newman