The News: This week at GTC in China, NVIDIA introduced groundbreaking inference software that developers everywhere can use to deliver conversational AI applications, slashing inference latency that until now has impeded true, interactive engagement.
NVIDIA TensorRT™ 7 — the seventh generation of the company’s inference software development kit — opens the door to smarter human-to-AI interactions, enabling real-time engagement with applications such as voice agents, chatbots and recommendation engines.
It is estimated that there are 3.25 billion digital voice assistants being used in devices around the world, according to Juniper Research. By 2023, that number is expected to reach 8 billion, more than the world’s total population.
TensorRT 7 features a new deep learning compiler designed to automatically optimize and accelerate the increasingly complex recurrent and transformer-based neural networks needed for AI speech applications. This speeds the components of conversational AI by more than 10x compared to when run on CPUs, driving latency below the 300-millisecond threshold considered necessary for real-time interactions. Read the full news release here.
Analyst Take: This week has been a big one for NVIDIA, which shouldn’t surprise anyone as GTC China is one of the company’s signature events and commonly includes announcements around its key businesses including automotive, gaming and enterprise. While gaming and autonomous vehicles often win the lions share of headlines, The launch of NVIDIA TensorRT 7 caught my attention as I’ve been focused on the implications that AI will have on the enterprise through mobile, desktop and applications where natural language processing and the contextualization of chat via AI inference are set to make a big impact.
Why NVIDIA TensorRT 7 Will Propel AI Inference in the Enterprise
It is really quite simple, conversational AI is one of the most well understood use cases for AI today. As more people engage in conversations with their Android device or Siri or with Alexa, the familiarity with natural language processing is growing and users are becoming more comfortable with the technology. Unfortunately, we are still hampered by the fact that most conversational interfaces are still pretty rudimentary. (When you ask Siri a question, it generally just pings the search engine and really isn’t doing much more than speech to text.) While this is still aiding the move forward with AI in the enterprise, what is being sought after is a much more natural conversation between human and machine. One that would be more like two people speaking, which is something that really isn’t happening yet, but with the release of Tensor RT7 becomes increasingly possible as latency is removed and the ability for multi-turn conversations becomes achievable.
With the emergence of real-time natural language processing, the logical evolution is real-time inference and response so that a customer can receive not just a canned response, but a contextual response based upon deep neural networks that have been developed over the course of large volumes of interactions. This added contextual capability solves for training, inference and real-time requirements that make AI truly able to provide value for service support or recommending products and services. As NVIDIA TensorRT 7 is able to prove itself within applications, like former versions have done for the likes of WeChat, I believe enterprises will increasingly adopt this SDK to support development of AI driven customer interfaces used for industries like Retail, Banking and Customer Service in just about any industry.
I remain confident that NVIDIA will be at the center of the conversational AI movement over the next 2-3 years. As I’ve oft-stated, it isn’t just the GPUs that make NVIDIA such a powerful player in enterprise AI, but the software/SDKs that the company continues to iterate and innovate upon to enable its users to build capable software leveraging the power of AI.
With the growth of enterprise AI, specifically conversational via voice agents, chatbots and recommendation engines, it is becoming increasingly clear that the delta between scalable adoption comes down to the speed and context that can be delivered at the user interface. This again, is why the software is so important, and also represents the reason why NVIDIA is so well positioned to capitalize on the market opportunity.
Futurum Research provides industry research and analysis. These columns are for educational purposes only and should not be considered in any way investment advice.
Read more analysis from Futurum Research:
Image Credit: NVIDIA
Daniel Newman is the Principal Analyst of Futurum Research and the CEO of Broadsuite Media Group. Living his life at the intersection of people and technology, Daniel works with the world’s largest technology brands exploring Digital Transformation and how it is influencing the enterprise. Read Full Bio