NVIDIA Omniverse Avatar Enables Real-Time Conversational AI Assistants

SANTA CLARA, Calif., Nov. 09, 2021 (GLOBE NEWSWIRE) -- GTC—NVIDIA today announced NVIDIA Omniverse Avatar, a technology platform for generating interactive AI avatars.

Omniverse Avatar connects the company’s technologies in speech AI, computer vision, natural language understanding, recommendation engines and simulation technologies. Avatars created in the platform are interactive characters with ray-traced 3D graphics that can see, speak, converse on a wide range of subjects, and understand naturally spoken intent.

Omniverse Avatar opens the door to the creation of AI assistants that are easily customizable for virtually any industry. These could help with the billions of daily customer service interactions — restaurant orders, banking transactions, making personal appointments and reservations, and more — leading to greater business opportunities and improved customer satisfaction.

“The dawn of intelligent virtual assistants has arrived,” said Jensen Huang, founder and CEO of NVIDIA. “Omniverse Avatar combines NVIDIA’s foundational graphics, simulation and AI technologies to make some of the most complex real-time applications ever created. The use cases of collaborative robots and virtual assistants are incredible and far reaching.”

Omniverse Avatar is part of NVIDIA Omniverse™, a virtual world simulation and collaboration platform for 3D workflows currently in open beta with over 70,000 users.

In his keynote address at NVIDIA GTC, Huang shared various examples of Omniverse Avatar: Project Tokkio for customer support, NVIDIA DRIVE Concierge for always-on, intelligent services in vehicles, and Project Maxine for video conferencing.

In the first demonstration of Project Tokkio, Huang showed colleagues engaging in a real-time conversation with an avatar crafted as a toy replica of himself — conversing on such topics as biology and climate science.         

In a second Project Tokkio demo, he highlighted a customer-service avatar in a restaurant kiosk, able to see, converse with and understand two customers as they ordered veggie burgers, fries and drinks. The demonstrations were powered by NVIDIA AI software and Megatron 530B, which is currently the world’s largest customizable language model.

In a demo of the DRIVE Concierge AI platform, a digital assistant on the center dashboard screen helps a driver select the best driving mode to reach his destination on time, and then follows his request to set a reminder once the car’s range drops below 100 miles.

Separately, Huang showed Project Maxine’s ability to add state-of-the-art video and audio features to virtual collaboration and content creation applications. An English-language speaker is shown on a video call in a noisy cafe, but can be heard clearly without background noise. As she speaks, her words are both transcribed and translated in real time into German, French and Spanish with her same voice and intonation.  

Omniverse Avatar Key Elements
Omniverse Avatar uses elements from speech AI, computer vision, natural language understanding, recommendation engines, facial animation, and graphics delivered through the following technologies: 

  • Its speech recognition is based on NVIDIA Riva, a software development kit that recognizes speech across multiple languages. Riva is also used to generate human-like speech responses using text-to-speech capabilities.
  • Its natural language understanding is based on the Megatron 530B large language model that can recognize, understand and generate human language. Megatron 530B is a pretrained model that can, with little or no training, complete sentences, answer questions of a large domain of subjects, summarize long, complex stories, translate to other languages, and handle many domains that it is not trained specifically to do.
  • Its recommendation engine is provided by NVIDIA Merlin™, a framework that allows businesses to build deep learning recommender systems capable of handling large amounts of data to make smarter suggestions.  
  • Its perception capabilities are enabled by NVIDIA Metropolis, a computer vision framework for video analytics.
  • Its avatar animation is powered by NVIDIA Video2Face and Audio2Face™, 2D and 3D AI-driven facial animation and rendering technologies.

These technologies are composed into an application and processed in real time using the NVIDIA Unified Compute Framework. Packaged as scalable, customizable microservices, the skills can be securely deployed, managed and orchestrated across multiple locations by NVIDIA Fleet Command™.

Learn more about Omniverse Avatar.

Register for free to learn more about NVIDIA Omniverse during NVIDIA GTC, taking place online through Nov. 11. Watch Huang’s GTC keynote address streaming on Nov. 9 and in replay.

About NVIDIA
NVIDIA’s (NASDAQ: NVDA) invention of the GPU in 1999 sparked the growth of the PC gaming market and has redefined modern computer graphics, high performance computing, and artificial intelligence. The company’s pioneering work in accelerated computing and AI is reshaping trillion-dollar industries, such as transportation, healthcare and manufacturing, and fueling the growth of many others. More information at https://nvidianews.nvidia.com/.

For further information, contact:
Kristin Uchiyama
Senior PR Manager
NVIDIA Corporation
+1-408-313-0448
kuchiyama@nvidia.com

Certain statements in this press release including, but not limited to, statements as to: the benefits, impact, and features of NVIDIA Omniverse Avatar, Project Tokkio, DRIVE Concierge, Project Maxine, NVIDIA Riva, Megatron 530B, NVIDIA Merlin, NVIDIA Metropolis, NVIDIA Video2Face and Audio2Face, the NVIDIA Unified Compute Framework and NVIDIA Fleet Command; Omniverse Avatar opening the door to the creation of AI assistants that are easily customizable for virtually any industry; the help of AI assistants leading to greater business opportunities and improved customer satisfaction; and the use cases of collaborative robots and virtual assistants are forward-looking statements that are subject to risks and uncertainties that could cause results to be materially different than expectations. Important factors that could cause actual results to differ materially include: global economic conditions; our reliance on third parties to manufacture, assemble, package and test our products; the impact of technological development and competition; development of new products and technologies or enhancements to our existing product and technologies; market acceptance of our products or our partners' products; design, manufacturing or software defects; changes in consumer preferences or demands; changes in industry standards and interfaces; unexpected loss of performance of our products or technologies when integrated into systems; as well as other factors detailed from time to time in the most recent reports NVIDIA files with the Securities and Exchange Commission, or SEC, including, but not limited to, its annual report on Form 10-K and quarterly reports on Form 10-Q. Copies of reports filed with the SEC are posted on the company's website and are available from NVIDIA without charge. These forward-looking statements are not guarantees of future performance and speak only as of the date hereof, and, except as required by law, NVIDIA disclaims any obligation to update these forward-looking statements to reflect future events or circumstances.

© 2021 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, Audio2Face, Maxine, NGC, NVIDIA DRIVE, NVIDIA Fleet Command, NVIDIA Merlin and NVIDIA Omniverse are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. All other trademarks and copyrights are the property of their respective owners. Features, pricing, availability, and specifications are subject to change without notice.

A photo accompanying this announcement is available at https://www.globenewswire.com/NewsRoom/AttachmentNg/35c4d67a-361e-4693-b500-289ff3c9dbc0