NVIDIA Launches TensorRT™ 8, the Eighth Generation of its AI Software
July 20, 2021 at 09:00 am EDT
Share
NVIDIA launched TensorRT™ 8, the eighth generation of the company’s AI software, which slashes inference time in half for language queries -- enabling developers to build the world’s best-performing search engines, ad recommendations and chatbots and offer them from the cloud to the edge. TensorRT 8’s optimizations deliver record-setting speed for language applications, running BERT-Large, one of the world’s most widely used transformer-based models, in 1.2 milliseconds. In the past, companies had to reduce their model size, which resulted in significantly less accurate results. Now, with TensorRT 8, companies can double or triple their model size to achieve dramatic improvements in accuracy. In addition to transformer optimizations, TensorRT 8’s breakthroughs in AI inference are made possible through two other key features. Sparsity is a new performance technique in NVIDIA Ampere architecture GPUs to increase efficiency, allowing developers to accelerate their neural networks by reducing computational operations. Quantization aware training enables developers to use trained models to run inference in INT8 precision without losing accuracy. This significantly reduces compute and storage overhead for efficient inference on Tensor Cores. Industry leaders have embraced TensorRT for their deep learning inference applications in conversational AI and across a range of other fields. Hugging Face is an open-source AI leader relied on by the world’s large AI service providers across multiple industries. The company is working closely with NVIDIA to introduce AI services that enable text analysis, neural search and conversational applications at scale. TensorRT 8 is now generally available and free of charge to members of the NVIDIA Developer program. The latest versions of plug-ins, parsers and samples are also available as open source from the TensorRT GitHub repository.
NVIDIA Corporation is the world leader in the design, development, and marketing of programmable graphics processors. The group also develops associated software. Net sales break down by family of products as follows:
- computing and networking solutions (55.9%): data center platforms and infrastructure, Ethernet interconnect solutions, high-performance computing solutions, platforms and solutions for autonomous and intelligent vehicles, solutions for enterprise artificial intelligence infrastructure, crypto-currency mining processors, embedded computer boards for robotics, teaching, learning and artificial intelligence development, etc.;
- graphics processors (44.1%): for PCs, game consoles, video game streaming platforms, workstations, etc. (GeForce, NVIDIA RTX, Quadro brands, etc.). The group also offers laptops, desktops, gaming computers, computer peripherals (monitors, mice, joysticks, remote controls, etc.), software for visual and virtual computing, platforms for automotive infotainment systems and cloud collaboration platforms.
Net sales break down by industry between data storage (55.6%), gaming (33.6%), professional visualization (5.7%), automotive (3.4%) and other (1.7%).
Net sales are distributed geographically as follows: the United States (30.7%), Taiwan (25.9%), China (21.5%) and other (21.9%).