NVIDIA announced two new large language model cloud AI services — the NVIDIA NeMo Large Language Model Service and the NVIDIA BioNeMo LLM Service — that enable developers to easily adapt LLMs and deploy customized AI applications for content generation, text summarization, chatbots, code development, as well as protein structure and biomolecular property predictions, and more. The NeMo LLM Service allows developers to rapidly tailor a number of pretrained foundation models using a training method called prompt learning on NVIDIA-managed infrastructure. The NVIDIA BioNeMo Service is a cloud application programming interface (API) that expands LLM use cases beyond language and into scientific applications to accelerate drug discovery for pharma and biotech companies.

NeMo LLM Service Boosts Accuracy With Prompt Learning, Accelerates Deployments; With the NeMo LLM Service, developers can use their own training data to customize foundation models ranging from 3 billion parameters up to Megatron 530B, one of the world's largest LLMs. The process takes just minutes to hours compared with the weeks or months required to train a model from scratch. Models are customized with prompt learning, which uses a technique called p-tuning. This allows developers to use just a few hundred examples to rapidly tailor foundation models that were originally trained with billions of data points.

The customization process generates task-specific prompt tokens, which are then combined with the foundation models to deliver higher accuracy and more relevant responses for specific use cases. Developers can customize for multiple use cases using the same model and generate many different prompt tokens. A playground feature provides a no-code option to easily experiment and interact with models, further boosting the effectiveness and accessibility of LLMs for industry-specific use cases.

Once ready to deploy, the tuned models can run on cloud instances, on-premises systems or through an API. BioNeMo LLM Service Enables Researchers to Tap Power of Massive Models; The BioNeMo LLM Service includes two new BioNeMo language models for chemistry and biology applications. It provides support for protein, DNA and biochemical data to help researchers discover patterns and insights in biological sequences.

BioNeMo enables researchers to expand the scope of their work by leveraging models that contain billions of parameters. These larger models can store more information about the structure of proteins, evolutionary relationships between genes, and even generate novel biomolecules for therapeutic applications. Cloud API Provides Access to Megatron 530B, Other Ready-Made Models; In addition to tuning foundation models, the LLM services include the option to use ready-made and custom models through a cloud API.

This gives developers access to a broad range of pretrained LLMs, including Megatron 530B. It also provides access to T5 and GPT-3 models created with the NVIDIA NeMo Megatron framework — now available in open beta — to support a broad range of applications and multilingual service requirements. Leaders in automotive, computing, education, healthcare, telecommunications and other industries are using NeMo Megatron to pioneer services for customers in Chinese, English, Korean, Swedish and other languages.