CoreWeave, Inc. announced the launch of Serverless RL, a fast and easy way to train AI agents using reinforcement learning (RL). The first publicly available fully managed RL capability, Serverless RL scales seamlessly to dozens of GPUs, requires only a Weights & Biases account and API key to get started, and delivers faster feedback loops with lower barriers to entry for developers. This new capability launches just weeks after CoreWeave's acquisition of OpenPipe, combining its leading RL tools with the Weights & Biases AI developer platform, powered by CoreWeave's AI cloud.
reinforcement learning is critical to making AI agents more reliable, but it has historically been out of reach for most organizations. Running RL ordinarily requires access to costly infrastructure, deep expertise, and time-consuming workflows that slow down iteration. Serverless RL helps to remove those constraints so more enterprises can continuously improve AI agents and deliver a better customer experience.
Benchmarks show nearly 1.4x faster training times and 40% lower costs compared to local H100 GPU environments, with no impact on model quality. This was achieved by solving the long-standing "straggler problem" in RL training. By multiplexing many training runs across CoreWeave's production-grade cluster environment, the system maintains high utilization in aggregate and only charges for incremental tokens generated.
This delivers both improved throughput and significantly reduced cost in practice. Customers ranging from AI-native companies to Fortune 500 enterprises are already showing strong interest in Serverless RL. For example, SquadStack.ai, an AI-powered contact center platform that delivers hyper-personalized experiences for leading consumer brands, will use Serverless RL to enhance customer engagement.
As will QA Wolf, a hybrid platform that helps technology teams ship better software faster.


















