Box, Inc. announced the general availability of Box Extract. Powered by leading generative AI models from companies like Google, Anthropic, and OpenAI, and combined with advanced agentic capabilities, Box Extract enables enterprises to intelligently and securely pull the most valuable information from content and save it as metadata in Box. With Box Extract, it is now easier than ever for enterprises to automate workflows, accelerate decision-making, and power faster access to information and insights.
Box Extract enables organizations to turn unstructured content into structured, usable data, delivering real-world impact by having their content actively work for them across their most important lines of business. Box Extract pairs enterprise-grade security controls with the deep subject matter expertise of employees to extract data from unstructured sources and transform it into actionable insights securely. By mining information from sources such as account forms, insurance illustrations, and commission statements, organizations have achieved gains in both efficiency and accuracy.
Box Extract combines the latest AI models from Google, Anthropic, and OpenAI, with advanced agentic capabilities to deliver accurate extraction from complex documents. Box?s agentic approach enables Box Extract to understand document structure and meaning, break it down into components, such as paragraphs, tables, or charts, and then pull out the most important information. Teams can create custom Extract Agents tailored to their business needs and deploy them securely at scale across a wide range of content.
These Box Extract Agents give customers the flexibility to store structured data alongside unstructured content as custom metadata, which can also be exported or synced to other systems such as Databricks and Snowflake. The information pulled out by Box Extract is stored alongside content on Box, and enables enterprises to quickly make decisions using metadata-powered dashboards and views within Box Apps; automate workflows end-to-end on Box with Box Relay today, and with Box Automate in the future; streamline content discovery and faster search for every Box user; surface and extend usage of metadata into 3rd party and custom applications. Financial services firms can use Box Extract for loan origination, enabling the extraction of due dates and loan terms to accelerate payments, reconciliation, and loan servicing; government and public sector agencies can use Box Extract on permits, public records, grants, contracts, and benefits documents to extract important details like permit types, fees, and inspection dates, streamlining compliance and accelerating service delivery across departments; media and entertainment teams can use Box Extract to automatically extract details including titles, writers, versions, rights holders, and scene keywords, from production files and creative assets, including scripts, talent agreements, and client briefs to quickly search for specific scenes and efficiently manage digital assets; insurance carriers can use Box Extract to automatically extract critical information from accident reports and hospital bills, and apply it as metadata in Box, helping investigators move faster, reduce manual review, and accelerate claim creation; legal teams can use Box Extract to process long contracts and identify different language and clauses.
Box Extract automatically captures key contract details, such as counterparty names, expiration dates, renewal terms, clauses, and obligation deadlines, and applies them as metadata, enabling enhanced contract management. The ability to create and manage custom Extract Agents with Box Extract is now available to Box customers through the Enterprise Advanced plan. Within the Box Extract offering, customers can choose between the Standard Extract Agent, which streamlines simple data capture for faster and more cost-efficient results, and the Enhanced Extract Agent, which takes dedicated steps based on multimodal document structure, delivers deeper reasoning, and can handle large, complex, or highly variable documents.


















