Box, Inc. (NYSE:BOX), a provider of intelligent content management solutions, has announced the general availability of Box Extract. This new offering is designed to leverage leading generative AI models from companies like Google, Anthropic, and OpenAI, combined with advanced agentic capabilities, to intelligently and securely extract valuable information from enterprise content and store it as metadata within Box.
According to Aaron Levie, co-founder and CEO of Box, enterprises possess a significant amount of untapped data within their content. Box Extract aims to unlock this information, enabling businesses to transform how they analyze data and make decisions by converting unstructured content into structured, usable data, thereby delivering real-world impact across various lines of business.
Valmark Financial Group, a financial services firm, has implemented Box Extract to manage sensitive data. Geoff Moore, CIO at Valmark, noted that the platform pairs enterprise-grade security controls with employee expertise to extract data from unstructured sources such as account forms, insurance illustrations, and commission statements. This has resulted in gains in both efficiency and accuracy.
Similarly, Wendy Barron, CIO at the Texas Department of Motor Vehicles, reported that Box Extract, powered by Box AI, allows them to automatically extract key information from forms and records. This reduces manual review and accelerates workflows, all while adhering to public agency security and compliance standards, enabling the team to focus more on timely public service delivery.
Previously, extracting insights from unstructured data often relied on manual processes or legacy tools that were costly and difficult to scale. Box Extract addresses these challenges by integrating advanced AI models and agentic capabilities to deliver accurate extraction from complex documents. Unlike older tools that primarily extract text, Box’s agentic approach allows Box Extract to comprehend document structure and meaning, breaking it into components like paragraphs, tables, or charts before pulling out key information.
Organizations can also create custom Extract Agents tailored to specific business requirements, deploying them securely at scale. The structured data can be stored alongside unstructured content as custom metadata, which can also be exported or synchronized to other systems such as Databricks and Snowflake.
The information extracted by Box Extract and stored as metadata enhances several operational aspects. It facilitates quicker decision-making through metadata-powered dashboards and views within Box Apps. It enables end-to-end workflow automation with Box Relay, with future integration planned for Box Automate. The solution also streamlines content discovery and accelerates search functions for all Box users, and allows for the surfacing and extension of metadata usage into third-party and custom applications.
Specific industry applications include financial services firms using Box Extract for loan origination, extracting due dates and loan terms to expedite payments, reconciliation, and loan servicing. Government and public sector agencies can apply it to permits, public records, grants, contracts, and benefits documents to extract details like permit types, fees, and inspection dates, improving compliance and service delivery. Media and entertainment teams can extract details such as titles, writers, versions, rights holders, and scene keywords from production files and creative assets to streamline search and digital asset management. Insurance carriers can automate the extraction of critical information from accident reports and hospital bills to apply as metadata, assisting investigators and accelerating claim creation. Legal teams can process extensive contracts, identifying language, clauses, counterparty names, expiration dates, renewal terms, and obligation deadlines to enhance contract management.
Matt Renner, President and Chief Revenue Officer of Google Cloud, stated that integrating Google’s Gemini into Box Extract helps customers instantly convert large volumes of content into structured, actionable intelligence. He highlighted Gemini’s capability to process complex, high-volume data, enabling workflow automation and speeding up processes like loan approvals and contract management.
Box Extract, including the ability to create and manage custom Extract Agents, is now available to Box customers on the Enterprise Advanced plan. Customers can select between the Standard Extract Agent, designed for simple and cost-efficient data capture, and the Enhanced Extract Agent, which offers deeper reasoning, handles multimodal document structures, and is suitable for large, complex, or highly variable documents.
Founded in 2005, Box is headquartered in Redwood City, CA, and provides its platform to global organizations including JLL, Morgan Stanley, and Nationwide.