Shapefin

CrowdStrike and Meta Introduce Open-Source Benchmarks for Evaluating AI in Cybersecurity Operations

Share It:

CrowdStrike and Meta Introduce Open-Source Benchmarks for Evaluating AI in Cybersecurity Operations

At Fal.

Con 2025 in Las Vegas, CrowdStrike (NASDAQ: CRWD) announced a partnership with Meta to introduce CyberSOCEval, a new open-source suite of benchmarks designed to assess the performance of AI systems, particularly large language models (LLMs), in real-world security operations, aiming to enhance cyber defense against evolving threats.

Cyber defenders consistently face challenges from a high volume of security alerts and sophisticated, evolving threats. To effectively counter adversaries, organizations are increasingly adopting advanced AI technologies. However, many security teams are still in the early stages of integrating AI, especially LLMs, to automate tasks and improve efficiency in security operations. Without standardized benchmarks, determining which AI systems, use cases, and performance metrics truly offer an advantage against real-world cyberattacks remains difficult.

Meta and CrowdStrike are addressing this challenge through CyberSOCEval, a benchmark suite built upon Meta’s open-source CyberSecEval framework and CrowdStrike’s extensive threat intelligence and cybersecurity AI data expertise. This suite establishes a new framework for testing, selecting, and leveraging LLMs within a security operations center (SOC), helping to define what constitutes effective AI for cyber defense.

CyberSOCEval evaluates LLMs across critical security workflows, including incident response, malware analysis, and threat analysis comprehension. By testing AI systems against a combination of real-world adversary tradecraft and expert-designed security reasoning scenarios based on observed adversarial tactics, organizations can validate system performance under pressure and confirm operational readiness. These benchmarks enable security teams to identify areas where AI delivers maximum value, while providing model developers with clear guidance for enhancing capabilities that improve return on investment and SOC effectiveness.

Vincent Gonguet, Director of Product, GenAI at Superintelligence Labs at Meta, stated, “At Meta, we are dedicated to advancing and maximizing the benefits of open source AI, particularly as large language models become powerful tools for organizations of all sizes. Our collaboration with CrowdStrike introduces a new open-source benchmark suite to evaluate the capabilities of LLMs in real-world security scenarios. With these benchmarks in place, and open for the security and AI community to further improve, we can more quickly work as an industry to unlock the potential of AI in protecting against advanced attacks, including AI-based threats.”

Daniel Bernard, Chief Business Officer at CrowdStrike, commented, “When two leaders like CrowdStrike and Meta come together, it’s larger than collaboration; it’s about setting the direction of cybersecurity for the AI era. By combining CrowdStrike’s adversary intelligence and leadership in AI-native cybersecurity with Meta’s AI research expertise and vast dataset, we’re helping customers – and cybersecurity as a sector – adopt AI systems with confidence. This partnership sets a new bar for how AI in the SOC should be built and deployed, empowering defenders to stay ahead of the adversary.”

The CyberSOCEval open-source benchmark suite is now available for the AI and security community to utilize for evaluating model capabilities. Access to the benchmarks is provided via Meta’s CyberSecEval framework.

Latest Posts