As AI systems advance in capability, they have the potential to accelerate scientific discovery and drive economic growth. Yet alongside those benefits they also pose a distinct challenge: Highly capable frontier AI systems may introduce or elevate large-scale risks to public safety and national security, including those related to advanced cyber and chemical, biological, radiological, and nuclear (CBRN) threats.
Frontier AI safety frameworks (FAISFs) have emerged as a critical tool for managing such risks and ensuring the responsible development and deployment of advanced AI systems.1 Since late 2023, leading frontier AI developers and research organizations have been refining frontier AI frameworks and elaborating on their use in managing the most severe risks from frontier models. In May 2024, 16 companies formally committed to developing and publishing frameworks as part of the Frontier AI Safety Commitments at the AI Seoul Summit. To date, 12 major AI developers have published FAISFs, demonstrating a growing industry consensus on responsible development practices.2
In a series of technical reports over the coming months, the Frontier Model Forum will examine how these frameworks can be implemented effectively across different organizational contexts. The series intends to provide detailed insight into key components of FAISFs, incorporating lessons from early adopters while acknowledging areas where best practices continue to evolve. Today, we’re introducing the first report in this series.
Framework Implementation
Frontier AI safety frameworks address a novel challenge in technology governance: managing risks from systems whose capabilities are rapidly evolving and may eventually exceed human understanding in key domains. These frameworks provide systematic processes for proactively identifying technological thresholds beyond which significant new risks emerge, assessing when systems approach those thresholds, developing and implementing proportional safeguards, and ensuring organizational accountability throughout the AI development lifecycle.
Frontier frameworks typically include several core elements:
- Risk Identification: Forward-looking processes for identifying high-severity risks, including those related to CBRN capabilities and advanced cyber threats
- Risk and Capability Thresholds: Processes for setting capability thresholds that trigger enhanced safeguards or development constraints
- Risk and Capability Assessments: Methodologies for determining whether models possess capabilities that could enable significant harm, including approaches to rigorously evaluate capabilities against established thresholds
- Risk Mitigations: Defense-in-depth strategies to prevent misuse of frontier capabilities, including harm refusal mechanisms, monitoring systems, and processes to verify their effectiveness against sophisticated adversaries
- Risk Governance: Organizational structures and processes, including third-party assessments and related transparency mechanisms, that ensure frontier frameworks are properly implemented, maintained, and improved
Although there is a growing consensus about the high-level structure of frontier frameworks, there is still significant variation in how they are implemented.3 Our technical report series aims to inform stakeholders across the frontier AI ecosystem about current practices in the implementation of FAISFs and support efforts to develop standards for their implementation.
Harmonizing approaches to implementing frontier frameworks creates substantial benefits: It establishes clear, consistent expectations for what constitutes adequate safety and security practices and enables meaningful comparisons between different organizations’ assurances about the security and reliability of their most advanced models and systems. Most importantly, harmonizing the implementation of FAISFs helps to translate high-level security principles into consistent, effective practices that can be evaluated and improved over time.
Looking Ahead and Forthcoming Reports
As AI capabilities continue to advance, robust and effective FAISFs will become increasingly important for maintaining public trust in frontier AI. Our inaugural report focuses on Frontier Capability Assessments—procedures and methodologies for determining whether models possess capabilities that could pose significant risks.
Future reports in this series will address the following elements of frontier frameworks:
- Risk taxonomies and thresholds
- Mitigations and safeguards
- Third-party assessments
We welcome engagement with these technical reports from across the frontier AI safety and security ecosystem. Researchers and organizations interested in further refining and harmonizing the implementation of frontier frameworks are invited to reach out to the Frontier Model Forum.
We will update this post in the coming months as further technical reports are published.
- See METR’s Update on Responsible Scaling Policies (September 2023) and Anthropic’s original announcement of their Responsible Scaling Policy (September 2023). ↩︎
- Published frameworks include: Amazon’s Frontier Model Safety Framework, Anthropic’s Responsible Scaling Policy, Cohere’s Secure AI Frontier Model Framework, G42’s Frontier AI Safety Framework, Google DeepMind’s Frontier Safety Framework, Magic’s AGI Readiness Framework, Meta’s Frontier AI Framework, Microsoft’s Frontier Governance Framework, NAVER’s AI Safety Framework, NVIDIA’s Frontier AI Risk Assessment, OpenAI’s Preparedness Framework, and xAI’s Risk Management Framework. ↩︎
- For example, see METR, Common Elements of Frontier AI Safety Policies; the FMF, Components of Safety Frameworks; and Buhl, Bucknell and Masterson, Emerging Practices in Frontier AI Safety Frameworks. ↩︎