OpenAI has updated its Preparedness Framework — the internal system it uses to assess the safety of AI models and determine necessary safeguards during development and deployment. In the update, OpenAI stated that it may “adjust” its safety requirements if a competing AI lab releases a “high-risk” system without similar protections in place.
OpenAI 更新了其準備框架(Preparedness Framework),這是一個內部系統,用於評估 AI 模型的安全性,並確定開發和部署過程中必要的安全措施。在更新中,OpenAI 表示,如果競爭對手 AI 實驗室發布“高風險”系統,但沒有採取類似的保護措施,它可能會“調整”其安全要求。
The change reflects the increasing competitive pressures on commercial AI developers to deploy models quickly. OpenAI has been accused of lowering safety standards in favor of faster releases, and of failing to deliver timely reports detailing its safety testing. Last week, 12 former OpenAI employees filed a brief in Elon Musk’s case against OpenAI, arguing the company would be encouraged to cut even more corners on safety should it complete its planned corporate restructuring.
這一變化反映了商業 AI 開發商在快速部署模型方面面臨的日益激烈的競爭壓力。OpenAI 一直被指責降低安全標準以換取更快的發布速度,並且未能及時提供詳細的安全測試報告。上週,12 名前 OpenAI 員工在埃隆·馬斯克(Elon Musk)起訴 OpenAI 的案件中提交了一份簡報,認為如果 OpenAI 完成其計劃的公司重組,該公司將被鼓勵在安全方面偷工減料。
Perhaps anticipating criticism, OpenAI claims that it wouldn’t make these policy adjustments lightly, and that it would keep its safeguards at “a level more protective.”
也許是預料到批評,OpenAI 聲稱它不會輕易做出這些政策調整,並且會將其安全措施保持在“更具保護性的水平”。
“If another frontier AI developer releases a high-risk system without comparable safeguards, we may adjust our requirements,” wrote OpenAI in a blog post published Tuesday afternoon. “However, we would first rigorously confirm that the risk landscape has actually changed, publicly acknowledge that we are making an adjustment, assess that the adjustment does not meaningfully increase the overall risk of severe harm, and still keep safeguards at a level more protective.”
OpenAI 在週二下午發布的一篇博文中寫道:“如果另一家前沿 AI 開發商發布高風險系統,但沒有採取類似的安全措施,我們可能會調整我們的要求。” “但是,我們首先會嚴格確認風險形勢確實發生了變化,公開承認我們正在進行調整,評估該調整不會顯著增加嚴重危害的總體風險,並且仍然將安全措施保持在更具保護性的水平。”
The refreshed Preparedness Framework also makes clear that OpenAI is relying more heavily on automated evaluations to speed up product development. The company says that while it hasn’t abandoned human-led testing altogether, it has built “a growing suite of automated evaluations” that can supposedly “keep up with [a] faster [release] cadence.”
更新後的準備框架還明確表明,OpenAI 正在更多地依賴自動化評估來加速產品開發。該公司表示,雖然它並未完全放棄人工主導的測試,但它已經建立了一套“不斷增長的自動化評估套件”,據稱可以“跟上[更快的][發布]節奏”。
Some reports contradict this. According to the Financial Times, OpenAI gave testers less than a week for safety checks for an upcoming major model — a compressed timeline compared to previous releases. The publication’s sources also alleged that many of OpenAI’s safety tests are now conducted on earlier versions of models rather than the versions released to the public.
一些報導與此相矛盾。據《金融時報》報導,OpenAI 給測試人員不到一週的時間來進行即將推出的主要模型的安全檢查——與之前的版本相比,時間線縮短了。該出版物的消息來源還聲稱,OpenAI 的許多安全測試現在是在早期版本的模型上進行的,而不是在向公眾發布的版本上進行的。
In statements, OpenAI has disputed the notion that it’s compromising on safety.
在聲明中,OpenAI 對其在安全方面做出妥協的說法提出了異議。
StrictlyVC is coming to London! We're bringing together the brightest minds in venture for an evening of candid conversations, top-tier networking, and delicious bites. Join us and be part of the movement shaping the future of innovation.
StrictlyVC is coming to London! We're bringing together the brightest minds in venture for an evening of candid conversations, top-tier networking, and delicious bites. Join us and be part of the movement shaping the future of innovation.
Other changes to OpenAI’s framework pertain to how the company categorizes models according to risk, including models that can conceal their capabilities, evade safeguards, prevent their shutdown, and even self-replicate. OpenAI says that it’ll now focus on whether models meet one of two thresholds: “high” capability or “critical” capability.
OpenAI 框架的其他變更涉及公司如何根據風險對模型進行分類,包括可以隱藏其能力、逃避安全措施、阻止其關閉甚至自我複製的模型。OpenAI 表示,它現在將專注於模型是否達到以下兩個閾值之一:“高”能力或“關鍵”能力。
OpenAI’s definition of the former is a model that could “amplify existing pathways to severe harm.” The latter are models that “introduce unprecedented new pathways to severe harm,” per the company.
OpenAI 對前者的定義是,一種可以“擴大現有嚴重危害途徑”的模型。根據該公司的說法,後者是“引入前所未有的新嚴重危害途徑”的模型。
“Covered systems that reach high capability must have safeguards that sufficiently minimize the associated risk of severe harm before they are deployed,” wrote OpenAI in its blog post. “Systems that reach critical capability also require safeguards that sufficiently minimize associated risks during development.”
OpenAI 在其博文中寫道:“達到高能力的受監管系統必須具有充分降低相關嚴重危害風險的安全措施,然後才能部署。” “達到關鍵能力的系統還需要在開發過程中採取充分降低相關風險的安全措施。”
The updates are the first OpenAI has made to the Preparedness Framework since 2023.
這是 OpenAI 自 2023 年以來首次對「準備框架」(Preparedness Framework)進行更新。