這是用戶在 2025-4-30 22:52 為 https://www.scmp.com/tech/tech-trends/article/3308566/deepseek-quietly-updates-open-source-model-han... 保存的雙語快照頁面,由 沉浸式翻譯 提供雙語支持。了解如何保存?

DeepSeek quietly updates open-source model that handles maths proofs
DeepSeek 低調更新了可處理數學證明的開源模型

The Chinese start-up has released the Prover-V2 model a day after Alibaba released Qwen3, and ahead of an anticipated release of DeepSeek-R2
這家中國初創公司在阿里巴巴發布 Qwen3 後一天發布了 Prover-V2 模型,並搶在預計發布 DeepSeek-R2 之前

Reading Time:  閱讀時間:2 minutes  2 分鐘
Why you can trust SCMP
為何您信任《南華早報》
0
Listen  收聽
DeepSeek’s quiet release of its latest open-source model for maths proofs comes amid a flurry of releases from Chinese AI firms. Photo: AFP
Ben Jiangin Beijing  江本在北京
Chinese start-up DeepSeek quietly open-sourced a new specialist artificial intelligence (AI) model on Wednesday, just a day after Alibaba unveiled the third generation of its Qwen family, as competition heats up in the race to advance generative AI capabilities.
中國新創公司 DeepSeek 週三悄悄地開源了一款新的專業人工智慧(AI)模型,就在阿里巴巴推出其 Qwen 系列第三代產品的隔天,生成式 AI 能力的競賽正變得白熱化。

Hangzhou-based DeepSeek uploaded its latest open-source Prover-V2 model to Hugging Face, the world’s largest open-source AI community, without making any announcements on its official social media channels. This comes amid growing anticipation for its new R2 reasoning model, which is expected to launch soon.
總部位於杭州的 DeepSeek 將其最新的開源 Prover-V2 模型上傳到全球最大的開源人工智慧社群 Hugging Face,但未在其官方社交媒體管道上發布任何公告。 此舉正值人們對其預計即將推出的新型 R2 推理模型的期待日益高漲之際。

DeepSeek’s Prover series consists of domain-specific models designed to solve math-related problems.
DeepSeek 的 Prover 系列由特定領域的模型組成,旨在解決與數學相關的問題。

The company has yet to provide any details about the model on its Hugging Face page. Uploaded files viewed by the Post suggest that it was built on top of DeepSeek’s V3 model, which has 671 billion parameters and adopts a mixture-of-experts architecture for cost-efficient training and operation.
該公司尚未在其 Hugging Face 頁面上提供有關該模型的任何詳細資訊。《南華早報》查看的已上傳檔案顯示,該模型建立在 DeepSeek 的 V3 模型之上,該模型具有 6710 億個參數,並採用混合專家架構,以實現具成本效益的訓練和運營。

The development of a math-focused model that could enhance a general-purpose foundational model’s mathematical skills has fueled speculation that DeepSeek will soon launch additional models.
開發一種以數學為重點的模型,可以增強通用基礎模型的數學技能,這引發了人們對 DeepSeek 即將推出更多模型的猜測。

DeepSeek did not immediately respond to a request for comment on Wednesday.
DeepSeek 週三未立即回應置評請求。

The stealth launch of Prover-V2 came on the heels of Alibaba’s release of Qwen3. The e-commerce giant said, citing benchmarks, that its newest model outperformed DeepSeek-R1 and OpenAI’s o1 reasoning models.
Prover-V2 的低調發布緊隨阿里巴巴發布 Qwen3 之後。這家電子商務巨頭表示,根據基準測試,其最新模型優於 DeepSeek-R1 和 OpenAI 的 o1 推理模型。

Prover-V2 is an update to its predecessor, Prover-V1.5, which debuted in August – four months before DeepSeek stunned the world with its V3 model. The company said that V3 was developed at a fraction of the cost and energy used by Western peers in training advanced AI models.
Prover-V2 是對其前身 Prover-V1.5 的更新,Prover-V1.5 於 8 月首次亮相,四個月後,DeepSeek 以其 V3 模型震驚了世界。該公司表示,V3 的開發成本和能源消耗僅為西方同行訓練先進 AI 模型的一小部分。

In a technical report for Prover-V1.5, DeepSeek said its work on pre-training the specialist model had advanced its base model’s capabilities in formal theorem proving and mathematical reasoning.
在 Prover-V1.5 的技術報告中,DeepSeek 表示,其在預訓練專業模型方面的工作已提高了其基礎模型在形式定理證明和數學推理方面的能力。

While DeepSeek has not openly shared its timeline or progress for new models, the company has regularly published its latest research results, including updates to the Prover model.
儘管 DeepSeek 沒有公開分享其新模型的時間表或進展,但該公司定期發布其最新的研究成果,包括 Prover 模型的更新。

Last month, DeepSeek also launched an update to its V3 foundational model, which features enhanced reasoning, optimised coding and upgraded Chinese writing capabilities, according to a notice on the company’s website.
上個月,DeepSeek 還發布了其 V3 基礎模型的更新,根據該公司網站上的一份通知,該模型具有增強的推理能力、優化的編碼和升級的中文寫作能力。

Ben Jiang
Ben Jiang  姜 Ben
Ben is a Beijing-based technology reporter for the Post focusing on emerging start-ups. He has previously covered Chinese tech for publications including KrAsia and TechNode.
Ben 是《郵報》駐北京的科技記者,主要關注新興的初創企業。他之前曾為 KrAsia 和 TechNode 等媒體報導中國科技新聞。

Smartphone giant Xiaomi unveils AI model, joining fierce competition in China
智能手機巨頭小米推出 AI 模型,加入中國的激烈競爭

Xiaomi says its open-source MiMo reasoning model, trained completely in-house, rivals the performance of OpenAI’s o1-mini and Alibaba’s QwQ-32B
小米表示,其完全自主研發的開源 MiMo 推理模型,性能可與 OpenAI 的 o1-mini 和阿里巴巴的 QwQ-32B 相媲美

Reading Time:  閱讀時間:2 minutes  2 分鐘
Why you can trust SCMP
為何您能信任 SCMP
0
Listen  聆聽
Xiaomi, which recently broke into China’s electric vehicle market, is now making its own artificial intelligence models as it looks to infuse its hardware with generative AI. Photo: AFP
Chinese smartphone and electric vehicle maker Xiaomi on Friday unveiled a new reasoning artificial intelligence (AI) model developed in-house, underscoring the company’s ambition to integrate its hardware products with home-grown generative AI.
中國智能手機和電動汽車製造商小米週五推出了一款內部開發的全新推理人工智慧 (AI) 模型,突顯了該公司將其硬體產品與本土生成式 AI 相結合的雄心。
The open-source MiMo model has 7 billion parameters and outperformed OpenAI’s o1-mini and Alibaba Group Holding’s QwQ-32B-Preview, part of the Qwen series of models, in maths reasoning and coding, Xiaomi said in a statement. Alibaba owns the South China Morning Post.
小米在一份聲明中表示,開源 MiMo 模型擁有 70 億個參數,在數學推理和編碼方面優於 OpenAI 的 o1-mini 和阿里巴巴集團控股的 QwQ-32B-Preview(Qwen 系列模型的一部分)。阿里巴巴擁有《南華早報》。

MiMo is Xiaomi’s first large language model (LLM), and the company said it was developed using reinforcement learning by its specialised AI task force, known as Core.
MiMo 是小米的首個大型語言模型 (LLM),該公司表示,該模型是由其專業 AI 工作組 Core 利用強化學習開發的。

Xiaomi’s stock price in Hong Kong rose 5.3 per cent on Friday to HK$49.95, while shares of Kingsoft Cloud Holdings – in which Xiaomi holds a 10 per cent stake and Xiaomi CEO Lei Jun holds 11 per cent – jumped 14.2 per cent to HK$7.4.
小米集團在香港的股價週五上漲 5.3%,至 49.95 港元,而金山雲控股的股價(小米持有 10% 的股份,小米 CEO 雷軍持有 11% 的股份)則飆升 14.2%,至 7.4 港元。

The launch of the model aligns with earlier reports that Xiaomi had been building up its computing resources. According to a report by local media outlet Jiemian in December, Xiaomi bought about 10,000 graphics processing units to train its models.
該模型的推出與早先關於小米一直在建立其計算資源的報導相符。根據當地媒體介面在 12 月份的一篇報導,小米購買了大約 10,000 個圖形處理單元來訓練其模型。

Xiaomi’s AI ambitions were evident when the company made an offer to hire Luo Fuli, China’s AI “genius girl” from DeepSeek. Luo, a key contributor to the DeepSeek-V2 model, ultimately declined the offer.
小米的人工智慧雄心在於,該公司曾向中國人工智慧「天才少女」DeepSeek 的羅芙莉發出聘用邀請。羅芙莉是 DeepSeek-V2 模型的主要貢獻者,但最終拒絕了該邀請。