How Alibaba Helped China Take the Lead From the U.S. in Open-Source AI
阿里巴巴如何助力中国在开源 AI 领域超越美国

从左至右:吴泳铭、马云和周靖人。Getty 供图


When Alibaba Group’s cloud business unit announced the first generation of artificial intelligence models developed by its own team in 2023, the Chinese tech giant said in a statement that it would integrate the models, known as Qwen, into its various businesses “in the near future.”
当阿里巴巴集团的云业务部门在 2023 年宣布推出其团队自主研发的第一代人工智能模型时,这家中国科技巨头在声明中表示,将在"不久的将来"把名为"通义千问"(Qwen)的模型整合到其各项业务中。
The reality wasn’t so simple. Each of the company’s six business units—including its cash-cow China e-commerce business and its online entertainment services unit—was making its own technology-buying decisions. And some teams building AI applications were so unimpressed by Qwen’s capabilities that they kept using other companies’ AI models, such as Meta Platforms’ Llama, well into 2024, according to employees with knowledge of the matter. More recently, some Alibaba apps chose to use DeepSeek’s R1 to power their AI features.
现实情况却复杂得多。据知情员工透露,公司旗下六大业务单元——包括利润丰厚的中国电商业务和在线娱乐服务部门——各自拥有技术采购决策权。部分开发 AI 应用的团队对通义千问的性能并不满意,直到 2024 年仍在继续使用 Meta 公司的 Llama 等其他企业的 AI 模型。近期,部分阿里系应用选择采用深度求索(DeepSeek)的 R1 模型来驱动其 AI 功能。
The Takeaway 事件要点
• Alibaba first found it difficult to get own units to use its Qwen models• 阿里巴巴最初难以推动内部各部门采用其通义千问模型
• Qwen models are now regarded as better than Llama
• 通义千问模型现已被认为优于 Llama
• Co-founder Jack Ma got personally involved to encourage Qwen team
• 联合创始人马云亲自介入鼓励 Qwen 团队
A lot has changed since then. Alibaba is now in the lead in open-source AI globally, ahead of Meta Platforms’ Llama on several benchmarks. And while Alibaba’s biggest model is neck and neck with DeepSeek’s R1 model, business users say they prefer Alibaba’s because it offers a broader lineup of models, including smaller ones that cost less to run than DeepSeek’s most up-to-date R1 model. Alibaba’s own business units have switched over completely to Qwen. At the same time, Alibaba is winning over outside businesses as it establishes itself as China’s biggest provider of open-source AI models.
此后情况发生了巨大变化。阿里巴巴如今在全球开源 AI 领域处于领先地位,在多项基准测试中超越了 Meta Platforms 的 Llama 模型。虽然阿里巴巴的最大模型与深度求索的 R1 模型不相上下,但企业用户表示更青睐阿里巴巴的产品,因其提供了更丰富的模型阵容,包括比深度求索最新 R1 模型运行成本更低的轻量级模型。阿里巴巴内部业务部门已全面转向使用 Qwen。与此同时,随着确立中国最大开源 AI 模型提供商的地位,阿里巴巴正在赢得外部企业的青睐。
As of January, more than 290,000 customers were using its Qwen models, in various industries such as automotive, healthcare, education and agriculture, according to the company. Some AI application startups are now opting for the Alibaba-developed models over Llama when building their software. Alibaba Cloud is also trying to increase the global presence of Qwen models. In Japan, for example, Tokyo-based AI developer Abeja has used Qwen to develop multiple large language models this year that are specially designed for the Japanese language.
截至 1 月,据该公司数据显示,已有超过 29 万客户在汽车、医疗、教育及农业等多个行业使用其通义千问(Qwen)模型。部分 AI 应用初创企业在开发软件时,正放弃 Llama 而选择阿里开发的模型。阿里云也正着力提升 Qwen 模型的全球影响力——例如东京 AI 开发商 Abeja 今年就利用 Qwen 开发了多款专为日语设计的大语言模型。
The success of Qwen and DeepSeek demonstrates how Chinese firms are starting to take the lead from the U.S. in open-source AI, one major front in the international AI race. That has enormous implications, as the low cost of open-source AI software means businesses are more likely to adopt it. Chinese tech giants like Alibaba could reshape the global AI software ecosystem if more developers around the world use Chinese open-source models.
通义千问与深度求索(DeepSeek)的成功,标志着中国企业正开始从美国手中夺取开源 AI 领域的主导权,这是国际 AI 竞赛的重要战线。其影响极为深远:由于开源 AI 软件成本低廉,企业更倾向于采用。若全球更多开发者使用中国开源模型,阿里巴巴等中国科技巨头或将重塑全球 AI 软件生态格局。
“Focusing on open-source AI models could enable Chinese companies to have a global impact. Popular open-source models can tap into the collective knowledge of developers and researchers around the world who use the models, and constant feedback from those communities can help accelerate improvements,” said Martin Saerbeck, co-founder and chief technology officer of Aiquris, a Singapore-based firm that helps global businesses adopt AI and manage potential risks.
专注于开源 AI 模型可能使中国企业产生全球影响力。总部位于新加坡、帮助全球企业采用 AI 并管理潜在风险的 Aiquris 公司联合创始人兼首席技术官 Martin Saerbeck 表示:"流行的开源模型能够汇聚全球使用这些模型的开发者和研究人员的集体智慧,来自这些社区的持续反馈有助于加速改进。"
Chinese open-source models like those from Alibaba and DeepSeek could also help accelerate the adoption of AI in China and trigger a proliferation of domestic AI applications, both for enterprises and consumers. The potential impact is huge, given China’s vast market and the growing acceptance of open-source AI solutions among state-owned enterprises and government agencies.
阿里巴巴和深度求索等中国企业的开源模型,还可能加速 AI 在中国的普及,并推动面向企业和消费者的本土 AI 应用激增。考虑到中国庞大的市场规模,以及国有企业和政府机构对开源 AI 解决方案日益增长的接受度,其潜在影响十分巨大。
Last week, Nvidia CEO Jensen Huang said during the company’s earnings conference call that DeepSeek and Alibaba’s Qwen are “among the best open-source AI models.” Huang also talked about how the U.S. can benefit from those Chinese open-source models by deploying and optimizing them on U.S. platforms. “America wins when models like DeepSeek and Qwen run best on American infrastructure,” he said.
上周,英伟达 CEO 黄仁勋在公司财报电话会议上表示,深度求索(DeepSeek)和阿里巴巴的通义千问(Qwen)是"最优秀的开源 AI 模型之一"。黄仁勋还谈到美国如何能通过在本土平台上部署和优化这些中国开源模型而获益。"当深度求索和通义千问这类模型在美国基础设施上运行得最好时,美国就是赢家,"他说。
When Nvidia’s AI research team recently developed new AI models called Cosmos-Reason1 that could be used for robots, autonomous vehicles and other applications that require an ability to understand the physical world, the team used an Alibaba open-source model as the basis for one of the Cosmos-Reason1 models, according to a paper published by Nvidia last month.
根据英伟达上月发布的论文,当英伟达 AI 研究团队近期开发名为 Cosmos-Reason1 的新型 AI 模型(可用于机器人、自动驾驶汽车等需要理解物理世界的应用场景)时,该团队采用了阿里巴巴的开源模型作为其中一款 Cosmos-Reason1 模型的基础架构。
For Alibaba Cloud, China’s largest cloud service provider, a broad lineup of open-source Qwen models that come in all sizes and specifications could motivate more businesses to start using Alibaba’s cloud computing platform, according to employees.
阿里巴巴员工表示,对中国最大云服务商阿里云而言,拥有全尺寸、多规格的开源通义千问模型阵容,将激励更多企业开始使用阿里巴巴的云计算平台。
How Alibaba took the lead in open-source AI is a lesson for U.S. tech giants, including Amazon, Microsoft and Google, which operate in a more centralized fashion than the Chinese company. Alibaba made its decision to allow its different business units to operate autonomously as a prelude to a breakup of the company that didn’t end up happening. But it proved to be a lucky break for Alibaba, forcing its AI engineers to work harder at making the models more appealing.
阿里巴巴如何在开源 AI 领域取得领先,这对亚马逊、微软和谷歌等运营模式更为集中的美国科技巨头具有借鉴意义。阿里巴巴当初决定让各业务单元自主运营,原本是为公司分拆做准备(虽然后来分拆并未实施),但这个决策却意外为阿里带来转机——它迫使 AI 工程师们必须加倍努力提升模型吸引力。
The engineers realized that if they couldn’t convince Alibaba’s own business units Qwen models were the best, they wouldn’t be able to convince outside customers, either.
工程师们意识到,如果连阿里内部业务单元都无法说服其采用 Qwen 模型,就更不可能赢得外部客户青睐。
A Thousand Questions 千问之路
Alibaba was one of China’s early movers in AI model development. In 2021, a year before OpenAI released ChatGPT, Alibaba’s research institute, Damo Academy, launched an AI model called M6. It was based on the transformer architecture that Google engineers had developed and that OpenAI used for its GPT generative AI models such as GPT-2, released in 2019.
阿里巴巴是中国 AI 模型开发的先行者之一。2021 年,在 OpenAI 发布 ChatGPT 的前一年,阿里研究院达摩院就推出了基于 Transformer 架构的 M6 模型。该架构由谷歌工程师开发,OpenAI 曾用于其 2019 年发布的 GPT-2 等生成式 AI 模型。
In late 2022, when OpenAI released ChatGPT, sparking a wave of excitement in the tech industry around the world, Alibaba ramped up its efforts. It promoted Alibaba executive Zhou Jingren, a Microsoft veteran who had joined Alibaba in 2015 and had worked on M6, to be Alibaba Cloud’s chief technology officer.
2022 年底,当 OpenAI 发布 ChatGPT 引发全球科技界热潮时,阿里巴巴迅速加码布局。该公司擢升了阿里云高管周靖人出任首席技术官——这位微软老兵 2015 年加入阿里,曾主导 M6 大模型研发。
Zhou set about developing a new generation of AI models under the name Tongyi Qianwen, or Qwen for short. In Mandarin, “Tongyi” means “extensive knowledge” and “Qianwen” means “a thousand questions.” Together, the moniker represents Alibaba’s ambition in the LLM space.
周靖人随即牵头开发新一代人工智能模型,命名为"通义千问"(简称 Qwen)。其中"通义"意为广博学识,"千问"代表无穷探询,这个名称彰显了阿里在 LLM 领域的雄心。
Alibaba Cloud unveiled the first version in April 2023 and the second, Qwen2, six months later.
阿里云在 2023 年 4 月发布首个版本,六个月后又推出了第二代 Qwen2 模型。
At the time, China’s domestic race to develop LLMs was still in the early stages. Alibaba and other Chinese companies were trying to catch up with U.S. leaders like OpenAI, Anthropic, Google and Meta. Dozens of local players, tech giants and startups alike, were rushing to build their foundation models. The market was so crowded and the competition so intense that the Chinese media dubbed the phenomenon “the war of a hundred models.”
当时中国本土的 LLM 研发竞赛尚处萌芽阶段。阿里巴巴等中国企业正试图追赶 OpenAI、Anthropic、谷歌和 Meta 等美国领军者。数十家本土科技巨头与初创企业争相构建基础模型,市场拥挤程度与竞争烈度令中国媒体将这种现象称为"百模大战"。
While Alibaba was grappling with the intensifying AI race, the company went through a historic shake-up. In early 2023—in the wake of the Chinese government’s antitrust crackdown, which landed Alibaba a record $2.8 billion fine—the company announced it would split itself into six highly independent business groups under a holding company. It was both a response to Chinese regulators’ unhappiness with big tech conglomerates and an effort to rejuvenate growth within the company. Alibaba at the time said the split would allow each business unit to respond more quickly to market changes.
当阿里巴巴正竭力应对日益激烈的 AI 竞赛时,公司经历了一场历史性变革。2023 年初——在中国政府反垄断监管开出创纪录的 28 亿美元罚单后——阿里巴巴宣布将重组为控股公司架构下的六大高度独立业务集团。这既是对中国监管机构不满科技巨头的回应,也是重振公司增长活力的举措。阿里巴巴当时表示,拆分将使各业务单元能更快应对市场变化。
In September 2023, Alibaba’s then-CEO, Daniel Zhang, stepped down, replaced by Eddie Wu, one of the 18 founding members who built the company in 1999. Wu, an engineer who had served as chief technology officer of multiple Alibaba businesses, focused his attention primarily on AI strategy once he took the helm.
2023 年 9 月,时任阿里巴巴 CEO 张勇卸任,由 1999 年参与创办公司的 18 位元老之一吴泳铭接棒。这位曾担任多个阿里业务首席技术官的工程师上任后,立即将主要精力聚焦于 AI 战略布局。
In the first half of 2024, Alibaba Cloud stepped up its efforts to persuade the other business units to use Qwen models for all of their AI products. Alibaba Cloud employees reached out to various units and tried to talk to teams that were working on AI applications and features. But after the 2023 reorganization, business units communicated less. Employees of one unit often had little knowledge of other units’ organizational structures or who was in charge of what.
2024 年上半年,阿里云加大力度说服其他业务部门在所有 AI 产品中使用通义千问模型。阿里云员工主动联系各业务单元,试图与开发 AI 应用和功能的团队沟通。但在 2023 年组织架构调整后,业务部门间的交流明显减少。某部门员工往往对其他部门的组织架构或职责分工知之甚少。
At that time, the company’s AI development work was focusing as heavily on proprietary versions of Qwen models as on open-source versions. But over the past year, Alibaba’s priorities gradually shifted toward open-source models, as Qwen’s open-source versions began to receive more feedback from AI developer communities in both China and in the U.S., and startups, academic researchers and doctoral students started using them to build their own custom AI models.
当时公司对闭源版通义千问模型的投入与开源版本并重。但随着通义千问开源版本开始获得来自中美 AI 开发者社区更多反馈,初创企业、学术研究者和博士生群体纷纷基于其构建定制化 AI 模型,过去一年间阿里巴巴逐渐将重心转向开源模型。
In contrast, the proprietary Qwen models, which were up against the best models from OpenAI, Anthropic and Google, as well as from Chinese competitors like ByteDance, didn’t attract as much attention.
相比之下,需要直面 OpenAI、Anthropic、谷歌以及字节跳动等国内竞争对手顶尖模型的闭源版通义千问,并未获得同等程度的关注。
The Qwen team’s first major breakthrough in terms of public recognition came in late 2024, after the release of its Qwen2.5 open-source models, which received positive feedback from developers in China and the U.S. and helped establish Alibaba as one of the leaders in open-source models. Inside Alibaba, many teams developing AI applications also adopted Qwen2.5.
Qwen 团队在公众认知层面的首次重大突破出现在 2024 年末,其开源的 Qwen2.5 模型发布后获得中美开发者的积极反馈,助力阿里巴巴跻身开源模型领导者行列。在阿里内部,众多 AI 应用开发团队也采用了 Qwen2.5。
The open-source versions of Qwen 2.5, released in September last year, “significantly outperformed” Llama 3, which had come out earlier in the year, said Tony Ren, founder of agentic AI startup ReOrc.
AI 代理初创公司 ReOrc 创始人 Tony Ren 表示,去年 9 月发布的 Qwen2.5 开源版本"显著优于"同年早些时候面世的 Llama3 模型。
But the success of DeepSeek quickly overshadowed the brief excitement over Qwen2.5. A two-year-old offshoot of a Chinese quantitative hedge fund, DeepSeek shot to global stardom in early February as its R1 open-source reasoning model shocked the global tech industry with its strong performance and low development cost.
但 DeepSeek 的迅速崛起很快掩盖了 Qwen2.5 带来的短暂热潮。这家源自中国量化对冲基金的两年新秀,在今年 2 月初凭借 R1 开源推理模型惊艳全球科技界——其卓越性能与低廉开发成本形成强烈反差,一夜之间成为国际焦点。
Many of Alibaba’s cloud services customers asked to use the DeepSeek model, so Alibaba Cloud added R1 to its offerings of AI models. Some of Alibaba’s own AI applications and features also adopted DeepSeek. For example, Alibaba’s popular travel app, Fliggy, decided to use R1 to build its new AI travel assistant feature, AskMe, launched in April this year, according to an employee with knowledge of the matter.
阿里巴巴云服务的许多客户要求使用深度求索(DeepSeek)模型,因此阿里云将 R1 模型纳入其 AI 模型产品线。阿里巴巴内部部分 AI 应用和功能也采用了 DeepSeek 技术。据知情员工透露,例如阿里巴巴旗下热门旅行应用飞猪今年 4 月推出的 AI 旅行助手"问我"功能,就是基于 R1 模型开发的。
Alibaba.com, which helps merchants outside China find products from Chinese suppliers, also integrated R1 into its AI search app, Accio. Some of Alibaba’s business intelligence teams also adopted R1 in their internal analytical tools.
阿里巴巴国际站(Alibaba.com)——该平台帮助海外商家对接中国供应商——也将 R1 模型集成至其 AI 搜索应用 Accio 中。阿里巴巴部分商业智能团队还在内部分析工具中采用了 R1 模型。
Jack Ma’s Attention 马云的高度关注
DeepSeek’s success put enormous pressure on the Qwen team. Even Jack Ma, Alibaba’s iconic founder, who had stepped down from executive and board roles six years ago, frequently asked Zhou, the Alibaba Cloud CTO, to provide updates on the progress of Qwen3 development, according to two people with knowledge of the matter. Ma’s attention reminded the Qwen team’s members that Qwen3 was the top priority not only for Alibaba Cloud but for Alibaba as a whole.
深度求索的成功给通义千问(Qwen)团队带来巨大压力。据两位知情人士透露,即便是六年前已卸任管理层和董事会职务的阿里巴巴标志性创始人马云,也频繁要求阿里云 CTO 周靖汇报 Qwen3 的开发进展。马云的关注让通义千问团队意识到,Qwen3 不仅是阿里云的首要任务,更是整个阿里巴巴集团的重中之重。
Adding to the pressure, Alibaba wanted the new models to come out before DeepSeek launched its highly anticipated successor to R1. In the office, Qwen team members sometimes took turns taking power naps at night on mattresses kept under their desks. During the final week before Qwen3’s April launch, some members only slept five or six hours in total for the whole week, according to an employee.
压力之下,阿里巴巴希望新模型能在深度求索发布万众期待的 R1 迭代版本前问世。办公室里,Qwen 团队成员夜间轮流在工位下的床垫上小憩。据一名员工透露,在 4 月 Qwen3 发布前的最后一周,部分成员整周睡眠时间仅有五六个小时。
Meanwhile, Meta’s AI team, responsible for the company’s Llama models, was working just as hard to catch up to DeepSeek and other rivals. In early April, Meta unveiled Llama 4, the latest generation of its open-source AI models, which received a lukewarm reception from some critics who said improvements from the previous generation were too incremental. That was a relief for Alibaba’s Qwen team, which internally tested Llama 4, according to two employees. They became more confident that their upcoming Qwen3 models would receive positive feedback from global AI developer communities.
与此同时,负责 Llama 模型的 Meta 人工智能团队也在全力追赶深度求索等竞争对手。4 月初 Meta 发布了最新一代开源 AI 模型 Llama 4,但部分评论人士反应平淡,认为其相较前代改进有限。据两名员工透露,阿里巴巴 Qwen 团队通过内部测试 Llama 4 后如释重负,他们更加确信即将发布的 Qwen3 模型将获得全球 AI 开发者社群的积极反馈。
In late April, Alibaba finally released Qwen3, a suite of eight models that come in various sizes and specifications. And all eight of them were open-source models, highlighting Alibaba’s strategic priority. The company said Qwen3 can switch between “thinking mode” for performing complex tasks like math and coding, and “nonthinking mode” for quick responses to simpler prompts, depending on users’ preferences. Wu, the Alibaba CEO, said during an earnings call last month that the company is firmly committed to open-source AI. “We believe the full open sourcing of Qwen3 will drive innovation and the new applications by developers, startups and enterprises,” he said.
四月底,阿里巴巴终于发布了 Qwen3 系列模型套装,包含八款不同规模和规格的模型。这八款模型全部开源,彰显了阿里巴巴的战略重点。公司表示,Qwen3 能根据用户偏好,在用于执行数学及编程等复杂任务的"思考模式"与快速响应简单指令的"非思考模式"间自由切换。阿里巴巴集团 CEO 吴泳铭在上月财报电话会上强调公司坚定投入 AI 开源:"我们相信 Qwen3 的全面开源将推动开发者、初创企业及各行业伙伴的创新应用开发。"
Several versions of Alibaba’s latest-generation Qwen3 models, released in late April, outperform Meta’s latest Llama 4 models, according to AI model leaderboards LiveBench and Artificial Analysis. The largest version of Qwen3 initially surpassed DeepSeek’s R1 on those leaderboards, but DeepSeek last week released an updated version of R1, which once again surpassed Qwen3.
根据 AI 模型排行榜 LiveBench 和 Artificial Analysis 的数据,阿里巴巴四月末发布的新一代 Qwen3 多个版本性能已超越 Meta 最新 Llama 4 模型。其中 Qwen3 最大参数版本曾一度在榜单上领先深度求索的 R1 模型,但后者上周发布 R1 升级版本后重新反超 Qwen3。
Alibaba’s own AI products that previously used DeepSeek are now relying on Qwen. Fliggy, the Alibaba travel app, is switching the foundation model for its AskMe AI travel assistant from R1 to Qwen3, according to the employee with knowledge of the matter. Accio, the AI search app for merchants, is also adopting Qwen3 while phasing out its usage of R1.
阿里巴巴旗下原本采用深度求索(DeepSeek)的 AI 产品现已全面转向 Qwen。据知情员工透露,阿里旅行应用飞猪正将其"问路"AI 旅行助手的基础模型从 R1 升级为 Qwen3。面向商家的 AI 搜索应用 Accio 也在采用 Qwen3,同时逐步淘汰 R1 模型。
Ren, of ReOrc, which is building enterprise AI agents for customers both within and outside China, said he sees big potential to develop enterprise agents on Qwen3 for overseas customers.
ReOrc 公司创始人任先生表示,他看好基于 Qwen3 为海外客户开发企业级 AI 代理的巨大潜力。该公司正在为海内外企业客户构建 AI 代理解决方案。
Though Alibaba’s business units continue to operate independently, the growing importance of Qwen is helping to bring them closer. Many teams from various business units are now talking to Alibaba Cloud about their plans to develop more-capable AI agents powered by Qwen3. Employees of multiple units are also discussing potential future collaborations where units can access each others’ AI agents, so all of the agents can perform more diverse tasks for users, according to an employee with knowledge of such discussions.
尽管阿里巴巴各业务单元仍保持独立运营,但 Qwen 日益凸显的重要性正促进内部协同。据知情人士透露,目前多个业务部门的团队正与阿里云商讨基于 Qwen3 开发更强大 AI 代理的计划。多个部门的员工也在探讨未来实现 AI 代理互通的合作可能,让所有代理能为用户执行更多样化的任务。
Juro Osawa is a reporter covering tech in Asia, from Alibaba and Tencent to startups. He previously worked for The Wall Street Journal. He is based in Hong Kong and can be found on Twitter at @JuroOsawa.
Juro Osawa 是一名报道亚洲科技领域的记者,关注范围从阿里巴巴、腾讯到初创企业。他此前曾供职于《华尔街日报》,现居香港,可通过 Twitter 账号@JuroOsawa 联系。
Qianer Liu is a reporter for The Information covering semiconductors and AI in Asia. She is based in Hong Kong and can be reached at qianer@theinformation.com or @QianerLiu on X.
刘倩儿是 The Information 驻亚洲记者,负责半导体与人工智能领域报道。她常驻香港,联系方式:qianer@theinformation.com 或推特账号@QianerLiu。