Chinese AI firm DeepSeek has emerged as a potential challenger to U.S. AI companies, demonstrating breakthrough models that claim to offer performance comparable to leading offerings at a fraction of the cost. The company’s mobile app, released in early January, has lately topped the App Store charts across major markets including the U.S., UK, and China, but it hasn’t escaped doubts about whether its claims are true.
中國 AI 公司 DeepSeek 已成為美國 AI 公司的潛在競爭者,其突破性模型聲稱以低廉的成本提供與領先產品相當的性能。該公司於 1 月初發布的行動應用程式最近在美國、英國和中國等主要市場的 App Store 排行榜上名列前茅,但其說法的真實性仍受到質疑。
Founded in 2023 by Liang Wenfeng, the former chief of AI-driven quant hedge fund High-Flyer, DeepSeek’s models are open source and incorporate a reasoning feature that articulates its thinking before providing responses.
DeepSeek 由前 AI 驅動量化對沖基金 High-Flyer 的負責人梁文峰於 2023 年創立,其模型是開源的,並包含一個推理功能,可在提供回應之前闡明其思考過程。
Wall Street’s reactions have been mixed. While brokerage firm Jefferies warns that DeepSeek’s efficient approach “punctures some of the capex euphoria” following recent spending commitments from Meta and Microsoft — each exceeding $60 billion this year — Citi is questioning whether such results were actually achieved without advanced GPUs.
華爾街的反應褒貶不一。傑富瑞證券公司警告稱,DeepSeek 的高效方法「刺破了 Meta 和微軟近期資本支出狂熱的部分泡沫」——兩家公司今年的支出均超過 600 億美元——而花旗則質疑這樣的結果是否真的沒有使用先進的 GPU 就能實現。
Goldman Sachs sees broader implications, suggesting the development could reshape competition between established tech giants and startups by lowering barriers to entry.
高盛認為其影響更為深遠,暗示這一發展可能通過降低進入門檻來重塑成熟科技巨頭與新創公司之間的競爭格局。
Here’s how Wall Street analysts are reacting to DeepSeek, in their own words (emphasis ours):
以下是華爾街分析師對 DeepSeek 的反應,他們自己的話(重點是我們的):
Jefferies
DeepSeek’s power implications for AI training punctures some of the capex euphoria which followed major commitments from Stargate and Meta last week. With DeepSeek delivering performance comparable to GPT-4o for a fraction of the computing power, there are potential negative implications for the builders, as pressure on AI players to justify ever increasing capex plans could ultimately lead to a lower trajectory for data center revenue and profit growth.
DeepSeek 強大的效能對 AI 訓練帶來的影響,沖淡了上週 Stargate 和 Meta 大量投資後所帶來的資本支出狂熱。DeepSeek 以一小部分的運算能力就能提供與 GPT-4o 相當的效能,這對建構者來說可能帶來負面影響,因為 AI 業者的壓力越來越大,需要證明不斷增加的資本支出計畫是合理的,最終可能導致數據中心營收和利潤增長的軌跡下降。If smaller models can work well, it is potentially positive for smartphone. We are bearish on AI smartphone as AI has gained no traction with consumers. More hardware upgrade (adv pkg+fast DRAM) is needed to run bigger models on the phone, which will raise costs. AAPL’s model is in fact based on MoE, but 3bn data parameters are still too small to make the services useful to consumers. Hence DeepSeek’s success offers some hope but there is no impact on AI smartphone’s near-term outlook.
如果較小的模型也能運作良好,則可能對智慧型手機有利。我們看淡 AI 智慧型手機,因為 AI 尚未獲得消費者的青睞。要在手機上運行更大的模型,需要更多硬體升級(進階封裝 + 快速 DRAM),這將提高成本。AAPL 的模型實際上是基於 MoE,但 30 億個數據參數仍然太小,無法讓服務對消費者有用。因此,DeepSeek 的成功帶來了一些希望,但對 AI 智慧型手機的短期展望沒有影響。China is the only market that pursues LLM efficiency owing to chip constraint. Trump/Musk likely recognize the risk of further restrictions is to force China to innovate faster. Therefore, we think it likely Trump will relax the AI Diffusion policy.
中國是唯一因晶片限制而追求LLM效率的市場。川普/馬斯克可能意識到進一步限制的風險是迫使中國更快地創新。因此,我們認為川普很可能會放寬 AI 散播政策。
Citi 花旗
While DeepSeek’s achievement could be groundbreaking, we question the notion that its feats were done without the use of advanced GPUs to fine tune it and/or build the underlying LLMs the final model is based on through the Distillation technique. While the dominance of the US companies on the most advanced AI models could be potentially challenged, that said, we estimate that in an inevitably more restrictive environment, US’ access to more advanced chips is an advantage. Thus, we don’t expect leading AI companies would move away from more advanced GPUs which provide more attractive $/TFLOPs at scale. We see the recent AI capex announcements like Stargate as a nod to the need for advanced chips.
儘管 DeepSeek 的成就可能具有突破性,但我們質疑其壯舉是否未使用先進 GPU 來微調它和/或構建其最終模型所基於的底層LLMs(通過蒸餾技術)。雖然美國公司在最先進的 AI 模型上的主導地位可能會受到挑戰,但話雖如此,我們估計在一個不可避免的更具限制性的環境中,美國獲得更先進晶片的優勢依然存在。因此,我們不預期領先的 AI 公司會放棄提供更具吸引力的規模化 $/TFLOPs 的更先進 GPU。我們將最近的 AI 資本支出公告(如 Stargate)視為對先進晶片需求的肯定。
Bernstein 伯恩斯坦
In short, we believe that 1) DeepSeek DID NOT “build OpenAI for $5M”; 2) the models look fantastic but we don’t think they are miracles; and 3) the resulting Twitterverse panic over the weekend seems overblown.
簡而言之,我們認為:1) DeepSeek 並沒有「以 500 萬美元打造 OpenAI」;2) 這些模型看起來很棒,但我們認為它們並非奇蹟;3) 週末 Twitter 上的恐慌反應似乎過於誇大。Our own initial reaction does not include panic (far from it). If we acknowledge that DeepSeek may have reduced costs of achieving equivalent model performance by, say, 10x, we also note that current model cost trajectories are increasing by about that much every year anyway (the infamous “scaling laws…”) which can’t continue forever. In that context, we NEED innovations like this (MoE, distillation, mixed precision etc) if AI is to continue progressing. And for those looking for AI adoption, as semi analysts we are firm believers in the Jevons paradox (i.e. that efficiency gains generate a net increase in demand), and believe any new compute capacity unlocked is far more likely to get absorbed due to usage and demand increase vs impacting long term spending outlook at this point, as we do not believe compute needs are anywhere close to reaching their limit in AI. It also seems like a stretch to think the innovations being deployed by DeepSeek are completely unknown by the vast number of top tier AI researchers at the world’s other numerous AI labs (frankly we don’t know what the large closed labs have been using to develop and deploy their own models, but we just can’t believe that they have not considered or even perhaps used similar strategies themselves).
我們最初的反應並未包含恐慌(遠非如此)。如果我們承認 DeepSeek 可能將實現同等模型性能的成本降低了 10 倍,我們也注意到目前的模型成本軌跡每年都在以大約相同的幅度增加(臭名昭著的「規模定律……」),而這不可能永遠持續下去。在此背景下,如果 AI 要繼續進步,我們需要像這樣的創新(MoE、蒸餾、混合精度等)。對於那些尋求 AI 應用的人來說,作為半分析師,我們堅信傑文斯悖論(即效率提升會導致需求淨增加),並相信任何解鎖的新計算能力都更有可能因使用和需求增加而被吸收,而不是在現階段影響長期支出展望,因為我們不認為 AI 的計算需求已接近其極限。認為 DeepSeek 部署的創新完全為全球眾多其他 AI 實驗室的大量頂級 AI 研究人員所不知,這似乎也有點牽強(坦白說,我們不知道大型封閉實驗室一直在使用什麼來開發和部署他們自己的模型,但我們就是不相信他們沒有考慮過,甚至可能沒有使用過類似的策略)。
Morgan Stanley 摩根士丹利
We have not confirmed the veracity of these reports, but if they are accurate, and advanced LLM are indeed able to be developed for a fraction of previous investment, we could see generative AI run eventually on smaller and smaller computers (downsizing from supercomputers to workstations, office computers, and finally personal computers) and the SPE industry could benefit from the accompanying increase in demand for related products (chips and SPE) as demand for generative AI spreads.
我們尚未證實這些報告的真實性,但如果屬實,而且先進的LLM確實能夠以過去投資的一小部分成本開發出來,我們可能會看到生成式 AI 最終運行在越來越小的電腦上(從超級電腦縮小到工作站、辦公電腦,最後到個人電腦),而 SPE 產業也可能受益於隨著生成式 AI 需求擴散而伴隨而來的相關產品(晶片和 SPE)需求增加。
Goldman Sachs
With the latest developments, we also see 1) potential competition between capital-rich internet giants vs. start-ups, given lowering barriers to entry, especially with recent new models developed at a fraction of the cost of existing ones; 2) from training to more inferencing, with increased emphasis on post-training (including reasoning capabilities and reinforcement capabilities) that requires significantly lower computational resources vs. pre-training; and 3) the potential for further global expansion for Chinese players, given their performance and cost/price competitiveness.
We continue to expect the race for AI application/AI agents to continue in China, especially amongst To-C applications, where China companies have been pioneers in mobile applications in the internet era, e.g., Tencent’s creation of the Weixin (WeChat) super-app. Amongst To-C applications, ByteDance has been leading the way by launching 32 AI applications over the past year. Amongst them, Doubao has been the most popular AI Chatbot thus far in China with the highest MAU (c.70mn), which has recently been upgraded with its Doubao 1.5 Pro model. We believe incremental revenue streams (subscription, advertising) and eventual/sustainable path to monetization/positive unit economics amongst applications/agents will be key.
For the infrastructure layer, investor focus has centered around whether there will be a near-term mismatch between market expectations on AI capex and computing demand, in the event of significant improvements in cost/model computing efficiencies. For Chinese cloud/data center players, we continue to believe the focus for 2025 will center around chip availability and the ability of CSP (cloud service providers) to deliver improving revenue contribution from AI-driven cloud revenue growth, and beyond infrastructure/GPU renting, how AI workloads & AI related services could contribute to growth and margins going forward. We remain positive on long-term AI computing demand growth as a further lowering of computing/training/inference costs could drive higher AI adoption. See also Theme #5 of our key themes report for our base/bear scenarios for BBAT capex estimates depending on chip availability, where we expect aggregate capex growth of BBAT to continue in 2025E in our base case (GSe: +38% yoy) albeit at a slightly more moderate pace vs. a strong 2024 (GSe: +61% yoy), driven by ongoing investment into AI infrastructure.
J.P.Morgan
Above all, much is made of DeepSeek’s research papers, and of their models’ efficiency. It’s unclear to what extent DeepSeek is leveraging High-Flyer’s ~50k hopper GPUs (similar in size to the cluster on which OpenAI is believed to be training GPT-5), but what seems likely is that they’re dramatically reducing costs (inference costs for their V2 model, for example, are claimed to be 1/7 that of GPT-4 Turbo). Their subversive (though not new) claim – that started to hit the US AI names this week – is that “more investments do not equal more innovation.” Liang: “Right now I don’t see any new approaches, but big firms do not have a clear upper hand. Big firms have existing customers, but their cash-flow businesses are also their burden, and this makes them vulnerable to disruption at any time.” And when asked about the fact that GPT5 has still not been released: “OpenAI is not a god, they won’t necessarily always be at the forefront.”
UBS
Throughout 2024, the first year we saw massive AI training workload in China, more than 80-90% IDC demand was driven by AI training and concentrated in 1-2 hyperscaler customers, which translated to wholesale hyperscale IDC demand in relatively remote area (as power-consuming AI training is sensitive to utility cost rather than user latency).
If AI training and inference cost is significantly lower, we would expect more end users would leverage AI to improve their business or develop new use cases, especially retail customers. Such IDC demand means more focus on location (as user latency is more important than utility cost), and thus greater pricing power for IDC operators that have abundant resources in tier 1 and satellite cities. Meanwhile, a more diversified customer portfolio would also imply greater pricing power.
We’ll update the story as more analysts react.