Search Engine Land » SEO » Claude Sonnet 3.7 is the leading LLM for AI SEO: Report
搜索引擎土地 » SEO » Claude Sonnet 3.7 是领先的 AI SEO LLM：报告

Claude Sonnet 3.7 is the leading LLM for AI SEO: Report
Claude Sonnet 3.7 是领先的 LLM，用于 AI SEO：报告

Benchmark reveals which LLMs you can use for some SEO tasks. It also reminds us that humans are more reliable than AI (for now anyway).
基准测试揭示了哪些 LLM 可以用于一些 SEO 任务。它也提醒我们，人类（至少目前）比 AI 更可靠。

Danny Goodwin on April 29, 2025 at 10:50 am | Reading time: 2 minutes
丹尼·古德温于 2025 年 4 月 29 日上午 10:50 | 阅读时间：2 分钟

Claude Sonnet 3.7 is the top-performing large language model (LLM) – it outperforms competitors like Google’s Gemini, Meta’s Llama, and X’s Grok. That’s according to SEO agency Previsible’s new AI SEO Benchmark report.
Claude Sonnet 3.7 是表现最出色的大型语言模型（LLM），它超越了谷歌的 Gemini、Meta 的 Llama 和 X 的 Grok 等竞争对手。这是根据 SEO 机构 Previsible 的新 AI SEO 基准报告得出的结论。

By the numbers. Claude Sonnet 3.7 “performed the best across the board,” earning an 83% score. But that score fell short against human SEOs (who scored 89%).
从数据来看，Claude Sonnet 3.7 “在所有方面都表现出色”，获得了 83% 的分数。但这个分数低于人类 SEO（人类 SEO 得分为 89%）。

LLMs averaged: LLMs 平均：

85% on content tasks. 在内容任务上为 85%。
79% on technical SEO. 技术 SEO 得分 79%
63% on ecommerce SEO. 电商 SEO 得分 63%

Here’s how the other language models scored:
其他语言模型的得分如下：

Perplexity: 82% 复杂度：82%
Gemini 2.5: 81% Gemini 2.5：81%
ChatGPT 4o: 79%
ChatGPT o3-mini: 78%
Copilot: 78%
Deepseek: 78%
Gemini 2.0 Flash: 71% Gemini 2.0 闪存：71%
Llama 4: 71% Llama 4：71%
Grok 3: 71% Grok 3：71%

Why we care. AI is getting better at handling various routine SEO tasks (e.g., content generation, keyword mapping). However, the real value in SEO comes from human expertise: strategic planning, technical execution, cross-discipline collaboration, and creative problem-solving. Relying too heavily on LLMs could expose brands to costly SEO mistakes and search visibility.
我们为何关心。人工智能在处理各种常规 SEO 任务（例如，内容生成、关键词映射）方面变得越来越出色。然而，SEO 的真正价值在于人类的专业知识：战略规划、技术执行、跨学科协作和创造性问题解决。过度依赖 LLMs 可能会使品牌面临昂贵的 SEO 错误和搜索可见性问题。

Persona helps. One interesting finding was that adding a persona to a prompt (e.g., “you are an SEO expert”) improves performance by 2.8%, on average.
角色化助力。一个有趣的发现是，在提示中添加一个角色（例如，“你是一位 SEO 专家”）平均可以提高性能 2.8%。

What doesn’t help. Allowing LLMs to use web search resulted in 3.2% worse performance on average. Also, deep research resulted in 5.7% worse performance, on average.
什么没有帮助。允许 LLMs 使用网络搜索平均导致性能下降 3.2%。此外，深入研究平均导致性能下降 5.7%。

About the data. Previsible created a 50-question SEO test set covering key categories like content, technical SEO, and ecommerce. Each question had objectively correct answers based on established best practices and was independently scored by multiple SEO experts to ensure consistency.
关于数据。Previsible 创建了一个包含 50 个问题的 SEO 测试集，涵盖了内容、技术 SEO 和电子商务等关键类别。每个问题都有基于既定最佳实践的正确答案，并由多位 SEO 专家独立评分，以确保一致性。

The benchmark measures accuracy – so an 83% score means a model answered 83% of questions correctly. All models were tested across different modes (e.g., with and without SEO personas, web search access) to evaluate how various features impacted performance.
该基准测试衡量准确性——因此 83%的得分意味着模型正确回答了 83%的问题。所有模型都在不同的模式下进行了测试（例如，带有和没有 SEO 角色，网络搜索访问），以评估各种功能如何影响性能。

Between the lines. The core flaw of using LLMs for SEO? AI is probabilistic – it predicts, it doesn’t know.
字里行间。使用 LLMs 进行 SEO 的核心缺陷是什么？AI 是概率性的——它预测，它不知道。

“Until [models] are 99%+ reliable, it’s impossible to rely too heavily on them. Your best bet is using them for what they’re good at – like building content briefs or identifying internal link opportunities using embeddings,” according to David Bell, Previsible SEO co-founder.
“除非[模型]达到 99%以上的可靠性，否则过度依赖它们是不可能的。根据可预见的 SEO 联合创始人 David Bell 的说法，您最好的选择是利用它们擅长的事情——比如构建内容概要或使用嵌入识别内部链接机会。”

What’s next. Previsible plans to update its AI SEO Benchmark here.
接下来是什么。预计将在此更新其 AI SEO 基准计划。

The report. Leaderboard Launch: Previsible’s New AI SEO Benchmark
报告。排行榜发布：Previsible 的新 AI SEO 基准

Add Search Engine Land to your Google News feed.

将搜索引擎营销添加到您的谷歌新闻源。

Related stories 相关故事

New on Search Engine Land
新在搜索引擎土地

Unblock your creative flow with AI
解锁您的创意流动，用 AI

Google’s ‘People also consider’ sponsored format raises concerns
谷歌的“人们还考虑”赞助格式引发担忧

Google disables Discover performance report hack to get desktop data
谷歌禁用 Discover 性能报告漏洞以获取桌面数据

Search, answer, and assistive engine optimization: A 3-part approach
搜索、回答和辅助引擎优化：三步法

Yelp launches 15 AI-powered updates for service brands and restaurants
Yelp 为服务品牌和餐厅推出 15 项 AI 驱动更新

About the author 关于作者

Staff 员工

Danny Goodwin 丹尼·古德温

Danny Goodwin is Editorial Director of Search Engine Land & Search Marketing Expo - SMX. He joined Search Engine Land in 2022 as Senior Editor. In addition to reporting on the latest search marketing news, he manages Search Engine Land’s SME (Subject Matter Expert) program. He also helps program U.S. SMX events.
丹尼·古德温是《搜索引擎营销》和《搜索引擎营销展览 - SMX》的编辑总监。他于 2022 年加入《搜索引擎营销》担任高级编辑。除了报道最新的搜索引擎营销新闻外，他还管理《搜索引擎营销》的主题专家（SME）项目。他还帮助策划美国的 SMX 活动。

Goodwin has been editing and writing about the latest developments and trends in search and digital marketing since 2007. He previously was Executive Editor of Search Engine Journal (from 2017 to 2022), managing editor of Momentology (from 2014-2016) and editor of Search Engine Watch (from 2007 to 2014). He has spoken at many major search conferences and virtual events, and has been sourced for his expertise by a wide range of publications and podcasts.
好文自 2007 年以来一直在编辑和撰写关于搜索和数字营销的最新发展和趋势。他之前是《搜索引擎杂志》的执行编辑（2017 年至 2022 年），《Momentology》的编辑（2014 年至 2016 年），以及《搜索引擎观察》的编辑（2007 年至 2014 年）。他曾在许多主要的搜索会议和虚拟活动中发表演讲，并且被众多出版物和播客引用其专业知识。

Claude Sonnet 3.7 is the leading LLM for AI SEO: ReportClaude Sonnet 3.7 是领先的 LLM，用于 AI SEO：报告

Benchmark reveals which LLMs you can use for some SEO tasks. It also reminds us that humans are more reliable than AI (for now anyway).基准测试揭示了哪些 LLM 可以用于一些 SEO 任务。它也提醒我们，人类（至少目前）比 AI 更可靠。

New AI Algorithms Hurting Your Site? 新的 AI 算法正在损害您的网站吗？

Claude Sonnet 3.7 is the leading LLM for AI SEO: Report
Claude Sonnet 3.7 是领先的 LLM，用于 AI SEO：报告

Benchmark reveals which LLMs you can use for some SEO tasks. It also reminds us that humans are more reliable than AI (for now anyway).
基准测试揭示了哪些 LLM 可以用于一些 SEO 任务。它也提醒我们，人类（至少目前）比 AI 更可靠。

New AI Algorithms Hurting Your Site?
新的 AI 算法正在损害您的网站吗？