這是用戶在 2024-6-1 20:32 為 https://sparktoro.com/blog/an-anonymous-source-shared-thousands-of-leaked-google-search-api-document... 保存的雙語快照頁面,由 沉浸式翻譯 提供雙語支持。了解如何保存?

An Anonymous Source Shared Thousands of Leaked Google Search API Documents with Me; Everyone in SEO Should See Them
一位匿名消息來源與我分享了數千份洩露的 Google 搜尋 API 檔案;每個從事 SEO 的人都應該看看這些檔案

On Sunday, May 5th, I received an email from a person claiming to have access to a massive leak of API documentation from inside Google’s Search division.
5 月 5 日星期天,我收到一封電子郵件,發件人聲稱擁有來自 Google 搜尋部門內部的大量 API 文件洩漏。

The email further claimed that these leaked documents were confirmed as authentic by ex-Google employees, and that those ex-employees and others had shared additional, private information about Google’s search operations.
該電子郵件進一步聲稱,這些洩露的檔案已被前Google員工證實為真,並且那些前員工和其他人分享了有關Google搜尋操作的更多私人資訊。

Many of their claims directly contradict public statements made by Googlers over the years, in particular the company’s repeated denial that click-centric user signals are employed, denial that subdomains are considered separately in rankings, denials of a sandbox for newer websites, denials that a domain’s age is collected or considered, and more. 
他們的許多說法直接與多年來Google員工的公開聲明相矛盾,特別是公司一再否認使用以點選為中心的使用者訊號,否認子域名在排名中被單獨考慮,否認新網站有沙盒,否認收集或考慮域名的年齡,等等。

Naturally, I was skeptical. The claims made by this source (who asked to remain anonymous) seemed extraordinary–claims like:
自然地,我持懷疑態度。這位消息來源(要求保持匿名)所提出的說法似乎非同尋常——例如:

  • In their early years, Google’s search team recognized a need for full clickstream data (every URL visited by a browser) for a large percent of web users to improve their search engine’s result quality.
    在早期,Google 的搜尋團隊認識到需要獲取大量網路使用者的完整點選流資料(瀏覽器訪問的每個 URL),以提高其搜尋引擎的結果品質。
  • A system called “NavBoost” (cited by VP of Search, Pandu Nayak, in his DOJ case testimony) initially gathered data from Google’s Toolbar PageRank, and desire for more clickstream data served as the key motivation for creation of the Chrome browser (launched in 2008).
    一個名為「NavBoost」的系統(由搜尋副總裁 Pandu Nayak 在其司法部案件證詞中提到)最初從 Google 的工具列 PageRank 收集資料,對更多點選流資料的需求成為建立 Chrome 瀏覽器(於 2008 年推出)的主要動力。
  • NavBoost uses the number of searches for a given keyword to identify trending search demand, the number of clicks on a search result (I ran several experiments on this from 2013-2015), and long clicks versus short clicks (which I presented theories about in this 2015 video).
    NavBoost 使用給定關鍵字的搜尋次數來識別搜尋需求趨勢、搜尋結果的點選次數(我在 2013-2015 年期間進行了幾次實驗),以及長點選與短點選(我在這個 2015 年的影片中提出了理論)。
  • Google utilizes cookie history, logged-in Chrome data, and pattern detection (referred to in the leak as “unsquashed” clicks versus “squashed” clicks) as effective means for fighting manual & automated click spam.
    Google 利用 cookie 歷史記錄、已登入的 Chrome 資料和模式檢測(在洩漏中稱為“未壓縮”點選與“壓縮”點選)作為對抗手動和自動點選垃圾郵件的有效手段。
  • NavBoost also scores queries for user intent. For example, certain thresholds of attention and clicks on videos or images will trigger video or image features for that query and related, NavBoost-associated queries.
    NavBoost 也會對使用者意圖進行查詢評分。例如,對影片或圖片的某些關注和點選閾值會觸發該查詢及相關的、與 NavBoost 相關的查詢的影片或圖片功能。
  • Google examines clicks and engagement on searches both during and after the main query (referred to as a “NavBoost query”).
    Google 會檢查主要查詢期間和之後的點選和互動(稱為“NavBoost 查詢”)。

    For instance, if many users search for “Rand Fishkin,” don’t find SparkToro, and immediately change their query to “SparkToro” and click
    例如,如果許多使用者搜尋“Rand Fishkin”,沒有找到 SparkToro,並立即將查詢更改為“SparkToro”並在搜尋結果中點選 SparkToro.comSparkToro.com(以及提到“SparkToro”的網站)將在“Rand Fishkin”關鍵字的搜尋結果中獲得提升。
    SparkToro.com in the search result, SparkToro.com (and websites mentioning “SparkToro”) will receive a boost in the search results for the “Rand Fishkin” keyword.
    例如,如果許多使用者搜尋“Rand Fishkin”,沒有找到 SparkToro,並立即將查詢更改為“SparkToro”並在搜尋結果中點選 SparkToro.comSparkToro.com(以及提到“SparkToro”的網站)將在“Rand Fishkin”關鍵字的搜尋結果中獲得提升。
  • NavBoost’s data is used at the host level for evaluating a site’s overall quality (my anonymous source speculated that this could be what Google and SEOs called “Panda”). This evaluation can result in a boost or a demotion.
    NavBoost 的資料在主機層級用於評估網站的整體品質(我的匿名消息來源推測這可能是 Google 和 SEO 所稱的“貓熊”)。這種評估可能會導致提升或降級。
  • Other minor factors such as penalties for domain names that exactly match unbranded search queries (e.g.
    其他次要因素,如完全匹配未品牌化搜尋查詢的域名處罰(例如 mens-luxury-watches.com 或 milwaukee-homes-for-sale.net)、較新的“BabyPanda”評分和垃圾郵件訊號,也在品質評估過程中考慮。
    mens-luxury-watches.com or milwaukee-homes-for-sale.net), a newer “BabyPanda” score, and spam signals are also considered during the quality evaluation process.
    在進行質量評估時,還會考慮一些次要因素,如對於那些與非品牌搜索查詢完全匹配的域名(如 mens-luxury-watches.com 或 milwaukee-homes-for-sale.net)所施加的懲罰、新推出的“BabyPanda”評分系統,以及垃圾信息信號等。
  • NavBoost geo-fences click data, taking into account country and state/province levels, as well as mobile versus desktop usage. However, if Google lacks data for certain regions or user-agents, they may apply the process universally to the query results.
    NavBoost 地理圍欄點選資料,考慮到國家和州/省等級,以及移動裝置與桌面裝置的使用情況。然而,如果 Google 缺乏某些地區或使用者代理的資料,他們可能會將該過程普遍應用於查詢結果。
  • During the Covid-19 pandemic, Google employed whitelists for websites that could appear high in the results for Covid-related searches
    在新冠疫情期間,Google使用白名單來確保某些網站在與新冠相關的搜尋結果中排名靠前
  • Similarly, during democratic elections, Google employed whitelists for sites that should be shown (or demoted) for election-related information
    同樣地,在民主選舉期間,Google使用白名單來顯示(或降級)與選舉相關的資訊的網站

And these are only the tip of the iceberg.
而這些只是冰山一角。

Extraordinary claims require extraordinary evidence. And while some of these overlap with information revealed during the Google/DOJ case (some of which you can read about on this thread from 2020), many are novel and suggest insider knowledge.
非凡的主張需要非凡的證據。雖然其中一些與 Google/司法部案件中揭示的資訊重疊(其中一些你可以在 2020 年的這個執行緒中閱讀),但許多是新穎的,並且暗示內部知識。

So, this past Friday, May 24th (following several emails), I had a video call with the anonymous source.
所以,在過去的星期五,5 月 24 日(在幾封電子郵件之後),我與匿名消息來源進行了影片通話。

Update (5/28 at 10:00am Pacific): The anonymous source has decided to come forward. This video announces their identity, Erfan Azimi, an SEO practitioner and the founder of EA Eagle Digital.
更新(5 月 28 日上午 10 點太平洋時間):匿名消息來源已決定公開身份。這段影片宣佈了他們的身份,Erfan Azimi,一位 SEO 從業者和 EA Eagle Digital 的創始人。

Prior to the email and call, I had neither met nor heard of Erfan. He asked that his identity remain veiled, and that I merely include the quote below:
在收到電子郵件和電話之前,我既未見過也未聽說過 Erfan。他要求保持身份隱秘,並要求我僅包含以下引述:

An eagle uses the storm to reach unimaginable heights.
一隻老鷹利用風暴達到難以想像的高度。

– Matshona Dhliwayo – 馬特肖納·德利瓦約

After the call I was able to confirm details of Erfan’s work history, mutual people we both know from the marketing world, and several of their claims about being at particular events with industry insiders (including Googlers), though I cannot confirm details of the meetings nor the content of discussions they claim to have had.
在通話後,我能夠確認 Erfan 的工作歷史細節、我們在行銷界共同認識的人,以及他們聲稱曾與業界內部人士(包括 Google 員工)一起參加特定活動的幾個說法,儘管我無法確認會議的細節或他們聲稱曾進行的討論內容。

During our call, Erfan showed me the leak itself: more than 2,500 pages of API documentation containing 14,014 attributes (API features) that appear to come from Google’s internal “Content API Warehouse.” Based on the document’s commit history, this code was uploaded to GitHub on Mar 27, 2024 and not removed until May 7, 2024.
在我們的通話中,Erfan 向我展示了洩漏的內容:超過 2,500 頁的 API 檔案,包含 14,014 個屬性(API 功能),這些屬性似乎來自 Google 的內部“內容 API 倉庫”。根據檔案的提交歷史,這段程式碼於 2024 年 3 月 27 日上傳到 GitHub,直到 2024 年 5 月 7 日才被移除。

(Note: because this piece was, post-publishing, edited to reflect Erfan’s identity, he’s referred to below as “the anonymous source”).
(注意:由於這篇文章在發佈後進行了編輯以反映 Erfan 的身份,他在下文中被稱為“匿名來源”)。

This documentation doesn’t show things like the weight of particular elements in the search ranking algorithm, nor does it prove which elements are used in the ranking systems. But, it does show incredible details about data Google collects.
這份檔案並未顯示在搜尋排名演算法中各個元素的權重,也未證明哪些元素被用於排名系統。但它確實展示了 Google 所收集的資料的詳細資訊。

Here’s an example of the document format:
這是一個檔案格式的範例:

After walking me through a handful of these API modules, the source explained their motivations (around transparency, holding Google to account, etc.) and their hope: that I would publish an article sharing this leak, revealing some of the many interesting pieces of data it contained, and refuting some “lies” Googlers “had been spreading for years.”
在帶我瞭解了這些 API 模組之後,消息來源解釋了他們的動機(關於透明度、讓 Google 負責等)和他們的希望:我會發表一篇文章分享這次洩漏,揭示其中包含的許多有趣資料,並駁斥一些 Google 員工「多年來一直在散佈的‘謊言’」。

Is this API Leak Authentic? Can We Trust It?
這個 API 洩漏是真實的嗎?我們能信任它嗎?

A critical next step in the process was verifying the authenticity of the API Content Warehouse documents.  So, I reached out to some ex-Googler friends, shared the leaked docs, and asked for their thoughts.
過程中的下一個關鍵步驟是驗證 API 內容倉庫檔案的真實性。因此,我聯絡了一些前Google員工朋友,分享了洩露的檔案,並詢問了他們的看法。

Three ex-Googlers wrote back: one said they didn’t feel comfortable looking at or commenting on it. The other two shared the following (off the record and anonymously):
三位前Google員工回覆了:一位表示他們不願意查看或評論。另兩位則匿名且非正式地分享了以下內容:

  • “I didn’t have access to this code when I worked there. But this certainly looks legit. “
    “我在那裡工作時無法接觸到這段程式碼。但這看起來確實是合法的。”
  • “It has all the hallmarks of an internal Google API.”
    “這具有所有內部 Google API 的特徵。”
  • “It’s a Java-based API. And someone spent a lot of time adhering to Google’s own internal standards for documentation and naming.”
    「這是一個基於 Java 的 API。而且有人花了很多時間遵守 Google 自己的內部文件和命名標準。」
  • “I’d need more time to be sure, but this matches internal documentation I’m familiar with.”
    “我需要更多時間才能確定,但這與我熟悉的內部檔案相符。”
  • “Nothing I saw in a brief review suggests this is anything but legit.”
    “我在簡短的審查中看到的沒有任何跡象表明這不是合法的。”

Next, I needed help analyzing and deciphering the naming conventions and more technical aspects of the documentation. I’ve worked with APIs a bit, but it’s been 20 years since I wrote code and 6 years since I practiced SEO professionally.
接下來,我需要幫助分析和解讀命名規則以及檔案的更多技術方面。我曾經使用過一些 API,但我已經有 20 年沒有寫過程式碼,並且有 6 年沒有專業從事 SEO 了。

So, I reached out to one of the world’s foremost technical SEOs: Mike King, founder of iPullRank.
所以,我聯絡了世界上最頂尖的技術 SEO 之一:iPullRank 的創始人 Mike King。

During a 40-minute phone call on Friday afternoon, Mike reviewed the leak and confirmed my suspicions: this appears to be a legitimate set of documents from inside Google’s Search division, and contains an extraordinary amount of previously-unconfirmed information about Google’s inner workings.
在週五下午的 40 分鐘電話中,麥克審查了洩漏的檔案並確認了我的懷疑:這似乎是一組來自Google搜尋部門的真實檔案,并包含了大量之前未確認的關於Google內部運作的資訊。

2,500 technical documents is an unreasonable amount of material to ask one man (a dad, husband, and entrepreneur, no less) to review in a single weekend. But, that didn’t stop Mike from doing his best.
要求一個人(更何況是一位父親、丈夫和企業家)在一個週末內審查 2,500 份技術檔案是不合理的。但這並沒有阻止邁克盡他最大的努力。

He’s put together an exceptionally detailed initial review of the Google API leak here, which I’ll reference more in the findings below. And he’s also agreed to join us at SparkTogether 2024 in Seattle, WA on Oct. 8, where he’ll present the fully transparent story of this leak in far greater detail, and with the benefit of the next few months of analysis.
他在這裡整理了一份非常詳細的 Google API 洩漏初步審查,我會在下面的發現中更多地引用它。他也同意在 2024 年 10 月 8 日於華盛頓州西雅圖舉行的 SparkTogether 會議上加入我們,屆時他將以更詳細的分析,全面透明地講述這次洩漏的故事。


Qualifications and Motivations for this Post
此職位的資格和動機

Before we go further, a few disclaimers: I no longer work in the SEO field. My knowledge of and experience with SEO is 6+ years out of date.
在我們進一步討論之前,有幾點聲明:我已不再從事 SEO 領域的工作。我對 SEO 的知識和經驗已經過時超過 6 年。

I don’t have the technical expertise or knowledge of Google’s internal operations to analyze an API documentation leak and confirm with certainty whether it’s authentic (hence getting Mike’s help and the input of ex-Googlers).
我沒有技術專業知識或對Google內部運作的瞭解,無法分析 API 文件洩漏並確定其真實性(因此需要尋求 Mike 的幫助和前Google員工的意見)。

So why publish on this topic?
那麼為什麼要發表這個主題呢?

Because when I spoke to the party that sent me this information, I found them credible, thoughtful, and deeply knowledgeable. Despite going into the conversation deeply skeptical, I could identify no red flags, nor any malicious motivation.
因為當我與向我提供這些資訊的那方交談時,我發現他們是可信的、有思想的,而且非常有知識。儘管在交談之前我非常懷疑,但我沒有發現任何警示訊號,也沒有任何惡意動機。

This person’s sole aim appeared quite aligned with my own: to hold Google accountable for public statements that conflict with private conversations and leaked documentation, and to bring greater transparency to the field of search marketing.
這個人的唯一目標似乎與我自己的目標非常一致:讓Google對其公開聲明與私下對話和洩露檔案相矛盾的情況負責,並為搜尋行銷領域帶來更大的透明度。

And they believed that, despite my years removed from SEO, I was the best person to share this publicly.
他們相信,儘管我已多年未涉足 SEO,我仍是最適合公開分享這些的人。

These are goals I cared about deeply for almost two decades. And while my professional life has moved on (I now run two companies: SparkToro, which makes audience research software and Snackbar Studio, an indie video game developer), my interest in and connections to the world of Search Engine Optimization remain strong. I feel a deep obligation to share information about how the world’s dominant search engine works, especially information Google would prefer to keep quiet.
這些是我近二十年來深切關注的目標。雖然我的職業生涯已經向前邁進(我現在經營兩家公司:SparkToro,製作受眾研究軟體,以及 Snackbar Studio,一家獨立的影片遊戲開發商),但我對搜尋引擎優化世界的興趣和聯絡依然強烈。我感到有深深的責任分享有關世界主導搜尋引擎運作的資訊,特別是Google希望保持安靜的資訊。

And sadly, I’m not sure where else to send something this potentially groundbreaking.
而且可悲的是,我不確定還能把這個可能具有突破性的東西送到哪裡。

Years ago, before he left journalism to become Google’s Search Liaison, Danny Sullivan, would have been my go-to source for a leak of this magnitude. He had the gravitas, resume, knowledge, and experience to examine a claim like this and present it fairly in the court of public opinion.
多年前,在他離開新聞業成為Google的搜尋聯絡員之前,丹尼·沙利文會是我尋求這種重大洩漏消息的首選來源。他擁有權威、履歷、知識和經驗,能夠審查這樣的聲明並在公眾輿論中公平地呈現出來。

There have been so many times in the last few years I’ve wished for Danny’s calm, even-handed, tough-but-fair-on-Google approach to newsworthy pieces like this–pieces that could reach as far as the company’s statements on the witness stand (e.g. his eloquent writing on Google’s indefensible privacy claims about organic keyword data).
在過去的幾年裡,有很多次我希望丹尼能以冷靜、公正、強硬但公平的方式處理像這樣的新聞報導——這些報導可能涉及公司在證人席上的聲明(例如他對Google無法辯護的有機關鍵字資料隱私聲明的精彩寫作)。

Whatever Google’s paying him, it isn’t nearly enough.
不管Google付他多少錢,都遠遠不夠。

Apologies that instead of Danny, dear reader, you’re stuck with me. But since you are, I’m going to assume you may not be familiar with my background or credentials, and briefly share those.
很抱歉,親愛的讀者,您得忍受我而不是丹尼。不過既然如此,我假設您可能不熟悉我的背景或資歷,簡單介紹一下。

  • I started doing SEO for small businesses in the Seattle area in 2001, and co-founded the SEO consultancy that would become Moz (originally called SEOmoz) in 2003.
    我於 2001 年開始為西雅圖地區的小型企業進行 SEO,並於 2003 年共同創立了後來成為 Moz(原名 SEOmoz)的 SEO 諮詢公司。
  • For the next 15 years, I worked in the search marketing industry and was often recognized as an influential leader in that field. I authored/co-authored Lost and Founder: A Painfully Honest Field Guide to the Startup World, The Art of SEO, and Inbound Marketing and SEO.
    在接下來的 15 年裡,我在搜尋行銷行業工作,並經常被認為是該領域的有影響力的領導者。我撰寫/合著了《Lost and Founder: A Painfully Honest Field Guide to the Startup World》、《The Art of SEO》和《Inbound Marketing and SEO》。
  • Publications including the WSJ, Inc, Forbes, and hundreds more have written about and quoted me on the world of SEO and Google search, many of them citing a popular weekly video series I hosted for a decade: Whiteboard Friday.
    包括《華爾街日報》、《Inc.》、《福布斯》在內的多家出版物以及數百家其他媒體都曾撰文並引用我對 SEO 和 Google 搜尋領域的見解,許多媒體還引用了我主持了十年的熱門每週影片系列《白板星期五》。
  • Moz grew to 35,000+ paying customers of its SEO software, revenues of $50M+, and a team of ~200 before being sold to a private equity buyer in 2021. I left in 2018 and started SparkToro, and in 2023, Snackbar Studio.
    Moz 成長到擁有超過 35,000 名付費客戶的 SEO 軟體公司,收入超過 5000 萬美元,並擁有約 200 人的團隊,於 2021 年被出售給私募股權買家。我在 2018 年離開並創立了 SparkToro,並於 2023 年創立了 Snackbar Studio。
  • I dropped out of college at the University of Washington in 2001 and do not hold a degree, yet my work on Google and SEO has been cited by the United States Congress, the US Federal Trade Commission, the Wall Street Journal, New York Times, and John Oliver’s Last Week Tonight, among dozens of others.
    我於 2001 年從華盛頓大學輟學,沒有學位,但我的 Google 和 SEO 工作已被美國國會、美國聯邦貿易委員會、《華爾街日報》、《紐約時報》和約翰·奧利弗的《上週今夜》等數十個機構引用。
  • I hold several patents around the design of a web scale link index, and am the creator of numerous link-index metrics, including Domain Authority, a machine-learning based score commonly used in the digital marketing world to assess a website’s capability to rank in Google’s search engine.
    我擁有多項有關網路規模連結索引設計的專利,並且是多種連結索引指標的建立者,包括域名權威度,一種基於機器學習的分數,通常用於數位行銷領域來評估網站在Google搜尋引擎中的排名能力。

OK. Back to the Google leak.
好的。回到Google洩漏事件。


What is the Google API Content Warehouse?
什麼是 Google API 內容倉庫?

When looking through the massive trove of API documentation, the first reasonable set of questions might be: “What is this? What is it used for? Why does it exist in the first place?”
當瀏覽大量的 API 檔案時,第一組合理的問題可能是:「這是什麼?它用來做什麼?它存在的原因是什麼?」

The leak appears to come from GitHub, and the most credible explanation for its exposure matches what my anonymous source told me on our call: these documents were inadvertently and briefly made public (many links in the documentation point to private GitHub repositories and internal pages on Google’s corporate site that require specific, Google-credentialed logins).
洩漏似乎來自 GitHub,最可信的解釋與我的匿名消息來源在我們通話中告訴我的情況相符:這些檔案被無意中短暫公開(文件中的許多連結指向私人 GitHub 儲存庫和 Google 企業網站上的內部頁面,這些頁面需要特定的 Google 認證登錄)。

During this probably-accidental, public period between March and May of 2024, the API documentation was spread to Hexdocs (which indexes public GitHub repos) and found/circulated by other sources (I’m certain that others have a copy, though it’s odd that I could find no public discourse until now).
在 2024 年 3 月至 5 月之間的這段可能是意外的公開期間,API 檔案被傳播到 Hexdocs(索引公共 GitHub 儲存庫)並被其他來源發現/傳播(我確信其他人有副本,儘管奇怪的是直到現在我都找不到公開的討論)。

According to my ex-Googler sources, documentation like this exists on almost every Google team, explaining various API attributes and modules to help familiarize those working on a project with the data elements available.
根據我前Google員工的消息,幾乎每個Google團隊都有這樣的檔案,解釋各種 API 屬性和模組,以幫助項目成員熟悉可用的資料元素。

This leak matches others in public GitHub repositories and on Google’s Cloud API documentation, using the same notation style, formatting, and even process/module/feature names and references.
此洩漏與公共 GitHub 儲存庫和 Google 的 Cloud API 檔案中的其他洩漏相符,使用相同的符號風格、格式,甚至過程/模組/功能名稱和參考。

If that all sounds like a technical mouthful, think of this as instructions for members of Google’s search engine team. It’s like an inventory of books in a library, a card catalogue of sorts, telling those employees who need to know what’s available and how they can get it.
如果這一切聽起來像是技術術語,請將其視為 Google 搜尋引擎團隊成員的指示。這就像圖書館中的書籍清單,一種卡片目錄,告訴那些需要知道的員工有哪些資源以及如何獲取。

But, whereas libraries are public, Google search is one of the most secretive, closely-guarded black boxes in the world. In the last quarter century, no leak of this magnitude or detail has ever been reported from Google’s search division.
但是,圖書館是公共的,Google 搜尋則是世界上最秘密、最嚴密保護的黑箱之一。在過去的四分之一世紀中,從未有過來自 Google 搜尋部門如此規模或詳細的洩漏報告。

How certain can we be that Google’s search engine uses everything detailed in these API docs?
我們能多確定 Google 的搜尋引擎使用了這些 API 檔案中詳細描述的所有內容?

That’s open to interpretation. Google could have retired some of these, used others exclusively for testing or internal projects, or may even have made API features available that were never employed.
這是可以解釋的。Google 可能已經淘汰了其中一些,將其他一些專門用於測試或內部項目,甚至可能提供了從未使用過的 API 功能。

However, there are references in the documentation to deprecated features and specific notes on others indicating they should no longer be used. That strongly suggests those not marked with such details were still in active use as of the March, 2024 leak.
然而,檔案中提到了一些已棄用的功能,並對其他功能有具體說明,表明它們不應再使用。這強烈暗示那些沒有這些標記的功能在 2024 年 3 月洩漏時仍在積極使用中。

We also can’t say for certain whether the March leak is of the most recent version of this documentation. The most recent date I can find referenced in the API docs is August of 2023:
我們也無法確定三月的洩漏是否是這份檔案的最新版本。我在 API 檔案中能找到的最新日期是 2023 年 8 月。

The relevant text reads:
相關文字如下:

“The domain-level display name of the website, such as “Google” for google.com. See go/site-display-name for more details. As of Aug 2023, this field is being deprecated in favor of info.[AlternativeTitlesResponse].site_display_name_response field, which also contains host-level site display names with additional information.”

A reasonable reader would conclude that the documentation was up-to-date as of last summer (references to other changes in 2023 and earlier years, all the way back to 2005, are also present), and possibly even up-to-date as of the March 2024 date of disclosure.
一個理性的讀者會得出結論,該檔案截至去年夏天是最新的(也提到了 2023 年及更早年份的其他變更,一直追溯到 2005 年),甚至可能截至 2024 年 3 月披露日期也是最新的。

Google search obviously changes massively from year to year, and recent introductions like their much-maligned AI Overviews, do not make an appearance in this leak. Which of the items mentioned are actively used today in Google’s ranking systems? That’s open to speculation. This trove contains fascinating references, many that will be entirely new to non-Google-search-engineers.
Google搜尋顯然每年都會發生巨大變化,最近推出的備受批評的 AI 概述並未出現在這次洩漏中。提到的哪些項目今天仍在Google的排名系統中積極使用?這是個值得猜測的問題。這批封包含了許多有趣的參考資料,對於非Google搜尋引擎工程師來說,許多內容將是全新的。

But, I would urge readers not to point to a particular API feature in this leak and say: “SEE! That’s proof Google uses XYZ in their rankings.” It’s not quite proof. It’s a strong indication, stronger than patent applications or public statements from Googlers, but still no guarantee.
但是,我會敦促讀者不要指著這次洩漏中的某個 API 功能說:「看!這就是Google在排名中使用 XYZ 的證據。」這還不能算是證據。這是一個強烈的跡象,比專利申請或Google員工的公開聲明更有力,但仍然不能保證。

That said, it’s as close to a smoking gun as anything since Google’s execs testified in the DOJ trial last year. And, speaking of that testimony, much of it is corroborated and expanded on in the document leak, as Mike details in his post. 👀
話雖如此,這是自Google高管去年在司法部審判中作證以來最接近確鑿證據的東西。而且,談到那次證詞,檔案洩露中有很多內容得到了證實和擴展,正如邁克在他的文章中詳細說明的那樣。👀

What can we learn from the Data Warehouse Leak?
我們可以從資料倉儲洩漏中學到什麼?

I expect that interesting and marketing-applicable insights will be mined from this massive file set for years to come. It’s simply too big and too dense to think that a weekend of browsing could unearth a comprehensive set of takeaways, or even come close.
我預期在未來幾年內,這個龐大的檔案集將會挖掘出有趣且適用於行銷的見解。這個檔案集實在太大太密集了,認為只需一個週末的瀏覽就能發掘出全面的結論,甚至接近全面結論,都是不切實際的。

However, I will share five of the most interesting, early discoveries in my perusal, some that shed new light on things Google has long been assumed to be doing, and others that suggest the company’s public statements (especially those on what they “collect”) have been erroneous.
然而,我將分享我在瀏覽中發現的五個最有趣的早期發現,其中一些揭示了Google長期以來被認為在做的事情的新亮點,另一些則表明該公司的公開聲明(特別是關於他們“收集”的內容)是錯誤的。

Because doing so would be tedious and could be perceived as personal grievances (given Google’s historic attacks on my work), I won’t bother showing side-by-sides of what Googlers said vs. what this document insinuates. Besides, Mike did a great job of that in his post.
因為這樣做會很繁瑣,而且可能被視為個人怨恨(鑑於Google歷史上對我工作的攻擊),我不會費心展示Google員工所說的與這份檔案暗示的對比。此外,Mike 在他的文章中已經做得很好了。

Instead, I’ll focus on interesting and/or useful takeaways, and my conclusions from the whole of the modules I’ve been able to review, Mike’s piece on the leak, and how this combines with other things we know to be true of Google.
相反,我會專注於有趣和/或有用的收穫,以及我從我能夠審查的所有模組中得出的結論,Mike 關於洩漏的文章,以及這如何與我們知道的其他關於 Google 的事實結合起來。

#1: Navboost and the use of clicks, CTR, long vs. short clicks, and user data
#1: Navboost 和點選、點選率、長短點選及使用者資料的使用

A handful of modules in the documentation make reference to features like “goodClicks,” “badClicks,” “lastLongestClicks,” impressions, squashed, unsquashed, and unicorn clicks. These are tied to Navboost and Glue, two words that may be familiar to folks who reviewed Google’s DOJ testimony. Here’s a relevant excerpt from DOJ attorney Kenneth Dintzer’s cross-examination of Pandu Nayak, VP of Search on the Search Quality team:
在檔案中的一些模組提到像“goodClicks”、“badClicks”、“lastLongestClicks”、展示次數、壓縮、未壓縮和獨角獸點選等功能。這些與 Navboost 和 Glue 有關,這兩個詞可能對審查過 Google 司法部證詞的人來說很熟悉。以下是司法部律師 Kenneth Dintzer 對搜尋品質團隊的搜尋副總裁 Pandu Nayak 進行交叉詢問的相關摘錄:

Q. So remind me, is navboost all the way back to 2005?
問:那麼提醒我一下,navboost 是從 2005 年開始的嗎?

A. It’s somewhere in that range. It might even be before that.
A. 大概在那個範圍內。甚至可能在那之前。

Q. And it’s been updated. It’s not the same old navboost that it was back then?
問:而且它已經更新了。它不再是當時的舊導航加速器了嗎?

A. No. A. 不。

Q. And another one is glue, right?
問:還有一個是膠水,對嗎?

A. Glue is just another name for navboost that includes all of the other features on the page.
A. 膠水只是 navboost 的另一個名稱,包含頁面上的所有其他功能。

Q. Right. I was going to get there later, but we can do that now. Navboost does web results, just like we discussed, right?
問:對。我本來打算稍後再談這個,但我們現在可以談。Navboost 會顯示網頁結果,就像我們討論過的那樣,對嗎?

A. Yes. A. 是的。

Q. And glue does everything else that’s on the page that’s not web results, right?
問:那麼,膠水負責頁面上所有不是網頁結果的部分,對嗎?

A. That is correct. A. 那是正確的。

Q. Together they help find the stuff and rank the stuff that ultimately shows up on our SERP?
問:他們一起幫助找到東西並對最終顯示在我們搜尋結果頁面的東西進行排名嗎?

A. That is true. They’re both signals into that, yes.
A. 那是真的。它們都是那個的訊號,是的。

A savvy reader of these API documents would find they support Mr. Nayak’s testimony (and align with Google’s patent on site quality):
精明的這些 API 檔案讀者會發現它們支援 Nayak 先生的證詞(並與 Google 的網站品質專利一致):

Google appears to have ways to filter out clicks they don’t want to count in their ranking systems, and include ones they do. They also seem to measure length of clicks (i.e. pogo-sticking – when a searcher clicks a result and then quickly clicks the back button, unsatisfied by the answer they found) and impressions.
Google 似乎有辦法過濾掉他們不想在排名系統中計算的點選,并包括他們想要的點選。他們似乎還會測量點選的長度(即 pogo-sticking——當搜尋者點選一個結果,然後迅速點選返回按鈕,對找到的答案不滿意)和印象。

Plenty has already been written about Google’s use of click data, so I won’t belabor the point. What matters is that Google has named and described features for that measurement, adding even more evidence to the pile.
關於Google使用點選資料的文章已經寫了很多,所以我不會再贅述。重要的是,Google已經為這種測量命名並描述了功能,這為這個論點增加了更多證據。

#2: Use of Chrome browser clickstreams to power Google Search
#2:使用 Chrome 瀏覽器點選流來驅動 Google 搜尋

My anonymous source claimed that way back in 2005, Google wanted the full clickstream of billions of Internet users, and with Chrome, they’ve now got it.
我的匿名消息來源聲稱,早在 2005 年,Google就想要數十億網際網路使用者的完整點選流,現在通過 Chrome,他們已經得到了。

The API documents suggest Google calculates several types of metrics that can be called using Chrome views related to both individual pages and entire domains.
API 檔案建議 Google 計算幾種類型的指標,這些指標可以使用與單個頁面和整個域相關的 Chrome 檢視來呼叫。

This document, describing the features around how Google creates Sitelinks, is particularly interesting.
這份檔案描述了 Google 如何建立網站連結的特點,特別有趣。

It showcases a call named topUrl, which is “A list of top urls with highest two_level_score, i.e., chrome_trans_clicks.” My read is that Google likely uses the number of clicks on pages in Chrome browsers and uses that to determine the most popular/important URLs on a site, which go into the calculation of which to include in the sitelinks feature.
它展示了一個名為 topUrl 的呼叫,即“具有最高 two_level_score 的頂級網址列表,即 chrome_trans_clicks。”我的理解是,Google 可能使用 Chrome 瀏覽器中頁面上的點選次數來確定網站上最受歡迎/重要的網址,這些網址會被納入網站連結功能的計算中。

E.G. In the above screenshot from Google’s results, pages like “Pricing,” the “Blog,” and the “Login” pages are our most-visited, and Google knows this through their tracking of billions of Chrome users’ clickstreams.
例如,在上面的 Google 結果截圖中,“定價”、“部落格”和“登錄”頁面是我們訪問最多的頁面,Google 通過跟蹤數十億 Chrome 使用者的點選流來瞭解這一點。

#3: Whitelists in Travel, Covid, and Politics
#3:旅行、疫情和政治中的白名單

A module on “Good Quality Travel Sites” would lead reasonable readers to conclude that a whitelist exists for Google in the travel sector (unclear if this is exclusively for Google’s “Travel” search tab, or web search more broadly).
一個關於「優質旅遊網站」的模組會讓理性讀者得出結論,認為在旅遊領域中存在一個Google的白名單(不清楚這是否僅限於Google的「旅遊」搜尋標籤,還是更廣泛的網頁搜尋)。

References in several places to flags for “isCovidLocalAuthority” and “isElectionAuthority” further suggests that Google is whitelisting particular domains that are appropriate to show for highly controversial of potentially problematic queries. 
在幾個地方提到“isCovidLocalAuthority”和“isElectionAuthority”的標誌,進一步表明Google正在將特定域名列入白名單,這些域名適合顯示高度爭議或潛在問題的查詢。

For example, following the 2020 US Presidential election, one candidate claimed (without evidence) that the election had been stolen, and encouraged their followers to storm the Capital and take potentially violent action against lawmakers, i.e. commit an insurrection.
例如,在2020年美國總統選舉結束後,有一位候選人宣稱(無任何證據)選舉結果遭到篡改,並煽動他的支持者攻佔國會大廈,對議員們進行可能的暴力襲擊,也就是說發起了一場叛亂。

Google would almost certainly be one of the first places people turned to for information about this event, and if their search engine returned propaganda websites that inaccurately portrayed the election evidence, that could directly lead to more contention, violence, or even the end of US democracy.
Google幾乎肯定會成為人們尋求有關此事件資訊的首選地點,如果他們的搜尋引擎返回不精準描述選舉證據的宣傳網站,這可能直接導致更多的爭議、暴力,甚至美國民主的終結。

Those of us who want free and fair elections to continue should be very grateful Google’s engineers are employing whitelists in this case.
那些希望自由和公平選舉能夠繼續的人應該非常感謝Google的工程師在這種情況下使用白名單。

#4: Employing Quality Rater Feedback
#4:採用品質評估員反饋

Google has long had a quality rating platform called EWOK (Cyrus Shepard, a notable leader in the SEO space, spent several years contributing to this and wrote about it here). We now have evidence that some elements from the quality raters are used in the search systems.
Google 長期以來一直擁有一個名為 EWOK 的品質評分平台(SEO 領域的知名領導者 Cyrus Shepard 曾在這方面貢獻了數年並在此處撰寫了相關內容)。我們現在有證據表明,品質評分員的一些元素被用於搜尋系統中。

How influential these rater-based signals are, and what precisely they’re used for is unclear to me in an initial read, but I suspect some thoughtful SEO detectives will dig into the leak, learn, and publish more about it.
這些基於評分者的訊號有多大影響力,以及它們具體用於什麼,在我初步閱讀時尚不清楚,但我懷疑一些有心的 SEO 偵探會深入研究這次洩漏,學習並行表更多相關內容。

What I find fascinating is that scores and data generated by EWOK’s quality raters may be directly involved in Google’s search system, rather than simply a training set for experiments. Of course, it’s possible these are “just for testing,” but as you browse through the leaked documents, you’ll find that when that’s true, it’s specifically called out in the notes and module details.
我覺得有趣的是,EWOK 的品質評估員生成的分數和資料可能直接參與 Google 的搜尋系統,而不僅僅是實驗的訓練集。當然,這些資料也有可能“僅供測試”,但當你瀏覽洩露的檔案時,你會發現如果是這樣的話,會在備註和模組詳情中具體說明。

This one calls out a “per document relevance rating” sourced from evaluations done via EWOK. There’s no detailed notation, but it’s not much of a logic-leap to imagine how important those human evaluations of websites really are.
這個提到了一個來自 EWOK 評估的“每檔案相關性評分”。雖然沒有詳細的註釋,但不難想像這些對網站的人為評估有多重要。

This one calls out “Human Ratings (e.g. ratings from EWOK)” and notes that they’re “typically only populated in the evaluation pipelines,” which suggests they may be primarily training data in this module (I’d argue that’s still a hugely important role, and marketers shouldn’t dismiss how important it is that quality raters perceive and rate their websites well).
這裡提到“人類評分(例如來自 EWOK 的評分)”並指出它們“通常只在評估管道中填充”,這表明它們可能主要是這個模組中的訓練資料(我認為這仍然是一個非常重要的角色,市場行銷人員不應忽視品質評估員如何看待和評價他們的網站的重要性)

#5: Google Uses Click Data to Determine How to Weight Links in Rankings
#5:Google 使用點選資料來確定如何在排名中權衡連結

This one’s fascinating, and comes directly from the anonymous source who first shared the leak. In their words: “Google has three buckets/tiers for classifying their link indexes (low, medium, high quality).
這個很有趣,並且直接來自最初分享洩漏資訊的匿名來源。用他們的話說:“Google 有三個分類等級來分類他們的連結索引(低、中、高品質)。

Click data is used to determine which link graph index tier a document belongs to. See SourceType here, and TotalClicks here.” In summary:
點選資料用於確定文件屬於哪個連結圖索引層級。請參閱此處的 SourceType 和此處的 TotalClicks。” 總結:

  • If Forbes.com/Cats/ has no clicks it goes into the low-quality index and the link is ignored
    如果 Forbes.com/Cats/ 沒有點擊,它會進入低質量索引,並且該鏈接會被忽略。
  • If Forbes.com/Dogs/ has a high volume of clicks from verifiable devices (all the Chrome-related data discussed previously), it goes into the high-quality index and the link passes ranking signals
    如果 Forbes.com/Dogs/來自可驗證設備(所有之前討論過的與 Chrome 相關的數據)的點擊量很高,它會進入高質量索引,並且該鏈接會傳遞排名信號。

Once the link becomes “trusted” because it belongs to a higher tier index, it can flow PageRank and anchors, or be filtered/demoted by link spam systems. Links from the low-quality link index won’t hurt a site’s ranking; they are merely ignored.
一旦連結因屬於較高層級索引而變得“受信任”,它可以傳遞 PageRank 和錨點,或者被連結垃圾郵件系統過濾/降級。來自低品質連結索引的連結不會影響網站排名;它們只是被忽略。


Big Picture Takeaways for Marketers who Care About Organic Search Traffic
關注自然搜尋流量的行銷人員的大圖景要點

If you care strategically about the value of organic search traffic, but don’t have much use for the technical details of how Google works, this section’s for you.
如果你戰略性地關注自然搜尋流量的價值,但對於 Google 的運作技術細節不太感興趣,那麼這部分內容適合你。

It’s my attempt to sum up much of Google’s evolution from the period this leak covers: 2005 – 2023, and I won’t limit myself exclusively to confirmed elements of the leak.
這是我嘗試總結從這次洩漏涵蓋的時期:2005-2023 年,Google的許多演變,我不會僅限於洩漏的確認元素。

  1. Brand matters more than anything else
    品牌比任何其他事物都重要

    Google has numerous ways to identify entities, sort, rank, filter, and employ them. Entities include brands (brand names, their official websites, associated social accounts, etc.), and as we’ve seen in our clickstream research with Datos, they’ve been on an inexorable path toward exclusively ranking and sending traffic to big, powerful brands that dominate the web > small, independent sites and businesses.
    Google 有許多方法來識別實體、排序、排名、過濾和使用它們。實體包括品牌(品牌名稱、其官方網站、相關的社交帳號等),正如我們在與 Datos 的點選流研究中所見,他們一直在不可避免地走向只排名和引導流量到主導網路的大型、強大品牌,而不是小型、獨立的網站和企業。


    If there was one universal piece of advice I had for marketers seeking to broadly improve their organic search rankings and traffic, it would be: “Build a notable, popular, well-recognized brand in your space, outside of Google search.”
    如果我對尋求廣泛提升其自然搜尋排名和流量的行銷人員有一條普遍的建議,那就是:「在你的領域內建立一個顯著的、受歡迎的、知名的品牌,超越Google搜尋。」

  2. Experience, expertise, authoritativeness, and trustworthiness (“E-E-A-T”) might not matter as directly as some SEOs think.
    經驗、專業知識、權威性和可信度(“E-E-A-T”)可能不像一些 SEO 專家所想的那樣直接重要。

    The only mention of topical expertise in the leak we’ve found so far is a brief notation about Google Maps review contributions. The other aspects of E-E-A-T are either buried, indirect, labeled in hard-to-identify ways, or, more likely (in my opinion) correlated with things Google uses and cares about, but not specific elements of the ranking systems.
    到目前為止,我們在洩漏中發現的唯一提及專業知識的地方是關於 Google 地圖評論貢獻的簡短註釋。E-E-A-T 的其他方面要麼被埋藏、間接標記、難以識別,要麼(在我看來更有可能)與 Google 使用和關心的事物相關,但不是排名系統的具體元素。


    As Mike noted in his article, there is documentation in the leak suggesting Google can identify authors and treats them as entities in the system. Building up one’s influence as an author online may indeed lead to ranking benefits in Google.
    正如麥克在他的文章中指出的那樣,洩漏的檔案顯示Google可以識別作者並將其視為系統中的實體。作為線上作者建立影響力確實可能帶來Google排名的好處。

    But what exactly in the ranking systems makes up “E-E-A-T” and how powerful those elements are is an open question. I’m a bit worried that E-E-A-T is 80% propaganda, 20% substance.
    但究竟在排名系統中構成“E-E-A-T”的是什麼,以及這些元素有多強大,這是一個懸而未決的問題。我有點擔心 E-E-A-T 是 80%的宣傳,20%的實質內容。

    There are plenty of powerful brands that rank remarkably well in Google and have very little experience, expertise, authoritativeness, or trustworthiness, as HouseFresh’s recent, viral article details in depth.
    有許多強大的品牌在Google上的排名非常高,但幾乎沒有經驗、專業知識、權威性或可信度,正如 HouseFresh 最近的病毒式文章中詳細描述的那樣。

  3. Content and links are secondary when user intention around navigation (and the patterns that intent creates) are present.
    當使用者在導航時的意圖(以及該意圖所創造的模式)存在時,內容和連結是次要的。

    Let’s say, for example, that many people in the Seattle area search for “Lehman Brothers” and scroll to page 2, 3, or 4 of the search results until they find the theatre listing for the Lehman Brother stage production, then click that result.
    假設,例如,許多在西雅圖地區的人搜尋「雷曼兄弟」,並滾動到搜尋結果的第 2、3 或 4 頁,直到找到雷曼兄弟舞台劇的劇院列表,然後點選該結果。

    Fairly quickly, Google will learn that’s what searchers for those words in that area want.
    相當快地,Google會學到那個地區搜尋那些詞語的使用者想要什麼。


    Even if the Wikipedia article about Lehman Brothers’ role in the financial crisis of 2008 were to invest heavily in link building and content optimization, it’s unlikely they could outrank the user-intent signals (calculated from queries and clicks) of Seattle’s theatre-goers.
    即使關於雷曼兄弟在 2008 年金融危機中角色的維基百科文章大力投資於連結建設和內容優化,他們也不太可能超越西雅圖戲劇觀眾的使用者意圖訊號(從查詢和點選計算)。


    Extending this example to the broader web and search as a whole, if you can create demand for your website among enough likely searchers in the regions you’re targeting, you may be able to end-around the need for classic on-and-off-page SEO signals like links, anchor text, optimized content, and the like.
    將此示範擴展到更廣泛的網路和搜尋整體,如果您能在目標地區中為您的網站創造足夠的潛在搜尋者需求,您可能可以繞過對傳統內外頁 SEO 訊號(如連結、錨文字、優化內容等)的需求

    The power of Navboost and the intent of users is likely the most powerful ranking factor in Google’s systems. As Google VP Alexander Grushetsky put it in a 2019 email to other Google execs (including Danny Sullivan and Pandu Nayak):
    Navboost 的力量和使用者的意圖可能是 Google 系統中最強大的排名因素。正如 Google 副總裁 Alexander Grushetsky 在 2019 年給其他 Google 高管(包括 Danny Sullivan 和 Pandu Nayak)的電子郵件中所說:


    We already know, one signal could be more powerful than the whole big system on a given metric.
    我們已經知道,一個訊號在某個特定指標上可能比整個大系統更強大。

    For example, I’m pretty sure that NavBoost alone was / is more positive on clicks (and likely even on precision / utility metrics) by itself than the rest of ranking (BTW, engineers outside of Navboost team used to be also not happy about the power of Navboost, and the fact it was “stealing wins”)
    例如,我相當確定單單 NavBoost 就比其他排序方法在點選率上(甚至可能在精確度/效用指標上)更為正面(順便說一下,NavBoost 團隊以外的工程師過去也對 NavBoost 的強大感到不滿,並且認為它在“竊取勝利”)
     翻譯文字:

    Those seeking even more confirmation could review Google engineer Paul Haahr’s detailed resume, which states:
    那些尋求更多確認的人可以查看 Google 工程師 Paul Haahr 的詳細履歷,其中指出:


    “I’m the manager for logs-based ranking projects. The team’s efforts are currently split among four areas: 1) Navboost. This is already one of Google’s strongest ranking signals. Current work is on automation in building new navboost data;”
    我是基於日誌排名項目的經理。團隊的工作目前分為四個領域:1) Navboost。這已經是 Google 最強的排名訊號之一。目前的工作是自動化建構新的 navboost 資料;

  4. Classic ranking factors: PageRank, anchors (topical PageRank based on the anchor text of the link), and text-matching have been waning in importance for years. But Page Titles are still quite important.
    經典排名因素:PageRank、錨文字(基於連結錨文字的主題 PageRank)和文字匹配的重要性多年來一直在減弱。但頁面標題仍然相當重要。

    This is a finding from Mike’s excellent analysis that I’d be foolish not to call out here. PageRank still appears to have a place in search indexing and rankings, but it’s almost certainly evolved from the original 1998 paper.
    這是來自麥克出色分析的一個發現,我若不在此提及將是愚蠢的。PageRank 似乎在搜尋索引和排名中仍然有一席之地,但幾乎可以肯定它已經從 1998 年的原始論文中演變而來。

    The document leak insinuates multiple versions of PageRank (rawPagerank, a deprecated PageRank referencing “nearest seeds,” firstCoveragePageRank from when the document was first served, etc.) have been created and discarded over the years. And anchor text links, while present in the leak, don’t seem to be as crucial or omnipresent as I’d have expected from my earlier years in SEO.
    檔案洩漏暗示了多個版本的 PageRank(rawPagerank、一個參考“最近種子”的已棄用 PageRank、檔案首次提供時的 firstCoveragePageRank 等)在多年來被建立和丟棄。而錨文字連結雖然在洩漏中存在,但似乎並不像我在早期 SEO 時期所預期的那樣重要或無處不在。

  5. For most small and medium businesses and newer creators/publishers, SEO is likely to show poor returns until you’ve established credibility, navigational demand, and a strong reputation among a sizable audience.
    對於大多數中小型企業和新興創作者/出版商來說,在建立信譽、導航需求和在大量受眾中樹立強大聲譽之前,SEO 可能會顯示出較差的回報。

    SEO is a big brand, popular domain’s game.
    SEO 是一個大品牌,受歡迎的領域遊戲。

    As an entrepreneur, I’m not ignoring SEO, but I strongly expect that for the years ahead, until/unless SparkToro becomes a much larger, more popular, more searched-for and clicked-on brand in its industry, this website will continue to be outranked, even for its original content, by aggregators and publishers who’ve existed for 10+ years.
    作為一名企業家,我並沒有忽視 SEO,但我強烈預期在未來的幾年裡,除非 SparkToro 成為行業中更大、更受歡迎、更被搜尋和點選的品牌,否則這個網站即使是其原創內容,仍將繼續被存在超過 10 年的聚合器和出版商超越。


    This is almost certainly true for other creators, publishers, and SMBs. The content you create is unlikely to perform well in Google if competition from big, popular websites with well-known brands exists.
    這幾乎可以肯定地適用於其他創作者、出版商和中小企業。如果存在來自大型知名網站的競爭,你建立的內容在 Google 上表現不太可能良好。

    Google no longer rewards scrappy, clever, SEO-savvy operators who know all the right tricks. They reward established brands, search-measurable forms of popularity, and established domains that searchers already know and click.
    Google 不再獎勵那些精明、聰明、精通 SEO 的經營者,他們知道所有正確的技巧。它們獎勵的是知名品牌、可搜尋的受歡迎形式,以及搜尋者已經知道並點選的知名域名。

    From 1998 – 2018 (or so), one could reasonable start a powerful marketing flywheel with SEO for Google. In 2024, I don’t think that’s realistic, at least, not on the English-language web in competitive sectors.
    從 1998 年到 2018 年左右,人們可以合理地通過Google的 SEO 啟動一個強大的行銷飛輪。在 2024 年,我認為這不現實,至少在競爭激烈的英語網路領域是這樣。

Next Steps for the Search Industry
搜尋行業的下一步

I’m excited to see how practitioners with more recent experience and deeper technical knowledge go about analyzing this leak.
我很期待看到擁有最新經驗和更深技術知識的從業者如何分析這次洩漏。

I encourage anyone curious to dig into the documentation, attempt to connect it to other public documents, statements, testimony, and ranking experiments, then publish their findings.
我鼓勵任何好奇的人深入研究檔案,嘗試將其與其他公開檔案、聲明、證詞和排名實驗聯絡起來,然後發表他們的發現。

Historically, some of the search industry’s loudest voices and most prolific publishers have been happy to uncritically repeat Google’s public statements. They write headlines like “Google says XYZ is true,” rather than “Google Claims XYZ; Evidence Suggests Otherwise.”
歷史上,搜尋行業中一些最響亮的聲音和最多產的出版商一直樂於不加批判地重複Google的公開聲明。他們寫的標題是“Google說 XYZ 是真的”,而不是“Google聲稱 XYZ;證據表明並非如此。”

The SEO industry doesn’t benefit from these kinds of headlines
SEO 行業不會從這些標題中受益

Please, do better. If this leak and the DOJ trial can create just one change, I hope this is it.
請做得更好。如果這次洩漏和司法部的審判能帶來一個改變,我希望就是這個。

When those new to the field read Search Engine Roundtable, Search Engine Land, SE Journal, and the many agency blogs and websites that cover the SEO field’s news, they don’t necessarily know how seriously to take Google’s statements.
當那些新手閱讀 Search Engine Roundtable、Search Engine Land、SE Journal 以及許多涵蓋 SEO 領域新聞的代理部落格和網站時,他們不一定知道應該多認真看待 Google 的聲明。

Journalists and authors should not presume that readers are savvy enough to know that dozens or hundreds of past public comments by Google’s official representatives were later proven wrong.
記者和作者不應假設讀者足夠精明,知道Google官方代表過去的數十或數百條公開評論後來被證明是錯誤的。

This obligation isn’t just about helping the search industry—it’s about helping the whole world. Google is one of the most powerful, influential forces for the spread of information and commerce on this planet.
這項義務不僅僅是為了幫助搜尋行業——它是為了幫助整個世界。Google是這個星球上傳播資訊和商業的最強大、最有影響力的力量之一。

Only recently have they been held to some account by governments and reporters.
直到最近,他們才開始被政府和記者追究責任。

The work of journalists and writers in the search marketing field carries weight in the courts of public opinion, in the halls of elected officials, and in the hearts of Google employees, all of whom have the power to change things for the better or ignore them at our collective peril.
記者和作家在搜尋行銷領域的工作在公眾輿論的法庭上、在民選官員的廳堂裡以及在Google員工的心中都具有份量,他們都有能力改變現狀使其變得更好,或者忽視它們而使我們集體面臨危險。


Thank you to Mike King for his invaluable help on this document leak story, to Amanda Natividad for editing help, and to the anonymous source who shared this leak with me. I expect that updates to this piece may arrive over the next few days and weeks as it reaches more eyeballs.
感謝麥克·金在這篇檔案洩漏報導中的寶貴幫助,感謝阿曼達·納蒂維達的編輯幫助,還有那位與我分享這次洩漏的匿名消息來源。我預計隨著這篇文章被更多人看到,未來幾天和幾週內可能會有更新。

If you have findings that support or contradict statements I’ve made here, please feel free to share them in the comments below.
如果你有支援或反駁我在此所做陳述的發現,請隨時在下方留言分享。