PDF scientific paper translation and bilingual comparison library.
PDF 科學論文翻譯與雙語對照庫。
- Online Service: Beta version launched Immersive Translate - BabelDOC 1000 free pages per month.
線上服務:Beta 版本已推出 Immersive Translate - BabelDOC,每月 1000 頁免費。 - Self-deployment: PDFMathTranslate 1.9.3+ Experimental support for BabelDOC, available for self-deployment + WebUI with more translation services.
自行部署:PDFMathTranslate 1.9.3+ 實驗性支援 BabelDOC,可自行部署 + 帶有更多翻譯服務的 WebUI。 - Provides a simple command line interface.
提供一個簡單的命令列介面。 - Provides a Python API.
- Mainly designed to be embedded into other programs, but can also be used directly for simple translation tasks.
主要設計用於嵌入其他程式,但也可直接用於簡單的翻譯任務。
We recommend using the Tool feature of uv to install yadt.
我們推薦使用 uv 的 Tool 功能來安裝 yadt。
-
First, you need to refer to uv installation to install uv and set up the
PATH
environment variable as prompted.
首先,您需要參考 uv 安裝說明來安裝 uv,並依照提示設定PATH
環境變數。 -
Use the following command to install yadt:
使用以下命令安裝 yadt:
uv tool install --python 3.12 BabelDOC
babeldoc --help
- Use the
babeldoc
command. For example:
使用babeldoc
命令。例如:
babeldoc --bing --files example.pdf
# multiple files
babeldoc --bing --files example1.pdf --files example2.pdf
We still recommend using uv to manage virtual environments.
我們仍然推薦使用 uv 來管理虛擬環境。
-
First, you need to refer to uv installation to install uv and set up the
PATH
environment variable as prompted.
首先,您需要參考 uv 安裝說明來安裝 uv,並依照提示設定PATH
環境變數。 -
Use the following command to install yadt:
使用以下命令安裝 yadt:
# clone the project
git clone https://github.com/funstory-ai/BabelDOC
# enter the project directory
cd BabelDOC
# install dependencies and run babeldoc
uv run babeldoc --help
- Use the
uv run babeldoc
command. For example:
使用uv run babeldoc
命令。例如:
uv run babeldoc --files example.pdf --openai --openai-model "gpt-4o-mini" --openai-base-url "https://api.openai.com/v1" --openai-api-key "your-api-key-here"
# multiple files
uv run babeldoc --files example.pdf --files example2.pdf --openai --openai-model "gpt-4o-mini" --openai-base-url "https://api.openai.com/v1" --openai-api-key "your-api-key-here"
Tip
The absolute path is recommended.
推薦使用絕對路徑。
Note
This CLI is mainly for debugging purposes. Although end users can use this CLI to translate files, we do not provide any technical support for this purpose.
此 CLI 主要用於偵錯目的。儘管終端使用者可以使用此 CLI 翻譯檔案,但我們不為此用途提供任何技術支援。
End users should directly use Online Service: Beta version launched Immersive Translate - BabelDOC 1000 free pages per month.
終端使用者應直接使用線上服務:Beta 版本已推出 Immersive Translate - BabelDOC,每月 1000 頁免費。
End users who need self-deployment should use PDFMathTranslate
需要自行部署的終端使用者應使用 PDFMathTranslate。
If you find that an option is not listed below, it means that this option is a debugging option for maintainers. Please do not use these options.
如果您發現某個選項未列於下方,這表示該選項是維護者的偵錯選項。請勿使用這些選項。
--lang-in
,-li
: Source language code (default: en)
--lang-in
,-li
:原始語言碼(預設:en)--lang-out
,-lo
: Target language code (default: zh)
--lang-out
,-lo
:目標語言碼(預設:zh)
Tip
Currently, this project mainly focuses on English-to-Chinese translation, and other scenarios have not been tested yet.
目前本專案主要專注於英中翻譯,其他情境尚未測試。
(2025.3.1 update): Basic English target language support has been added, primarily to minimize line breaks within words([0-9A-Za-z]+).
(2025.3.1 更新): 已加入基礎英文目標語言支援,主要用於減少單詞內的換行([0-9A-Za-z]+)。
HELP WANTED: Collecting word regular expressions for more languages
徵求協助:收集更多語言的單詞正則表達式
--files
: One or more file paths to input PDF documents.
--files
:一個或多個指向輸入 PDF 文件的檔案路徑。--pages
,-p
: Specify pages to translate (e.g., "1,2,1-,-3,3-5"). If not set, translate all pages
--pages
,-p
:指定要翻譯的頁面 (例如:"1,2,1-,-3,3-5")。如果未設定,則翻譯所有頁面--split-short-lines
: Force split short lines into different paragraphs (may cause poor typesetting & bugs)
--split-short-lines
:強制將短行分割到不同段落 (可能導致排版不良及錯誤)--short-line-split-factor
: Split threshold factor (default: 0.8). The actual threshold is the median length of all lines on the current page * this factor
--short-line-split-factor
:分割閾值係數 (預設值: 0.8)。實際閾值是當前頁面上所有行的中位數長度 * 此係數--skip-clean
: Skip PDF cleaning step
--skip-clean
: 跳過 PDF 清理步驟--dual-translate-first
: Put translated pages first in dual PDF mode (default: original pages first)
--dual-translate-first
: 在雙頁 PDF 模式中將翻譯頁面放在前面 (預設:原文頁面在前面)--disable-rich-text-translate
: Disable rich text translation (may help improve compatibility with some PDFs)
--disable-rich-text-translate
: 停用 RTF 翻譯 (可能幫助改善與某些 PDF 的相容性)--enhance-compatibility
: Enable all compatibility enhancement options (equivalent to --skip-clean --dual-translate-first --disable-rich-text-translate)
--enhance-compatibility
: 啟用所有相容性強化選項 (等同於 --skip-clean --dual-translate-first --disable-rich-text-translate)--use-alternating-pages-dual
: Use alternating pages mode for dual PDF. When enabled, original and translated pages are arranged in alternate order. When disabled (default), original and translated pages are shown side by side on the same page.
--use-alternating-pages-dual
: 在雙頁 PDF 中使用交替頁面模式。啟用時,原文頁面和翻譯頁面會交替排列。停用時 (預設),原文頁面和翻譯頁面會並排顯示在同一頁上。--watermark-output-mode
: Control watermark output mode: 'watermarked' (default) adds watermark to translated PDF, 'no_watermark' doesn't add watermark, 'both' outputs both versions.
--watermark-output-mode
: 控制浮水印輸出模式:'watermarked' (預設) 會將浮水印加入翻譯後的 PDF 中,'no_watermark' 不加入浮水印,'both' 會輸出兩個版本。--max-pages-per-part
: Maximum number of pages per part for split translation. If not set, no splitting will be performed.
--max-pages-per-part
: 分割翻譯時每個部分的頁面上限。如果未設定,將不進行分割。--no-watermark
: [DEPRECATED] Use --watermark-output-mode=no_watermark instead.
--no-watermark
: [已棄用] 請改用 --watermark-output-mode=no_watermark。--translate-table-text
: Translate table text (experimental, default: False)
--translate-table-text
: 翻譯表格文字 (實驗性,預設:False)--skip-scanned-detection
: Skip scanned document detection (default: False). When using split translation, only the first part performs detection if not skipped.
--skip-scanned-detection
: 跳過掃描文件偵測 (預設:False)。使用分割翻譯時,如果未跳過,則只有第一個部分會執行偵測。--ocr-workaround
: Use OCR workaround (default: False). When enabled, the tool will use OCR to detect text and fill background for scanned PDF.
--ocr-workaround
: 使用 OCR 替代方案 (預設值:False)。啟用時,此工具將使用 OCR 偵測掃描 PDF 中的文字並填滿背景。
Tip
- Both
--skip-clean
and--dual-translate-first
may help improve compatibility with some PDF readers
--skip-clean
和--dual-translate-first
都可能有助於改善與一些 PDF 閱讀器的相容性 --disable-rich-text-translate
can also help with compatibility by simplifying translation input
--disable-rich-text-translate
也可以幫助相容性,藉由簡化翻譯輸入- However, using
--skip-clean
will result in larger file sizes
然而,使用--skip-clean
將導致檔案大小增加 - If you encounter any compatibility issues, try using
--enhance-compatibility
first
如果遇到任何相容性問題,請先嘗試使用--enhance-compatibility
- Use
--max-pages-per-part
for large documents to split them into smaller parts for translation and automatically merge them back.
針對大型文件使用--max-pages-per-part
,將其分割成較小的部分進行翻譯,然後自動合併回來。 - Use
--skip-scanned-detection
to speed up processing when you know your document is not a scanned PDF.
如果您確定文件不是掃描的 PDF,請使用--skip-scanned-detection
加快處理速度。 - Use
--ocr-workaround
to fill background for scanned PDF. (Current assumption: background is pure white, text is pure black, this option will also auto enable--skip-scanned-detection
)
使用--ocr-workaround
填滿掃描 PDF 的背景。(目前假設:背景為純白色,文字為純黑色,此選項也將自動啟用--skip-scanned-detection
)
--qps
: QPS (Queries Per Second) limit for translation service (default: 4)
--qps
: 翻譯服務的 QPS (Queries Per Second) 限制 (預設值: 4)--ignore-cache
: Ignore translation cache and force retranslation
--ignore-cache
: 忽略翻譯快取並強制重新翻譯--no-dual
: Do not output bilingual PDF files
--no-dual
: 不要輸出雙語 PDF 檔案--no-mono
: Do not output monolingual PDF files
--no-mono
: 不要輸出單語 PDF 檔案--min-text-length
: Minimum text length to translate (default: 5)
--min-text-length
: 要翻譯的最小文字長度 (預設: 5)--openai
: Use OpenAI for translation (default: False)
--openai
: 使用 OpenAI 進行翻譯 (預設:False)--custom-system-prompt
: Custom system prompt for translation.
--custom-system-prompt
: 用於翻譯的自訂系統提示詞。--add-formula-placehold-hint
: Add formula placeholder hint for translation. (Currently not recommended, it may affect translation quality, default: False)
--add-formula-placehold-hint
: 為翻譯新增公式佔位符提示。(目前不建議使用,可能會影響翻譯品質,預設值:False)
Tip
- Currently, only OpenAI-compatible LLM is supported. For more translator support, please use PDFMathTranslate.
目前僅支援與 OpenAI 相容的 LLM。如需更多翻譯器支援,請使用 PDFMathTranslate。 - It is recommended to use models with strong compatibility with OpenAI, such as:
glm-4-flash
,deepseek-chat
, etc.
建議使用與 OpenAI 相容性強的模型,例如:glm-4-flash
、deepseek-chat
等。 - Currently, it has not been optimized for traditional translation engines like Bing/Google, it is recommended to use LLMs.
目前尚未針對 Bing/Google 等傳統翻譯引擎進行優化,建議使用 LLMs。 - You can use litellm to access multiple models.
您可以使用 litellm 存取多個模型。 --custom-system-prompt
: It is mainly used to add the/no_think
instruction of Qwen 3 in the prompt. For example:--custom-system-prompt "/no_think You are a professional, authentic machine translation engine."
--custom-system-prompt
:主要用於在提示詞中添加 Qwen 3 的/no_think
指令。例如:--custom-system-prompt "/no_think You are a professional, authentic machine translation engine."
--openai-model
: OpenAI model to use (default: gpt-4o-mini)
'--openai-model
: 要使用的 OpenAI 模型 (預設值: gpt-4o-mini)'--openai-base-url
: Base URL for OpenAI API
'--openai-base-url
: OpenAI API 的基礎 URL'--openai-api-key
: API key for OpenAI service
'--openai-api-key
: OpenAI 服務的 API 金鑰'
Tip
- This tool supports any OpenAI-compatible API endpoints. Just set the correct base URL and API key. (e.g.
https://xxx.custom.xxx/v1
)
此工具支援任何與 OpenAI 相容的 API 端點。只需設定正確的基礎 URL 和 API 金鑰。 - For local models like Ollama, you can use any value as the API key (e.g.
--openai-api-key a
).
對於像 Ollama 這樣的本地模型,您可以使用任何值作為 API 金鑰(例如--openai-api-key a
)。
--output
,-o
: Output directory for translated files. If not set, use current working directory.
--output
,-o
:翻譯檔案的輸出目錄。如果未設定,則使用目前工作目錄。--debug
,-d
: Enable debug logging level and export detailed intermediate results in~/.cache/yadt/working
.
--debug
,-d
:啟用偵錯記錄層級並將詳細的中間結果匯出至~/.cache/yadt/working
。--report-interval
: Progress report interval in seconds (default: 0.1).
--report-interval
: 進度報告間隔,以秒為單位 (預設: 0.1)。
--generate-offline-assets
: Generate an offline assets package in the specified directory. This creates a zip file containing all required models and fonts.
--generate-offline-assets
: 在指定目錄中產生離線資源包。這會建立一個包含所有必要模型和字型的 zip 壓縮檔。--restore-offline-assets
: Restore an offline assets package from the specified file. This extracts models and fonts from a previously generated package.
'--restore-offline-assets
: 從指定檔案還原離線資產套件。這會從先前產生的套件中擷取模型和字型。'
Tip
- Offline assets packages are useful for environments without internet access or to speed up installation on multiple machines.
離線資產套件適用於沒有網際網路連線的環境,或加快在多部機器上的安裝速度。 - Generate a package once with
babeldoc --generate-offline-assets /path/to/output/dir
and then distribute it.
使用babeldoc --generate-offline-assets /path/to/output/dir
產生一次套件,然後分發。 - Restore the package on target machines with
babeldoc --restore-offline-assets /path/to/offline_assets_*.zip
.
使用babeldoc --restore-offline-assets /path/to/offline_assets_*.zip
在目標機器上還原套件。 - The offline assets package name cannot be modified because the file list hash is encoded in the name.
無法修改離線資產套件名稱,因為檔案清單的雜湊值已編碼在名稱中。 - If you provide a directory path to
--restore-offline-assets
, the tool will automatically look for the correct offline assets package file in that directory.
如果您提供目錄路徑給--restore-offline-assets
,工具將自動在該目錄中尋找正確的離線資產套件檔案。 - The package contains all necessary fonts and models required for document processing, ensuring consistent results across different environments.
此套件包含文件處理所需的所有必要字型和模型,確保在不同環境中結果一致。 - The integrity of all assets is verified using SHA3-256 hashes during both packaging and restoration.
所有資產的完整性在打包和還原期間都使用 SHA3-256 雜湊進行驗證。 - If you're deploying in an air-gapped environment, make sure to generate the package on a machine with internet access first.
如果您要在隔離環境中部署,請確保先在可存取網際網路的機器上生成套件。
--config
,-c
: Configuration file path. Use the TOML format.
--config
,-c
: 組態檔案路徑。使用 TOML 格式。
Example Configuration:
[babeldoc]
# Basic settings
debug = true
lang-in = "en-US"
lang-out = "zh-CN"
qps = 10
output = "/path/to/output/dir"
# PDF processing options
split-short-lines = false
short-line-split-factor = 0.8
skip-clean = false
dual-translate-first = false
disable-rich-text-translate = false
use-alternating-pages-dual = false
watermark-output-mode = "watermarked" # Choices: "watermarked", "no_watermark", "both"
max-pages-per-part = 50 # Automatically split the document for translation and merge it back.
# no-watermark = false # DEPRECATED: Use watermark-output-mode instead
skip-scanned-detection = false # Skip scanned document detection for faster processing
# Translation service
openai = true
openai-model = "gpt-4o-mini"
openai-base-url = "https://api.openai.com/v1"
openai-api-key = "your-api-key-here"
# Output control
no-dual = false
no-mono = false
min-text-length = 5
report-interval = 0.5
# Offline assets management
# Uncomment one of these options as needed:
# generate-offline-assets = "/path/to/output/dir"
# restore-offline-assets = "/path/to/offline_assets_package.zip"
Tip
-
Before pdf2zh 2.0 is released, you can temporarily use BabelDOC's Python API. However, after pdf2zh 2.0 is released, please directly use pdf2zh's Python API.
在 pdf2zh 2.0 發布之前,您可以暫時使用 BabelDOC 的 Python API。然而,在 pdf2zh 2.0 發布之後,請直接使用 pdf2zh 的 Python API。 -
This project's Python API does not guarantee any compatibility. However, the Python API from pdf2zh will guarantee a certain level of compatibility.
此專案的 Python API 不保證任何相容性。然而,來自 pdf2zh 的 Python API 將保證一定程度的相容性。 -
We do not provide any technical support for the BabelDOC API.
我們不為 BabelDOC API 提供任何技術支援。 -
When performing secondary development, please refer to pdf2zh 2.0 high level and ensure that BabelDOC runs in a subprocess.
進行二次開發時,請參考 pdf2zh 2.0 high level,並確保 BabelDOC 在子程序中運行。
You can refer to the example in main.py to use BabelDOC's Python API.
你可以參考 main.py 中的範例來使用 BabelDOC 的 Python API。
Please note:
-
Make sure call
babeldoc.high_level.init()
before using the API
確保在使用 API 之前呼叫babeldoc.high_level.init()
-
The current
TranslationConfig
does not fully validate input parameters, so you need to ensure the validity of input parameters
目前的TranslationConfig
未完全驗證輸入參數,因此您需要確保輸入參數的有效性 -
For offline assets management, you can use the following functions:
對於離線資產管理,您可以使用以下函數:# Generate an offline assets package from pathlib import Path import babeldoc.assets.assets # Generate package to a specific directory # path is optional, default is ~/.cache/babeldoc/assets/offline_assets_{hash}.zip babeldoc.assets.assets.generate_offline_assets_package(Path("/path/to/output/dir")) # Restore from a package file # path is optional, default is ~/.cache/babeldoc/assets/offline_assets_{hash}.zip babeldoc.assets.assets.restore_offline_assets_package(Path("/path/to/offline_assets_package.zip")) # You can also restore from a directory containing the offline assets package # The tool will automatically find the correct package file based on the hash babeldoc.assets.assets.restore_offline_assets_package(Path("/path/to/directory"))
Tip
- The offline assets package name cannot be modified because the file list hash is encoded in the name.
無法修改離線資產套件名稱,因為檔案清單的雜湊值已編碼在名稱中。 - When using in production environments, it's recommended to pre-generate the assets package and include it with your application distribution.
在生產環境中使用時,建議預先生成資產套件並將其包含在您的應用程式發布中。 - The package verification ensures that all required assets are intact and match their expected checksums.
套件驗證確保所有必要資產完整無損,且與其預期校驗和相符。
There are a lot projects and teams working on to make document editing and translating easier like:
有許多專案和團隊正致力於讓文件編輯和翻譯變得更容易,像是:
There are also some solutions to solve specific parts of the problem like:
也有一些解決方案可以解決問題的特定部分,例如:
- layoutreader: the read order of the text block in a pdf
layoutreader: PDF 中文字區塊的閱讀順序 - Surya: the structure of the pdf
Surya: PDF 的結構
This project hopes to promote a standard pipeline and interface to solve the problem.
本專案希望推廣一套標準化流程與介面來解決這個問題。
In fact, there are two main stages of a PDF parser or translator:
事實上,一個 PDF 解析器或轉換器主要有兩個階段:
- Parsing: A stage of parsing means to get the structure of the pdf such as text blocks, images, tables, etc.
解析:解析階段表示獲取 PDF 的結構,例如文字區塊、圖片、表格等。 - Rendering: A stage of rendering means to render the structure into a new pdf or other format.
渲染:渲染階段表示將結構渲染成一個新的 PDF 或其他格式。
For a service like mathpix, it will parse the pdf into a structure may be in a XML format, and then render them using a single column reader order as layoutreader does. The bad news is that the original structure lost.
對於像 Mathpix 這樣的服務,它會將 PDF 解析成一個可能為 XML 格式的結構,然後使用單欄閱讀器順序將其渲染,就像 layoutreader 所做的那樣。壞消息是原始結構丟失了。
Some people will use Adobe PDF Parser because it will generate a Word document and it keeps the original structure. But it is somewhat expensive.
And you know, a pdf or word document is not a good format for reading in mobile devices.
有些人會使用 Adobe PDF 解析器,因為它會產生 Word 文件並保留原始結構。但它有點貴。而且你知道,PDF 或 Word 文件在行動裝置上閱讀時格式並不好。
We offer an intermediate representation of the results from parser and can be rendered into a new pdf or other format. The pipeline is also a plugin-based system which everybody can add their new model, ocr, renderer, etc.
我們提供解析器結果的中間表示,並且可以被渲染成新的 PDF 或其他格式。該管線也是一個基於外掛的系統,每個人都可以新增他們新的模型、OCR、渲染器等。
- Add line support
- Add table support
- Add cross-page/cross-column paragraph support
新增跨頁/跨欄段落支援 - More advanced typesetting features
更進階的排版功能 - Outline support
- ...
Our first 1.0 version goal is to finish a translation from PDF Reference, Version 1.7 to the following language version:
我們第一個 1.0 版本的目標是完成從 PDF Reference, Version 1.7 到以下語言版本的翻譯:
- Simplified Chinese
- Traditional Chinese
- Japanese
- Spanish
And meet the following requirements:
並符合以下要求:
- layout error less than 1%
版面錯誤少於 1% - content loss less than 1%
內容損失小於 1%
- Parsing errors in the author and reference sections; they get merged into one paragraph after translation.
作者和參考文獻部分的解析錯誤;翻譯後它們會合併成一個段落。 - Lines are not supported.
- Does not support drop caps.
不支援首字放大 - Large pages will be skipped.
大型頁面將會被跳過
We encourage you to contribute to YADT! Please check out the CONTRIBUTING guide.
我們鼓勵您為 YADT 做出貢獻!請參閱 CONTRIBUTING 指南。
Everyone interacting in YADT and its sub-projects' codebases, issue trackers, chat rooms, and mailing lists is expected to follow the YADT Code of Conduct.
所有在 YADT 及其子專案的程式碼庫、問題追蹤器、聊天室和郵寄列表中互動的人,都應遵守 YADT 行為準則。
Immersive Translation sponsors monthly Pro membership redemption codes for active contributors to this project, see details at: CONTRIBUTOR_REWARD.md
Immersive Translation 為本專案的活躍貢獻者提供每月 Pro 會員兌換碼贊助,詳細資訊請參見:CONTRIBUTOR_REWARD.md