这是用户在 2024-6-23 1:22 为 https://github.com/rhasspy/piper 保存的双语快照页面,由 沉浸式翻译 提供双语支持。了解如何保存?
set 限制解除
Skip to content
rhasspy  /   piper  /  
Owner avatar piper Public

Notifications

Notification settings

A fast, local neural text to speech system
快速本地神经网络文本转语音系统

License

Open in github.dev Open in a new github.dev tab Open in codespace

rhasspy/piper

t

Add file

Add file

Repository files navigation

A fast, local neural text to speech system that sounds great and is optimized for the Raspberry Pi 4. Piper is used in a variety of projects.
这是一个快速、本地的神经文本转语音系统,听起来很棒,针对树莓派 4 进行了优化。Piper 在各种项目中都有应用。

echo 'Welcome to the world of speech synthesis!' | \
  ./piper --model en_US-lessac-medium.onnx --output_file welcome.wav

Listen to voice samples and check out a video tutorial by Thorsten Müller
聆听语音样本并观看 Thorsten Müller 的视频教程

Voices are trained with VITS and exported to the onnxruntime.
声音是用 VITS 训练的,并导出到 onnxruntime。

This is a project of the Open Home Foundation.
这是 Open Home 基金会的一个项目。

Voices 语音声音

Our goal is to support Home Assistant and the Year of Voice.
我们的目标是支持 Home Assistant 以及声音之年计划。

Download voices for the supported languages:
下载支持的语言的语音包

  • Arabic (ar_JO) 阿拉伯语(ar_JO)
  • Catalan (ca_ES) 加泰罗尼亚语(ca_ES)
  • Czech (cs_CZ) 捷克语(cs_CZ)
  • Danish (da_DK) 丹麦语(丹麦)
  • German (de_DE) 德语(de_DE)
  • Greek (el_GR) 希腊语(el_GR)
  • English (en_GB, en_US) 英文(英国英语,美国英语)
  • Spanish (es_ES, es_MX) 西班牙语(西班牙语西班牙裔,西班牙语墨西哥裔)
  • Finnish (fi_FI) 芬兰语(fi_FI)
  • French (fr_FR) 法语 (fr_FR)
  • Hungarian (hu_HU) 匈牙利语 (hu_HU)
  • Icelandic (is_IS) 冰岛语 (is_IS)
  • Italian (it_IT) 意大利语 (it_IT)
  • Georgian (ka_GE) 格鲁吉亚语 (ka_GE)
  • Kazakh (kk_KZ) 哈萨克语 (kk_KZ)
  • Luxembourgish (lb_LU) 卢森堡语 (lb_LU)
  • Nepali (ne_NP) 尼泊尔语 (ne_NP)
  • Dutch (nl_BE, nl_NL) 荷兰语 (nl_BE, nl_NL)
  • Norwegian (no_NO) 挪威语(nb_NO)
  • Polish (pl_PL) 波兰语 (pl_PL)
  • Portuguese (pt_BR, pt_PT)
    葡萄牙语(pt_BR,pt_PT)
  • Romanian (ro_RO) 罗马尼亚语(ro_RO)
  • Russian (ru_RU) 对不起,您提供的内容没有具体的文本,因此无法翻译成简体中文。请提供需要翻译的具体文本
  • Serbian (sr_RS) 塞尔维亚语(sr_RS)
  • Swedish (sv_SE) 瑞典语(sv_SE)
  • Swahili (sw_CD) 斯瓦希里语(sw_CD)
  • Turkish (tr_TR) 土耳其语(tr_TR)
  • Ukrainian (uk_UA) 乌克兰语(uk_UA)
  • Vietnamese (vi_VN) 越南语(vi_VN)
  • Chinese (zh_CN) 好的,请您提供需要翻译的内容,我会尽力为您提供翻译

You will need two files per voice:
每一款声音会对应需要两个文件:

  1. A .onnx model file, such as en_US-lessac-medium.onnx
    一个 ` .onnx ` 模型文件,例如 ` en_US-lessac-medium.onnx `
  2. A .onnx.json config file, such as en_US-lessac-medium.onnx.json
    一个配置文件,如 en_US-lessac-medium.onnx.json

The MODEL_CARD file for each voice contains important licensing information. Piper is intended for text to speech research, and does not impose any additional restrictions on voice models. Some voices may have restrictive licenses, however, so please review them carefully!
Please refer to relevant websites for more information, and feel free to ask me any other questions.

Installation 安装说明

You can run Piper with Python or download a binary release:
你可以使用 Python 运行 Piper 或下载二进制版本发布:

  • amd64 (64-bit desktop Linux)
    amd64(适用于 Linux 的 64 位桌面系统)
  • arm64 (64-bit Raspberry Pi 4)
    arm64(树莓派处理器模型 3 支持更快的微架构以容纳新一代要求较高的用例);升级版扩展到四线程是其配套能力的支柱所在
  • armv7 (32-bit Raspberry Pi 3/4)
    armv7(用于 32 位树莓派 3/4)

If you want to build from source, see the Makefile and C++ source. You must download and extract piper-phonemize to lib/Linux-$(uname -m)/piper_phonemize before building. For example, lib/Linux-x86_64/piper_phonemize/lib/libpiper_phonemize.so should exist for AMD/Intel machines (as well as everything else from libpiper_phonemize-amd64.tar.gz).
如果您想从源代码构建,请参阅 Makefile 和 C++源代码。在构建之前,您必须下载并解压 piper-phonemize 到 lib/Linux-$(uname -m)/piper_phonemize 。例如,AMD/Intel 机器上应该存在 lib/Linux-x86_64/piper_phonemize/lib/libpiper_phonemize.so (以及来自 libpiper_phonemize-amd64.tar.gz 的其他所有内容)。

Usage 使用方法

  1. Download a voice and extract the .onnx and .onnx.json files
    Please refer to relevant websites for more information, and feel free to ask me any other questions
  2. Run the piper binary with text on standard input, --model /path/to/your-voice.onnx, and --output_file output.wav
    运行带有标准输入文本、 --model /path/to/your-voice.onnx--output_file output.wavpiper 二进制文件

For example: 例如:

echo 'Welcome to the world of speech synthesis!' | \
  ./piper --model en_US-lessac-medium.onnx --output_file welcome.wav

For multi-speaker models, use --speaker <number> to change speakers (default: 0).
在多说话人模型中,使用 --speaker <number> 来更改说话人(默认值为 0)。

See piper --help for more options.
更多选项请参见 piper --help

Streaming Audio 音频流

Piper can stream raw audio to stdout as its produced:
Piper 可以在生产时将原始音频流传输到 stdout

echo 'This sentence is spoken first. This sentence is synthesized while the first sentence is spoken.' | \
  ./piper --model en_US-lessac-medium.onnx --output-raw | \
  aplay -r 22050 -f S16_LE -t raw -

This is raw audio and not a WAV file, so make sure your audio player is set to play 16-bit mono PCM samples at the correct sample rate for the voice.
这是原始音频,不是 WAV 文件,所以请确保您的音频播放器设置为以正确的采样率播放语音的 16 位单声道 PCM 样本。

JSON Input JSON 输入

The piper executable can accept JSON input when using the --json-input flag. Each line of input must be a JSON object with text field. For example:
在使用 --json-input 标志时, piper 可执行文件可以接受 JSON 输入。每一行输入必须是一个带有 text 字段的 JSON 对象。例如:

{ "text": "First sentence to speak." }
{ "text": "Second sentence to speak." }

Optional fields include:
可选字段包括:

  • speaker - string
    speaker - 字符串
    • Name of the speaker to use from speaker_id_map in config (multi-speaker voices only)
      配置中从 speaker_id_map 使用的说话人名称(仅适用于多说话人声音)
  • speaker_id - number
    Please refer to relevant websites for more information, and feel free to ask me any other questions
    • Id of speaker to use from 0 to number of speakers - 1 (multi-speaker voices only, overrides "speaker")
      使用的说话人的 ID 从 0 到说话人数减一(仅限多说话人声音,覆盖“说话人”)
  • output_file - string
    output_file - 字符串
    • Path to output WAV file
      输出 WAV 文件的路径

The following example writes two sentences with different speakers to different files:
下面的例子将两个不同说话人的句子写入不同的文件中:

{ "text": "First speaker.", "speaker_id": 0, "output_file": "/tmp/speaker_0.wav" }
{ "text": "Second speaker.", "speaker_id": 1, "output_file": "/tmp/speaker_1.wav" }

People using Piper 使用 Piper 的人

Piper has been used in the following projects/papers:
Piper 已在以下项目和论文中使用过:

Training 训练

See the training guide and the source code.
参阅训练指南和源代码。

Pretrained checkpoints are available on Hugging Face
预训练的检查点可在 Hugging Face 上获得

Running in Python 在 Python 中运行

See src/python_run 查看 src/python_run

Install with pip: 使用 pip 安装:

pip install piper-tts

and then run: 然后运行:

echo 'Welcome to the world of speech synthesis!' | piper \
  --model en_US-lessac-medium \
  --output_file welcome.wav

This will automatically download voice files the first time they're used. Use --data-dir and --download-dir to adjust where voices are found/downloaded.
这将自动下载首次使用的语音文件。使用 ` --data-dir ` 和 ` --download-dir ` 调整语音查找和下载的位置。

If you'd like to use a GPU, install the onnxruntime-gpu package:
如果您想使用 GPU,请安装 onnxruntime-gpu 软件包:

.venv/bin/pip3 install onnxruntime-gpu

and then run piper with the --cuda argument. You will need to have a functioning CUDA environment, such as what's available in NVIDIA's PyTorch containers.
然后使用带有 --cuda 参数的 piper 运行它。您需要有一个可以运行的 CUDA 环境,例如 NVIDIA 的 PyTorch 容器中的环境。