PaddleSpeech安装部署与快速上手指南

PaddleSpeech安装部署与快速上手指南

这篇文章比官方的指南系统。原文链接：https://blog.csdn.net/gitblog_00090/article/details/151531945

一、ASR测试下载后的音频文件

1.1 命令行方式测试

paddlespeech asr --lang zh --input zh.wav --rtf

(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# paddlespeech asr --lang zh --input zh.wav --rtf
/mnt/workspace/myvenv/lib/python3.11/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Looking in indexes: https://mirrors.cloud.aliyuncs.com/pypi/simple
ERROR: Could not find a version that satisfies the requirement paddlespeech_ctcdecoders (from versions: none)
ERROR: No matching distribution found for paddlespeech_ctcdecoders
我认为跑步最重要的就是给我带来了身体健康

根据指引安装后，有些报错可以忽略，如：

ERROR: Could not find a version that satisfies the requirement paddlespeech_ctcdecoders (from versions: none)
ERROR: No matching distribution found for paddlespeech_ctcdecoders

这是 PaddleSpeech 在运行时尝试自动安装 paddlespeech_ctcdecoders（一个用于 CTC 解码的加速包）失败导致的。

为什么会出现？
paddlespeech_ctcdecoders 是一个可选依赖，用于提升解码速度（尤其在流式 ASR 中）。
它需要编译 C++ 代码，因此对系统环境有要求（如 gcc, cmake, libsndfile1-dev 等）。
如果你的环境中（如阿里云 DSW 实例），缺少编译依赖或预编译 wheel 包不支持当前平台（比如 Python 3.11 + Linux 架构组合没有现成的 wheel）。
⚠️ 但这不影响基本 ASR 功能！PaddleSpeech 会自动回退到纯 Python 解码器（稍慢，但能用）。

1.2 Python代码方式调用

Paddle_ASR.py

from paddlespeech.cli.asr.infer import ASRExecutor
 
# 初始化ASR执行器
asr_executor = ASRExecutor()
 
# 中文语音识别
result = asr_executor(
    audio_file='zh.wav',
    model='conformer_wenetspeech',
    lang='zh',
    sample_rate=16000,
    device='cpu'
)
print(f"识别结果: {result}")
 
# 英文语音识别
result_en = asr_executor(
    audio_file='en.wav', 
    model='conformer_librispeech',
    lang='en',
    sample_rate=16000
)
print(f"Recognition result: {result_en}")

执行Python代码：

(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# python ../Paddle_ASR.py
/mnt/workspace/myvenv/lib/python3.11/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Looking in indexes: https://mirrors.cloud.aliyuncs.com/pypi/simple
ERROR: Could not find a version that satisfies the requirement paddlespeech_ctcdecoders (from versions: none)
ERROR: No matching distribution found for paddlespeech_ctcdecoders
2025-09-24 10:19:29.358 | INFO     | paddlespeech.s2t.modules.ctc:<module>:45 - paddlespeech_ctcdecoders not installed!
2025-09-24 10:19:29.531 | INFO     | paddlespeech.s2t.modules.embedding:__init__:153 - max len: 5000
识别结果: 我认为跑步最重要的就是给我带来了身体健康

二、TTS测试

2.1 Python代码：

from paddlespeech.cli.tts.infer import TTSExecutor

import os
import nltk
import nltk

# 设置 NLTK 数据目录（避免权限问题）
nltk_data_dir = "/root/nltk_data"
os.makedirs(nltk_data_dir, exist_ok=True)
nltk.data.path.insert(0, nltk_data_dir)

# 自动下载所需数据（如果缺失）
try:
    nltk.data.find('taggers/averaged_perceptron_tagger')
except LookupError:
    nltk.download('averaged_perceptron_tagger', download_dir=nltk_data_dir)

try:
    nltk.data.find('corpora/cmudict')
except LookupError:
    nltk.download('cmudict', download_dir=nltk_data_dir)
    
 
# 初始化TTS执行器
tts_executor = TTSExecutor()
 
# 中文语音合成
tts_executor(
    text="欢迎使用PaddleSpeech语音合成技术",
    output='output_zh.wav',
    am='fastspeech2_csmsc',
    voc='hifigan_csmsc',
    lang='zh',
    spk_id=0
)
 
# 英文语音合成  
tts_executor(
    text="Welcome to use PaddleSpeech text to speech",
    output='output_en.wav',
    am='fastspeech2_ljspeech',
    voc='hifigan_ljspeech', 
    lang='en'
)
 
# 中英文混合语音合成
tts_executor(
    text="Hello 世界，这是中英文混合合成",
    output='output_mix.wav',
    am='fastspeech2_mix',
    voc='hifigan_csmsc',
    lang='mix',
    spk_id=174
)

2.2 执行Python代码

python ../Paddle_TTS.py

(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# python ../Paddle_TTS.py
/mnt/workspace/myvenv/lib/python3.11/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
/mnt/workspace/myvenv/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
[2025-09-24 11:41:32,873] [    INFO] - tokenizer config file saved in /root/.paddlenlp/models/bert-base-chinese/tokenizer_config.json
[2025-09-24 11:41:32,873] [    INFO] - Special tokens file saved in /root/.paddlenlp/models/bert-base-chinese/special_tokens_map.json
Building prefix dict from the default dictionary ...
[2025-09-24 11:41:37,535] [   DEBUG] __init__.py:113 - Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
[2025-09-24 11:41:37,535] [   DEBUG] __init__.py:132 - Loading model from cache /tmp/jieba.cache
Loading model cost 0.635 seconds.
[2025-09-24 11:41:38,170] [   DEBUG] __init__.py:164 - Loading model cost 0.635 seconds.
Prefix dict has been built successfully.
[2025-09-24 11:41:38,170] [   DEBUG] __init__.py:166 - Prefix dict has been built successfully.
(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# ls -l /mnt/workspace/myvenv/output_*.wav
-rw-r--r-- 1 root root  78044  9月 24 11:41 /mnt/workspace/myvenv/output_en.wav
-rw-r--r-- 1 root root 132044  9月 24 11:41 /mnt/workspace/myvenv/output_mix.wav
-rw-r--r-- 1 root root 130244  9月 24 11:41 /mnt/workspace/myvenv/output_zh.wav

2.3 正常输出音频文件

生成了3个语音文件。

2.3.1 错误处理

这个过程可能报错：

(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# python ../Paddle_TTS.py
/mnt/workspace/myvenv/lib/python3.11/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
  warnings.warn(
/mnt/workspace/myvenv/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources
Traceback (most recent call last):
  File "/mnt/workspace/myvenv/../Paddle_TTS.py", line 1, in <module>
    from paddlespeech.cli.tts.infer import TTSExecutor
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/cli/tts/__init__.py", line 14, in <module>
    from .infer import TTSExecutor
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/cli/tts/infer.py", line 33, in <module>
    from paddlespeech.t2s.exps.syn_utils import get_am_inference
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/t2s/exps/syn_utils.py", line 38, in <module>
    from paddlespeech.t2s.frontend.en_frontend import English
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/t2s/frontend/en_frontend.py", line 14, in <module>
    from .phonectic import English
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/t2s/frontend/phonectic.py", line 20, in <module>
    from g2p_en import G2p
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/g2p_en/__init__.py", line 1, in <module>
    from .g2p import G2p
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/g2p_en/g2p.py", line 22, in <module>
    nltk.data.find('taggers/averaged_perceptron_tagger.zip')
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/nltk/data.py", line 538, in find
    return ZipFilePathPointer(p, zipentry)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/nltk/data.py", line 391, in __init__
    zipfile = OpenOnDemandZipFile(os.path.abspath(zipfile))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/mnt/workspace/myvenv/lib/python3.11/site-packages/nltk/data.py", line 1020, in __init__
    zipfile.ZipFile.__init__(self, filename)
  File "/usr/local/lib/python3.11/zipfile.py", line 1313, in __init__
    self._RealGetContents()
  File "/usr/local/lib/python3.11/zipfile.py", line 1380, in _RealGetContents
    raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv#

错误解决：

但下载过程中出现了 [nltk_data] Error with downloaded zip file，说明 ZIP 文件仍然损坏或不完整。
最终在解压 cmudict.zip 时再次触发了类似之前的 BadZipFile 错误（虽然日志被截断，但可以推断）。

2.3.2 根本原因

NLTK 在受限网络环境（如阿里云 DSW、某些 Docker 容器、企业代理）中下载数据时，可能被重定向到一个 HTML 错误页面（如 403/404），但 NLTK 仍将其保存为 .zip 文件，导致后续解压失败。

2.3.3 终极解决方案：手动下载 + 离线安装 NLTK 数据

步骤 1：手动下载所需数据包

在有正常网络的机器（或当前机器用 wget/curl）下载以下两个文件：

# 1. averaged_perceptron_tagger
wget https://github.com/nltk/nltk_data/raw/gh-pages/packages/taggers/averaged_perceptron_tagger.zip

# 2. cmudict
wget https://github.com/nltk/nltk_data/raw/gh-pages/packages/corpora/cmudict.zip

如果 github.com 访问慢，可尝试镜像或代理，或使用国内加速（如 https://npmmirror.com/mirrors/nltk/，但需确认路径）。

步骤 2：创建 NLTK 数据目录结构

mkdir -p /root/nltk_data/taggers
mkdir -p /root/nltk_data/corpora

步骤 3：将 ZIP 文件放入对应目录并解压

# 移动并解压 tagger
mv averaged_perceptron_tagger.zip /root/nltk_data/taggers/
cd /root/nltk_data/taggers
unzip averaged_perceptron_tagger.zip

# 移动并解压 cmudict
mv cmudict.zip /root/nltk_data/corpora/
cd /root/nltk_data/corpora
unzip cmudict.zip

✅ 确保解压后目录结构为：

/root/nltk_data/taggers/averaged_perceptron_tagger/...

/root/nltk_data/corpora/cmudict/...

步骤 4：设置环境变量（推荐）

export NLTK_DATA=/root/nltk_data

或在你的 Paddle_TTS.py 开头添加(我们采用的方式）：

import nltk
# 设置 NLTK 数据目录（避免权限问题）
nltk_data_dir = "/root/nltk_data"
os.makedirs(nltk_data_dir, exist_ok=True)
nltk.data.path.insert(0, nltk_data_dir)

步骤 5：运行脚本

python ../Paddle_TTS.py

验证 NLTK 数据是否可用

运行以下命令测试：

python -c "
import nltk
nltk.data.find('taggers/averaged_perceptron_tagger')
nltk.data.find('corpora/cmudict')
print('✅ NLTK data OK')
"

如果无报错，说明配置成功。