PaddleSpeech安装部署与快速上手指南
这篇文章比官方的指南系统。原文链接:https://blog.csdn.net/gitblog_00090/article/details/151531945
一、ASR测试下载后的音频文件
1.1 命令行方式测试
paddlespeech asr --lang zh --input zh.wav --rtf
(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# paddlespeech asr --lang zh --input zh.wav --rtf
/mnt/workspace/myvenv/lib/python3.11/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
warnings.warn(
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Looking in indexes: https://mirrors.cloud.aliyuncs.com/pypi/simple
ERROR: Could not find a version that satisfies the requirement paddlespeech_ctcdecoders (from versions: none)
ERROR: No matching distribution found for paddlespeech_ctcdecoders
我认为跑步最重要的就是给我带来了身体健康
根据指引安装后,有些报错可以忽略,如:
ERROR: Could not find a version that satisfies the requirement paddlespeech_ctcdecoders (from versions: none)
ERROR: No matching distribution found for paddlespeech_ctcdecoders
这是 PaddleSpeech 在运行时尝试自动安装 paddlespeech_ctcdecoders(一个用于 CTC 解码的加速包)失败导致的。
为什么会出现?
paddlespeech_ctcdecoders 是一个 可选依赖,用于提升解码速度(尤其在流式 ASR 中)。
它需要编译 C++ 代码,因此对系统环境有要求(如 gcc, cmake, libsndfile1-dev 等)。
如果你的环境中(如阿里云 DSW 实例),缺少编译依赖或预编译 wheel 包不支持当前平台(比如 Python 3.11 + Linux 架构组合没有现成的 wheel)。
⚠️ 但这 不影响基本 ASR 功能!PaddleSpeech 会自动回退到纯 Python 解码器(稍慢,但能用)。
1.2 Python代码方式调用
Paddle_ASR.py
from paddlespeech.cli.asr.infer import ASRExecutor
# 初始化ASR执行器
asr_executor = ASRExecutor()
# 中文语音识别
result = asr_executor(
audio_file='zh.wav',
model='conformer_wenetspeech',
lang='zh',
sample_rate=16000,
device='cpu'
)
print(f"识别结果: {result}")
# 英文语音识别
result_en = asr_executor(
audio_file='en.wav',
model='conformer_librispeech',
lang='en',
sample_rate=16000
)
print(f"Recognition result: {result_en}")
执行Python代码:
(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# python ../Paddle_ASR.py
/mnt/workspace/myvenv/lib/python3.11/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
warnings.warn(
WARNING: pip is being invoked by an old script wrapper. This will fail in a future version of pip.
Please see https://github.com/pypa/pip/issues/5599 for advice on fixing the underlying issue.
To avoid this problem you can invoke Python with '-m pip' instead of running pip directly.
Looking in indexes: https://mirrors.cloud.aliyuncs.com/pypi/simple
ERROR: Could not find a version that satisfies the requirement paddlespeech_ctcdecoders (from versions: none)
ERROR: No matching distribution found for paddlespeech_ctcdecoders
2025-09-24 10:19:29.358 | INFO | paddlespeech.s2t.modules.ctc:<module>:45 - paddlespeech_ctcdecoders not installed!
2025-09-24 10:19:29.531 | INFO | paddlespeech.s2t.modules.embedding:__init__:153 - max len: 5000
识别结果: 我认为跑步最重要的就是给我带来了身体健康
二、TTS测试
2.1 Python代码:
from paddlespeech.cli.tts.infer import TTSExecutor
import os
import nltk
import nltk
# 设置 NLTK 数据目录(避免权限问题)
nltk_data_dir = "/root/nltk_data"
os.makedirs(nltk_data_dir, exist_ok=True)
nltk.data.path.insert(0, nltk_data_dir)
# 自动下载所需数据(如果缺失)
try:
nltk.data.find('taggers/averaged_perceptron_tagger')
except LookupError:
nltk.download('averaged_perceptron_tagger', download_dir=nltk_data_dir)
try:
nltk.data.find('corpora/cmudict')
except LookupError:
nltk.download('cmudict', download_dir=nltk_data_dir)
# 初始化TTS执行器
tts_executor = TTSExecutor()
# 中文语音合成
tts_executor(
text="欢迎使用PaddleSpeech语音合成技术",
output='output_zh.wav',
am='fastspeech2_csmsc',
voc='hifigan_csmsc',
lang='zh',
spk_id=0
)
# 英文语音合成
tts_executor(
text="Welcome to use PaddleSpeech text to speech",
output='output_en.wav',
am='fastspeech2_ljspeech',
voc='hifigan_ljspeech',
lang='en'
)
# 中英文混合语音合成
tts_executor(
text="Hello 世界,这是中英文混合合成",
output='output_mix.wav',
am='fastspeech2_mix',
voc='hifigan_csmsc',
lang='mix',
spk_id=174
)
2.2 执行Python代码
python ../Paddle_TTS.py
(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# python ../Paddle_TTS.py
/mnt/workspace/myvenv/lib/python3.11/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
warnings.warn(
/mnt/workspace/myvenv/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
[2025-09-24 11:41:32,873] [ INFO] - tokenizer config file saved in /root/.paddlenlp/models/bert-base-chinese/tokenizer_config.json
[2025-09-24 11:41:32,873] [ INFO] - Special tokens file saved in /root/.paddlenlp/models/bert-base-chinese/special_tokens_map.json
Building prefix dict from the default dictionary ...
[2025-09-24 11:41:37,535] [ DEBUG] __init__.py:113 - Building prefix dict from the default dictionary ...
Loading model from cache /tmp/jieba.cache
[2025-09-24 11:41:37,535] [ DEBUG] __init__.py:132 - Loading model from cache /tmp/jieba.cache
Loading model cost 0.635 seconds.
[2025-09-24 11:41:38,170] [ DEBUG] __init__.py:164 - Loading model cost 0.635 seconds.
Prefix dict has been built successfully.
[2025-09-24 11:41:38,170] [ DEBUG] __init__.py:166 - Prefix dict has been built successfully.
(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# ls -l /mnt/workspace/myvenv/output_*.wav
-rw-r--r-- 1 root root 78044 9月 24 11:41 /mnt/workspace/myvenv/output_en.wav
-rw-r--r-- 1 root root 132044 9月 24 11:41 /mnt/workspace/myvenv/output_mix.wav
-rw-r--r-- 1 root root 130244 9月 24 11:41 /mnt/workspace/myvenv/output_zh.wav
2.3 正常输出音频文件
生成了3个语音文件。
2.3.1 错误处理
这个过程可能报错:
(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv# python ../Paddle_TTS.py
/mnt/workspace/myvenv/lib/python3.11/site-packages/_distutils_hack/__init__.py:53: UserWarning: Reliance on distutils from stdlib is deprecated. Users must rely on setuptools to provide the distutils module. Avoid importing distutils or import setuptools first, and avoid setting SETUPTOOLS_USE_DISTUTILS=stdlib. Register concerns at https://github.com/pypa/setuptools/issues/new?template=distutils-deprecation.yml
warnings.warn(
/mnt/workspace/myvenv/lib/python3.11/site-packages/jieba/_compat.py:18: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
import pkg_resources
Traceback (most recent call last):
File "/mnt/workspace/myvenv/../Paddle_TTS.py", line 1, in <module>
from paddlespeech.cli.tts.infer import TTSExecutor
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/cli/tts/__init__.py", line 14, in <module>
from .infer import TTSExecutor
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/cli/tts/infer.py", line 33, in <module>
from paddlespeech.t2s.exps.syn_utils import get_am_inference
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/t2s/exps/syn_utils.py", line 38, in <module>
from paddlespeech.t2s.frontend.en_frontend import English
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/t2s/frontend/en_frontend.py", line 14, in <module>
from .phonectic import English
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/paddlespeech/t2s/frontend/phonectic.py", line 20, in <module>
from g2p_en import G2p
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/g2p_en/__init__.py", line 1, in <module>
from .g2p import G2p
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/g2p_en/g2p.py", line 22, in <module>
nltk.data.find('taggers/averaged_perceptron_tagger.zip')
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/nltk/data.py", line 538, in find
return ZipFilePathPointer(p, zipentry)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/nltk/data.py", line 391, in __init__
zipfile = OpenOnDemandZipFile(os.path.abspath(zipfile))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/mnt/workspace/myvenv/lib/python3.11/site-packages/nltk/data.py", line 1020, in __init__
zipfile.ZipFile.__init__(self, filename)
File "/usr/local/lib/python3.11/zipfile.py", line 1313, in __init__
self._RealGetContents()
File "/usr/local/lib/python3.11/zipfile.py", line 1380, in _RealGetContents
raise BadZipFile("File is not a zip file")
zipfile.BadZipFile: File is not a zip file
(myvenv) root@dsw-1361954-68c7649f94-c7dlr:/mnt/workspace/myvenv#
错误解决:
- 但下载过程中出现了
[nltk_data] Error with downloaded zip file
,说明 ZIP 文件仍然损坏或不完整。 - 最终在解压
cmudict.zip
时再次触发了类似之前的BadZipFile
错误(虽然日志被截断,但可以推断)。
2.3.2 根本原因
NLTK 在受限网络环境(如阿里云 DSW、某些 Docker 容器、企业代理)中下载数据时,可能被重定向到一个 HTML 错误页面(如 403/404),但 NLTK 仍将其保存为 .zip
文件,导致后续解压失败。
2.3.3 终极解决方案:手动下载 + 离线安装 NLTK 数据
步骤 1:手动下载所需数据包
在有正常网络的机器(或当前机器用 wget
/curl
)下载以下两个文件:
# 1. averaged_perceptron_tagger
wget https://github.com/nltk/nltk_data/raw/gh-pages/packages/taggers/averaged_perceptron_tagger.zip
# 2. cmudict
wget https://github.com/nltk/nltk_data/raw/gh-pages/packages/corpora/cmudict.zip
如果
github.com
访问慢,可尝试镜像或代理,或使用国内加速(如https://npmmirror.com/mirrors/nltk/
,但需确认路径)。
步骤 2:创建 NLTK 数据目录结构
mkdir -p /root/nltk_data/taggers
mkdir -p /root/nltk_data/corpora
步骤 3:将 ZIP 文件放入对应目录并解压
# 移动并解压 tagger
mv averaged_perceptron_tagger.zip /root/nltk_data/taggers/
cd /root/nltk_data/taggers
unzip averaged_perceptron_tagger.zip
# 移动并解压 cmudict
mv cmudict.zip /root/nltk_data/corpora/
cd /root/nltk_data/corpora
unzip cmudict.zip
✅ 确保解压后目录结构为:
/root/nltk_data/taggers/averaged_perceptron_tagger/...
/root/nltk_data/corpora/cmudict/...
步骤 4:设置环境变量(推荐)
export NLTK_DATA=/root/nltk_data
或在你的 Paddle_TTS.py
开头添加(我们采用的方式):
import nltk
# 设置 NLTK 数据目录(避免权限问题)
nltk_data_dir = "/root/nltk_data"
os.makedirs(nltk_data_dir, exist_ok=True)
nltk.data.path.insert(0, nltk_data_dir)
步骤 5:运行脚本
python ../Paddle_TTS.py
验证 NLTK 数据是否可用
运行以下命令测试:
python -c "
import nltk
nltk.data.find('taggers/averaged_perceptron_tagger')
nltk.data.find('corpora/cmudict')
print('✅ NLTK data OK')
"
如果无报错,说明配置成功。