简介

Xinference 是由 Xorbits 团队开发的一个开源的大规模模型推理框架。它专注于提供高效、灵活的大模型推理解决方案，支持多种类型的模型，包括但不限于大语言模型（LLMs）、多模态模型等，并且能够适应从本地部署到分布式环境的各种应用场景。

Xinference具有如下特点：

Xinference支持多种推理引擎，例如Transformers，vLLM，Llama cpp，SGLang和MLX等。
Xinference支持的模型种类最为广泛。同时支持大语言模型，嵌入模型，图像模型，音频模型，reranker模型和视频模型。比单独的Ollama，vLLM和SGlang等支持的模型类型都更为全面。
自带分布式集群模式推理支持，无需依赖其他工具。

本篇以部署语音识别模型为例，介绍Xinference的使用方式。

Docker方式安装和运行

Docker方式部署不受主机系统类型和环境的限制，安装非常灵活和简便。建议使用Docker方式部署。

纯CPU模式：

docker pull xprobe/xinference:latest-cpu

GPU模式：

docker pull xprobe/xinference:latest

GPU模式的镜像也可以纯CPU运行。但GPU镜像比CPU镜像占用磁盘空间大。

镜像拉取成功之后使用下方的命令启动Xinference（单节点模式）：

docker run -e XINFERENCE_MODEL_SRC=modelscope -p 9998:9997 xprobe/xinference:latest-cpu xinference-local -H 0.0.0.0 --log-level debug

Docker也可以部署集群模式的Xinference。方法为将docker run后面的启动命令替换为xinference-supervisor或者是xinference-worker。详细命令参见后面的裸机安装和运行中集群模式运行一节。

需要注意的是，Xinference默认下载模型的存放路径为：~/.xinference和~/.cache。建议Docker启动时将这些目录映射为宿主机的目录，方便管理模型。

裸机安装和运行

环境要求

部署环境使用Python 3.11.x。高版本会有问题。

部署音频视频模型等需要ffmpeg，需要提前安装好。

安装步骤

按照如下步骤下载或编译Xinference所需依赖。

uv init xoribits --python 3.11
cd xoribits

# 安装xinference全引擎版本
uv add "xinference[all]==v1.5.0.post2" -v -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple

中间可能会遇到的包编译或依赖顺序问题。可单独安装。如下所示：

uv add "torch==2.6.0" -v --no-build-isolation -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
uv add setuptools -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
uv add "scikit_build_core" -v --no-build-isolation -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
uv add "llama-cpp-python" -v -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple --no-build-isolation

编译安装llama-cpp-python需要cmake，gcc，gcc-c++和python-devel。如果系统没有需要提前安装。

提前安装某些包解决完依赖顺序问题后，最后再次执行：

uv add "xinference[all]==v1.5.0.post2" -v --no-build-isolation -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple

最终安装完毕之后的Python库版本依赖树(可用uv tree查看)请查看本篇附录。当安装运行存在问题的时候供参考。

单机模式运行

激活虚拟环境：

source .venv/bin/activate

启动服务：

XINFERENCE_MODEL_SRC=modelscope xinference-local --host=0.0.0.0 --port 9997

其中：

XINFERENCE_MODEL_SRC=modelscope指定从modelscope下载模型。
--host绑定IP。
--port指定服务端口号。

加载模型：

XINFERENCE_MODEL_SRC=modelscope xinference launch --model-name Belle-whisper-large-v3-zh --model-type audio

执行该命令后，Xinference自动从modescope下载模型并运行。

其中：

--model-name，指定模型名称。必须为modelscope上模型的真实名称，否则无法下载。
--model-type模型类型，audio为音频模型。
除此之外我们开可以使用XINFERENCE_HOME:变量指定模型和日志的默认保存位置。默认为<HOME>/.xinference。

对于离线环境使用，可以使用如下方式加载自定义模型（加载已经下载好的模型）：

xinference launch --model-path <model_file_path> --model-engine <engine> -n <model_name>

其中：

--model_path模型在本地的保存路径。
--model_engine运行使用什么引擎。
-n模型名称，这里可以和实际的名称不同。

例如：

xinference launch --model-path /root/.cache/modelscope/hub/models/Xorbits/Belle-whisper-large-v3-zh -n Belle-whisper-large-v3-zh --model-type audio

注意，如果在启动的时候（xinference-local）使用--port指定了端口号，那么后面在使用xinference命令操作的时候必须使用-e参数指定endpoint。否则会找不到xinference服务。

例如：

xinference-local --host=0.0.0.0 --port 10097
xinference launch --model-path /home/paul/Documents/model/Belle-whisper-large-v3-zh -n Belle-whisper-large-v3-zh --model-type audio -e http://127.0.0.1:10097

集群模式运行

官网链接：使用 — Xinference

集群模式的Xinference分为两种角色：supervisor和worker。分别负责协调和处理推理负载。在启动集群之前，需要在所有的集群节点上按照前面裸机安装的步骤安装Xinference。

在规划的Supervisor节点执行如下命令启动Supervisor：

xinference-supervisor -H 主机地址 --port 端口号

在规划的Worker节点执行如下命令启动Worker：

xinference-worker -e "http://supervisor地址:supervisor端口号" -H worker地址

主要注意的是，集群模式使用xinference命令交互，必须增加-e http://supervisor地址:supervisor端口号参数，和前一节所述相同。

Xinference命令使用

启动模型

以Belle-whisper-large-v3-zh — Xinference为例说明。

启动命令为：

xinference launch --model-name Belle-whisper-large-v3-zh --model-type audio

列出正在运行的模型

xinference list

停止正在运行的模型

xinference terminate 模型名称

监控

对于local(xinference-local)模式而言Supervisor的监控metrics地址为：http://{host}:{port}/metrics。其中host和port为xinference-local启动命令的参数。
Worker的监控地址端口号为随机。查看xinference-local启动日志，可以发现类似如下输出。

xinference.core.worker 230437 INFO     Metrics server is started at: http://0.0.0.0:35117

可以得知Worker的监控metrics地址为：http://{host}:35117/metrics。host和前面的值相同。
对于分布式部署而言host需要替换为supervisor或者是worker的真实地址。

如果不希望worker使用随机的端口号，可使用–-metrics-exporter-host 和 –-metrics-exporter-port参数来指定exporter的主机地址和端口号。

官网链接：Metrics — Xinference

问题记录

裸机部署环境，语音识别时提示ffmpeg找不到

错误信息如下：

ffmpeg was not found but is required to load audio files from filename

解决方式为在系统安装ffmpeg。其中Ferdora为：

dnf install ffmpeg-free

Ubuntu为：

apt install ffmepg

其他系统先搜索ffmpeg确定软件包的名称，然后安装。如系统未提供可考虑编译安装，或者Docker方式部署。

载入模型报错

执行xinference launch命令之后xinference出现如下错误：

ERROR    [request b01de71e-2f97-11f0-841a-8b37737f7b27] Leave launch_builtin_model, error: MainActorPool.append_sub_pool() got an unexpected keyword argument 'start_method', elapsed time: 0 s
Traceback (most recent call last):
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/utils.py", line 93, in wrapped
    ret = await func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/worker.py", line 1018, in launch_builtin_model
    subpool_address, devices = await self._create_subpool(
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/worker.py", line 618, in _create_subpool
    subpool_address = await self._main_pool.append_sub_pool(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: MainActorPool.append_sub_pool() got an unexpected keyword argument 'start_method'
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-49' coro=<SupervisorActor.launch_builtin_model.<locals>._launch_model() done, defined at /path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/supervisor.py:1109> exception=TypeError()>
Traceback (most recent call last):
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/supervisor.py", line 1134, in _launch_model
    subpool_address = await _launch_one_model(
                      ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/supervisor.py", line 1088, in _launch_one_model
    subpool_address = await worker_ref.launch_builtin_model(
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xoscar/backends/context.py", line 262, in send
    return self._process_result_message(result)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xoscar/backends/context.py", line 111, in _process_result_message
    raise message.as_instanceof_cause()
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xoscar/backends/pool.py", line 689, in send
    result = await self._run_coro(message.message_id, coro)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xoscar/backends/pool.py", line 389, in _run_coro
    return await coro
           ^^^^^^^^^^
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xoscar/api.py", line 418, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "xoscar/core.pyx", line 564, in __on_receive__
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
  File "xoscar/core.pyx", line 527, in xoscar.core._BaseActor.__on_receive__
  File "xoscar/core.pyx", line 532, in xoscar.core._BaseActor.__on_receive__
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/utils.py", line 93, in wrapped
    ret = await func(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/worker.py", line 1018, in launch_builtin_model
    subpool_address, devices = await self._create_subpool(
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/worker.py", line 618, in _create_subpool
    subpool_address = await self._main_pool.append_sub_pool(
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: [address=0.0.0.0:55916, pid=104356] MainActorPool.append_sub_pool() got an unexpected keyword argument 'start_method'
^C
Aborted!

查看/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/worker.py618行的代码。如下所示：

        subpool_address = await self._main_pool.append_sub_pool(
            env=env, start_method=start_method
        )

报错的原因为start_method=start_method这个keyword argument是不受期待的。问题很可能在于
self._main_pool对象的append_sub_pool方法定义发生了变化，即使用的依赖库库版本发生了变化。继续跟踪发现self._main_pool对象为MainActorPoolType类型，这个类来自于xoscar，因为文件开头可以找到这一行。

from xoscar import MainActorPoolType

接着执行uv tree | grep xoscar命令查看xoscar的版本：

(xorbits) root@paul-virtual-machine:/path/to/xorbits# uv tree | grep xoscar
Resolved 353 packages in 43ms
    ├── xoscar v0.7.1

发现版本为v0.7.1。而附录中的xoscar版本为v0.6.2。
解决思路有了，尝试将xoscar覆盖为v0.6.2版本，执行uv add xoscar==v0.6.2 -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple命令：

(xorbits) root@paul-virtual-machine:/path/to/xorbits# uv add xoscar==v0.6.2 -i https://mirrors.tuna.tsinghua.edu.cn/pypi/web/simple
warning: Indexes specified via `--index-url` will not be persisted to the `pyproject.toml` file; use `--default-index` instead.
Resolved 353 packages in 6.29s
Prepared 1 package in 5.22s
Uninstalled 1 package in 75ms
Installed 1 package in 7ms
 - xoscar==0.7.1
 + xoscar==0.6.2

成功安装xoscar==v0.6.2版本之后，重新加载模型，问题解决。

启动Xinference服务超时错误问题

出现如下错误：

Traceback (most recent call last):
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/worker.py", line 1347, in report_status
    status = await asyncio.to_thread(gather_node_info)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
asyncio.exceptions.CancelledError

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/worker.py", line 1346, in report_status
    async with timeout(2):
  File "/usr/lib/python3.11/asyncio/timeouts.py", line 115, in __aexit__
    raise TimeoutError from exc_val
TimeoutError
2025-06-05 14:54:08,892 xinference.core.worker 1791636 ERROR    Report status got error.
Traceback (most recent call last):
  File "/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/worker.py", line 1347, in report_status
    status = await asyncio.to_thread(gather_node_info)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/asyncio/threads.py", line 25, in to_thread
    return await loop.run_in_executor(None, func_call)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
asyncio.exceptions.CancelledError

问题原因是gather_node_info调用超时所致。经分析很可能是pynvml库获取GPU信息过慢所致。

解决方法为编辑/path/to/xorbits/.venv/lib/python3.11/site-packages/xinference/core/worker.py（假设/path/to/xorbits/为xinference的安装路径）。找到1346行。将async with timeout(2)修改为async with timeout(30)，如下所示：

async def report_status(self):
    status = dict()
    try:
        # asyncio.timeout is only available in Python >= 3.11
        # 下面是1346行代码
        async with timeout(30):
            status = await asyncio.to_thread(gather_node_info)
    except asyncio.CancelledError:
        raise
    except Exception:
        logger.exception("Report status got error.")
    supervisor_ref = await self.get_supervisor_ref()
    await supervisor_ref.report_worker_status(self.address, status)

保存后重新启动服务可正常运行。

多GPU环境加载出现CUDA out of memory问题

使用xinference launch加载模型出现的错误如下：

torch.OutOfMemoryError: [address=0.0.0.0:42491, pid=299597] CUDA out of memory. Tried to allocate 14.00 MiB. GPU 0 has a total capacity of 44.40 GiB of which 8.19 MiB is free. Process 13039 has 1.43 GiB memory in use. Process 15300 has 39.07 GiB memory in use. Process 473909 has 2.31 GiB memory in use. Including non-PyTorch memory, this process has 1.55 GiB memory in use. Of the allocated memory 1.16 GiB is allocated by PyTorch, and 115.39 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

经排查发现是GPU 0显存已经完全占用，但是GPU 1仍处于空闲状态。

解决方式为加载模型时候指定GPU，命令如下：

xinference launch --model-path /home/paul/Belle-whisper-large-v3-zh -n Belle-whisper-large-v3-zh --model-type audio --gpu-idx 1

--gpu-idx参与用来指定模型加载到那个GPU上。GPU ID可以通过nvidia-smi命令查询。

附录

依赖树和版本

xorbits v0.1.0
└── xinference[all] v1.5.0.post2
    ├── aioprometheus[starlette] v23.12.0
    │   ├── orjson v3.10.16
    │   ├── quantile-python v1.1
    │   └── starlette v0.46.2 (extra: starlette)
    │       └── anyio v4.9.0
    │           ├── idna v3.10
    │           ├── sniffio v1.3.1
    │           └── typing-extensions v4.13.2
    ├── async-timeout v5.0.1
    ├── click v8.1.8
    ├── fastapi v0.115.12
    │   ├── pydantic v2.11.3
    │   │   ├── annotated-types v0.7.0
    │   │   ├── pydantic-core v2.33.1
    │   │   │   └── typing-extensions v4.13.2
    │   │   ├── typing-extensions v4.13.2
    │   │   └── typing-inspection v0.4.0
    │   │       └── typing-extensions v4.13.2
    │   ├── starlette v0.46.2 (*)
    │   ├── typing-extensions v4.13.2
    │   ├── email-validator v2.2.0 (extra: standard)
    │   │   ├── dnspython v2.7.0
    │   │   └── idna v3.10
    │   ├── fastapi-cli[standard] v0.0.7 (extra: standard)
    │   │   ├── rich-toolkit v0.14.1
    │   │   │   ├── click v8.1.8
    │   │   │   ├── rich v14.0.0
    │   │   │   │   ├── markdown-it-py v3.0.0
    │   │   │   │   │   └── mdurl v0.1.2
    │   │   │   │   └── pygments v2.19.1
    │   │   │   └── typing-extensions v4.13.2
    │   │   ├── typer v0.15.2
    │   │   │   ├── click v8.1.8
    │   │   │   ├── rich v14.0.0 (*)
    │   │   │   ├── shellingham v1.5.4
    │   │   │   └── typing-extensions v4.13.2
    │   │   ├── uvicorn[standard] v0.34.2
    │   │   │   ├── click v8.1.8
    │   │   │   ├── h11 v0.14.0
    │   │   │   ├── httptools v0.6.4 (extra: standard)
    │   │   │   ├── python-dotenv v1.1.0 (extra: standard)
    │   │   │   ├── pyyaml v6.0.2 (extra: standard)
    │   │   │   ├── uvloop v0.21.0 (extra: standard)
    │   │   │   ├── watchfiles v1.0.5 (extra: standard)
    │   │   │   │   └── anyio v4.9.0 (*)
    │   │   │   └── websockets v15.0.1 (extra: standard)
    │   │   └── uvicorn[standard] v0.34.2 (extra: standard) (*)
    │   ├── httpx v0.28.1 (extra: standard)
    │   │   ├── anyio v4.9.0 (*)
    │   │   ├── certifi v2025.1.31
    │   │   ├── httpcore v1.0.8
    │   │   │   ├── certifi v2025.1.31
    │   │   │   └── h11 v0.14.0
    │   │   └── idna v3.10
    │   ├── jinja2 v3.1.6 (extra: standard)
    │   │   └── markupsafe v3.0.2
    │   ├── python-multipart v0.0.20 (extra: standard)
    │   └── uvicorn[standard] v0.34.2 (extra: standard) (*)
    ├── gradio v5.25.2
    │   ├── aiofiles v24.1.0
    │   ├── anyio v4.9.0 (*)
    │   ├── fastapi v0.115.12 (*)
    │   ├── ffmpy v0.5.0
    │   ├── gradio-client v1.8.0
    │   │   ├── fsspec v2024.12.0
    │   │   │   └── aiohttp v3.11.18 (extra: http)
    │   │   │       ├── aiohappyeyeballs v2.6.1
    │   │   │       ├── aiosignal v1.3.2
    │   │   │       │   └── frozenlist v1.6.0
    │   │   │       ├── attrs v25.3.0
    │   │   │       ├── frozenlist v1.6.0
    │   │   │       ├── multidict v6.4.3
    │   │   │       ├── propcache v0.3.1
    │   │   │       └── yarl v1.20.0
    │   │   │           ├── idna v3.10
    │   │   │           ├── multidict v6.4.3
    │   │   │           └── propcache v0.3.1
    │   │   ├── httpx v0.28.1 (*)
    │   │   ├── huggingface-hub v0.30.2
    │   │   │   ├── filelock v3.18.0
    │   │   │   ├── fsspec v2024.12.0 (*)
    │   │   │   ├── packaging v24.2
    │   │   │   ├── pyyaml v6.0.2
    │   │   │   ├── requests v2.32.3
    │   │   │   │   ├── certifi v2025.1.31
    │   │   │   │   ├── charset-normalizer v3.4.1
    │   │   │   │   ├── idna v3.10
    │   │   │   │   ├── urllib3 v2.0.7
    │   │   │   │   └── pysocks v1.7.1 (extra: socks)
    │   │   │   ├── tqdm v4.67.1
    │   │   │   ├── typing-extensions v4.13.2
    │   │   │   └── hf-xet v1.0.3 (extra: hf-xet)
    │   │   ├── packaging v24.2
    │   │   ├── typing-extensions v4.13.2
    │   │   └── websockets v15.0.1
    │   ├── groovy v0.1.2
    │   ├── httpx v0.28.1 (*)
    │   ├── huggingface-hub v0.30.2 (*)
    │   ├── jinja2 v3.1.6 (*)
    │   ├── markupsafe v3.0.2
    │   ├── numpy v1.26.4
    │   ├── orjson v3.10.16
    │   ├── packaging v24.2
    │   ├── pandas v2.2.3
    │   │   ├── numpy v1.26.4
    │   │   ├── python-dateutil v2.9.0.post0
    │   │   │   └── six v1.17.0
    │   │   ├── pytz v2025.2
    │   │   └── tzdata v2025.2
    │   ├── pillow v11.2.1
    │   ├── pydantic v2.11.3 (*)
    │   ├── pydub v0.25.1
    │   ├── python-multipart v0.0.20
    │   ├── pyyaml v6.0.2
    │   ├── ruff v0.11.6
    │   ├── safehttpx v0.1.6
    │   │   └── httpx v0.28.1 (*)
    │   ├── semantic-version v2.10.0
    │   ├── starlette v0.46.2 (*)
    │   ├── tomlkit v0.13.2
    │   ├── typer v0.15.2 (*)
    │   ├── typing-extensions v4.13.2
    │   └── uvicorn v0.34.2 (*)
    ├── huggingface-hub v0.30.2 (*)
    ├── modelscope v1.25.0
    │   ├── requests v2.32.3 (*)
    │   ├── tqdm v4.67.1
    │   └── urllib3 v2.0.7
    ├── nvidia-ml-py v12.570.86
    ├── openai v1.75.0
    │   ├── anyio v4.9.0 (*)
    │   ├── distro v1.9.0
    │   ├── httpx v0.28.1 (*)
    │   ├── jiter v0.9.0
    │   ├── pydantic v2.11.3 (*)
    │   ├── sniffio v1.3.1
    │   ├── tqdm v4.67.1
    │   └── typing-extensions v4.13.2
    ├── passlib[bcrypt] v1.7.4
    │   └── bcrypt v4.3.0 (extra: bcrypt)
    ├── peft v0.15.2
    │   ├── accelerate v1.6.0
    │   │   ├── huggingface-hub v0.30.2 (*)
    │   │   ├── numpy v1.26.4
    │   │   ├── packaging v24.2
    │   │   ├── psutil v7.0.0
    │   │   ├── pyyaml v6.0.2
    │   │   ├── safetensors v0.5.3
    │   │   └── torch v2.6.0
    │   │       ├── filelock v3.18.0
    │   │       ├── fsspec v2024.12.0 (*)
    │   │       ├── jinja2 v3.1.6 (*)
    │   │       ├── networkx v3.4.2
    │   │       ├── nvidia-cublas-cu12 v12.4.5.8
    │   │       ├── nvidia-cuda-cupti-cu12 v12.4.127
    │   │       ├── nvidia-cuda-nvrtc-cu12 v12.4.127
    │   │       ├── nvidia-cuda-runtime-cu12 v12.4.127
    │   │       ├── nvidia-cudnn-cu12 v9.1.0.70
    │   │       │   └── nvidia-cublas-cu12 v12.4.5.8
    │   │       ├── nvidia-cufft-cu12 v11.2.1.3
    │   │       │   └── nvidia-nvjitlink-cu12 v12.4.127
    │   │       ├── nvidia-curand-cu12 v10.3.5.147
    │   │       ├── nvidia-cusolver-cu12 v11.6.1.9
    │   │       │   ├── nvidia-cublas-cu12 v12.4.5.8
    │   │       │   ├── nvidia-cusparse-cu12 v12.3.1.170
    │   │       │   │   └── nvidia-nvjitlink-cu12 v12.4.127
    │   │       │   └── nvidia-nvjitlink-cu12 v12.4.127
    │   │       ├── nvidia-cusparse-cu12 v12.3.1.170 (*)
    │   │       ├── nvidia-cusparselt-cu12 v0.6.2
    │   │       ├── nvidia-nccl-cu12 v2.21.5
    │   │       ├── nvidia-nvjitlink-cu12 v12.4.127
    │   │       ├── nvidia-nvtx-cu12 v12.4.127
    │   │       ├── sympy v1.13.1
    │   │       │   └── mpmath v1.3.0
    │   │       ├── triton v3.2.0
    │   │       └── typing-extensions v4.13.2
    │   ├── huggingface-hub v0.30.2 (*)
    │   ├── numpy v1.26.4
    │   ├── packaging v24.2
    │   ├── psutil v7.0.0
    │   ├── pyyaml v6.0.2
    │   ├── safetensors v0.5.3
    │   ├── torch v2.6.0 (*)
    │   ├── tqdm v4.67.1
    │   └── transformers v4.51.1
    │       ├── filelock v3.18.0
    │       ├── huggingface-hub v0.30.2 (*)
    │       ├── numpy v1.26.4
    │       ├── packaging v24.2
    │       ├── pyyaml v6.0.2
    │       ├── regex v2024.11.6
    │       ├── requests v2.32.3 (*)
    │       ├── safetensors v0.5.3
    │       ├── tokenizers v0.21.1
    │       │   └── huggingface-hub v0.30.2 (*)
    │       └── tqdm v4.67.1
    ├── pillow v11.2.1
    ├── pydantic v2.11.3 (*)
    ├── pynvml v12.0.0
    │   └── nvidia-ml-py v12.570.86
    ├── python-jose[cryptography] v3.4.0
    │   ├── ecdsa v0.19.1
    │   │   └── six v1.17.0
    │   ├── pyasn1 v0.4.8
    │   ├── rsa v4.9.1
    │   │   └── pyasn1 v0.4.8
    │   └── cryptography v44.0.2 (extra: cryptography)
    │       └── cffi v1.17.1
    │           └── pycparser v2.22
    ├── requests v2.32.3 (*)
    ├── setproctitle v1.3.5
    ├── sse-starlette v2.2.1
    │   ├── anyio v4.9.0 (*)
    │   └── starlette v0.46.2 (*)
    ├── tabulate v0.9.0
    ├── timm v1.0.15
    │   ├── huggingface-hub v0.30.2 (*)
    │   ├── pyyaml v6.0.2
    │   ├── safetensors v0.5.3
    │   ├── torch v2.6.0 (*)
    │   └── torchvision v0.21.0
    │       ├── numpy v1.26.4
    │       ├── pillow v11.2.1
    │       └── torch v2.6.0 (*)
    ├── torch v2.6.0 (*)
    ├── tqdm v4.67.1
    ├── typing-extensions v4.13.2
    ├── uvicorn v0.34.2 (*)
    ├── xoscar v0.6.2
    │   ├── cloudpickle v3.1.1
    │   ├── numpy v1.26.4
    │   ├── packaging v24.2
    │   ├── pandas v2.2.3 (*)
    │   ├── psutil v7.0.0
    │   ├── scipy v1.13.1
    │   │   └── numpy v1.26.4
    │   ├── tblib v3.1.0
    │   └── uvloop v0.21.0
    ├── accelerate v1.6.0 (extra: all) (*)
    ├── attrdict v2.0.1 (extra: all)
    │   └── six v1.17.0
    ├── autoawq v0.2.5 (extra: all)
    │   ├── accelerate v1.6.0 (*)
    │   ├── autoawq-kernels v0.0.9
    │   │   └── torch v2.6.0 (*)
    │   ├── datasets v3.5.0
    │   │   ├── aiohttp v3.11.18 (*)
    │   │   ├── dill v0.3.8
    │   │   ├── filelock v3.18.0
    │   │   ├── fsspec[http] v2024.12.0 (*)
    │   │   ├── huggingface-hub v0.30.2 (*)
    │   │   ├── multiprocess v0.70.16
    │   │   │   └── dill v0.3.8
    │   │   ├── numpy v1.26.4
    │   │   ├── packaging v24.2
    │   │   ├── pandas v2.2.3 (*)
    │   │   ├── pyarrow v19.0.1
    │   │   ├── pyyaml v6.0.2
    │   │   ├── requests v2.32.3 (*)
    │   │   ├── tqdm v4.67.1
    │   │   └── xxhash v3.5.0
    │   ├── tokenizers v0.21.1 (*)
    │   ├── torch v2.6.0 (*)
    │   ├── transformers v4.51.1 (*)
    │   ├── typing-extensions v4.13.2
    │   └── zstandard v0.23.0
    ├── bitsandbytes v0.45.5 (extra: all)
    │   ├── numpy v1.26.4
    │   └── torch v2.6.0 (*)
    ├── blobfile v3.0.0 (extra: all)
    │   ├── filelock v3.18.0
    │   ├── lxml v5.4.0
    │   ├── pycryptodomex v3.22.0
    │   └── urllib3 v2.0.7
    ├── boto3 v1.28.64 (extra: all)
    │   ├── botocore v1.31.85
    │   │   ├── jmespath v1.0.1
    │   │   ├── python-dateutil v2.9.0.post0 (*)
    │   │   └── urllib3 v2.0.7
    │   ├── jmespath v1.0.1
    │   └── s3transfer v0.7.0
    │       └── botocore v1.31.85 (*)
    ├── cachetools v5.5.2 (extra: all)
    ├── chattts v0.2.3 (extra: all)
    │   ├── numba v0.61.0
    │   │   ├── llvmlite v0.44.0
    │   │   └── numpy v1.26.4
    │   ├── numpy v1.26.4
    │   ├── pybase16384 v0.3.8
    │   │   └── cffi v1.17.1 (*)
    │   ├── torch v2.6.0 (*)
    │   ├── torchaudio v2.6.0
    │   │   └── torch v2.6.0 (*)
    │   ├── tqdm v4.67.1
    │   ├── transformers v4.51.1 (*)
    │   ├── vector-quantize-pytorch v1.17.3
    │   │   ├── einops v0.8.1
    │   │   ├── einx v0.3.0
    │   │   │   ├── frozendict v2.4.6
    │   │   │   ├── numpy v1.26.4
    │   │   │   └── sympy v1.13.1 (*)
    │   │   └── torch v2.6.0 (*)
    │   └── vocos v0.1.0
    │       ├── einops v0.8.1
    │       ├── encodec v0.1.1
    │       │   ├── einops v0.8.1
    │       │   ├── numpy v1.26.4
    │       │   ├── torch v2.6.0 (*)
    │       │   └── torchaudio v2.6.0 (*)
    │       ├── huggingface-hub v0.30.2 (*)
    │       ├── numpy v1.26.4
    │       ├── pyyaml v6.0.2
    │       ├── scipy v1.13.1 (*)
    │       ├── torch v2.6.0 (*)
    │       └── torchaudio v2.6.0 (*)
    ├── conformer v0.3.2 (extra: all)
    │   ├── einops v0.8.1
    │   └── torch v2.6.0 (*)
    ├── controlnet-aux v0.0.7 (extra: all)
    │   ├── einops v0.8.1
    │   ├── filelock v3.18.0
    │   ├── huggingface-hub v0.30.2 (*)
    │   ├── importlib-metadata v8.6.1
    │   │   └── zipp v3.21.0
    │   ├── numpy v1.26.4
    │   ├── opencv-python v4.10.0.84
    │   │   └── numpy v1.26.4
    │   ├── pillow v11.2.1
    │   ├── scikit-image v0.25.2
    │   │   ├── imageio v2.37.0
    │   │   │   ├── numpy v1.26.4
    │   │   │   └── pillow v11.2.1
    │   │   ├── lazy-loader v0.4
    │   │   │   └── packaging v24.2
    │   │   ├── networkx v3.4.2
    │   │   ├── numpy v1.26.4
    │   │   ├── packaging v24.2
    │   │   ├── pillow v11.2.1
    │   │   ├── scipy v1.13.1 (*)
    │   │   └── tifffile v2025.3.30
    │   │       └── numpy v1.26.4
    │   ├── scipy v1.13.1 (*)
    │   ├── timm v1.0.15 (*)
    │   ├── torch v2.6.0 (*)
    │   └── torchvision v0.21.0 (*)
    ├── datamodel-code-generator v0.30.0 (extra: all)
    │   ├── argcomplete v3.6.2
    │   ├── black v25.1.0
    │   │   ├── click v8.1.8
    │   │   ├── mypy-extensions v1.1.0
    │   │   ├── packaging v24.2
    │   │   ├── pathspec v0.12.1
    │   │   └── platformdirs v4.3.7
    │   ├── genson v1.3.0
    │   ├── inflect v7.5.0
    │   │   ├── more-itertools v10.7.0
    │   │   └── typeguard v4.4.2
    │   │       └── typing-extensions v4.13.2
    │   ├── isort v6.0.1
    │   ├── jinja2 v3.1.6 (*)
    │   ├── packaging v24.2
    │   ├── pydantic v2.11.3 (*)
    │   ├── pyyaml v6.0.2
    │   └── tomli v2.2.1
    ├── diffusers v0.33.1 (extra: all)
    │   ├── filelock v3.18.0
    │   ├── huggingface-hub v0.30.2 (*)
    │   ├── importlib-metadata v8.6.1 (*)
    │   ├── numpy v1.26.4
    │   ├── pillow v11.2.1
    │   ├── regex v2024.11.6
    │   ├── requests v2.32.3 (*)
    │   └── safetensors v0.5.3
    ├── einops v0.8.1 (extra: all)
    ├── eva-decord v0.6.1 (extra: all)
    │   └── numpy v1.26.4
    ├── flagembedding v1.3.4 (extra: all)
    │   ├── accelerate v1.6.0 (*)
    │   ├── datasets v3.5.0 (*)
    │   ├── ir-datasets v0.5.10
    │   │   ├── beautifulsoup4 v4.13.4
    │   │   │   ├── soupsieve v2.7
    │   │   │   └── typing-extensions v4.13.2
    │   │   ├── ijson v3.3.0
    │   │   ├── inscriptis v2.6.0
    │   │   │   ├── lxml v5.4.0
    │   │   │   └── requests v2.32.3 (*)
    │   │   ├── lxml v5.4.0
    │   │   ├── lz4 v4.4.4
    │   │   ├── numpy v1.26.4
    │   │   ├── pyarrow v19.0.1
    │   │   ├── pyyaml v6.0.2
    │   │   ├── requests v2.32.3 (*)
    │   │   ├── tqdm v4.67.1
    │   │   ├── trec-car-tools v2.6
    │   │   │   ├── cbor v1.0.0
    │   │   │   └── numpy v1.26.4
    │   │   ├── unlzw3 v0.2.3
    │   │   ├── warc3-wet v0.2.5
    │   │   ├── warc3-wet-clueweb09 v0.2.5
    │   │   └── zlib-state v0.1.9
    │   ├── peft v0.15.2 (*)
    │   ├── protobuf v6.30.2
    │   ├── sentence-transformers v4.1.0
    │   │   ├── huggingface-hub v0.30.2 (*)
    │   │   ├── pillow v11.2.1
    │   │   ├── scikit-learn v1.6.1
    │   │   │   ├── joblib v1.4.2
    │   │   │   ├── numpy v1.26.4
    │   │   │   ├── scipy v1.13.1 (*)
    │   │   │   └── threadpoolctl v3.6.0
    │   │   ├── scipy v1.13.1 (*)
    │   │   ├── torch v2.6.0 (*)
    │   │   ├── tqdm v4.67.1
    │   │   ├── transformers v4.51.1 (*)
    │   │   └── typing-extensions v4.13.2
    │   ├── sentencepiece v0.2.0
    │   ├── torch v2.6.0 (*)
    │   └── transformers v4.51.1 (*)
    ├── funasr v1.1.16 (extra: all)
    │   ├── editdistance v0.8.1
    │   ├── hydra-core v1.3.2
    │   │   ├── antlr4-python3-runtime v4.9.3
    │   │   ├── omegaconf v2.3.0
    │   │   │   ├── antlr4-python3-runtime v4.9.3
    │   │   │   └── pyyaml v6.0.2
    │   │   └── packaging v24.2
    │   ├── jaconv v0.4.0
    │   ├── jamo v0.4.1
    │   ├── jieba v0.42.1
    │   ├── kaldiio v2.18.1
    │   │   └── numpy v1.26.4
    │   ├── librosa v0.11.0
    │   │   ├── audioread v3.0.1
    │   │   ├── decorator v5.2.1
    │   │   ├── joblib v1.4.2
    │   │   ├── lazy-loader v0.4 (*)
    │   │   ├── msgpack v1.1.0
    │   │   ├── numba v0.61.0 (*)
    │   │   ├── numpy v1.26.4
    │   │   ├── pooch v1.8.2
    │   │   │   ├── packaging v24.2
    │   │   │   ├── platformdirs v4.3.7
    │   │   │   └── requests v2.32.3 (*)
    │   │   ├── scikit-learn v1.6.1 (*)
    │   │   ├── scipy v1.13.1 (*)
    │   │   ├── soundfile v0.13.1
    │   │   │   ├── cffi v1.17.1 (*)
    │   │   │   └── numpy v1.26.4
    │   │   ├── soxr v0.5.0.post1
    │   │   │   └── numpy v1.26.4
    │   │   └── typing-extensions v4.13.2
    │   ├── modelscope v1.25.0 (*)
    │   ├── oss2 v2.13.1
    │   │   ├── aliyun-python-sdk-core-v3 v2.11.5
    │   │   │   └── pycryptodome v3.22.0
    │   │   ├── aliyun-python-sdk-kms v2.16.5
    │   │   │   └── aliyun-python-sdk-core v2.11.5
    │   │   │       └── pycryptodome v3.22.0
    │   │   ├── crcmod v1.7
    │   │   ├── pycryptodome v3.22.0
    │   │   ├── requests v2.32.3 (*)
    │   │   └── six v1.17.0
    │   ├── pytorch-wpe v0.0.1
    │   │   └── numpy v1.26.4
    │   ├── pyyaml v6.0.2
    │   ├── requests v2.32.3 (*)
    │   ├── scipy v1.13.1 (*)
    │   ├── sentencepiece v0.2.0
    │   ├── soundfile v0.13.1 (*)
    │   ├── tensorboardx v2.6.2.2
    │   │   ├── numpy v1.26.4
    │   │   ├── packaging v24.2
    │   │   └── protobuf v6.30.2
    │   ├── torch-complex v0.4.4
    │   │   ├── numpy v1.26.4
    │   │   └── packaging v24.2
    │   ├── tqdm v4.67.1
    │   └── umap-learn v0.5.7
    │       ├── numba v0.61.0 (*)
    │       ├── numpy v1.26.4
    │       ├── pynndescent v0.5.13
    │       │   ├── joblib v1.4.2
    │       │   ├── llvmlite v0.44.0
    │       │   ├── numba v0.61.0 (*)
    │       │   ├── scikit-learn v1.6.1 (*)
    │       │   └── scipy v1.13.1 (*)
    │       ├── scikit-learn v1.6.1 (*)
    │       ├── scipy v1.13.1 (*)
    │       └── tqdm v4.67.1
    ├── gdown v5.2.0 (extra: all)
    │   ├── beautifulsoup4 v4.13.4 (*)
    │   ├── filelock v3.18.0
    │   ├── requests[socks] v2.32.3 (*)
    │   └── tqdm v4.67.1
    ├── gguf v0.10.0 (extra: all)
    │   ├── numpy v1.26.4
    │   ├── pyyaml v6.0.2
    │   └── tqdm v4.67.1
    ├── gptqmodel v2.2.0 (extra: all)
    ├── hydra-core v1.3.2 (extra: all) (*)
    ├── hyperpyyaml v1.2.2 (extra: all)
    │   ├── pyyaml v6.0.2
    │   └── ruamel-yaml v0.18.10
    │       └── ruamel-yaml-clib v0.2.12
    ├── imageio-ffmpeg v0.6.0 (extra: all)
    ├── inflect v7.5.0 (extra: all) (*)
    ├── jieba v0.42.1 (extra: all)
    ├── jj-pytorchvideo v0.1.5 (extra: all)
    │   ├── av v14.3.0
    │   ├── fvcore v0.1.5.post20221221
    │   │   ├── iopath v0.1.10
    │   │   │   ├── portalocker v3.1.1
    │   │   │   ├── tqdm v4.67.1
    │   │   │   └── typing-extensions v4.13.2
    │   │   ├── numpy v1.26.4
    │   │   ├── pillow v11.2.1
    │   │   ├── pyyaml v6.0.2
    │   │   ├── tabulate v0.9.0
    │   │   ├── termcolor v3.0.1
    │   │   ├── tqdm v4.67.1
    │   │   └── yacs v0.1.8
    │   │       └── pyyaml v6.0.2
    │   ├── iopath v0.1.10 (*)
    │   ├── networkx v3.4.2
    │   └── parameterized v0.9.0
    ├── jsonschema v4.23.0 (extra: all)
    │   ├── attrs v25.3.0
    │   ├── jsonschema-specifications v2024.10.1
    │   │   └── referencing v0.36.2
    │   │       ├── attrs v25.3.0
    │   │       ├── rpds-py v0.24.0
    │   │       └── typing-extensions v4.13.2
    │   ├── referencing v0.36.2 (*)
    │   └── rpds-py v0.24.0
    ├── langdetect v1.0.9 (extra: all)
    │   └── six v1.17.0
    ├── librosa v0.11.0 (extra: all) (*)
    ├── lightning v2.5.1 (extra: all)
    │   ├── fsspec[http] v2024.12.0 (*)
    │   ├── lightning-utilities v0.14.3
    │   │   ├── packaging v24.2
    │   │   ├── setuptools v79.0.0
    │   │   └── typing-extensions v4.13.2
    │   ├── packaging v24.2
    │   ├── pytorch-lightning v2.5.1
    │   │   ├── fsspec[http] v2024.12.0 (*)
    │   │   ├── lightning-utilities v0.14.3 (*)
    │   │   ├── packaging v24.2
    │   │   ├── pyyaml v6.0.2
    │   │   ├── torch v2.6.0 (*)
    │   │   ├── torchmetrics v1.7.1
    │   │   │   ├── lightning-utilities v0.14.3 (*)
    │   │   │   ├── numpy v1.26.4
    │   │   │   ├── packaging v24.2
    │   │   │   └── torch v2.6.0 (*)
    │   │   ├── tqdm v4.67.1
    │   │   └── typing-extensions v4.13.2
    │   ├── pyyaml v6.0.2
    │   ├── torch v2.6.0 (*)
    │   ├── torchmetrics v1.7.1 (*)
    │   ├── tqdm v4.67.1
    │   └── typing-extensions v4.13.2
    ├── llama-cpp-python v0.3.8 (extra: all)
    │   ├── diskcache v5.6.3
    │   ├── jinja2 v3.1.6 (*)
    │   ├── numpy v1.26.4
    │   └── typing-extensions v4.13.2
    ├── loguru v0.7.3 (extra: all)
    ├── loralib v0.1.2 (extra: all)
    ├── natsort v8.4.0 (extra: all)
    ├── nemo-text-processing v1.0.2 (extra: all)
    │   ├── cdifflib v1.2.9
    │   ├── editdistance v0.8.1
    │   ├── inflect v7.5.0 (*)
    │   ├── joblib v1.4.2
    │   ├── pandas v2.2.3 (*)
    │   ├── pynini v2.1.5
    │   │   └── cython v3.0.12
    │   ├── regex v2024.11.6
    │   ├── sacremoses v0.1.1
    │   │   ├── click v8.1.8
    │   │   ├── joblib v1.4.2
    │   │   ├── regex v2024.11.6
    │   │   └── tqdm v4.67.1
    │   ├── setuptools v79.0.0
    │   ├── tqdm v4.67.1
    │   ├── transformers v4.51.1 (*)
    │   ├── wget v3.2
    │   └── wrapt v1.17.2
    ├── omegaconf v2.3.0 (extra: all) (*)
    ├── onnxruntime v1.21.1 (extra: all)
    │   ├── coloredlogs v15.0.1
    │   │   └── humanfriendly v10.0
    │   ├── flatbuffers v25.2.10
    │   ├── numpy v1.26.4
    │   ├── packaging v24.2
    │   ├── protobuf v6.30.2
    │   └── sympy v1.13.1 (*)
    ├── optimum v1.24.0 (extra: all)
    │   ├── huggingface-hub v0.30.2 (*)
    │   ├── numpy v1.26.4
    │   ├── packaging v24.2
    │   ├── torch v2.6.0 (*)
    │   └── transformers v4.51.1 (*)
    ├── orjson v3.10.16 (extra: all)
    ├── ormsgpack v1.9.1 (extra: all)
    ├── outlines v0.1.11 (extra: all)
    │   ├── airportsdata v20250224
    │   ├── cloudpickle v3.1.1
    │   ├── diskcache v5.6.3
    │   ├── interegular v0.3.3
    │   ├── jinja2 v3.1.6 (*)
    │   ├── jsonschema v4.23.0 (*)
    │   ├── lark v1.2.2
    │   ├── nest-asyncio v1.6.0
    │   ├── numpy v1.26.4
    │   ├── outlines-core v0.1.26
    │   │   ├── interegular v0.3.3
    │   │   └── jsonschema v4.23.0 (*)
    │   ├── pycountry v24.6.1
    │   ├── pydantic v2.11.3 (*)
    │   ├── referencing v0.36.2 (*)
    │   ├── requests v2.32.3 (*)
    │   ├── torch v2.6.0 (*)
    │   ├── tqdm v4.67.1
    │   └── typing-extensions v4.13.2
    ├── protobuf v6.30.2 (extra: all)
    ├── pyarrow v19.0.1 (extra: all)
    ├── pyloudnorm v0.1.1 (extra: all)
    │   ├── future v1.0.0
    │   ├── numpy v1.26.4
    │   └── scipy v1.13.1 (*)
    ├── pypinyin v0.54.0 (extra: all)
    ├── qwen-omni-utils v0.0.4 (extra: all)
    │   ├── av v14.3.0
    │   ├── librosa v0.11.0 (*)
    │   ├── packaging v24.2
    │   ├── pillow v11.2.1
    │   └── requests v2.32.3 (*)
    ├── qwen-vl-utils v0.0.11 (extra: all)
    │   ├── av v14.3.0
    │   ├── packaging v24.2
    │   ├── pillow v11.2.1
    │   └── requests v2.32.3 (*)
    ├── sentence-transformers v4.1.0 (extra: all) (*)
    ├── sentencepiece v0.2.0 (extra: all)
    ├── sglang[srt] v0.4.5.post3 (extra: all)
    │   ├── aiohttp v3.11.18 (*)
    │   ├── ipython v9.1.0
    │   │   ├── decorator v5.2.1
    │   │   ├── ipython-pygments-lexers v1.1.1
    │   │   │   └── pygments v2.19.1
    │   │   ├── jedi v0.19.2
    │   │   │   └── parso v0.8.4
    │   │   ├── matplotlib-inline v0.1.7
    │   │   │   └── traitlets v5.14.3
    │   │   ├── pexpect v4.9.0
    │   │   │   └── ptyprocess v0.7.0
    │   │   ├── prompt-toolkit v3.0.51
    │   │   │   └── wcwidth v0.2.13
    │   │   ├── pygments v2.19.1
    │   │   ├── stack-data v0.6.3
    │   │   │   ├── asttokens v3.0.0
    │   │   │   ├── executing v2.2.0
    │   │   │   └── pure-eval v0.2.3
    │   │   ├── traitlets v5.14.3
    │   │   └── typing-extensions v4.13.2
    │   ├── numpy v1.26.4
    │   ├── requests v2.32.3 (*)
    │   ├── setproctitle v1.3.5
    │   ├── tqdm v4.67.1
    │   ├── compressed-tensors v0.9.2 (extra: srt)
    │   │   ├── pydantic v2.11.3 (*)
    │   │   ├── torch v2.6.0 (*)
    │   │   └── transformers v4.51.1 (*)
    │   ├── cuda-python v12.8.0 (extra: srt)
    │   │   └── cuda-bindings v12.8.0
    │   ├── datasets v3.5.0 (extra: srt) (*)
    │   ├── decord v0.6.0 (extra: srt)
    │   │   └── numpy v1.26.4
    │   ├── einops v0.8.1 (extra: srt)
    │   ├── fastapi v0.115.12 (extra: srt) (*)
    │   ├── flashinfer-python v0.2.3 (extra: srt)
    │   │   ├── ninja v1.11.1.4
    │   │   ├── numpy v1.26.4
    │   │   └── torch v2.6.0 (*)
    │   ├── hf-transfer v0.1.9 (extra: srt)
    │   ├── huggingface-hub v0.30.2 (extra: srt) (*)
    │   ├── interegular v0.3.3 (extra: srt)
    │   ├── llguidance v0.7.18 (extra: srt)
    │   ├── modelscope v1.25.0 (extra: srt) (*)
    │   ├── ninja v1.11.1.4 (extra: srt)
    │   ├── orjson v3.10.16 (extra: srt)
    │   ├── outlines v0.1.11 (extra: srt) (*)
    │   ├── packaging v24.2 (extra: srt)
    │   ├── partial-json-parser v0.2.1.1.post5 (extra: srt)
    │   ├── pillow v11.2.1 (extra: srt)
    │   ├── prometheus-client v0.21.1 (extra: srt)
    │   ├── psutil v7.0.0 (extra: srt)
    │   ├── pydantic v2.11.3 (extra: srt) (*)
    │   ├── pynvml v12.0.0 (extra: srt) (*)
    │   ├── python-multipart v0.0.20 (extra: srt)
    │   ├── pyzmq v26.4.0 (extra: srt)
    │   ├── sgl-kernel v0.0.9.post2 (extra: srt)
    │   ├── soundfile v0.13.1 (extra: srt) (*)
    │   ├── torch v2.6.0 (extra: srt) (*)
    │   ├── torchao v0.10.0 (extra: srt)
    │   ├── torchvision v0.21.0 (extra: srt) (*)
    │   ├── transformers v4.51.1 (extra: srt) (*)
    │   ├── uvicorn v0.34.2 (extra: srt) (*)
    │   ├── uvloop v0.21.0 (extra: srt)
    │   └── xgrammar v0.1.17 (extra: srt)
    │       ├── nanobind v2.7.0
    │       ├── ninja v1.11.1.4
    │       ├── pydantic v2.11.3 (*)
    │       ├── sentencepiece v0.2.0
    │       ├── tiktoken v0.9.0
    │       │   ├── regex v2024.11.6
    │       │   └── requests v2.32.3 (*)
    │       ├── torch v2.6.0 (*)
    │       └── transformers v4.51.1 (*)
    ├── silero-vad v5.1.2 (extra: all)
    │   ├── onnxruntime v1.21.1 (*)
    │   ├── torch v2.6.0 (*)
    │   └── torchaudio v2.6.0 (*)
    ├── soundfile v0.13.1 (extra: all) (*)
    ├── tensorizer v2.9.2 (extra: all)
    │   ├── boto3 v1.28.64 (*)
    │   ├── hiredis v3.1.0
    │   ├── libnacl v2.1.0
    │   ├── numpy v1.26.4
    │   ├── protobuf v6.30.2
    │   ├── psutil v7.0.0
    │   ├── redis v5.2.1
    │   │   └── async-timeout v5.0.1
    │   └── torch v2.6.0 (*)
    ├── tiktoken v0.9.0 (extra: all) (*)
    ├── timm v1.0.15 (extra: all) (*)
    ├── tomli v2.2.1 (extra: all)
    ├── torch v2.6.0 (extra: all) (*)
    ├── torchaudio v2.6.0 (extra: all) (*)
    ├── torchdiffeq v0.2.5 (extra: all)
    │   ├── scipy v1.13.1 (*)
    │   └── torch v2.6.0 (*)
    ├── torchvision v0.21.0 (extra: all) (*)
    ├── transformers v4.51.1 (extra: all) (*)
    ├── transformers-stream-generator v0.0.5 (extra: all)
    │   └── transformers v4.51.1 (*)
    ├── uv v0.6.16 (extra: all)
    ├── vector-quantize-pytorch v1.17.3 (extra: all) (*)
    ├── verovio v5.1.0 (extra: all)
    ├── vllm v0.8.3 (extra: all)
    │   ├── aiohttp v3.11.18 (*)
    │   ├── blake3 v1.0.4
    │   ├── cachetools v5.5.2
    │   ├── cloudpickle v3.1.1
    │   ├── compressed-tensors v0.9.2 (*)
    │   ├── depyf v0.18.0
    │   │   ├── astor v0.8.1
    │   │   └── dill v0.3.8
    │   ├── einops v0.8.1
    │   ├── fastapi[standard] v0.115.12 (*)
    │   ├── filelock v3.18.0
    │   ├── gguf v0.10.0 (*)
    │   ├── huggingface-hub[hf-xet] v0.30.2 (*)
    │   ├── importlib-metadata v8.6.1 (*)
    │   ├── lark v1.2.2
    │   ├── llguidance v0.7.18
    │   ├── lm-format-enforcer v0.10.11
    │   │   ├── interegular v0.3.3
    │   │   ├── packaging v24.2
    │   │   ├── pydantic v2.11.3 (*)
    │   │   └── pyyaml v6.0.2
    │   ├── mistral-common[opencv] v1.5.4
    │   │   ├── jsonschema v4.23.0 (*)
    │   │   ├── numpy v1.26.4
    │   │   ├── pillow v11.2.1
    │   │   ├── pydantic v2.11.3 (*)
    │   │   ├── requests v2.32.3 (*)
    │   │   ├── sentencepiece v0.2.0
    │   │   ├── tiktoken v0.9.0 (*)
    │   │   ├── typing-extensions v4.13.2
    │   │   └── opencv-python-headless v4.11.0.86 (extra: opencv)
    │   │       └── numpy v1.26.4
    │   ├── msgspec v0.19.0
    │   ├── ninja v1.11.1.4
    │   ├── numba v0.61.0 (*)
    │   ├── numpy v1.26.4
    │   ├── openai v1.75.0 (*)
    │   ├── opencv-python-headless v4.11.0.86 (*)
    │   ├── outlines v0.1.11 (*)
    │   ├── partial-json-parser v0.2.1.1.post5
    │   ├── pillow v11.2.1
    │   ├── prometheus-client v0.21.1
    │   ├── prometheus-fastapi-instrumentator v7.1.0
    │   │   ├── prometheus-client v0.21.1
    │   │   └── starlette v0.46.2 (*)
    │   ├── protobuf v6.30.2
    │   ├── psutil v7.0.0
    │   ├── py-cpuinfo v9.0.0
    │   ├── pydantic v2.11.3 (*)
    │   ├── python-json-logger v3.3.0
    │   ├── pyyaml v6.0.2
    │   ├── pyzmq v26.4.0
    │   ├── ray[cgraph] v2.43.0
    │   │   ├── aiosignal v1.3.2 (*)
    │   │   ├── click v8.1.8
    │   │   ├── filelock v3.18.0
    │   │   ├── frozenlist v1.6.0
    │   │   ├── jsonschema v4.23.0 (*)
    │   │   ├── msgpack v1.1.0
    │   │   ├── packaging v24.2
    │   │   ├── protobuf v6.30.2
    │   │   ├── pyyaml v6.0.2
    │   │   ├── requests v2.32.3 (*)
    │   │   └── cupy-cuda12x v13.4.1 (extra: cgraph)
    │   │       ├── fastrlock v0.8.3
    │   │       └── numpy v1.26.4
    │   ├── requests v2.32.3 (*)
    │   ├── scipy v1.13.1 (*)
    │   ├── sentencepiece v0.2.0
    │   ├── tiktoken v0.9.0 (*)
    │   ├── tokenizers v0.21.1 (*)
    │   ├── torch v2.6.0 (*)
    │   ├── torchaudio v2.6.0 (*)
    │   ├── torchvision v0.21.0 (*)
    │   ├── tqdm v4.67.1
    │   ├── transformers v4.51.1 (*)
    │   ├── typing-extensions v4.13.2
    │   ├── watchfiles v1.0.5 (*)
    │   ├── xformers v0.0.29.post2
    │   │   ├── numpy v1.26.4
    │   │   └── torch v2.6.0 (*)
    │   └── xgrammar v0.1.17 (*)
    ├── vocos v0.1.0 (extra: all) (*)
    ├── wetextprocessing v1.0.3 (extra: all)
    │   ├── importlib-resources v6.5.2
    │   └── pynini v2.1.5 (*)
    ├── x-transformers v2.2.12 (extra: all)
    │   ├── einops v0.8.1
    │   ├── einx v0.3.0 (*)
    │   ├── loguru v0.7.3
    │   ├── packaging v24.2
    │   └── torch v2.6.0 (*)
    ├── xllamacpp v0.1.14 (extra: all)
    └── xxhash v3.5.0 (extra: all)
(*) Package tree already displayed

Xinference 部署和使用简介