概述

ChatGLM3 是由智谱AI和清华大学 KEG 实验室联合发布的新一代对话预训练模型。ChatGLM3-6B 是 ChatGLM3 系列中的开源模型，继承了前两代模型对话流畅、部署门槛低等众多优秀特性，并在此基础上进行了全面的性能提升和创新性功能扩展。

系统要求

操作系统：Windows、Linux 或 macOS。本教程使用Windows进行安装。
python 版本推荐3.10.12
transformers 库版本推荐为 4.30.2
torch 推荐使用 2.0 及以上的版本，以获得最佳的推理性能
CUDA：如果你打算在 GPU 上运行模型，需要安装 CUDA（仅限 Windows 和 Linux）

部署

部署gpu驱动

#下载rtx4060驱动
https://www.nvidia.cn/drivers/lookup/ 

#安装基础依赖环境
yum -y install gcc kernel-devel kernel-headers

#内核版本和源码版本
ls /boot | grep vmlinu 
rpm -aq |grep kernel-devel

# 屏蔽默认带有的nouveau，并追加两条
vim /lib/modprobe.d/dist-blacklist.conf
#blacklist nvidiafb
blacklist nouveau
options nouveau modeset=0

#重建 initramfs image 步骤
mv /boot/initramfs-$(uname -r).img /boot/initramfs-$(uname -r).img.bak
dracut /boot/initramfs-$(uname -r).img $(uname -r)

#修改运行级别为文本模式
systemctl set-default multi-user.target
reboot
#检查，如果没有显示相关的内容，说明已禁用
ls mod | grep nouveau

#安装驱动
chmod 777 NVIDIA-Linux-x86_64-550.142.run
./NVIDIA-Linux-x86_64-550.142.run --add-this-kernel --kernel-source-path=/usr/src/kernels/3.10.0-1160.119.1.el7.x86_64

步骤 1：创建虚拟环境

打开终端cmd，安装并创建一个新的 Anaconda 环境。这将有助于隔离项目依赖项。
Anaconda 下载地址：

#配置yum源
mv /etc/yum.repos.d/* /tmp
curl -o /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
curl -o /etc/yum.repos.d/epel.repo http://mirrors.aliyun.com/repo/epel-7.repo

yum install -y wget
wget https://repo.anaconda.com/archive/Anaconda3-2023.03-1-Linux-x86_64.sh
bash Anaconda3-2023.03-1-Linux-x86_64.sh
选择yes

vi ~/.bashrc
export PATH=$PATH:/root/anaconda3/bin
source ~/.bashrc

#使用conda list查看conda清单
conda list
#不激活base
conda config --set auto_activate_base false

执行命令：

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple/
conda create -n py3.10 python=3.10
conda activate py3.10

步骤 2：安装依赖项

安装NVIDIA驱动以及CUDA Toolkit 11.8，地址如下：
https://developer.nvidia.com/cuda-11-8-0-download-archive?target_os=Linux。选择对应的安装包进行下载并安装

安装NVIDIA驱动以及CUDA Toolkit 11.8

安装PyTorch，到此地址
https://pytorch.org/get-started/locally/并根据本机硬件选择的版本，如下图所示：

安装PyTorch

conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia

#确认安装成功
(py3.10) [root@centos ChatGLM3-main]# python
Python 3.10.16 (main, Dec 11 2024, 16:24:50) [GCC 11.2.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
>>> print(torch.__version__)
2.3.0+cu121
>>> torch.cuda.is_available()
True

步骤 3：下载 ChatGLM3-6B 模型

yum install -y git
git clone https://github.com/THUDM/ChatGLM3
cd ChatGLM3

步骤 4：安装模型依赖项

pip install -r requirements.txt

步骤 5：下载模型文件

mkdir THUDM
cd THUDM
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.rpm.sh | sudo bash
yum install git-lfs
git clone https://www.modelscope.cn/ZhipuAI/chatglm3-6b.git

步骤 6：运行模型

命令界面

python basic_demo/cli_demo.py

Streamlit 界面
在浏览器中打开 http://localhost:8501 来访问 Streamlit 界面。

pip install streamlit
streamlit run basic_demo\web_demo_streamlit.py

在浏览器中打开 http://localhost:8501 来访问 Streamlit 界面。

REST API

python openai_api_demo\api_server.py

ChatGLM3-6B大模型Centos7部署