模型文件(gguf,safetensors)转换笔记

1. gguf 转 safetensors

以ollama下运行的模型为例
a.) 查看模型路径
ollama show 模型名称 --modelfile
返回的 FROM xxxxxx 就是模型的文件(gguf,可能显示的后缀不是.gguf)

b.) 下载转换工具
https://github.com/purinnohito/gguf_to_safetensors
在目录下安装所需的依赖
pip3 install -r requirements.txt

c.) 转换操作
python3 gguf_to_safetensors.py --input mywait.gguf --output mywait.safetensors

Extracted 389 tensors from GGUF file
Using FP16 conversion.
dequantize tensor: token_embd.weight | Shape: torch.Size([250002, 1024]) | Type: torch.float16
Using FP16 conversion.
dequantize tensor: position_embd.weight | Shape: torch.Size([8192, 1024]) | Type: torch.float16
......
......
dequantize tensor: blk.23.layer_output_norm.weight | Shape: torch.Size([1024]) | Type: torch.float16
Using FP16 conversion.
dequantize tensor: blk.23.layer_output_norm.bias | Shape: torch.Size([1024]) | Type: torch.float16
Conversion complete!

2. safetensors 转 gguf

a.) 下载转换工具
https://github.com/ggml-org/llama.cpp
在目录下安装所需的依赖
pip3 install -r requirements.txt

b.) 转换操作
[llama.cpp]
python3 convert_hf_to_gguf.py 模型文件夹 --outfile dsr1-qw-1.5b.gguf

INFO:hf-to-gguf:Loading model: DeepSeek-R1-Distill-Qwen-1.5B
INFO:gguf.gguf_writer:gguf: This GGUF file is for Little Endian only
INFO:hf-to-gguf:Exporting model...
INFO:hf-to-gguf:gguf: loading model part 'model.safetensors'
INFO:hf-to-gguf:output.weight,             torch.bfloat16 --> F16, shape = {1536, 151936}
.....
.....
INFO:hf-to-gguf:Set model quantization version
INFO:gguf.gguf_writer:Writing the following files:
INFO:gguf.gguf_writer:dsr1-qw-1.5b.gguf: n_tensors = 339, total_size = 3.6G
INFO:hf-to-gguf:Model successfully exported to dsr1-qw-1.5b.gguf

[自定义转换工具]
有时候只有一个safetensors后缀的模型文件,则可以使用如下脚本进行转换;
Python脚本:safetensors_to_gguf.py

import torch
from safetensors.torch import load_file
import gguf
import argparse

parser = argparse.ArgumentParser(description='params')
parser.add_argument("--input",type=str)
parser.add_argument("--output",type=str)
parser.add_argument("--ft", default="0",type=str, help="数据类型,1 - float32 , 0 - float16;默认float16")

args = parser.parse_args()
floatType = args.ft

def convert_tensor_to_supported_dtype(tensor):
    """
    将张量转换为 GGUF 支持的数据类型(F16 或 F32)。
    """
    if tensor.dtype in [torch.float16, torch.float32, torch.float64]:
        if floatType == "1":
            return tensor.to(dtype=torch.float32)  # 转换为 F32 以确保兼容性
        else:
            return tensor.to(dtype=torch.float16)
    elif tensor.dtype in [torch.int8, torch.int16, torch.int32, torch.int64]:
        return tensor
    else:
        raise ValueError(f"Unsupported tensor dtype: {tensor.dtype}")

# Load the .safetensors file
model_path = args.input
model = load_file(model_path, device="cpu")

# Create a GGUF writer
output_path = args.output
gguf_writer = gguf.GGUFWriter(output_path, "llama")

# Add model weights to GGUF
# Convert tensors
converted_model = {k: convert_tensor_to_supported_dtype(v) for k, v in model.items()}

for key, tensor in converted_model.items():
    gguf_writer.add_tensor(key, tensor.numpy())

# Finalize the GGUF file
gguf_writer.write_header_to_file()
gguf_writer.write_kv_data_to_file()
gguf_writer.write_tensors_to_file()
gguf_writer.close()

使用命令:
python3 safetensors_to_gguf.py --input bge-m3_latest.safetensors --output /bge-m3.gguf --ft 1

©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容