ubuntu radeon780m 编译ollama

前言:ubuntu22.04 rocm VRAM被限制在bios分配的4gb,。查阅github issue可以通过手动编译解决

  1. 升级内核到6.10 (!!!!重要,必须6.9.9以上)
# 先进入bios关闭security boot
sudo apt update
sudo apt install -y libc6-dev libelf1 libssl3
sudo apt --fix-broken install

sudo add-apt-repository ppa:cappelikan/ppa -y
sudo apt update && sudo apt install mainline -y
sudo mainline install 6.10.14
  1. 安装 go 1.23
    https://go.dev/dl/

  2. 根据这个issue, 拉取特定版本的ollama,手动编译
    https://github.com/ollama/ollama/pull/6282

git clone https://github.com/Maciej-Mogilany/ollama.git
cd ollama
git checkout AMD_APU_GTT_memory
make -j 12
export HSA_OVERRIDE_GFX_VERSION=11.0.1 // for 780m (我的是11.0.2)
sudo systemctl stop ollama // stop original ollama for now
./ollama serve
in another terminal
./ollama run model name
if all work, you many replace original ollama bin file with generated form source and add HSA_OVERRIDE_GFX_VERSION=11.0.1 to ollama service for convenience
sudo systemctl start ollama // start original ollama
  1. 拉取模型的脚本(解决中断问题)
#!/bin/sh

# Set the speed threshold in KB/s
THRESHOLD=500

# Variable to track consecutive slow speed occurrences
slow_count=0

while true; do
  # Measure connection speed (downloading a small file)
  speed=$(curl -s -w '%{speed_download}' -o /dev/null https://speed.hetzner.de/100MB.bin)
  
  # Check if speed is empty or invalid
  if [ -z "$speed" ] || ! echo "$speed" | grep -Eq '^[0-9]+(\.[0-9]+)?$'; then
    echo "Failed to measure speed. Retrying..."
    sleep 10
    continue
  fi

  # Convert speed from bytes/s to KB/s
  speed_kbps=$(echo "$speed / 1024" | awk '{printf "%.0f", $1}')  # Using `awk` for precision

  echo "Current speed: ${speed_kbps} KB/s"

  if [ "$speed_kbps" -lt "$THRESHOLD" ]; then
    slow_count=$((slow_count + 1))
    echo "Speed below threshold ($THRESHOLD KB/s). Slow count: $slow_count"
  else
    slow_count=0  # Reset the counter if speed is above threshold
  fi

  # Retry the pull if slow speed is detected twice in a row
  if [ "$slow_count" -ge 2 ]; then
    echo "Connection speed is slow twice in a row. Retrying pull..."
    timeout 10s ollama pull deepseek-r1:8b
    slow_count=0  # Reset the counter after a pull attempt
  fi

  # Wait for a few seconds before checking again
  sleep 7
done

  1. 如果还是有问题,参考这个构建镜像(待验证)
    https://blog.machinezoo.com/Running_Ollama_on_AMD_iGPU
cd ~
mkdir ollama-gtt
cd ollama-gtt
git clone \
    -b AMD_APU_GTT_memory \
    --recurse-submodules \
    https://github.com/Maciej-Mogilany/ollama.git \
    .
podman image prune -f
rm -rf /var/tmp/buildah-cache-1000
podman build \
    -f Dockerfile \
    --no-cache \
    --platform=linux/amd64 \
    --target runtime-rocm \
    --build-arg=OLLAMA_SKIP_CUDA_GENERATE=1 \
    -t ollama-gtt

podman run -d \
    --name ollama \
    --replace \
    --pull=always \
    --restart=always \
    --stop-signal=SIGKILL \
    -p 127.0.0.1:11434:11434 \
    -v ollama:/root/.ollama \
    -e OLLAMA_MAX_LOADED_MODELS=1 \
    -e OLLAMA_NUM_PARALLEL=1 \
    --device /dev/dri \
    --device /dev/kfd \
    -e HSA_OVERRIDE_GFX_VERSION=9.0.0 \
    ollama-gtt
最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
【社区内容提示】社区部分内容疑似由AI辅助生成,浏览时请结合常识与多方信息审慎甄别。
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

相关阅读更多精彩内容

友情链接更多精彩内容