Window10下安装Llama3大模型

环境相关

系统：

window10

模型：

Llama3:8b 这里只是在个人笔记本运行，使用轻量一点的8b模型，如果个人笔记本配置比较高，也可以使用70b模型
因为Hugging Face官网正常无法访问，需要科学上网，因此推荐国内镜像进行下载：

官网地址：https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/tree/main

国内镜像：https://hf-mirror.com/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/tree/main

安装工具：

ollama 非常方便的一个部署工具，支持Mac、Linux和window

官网地址：https://ollama.com/

安装

下载ollama安装包

直接浏览器访问 https://ollama.com/ ，打开下载界面，选择对应的系统安装包下载安装即可，这里基本本会有什么问题，正常情况这个网站可以直接打开的。

下载模型

这里我选择是Llama3:8b模型，根据自己需要可以选择ollama支持的，查看支持列表 https://ollama.com/library，选择即可
按照官网命令，选择模型后，复制命令直接执行

ollama run llama3.1

即可下载和运行模型，这里我遇到网络超时的问题，没有办法，在网上找了离线方案，可以参照该文章
如何使用Ollama离线部署LLM大语言模型，这里是使用qwen:0.5b模型制作的Modelfile，按照上面方法，我参照官网的llama配置，重新创建了一份对应的Modelfile文件

FROM ./Meta-Llama-3-8B-Instruct.Q4_K_M.gguf
TEMPLATE """
{{ if .Messages }}
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
{{- if .System }}

{{ .System }}
{{- end }}
{{- if .Tools }}

You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the orginal use question.
{{- end }}
{{- end }}<|eot_id|>
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{- if and $.Tools $last }}

Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.

Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.

{{ $.Tools }}
{{- end }}

{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>

{{ end }}
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
{{- if .ToolCalls }}

{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
{{- else }}

{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}
{{- end }}
{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>

{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>

{{ end }}
{{- end }}
{{- end }}
{{- else }}
{{- if .System }}<|start_header_id|>system<|end_header_id|>

{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>

{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>

{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}
"""

PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"

将国内镜像网站下载的大模型包跟Modelfile放在一起，执行创建命令

# 创建离线包
ollama create llama3:8b -f Modelfile

# 确认模型已存在
ollama list

# 启动模型
> ollama run llama3:8b

> ollama run llama3:8b

>>> hi
Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?
>>> who are you
I am LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational manner. I'm not a 
human, but rather a computer program designed to simulate conversation, answer questions, and even generate text on my own.
I was trained on a massive dataset of text from the internet and can generate responses to a wide range of topics and questions. My 
training data includes a large corpus of text, which allows me to understand and respond to natural language inputs.

Some of the things I can do include:

* Answering questions on a wide range of topics
* Generating text based on prompts or topics
* Chatting with you about your interests and hobbies
* Translating text from one language to another

正常启动后，即可进入命令行绘画，输入hi，就可以收到模型的回应