环境相关
系统:
window10
模型:
Llama3:8b 这里只是在个人笔记本运行,使用轻量一点的8b模型,如果个人笔记本配置比较高,也可以使用70b模型
因为Hugging Face官网正常无法访问,需要科学上网,因此推荐国内镜像进行下载:
官网地址:https://huggingface.co/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/tree/main
国内镜像:https://hf-mirror.com/QuantFactory/Meta-Llama-3-8B-Instruct-GGUF/tree/main
安装工具:
ollama 非常方便的一个部署工具,支持Mac、Linux和window
官网地址:https://ollama.com/
安装
下载ollama安装包
直接浏览器访问 https://ollama.com/ ,打开下载界面,选择对应的系统安装包下载安装即可,这里基本本会有什么问题,正常情况这个网站可以直接打开的。
下载模型
这里我选择是Llama3:8b模型,根据自己需要可以选择ollama支持的,查看支持列表 https://ollama.com/library,选择即可
按照官网命令,选择模型后,复制命令直接执行
ollama run llama3.1
即可下载和运行模型,这里我遇到网络超时的问题,没有办法,在网上找了离线方案,可以参照该文章
如何使用Ollama离线部署LLM大语言模型,这里是使用qwen:0.5b模型制作的Modelfile,按照上面方法,我参照官网的llama配置,重新创建了一份对应的Modelfile文件
FROM ./Meta-Llama-3-8B-Instruct.Q4_K_M.gguf
TEMPLATE """
{{ if .Messages }}
{{- if or .System .Tools }}<|start_header_id|>system<|end_header_id|>
{{- if .System }}
{{ .System }}
{{- end }}
{{- if .Tools }}
You are a helpful assistant with tool calling capabilities. When you receive a tool call response, use the output to format an answer to the orginal use question.
{{- end }}
{{- end }}<|eot_id|>
{{- range $i, $_ := .Messages }}
{{- $last := eq (len (slice $.Messages $i)) 1 }}
{{- if eq .Role "user" }}<|start_header_id|>user<|end_header_id|>
{{- if and $.Tools $last }}
Given the following functions, please respond with a JSON for a function call with its proper arguments that best answers the given prompt.
Respond in the format {"name": function name, "parameters": dictionary of argument name and its value}. Do not use variables.
{{ $.Tools }}
{{- end }}
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- else if eq .Role "assistant" }}<|start_header_id|>assistant<|end_header_id|>
{{- if .ToolCalls }}
{{- range .ToolCalls }}{"name": "{{ .Function.Name }}", "parameters": {{ .Function.Arguments }}}{{ end }}
{{- else }}
{{ .Content }}{{ if not $last }}<|eot_id|>{{ end }}
{{- end }}
{{- else if eq .Role "tool" }}<|start_header_id|>ipython<|end_header_id|>
{{ .Content }}<|eot_id|>{{ if $last }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}
{{- end }}
{{- end }}
{{- else }}
{{- if .System }}<|start_header_id|>system<|end_header_id|>
{{ .System }}<|eot_id|>{{ end }}{{ if .Prompt }}<|start_header_id|>user<|end_header_id|>
{{ .Prompt }}<|eot_id|>{{ end }}<|start_header_id|>assistant<|end_header_id|>
{{ end }}{{ .Response }}{{ if .Response }}<|eot_id|>{{ end }}
"""
PARAMETER stop "<|start_header_id|>"
PARAMETER stop "<|end_header_id|>"
PARAMETER stop "<|eot_id|>"
将国内镜像网站下载的大模型包跟Modelfile放在一起,执行创建命令
# 创建离线包
ollama create llama3:8b -f Modelfile
# 确认模型已存在
ollama list
# 启动模型
> ollama run llama3:8b
> ollama run llama3:8b
>>> hi
Hi! It's nice to meet you. Is there something I can help you with or would you like to chat?
>>> who are you
I am LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational manner. I'm not a
human, but rather a computer program designed to simulate conversation, answer questions, and even generate text on my own.
I was trained on a massive dataset of text from the internet and can generate responses to a wide range of topics and questions. My
training data includes a large corpus of text, which allows me to understand and respond to natural language inputs.
Some of the things I can do include:
* Answering questions on a wide range of topics
* Generating text based on prompts or topics
* Chatting with you about your interests and hobbies
* Translating text from one language to another
正常启动后,即可进入命令行绘画,输入hi,就可以收到模型的回应