本地部署AI大模型 —— Ollama文档中文翻译
红雨随心翻作浪 2024-06-13 16:31:03 阅读 77
写在前面
来自Ollama GitHub项目的README.md 文档。文档中涉及的其它文档未翻译,但是对于本地部署大模型而言足够了。
Ollama
开始使用大模型。
macOS
Download
Windows 预览版
Download
Linux
curl -fsSL https://ollama.com/install.sh | sh
手动安装说明
Docker
官方 Ollama Docker 镜像 ollama/ollama
已在 Docker Hub 上可用.
库资源
ollama-pythonollama-js快速启动
使用 Llama 3 本地大模型:
ollama run llama3
模型库
查询 Ollama 支持的可用大模型列表 ollama.com/library
这里是一些可以下载的大模型的例子:
模型 | 参数 | 大小 | 下载 |
---|---|---|---|
Llama 3 | 8B | 4.7GB | ollama run llama3 |
Llama 3 | 70B | 40GB | ollama run llama3:70b |
Phi 3 Mini | 3.8B | 2.3GB | ollama run phi3 |
Phi 3 Medium | 14B | 7.9GB | ollama run phi3:medium |
Gemma | 2B | 1.4GB | ollama run gemma:2b |
Gemma | 7B | 4.8GB | ollama run gemma:7b |
Mistral | 7B | 4.1GB | ollama run mistral |
Moondream 2 | 1.4B | 829MB | ollama run moondream |
Neural Chat | 7B | 4.1GB | ollama run neural-chat |
Starling | 7B | 4.1GB | ollama run starling-lm |
Code Llama | 7B | 3.8GB | ollama run codellama |
Llama 2 Uncensored | 7B | 3.8GB | ollama run llama2-uncensored |
LLaVA | 7B | 4.5GB | ollama run llava |
Solar | 10.7B | 6.1GB | ollama run solar |
Note: 你需要至少8GB RAM 来运行7B 参数的模型, 16GB 来运行 13B 大模型, 32GB 来运行33B.
自定义模型
从 GGUF 引入
Ollama支持在Modelfile中导入GGUF模型:
创建一个名为 Modelfile
的文件, 使用带有要导入的模型的本地文件路径的“FROM”指令。
FROM ./vicuna-33b.Q4_0.gguf
在 Ollama 里创建模型
ollama create example -f Modelfile
运行模型
ollama run example
从 PyTorch 或 Safetensors 引入
检查 引导 来获得关于引入模型的更多信息. (中文版不可用)
自定义 prompt
从Ollama 库下载的大模型可以用prompt 自定义. 例如, 要自定义 llama3
模型:
ollama pull llama3
创建 Modelfile
:
FROM llama3# 将参数设置为1[越高越有创意,越低越连贯]PARAMETER temperature 1# 设置系统信息SYSTEM """You are Mario from Super Mario Bros. Answer as Mario, the assistant, only."""
下一步, 创建并运行模型:
ollama create mario -f ./Modelfileollama run mario>>> hiHello! It's your friend Mario.
有关更多示例,请参阅examples目录。有关使用模型文件的更多信息,请参阅Modelfile文档。(中文版未翻译)
命令参考
创建模型
ollama create
用于通过Modelfile 来创建模型.
ollama create mymodel -f ./Modelfile
下载一个模型
ollama pull llama3
这个命令也可以用来更新本地模型。只有不同的部分会被下载。
删除模型
ollama rm llama3
复制模型
ollama cp llama3 my-model
多行输入
要实现多行输入, 你可以用 """
包围它们:
>>> """Hello,... world!... """I'm a basic program that prints the famous "Hello, world!" message to the console.
多模式模型
>>> What's in this image? /Users/jmorgan/Desktop/smile.pngThe image features a yellow smiley face, which is likely the central focus of the picture.
将Prompt 作为参数传递
$ ollama run llama3 "Summarize this file: $(cat README.md)" Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.
列出你电脑上的模型
ollama list
启动Ollama
ollama serve
用于在不运行桌面应用程序的情况下启动ollama.
构建
检查 开发者引导
运行本地构建
随后,启动服务:
./ollama serve
最后,在一个单独的shell中,运行一个模型:
./ollama run llama3
REST API
Ollama有一个用于运行和管理模型的REST API.
生成回应
curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt":"Why is the sky blue?"}'
和模型对话
curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [ { "role": "user", "content": "why is the sky blue?" } ]}'
检查 API documentation 得到所有终端.
社区整合
Web & Desktop
Open WebUIEnchanted (macOS native)HollamaLollms-WebuiLibreChatBionic GPTHTML UISaddleChatbot UIChatbot UI v2Typescript UIMinimalistic React UI for Ollama ModelsOllamacbig-AGICheshire Cat assistant frameworkAmicachatdOllama-SwiftUIDify.AIMindMacNextJS Web Interface for OllamaMstyChatboxWinForm Ollama CopilotNextChat with Get Started DocAlpaca WebUIOllamaGUIOpenAOEOdin RunesLLM-X (Progressive Web App)AnythingLLM (Docker + MacOs/Windows/Linux native app)Ollama Basic Chat: Uses HyperDiv Reactive UIOllama-chats RPGQA-Pilot (Chat with Code Repository)ChatOllama (Open Source Chatbot based on Ollama with Knowledge Bases)CRAG Ollama Chat (Simple Web Search with Corrective RAG)RAGFlow (Open-source Retrieval-Augmented Generation engine based on deep document understanding)StreamDeploy (LLM Application Scaffold)chat (chat web app for teams)Lobe Chat with Integrating DocOllama RAG Chatbot (Local Chat with multiple PDFs using Ollama and RAG)BrainSoup (Flexible native client with RAG & multi-agent automation)macai (macOS client for Ollama, ChatGPT, and other compatible API back-ends)Olpaka (User-friendly Flutter Web App for Ollama)OllamaSpring (Ollama Client for macOS)LLocal.in (Easy to use Electron Desktop Client for Ollama)Terminal
otermEllama Emacs clientEmacs clientgen.nvimollama.nvimollero.nvimollama-chat.nvimogpt.nvimgptel Emacs clientOatmealcmdhoooshell-pilottenerellm-ollama for Datasette’s LLM CLI.typechat-cliShellOracletlmpodman-ollamagollamaDatabase
MindsDB (Connects Ollama models with nearly 200 data platforms and apps)chromem-go with examplePackage managers
PacmanHelm ChartGuix channelLibraries
LangChain and LangChain.js with exampleLangChainGo with exampleLangChain4j with exampleLangChainRust with exampleLlamaIndexLiteLLMOllamaSharp for .NETOllama for RubyOllama-rs for RustOllama4j for JavaModelFusion Typescript LibraryOllamaKit for SwiftOllama for DartOllama for LaravelLangChainDartSemantic Kernel - PythonHaystackElixir LangChainOllama for R - rollamaOllama for R - ollama-rOllama-ex for ElixirOllama Connector for SAP ABAPTestcontainersPortkeyPromptingTools.jl with an exampleLlamaScriptMobile
EnchantedMaidExtensions & Plugins
Raycast extensionDiscollama (Discord bot inside the Ollama discord channel)ContinueObsidian Ollama pluginLogseq Ollama pluginNotesOllama (Apple Notes Ollama plugin)Dagger ChatbotDiscord AI BotOllama Telegram BotHass Ollama ConversationRivet pluginObsidian BMO Chatbot pluginCliobot (Telegram bot with Ollama support)Copilot for Obsidian pluginObsidian Local GPT pluginOpen InterpreterLlama Coder (Copilot alternative using Ollama)Ollama Copilot (Proxy that allows you to use ollama as a copilot like Github copilot)twinny (Copilot and Copilot chat alternative using Ollama)Wingman-AI (Copilot code and chat alternative using Ollama and HuggingFace)Page Assist (Chrome Extension)AI Telegram Bot (Telegram bot using Ollama in backend)AI ST Completion (Sublime Text 4 AI assistant plugin with Ollama support)Discord-Ollama Chat Bot (Generalized TypeScript Discord Bot w/ Tuning Documentation)Discord AI chat/moderation bot Chat/moderation bot written in python. Uses Ollama to create personalities.Headless Ollama (Scripts to automatically install ollama client & models on any OS for apps that depends on ollama server)Supported backends
llama.cpp project founded by Georgi Gerganov.声明
本文内容仅代表作者观点,或转载于其他网站,本站不以此文作为商业用途
如有涉及侵权,请联系本站进行删除
转载本站原创文章,请注明来源及作者。