本地部署AI大模型 —— Ollama文档中文翻译

红雨随心翻作浪 2024-06-13 16:31:03 阅读 77

写在前面

来自Ollama GitHub项目的README.md 文档。文档中涉及的其它文档未翻译,但是对于本地部署大模型而言足够了。


Ollama

开始使用大模型。

macOS

Download

Windows 预览版

Download

Linux

curl -fsSL https://ollama.com/install.sh | sh

手动安装说明

Docker

官方 Ollama Docker 镜像 ollama/ollama 已在 Docker Hub 上可用.

库资源

ollama-pythonollama-js

快速启动

使用 Llama 3 本地大模型:

ollama run llama3

模型库

查询 Ollama 支持的可用大模型列表 ollama.com/library

这里是一些可以下载的大模型的例子:

模型 参数 大小 下载
Llama 3 8B 4.7GB ollama run llama3
Llama 3 70B 40GB ollama run llama3:70b
Phi 3 Mini 3.8B 2.3GB ollama run phi3
Phi 3 Medium 14B 7.9GB ollama run phi3:medium
Gemma 2B 1.4GB ollama run gemma:2b
Gemma 7B 4.8GB ollama run gemma:7b
Mistral 7B 4.1GB ollama run mistral
Moondream 2 1.4B 829MB ollama run moondream
Neural Chat 7B 4.1GB ollama run neural-chat
Starling 7B 4.1GB ollama run starling-lm
Code Llama 7B 3.8GB ollama run codellama
Llama 2 Uncensored 7B 3.8GB ollama run llama2-uncensored
LLaVA 7B 4.5GB ollama run llava
Solar 10.7B 6.1GB ollama run solar

Note: 你需要至少8GB RAM 来运行7B 参数的模型, 16GB 来运行 13B 大模型, 32GB 来运行33B.

自定义模型

从 GGUF 引入

Ollama支持在Modelfile中导入GGUF模型:

创建一个名为 Modelfile 的文件, 使用带有要导入的模型的本地文件路径的“FROM”指令。

FROM ./vicuna-33b.Q4_0.gguf

在 Ollama 里创建模型

ollama create example -f Modelfile

运行模型

ollama run example

从 PyTorch 或 Safetensors 引入

检查 引导 来获得关于引入模型的更多信息. (中文版不可用)

自定义 prompt

从Ollama 库下载的大模型可以用prompt 自定义. 例如, 要自定义 llama3 模型:

ollama pull llama3

创建 Modelfile:

FROM llama3# 将参数设置为1[越高越有创意,越低越连贯]PARAMETER temperature 1# 设置系统信息SYSTEM """You are Mario from Super Mario Bros. Answer as Mario, the assistant, only."""

下一步, 创建并运行模型:

ollama create mario -f ./Modelfileollama run mario>>> hiHello! It's your friend Mario.

有关更多示例,请参阅examples目录。有关使用模型文件的更多信息,请参阅Modelfile文档。(中文版未翻译)

命令参考

创建模型

ollama create 用于通过Modelfile 来创建模型.

ollama create mymodel -f ./Modelfile

下载一个模型

ollama pull llama3

这个命令也可以用来更新本地模型。只有不同的部分会被下载。

删除模型

ollama rm llama3

复制模型

ollama cp llama3 my-model

多行输入

要实现多行输入, 你可以用 """ 包围它们:

>>> """Hello,... world!... """I'm a basic program that prints the famous "Hello, world!" message to the console.

多模式模型

>>> What's in this image? /Users/jmorgan/Desktop/smile.pngThe image features a yellow smiley face, which is likely the central focus of the picture.

将Prompt 作为参数传递

$ ollama run llama3 "Summarize this file: $(cat README.md)" Ollama is a lightweight, extensible framework for building and running language models on the local machine. It provides a simple API for creating, running, and managing models, as well as a library of pre-built models that can be easily used in a variety of applications.

列出你电脑上的模型

ollama list

启动Ollama

ollama serve 用于在不运行桌面应用程序的情况下启动ollama.

构建

检查 开发者引导

运行本地构建

随后,启动服务:

./ollama serve

最后,在一个单独的shell中,运行一个模型:

./ollama run llama3

REST API

Ollama有一个用于运行和管理模型的REST API.

生成回应

curl http://localhost:11434/api/generate -d '{ "model": "llama3", "prompt":"Why is the sky blue?"}'

和模型对话

curl http://localhost:11434/api/chat -d '{ "model": "llama3", "messages": [ { "role": "user", "content": "why is the sky blue?" } ]}'

检查 API documentation 得到所有终端.

社区整合

Web & Desktop

Open WebUIEnchanted (macOS native)HollamaLollms-WebuiLibreChatBionic GPTHTML UISaddleChatbot UIChatbot UI v2Typescript UIMinimalistic React UI for Ollama ModelsOllamacbig-AGICheshire Cat assistant frameworkAmicachatdOllama-SwiftUIDify.AIMindMacNextJS Web Interface for OllamaMstyChatboxWinForm Ollama CopilotNextChat with Get Started DocAlpaca WebUIOllamaGUIOpenAOEOdin RunesLLM-X (Progressive Web App)AnythingLLM (Docker + MacOs/Windows/Linux native app)Ollama Basic Chat: Uses HyperDiv Reactive UIOllama-chats RPGQA-Pilot (Chat with Code Repository)ChatOllama (Open Source Chatbot based on Ollama with Knowledge Bases)CRAG Ollama Chat (Simple Web Search with Corrective RAG)RAGFlow (Open-source Retrieval-Augmented Generation engine based on deep document understanding)StreamDeploy (LLM Application Scaffold)chat (chat web app for teams)Lobe Chat with Integrating DocOllama RAG Chatbot (Local Chat with multiple PDFs using Ollama and RAG)BrainSoup (Flexible native client with RAG & multi-agent automation)macai (macOS client for Ollama, ChatGPT, and other compatible API back-ends)Olpaka (User-friendly Flutter Web App for Ollama)OllamaSpring (Ollama Client for macOS)LLocal.in (Easy to use Electron Desktop Client for Ollama)

Terminal

otermEllama Emacs clientEmacs clientgen.nvimollama.nvimollero.nvimollama-chat.nvimogpt.nvimgptel Emacs clientOatmealcmdhoooshell-pilottenerellm-ollama for Datasette’s LLM CLI.typechat-cliShellOracletlmpodman-ollamagollama

Database

MindsDB (Connects Ollama models with nearly 200 data platforms and apps)chromem-go with example

Package managers

PacmanHelm ChartGuix channel

Libraries

LangChain and LangChain.js with exampleLangChainGo with exampleLangChain4j with exampleLangChainRust with exampleLlamaIndexLiteLLMOllamaSharp for .NETOllama for RubyOllama-rs for RustOllama4j for JavaModelFusion Typescript LibraryOllamaKit for SwiftOllama for DartOllama for LaravelLangChainDartSemantic Kernel - PythonHaystackElixir LangChainOllama for R - rollamaOllama for R - ollama-rOllama-ex for ElixirOllama Connector for SAP ABAPTestcontainersPortkeyPromptingTools.jl with an exampleLlamaScript

Mobile

EnchantedMaid

Extensions & Plugins

Raycast extensionDiscollama (Discord bot inside the Ollama discord channel)ContinueObsidian Ollama pluginLogseq Ollama pluginNotesOllama (Apple Notes Ollama plugin)Dagger ChatbotDiscord AI BotOllama Telegram BotHass Ollama ConversationRivet pluginObsidian BMO Chatbot pluginCliobot (Telegram bot with Ollama support)Copilot for Obsidian pluginObsidian Local GPT pluginOpen InterpreterLlama Coder (Copilot alternative using Ollama)Ollama Copilot (Proxy that allows you to use ollama as a copilot like Github copilot)twinny (Copilot and Copilot chat alternative using Ollama)Wingman-AI (Copilot code and chat alternative using Ollama and HuggingFace)Page Assist (Chrome Extension)AI Telegram Bot (Telegram bot using Ollama in backend)AI ST Completion (Sublime Text 4 AI assistant plugin with Ollama support)Discord-Ollama Chat Bot (Generalized TypeScript Discord Bot w/ Tuning Documentation)Discord AI chat/moderation bot Chat/moderation bot written in python. Uses Ollama to create personalities.Headless Ollama (Scripts to automatically install ollama client & models on any OS for apps that depends on ollama server)

Supported backends

llama.cpp project founded by Georgi Gerganov.


声明

本文内容仅代表作者观点,或转载于其他网站,本站不以此文作为商业用途
如有涉及侵权,请联系本站进行删除
转载本站原创文章,请注明来源及作者。