Ollama 提供了完整的 RESTful API,默认运行在 http://localhost:11434。主要 API 端点包括:
1. 生成文本(POST /api/generate):
bashcurl http://localhost:11434/api/generate -d '{ "model": "llama2", "prompt": "Hello, how are you?", "stream": false }'
2. 对话(POST /api/chat):
bashcurl http://localhost:11434/api/chat -d '{ "model": "llama2", "messages": [ { "role": "user", "content": "Hello!" }, { "role": "assistant", "content": "Hi there!" }, { "role": "user", "content": "How are you?" } ] }'
3. 列出模型(GET /api/tags):
bashcurl http://localhost:11434/api/tags
4. 查看模型信息(POST /api/show):
bashcurl http://localhost:11434/api/show -d '{ "name": "llama2" }'
5. 复制模型(POST /api/copy):
bashcurl http://localhost:11434/api/copy -d '{ "source": "llama2", "destination": "my-llama2" }'
6. 删除模型(DELETE /api/delete):
bashcurl -X DELETE http://localhost:11434/api/delete -d '{ "name": "llama2" }'
7. 拉取模型(POST /api/pull):
bashcurl http://localhost:11434/api/pull -d '{ "name": "llama2" }'
流式响应:
设置 "stream": true 可以获得流式响应,适合实时显示生成内容。
Python 集成示例:
pythonimport requests response = requests.post('http://localhost:11434/api/generate', json={ 'model': 'llama2', 'prompt': 'Tell me a joke', 'stream': False }) print(response.json()['response'])