Ollama 提供了完整的 RESTful API，默认运行在 `http://localhost:11434`。主要 API 端点包括：

**1. 生成文本（POST /api/generate）：**
```bash
curl http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt": "Hello, how are you?",
  "stream": false
}'
```

**2. 对话（POST /api/chat）：**
```bash
curl http://localhost:11434/api/chat -d '{
  "model": "llama2",
  "messages": [
    { "role": "user", "content": "Hello!" },
    { "role": "assistant", "content": "Hi there!" },
    { "role": "user", "content": "How are you?" }
  ]
}'
```

**3. 列出模型（GET /api/tags）：**
```bash
curl http://localhost:11434/api/tags
```

**4. 查看模型信息（POST /api/show）：**
```bash
curl http://localhost:11434/api/show -d '{
  "name": "llama2"
}'
```

**5. 复制模型（POST /api/copy）：**
```bash
curl http://localhost:11434/api/copy -d '{
  "source": "llama2",
  "destination": "my-llama2"
}'
```

**6. 删除模型（DELETE /api/delete）：**
```bash
curl -X DELETE http://localhost:11434/api/delete -d '{
  "name": "llama2"
}'
```

**7. 拉取模型（POST /api/pull）：**
```bash
curl http://localhost:11434/api/pull -d '{
  "name": "llama2"
}'
```

**流式响应：**
设置 `"stream": true` 可以获得流式响应，适合实时显示生成内容。

**Python 集成示例：**
```python
import requests

response = requests.post('http://localhost:11434/api/generate', json={
    'model': 'llama2',
    'prompt': 'Tell me a joke',
    'stream': False
})
print(response.json()['response'])
```

Ollama API 的主要端点有哪些，如何使用？