乐闻世界logo
搜索文章和话题

如何使用 SSH 进行自动化运维?有哪些常用的自动化工具和脚本?

3月6日 21:32

SSH 自动化是现代 DevOps 和运维自动化的重要组成部分。通过 SSH 自动化,可以实现批量服务器管理、自动化部署、监控告警等功能。

SSH 自动化工具

1. Ansible

Ansible 是最流行的 SSH 自动化工具之一,无需在目标服务器安装代理。

安装

bash
# Ubuntu/Debian sudo apt-get install ansible # CentOS/RHEL sudo yum install ansible # macOS brew install ansible # pip pip install ansible

配置

bash
# /etc/ansible/hosts [webservers] web1.example.com web2.example.com web3.example.com [dbservers] db1.example.com db2.example.com [all:vars] ansible_user=admin ansible_ssh_private_key_file=~/.ssh/ansible_key

使用示例

bash
# 执行命令 ansible webservers -m shell -a "uptime" # 复制文件 ansible webservers -m copy -a "src=/tmp/file dest=/tmp/" # 安装软件包 ansible all -m apt -a "name=nginx state=present" # 执行 Playbook ansible-playbook deploy.yml

Playbook 示例

yaml
# deploy.yml --- - hosts: webservers become: yes tasks: - name: Update apt cache apt: update_cache: yes - name: Install nginx apt: name: nginx state: present - name: Start nginx service service: name: nginx state: started enabled: yes - name: Copy configuration file copy: src: nginx.conf dest: /etc/nginx/nginx.conf notify: restart nginx handlers: - name: restart nginx service: name: nginx state: restarted

2. Fabric

Fabric 是一个 Python 库,用于简化 SSH 自动化任务。

安装

bash
pip install fabric

使用示例

python
# fabfile.py from fabric import Connection from fabric import task @task def deploy(c): """Deploy application to server""" with Connection('user@server') as conn: # Update code conn.run('git pull origin main') # Install dependencies conn.run('pip install -r requirements.txt') # Restart service conn.sudo('systemctl restart myapp') @task def update(c, server): """Update multiple servers""" with Connection(f'user@{server}') as conn: conn.run('apt-get update && apt-get upgrade -y') @task def backup(c): """Backup database""" with Connection('user@server') as conn: conn.run('mysqldump -u root -p database > backup.sql') conn.get('backup.sql', './backups/')

运行

bash
# 执行单个任务 fab deploy # 执行带参数的任务 fab update:server=web1.example.com # 执行多个任务 fab deploy backup

3. SSH 批处理脚本

使用 Shell 脚本实现简单的 SSH 批处理。

示例脚本

bash
#!/bin/bash # batch_ssh.sh SERVERS=( "user@server1.example.com" "user@server2.example.com" "user@server3.example.com" ) COMMAND="uptime" for server in "${SERVERS[@]}"; do echo "=== $server ===" ssh "$server" "$COMMAND" echo "" done

高级脚本

bash
#!/bin/bash # advanced_batch_ssh.sh # 配置 SERVERS_FILE="servers.txt" SSH_KEY="~/.ssh/batch_key" SSH_USER="admin" TIMEOUT=10 # 函数:执行命令 execute_command() { local server=$1 local command=$2 echo "Executing on $server: $command" timeout $TIMEOUT ssh -i $SSH_KEY -o StrictHostKeyChecking=no $SSH_USER@$server "$command" if [ $? -eq 0 ]; then echo "Success" else echo "Failed" fi } # 函数:并行执行 parallel_execute() { local command=$1 while read -r server; do execute_command "$server" "$command" & done < "$SERVERS_FILE" wait } # 主程序 case "$1" in "update") parallel_execute "apt-get update && apt-get upgrade -y" ;; "restart") parallel_execute "systemctl restart nginx" ;; "status") parallel_execute "systemctl status nginx" ;; *) echo "Usage: $0 {update|restart|status}" exit 1 ;; esac

4. Pexpect

Pexpect 是一个 Python 模块,用于自动化交互式程序。

安装

bash
pip install pexpect

使用示例

python
import pexpect def ssh_interactive(host, user, password, command): """Automate interactive SSH session""" ssh = pexpect.spawn(f'ssh {user}@{host}') # 处理密码提示 ssh.expect('password:') ssh.sendline(password) # 执行命令 ssh.expect('$') ssh.sendline(command) # 获取输出 ssh.expect('$') output = ssh.before.decode() print(output) ssh.close() # 使用 ssh_interactive('server.example.com', 'user', 'password', 'ls -la')

自动化场景

场景1:批量部署

bash
#!/bin/bash # deploy.sh APP_DIR="/var/www/myapp" REPO="https://github.com/user/myapp.git" BRANCH="main" # 服务器列表 SERVERS=( "web1.example.com" "web2.example.com" "web3.example.com" ) for server in "${SERVERS[@]}"; do echo "Deploying to $server..." ssh admin@$server << EOF cd $APP_DIR git pull origin $BRANCH npm install npm run build pm2 restart myapp EOF echo "Deployment to $server completed" done

场景2:批量监控

python
#!/usr/bin/env python3 # monitor.py import paramiko import time SERVERS = [ {'host': 'server1.example.com', 'user': 'admin'}, {'host': 'server2.example.com', 'user': 'admin'}, {'host': 'server3.example.com', 'user': 'admin'}, ] def check_server(server): """Check server status""" ssh = paramiko.SSHClient() ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy()) try: ssh.connect(server['host'], username=server['user']) # 检查 CPU stdin, stdout, stderr = ssh.exec_command('top -bn1 | grep "Cpu(s)"') cpu_usage = stdout.read().decode() # 检查内存 stdin, stdout, stderr = ssh.exec_command('free -m') memory = stdout.read().decode() # 检查磁盘 stdin, stdout, stderr = ssh.exec_command('df -h') disk = stdout.read().decode() print(f"=== {server['host']} ===") print(f"CPU: {cpu_usage.strip()}") print(f"Memory: {memory.strip()}") print(f"Disk: {disk.strip()}") print("") except Exception as e: print(f"Error connecting to {server['host']}: {e}") finally: ssh.close() # 主程序 while True: for server in SERVERS: check_server(server) time.sleep(300) # 每5分钟检查一次

场景3:自动备份

bash
#!/bin/bash # backup.sh BACKUP_DIR="/backups" DATE=$(date +%Y%m%d) RETENTION_DAYS=7 SERVERS=( "db1.example.com" "db2.example.com" ) for server in "${SERVERS[@]}"; do echo "Backing up $server..." # 创建备份目录 mkdir -p "$BACKUP_DIR/$server" # 备份数据库 ssh admin@$server "mysqldump -u root -p'password' database | gzip" > \ "$BACKUP_DIR/$server/database_$DATE.sql.gz" # 备份文件 rsync -avz --delete admin@$server:/var/www/ "$BACKUP_DIR/$server/files/" echo "Backup of $server completed" done # 清理旧备份 find $BACKUP_DIR -name "*.sql.gz" -mtime +$RETENTION_DAYS -delete

最佳实践

1. 安全性

bash
# 使用密钥认证,禁用密码认证 ssh-keygen -t ed25519 -f ~/.ssh/automation_key ssh-copy-id -i ~/.ssh/automation_key.pub user@server # 限制密钥使用 # ~/.ssh/authorized_keys command="/usr/local/bin/automation-wrapper.sh",no-port-forwarding,no-X11-forwarding ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAI...

2. 错误处理

bash
#!/bin/bash # 错误处理示例 set -e # 遇到错误立即退出 set -u # 使用未定义变量时报错 set -o pipefail # 管道命令失败时退出 # 函数:错误处理 error_exit() { echo "Error: $1" >&2 exit 1 } # 使用 ssh user@server "command" || error_exit "SSH command failed"

3. 日志记录

bash
#!/bin/bash # 日志记录示例 LOG_FILE="/var/log/automation.log" log() { local message=$1 echo "[$(date '+%Y-%m-%d %H:%M:%S')] $message" | tee -a $LOG_FILE } # 使用 log "Starting deployment" ssh user@server "command" log "Deployment completed"

4. 配置管理

bash
# 使用配置文件 # config.ini [general] user=admin key=~/.ssh/automation_key timeout=30 [servers] web1=web1.example.com web2=web2.example.com db1=db1.example.com

5. 幂等性

确保自动化任务可以重复执行而不会产生副作用。

bash
#!/bin/bash # 幂等性示例 # 检查服务是否已安装 if ! systemctl is-active --quiet nginx; then apt-get install -y nginx fi # 检查配置是否已更新 if ! diff -q nginx.conf /etc/nginx/nginx.conf > /dev/null; then cp nginx.conf /etc/nginx/nginx.conf systemctl reload nginx fi

监控和告警

1. 自动化监控

bash
#!/bin/bash # 监控脚本 ALERT_EMAIL="admin@example.com" ALERT_SUBJECT="SSH Automation Alert" check_service() { local server=$1 local service=$2 if ! ssh admin@$server "systemctl is-active --quiet $service"; then send_alert "$service is down on $server" fi } send_alert() { local message=$1 echo "$message" | mail -s "$ALERT_SUBJECT" $ALERT_EMAIL } # 主程序 for server in web1 web2 web3; do check_service "$server.example.com" nginx check_service "$server.example.com" mysql done

2. 集成监控工具

yaml
# Prometheus + Grafana # prometheus.yml scrape_configs: - job_name: 'ssh_automation' static_configs: - targets: ['localhost:9090']

SSH 自动化可以大大提高运维效率,但需要注意安全性、可靠性和可维护性。

标签:SSH