5月30日 00:10
What are the common text processing tools in Shell scripts? How to use grep, sed, awk, and cut?
Common text processing tools in Shell scripts include grep, sed, awk, and cut.
grep - Text Search Tool
Basic Usage
bash# Search for text in file grep "pattern" file.txt # Search multiple files grep "pattern" file1.txt file2.txt # Recursive search in directory grep -r "pattern" /path/to/directory # Case insensitive search grep -i "pattern" file.txt # Show line numbers grep -n "pattern" file.txt # Invert match (exclude) grep -v "pattern" file.txt # Show only matching filenames grep -l "pattern" *.txt # Count matching lines grep -c "pattern" file.txt
Regular Expressions
bash# Match start of line grep "^start" file.txt # Match end of line grep "end$" file.txt # Match digits grep "[0-9]" file.txt # Match specific count grep "a\{3\}" file.txt # Match 3 a's # Use extended regular expressions grep -E "pattern1|pattern2" file.txt
Practical Applications
bash# Find process ps aux | grep "nginx" # Find errors in logs grep "ERROR" /var/log/syslog # Find files containing specific content grep -r "TODO" ./src # Count code lines grep -c "^" *.py
sed - Stream Editor
Basic Usage
bash# Replace text sed 's/old/new/' file.txt # Global replacement sed 's/old/new/g' file.txt # Delete lines sed '3d' file.txt # Delete line 3 sed '/pattern/d' file.txt # Delete matching lines # Print specific lines sed -n '5p' file.txt # Print line 5 sed -n '1,5p' file.txt # Print lines 1-5 # Insert and append sed '2i\new line' file.txt # Insert before line 2 sed '2a\new line' file.txt # Append after line 2
Advanced Usage
bash# Use regular expressions sed 's/[0-9]\+//g' file.txt # Multiple replacements sed -e 's/old1/new1/g' -e 's/old2/new2/g' file.txt # In-place editing (modify original file) sed -i 's/old/new/g' file.txt # Edit with backup sed -i.bak 's/old/new/g' file.txt # Use variables var="pattern" sed "s/$var/replacement/g" file.txt
Practical Applications
bash# Replace values in config file sed -i 's/port=8080/port=9090/' config.ini # Delete comment lines sed '/^#/d' file.txt # Delete empty lines sed '/^$/d' file.txt # Format output sed 's/\s\+/ /g' file.txt
awk - Text Processing Tool
Basic Usage
bash# Print specific columns awk '{print $1}' file.txt # Print multiple columns awk '{print $1, $3}' file.txt # Specify delimiter awk -F: '{print $1}' /etc/passwd # Print line numbers awk '{print NR, $0}' file.txt # Conditional printing awk '$3 > 100 {print $0}' file.txt
Built-in Variables
bashNR # Current record number (line number) NF # Number of fields in current record $0 # Complete record $1, $2 # 1st, 2nd fields FS # Field separator (default space) OFS # Output field separator RS # Record separator (default newline) ORS # Output record separator
Patterns and Actions
bash# Pattern matching awk '/pattern/ {print $0}' file.txt # BEGIN and END blocks awk 'BEGIN {print "Start"} {print $0} END {print "End"}' file.txt # Calculate sum awk '{sum += $1} END {print sum}' file.txt # Calculate average awk '{sum += $1; count++} END {print sum/count}' file.txt
Practical Applications
bash# Calculate total file size ls -l | awk '{sum += $5} END {print sum}' # Find maximum value awk '{if ($1 > max) max = $1} END {print max}' file.txt # Format output awk '{printf "%-10s %10s\n", $1, $2}' file.txt # Process CSV file awk -F, '{print $1, $3}' data.csv
cut - Text Cutting Tool
Basic Usage
bash# Cut by characters cut -c 1-5 file.txt # Extract characters 1-5 cut -c 1,5,10 file.txt # Extract characters 1, 5, 10 # Cut by bytes cut -b 1-10 file.txt # Cut by fields cut -d: -f1 /etc/passwd # Extract 1st field cut -d: -f1,3 /etc/passwd # Extract 1st and 3rd fields
Practical Applications
bash# Extract usernames cut -d: -f1 /etc/passwd # Extract IP address ifconfig | grep "inet " | cut -d: -f2 | cut -d' ' -f1 # Extract file extension echo "file.txt" | cut -d. -f2
Combined Usage Examples
Log Analysis
bash# Count errors grep "ERROR" /var/log/app.log | wc -l # Find logs for specific time period sed -n '/2024-01-01 10:00/,/2024-01-01 11:00/p' /var/log/app.log # Extract IP addresses grep "ERROR" /var/log/app.log | awk '{print $5}' | cut -d: -f2 # Count error types grep "ERROR" /var/log/app.log | awk '{print $6}' | sort | uniq -c
Text Processing
bash# Delete empty lines and comments sed '/^$/d; /^#/d' file.txt # Replace multiple spaces with single space sed 's/\s\+/ /g' file.txt # Extract specific column and deduplicate awk '{print $1}' file.txt | sort -u # Calculate average awk '{sum += $1} END {print sum/NR}' file.txt
System Administration
bash# Find processes with highest CPU usage ps aux | sort -rk 3 | head -n 5 # Find processes with highest memory usage ps aux | sort -rk 4 | head -n 5 # Count processes per user ps aux | awk '{print $1}' | sort | uniq -c # Find process using specific port lsof -i :8080 | awk '{print $2}' | tail -n +2
Best Practices
- Combine tools with pipes:
grep | awk | sort | uniq - Prefer grep for searching: Fastest for simple searches
- Use sed for replacement: Preferred tool for text replacement
- Use awk for column data: Best choice for structured text
- Use cut for fixed positions: Simple text cutting tasks
- Be aware of regex syntax: grep and sed have slightly different regex
- Test commands: Test commands before processing important files