What are the common usages and regular expressions of grep, sed, and awk commands in Linux text processing? - 面试题

Linux text processing is an important skill for system management and data analysis. Mastering these tools can greatly improve work efficiency.

grep (text search):

sed (stream editor):

Basic usage: sed 'command' file
Common commands:
- s/pattern/replacement/: replace (only first match)
- s/pattern/replacement/g: global replace (replace all matches)
- d: delete line
- p: print line
- n: read next line
Common options:
- -i: modify file directly
- -n: suppress automatic output
- -e: execute multiple commands
Examples:
- sed 's/old/new/g' file (global replace)
- sed '/pattern/d' file (delete matching lines)
- sed -i 's/foo/bar/g' file (modify file directly)

awk (text processing tool):

Basic usage: awk 'pattern {action}' file
Built-in variables:
- $0: entire line
- $1, $2, ...: 1st, 2nd, ... fields
- NF: number of fields
- NR: current line number
- FNR: current file line number
- FS: field separator (default space)
- OFS: output field separator
- RS: record separator (default newline)
- ORS: output record separator
Common functions:
- print: print
- printf: formatted output
- length(): string length
- substr(): substring
- split(): split string
- gsub(): global replace
Examples:
- awk '{print $1}' file (print first field)
- awk -F: '{print $1, $3}' /etc/passwd (colon as separator)
- awk 'NR==1,NR==10' file (print lines 1-10)
- awk '{sum+=$1} END {print sum}' file (calculate sum of first column)

cut (cut tool):

Basic usage: cut [options] file
Common options:
- -d: specify delimiter
- -f: specify fields
- -c: specify character positions
Examples:
- cut -d: -f1 /etc/passwd (extract first field with colon as delimiter)
- cut -c1-10 file (extract first 10 characters of each line)

sort (sorting tool):

Basic usage: sort [options] file
Common options:
- -n: numeric sort
- -r: reverse order
- -k: specify sort field
- -t: specify delimiter
- -u: unique
Examples:
- sort -n -k2 file (sort by second column numerically)
- sort -t: -k3 -n /etc/passwd (sort by third column numerically)

uniq (deduplication tool):

Basic usage: uniq [options] file
Common options:
- -c: count duplicates
- -d: show only duplicate lines
- -u: show only unique lines
Examples:
- sort file | uniq -c (sort and count duplicates)

Combined usage:

Count errors in logs: grep "error" log | wc -l
Find most visited IPs: awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10
Extract emails from file: grep -Eo '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}' file