Linux text processing is an important skill for system management and data analysis. Mastering these tools can greatly improve work efficiency.
grep (text search):
- Basic usage: grep "pattern" file
- Common options:
- -i: ignore case
- -v: reverse match (show non-matching lines)
- -n: show line numbers
- -c: count matching lines
- -r: recursive directory search
- -l: show only filenames containing matches
- -A n: show matching line and n lines after
- -B n: show matching line and n lines before
- -C n: show matching line and n lines before and after
- Regular expressions: grep -E "pattern1|pattern2" file (extended regex)
- Example: grep -rn "error" /var/log (recursively search for errors in logs)
sed (stream editor):
- Basic usage: sed 'command' file
- Common commands:
- s/pattern/replacement/: replace (only first match)
- s/pattern/replacement/g: global replace (replace all matches)
- d: delete line
- p: print line
- n: read next line
- Common options:
- -i: modify file directly
- -n: suppress automatic output
- -e: execute multiple commands
- Examples:
- sed 's/old/new/g' file (global replace)
- sed '/pattern/d' file (delete matching lines)
- sed -i 's/foo/bar/g' file (modify file directly)
awk (text processing tool):
- Basic usage: awk 'pattern {action}' file
- Built-in variables:
- $0: entire line
- $1, $2, ...: 1st, 2nd, ... fields
- NF: number of fields
- NR: current line number
- FNR: current file line number
- FS: field separator (default space)
- OFS: output field separator
- RS: record separator (default newline)
- ORS: output record separator
- Common functions:
- print: print
- printf: formatted output
- length(): string length
- substr(): substring
- split(): split string
- gsub(): global replace
- Examples:
- awk '{print $1}' file (print first field)
- awk -F: '{print $1, $3}' /etc/passwd (colon as separator)
- awk 'NR==1,NR==10' file (print lines 1-10)
- awk '{sum+=$1} END {print sum}' file (calculate sum of first column)
cut (cut tool):
- Basic usage: cut [options] file
- Common options:
- -d: specify delimiter
- -f: specify fields
- -c: specify character positions
- Examples:
- cut -d: -f1 /etc/passwd (extract first field with colon as delimiter)
- cut -c1-10 file (extract first 10 characters of each line)
sort (sorting tool):
- Basic usage: sort [options] file
- Common options:
- -n: numeric sort
- -r: reverse order
- -k: specify sort field
- -t: specify delimiter
- -u: unique
- Examples:
- sort -n -k2 file (sort by second column numerically)
- sort -t: -k3 -n /etc/passwd (sort by third column numerically)
uniq (deduplication tool):
- Basic usage: uniq [options] file
- Common options:
- -c: count duplicates
- -d: show only duplicate lines
- -u: show only unique lines
- Examples:
- sort file | uniq -c (sort and count duplicates)
Combined usage:
- Count errors in logs: grep "error" log | wc -l
- Find most visited IPs: awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10
- Extract emails from file: grep -Eo '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}' file