乐闻世界logo
搜索文章和话题

What are the common usages and regular expressions of grep, sed, and awk commands in Linux text processing?

2月17日 23:37

Linux text processing is an important skill for system management and data analysis. Mastering these tools can greatly improve work efficiency.

grep (text search):

  • Basic usage: grep "pattern" file
  • Common options:
    • -i: ignore case
    • -v: reverse match (show non-matching lines)
    • -n: show line numbers
    • -c: count matching lines
    • -r: recursive directory search
    • -l: show only filenames containing matches
    • -A n: show matching line and n lines after
    • -B n: show matching line and n lines before
    • -C n: show matching line and n lines before and after
  • Regular expressions: grep -E "pattern1|pattern2" file (extended regex)
  • Example: grep -rn "error" /var/log (recursively search for errors in logs)

sed (stream editor):

  • Basic usage: sed 'command' file
  • Common commands:
    • s/pattern/replacement/: replace (only first match)
    • s/pattern/replacement/g: global replace (replace all matches)
    • d: delete line
    • p: print line
    • n: read next line
  • Common options:
    • -i: modify file directly
    • -n: suppress automatic output
    • -e: execute multiple commands
  • Examples:
    • sed 's/old/new/g' file (global replace)
    • sed '/pattern/d' file (delete matching lines)
    • sed -i 's/foo/bar/g' file (modify file directly)

awk (text processing tool):

  • Basic usage: awk 'pattern {action}' file
  • Built-in variables:
    • $0: entire line
    • $1, $2, ...: 1st, 2nd, ... fields
    • NF: number of fields
    • NR: current line number
    • FNR: current file line number
    • FS: field separator (default space)
    • OFS: output field separator
    • RS: record separator (default newline)
    • ORS: output record separator
  • Common functions:
    • print: print
    • printf: formatted output
    • length(): string length
    • substr(): substring
    • split(): split string
    • gsub(): global replace
  • Examples:
    • awk '{print $1}' file (print first field)
    • awk -F: '{print $1, $3}' /etc/passwd (colon as separator)
    • awk 'NR==1,NR==10' file (print lines 1-10)
    • awk '{sum+=$1} END {print sum}' file (calculate sum of first column)

cut (cut tool):

  • Basic usage: cut [options] file
  • Common options:
    • -d: specify delimiter
    • -f: specify fields
    • -c: specify character positions
  • Examples:
    • cut -d: -f1 /etc/passwd (extract first field with colon as delimiter)
    • cut -c1-10 file (extract first 10 characters of each line)

sort (sorting tool):

  • Basic usage: sort [options] file
  • Common options:
    • -n: numeric sort
    • -r: reverse order
    • -k: specify sort field
    • -t: specify delimiter
    • -u: unique
  • Examples:
    • sort -n -k2 file (sort by second column numerically)
    • sort -t: -k3 -n /etc/passwd (sort by third column numerically)

uniq (deduplication tool):

  • Basic usage: uniq [options] file
  • Common options:
    • -c: count duplicates
    • -d: show only duplicate lines
    • -u: show only unique lines
  • Examples:
    • sort file | uniq -c (sort and count duplicates)

Combined usage:

  • Count errors in logs: grep "error" log | wc -l
  • Find most visited IPs: awk '{print $1}' access.log | sort | uniq -c | sort -rn | head -10
  • Extract emails from file: grep -Eo '[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+.[a-zA-Z]{2,}' file
标签:Linux