ratings-display.rating-aria-label(1)
Feb 27


awk is a powerful text-processing tool available in Linux and UNIX. It is primarily used for pattern scanning, text manipulation, and reporting. Whether you need to extract specific columns, filter lines, or perform arithmetic operations, awk can do it efficiently.
Let's break it down with simple examples and outputs.
The general syntax of awk is:
[root@siddhesh ~]# awk 'pattern { action }' file
pattern: Specifies when the action should be applied.
action: Defines what should be done when the pattern matches.
file: The input file to be processed.
If no pattern is given, awk applies the action to all lines.
Consider a sample file data.txt with the following content:
1 Siddhesh DevOps
2 Rajesh Admin
3 Akash Developer
Now, to extract only the second column (names):
[root@siddhesh ~]# awk '{print $2}' data.txt
This command prints the second column ($2) of each line in data.txt.
Siddhesh
Rajesh
Akash
Each field in awk is represented as $1, $2, $3, and so on.
If you want to print only the lines where the third column contains "DevOps":
[root@siddhesh ~]# awk '$3 == "DevOps" {print $0}' data.txt
This command filters out lines where the third column ($3) matches "DevOps".
1 Siddhesh DevOps
By default, awk assumes fields are separated by spaces or tabs. If your file uses another delimiter (e.g., :), specify it using -F:
Consider users.txt:
root:x:0:0:root:/root:/bin/bash
siddhesh:x:1001:1001::/home/siddhesh:/bin/bash
Extract only usernames (first column):
[root@siddhesh ~]# awk -F: '{print $1}' users.txt
The -F: option tells awk to treat : as the field separator.
root
siddhesh
Consider numbers.txt:
10 20
30 40
50 60
To add both columns and display the sum:
[root@siddhesh ~]# awk '{print $1 + $2}' numbers.txt
This command adds the values of the first and second columns.
30
70
110
If you want to print only numbers where the first column is greater than 20:
[root@siddhesh ~]# awk '{if ($1 > 20) print $0}' numbers.txt
The if statement checks if the value in the first column ($1) is greater than 20.
30 40
50 60
[root@siddhesh ~]# awk 'END {print "Total lines:", NR}' data.txt
NR holds the number of processed lines; END ensures the count is printed at the end.
Total lines: 3
[root@siddhesh ~]# awk '{for(i=1; i<=NF; i++) words[$i]++} END {for (w in words) print w}' data.txt
This script stores words in an associative array to filter unique entries.
[root@siddhesh ~]# awk '{print NR, $0}' data.txt
NR (line number) is printed before each line.
[root@siddhesh ~]# awk '{if($1 > max) max=$1} END {print "Max:", max}' numbers.txt
Iterates through the first column, tracking the highest value.
[root@siddhesh ~]# awk '{sum += $1} END {print "Sum:", sum}' numbers.txt
Adds values in the first column and prints the total sum.
[root@siddhesh ~]# awk '{gsub("Rajesh", "Ramesh"); print}' data.txt
gsub replaces all instances of "Rajesh" with "Ramesh".
[root@siddhesh ~]# awk 'NR%2==0' data.txt # Even lines
[root@siddhesh ~]# awk 'NR%2==1' data.txt # Odd lines
Filters even or odd numbered lines using modulus operator.
[root@siddhesh ~]# awk '{print $1}' numbers.txt | sort -n
Extracts the first column and sorts it numerically.
[root@siddhesh ~]# awk '{sum+=$1} END {print "Average:", sum/NR}' numbers.txt
Calculates the average by summing column values and dividing by the line count.
[root@siddhesh ~]# awk '{for(i=NF; i>=1; i--) printf "%s ", $i; print ""}' data.txt
This prints fields in reverse order for each line.
Conclusion
awk is a simple yet powerful tool for text manipulation. With its pattern-matching capabilities and built-in variables, it can handle various tasks efficiently. Practice these examples, and soon you’ll be comfortable using awk in your scripts!
Happy coding! 🚀
Comments