awk is a interpreted programming language that processing text file line-by-line, which usually used in the shell to accomplish simple field-based text processing task. There are two common usages of awk command
awk 'program' [file]
awk -f [program-file] [file]
By providing the option -f
, we provide the awk script through a file, which avoid some compatibility problems when using the first inline form, the details is provided in awk_inline_problems. ==TODO: Create this.==
Another common approach is processing the output of some shell command through piping:
cmd | awk 'program'
The general format of awk program is
[pattern] [action]
The [pattern]
specifies which lines should be processed by the given [action]
. One of them can be omitted. If the pattern
is omitted, every line of file will be processed; if the action
is omitted, then awk will simply print the line obeys pattern.
For multiple [pattern] [action]
combinations given in one file, awk will run them sequentially and independently. For example, the awk program file with content
#! /bin/awk -f
/12/ {print $0}
/21/ {print $0}
will print all the lines contains 12
or 21
, and those lines with both patterns will be printed twice.
Each line is contructed by many fields, seperated by space characters by default, and the seperator can be specified by assigning variable FS
, or setting the -F/--file-separator
option (which actually also change FS
).
The special variables $1, $2, $3
represents the fields, numbering starts from the left. Specially, the variable $0
represents the entire line. Take an example,
ls -l | awk '{print $9 " " $3 " " $1}'
This will printed the 9th, 3rd, 1s field of the output of ls -l
, which is the filename, owner’s name, and rwx
mode respectively.
Besides the constant, the field number can be provided by variable, for example,
print
The print
statement print its following object.
awk '{print $0}' data
Since no pattern is provided, this print applied print $0
to entire line, that is, print entire line. Hence this command actually print the entire file.
variable can be used directly without assigned: