AWK

From TBP Wiki
Jump to: navigation, search

The awk utility shall execute programs written in the awk programming language, which is specialized for textual data manipulation. An awk program is a sequence of patterns and corresponding actions. When input is read that matches a pattern, the action associated with that pattern is carried out.

Input shall be interpreted as a sequence of records. By default, a record is a line, less its terminating <newline>, but this can be changed by using the RS built-in variable. Each record of input shall be matched in turn against each pattern in the program. For each pattern matched, the associated action shall be executed.

The awk utility shall interpret each input record as a sequence of fields where, by default, a field is a string of non-<blank> non-<newline> characters. This default <blank> and <newline> field delimiter can be changed by using the FS built-in variable or the −F sepstring option. The awk utility shall denote the first field in a record $1, the second $2, and so on. The symbol $0 shall refer to the entire record; setting any other field causes the re-evaluation of $0. Assigning to $0 shall reset the values of all other fields and the NF built-in variable.

Using awk

Remove duplicates from a file

   awk '!a[$0]++'

Example:

   cat filename.txt | awk '!a[$0]++' >> newfile.txt

Print the second line in something. This can be piped.

   awk '{print $2}'

Print multiple items. This can be rearranged in any manner.

    awk '{print $6,$2,$9,$1}'

This can substitute "foo" with "bar" within a file.

   awk '{gsub(/foo/,"bar")}' FILENAME

Print all new lines to one line with a space between each item:

   awk 'BEGIN { ORS=" " }; { print $2 }'


Print only up until first instance of "." for output of ls and grep:

   ls /dir1/dir2/ | grep us | awk -F "." '{print $1}'