bash-awk
# Resources:
# Usecase:
awk
is a scripting language that allows text manipulation, but also arithmetic and conditional logic.
It’s commonly used to:
- Validate/search through text
- Transforming text input, and thus producing formatted text files.
- Performing calculations
- Looping through data and applying conditional logic.
TL;DR:
awk
runs line by line. Each line will be split into fields, represented by each string (if delimited by space).- The delimiter of what represents a “line” or “field” can be changed.
awk
is capable of arithmetic, often by using variable assignment also.awk
is also capable ofif-else
,for
andwhile
logic.
# Syntax:
# Basic syntax:
|
|
selection_criteria
is used to pattern match against fields (strings).- The pattern to be matched is denoted by single quotes (
''
). - Examples of
{actions}
include:print
: print the field.printf
: formatted printing similar to python.delete
: can be used to delete both fields and lines.
# Quick Cheatsheet:
Description | Syntax |
---|---|
Replace all occurrences of a pattern with a replacement string | {print $0 ~ /pattern/ ? replacement : $0} |
Delete all lines that match a pattern | {if ($0 ~ /pattern/) next; print $0} |
Print only the lines that match a pattern | {if ($0 ~ /pattern/) print $0} |
Insert a line before or after the line that matches a pattern | {if ($0 ~ /pattern/) {print line; print $0}} or {if ($0 ~ /pattern/) print $0; print line} |
Append a line to the end of the file | {print line} |
Change the delimiter from white space to another character | awk -F'new_char_here' '{ command }' file.txt |
To only print the last field per line, use $NF | awk '{print $NF }' file.txt |
Common commands/arguments: Example syntax incorporation:
|
|
# User-defined and Shell Generated Variables
END
:
- END
is a special block used to specify an action once all records are processed. It’s useful when looking to validate an entire file, such as by counting number of lines.
Positional variables:
- When Awk processes a line/record, it will split each delimited set of characters (string) into variables.
- “$1
”, “$2
”, “$3
” etc. refer to each field within the record – it is NOT zero-indexed.
- “$0
” refers to the entire input record (or line) in “awk”.
Variable assignment:
|
|
Common special variables:
- NR: NR command keeps a current count of the number of input records.
- NF: NF command keeps a count of the number of fields within the current input record.
- FS: FS command contains the field separator character which is used to divide fields on the input line. The default is “white space”, meaning space and tab characters.
- RS: RS command stores the current record separator character. Since, by default, an input line is the input record, the default record separator character is a newline.
- OFS: OFS command stores the output field separator, which separates the fields when Awk prints them. The default is a blank space.
# Deleting with awk
awk
can be used to delete both fields or lines.
Deleting per pattern:
|
|
Deleting per positional variable:
|
|
Notes:
- The
delete
action is applied after all other awk actions have been executed. - It only be used to delete fields or records that have already been created.
- It cannot delete fields or records that are created by other awk actions, such as
split
.
# Conditional logic:
Example if-else logic:
|
|