| AWK(1) | General Commands Manual | AWK(1) | 
awk —
| awk | [ -Ffs]
      [-vvar=value]
      [-safe]
      [-d[N]]
      [prog |-fprogfile] file ... | 
| awk | -version | 
awk is the Bell Labs' implementation of the AWK
  programming language as described in the The AWK Programming
  Language by A.V. Aho, B.W. Kernighan,
  P.J. Weinberger.
awk scans each input
    file for lines that match any of a set of patterns
    specified literally in prog or in one or more files
    specified as -f progfile. With
    each pattern there can be an associated action that will be performed when a
    line of a file matches the pattern. Each line is
    matched against the pattern portion of every pattern-action statement; the
    associated action is performed for each matched pattern. The file name
    - means the standard input. Any
    file of the form
    var=value
    is treated as an assignment, not a filename, and is executed at the time it
    would have been opened if it were a filename. The option
    -v followed by
    var=value
    is an assignment to be done before prog is executed;
    any number of -v options may be present. The
    -F fs option defines the input
    field separator to be the regular expression fs.
The options are as follows:
-d[N]-f
    filename-f options may be
    specified.-F
    fs-mr
    NNN, -mf
    NNN-safesystem() make
      the program abort (with a warning message).-v
    var=value-v options may be present.-versionawk version on standard output and
    exit.An input line is normally made up of fields separated by white
    space, or by the regular expression the built-in variable
    FS is set to. If FS is null, the
    input line is split into one field per character. The fields are denoted
    $1,
    $2, ..., while
    $0 refers to the entire line.
    Setting any other field causes the re-evaluation of
    $0 Assigning to
    $0 resets the values of all
    other fields and the NF built-in variable.
A pattern-action statement has the form
{ action
  }A missing { action
    } means print the line; a missing pattern always
    matches. Pattern-action statements are separated by newlines or
  semicolons.
An action is a sequence of statements. Statements are terminated
    by semicolons, newlines or right braces. An empty
    expression-list stands for
    $0. String constants are
    quoted "", with the usual C escapes
    recognized within. Expressions take on string or numeric values as
    appropriate, and are built using the
    Operators (see next subsection).
    Variables may be scalars, array elements (denoted
    x[i]) or fields. Variables are
    initialized to the null string. Array subscripts may be any string, not
    necessarily numeric; this allows for a form of associative memory. Multiple
    subscripts such as
    [i,j,k]
    are permitted; the constituents are concatenated, separated by the value of
    SUBSEP.
awk operators, in order of decreasing precedence, are:
(...)$++
    --^** form is also supported, and
      **= for the assignment operator).<
    ><=
    >=!=
    ==~
    !~in&&||?:? expr2
      : expr3. If
      expr1 is true, the result value is
      expr2, otherwise it is expr3.
      Only one of expr2 and expr3 is
      evaluated.= +=
    -=*=
    /= %= ^=if
    (expression)
    statement [else
    statement]while
    (expression)
    statementfor
    (expression;
    expression;
    expression)
    statementfor
    (var in
    array)
    statementdo
    statement while
    (expression)breakcontinue{
    [statement ...] }=
      expressionreturn
    [expression]nextnextfiledelete
    array[expression]delete
    arrayexit
    [expression]close(expr)fflush(expr)getline
    [var]$0 if
      var is not specified) to the next input record from
      the current input file. getline returns 1 for a
      successful input, 0 for end of file, and -1 for an error.getline
    [var] <
    file$0 if
      var is not specified) to the next input record from
      the specified file file.| getlinegetline; each call of
      getline returns the next line of output from
      expr.print
    [expr-list] [redirection]printf
    format[,
    expr-list] [redirection]Both print and
    printf statements write to standard output by
    default. The output is written to the file or pipe specified by
    redirection if one is supplied, as follows:
    > file,
     >>
    file, or
    | expr. Both
    file and expr may be literal
    names or parenthesized expressions; identical string values in different
    statements denote the same open file. For that purpose the file names
    /dev/stdin, /dev/stdout, and
    /dev/stderr refer to the program's
    stdin, stdout, and
    stderr respectively (and are unrelated to the
    fd(4) devices of the same
  names).
atan2(x,
    y)/y in
      radians. See also
    atan2(3).cos(expr)exp(expr)int(expr)log(expr)rand()sin(expr)sqrt(expr)srand([expr])rand()) and
      returns the previous seed.gensub(r,
    s, h
    [t]);g’ or
      ‘G’, then replace all matches of
      r with s. Otherwise,
      h is a number indicating which match of
      r to replace. If no t is
      supplied, $0 is used
      instead. Unlike sub() and
      gsub(), the modified string is returned as the
      result of the function, and the original target is not
      changed. Note that the ‘\n’
      sequences within replacement string s supported by
      GNU awk are not supported at
      this moment.gsub(r,
    s [t]);sub() except that all occurrences of the
      regular expression are replaced; sub() and
      gsub() return the number of replacements.index(s,
    t)length[([string])]$0 if no argument.match(s,
    r)split(s,
    a [fs]);[1],
      a[2], ...,
      a[n],
      and returns n. The separation is done with the
      regular expression fs or with the field separator
      FS if fs is not given. An
      empty string as field separator splits the string into one array element
      per character.sprintf(fmt,
    expr, ...)sub(r,
    s [t]);$0 is used.substr(s,
    m [n]);tolower(str)toupper(str)awk provides the following two functions for
  obtaining time stamps and formatting them:
systime()strftime([format
    [timestamp]]);systime(). If
      timestamp is missing, current time is used. If
      format is missing, a default format equivalent to
      the output of date(1) would be
      used. See the specification of ANSI C
      strftime(3) for the format
      conversions which are supported.system(cmd)! ||
  &&) of regular expressions and relational expressions. Regular
  expressions are as in egrep(1).
  Isolated regular expressions in a pattern apply to the entire line. Regular
  expressions may also occur in relational expressions, using the operators
  ~ and !~.
  /re/ is
  a constant regular expression; any string (constant or variable) may be used
  as a regular expression, except in the position of an isolated regular
  expression in a pattern.
A pattern may consist of two patterns separated by a comma; in this case, the action is performed for all lines from an occurrence of the first pattern though an occurrence of the second.
A relational expression is one of the following:
in
    array-name(expr,
    expr,
    ... ) in
    array-namewhere a relop is any of the six relational
    operators in C, and a matchop is either
    ~ (matches) or !~ (does not
    match). A conditional is an arithmetic expression, a relational expression,
    or a Boolean combination of these.
The special patterns BEGIN and
    END may be used to capture control before the first
    input line is read and after the last. BEGIN and
    END do not combine with other patterns.
If an awk program consists of only actions with the pattern
    BEGIN, and the BEGIN action
    contains no getline statement, awk exits without
    reading its input when the last statement in the last
    BEGIN action is executed. If an awk program consists
    of only actions with the pattern END or only actions
    with the patterns BEGIN and
    END, the input is read before the statements in the
    END actions are executed.
"%.6g")-F fs."%.6g")match(); 0 if no match.match(); -1 if no
      match.034)
function foo(a, b, c) { ...; return x }
Parameters are passed by value if scalar and by reference if array name; functions may be called recursively. Parameters are local to the function; all other variables are global. Thus local variables may be created by providing excess parameters in the function definition.
length() defaults
  to $0 and the empty parens can
  also be omitted in this case:
length > 72Print first two fields in opposite order:
{ print $2, $1 }Same, with input fields separated by comma and/or blanks and tabs:
BEGIN { FS = ",[ \t]*|[ \t]+" }
      { print $2, $1 }
Add up first column, print sum and average:
{ s += $1 }
END { print "sum is", s, "average is", s/NR }
Print all lines between start/stop pairs:
/start/, /stop/Simulate echo(1):
BEGIN  {
        for (i = 1; i < ARGC; ++i)
        printf("%s%s", ARGV[i], i==ARGC-1?"\n":" ")
}
Another way to do the same that demonstrates field assignment and
    $0 re-evaluation:
BEGIN { for (i = 1; i < ARGC; ++i)
  $i = ARGV[i]; print }Print an error message to standard error:
{ print "error!" > "/dev/stderr" }
A.V. Aho, B.W. Kernighan, P.J. Weinberger, The AWK Programming Language, Addison-Wesley, 1988. ISBN 0-201-07981-X
AWK Language Programming, Edition 1.0, published by the Free Software Foundation, 1995
nawk has been the default system
  awk since NetBSD 2.0,
  replacing the previously used GNU awk.
The scope rules for variables in functions are a botch; the syntax is worse.
Only eight-bit characters sets are handled correctly.
| July 5, 2022 | NetBSD 10.0 |