awk
awk is a scripting language for data extraction often used in tandem with sed and grep for complex reporting needs. While large awk programs are possible, the language itself was intended primarily to be used to construct one-liners to filter data and perform simple computations.
Several versions of awk are provided with Gentoo. A specific implementation may be selected with Project:Base/Alternatives. By default sys-apps/gawk will be pulled in, and that implementation will be used as an example for this article.
Installation
Installation usually happens by Unpacking the stage tarball.
USE flags
The app-alternatives/awk USE flags select which version of awk to pull in:
USE flags for app-alternatives/awk /bin/awk and /usr/bin/awk symlinks
By default, sys-apps/gawk will be pulled in. To use a different implementation of awk, set the appropriate USE flags on app-alternatives/awk in package.use.
USE flags for sys-apps/gawk:
USE flags for sys-apps/gawk GNU awk pattern-matching language
mpfr
|
Use dev-libs/mpfr for high precision arithmetic (-M / --bignum) |
nls
|
Add Native Language Support (using gettext - GNU locale utilities) |
pma
|
Experimental Persistent Memory Allocator (PMA) support which allows persistence of variables, arrays, and user-defined functions across runs. |
readline
|
Enable support for libreadline, a GNU line-editing library that almost everyone wants |
verify-sig
|
Verify upstream signatures on distfiles |
Emerge
Install awk:
root #
emerge --ask app-alternatives/awk
Configuration
Environment variables
- AWKPATH a list of directories searches to find file names passed at runtime with the --file option.
- AWKLIBPATH a list of directories searches to find file names passed at runtime with the --load option.
- GAWK_READ_TIMEOUT the amount of time (in milliseconds) awk waits for input before giving up. (for sys-apps/gawk)
- GAWK_SOCK_RETRIES the total number of retries when reading data from a socket. (for sys-apps/gawk)
- GAWK_MSEC_SLEEP the amount of time (in milliseconds) awk sleeps between retries. (for sys-apps/gawk)
- POSIXLY_CORRECT duplicates the --posix switch.
Usage
Invocation
Invocation information for sys-apps/gawk:
user $
awk --help
Usage: awk [POSIX or GNU style options] -f progfile [--] file ... Usage: awk [POSIX or GNU style options] [--] 'program' file ... POSIX options: GNU long options: (standard) -f progfile --file=progfile -F fs --field-separator=fs -v var=val --assign=var=val Short options: GNU long options: (extensions) -b --characters-as-bytes -c --traditional -C --copyright -d[file] --dump-variables[=file] -D[file] --debug[=file] -e 'program-text' --source='program-text' -E file --exec=file -g --gen-pot -h --help -i includefile --include=includefile -I --trace -l library --load=library -L[fatal|invalid|no-ext] --lint[=fatal|invalid|no-ext] -M --bignum -N --use-lc-numeric -n --non-decimal-data -o[file] --pretty-print[=file] -O --optimize -p[file] --profile[=file] -P --posix -r --re-interval -s --no-optimize -S --sandbox -t --lint-old -V --version To report bugs, see node `Bugs' in `gawk.info' which is section `Reporting Problems and Bugs' in the printed version. This same information may be found at https://www.gnu.org/software/gawk/manual/html_node/Bugs.html. PLEASE do NOT try to report bugs by posting in comp.lang.awk, or by using a web forum such as Stack Overflow. gawk is a pattern scanning and processing language. By default it reads standard input and writes standard output. Examples: awk '{ sum += $1 }; END { print sum }' file awk -F: '{ print $1 }' /etc/passwd
Usage in ebuilds
src_prepare() {
default
# Split the file in two parts, one for each run-protoc call
awk '/--java_out/{x="test-sources-build-"++i;}{print > x;}' \
java/core/generate-test-sources-build.xml || die
}
Removal
Removal is not recomended since awk is a member of @system.
Unmerge
Uninstall package:
root #
emerge --ask --depclean --verbose app-alternatives/awk
See also
- ed — a line-based text editor with support for regular expressions
- sed — a program that uses regular expressions to programmatically modify streams of text
- grep — a tool for searching text files with regular expressions
- Perl — a general purpose interpreted programming language with a powerful regular expression engine.
- Raku — a high-level, general-purpose, and gradually typed programming language with low boilerplate objects, optionally immutable data structures, and an advanced macro system.