awk

From Gentoo Wiki
Jump to:navigation Jump to:search
This article is a stub. Please help out by expanding it - how to get started.

awk is a scripting language for data extraction often used in tandem with sed and grep for complex reporting needs. While large awk programs are possible, the language itself was intended primarily to be used to construct one-liners to filter data and perform simple computations.

Several versions of awk are provided with Gentoo. A specific implementation may be selected with Project:Base/Alternatives. By default sys-apps/gawk will be pulled in, and that implementation will be used as an example for this article.

Installation

Installation usually happens by Unpacking the stage tarball.

USE flags

The app-alternatives/awk USE flags select which version of awk to pull in:

USE flags for app-alternatives/awk /bin/awk and /usr/bin/awk symlinks

+gawk Symlink to sys-apps/gawk
busybox Symlink to sys-apps/busybox
mawk Symlink to sys-apps/mawk (warning: mawk is not fully POSIX-compliant)
nawk Symlink to sys-apps/nawk
split-usr Enable behavior to support maintaining /bin, /lib*, /sbin and /usr/sbin separately from /usr/bin and /usr/lib*

By default, sys-apps/gawk will be pulled in. To use a different implementation of awk, set the appropriate USE flags on app-alternatives/awk in package.use.

USE flags for sys-apps/gawk:

USE flags for sys-apps/gawk GNU awk pattern-matching language

mpfr Use dev-libs/mpfr for high precision arithmetic (-M / --bignum)
nls Add Native Language Support (using gettext - GNU locale utilities)
pma Experimental Persistent Memory Allocator (PMA) support which allows persistence of variables, arrays, and user-defined functions across runs.
readline Enable support for libreadline, a GNU line-editing library that almost everyone wants
verify-sig Verify upstream signatures on distfiles

Emerge

Install awk:

root #emerge --ask app-alternatives/awk

Configuration

Environment variables

  • AWKPATH a list of directories searches to find file names passed at runtime with the --file option.
  • AWKLIBPATH a list of directories searches to find file names passed at runtime with the --load option.
  • GAWK_READ_TIMEOUT the amount of time (in milliseconds) awk waits for input before giving up. (for sys-apps/gawk)
  • GAWK_SOCK_RETRIES the total number of retries when reading data from a socket. (for sys-apps/gawk)
  • GAWK_MSEC_SLEEP the amount of time (in milliseconds) awk sleeps between retries. (for sys-apps/gawk)
  • POSIXLY_CORRECT duplicates the --posix switch.

Usage

Invocation

Invocation information for sys-apps/gawk:

user $awk --help
Usage: awk [POSIX or GNU style options] -f progfile [--] file ...
Usage: awk [POSIX or GNU style options] [--] 'program' file ...
POSIX options:		GNU long options: (standard)
	-f progfile		--file=progfile
	-F fs			--field-separator=fs
	-v var=val		--assign=var=val
Short options:		GNU long options: (extensions)
	-b			--characters-as-bytes
	-c			--traditional
	-C			--copyright
	-d[file]		--dump-variables[=file]
	-D[file]		--debug[=file]
	-e 'program-text'	--source='program-text'
	-E file			--exec=file
	-g			--gen-pot
	-h			--help
	-i includefile		--include=includefile
	-I			--trace
	-l library		--load=library
	-L[fatal|invalid|no-ext]	--lint[=fatal|invalid|no-ext]
	-M			--bignum
	-N			--use-lc-numeric
	-n			--non-decimal-data
	-o[file]		--pretty-print[=file]
	-O			--optimize
	-p[file]		--profile[=file]
	-P			--posix
	-r			--re-interval
	-s			--no-optimize
	-S			--sandbox
	-t			--lint-old
	-V			--version

To report bugs, see node `Bugs' in `gawk.info'
which is section `Reporting Problems and Bugs' in the
printed version.  This same information may be found at
https://www.gnu.org/software/gawk/manual/html_node/Bugs.html.
PLEASE do NOT try to report bugs by posting in comp.lang.awk,
or by using a web forum such as Stack Overflow.

gawk is a pattern scanning and processing language.
By default it reads standard input and writes standard output.

Examples:
	awk '{ sum += $1 }; END { print sum }' file
	awk -F: '{ print $1 }' /etc/passwd

Usage in ebuilds

CODE Example for using awk in an ebuild (dev-java/protobuf-java)
src_prepare() {
	default

	# Split the file in two parts, one for each run-protoc call
	awk '/--java_out/{x="test-sources-build-"++i;}{print > x;}' \
		java/core/generate-test-sources-build.xml || die
}

Removal

Removal is not recomended since awk is a member of @system.

Unmerge

Uninstall package:

root #emerge --ask --depclean --verbose app-alternatives/awk

See also

  • ed — a line-based text editor with support for regular expressions
  • sed — a program that uses regular expressions to programmatically modify streams of text
  • grep — a tool for searching text files with regular expressions
  • Perl — a general purpose interpreted programming language with a powerful regular expression engine.
  • Raku — a high-level, general-purpose, and gradually typed programming language with low boilerplate objects, optionally immutable data structures, and an advanced macro system.

External resources