From Bash to Z Shell Conquering the Command Line

Introduction to Shells

What’s a Shell?

There are various definitions of the term shell. When we say “shell”, we’re talking about an interface between the user and the computer’s operating system. You can think of a shell as a “wrapper” around the operating system, one that protects you from the system. It lets you use the computer without needing to understand the low-level details that programmers do.

Shell Types and Versions

Which shell are you using? If you aren’t sure, here are three command lines to try:

echo $0
echo $SHELL
ps -p $$

Expansion and Substitution

Shells handle wildcards by expansion.

There’s one more important question here: what happens if expansion fails? For instance, what if you type a wildcard pattern that doesn’t match? The answer is: it depends. Shells have different ways to handle this; some can be configured, and others just use their default behavior. If a wildcarded argument doesn’t match any pathname, Bourne-type shells generally pass that unexpanded argument on to the program. On C-type shells, if no argument matches, they print an error and won’t run the command line at all. In C-type shells, if some of the arguments expand and others don’t, unmatched arguments are removed:

echo zz*zz
# echo: No match.
echo *conf zz*zz
# fsconf linuxconf netconf userconf

Command History

When you enter a command line, the shell saves it in memory or in a disk file. Each line is assigned a number. To recall a previous command line, type the history expansion character ! followed by the history number or the first few letters of the command name. There are other shortcuts too – including !! to recall all of the previous command line, and ^x to recall the previous command line and remove x.

cal 1222 1995
# cal: illegal month value: use 1-12
^22
# cal 12 1995
# December 1995
# Su Mo Tu We Th Fr Sa
# ...
pwd
# /u/alison
ls /usr/local/src
!!/bigproj
# ls /usr/local/src/bigproj
cp !$/somefile .
# cp /usr/local/src/bigproj/somefile .
!ca
# cal 12 1995
# December 1995
# Su Mo Tu We Th Fr Sa
# ...
ls -l !37:$
# ls -l /usr/local/src/bigproj

Using Shell Features Together

for and foreach Loops

for dir in `echo $PATH | tr ':' ' '`; do
  cd "$dir"
  for file in *; do
    echo "$file"
  done
done | sort > proglist

C-type shells:

foreach f (*)
  cp -i "${f}" "OLD-${f}"
end

Building Our Script

Bourne shells also have an if command that’s designed to work with test – and, actually, with many Unix programs. if lets the shell make decisions based on the results of other commands. By the way, the C shells’ if works differently.

The exit status isn’t displayed on the screen (unless your shell is configured that way). The exit status of the previous command line is available from $? in Bourne-type shells and $status in C shells.

Entering and Editing the Command Line

Beyond Keystrokes: Commands and Bindings

Names and Extended Commands

In zsh you can execute a command simply by knowing its name. To do this, you press Esc x, then the name of the editing command.

Configuration and Key Binding: readline and zle

bash uses the GNU project’s standard library, readline, which does exactly what its name suggests. readline has its own configuration file, from which you can configure a lot of different tools; the sidebar “readline Configuration” has more information on this. You can still configure readline settings from your ~/.bashrc.

READLINE CONFIGURATION

Configuration for the readline library goes in a file named ~/.inputrc. A system-wide /etc/inputrc file also exists. Each line takes the same form as an argument to the bind built in. So to bind a key you might, for example, use the following command:

"\C-u": kill-whole-line

A number of readline variables allow other aspects of readline’s behavior to be configured.

set mark-modified-lines on

You can see what the current settings for all the variables are by typing bind -v. You can also set readline variables direct from bash using the bind command:

bind 'set mark-modified-lines on'

zsh has its own “zsh line editor”, or “zle” for short. Because it’s always used within zsh, the only way to set up zle is with standard shell commands.

zle -la

You can find out what commands are bound to keys already. Here’s how to do that in bash:

bind -p
bind -p | grep '"\\C-y"'
bind '"\M-p": history-search-backward'

The bash command is bind and the zsh command is bindkey; they have rather different syntaxes.

bindkey -L
bindkey '\C-y'
bindkey '\ep' history-beginning-search-backward

Finding the Key Sequences for Function Keys

Shells only have the ability to read strings of characters. The terminal emulator generates a string of characters (often called an escape sequence) for each special key. The simplest, most general way we know of is to type read, then press Return, then the key combination you want to investigate. You’ll see the characters the terminal sends. Let’s try the function key F1.

read
# press F1, output ^[OP

The ^[ represents an escape character. The shells know that as \e, which is a little more obvious.

Binding Strings

Both shells have ways of binding a string instead of a command to a key sequence. This means that when you type the key sequence the string appears on the command line.

bind '"\C-xb": "bind"'
bindkey -s '\C-xb' bindkey

Multiple Key Sequences

bind '"\C-xdd": kill-word'
bindkey '\C-xdd' kill-word

Conflicts between Prefixes and Editor Commands

bind '"\C-xd": backward-kill-word'

This binding conflicts with the example using kill-word at the end of the previous section: After Ctrl-x-d, the shell doesn’t know whether you’re going to type a d next, to get kill-word. bash resolves this simply by waiting to see what you type next, so if you don’t type anything, nothing happens; if you then type a d, you get kill-word, and if you type anything else, the shell executes backward-kill-word, followed by the editor commands corresponding to whatever else you typed.

zsh works like bash if you type something immediately after the prefix. However, it has a bit of extra magic to avoid the shell waiting forever to see what character comes after the Ctrl-x-d. The shell variable KEYTIMEOUT specifies a value in hundredths of a second for which the shell will wait for the next key when the keys typed so far form a prefix.

Keymaps

	bash	zsh
emacs	set -o emacs	bindkey -e
vi	set -o vi	bindkey -v

To bind keys in a particular keymap, the clearest way is to use the keymap’s name.

bind -m vi-command '"B": backward-word'
bindkey -M vicmd B backward-word

Multiline Editing and the zsh Editor Stack

zsh has powerful handling for editing commands that span more than one line.

mv file1 old_file1<escape><return>
mv file2 old_file2<return>

The Buffer Stack

Every time you press Esc-q on a complete buffer, the buffer is pushed onto the end of the stack. Every time you press Return, the last piece of text pushed onto the end of the stack is popped off and loaded back into the line editor.

ls -L here/there/everywhere<escape><q>
man ls<return>
#> ls -L here/there/everywhere

A Quick Way of Getting Help on a Command

You can look up the documentation for a command you are entering by simply pressing Esc-h, without clearing the command line. This pushes the line for you, then runs the command run-help, which by default is an alias for man. Afterwards, the command line appears from the buffer stack.

Other Tips on Terminals

Unix terminal drivers – the part of the system that handles input and output for a terminal or terminal emulator – are slightly weird things.

We’ve already met some occasions where certain special keys are swallowed up. They were Ctrl-d at the start of the line, which meant end-of-file (EOF), Ctrl-s to stop output, and Ctrl-q to start it again.

There’s a program called stty that controls these settings.

Starting the Shell

Which files a shell executes at startup depends on two things: whether it’s a login shell, or an interactive shell.

The shell actually has two ways of deciding whether to act as a login shell. One is if it’s given an option, -l, or --login in bash. The shell can also be started under a name that begins with a dash (-). This probably seems a bit odd; it’s an old Unix convention. The Unix login program, which runs when you log into a Unix system, uses this convention to tell the shell that it should act like a login shell.

bash Startup Files

bash behaves differently according to which type of shell is running. If it is a login shell, it executes the code in the following files, in this order:

/etc/profile
~/.bash_profile, and if that doesn’t exist, bash executes the file ~/.bash_login instead. If that doesn’t exist either, it looks for ~/.profile and executes that if it exists.

Be particularly careful with ~/.profile, if you use that file to contain bash code. It is executed by the Bourne shell and all its direct descendants such as ksh. Putting bash-specific code in ~/.profile can cause problems.

If the shell isn’t a login shell but is interactive, bash executes the single file ~/.bashrc.

Note that no startup file is read by both login and non-login shells. If you want all interactive bash shells to run ~/.bashrc, you can put an appropriate command in ~/.bash_profile.

. ~/.bashrc

The first “.” is actually a command that tells the shell to read the named file like a startup file. That is, all code will be run inside the current shell instead of in a separate process.

If the shell is not interactive, bash does not automatically read any of the files. Instead, it looks for an environment variable called BASH_ENV. If it’s set, that variable should contain the full path to the startup file.

export BASH_ENV=~/.bashrc

One more file needs mentioning for completeness: ~/.bash_logout is run at the end of a login shell. There is no specific file run at the end of shells that aren’t login shells, but you can make the shell run ~/.bash_logout or any other file by putting the following in ~/.bashrc:

trap '. ~/.bash_logout' EXIT

A trap specifies code, within the quotes, for the shell to execute when something happens. Here we’ve told the shell to source ~/.bash_logout when it exits.

zsh Startup Files

When zsh starts, it can execute up to eight different files. Many of the files correspond to those in bash, and so have similar names. Also, the .login and .profile variants have come from different predecessors of zsh, csh, and sh. Here is a complete list (in order) in which the files are considered by the shell to see if they should be executing:

/etc/zshenv. This is the only file that is always executed.
~/.zshenv. This is executed as long as the option rcs is turned on, which is true by default.
/etc/zprofile. This is executed if the options rcs, globalrcs, and login are set. The option globalrcs is normally turned on, like rcs.
~/.zprofile. This is executed if the options login and rcs are turned on.
/etc/zshrc. This is executed if the options interactive, rcs, and globalrcs are turned on.
~/.zshrc. This is executed if the options interactive and rcs are turned on.
/etc/zlogin. This is executed if the options login, rcs, and globalrcs are turned on.
~/.zlogin. This is executed if the options login and rcs are turned on.

The options login and interactive are the same as the -l and -i options you can pass to the shell.

Shell Options

Options are one of two main ways to control the shell; shell variables (including environment variables) are the other way.

Setting Options with bash and zsh

Sometimes you’d like the shell to start up without running your startup files.

In bash, the option is --norc.

In zsh, the corresponding option works like the following:

zsh -f

Setting Options with set

Both bash and zsh use the command set to turn on options. set accepts the same options you can give the shell at startup time. Note, though, that bash doesn’t let you set any of the options beginning with this way.

This time one common to both shells: -v, for “verbose”. This option tells the shell to echo (show) what it’s about to execute. First, on the command line as you start the shell:

zsh -v
echo you will see this line twice
# echo you will see this line twice
# you will see this line twice

Second, we’ll set the option after starting the shell:

zsh
set -v
echo you will see this line twice
# echo you will see this line twice
# you will see this line twice

To turn the option off again, use +v instead of -v:

set +v
# set +v
echo I will show this only once.
# I will show this only once.

Since set has other purposes as well as option setting, you need to tell the shell there is a named option coming up, which you do with a -o. To turn off that option you use +o instead.

set -o verbose
set +o verbose

The -o has another use: if you type set -o on its own, not followed by anything, the shell will show which options are turned on.

Setting zsh Options with setopt

There is some freedom in zsh about how you spell named options. This can make options even more readable. It may not be obvious what histexpiredupsfirst means, but Hist_Expire_Dups_First is clearer. zsh ignores all the underscores as well as the string’s case. In other words, yet another way of writing this option is hist_expire_dups_first.

To invert a zsh option, you can place no (or NO, or No_, and so on) in front of the option name.

setopt verify
setopt noverify

bash has a command shopt that’s a bit like setopt. You can turn options on with shopt -s and turn them off with shopt -u. The options are different from the ones you set with set -o, although you can use shopt -o to manipulate that other set. You can set shopt options when starting the shell by using -O to introduce it instead of -o. Once the shell has been started, you must use shopt.

More About Shell History

Setting Up Variables

Variable in bash	Variable in zsh	Purpose
HISTSIZE	Same	Number of lines stored within the shell
HISTFILE	Same	File where history is saved for future use
HISTFILESIZE	SAVEHIST	Maximum number of lines in $HISTFILE

“Bang” History: The Use of Exclamation Marks

Both bash and zsh have a history mechanism inherited from the C shell that uses the ! character to retrieve history lines.

When csh was written, this feature was the only way of using lines from the shell’s history. There was no command-line editing except for the few simple operations configured with stty. You’ll probably use the line editor to do many of the tasks represented by complex history substitutions like !1023:1:t.

For commands before the last one, you can use a negative number. !! is the same as !-1, !-2 is the command immediately before that, and so on.

History Words

There are two more arguments you can have in a history expansion: one to select the word in the history, the next to modify the retrieved text in some way. Neither needs to be there, but you can have all three. The arguments are separated by colons.

To restrict the retrieved text to one or more words in the history line, you can use one of the following:

Numbers, where 0 is the command name and 1 is its first argument.
* for everything except the command name.
$ for the last argument on the command line.
Two numbers, or a number and a $, with a - in between. This selects a range.

echo History is bunk.
# History is bunk.
!!:0
#> echo
#
!!:0-1
#> echo History
# History
echo I said !!:$
#> echo I said bunk.
# I said bunk.
echo The arguments were !!:*
#> echo The arguments were History is bunk.
# The arguments were History is bunk.

You can shorten the !! to a single ! if it’s followed by one of these forms.

In zsh, a single ! refers to the last line you referred to, if you had previous history substitutions on the line.

perl -e 'printf "%c\n", 65;'
# A
perl -e 'printf "%c\n", 48;'
# O
perl !-2:* !:*
#> perl -e 'printf "%c\n", 65;' -e 'printf "%c\n", 65;'
# A
# A

You can turn this feature off by setting the shell option csh_junkie_history.

Modifiers

The third part of a history substitution is a “modifier” because it modifies the word you’ve picked out so far.

Modifiers that deal with filenames assume a standard Unix-style path. This means if you’re working under Cygwin you won’t get the right effect from Windows-style paths with backslashes.

ls -l /home/pws/zsh/sourceforge/zsh/Src/Zle/zle_main.c
ls !:2:h
#> ls /home/pws/zsh/sourceforge/zsh/Src/Zle
echo !-2:2:t >>source_files.lis
#> echo zle_main.c >>source_files.lis
echo !-3:2:r
#> echo /home/pws/zsh/sourceforge/zsh/Src/Zle/zle_main
echo There\'s no subsitute for hard work.
!!:s/substitute/accounting/
#> echo There\'s no accounting for hard work.
echo There are substitutes for butter.
!!:&
#> echo There are accountings for butter.

The substitution was remembered. It is kept until the next :s, or until you exit the shell.

Changing something in the last line is so common that there’s a shorthand for it:

echo There\'s no subsititute for hard work.
^hard work^taste
#> echo There\'s no substitute for taste.
echo There\'s no substitute for hard work.
^hard work^taste^:s/substitute/accounting/
#> echo There\'s no accounting for taste.

Substitutions usually just substitute once in each line:

echo this line is not a large line
^line^lion
#> echo this lion is not a large line

The way to do global replacements is to use :gs in place of :s:

echo this line is not a large line
!!:gs/line/lion/
#> echo this lion is not a large lion

More Options for Manipulating History

Appending to the History List

It can often happen that you have several shells, each running in a different window. When you log out, all the shells exit at once and they all try to save their history at the same time. The default behavior means that the last to exit will win, and will overwrite the output saved by the others. You can fix this in bash by using the following setting:

shopt -s histappend

The equivalent option in zsh is set like this:

setopt append_history

A compromise option, inc_append_history, saves the line to the history file immediately but doesn’t attempt to read back lines from the history file. You can issue the command fc -R in another shell to read the history file.

Pruning and Massaging the History List

Both shells have mechanisms that give you more control over what is saved in the history list. This avoids the frustration of having a large number of useless lines remembered while the line you actually want has already been forgotten. The shells do this in their own ways.

bash Features for Pruning History

The variable HISTCONTROL can take various values.

ignorespace: Lines beginning with a space are not to be saved in the history.
ignoredups: If the same line occurs twice in a row, only one instance is saved.
ignoreboth: Combines the effect of the two previous values.
erasedups: When a line is added to the history any duplicate of the line is erased from the history.

The other variable is HISTIGNORE. This requires a bit more thought. It is colon-separated list of patterns that are matched against the complete line. You can use the wildcard * to match any text. A matched line is not saved to the history file. There’s one shortcut, which is that & matches the last line saved in the history before the one being considered.

HISTIGNORE=" *:&"

is equivalent to setting HISTCONTROL to ignoreboth.

zsh Features for Pruning History

zsh doesn’t have the general history control of bash’s HISTIGNORE at the moment; everything is controlled by shell options.

hist_ignore_dups: Consecutive duplicates are not saved. Actually, the most recent version of the line is always kept.
hist_ignore_all_dups
hist_save_no_dups
hist_expire_dups_first
hist_find_no_dups
hist_ignore_space
hist_no_functions: Function definitions aren’t saved to the history.

A Few More History Tricks

Forcing History to be Read or Written

# bash
history -r
history -w
# zsh
fc -R
fc -W

Prompts

Basic Prompting

Prompt Name	bash or zsh	Purpose
PS1	Both	Normal command prompt
PS2	Both	Continuation command prompt
PS3	Both	Prompt for select built-in
PS4	Both	Debugging with xtrace (-x) option
RPS1	zsh	Normal command prompt at right
RPS2	zsh 4.2	Continuation command prompt on the right
SPROMPT	zsh	Used when offering to correct spelling
TIMEFORMAT	bash	Output from time built-in
TIMEFMT	zsh	Output from time built-in
WATCHFMT	zsh	Information on logins and logouts

Prompts in bash

bash Form	zsh Form	Expansion
$	%#	# if superuser, else $ (bash) or % (zsh)
\u	%n	Username
\h	%m	Host (machine) name
\W	%.	Last part of current directory
\w	%~	Full name of current directory
\A	%T	Time in the 24-hour format
@	%@ (or %t)	Time in the 12-hour format
\t	%*	Time in the 24-hour format with seconds
\T	%D<!–swig133–>`.

Color	Code for Foreground	Code for Background
Black	30	40
Red	31	41
Green	32	42
Yellow	33	43
Blue	34	44
Purple	35	45
Cyan	36	46
White	37	47

Prompts in zsh

For percent escapes to be replaced, the option prompt_percent has to be turned on, as it usually is.

To be able to include variables and other standard substitutions in your prompts, the option prompt_subst has to be turned on. This is not on by default, since very old versions of zsh didn’t have the option. It’s probably a good idea just to turn it on. If you do, you can simply put a backslash in front of the $ in the prompt definition.

Previewing zsh Prompts

print -rP '%j%# '

Prompt Escapes in zsh

Showing Information About What the Shell Is Executing

You’ve already learned about the %_ reference for inserting the name of the activity or activities the shell is performing. This is the magic in PS2 that makes the shell display what syntactical structure it’s waiting to complete. The default $PS2 is %_>. Remember that PS2 is the continuation prompt; the shell outputs this when the first piece of output wasn’t a complete command, so that it is waiting for more.

Visual Effects

%B and %b start and stop bold mode
%S and %s start and stop standout mode
%U and %u start and stop underline mode

Advanced Examples

Ternary Expressions

true
print -rP 'The last status was %(?.zero.non-zero)'
# The last status was zero
false
print -rP 'The last status was %(?.zero.non-zero)'
# The last status was non-zero
false
print -rP 'The last status was %(1?/one/not one)'
# The last status was one

Variables in Prompts

We have already mentioned that if you want variables or command substitutions to be expanded inside prompts you need to set the option prompt_subst. There’s one special case that works without that option: the variable psvar. It’s actually an array.

psvar=(one two three four)
print -rP '%1v and a %2v'
# one and a two

Some of the more interesting variable substitutions you might want to consider putting into prompts for one reason or another are $SECONDS, which tells you how many seconds the shell has been running; $COLUMNS, which tells you the width of the terminal; and $RANDOM, which is just a random number.

expr='%$(( RANDOM & 1 ))(?./.\\)'
PS1="$expr$expr$expr$expr%# "
#> //\\%
#> ///\%
#> \\\/%

Prompt Themes

autoload -U promptinit
promptinit
prompt -l
prompt -p suse
prompt suse

Prompts for Spell Checking

zsh can correct the first word of the name of a command, if you misspell it, and it can correct words in the name of a file. This feature is taken from tcsh.

You can activate spell checking in various ways:

By pressing Esc-s, which tells the line editor to correct the word. However, you are probably better off relying on the completion system. If you have it loaded, you can use Ctrl-x-c to correct the word. This is usually bound to the editor function _correct_word.
By setting the option correct. Then when you press Return the shell looks at the command word, and if it doesn’t recognize it as a command it tries to find a correction, which it then offers to you.
By setting the option correct_all. This is like correct, but it checks arguments after the command. However, it simply assumes they are files, and tries to correct the words to filenames.

In the second and third cases, the shell prompts you for what to do using the variable SPROMPT. However, there are two additional prompt escapes: %R turns into the original string, the one the shell wants to correct, and %r turns into what it wants to correct it to.

The default value is zsh: correct '%R' to '%r' [nyae]?. The letters in square brackets indicate the letters you can type.

Watching Logins and Logouts

A long-standing zsh feature taken straight out of tcsh is the ability for the shell to tell you when other people log into or out of the computer.

If one of those users logs in or out, the shell then prompts you with the variable WATCHFMT, which has its own prompt escapes. This means that WATCHFMT does not understand the escapes we described for the PS strings.This always happens right before a prompt is printed, and only if at least $LOGCHECK seconds (default is 60) have elapsed since the last check. You can set LOGCHECK to zero so it checks each time.

WATCHFMT="The %(a.eager beaver.lazy so-and-so) %n has \
%(a.appeared.vanished) at %t."

Files and Directories

Finding Commands and Files

Finding a Command

If you simply want to find a command in your path, you can use the built-in type, common to both bash and zsh.

bash is a bit more verbose about functions. You can achieve a similar effect in zsh with one of the following commands or options:

The functions command, which lists only functions but lists them in full.
The which command. This is inherited from the C shell, but has slightly different behavior. This function also exists in recent versions of bash.
The option type -f lists functions in full, but still lists other types of commands, too.

Actually, the basic command in zsh that provides all the other ways of finding commands is whence. It’s used by which and type, although functions is different since you can also use it to set attributes for shell functions.

We haven’t talked about what happens when you add a new command somewhere in the command search path. Usually, this isn’t a problem; the shell will search for the new command. However, if there is already a command of the same name later in the search path, the shell will use the old command instead of the new one. This is because the shell keeps a list of commands so that it can quickly find out the location from the name. The list of commands is called a hash table from the way it is stored. You can issue the command hash -r (or, in zsh only, rehash) to fix this.

Managing Directories with the Shell

Shorthand for Referring to Directories

Example	Meaning
~	The user’s own home directory
~pws	The home directory of user pws
~var	The directory path given by the variable var (zsh only).
~+	The current directory, the same as $PWD
~-	The previous directory, the same as $OLDPWD
~+2	The second directory on the directory stack
~2	A shorter form for the second directory on the directory stack
~-0	The final directory on the directory stack

The Directory Stack

pwd
# /home/pws
pushd zsh/projects/zshbook
# ~/zsh/projects/zshbook ~
pushd ~/tmp
# ~/tmp ~/zsh/projects/zshbook ~
pushd ~/src/zsh
# ~/src/zsh ~/tmp ~/zsh/projects/zshbook ~
pushd ~/elisp
# ~/elisp ~/src/zsh ~/tmp ~/zsh/projects/zshbook ~
popd
# ~/src/zsh ~/tmp ~/zsh/projects/zshbook ~
dirs -v
# 0       ~/src/zsh
# 1       ~/tmp
# 2       ~/zsh/projects/zshbook
# 3       ~
pushd +2
# zsh  -> ~/zsh/projects/zshbook ~ ~/src/zsh ~/tmp
# bash -> ~/zsh/projects/zshbook ~/src/zsh ~/tmp  ~

More Argument Handling: Braces

A list of comma-separated items in braces expands to those items. This works in all versions of zsh and recent versions of bash.

ls myfile.{c,h}
# myfile.c myfile.h
mv brace_expansion.xml{,.bak}
# mv brace_expansion.xml brace_expansion.xml.bak
tar cf project_cheapskate{.tar,}
# tar cf project_cheapskate.tar project_cheapskate

Generating Numbers with Braces

echo {01..10}
# 01 02 03 04 05 06 07 08 09 10

Redirection

Preventing Files from Being Clobbered

If the file logfile.txt existed, it would be overwritten by the first command. Both bash and zsh offer an option noclobber that you can set to stop this happening:

echo "**** Start of log file ****" >logfile.txt
echo "**** More log output ****" >logfile.txt
# logfile.txt: cannot overwrite existing file

If you want to overwrite the file, you can delete or rename it, or you can use a special syntax that tells the shell you don’t care whether the file already exists:

echo "**** More log output ****" >|logfile.txt

zsh tries to be clever if you set the option hist_allow_clobber. You’ll see what happens if you repeat the example without the pipe symbol. zsh gives you the error, just as before. However, if you use the Up Arrow key to go back in the history, you’ll find the shell inserted the pipe.

Pattern Matching

Basic Globbing

Operator	Meaning
*	Zero or more characters
?	Any single character
[abc]	Any of the characters a, b, or c
[a-z]	Any character that lies between a and z
[^a-z] or [!a-z]	Any character other than one that lies between a and z

bash has an option, nocaseglob, which if turned on makes the matching case insensitive. zsh 4.2 and greater also has the option no_case_glob.

Unfortunately, as character sets have been extended to allow for further characters, problems
have become apparent. For instance, [A-Z] would not match . Also, even in the English language, conventional dictionary order would not put Z before a.

The modern way of handling a character class involves using the collation order for the current locale. zsh still does things the old way, but even if you use zsh, you may find that character classes in regular expressions handled by commands like grep work differently from how they work in the shell. The collation order is even used by commands like ls to determine the order in which files are listed.

To allow things like uppercase letters to be matched, most modern Bourne-shell variants have an extension that lets you use names for character classes. You simply include an extra set of brackets, with a colon before and after the name, giving the complete result [[:name:]].

Name	Explanation
alnum	A letter or number.
alpha	A letter.
ascii	A character in the ASCII set.
blank	A space or tab. This is a GNU extension and might not always be available.
cntrl	A control character.
digit	A decimal digit (number).
graph	A printable character, but not a space.
lower	A lowercase character.
print	A printable character including space.
punct	A punctuation character, meaning any printable character that is neither alphanumeric nor one of the space characters.
space	A white-space character, usually space, (horizontal) tab, newline, carriage return plus the less usual vertical tab and form feed.
upper	An uppercase letter.
xdigit	A hexadecimal digit; an upper- or lowercase “a” to “f”, or a digit.

echo [[:digit:]m]*
# 1foo msg.txt

Internationalization and Locales

The default locale is called C, as in the programming language. It will always exist and so can be selected as a way of disabling the special behaviors associated with internationalization.

Two variables particularly affect the shell:

LC_COLLATE determines the sort order of characters.
LC_CTYPE determines the character handling and classification behavior.

You can override all of the LC_ variables by storing a value in LC_ALL. Furthermore, by storing a value in LANG, you can specify a fallback for any variable that is left unset.

Globbing in Bash

Extended Globbing

If you find that you need some more sophisticated forms of pattern matching, you can turn on the extglob option. That makes available a set of patterns all with the same form: a special character, followed by an expression in parentheses. The expression in parentheses can consist of vertical bars (or pipe symbols), which in this context separate pattern alternatives. The presence of the surrounding pattern delimiters stops the shell from recognizing the bar as a real pipe, since it’s meaningless in the middle of a pattern. This style of extended globbing comes originally from the Korn shell.

echo *.@(out|txt)
# ham.txt msg.out msg.txt spam.txt

In this case the alternatives were simple strings, but they could be patterns, too. An advantage of the parentheses is that they make it easy to bury patterns recursively inside others.

echo OUTPUT*.@(@(txt|log).old|bak)

A list of syntactical variations of this form follows:

@(expr) matches exactly one of the alternatives in expr.
?(expr) matches either nothing or one of the alternatives in expr.
*(expr) matches any number of repetitions, including zero, of the patterns in expr.
+(expr) matches one or more repetitions of the patterns in expr.
!(expr) matches anything except the patterns in expr.

Ignoring Patterns

bash gives you a way of overriding the patterns matched by globbing. The variable GLOBIGNORE works like PATH in that it consists of different elements joined with colons.

echo *
# ham ham~ msg msg.bak msg~ newmsg
export GLOBIGNORE='*~'
echo *
# .msg.list ham msg msg.bak newmsg

Globbing in Zsh

Option	echo file1.* file2.*	echo file2.*
Default	Error	Error
no_nomatch	file1.c file1.h file2.*	file2.*
csh_null_glob	file1.c file1.h	Error
null_glob	file1.c file1.h	Empty argument list

Special Patterns in zsh

zsh’s additional globbing operators are in two classes: those usually enabled, and those for which the option extended_glob must be enabled. The first set extends the operation of characters already special to the shell, and are enabled by default since they aren’t likely to cause problems with compatibility. The second set uses characters that aren’t special inside patterns unless the option is set.

Grouping Alternatives

Like ksh and bash (with the extglob option set), zsh can interpret pattern alternatives that are surrounded by parentheses and separated by a vertical bar. The difference from ksh and bash is that the parentheses aren’t preceded by a special character.

echo d*.(out|txt)

The uses we’ve shown for parentheses in zsh would cause other shells derived from the Bourne shell such as bash to report a syntax error. If you prefer to use only the syntax common to bash and zsh, you can set the option sh_glob to turn off this use of parentheses. The combination of sh_glob and ksh_glob turned on and extended_glob turned off makes zsh work very like both bash and ksh. A problem you might notice when the ksh_glob option is in effect is that the form !(pattern) can be taken as a history reference. For better compatibility with other shells, you can turn off the option bang_hist.

Recursive Searching

The pattern **/ matches any number of directories (including none) to any depth.

Matching Numeric Ranges

echo data<1-9>
# data1 data2 data3

Either of the numbers can be omitted. In the simplest case, <->, the expression matches any set of digits. It even matches a number that’s usually too large for shell arithmetic, because the shell doesn’t need to do arithmetic; it just needs to match all the digits it encounters without turning them into an integer. If the first number in the range is omitted, it’s taken as 0, and if the second in the range is omitted, it’s effectively infinity.

Usually files are sorted as strings, even if they contain numbers. However, it’s possible to set the option numeric_glob_sort so that any ranges of digits within the pattern are sorted numerically.

Extended Globbing in zsh

When you set the option extended_glob, the characters ^, ~, and # become special wherever they appear unquoted. Conveniently, the new uses apply only when they don’t appear at the start of a word, so there is no clash.

Negated Matches

echo A^r*
# AAreadme

Keep in mind that the ^ character is used to indicate control characters for the system’s stty command and the shell’s bindkey command. Unfortunately, with the extended_glob option set, the text ^c expands to every file in the directory except one called c. Make sure you put quotes around words containing ^ if you use extended globbing.

Pattern Exceptions

Exceptions allow you to turn that thought directly into a pattern. In zsh you can specify exceptions to patterns. This means that you can give a shell a pattern that it must match, and also a pattern that it must not match. The syntax is the pattern that must be matched, followed by a tilde character, ~, followed by the pattern that must not be matched.

echo m*
# msg msg.out msg.txt
echo m*~*.txt
# msg msg.out

Multiple Matches

The character # placed after a character or group allows that character or group to be matched any number of times, including zero.

echo [ab]#
# abba
echo (H?)#
# HeHiHo

A very close relative of # is ##. It indicates one or more occurrences of the preceding pattern.

echo A#readme
# AAreadme Areadme readme
echo A##readme
# AAreadme Areadme

Glob Qualifiers in Zsh

A glob qualifier is a brief set of characters in parentheses at the end of a pattern that provides some restriction on the type of file to be matched.

Single-Character Qualifiers

grep mypat Src/^*.o(.)

File Types

The following is a list of the most useful qualifiers for file types:

(.), this matches a regular file, that is, one which is not a directory, link, or one of the other types of special files.
(/), for a directory.
(*), for an executable regular file.
(@), for symbolic links.

File Owners and Permissions

The qualifiers r, w, and x indicate files that are readable, writable, or executable by the current user. The qualifiers R, W, and X indicate files that are readable, writable, or executable by everyone on the system.

Combining Qualifiers

echo *(*rw)
echo *(*r^w)
echo *(/,*)

More Complicated Qualifiers: Numeric Arguments

File Sizes

ls -l ^*.o(Lk+100)

File Timestamps

ls -l *(mh-1)

Counting Links

ls -l *(.l2)

More Complicated Qualifiers: String Arguments

Specifying the File Owner

ls -ld /var/*(^u:root:)

File Permissions (Complicated)

echo *(f:u+rx:)
echo *(f:u+rx,o-x:)

Qualifiers for Ordering and Selecting

Normally the matches generated by globbing are sorted into the order of the names, with the effect of the option numeric_glob_sort taken into account. You can turn that option on for one pattern only with the qualifier n.

Ordering Files

Using (on) produces the default order. If you turn that into (On), the order is reversed.

ls *(On)
# 1foo MSG ham.txt msg msg.out msg.txt newmsg spam.txt
echo *(On)
# spam.txt newmsg msg.txt msg.out msg ham.txt MSG 1foo

Changing What Is Displayed Using Qualifiers

echo *(M)
# dir/ prog
echo *(T)
# dir/ prog*

Colon Modifiers as Qualifiers

echo ~/.z*(:t:s/z/ZED/)
# .ZEDcompdump .ZEDlogout .ZEDshenv .ZEDshrc .ZEDtcp_sessions

Globbing Flags in Zsh

zsh allows various globbing flags that can appear anywhere in the pattern and affect the way it is interpreted. They always have the form (#X), where X is a code character, possibly followed by a numeric argument.

Case-Insensitive Matching

print (#i)readme
# README READme ReAdMe
print (#i)read(#I)me
# READme

Completion

Getting Started with Completion

bash_completion

. /etc/bash_completion

zsh’s compinit

autoload -U compinit
compinit

Jobs and Processes

Mastering Job Control

Letting Sleeping (or Background) Jobs Lie

By default, zsh kills any running jobs when you exit the shell. It does this by sending each job the SIGHUP signal. In contrast to this, bash leaves running jobs alone by default. Stopped jobs are treated differently: the operating system itself kills them. Actually, bash will kill any stopped jobs first (using the SIGTERM signal), but the key point is that they are always killed.

There are various ways in zsh of preventing the shell from killing jobs on exit:

Set the option nohup. Then the shell won’t kill running jobs when it exits. Stopped jobs will be killed by the operating system anyway, though the shell won’t do this deliberately. The shell will still warn you about stopped and background jobs, however. You can disable the warning, too, by unsetting the option check_jobs.
If you want some but not all commands to be left running when the shell exits, you can put nohup in front of any command line. This tells the command to ignore any SIGHUP sent to it, so that the command will be left running when the shell exits. Note that nohup doesn’t automatically run the command in the background so you still need to end the command line with &.
You can use the shell’s disown command to tell the shell to ignore the job completely. The shell won’t send SIGHUP to the job. This method has the side effect that no job control commands will work. Even jobs won’t show the command any more. If this isn’t a problem, you can start the job in the background with &! at the end of the line instead of &. This immediately disowns the job.

bash works a bit differently. if you set (with shopt -s) the huponexit option, bash will also send the SIGHUP signal to running jobs. Note, however, that the option only has an effect from an interactive login shell.

With huponexit set, nohup, which is a standard external command, is also useful in bash. Since nohup is not part of the shell, you can’t use it with commands that are built into the shell or with shell functions. It will work with scripts, however.

The disown command is also available in bash and works like it does in zsh, removing the job completely from the shell’s list of jobs. However, in bash you can also execute disown -h %num to tell the shell that you don’t want the job to be sent SIGHUP when the shell exits. In this case you can still do job control. The &! syntax doesn’t exist, unfortunately.

High-Power Command Substitutions

Command Substitution and Command Arguments

An expression in backquotes, `...`, is equivalent to the same expression inside $(...). The second form wasn’t understood by older shells.

strip $(file $(cat filenames) | grep 'not stripped' | cut -d: -f1)
strip `file \`cat filenames\` | grep 'not stripped' | cut -d: -f1`

Quoted Command Substitution

The output from command substitution doesn’t get re-evaluated. For example, if the output contains $ characters, they won’t be used to introduce substitutions, because the shell has already dealt with those. They’ll simply be inserted onto the command line as literal $ characters.

It’s not quite that simple, in fact. The real rule is that the shell continues with substitutions in the same order it always does them. In bash, that means that pattern characters that come from command substitution are active, because globbing occurs after command substitution.

If you try that in zsh, you’ll just see *. However, zsh does the substitutions in the same order as bash. The difference is that characters that come from substitutions are never special in zsh; they are treated as if they were quoted. You can switch to the bash behavior by setting the option glob_subst.

Process Substitution

Process substitution is a specialized form of command substitution. Like ordinary command substitution, special syntax is used to indicate a command to be executed. Unlike ordinary command substitution, what is substituted is not the output of the command.

<(commands): The commands are executed. The expression is replaced with the name of a file that can be read for the output of the commands.
>(commands): The commands are executed and the expression is replaced by a filename, as before. However, this time the file is for writing: anything sent to it is used as input for the commands. The output of commands is not captured in any way.
=(commands): This works just the same as the <(...) form. The only difference is that the filename is guaranteed to be a regular file. The other two forms use special files.

bash, zsh, and ksh93 all support the first two forms. The third form is a zsh extension that is not available in the other shells.

paste <(cut -d: -f1 /etc/passwd) <(cut -d: -f5 /etc/passwd)
# root root
make 2> >(grep Error >logfile.txt)

There is one important difference between ordinary pipes and the >(...) form. In the second case the shell won’t wait for the command to finish.

Finding the Full Path to a Command in zsh

zsh has a simple shorthand form for this: =command (no parentheses this time). The command must live somewhere in your PATH. The expansion doesn’t work for shell functions, aliases, or commands built into the shell.

If you find this feature annoying and don’t need the special behavior, set the option no_equals to turn it off.

Resource Limits

Unix-style systems allow limits on the resources used by a process or user. They are enforced by the operating system since every process, not just the shell, has limits. However, there are shell commands to alter the values of the limits. Like the current directory, these values are inherited by any processes started from the shell.

Both shells have the command ulimit for manipulating limits. zsh also has the command limit, which it inherited from the C shell.

Variables

Shell variables can be scalar or nonscalar. A scalar is a single unit of information, typically either a number or a string of one or more characters.

Arrays

Constructing an Array

There are two ways to assign values to an array. The 1988 version of ksh only supports the old way. bash allows only the new way. zsh and ksh93 support both.

set -A arr one two three
arr=( one two three )

Inside an array assignment, you can use any shell substitution or expansion.

src_files=( $(find . -name '*.[ch]' -print) )

To see how many elements are in the arr array, we can do the following:

arr=(
  'first element'
  'second element'
  'third element'
)
echo ${#arr}
# 3

Accessing Array Elements

Array variable expansions are written just like regular variable expansions except that they are followed by an index in brackets. The brackets are called the subscript operator and the index inside them is commonly referred to as the subscript.

echo ${arr[1]}

The first element of a zsh array is numbered 1 while bash starts from 0. If you want zsh to act like bash or ksh, you can turn the ksh_arrays option on.

Note that we place braces around the expansion to force the index to be considered a part of it. zsh actually doesn’t require braces around array indexes.

echo $arr[1]

In ksh, regular string variables are actually treated as arrays containing just one element. For this reason, if you leave the subscript out in an array expansion, just the first element of the array will be expanded.

bash behaves likewise. zsh will expand all the elements of the array instead.

It is also possible to use array indexes in assignments. This allows you to change just one element of an array at a time.

arr[1]=zwei

If the array doesn’t exist, it will be created. In zsh, you can even replace a range of values in an array

arr[2,3]=( zwei drei )

What this actually does is first remove elements 2 and 3 and then insert the new elements in the array.

zsh’s array ranges tend to be most useful when accessing an array. You can use the value -1 to indicate the last element in the array.

echo $arr[-1]

If you want to retrieve the last element of an array in bash, you can’t use an index of -1. Instead, you can make use of the # flag.

echo ${arr[${#arr}-1]}

Array Attributes

declare, also referred to as typeset. In bash and zsh, it doesn’t matter which of the two names you use – they are both the same command. In ksh, you don’t have the luxury of being able to choose – only typeset is available.

The declare command allows you to specify the type of a variable or to specify one or more attributes for a variable. To specify that a variable is an array, you use the -a option.

declare -a arr

If arr already existed as another type of variable, declare -a converts it to an array.

var=string
declare -a var
declare -p var
# declare -a var='([0]="string")'
declare -a arr=( [0]=one [1]=two [2]=three )

Word Splitting

alias showargs="printf '>>%s<<\n'"
showargs one 'two three'
# >>one<<
# >>two three<<

# bash
var='one two'
showargs $var
# >>one<<
# >>two<<
showargs "$var"
# >>one two<<

# zsh
var='one two'
showargs $var
# >>one two<<
showargs $=var
# >>one<<
# >>two<<

In zsh, you can emulate the Bourne shell behavior by turning the sh_word_split option on.

Array Expansions

There are two methods for retrieving all the elements of an array: ${arr[@]} and ${arr[*]}. The first form results in each element of the array being a separate word while the second form amounts to joining all the array elements together with a space between each element and treating the resulting string like a scalar variable expansion.

arr=( one 'two three' four )
showargs ${arr[*]}
# >>one<<
# >>two<<
# >>three<<
# >>four<<
showargs "${arr[*]}"
# >>one two three four<<
showargs "${arr[@]}"
# >>one<<
# >>two three<<
# >>four<<

Variable Attributes

There are a number of attributes you can set for variables to achieve useful effects. You set these attributes using options to the declare command.

declare -u -R 10 greeting='hello'
echo $greeting
#   HELLO
declare +u +R greeting
echo $greeting
# hello

The export command is the same as using declare with the -x option. There are quite a few commands such as export that are specialized forms of declare, each corresponding to a particular option. Another is readonly, which is equivalent to declare -r. That allows you to protect a variable from having its value changed.

If you want to be sure of removing all attributes from a variable, the best way is to unset it. There is an unset built-in that lets you do exactly this.

That is generally useful as a better alternative to assigning an empty value to a variable. It won’t work for a read-only variable, however: that is the one attribute you have to remove manually.

Numeric Variables and Arithmetic

In the past, to perform mathematical calculations from the command line, separate calculator programs such as bc were necessary. From shell scripts, the external expr command was typically used (and still is where portability is an issue). ksh88 added the let built-in command to do calculations directly in the shell.

let product='3*4'

Although you can use let, the following syntax is preferred because it avoids the need for quoting:

(( product=3*4 ))

If the variable product didn’t exist before, it will be created with an integer type. While integer variables act in every way like a string variable, they are more efficient because no conversion back and forth between ASCII and binary representations takes place.

You can also declare integer variables with the declare built-in or (in zsh) with integer:

declare -i product

If you don’t want to assign the result of a calculation to a variable but want it to appear in place like a variable expansion, you can do that by adding $ before the opening parenthesis:

echo $(( 3 * 4 ))

Array subscripts for example can be any mathematical expression.

a=( one two three four )
let i=1
echo ${a[ 3 * i ]}
# three

Operators	Description
+ - ! ~ ++ –	Unary plus and minus, logical NOT, bitwise NOT, increment, decrement
<< >>	Bitwise shift left, right
&	Bitwise AND
^	Bitwise XOR
\|	Bitwise OR
**	Exponentiation
* / %	Multiplication, division, remainder
+ -	Addition, subtraction
< > <= >=	Comparison: less, greater, less or equal, greater or equal
== !=	Comparison: equal, unequal
&&	Logical AND
\|\| ^^	Logical OR, XOR
x?y:z	If x then y else z
= += -= *= /= %= >>= <<= &= ^= \|=	Assignment
, (comma)	Sequence separator

Number Bases

echo $(( 0xff ))
echo $(( 0377 ))
echo $(( 12#193 ))

The standard C convention for octal is disabled by default in zsh, though. This is because it is inconvenient when parsing strings with initial zeros as is common for time strings. You can enable this feature by turning on the octal_zeroes option.

zsh goes a step further and allows you to output numbers in a different base.

echo $(( [#16] 255 ))
# 16#FF
declare -i 16 i=255
echo $i
# 16#FF

Floating-Point Numbers

echo $(( 1 / 3 ))
# 0
echo $(( 1. / 3 ))
# 0.33333333333333331

Floating point variables are defined with an option to declare or with a variant of it named float. There are actually two such options to declare: -F and -E. The difference between them relates to the output format: with -E, engineering notation is used.

declare -F f='1.0/3'
declare -E e='1.0/3'
echo $f
# 0.3333333333
echo $e
# 3.333333333e-01

zsh also has a number of more complex mathematical functions such as the common trigonometry functions. They are in a separate loadable module, which you need to first load. Modules are loaded with the zmodload command:

zmodload zsh/mathfunc
(( pi = 4 * atan(1) ))

Complex Variable Expansions

Alternative and Default Values

To specify a default value in an expansion, you can use ${variable:-default}.

A similar substitution performs the inverse: the word on the right is substituted only when the variable is set. This takes the form ${variable:+alternative}.

Patterns

Let’s look first at the operators ${variable#pattern}, ${variable##pattern}, ${variable%pattern}, and ${variable%%pattern}. These operators expand variable, removing text that matches pattern. The operators # and ## remove text from the left side (beginning) of the value; % and %% remove text from the right side (end). The single-character operators # and % remove the least text possible; the double-character operators ## and %% remove the most text possible.

Modern shells also have a third form that is not anchored. This form is more similar to sed‘s s command. It looks like ${variable/pattern/string}.

There is another similar form that looks like ${variable//pattern/string}. This doesn’t enable greedy matching as you might expect; greedy matching is actually the default for both ${variable/pattern/string} and ${variable//pattern/string} substitutions. Instead, this form causes all occurrences of the pattern to be replaced. So in bash you might use ${PATH//:/ } to return the directories of your path split into separate words. Remember that in zsh you need to add an equals to enable word splitting (${=PATH//:/ }) – or you can just use $path.

Substrings

# zsh
a='123456789'
echo $a[3,5]
# 345

# bash
a='123456789'
echo ${a:2:3}
# 345

Expansion Flags

zsh offers various flags that can be specified inside variable expansions to enable optional behavior with respect to the expansion.

Flag	Description
L	Lowercase
U	Uppercase
C	Capitalize initial letters of words
a	Array index order (useful with O)
i	Case-independent ordering
o	Ascending order
O	Descending order
f	Split at lines
s	Split at a specified character
z	Split into words, taking account of any shell quoting
F	Join words using newlines
j	Join words using a specified character
c	Count characters
w	Count words
W	Count words, including empty ones
Q	Remove one level of quoting
q	Quote the resulting words
V	Make special characters visible
l	Pad words on the left
r	Pad words on the right
t	Return the type of a variable
k	Return associative array keys
v	Return associative array values
%	Perform prompt expansion on the value
e	Perform shell expansions on the value
P	Reinterpret value as a further variable name

Converting Strings to Upper or Lower Case

for f in *; do
  mv $f ${(L)f}
done

Matching Patterns Against Arrays

files=( /lib/lib* )
echo ${files[@]##*/}

Associative Arrays

To create an associative array, you need to invoke declare with the -A option. Assignments then look like regular array assignments. The values in the assignment are expected to alternate between keys and values, so there must be an even number of them.

declare -A people
people=(pres Pamela vicepres Victor secr Sam)
people[treas]=Tammy
unset 'people[pres]'

Scripting and Functions

Programming with the Shell

Control Flow

if grep -q word file; then
  echo file contained word
fi

Condition Tests

There is one command that is used more often than any other in conditions: that is the test command and its synonym [.

if test ${file##*.} = txt; then
  echo file has .txt extension
fi

Note that because test is like any other command, it can’t tell the difference between an expanded variable and a literal string. So if your variable happens to be empty or its value looks like an option, you may run into problems.

value=''
[ $value = val ]
# bash: [: =: unary operator expected

Using double quotes around variable expansions solves this dilemma. Quoting doesn’t help where you have a value that might look like an option to test. For this reason you may see a condition where an extra character has been added on both sides of the comparison operator.

if [ X$1 = X-z ]; then

Due to issues such as this, there is a newer way of doing conditions. These are instead delimited by double brackets ([[ ... ]]). They are given special handling by the shell so that they work better.

Operator	Purpose
-b file	Tests if file is a block special file.
-c file	Tests if file is a character special file.
-d file	Tests if file exists and is a directory.
-e file	Tests if file exists.
-f file	Tests if file exists and is an ordinary file.
-g file	Tests if file exists and has its setgid bit set.
-k file	Tests if file exists and has its sticky bit set.
-n string	Tests if string is nonempty.
-o option	Tests if option is turned on.
-p file	Tests if file exists and is a named pipe (fifo).
-r file	Tests if file exists and is readable.
-s file	Tests if file exists and has a size greater than zero.
-t file descriptor	Tests if file descriptor is open and associated with a terminal device.
-u file	Tests if file exists and has its setuid bit set.
-w file	Tests if file exists and is writable.
-x file	Tests if file exists and is executable.
-z string	Tests if string is empty (length zero).
-G file	Tests if file exists and is owned by the current group.
-L file	Tests if file exists and is a symbolic link.
-O file	Tests if file exists and is owned by the current user.
-S file	Tests if file exists and is a socket.
file1 -ef file2	Tests if the two filenames refer to the same file.
file1 -nt file2	Tests if file1 is newer than file2.
file1 -ot file2	Tests if file1 is older than file2.
string == pattern	Tests if the string matches the pattern.
string != pattern	Tests if the string doesn’t match the pattern.
string1 > string2	Compares strings based on their ASCII values.
string1 < string2	Compares strings based on their ASCII values.
string =~ regex	Tests if the string matches the regular expression (bash 3 only).

The == operator (which can also be written as just =) is worth a special mention because it is one of the most useful.

[[ $PWD = $HOME/* ]]

Note that it isn’t necessary to quote the pattern to protect it from filename generation. [[ ... ]] style conditions are handled specially by the shell. Often, you will only want to compare against a literal string instead of a pattern. To do this, just quote any characters that have special meanings in patterns.

Note that the notions of true and false for math evaluation are similar to those used by the C programming language. This is the reverse of how the exit status of a Unix command is interpreted. This means that if the number resulting from the math evaluation is nonzero, the return status will be zero.

if (( ! ${#array} )); then

Control Operators

grep -q word file && echo file contained word

grep -q word file || echo "file didn't contain word"

if ! grep -q word file ; then
  echo "file didn't contain word"
fi

Case Statement

case $TERM in
  (aixterm|iris-ansi*)
    bindkey '\e[139q' overwrite-mode
  ;;
  (xterm|dtterm)
    bindkey '\e[2~' overwrite-mode
  ;;
esac

Each pattern is tried in turn until one matches. If you want a catch-all condition at the end, use the pattern *. The double semicolon is used to terminate the commands for each branch of the case statement. If you want a particular case to fall through and also run the commands for the next case, ksh and zsh allow you to use ;& instead.

More Looping

Like most programming languages, the shell offers a while loop. The condition is similar in structure to that used by the if statement. It is evaluated first and before each subsequent iteration of the loop. Looping only continues when the condition evaluates to true. There is a variant of this: the until loop, which is identical except that looping continues for as long as the condition is false.

Modern shells offer another type of for loop.

for ((i=1;i<5;i++)); do
  a[$i]=$i
done

zsh offers another loop construct, inherited from C shell. It is known as the repeat loop:

repeat 5; p=${p#*/}

Note that we have omitted the do and done keywords in this loop. If you have the short_loops option turned on, zsh allows you to do this with for and repeat loops if they contain only one command.

The break and continue Statements

The break is used to exit immediately from a loop. It is often useful if an error occurs.

If you wanted to break out of an outer loop, you can pass a number to break specifying how many levels you want to break out of.

The continue statement causes control to advance to the next iteration of the loop skipping any following commands. Like break, continue can be passed a numeric argument to allow execution to skip to the next iteration of an outer loop.

Parsed Comments

There is another command functionally equivalent to the true command mentioned in the last section, it consists of just a single colon. Like true, the command itself doesn’t perform any useful action but the shell parses and evaluates its arguments.

When we place a colon before the command name all the lines are, in effect, commented out. Be careful if you have any command substitutions, though: command substitutions are still evaluated and may have unwanted side effects. Note that you can’t use a colon to comment out multiline structures that consist of several commands such as a while loop or pipeline: it applies to one command only. You can use a colon to comment out a single command in a series, however.

grep -q word file && grep -q word otherfile && echo files both contain word
grep -q word file && : grep -q word otherfile && echo files both contain word

if true; then
  :
fi

Grouping and Subshells

Sometimes, it can be useful to group several commands together.

{
  ufsdump 0f /dev/rmt/0n /dev/dsk/c0t1d0s0
  ufsdump 0f /dev/rmt/0n /dev/dsk/c0t1d0s1
} > logfile

You may also have seen something similar to this that uses parentheses instead of braces. This introduces a subshell: the shell forks creating a copy of itself as a separate process. A subshell inherits just about everything from its parent but inside it, changes to things like the current directory, traps, or any variables are lost.

tar cvf - . | ( cd /somewhere/else; tar xvf - )

Functions and Variable Scope

findpgm() {
  grep "$1" $HOME/proglist | column
}

function findpgm {
  grep "$1" $HOME/proglist | column
}

Only the former syntax is defined in the POSIX specification, but either will work in modern shells.

One thing to be aware of if you write ksh scripts is that in ksh the two syntaxes have different semantics. With the latter syntax, variables and traps are local to the function, making the functions more script-like.

In both bash and zsh, all functions follow the simpler semantics. You have to state explicitly if you want local variables or traps. Note that the positional parameters (such as $1) are always local, though. Local variables are declared using the local built-in, which is yet another variant of declare.

Local variables in both bash and zsh have what is called dynamic scope.

Note that there is no such thing as a local function. If you declare a function inside another function, it will be available globally.

Porting Scripts

As you may have noticed, zsh offers a number of options that make it behave in ways more similar to other shells. bash doesn’t have any equivalent options, but like many GNU programs, it does look at the POSIXLY_CORRECT environment variable. If it is set, it will alter a few minor things to achieve better compliance with the POSIX specification. You can also use set -o posix or invoke bash with the --posix option to achieve the same effect.

Input and Output

Exit Statuses

Normally when a function or shell script finishes, it passes on the exit status of the last command executed. It is important to be aware of this because it can often cause your shell script to return 1 even when it finishes successfully. To avoid this, you need to explicitly specify your exit status. From a shell script you do this with the exit command.

The exit command terminates the currently running shell process. Functions don’t run in a separate process so if you try this from a function, it will cause your shell to exit. Functions, therefore, have a separate command: return. For this reason, the status after leaving a function is often referred to as the return status.

Positional Parameters

$1, which contains the first argument passed to our function or script. There is a whole collection of these: $2, $3, and so on up to $9, each containing the next parameter. After 9 you need to use braces: ${10}. A $# variable contains the number of parameters that have been supplied. You can also view all the positional parameters together with the $* string or $@ array.

The difference between $* and $@ is exactly the same as for ${arr[@]} and ${arr[*]} array expansions.

$0, it contains the name of the script itself as it was typed on the command line.

In bash, you have to use the FUNCNAME special variable instead to get the name of the function. Note that as of version 3, FUNCNAME is an array containing the names of all functions in the current call stack. This means that the first element contains the name of the currently executing function, the second element is the name of the function that called that, and so on. zsh has a similar array named funcstack provided by the zsh/parameter module.

One more feature provided by zsh is that the positional parameters can also be accessed via argv and ARGC special variables. These come from C shell, but zsh is really too unlike C shell for them to ever be useful in running C shell scripts. argv is an array corresponding to $@ while ARGC is a scalar and corresponds to $#.

Option Parsing

In addition to this, there is a shell built-in named getopts.

To use getopts, you need to give it a specification describing the expected options. This is just a list of option letters. Any option that expects an argument should be followed by a colon. getopts is therefore of little use if you want to have long option names.

while getopts "abc:" par; do
  case $par in
    (a) aopt=1;;
    (b) bopt=1;;
    (c) carg=$OPTARG;;
    (?) exit 1;;
  esac
done
shift $(( OPTIND - 1 ))

Positioned as the condition of the while loop, getopts controls the loop, causing it to stop when it reaches a nonoption argument or the argument --. It uses three variables to provide the status. The first contains the current option with the leading dash removed. This is the one named with an argument to getopts – in this case par. The other two variables have fixed names: OPTARG provides any argument to an option such as for the -c option here and OPTIND is an index into the positional parameters and serves as the loop iterator. If passed a numeric parameter, shift will shift the parameters by that many places.

zsh has one further built-in – named zparseopts – for parsing options.

Reading Input

input="$(cat config.ini)"
input="$(<config.ini)"

while read; do
  echo $REPLY
done < config.ini

ps | while read; do
  echo $REPLY
done

The shell’s word splitting isn’t confined to splitting up words at space and tab characters. It actually uses the characters listed in the IFS (internal field separator) variable as separators when splitting things into words.

while IFS=: read user pw uid gid name home shell; do
  echo $user $name
done </etc/passwd

Backreferences

zsh has another, very powerful, way of doing this: backreferences. They are enabled in a pattern with the (#b) globbing flag. When enabled, parentheses in the pattern are active and can be referred back to by using shell variables. Each parenthesis matched where backreferences are turned on causes the shell to set an entry in the array match to the substring matched, and entries in the arrays mbegin and mend to the position of the substring inside the full string. Up to a maximum of nine parentheses are matched.

declare -A results
while read line; do
  case $line in
    (|\#*) ;;
    ((#b)([^=]##)=(*))
      results[$match[1]]="$match[2]"
    ;;
    *)
      echo "Syntax error" >&2
      exit 1
    ;;
  esac
done < config.ini

If you use backreferences in a function, remember the feature sets three arrays. It’s usually sensible to make all three invisible to anything outside the function by including the statement at the top of the function:

local match mbegin mend

bash 3 adds support for a form of backreferences but using regular expressions instead of shell patterns. An additional =~ operator is available in condition tests allowing a value to be matched against a regular expression. After matching, the BASH_REMATCH array contains any matching portions of the value.

[[ $line =~ '^([^=]+)=(.*)$' ]]

Note that the regular expression needed to be quoted.

Asking the User for Input

The -r option to read stands for raw. It prevents backslash escapes being interpreted in the input. Because displaying a prompt is very common, you can specify it directly from read. In ksh or zsh, this would be:

read -r 'address?Enter e-mail address: '

For bash, the prompt is specified with a -p option:

read -r -p 'Enter e-mail address: ' address

If you use the -e option, bash will use readline. In zsh, the read command doesn’t have an equivalent option; -e does something entirely different. Instead, you can use the vared built-in.

Propagating Functions

Exporting Functions

bash allows functions to be exported. This means that function definitions are kept in memory so when a new shell starts, they can be loaded quickly. The declare command applies to functions if passed the -f option so that can be combined with -x for exporting.

foo() {
  echo in foo
}
declare -fx foo
bash
foo
# in foo
zsh
foo
# command not found: foo
echo $foo
# () { echo in foo
# }

What bash is actually doing when running a command is treating the function as if it was a normal environment variable that begins with (). When the second instance of bash executes, it sees this environment variable and handles it specially, creating a function.

Autoloadable Functions

The function autoloading mechanism of ksh and zsh takes the idea of loading commands when they are executed and applies it to functions. The functions are found by searching directories listed in $FPATH for a file that has the same name as the function.

In a similar fashion to the path array dual of PATH, zsh has a fpath array that is somewhat easier to manipulate than the FPATH string.

fpath=( ~/.zfunc $fpath )
autoload newfunc

Traps and Special Functions

Traps let you specify code to be run whenever a signal is sent to the current process. Special functions are invoked directly by the shell when certain events occur, such as a new prompt being printed or the current directory changing.

Trapping Signals

trap 'rm /tmp/temp-$$; exit' INT TERM

Note how we include an exit command to ensure the script still exits after clearing up the temporary file. If you want to ignore a signal, disabling any default behavior, you can specify the empty string as the command argument to trap.

trap '' HUP

If you subsequently want to restore the default handling of a signal, you need to remove the trap. This is done by passing a dash (-) as the command argument to trap.

trap - HUP

Two signals, USR1 and USR2, are made available for user-defined purposes. If you want to use a signal for some custom purpose, you can use one of these. It isn’t possible to pass any information with a signal, though, so they are only useful for synchronization, perhaps to tell another process when a file is ready for it to read.

The shell also makes available a set of fake signals. These are signals that aren’t known to the operating system and can’t be sent with the kill command but that the shell allows you to trap. They allow you to intercept certain operations of the shell.

Similar to the EXIT trap, bash version 3 has a RETURN fake trap that is triggered whenever a function or sourced script returns. In zsh, if you define the EXIT trap within a function, it will be triggered when the function exits. zsh will also restore any previous EXIT trap when the function exits. In general, traps are not restored when a function returns. zsh allows all traps to be made local to a function by turning the local_traps option on.

DEBUG and ERR, which are triggered after every command and commands returning a nonzero exit status, respectively. In zsh, the ERR trap is instead named ZERR because some operating systems already have a signal named ERR.

Replacing Built-in Commands

In bash and zsh, functions come before built-ins, so we just need to create a function with the same name.

Note that ksh resolves built-ins before functions. Aliases are resolved before built-ins, though, so you can use an alias to call your function.

Debugging Scripts

Trace Information

With the xtrace option on, the shell shows each command line as it is executed.

One thing you may just be wondering about is our use of set -x to turn the option on here. You can use setopt xtrace or set -o xtrace if you prefer.

In zsh, you can also enable tracing for a function by setting the trace attribute for it:

functions -t func

There are two other similar options that can be useful in debugging:

noexec (or -n) – This prevents commands from being run and just checks the syntax.
verbose (or -v) – This causes each command line to be printed exactly as it appears in the script before any expansion.

The bash Debugger

There are a few such debuggers available. A ksh debugger has been around for some time. bash users will find the bashdb script.

LOADING...