Bring the power of the Linux command line into your application development process. As a novice software developer, the one thing I look for when choosing a programming language is this: is there a library that allows me to interface with the system to accomplish a task? If Python didn't have Flask, I might choose a different language to write a web application. For this same reason, I've begun to develop many, admittedly small, applications with Bash. Although Python, for example, has many modules to import and extend functionality, Bash has thousands of commands that perform a variety of features, including string manipulation, mathematic computation, encryption and database operations. In this article, I take a look at these features and how to use them easily within a Bash application.
Reusable Code Snippets
Bash provides three features that I've found particularly useful when creating reusable functions: aliases, functions and command substitution. An alias is a command-line shortcut for a long command. Here's an example:
alias getloadavg='cat /proc/loadavg'
The alias for this example is
getloadavg
. Once defined, it can be executed as any other Linux command. In this instance,
alias
will dump the contents of the /proc/loadavg file. Something to keep in mind is that this is a static command alias. No matter how many times it is executed, it always will dump the contents of the same file. If there is a need to vary the way a command is executed (by passing arguments, for instance), you can create a function. A function in Bash functions the same way as a function in any other language: arguments are evaluated, and commands within the function are executed. Here's an example function:
getfilecontent() {
if [ -f $1 ]; then
cat $1
else
echo "usage: getfilecontent "
fi
}
This function declaration defines the function name as
getfilecontent
. The
if
/
else
statement checks whether the file specified as the first function argument (
$1
) exists. If it does, the contents of the file is outputted. If not, usage text is displayed. Because of the incorporation of the argument, the output of this function will vary based on the argument provided.
The final feature I want to cover is command substitution. This is a mechanism for reassigning output of a command. Because of the versatility of this feature, let's take a look at two examples. This one involves reassigning the output to a variable:
LOADAVG="$(cat /proc/loadavg)"
The syntax for command substitution is
$(command)
where "command" is the command to be executed. In this example, the
LOADAVG
variable will have the contents of the /proc/loadavg file stored in it. At this point, the variable can be evaluated, manipulated or simply echoed to the console.
Text Manipulation
If there is one feature that sets scripting on UNIX apart from other environments, it is the robust ability to process text. Although many text processing mechanisms are available when scripting in Linux, here I'm looking at
grep
,
awk
,
sed
and variable-based operations. The
grep
command allows for searching through text whether in a file or piped from another command. Here's a
grep
example:
alias searchdate='grep
↪"[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]"'
The alias created here will search through data for a date in the YYYY-MM-DD format. Like the
grep
command, text either can be provided as piped data or as a file path following the command. As the example shows, search syntax for the
grep
command includes the use of regular expressions (or regex).
When processing lines of text for the purpose of pulling out delimited fields,
awk
is the easiest tool for the job. You can use
awk
to create verbose output of the /proc/loadavg file:
awk '{ printf("1-minute: %s\n5-minute: %s\n15-minute:
↪%s\n",$1,$2,$3); }' /proc/loadavg
For the purpose of this example, let's examine the structure of the /proc/loadavg file. It is a single-line file, and there are typically five space-delimited fields, although this example uses only the first three fields. Much like Bash function arguments, fields in
awk
are references as variables are named by their position in the line (
$1
is the first field and so on). In this example, the first three fields are referenced as arguments to the
printf
statement. The
printf
statement will display three lines, and each line will contain a description of the data and the data itself. Note that each
%s
is substituted with the corresponding parameter to the
printf
function.
Within all of the commands available for text processing on Linux,
sed
may be considered the Swiss army knife for text processing. Like
grep
,
sed
uses regex. The specific operation I'm looking at here involves regex substitution. For an accurate comparison, let's re-create the previous
awk
example using
sed
:
sed 's/^\([0-9]\+\.[0-9]\+\) \([0-9]\+\.[0-9]\+\)
↪\([0-9]\+\.[0-9]\+\).*$/1-minute: \1\n5-minute:
↪\2\n15-minute: \3/g' /proc/loadavg
Since this is a long example, I'm going to separate this into smaller parts. As I mentioned, this example uses regex substitution, which follows this syntax: s/search/replace/g. The "s" begins the definition of the substitution statement. The "search" value defines the text pattern you want to search for, and the "replace" value defines what you want to replace the search value with. The "g" at the end is a flag that denotes global substitution within the file and is one of many flags available with the substitute statement. The search pattern in this example is:
^\([0-9]\+\.[0-9]\+\) \([0-9]\+\.[0-9]\+\)
↪\([0-9]\+\.[0-9]\+\).*$
The caret (^) at the beginning of the string denotes the beginning of a line of text being processed, and the dollar sign ($) at the end of the string denotes the end of a line of text. Four things are being searched for within this example. The first three items are:
\([0-9]\+\.[0-9]\+\)
This entire string is enclosed with escaped parentheses, which makes the value within available for use in the replace value. Just like the
grep
example, the
[0-9]
will match a single numeric character. When followed by an escaped plus sign, it will match one or more numeric characters. The escaped period will match a single period. When you put this whole expression together, you get an pattern for a decimal digit.
The fourth item in the search value is simply a period followed by an asterisk. The period will match any character, and the asterisk will match zero or more of whatever preceded it. The replace value of the example is:
1-minute: \1\n5-minute: \2\n15-minute: \3
This is largely composed of plain text; however, it contains four unique special items. There are newline characters that are represented by the slash-"/n". The other three items are slashes followed by a number. This number corresponds to the patterns in the search value surrounded by parentheses. Slash-1 is the first pattern in parentheses, slash-2 is the second and so on. The output of this
sed
command will be exactly the same as the
awk
command from earlier.
The final mechanism for string manipulation that I want to discuss involves using Bash variables to manipulate strings. Although this is much less powerful than traditional regex, it provides a number of ways to manipulate text. Here are a few examples using Bash variables:
MYTEXT="my example string"
echo "String Length: ${#MYTEXT}"
echo "First 5 Characters: ${MYTEXT:0:5}"
echo "Remove \"example\": ${MYTEXT/ example/}"
The variable named
MYTEXT
is the sample string this example works with. The first
echo
command shows how to determine the length of a string variable. The second
echo
command will return the first five characters of the string. This substring syntax involves the beginning character index (in this case, zero) and the length of the substring (in this case, five). The third
echo
command removes the word "example" along with a leading space.
Mathematic Computation
Although text processing might be what makes Bash scripting great, the need to do mathematics still exists. Basic math problems can be evaluated using either
bc
,
awk
or Bash arithmetic expansion. The
bc
command has the ability to evaluate math problems via an interactive console interface and piped input. For the purpose of this article, let's look at evaluating piped data. Consider the following:
pow() {
if [ -z "$1" ]; then
echo "usage: pow "
else
echo "$1^$2" | bc
fi
}
This example shows creating an implementation of the
pow
function from C++. The function requires two arguments. The result of the function will be the first number raised to the power of the second number. The math statement of
"$1^$2"
is piped into the
bc
command for calculation.
Although
awk
does provide the ability to do basic math calculation, the ability for
awk
to iterate through lines of text makes it especially useful for creating summary data. For instance, if you want to calculate the total size of all files within a folder, you might use something like this:
foldersize() {
if [ -d $1 ]; then
ls -alRF $1/ | grep '^-' | awk 'BEGIN {tot=0} {
↪tot=tot+$5 } END { print tot }'
else
echo "$1: folder does not exist"
fi
}
This function will do a recursive long-listing for all entries underneath the folder supplied as an argument. It then will search for all lines beginning with a dash (this will select all files). The final step is to use
awk
to iterate through the output and calculate the combined size of all files.
Here is how the
awk
statement breaks down. Before processing of the piped data begins, the
BEGIN
block sets a variable named
tot
to zero. Then for each line, the next block is executed. This block will add to
tot
the value of the fifth field in each line, which is the file size. Finally, after the piped data has been processed, the
END
block then will print the value of
tot
.
The other way to perform basic math is through arithmetic expansion. This will take a similar visual for the command substitution. Let's rewrite the previous example using arithmetic expansion:
pow() {
if [ -z "$1" ]; then
echo "usage: pow "
else
echo "$[$1**$2]"
fi
}
The syntax for arithmetic expansion is
$[expression]
, where expression is a mathematic expression. Notice that instead of using the caret operator for exponents, this example uses a double-asterisk. Although there are differences and limitations to this method of calculation, the syntax can be more intuitive than piping data to the
bc
command.
Cryptography
The ability to perform cryptographic operations on data may be necessary depending on the needs of an application. If a string needs to be hashed, a file needs to be encrypted, or data needs to be base64-encoded, this all can be accomplished using the
openssl
command. Although
openssl
provides a large set of ciphers, hashing algorithms and other functions, I cover only a few here.
The first example shows encrypting a file using the blowfish cipher:
$1.enc
else
echo "usage: bf-enc "
fi
}
This function requires two arguments: a file to encrypt and the password to use to encrypt it. After running, this script produces a file named the same as your original but with the file extension of "enc".
Once you have the data encrypted, you need a function to decrypt it. Here's the decryption function:
bf-dec() {
if [ -f $1 ] && [ -n "$2" ]; then
cat $1 | openssl enc -d -blowfish -pass pass:$2 >
↪${1%%.enc}
else
echo "usage: bf-dec "
fi
}
The syntax for the decryption function is almost identical to the encryption function with the addition of "-d" to decrypt the piped data and the syntax to remove ".enc" from the end of the decrypted filename.
Another piece of functionality provided by
openssl
is the ability to create hashes. Although files may be hashed using
openssl
, I'm going to focus on hashing strings here. Let's make a function to create an MD5 hash of a string:
md5hash() {
if [ -z "$1" ]; then
echo "usage: md5hash "
else
echo "$1" | openssl dgst -md5 | sed 's/^.*= //g'
fi
}
This function will take the string argument provided to the function and generate an MD5 hash of that string. The
sed
statement at the end of the command will strip off text that
openssl
puts at the beginning of the command output, so that the only text returned by the function is the hash itself.
The way that you would validate a hash (as opposed to decrypting it) is to create a new hash and compare it to the old hash. If the hashes match, the original strings will match.
I also want to discuss the ability to create a base64-encoded string of data. One particular application that I have found this useful for is creating an HTTP basic authentication header string (this contains username:password). Here is a function that accomplishes this:
basicauth() {
if [ -z "$1" ]; then
echo "usage: basicauth "
else
echo "$1:$(read -s -p "Enter password: " pass ;
↪echo $pass)" | openssl enc -base64
fi
}
This function will take the user name provided as the first function argument and the password provided by user input through command substitution and use
openssl
to base64-encode the string. This string then can be added to an HTTP authorization header field.
Database Operations
An application is only as useful as the data that sits behind it. Although there are command-line tools to interact with database server software, here I focus on the SQLite file-based database. Something that can be difficult when moving an application from one computer to another is that depending on the version of SQLite, the executable may be named differently (typically either
sqlite
or
sqlite3
). Using command substitution, you can create a fool-proof way of calling
sqlite
:
$(ls /usr/bin/sqlite* | grep 'sqlite[0-9]*$' | head -n1)
This will return the full file path of the
sqlite
executable available on a system.
Consider an application that, upon first execution, creates an empty database. If this syntax is used to invoke the
sqlite
binary, the empty database always will be created using the correct version of
sqlite
on that system.
Here's an example of how to create a new database with a table for personal information:
$(ls /usr/bin/sqlite* | grep 'sqlite[0-9]*$' | head -n1) test.db
↪"CREATE TABLE people(fname text, lname text, age int)"
This will create a database file named test.db and will create the people table as described. This same syntax could be used to perform any SQL operations that SQLite provides, including SELECT, INSERT, DELETE, DROP and many more.
This article barely scrapes the surface of commands available to develop console applications on Linux. There are a number of great resources for learning more in-depth scripting techniques, whether in Bash, awk, sed or any other console-based toolset. See the Resources section for links to more helpful information.
Resources