2020-02-02

Removing variable prefixes and suffixes from other variables in Bash

Just been bitten by this…

If you have a variable txt in Bash, you can strip a given prefix or suffix from it like this:

$ txt=a/b.d/c.jpg
$ echo "${txt%.*}"
a/b.d/c
$ echo "${txt##*.}"
jpg
$ echo "${txt%%.*}"
a/b
$ echo "${txt#*.}"
d/c.jpg

The % operator strips of the shortest matching suffix, and .* matches .jpg, so that gets removed. %% strips off the longest matching suffix. Similarly, # and ## strip off the shortest and longest matching prefix, respectively. Asterisks, square brackets and other characters are special, probably following the same rules as Pattern Matching in the Bash manual page.

You can also use literal strings as the patterns, i.e., no special characters:

$ txt=a/b.d/c.jpg
$ echo "${txt%.jpg}"
a/b.d/c
$ echo "${txt%.png}"
a/b.d/c.jpg
$ echo "${txt#a/b.d/}"
c.jpg
$ echo "${txt#c/b.d/}"
a/b.d/c.jpg

Note that, if the prefix or suffix doesn't match (whether you use special characters or not), you get the whole string returned.

These operations are useful for traversing pathnames:

$ path="/home/john/file.jpg"
$ echo Leaf is "${path%%*/}"
Leaf is file.jpg
$ echo Dir is "${path#/*}"
Dir is /home/john

You have to be careful if your input doesn't contain the separator:

$ input1=path/to/file.jpg
$ input2=file.jpg
$ echo Input 1 dir "[${input1%/*}]" leaf "[${input1##*/}]"
Input 1 dir [path/to] leaf [file.jpg]
$ echo Input 2 dir "[${input2%/*}]" leaf "[${input2##*/}]"
Input 2 dir [file.jpg] leaf [file.jpg]

To avoid this special case, I thought I could do this:

input1=path/to/file.jpg
input2=file.jpg
input1leaf="${input1##*/}"
input1dir="${input1%${input1leaf}}"
input2leaf="${input2##*/}"
input2dir="${input2%${input2leaf}}"
echo "[${input1dir}]" "[${input1leaf}]"
echo "[${input2dir}]" "[${input2leaf}]"

…which leads to this:

[path/to/] [file1.jpg]
[] [file1.jpg]

However, I hadn't noticed that special characters are still interpreted after the partial expansion:

input3="path/to/file [2002].jpg"
input3leaf="${input3##*/}"
input3dir="${input3%${input3leaf}}"
echo "[${input3dir}]" "[${input3leaf}]"

The square brackets are taken as a wildcard, and fail to match the literal value:

[path/to/file [2002].jpg] [file1 [2002].jpg]

The trick is to quote again:

input3="path/to/file [2002].jpg"
input3leaf="${input3##*/}"
input3dir="${input3%"${input3leaf}"}"
echo "[${input3dir}]" "[${input3leaf}]"

Now you get the intended result:

[path/to/] [file1 [2002].jpg]

An alternative technique would be to use the length of your prefix/suffix in a substring operations, but it's less convenient and more error-prone if you want to do small adjustments to a prefix or suffix before applying it.

Anyway, in summary, if you're going to use Bash's prefix/suffix removal with a computed pattern, put the result in quotes!

No comments:

Post a Comment