2020-02-02

Removing variable prefixes and suffixes from other variables in Bash

Just been bitten by this…

If you have a variable txt in Bash, you can strip a given prefix or suffix from it like this:

$ txt=a/b.d/c.jpg
$ echo "${txt%.*}"
a/b.d/c
$ echo "${txt##*.}"
jpg
$ echo "${txt%%.*}"
a/b
$ echo "${txt#*.}"
d/c.jpg

The % operator strips of the shortest matching suffix, and .* matches .jpg, so that gets removed. %% strips off the longest matching suffix. Similarly, # and ## strip off the shortest and longest matching prefix, respectively. Asterisks, square brackets and other characters are special, probably following the same rules as Pattern Matching in the Bash manual page.

You can also use literal strings as the patterns, i.e., no special characters:

$ txt=a/b.d/c.jpg
$ echo "${txt%.jpg}"
a/b.d/c
$ echo "${txt%.png}"
a/b.d/c.jpg
$ echo "${txt#a/b.d/}"
c.jpg
$ echo "${txt#c/b.d/}"
a/b.d/c.jpg

Note that, if the prefix or suffix doesn't match (whether you use special characters or not), you get the whole string returned.

These operations are useful for traversing pathnames:

$ path="/home/john/file.jpg"
$ echo Leaf is "${path%%*/}"
Leaf is file.jpg
$ echo Dir is "${path#/*}"
Dir is /home/john

You have to be careful if your input doesn't contain the separator:

$ input1=path/to/file.jpg
$ input2=file.jpg
$ echo Input 1 dir "[${input1%/*}]" leaf "[${input1##*/}]"
Input 1 dir [path/to] leaf [file.jpg]
$ echo Input 2 dir "[${input2%/*}]" leaf "[${input2##*/}]"
Input 2 dir [file.jpg] leaf [file.jpg]

To avoid this special case, I thought I could do this:

input1=path/to/file.jpg
input2=file.jpg
input1leaf="${input1##*/}"
input1dir="${input1%${input1leaf}}"
input2leaf="${input2##*/}"
input2dir="${input2%${input2leaf}}"
echo "[${input1dir}]" "[${input1leaf}]"
echo "[${input2dir}]" "[${input2leaf}]"

…which leads to this:

[path/to/] [file1.jpg]
[] [file1.jpg]

However, I hadn't noticed that special characters are still interpreted after the partial expansion:

input3="path/to/file [2002].jpg"
input3leaf="${input3##*/}"
input3dir="${input3%${input3leaf}}"
echo "[${input3dir}]" "[${input3leaf}]"

The square brackets are taken as a wildcard, and fail to match the literal value:

[path/to/file [2002].jpg] [file1 [2002].jpg]

The trick is to quote again:

input3="path/to/file [2002].jpg"
input3leaf="${input3##*/}"
input3dir="${input3%"${input3leaf}"}"
echo "[${input3dir}]" "[${input3leaf}]"

Now you get the intended result:

[path/to/] [file1 [2002].jpg]

An alternative technique would be to use the length of your prefix/suffix in a substring operations, but it's less convenient and more error-prone if you want to do small adjustments to a prefix or suffix before applying it.

Anyway, in summary, if you're going to use Bash's prefix/suffix removal with a computed pattern, put the result in quotes!

Bash redirection with descriptor in variable, and locking

A recommended way to acquire a lock in Bash is to open the lock file for a group command, and call flock on the open descriptor before doing anything dangerous:

{
  echo waiting
  flock -x 9
  echo in
  sleep 10
  echo done
} 9> /tmp/lock

Try it in two independent terminals. The second command will run only as the first finishes.

However, one should never have to pick an arbitrary file descriptor (9 in this case). Fortunately, you can get Bash to choose an available descriptor, using {var} in place of the literal descriptor number:

unset lfd
{
  echo waiting
  flock -x $lfd
  echo in
  sleep 10
  echo done
} {lfd}> /tmp/lock

Problem solved!

No, wait. The descriptor doesn't get closed at the end of the group command, so your second invocation will hang indefinitely. Once the first terminal has finished, if you manually close the descriptor, the second proceeds:

exec {lfd}>&-

Looks like you have to do things more explicitly (and the group command is no longer useful):

echo waiting
unset lfd
exec {lfd}> /tmp/lock
flock -x $lfd
echo in
sleep 10
echo done
exec {lfd}>&-

This is inconvenient if you want to break or continue out of an enclosing loop:

for i in $(seq 1 10)
do
  {
    echo waiting
    flock -x 9
    echo in
    sleep 5
    if something_went_wrong ; then continue ; fi
    sleep 5
    echo done
  } 9> /tmp/lock
done

Is this a bug, a feature, or a mistake on my part? (Bash version 4.4.20(1)-release.)