Something all bash scripters need to know (and most of us don’t)

2012-02-12 § 16 Comments

Calling all bash users. This is a public service announcement.

Here’s something you need to know if you want to write bash scripts that work reliably, but you probably don’t.

Recommendations

For script authors: Every bash script that uses the cd command with a relative path needs to call unset CDPATH, or else it may not work correctly. Scripts that don’t use cd should probably do it anyway, in case someone puts a cd in later.

For users: Never export CDPATH from your shell to the environment. If you use CDPATH then set it in your .bashrc file and don’t export it, so that it’s only set in interactive shells.

For the bash implementers: Change bash to ignore the inherited value of CDPATH in shell scripts (as opposed to interactive shells).

Update

Since I wrote this, thanks to commenters here and on Reddit I’ve learnt two interesting things:

  • CDPATH is not a bash-specific feature; it’s actually specified by POSIX.
  • You can avoid it in some cases by using cd ./foo, which does not consult CDPATH. But this is not a panacea: it can’t easily be used with paths that might be absolute or relative, such as `dirname "$0"`, so I think unsetting CDPATH is still the best way to deal with it.

What you need to know

The bash shell has a little-known feature that might occasionally be handy in interactive use, but is never useful in a script and acts as a brutal trap for the unwary scripter. The variable CDPATH can be set to a colon-separated list of directories, and then whenever cd somewhere is called with a relative path the directories in CDPATH are tested in turn to see whether somewhere exists in any of them. If it does, the current working directory is changed to that directory and the fully-qualified path of the new working directory is printed to standard output.

For example:

-bash-3.2$ cd                  # Change to my home directory
-bash-3.2$ mkdir foo /tmp/foo  # Create directory "foo" here and in /tmp
-bash-3.2$ CDPATH=/tmp:.       # Set CDPATH
-bash-3.2$ cd foo              # Call cd
/tmp/foo                       # cd changes to /tmp/foo, and prints it out

Here running cd foo changes to /tmp/foo rather than ~/foo, because /tmp precedes . in the CDPATH.

If CDPATH is set in the environment, e.g. exported from a shell, then it may cause the cd command to behave unexpectedly in shell scripts. By the robustness principle users should not export CDPATH and scripts should be written to work even if they do.

In case you doubted it, it’s very common to see scripting idioms that may not work properly if CDPATH is exported. Even the common cd "$(dirname "$0")" falls into this category.

How I discovered this trap (the hard way)

I’ve been writing bash scripts for almost half my life, but it still has the capacity to surprise me.

At work we have a library of shell code that is used for things like configuration management across several different projects. Because the library is included as a git submodule in many different projects, and these projects themselves will be installed by different people in different places, the library code can’t use a hard-coded path to itself; but sometimes it does need to know where it’s installed so that library functions can invoke scripts from the same package.

Note that this is a library of functions that will be included into other shell scripts using the source command, so we can’t assume we’re in "$(dirname "$0")" as a straightforward shell script could. But that’s okay, because bash has a special variable $BASH_SOURCE that any function can use to find the filename of the file that function is defined in. So I wrote this:

_mysociety_commonlib_directory() {
    (
      cd "$(dirname "${BASH_SOURCE[0]}")"/..
      pwd
    )
}
MYSOCIETY_COMMONLIB_DIR=$(_mysociety_commonlib_directory)

which sets $MYSOCIETY_COMMONLIB_DIR to the fully-qualified pathname of the parent directory of the directory this function is in. I was happy with the neatness of this solution, and it worked fine in all my tests.

A few days ago, though, a user reported a bug that we eventually traced back to this function. It turned out that the cd command was also printing the name of the directory, and so $MYSOCIETY_COMMONLIB_DIR ended up containing the directory name twice.

I can only suppose that the user must have CDPATH set in the environment. But what a nasty trap.

§ 16 Responses to Something all bash scripters need to know (and most of us don’t)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

What’s this?

You are currently reading Something all bash scripters need to know (and most of us don’t) at Bosker Blog.

meta

Follow

Get every new post delivered to your Inbox.

Join 743 other followers

%d bloggers like this: