Category Archives: Mac OS X

Fun with Mac/Unix binaries

$PATH and its directories

As of Mac OS X 10.7, there is a magic file named /private/etc/paths that contains the initial list of directories for the $PATH variable. It looks like this on a clean install:

/usr/bin
/bin
/usr/sbin
/sbin
/usr/local/bin

Each newline is turned into a ‘:’ character so that $PATH looks like /usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin.

There is also a directory named /private/etc/paths.d/, which contains an arbitrary number of files that also contain entries for the $PATH variable. The files are read in alphabetic order and their contents catenated to the $PATH variable. On my system, I have a 50-X11 file and a git file, because I installed X11 (probably when I installed Mac OS X 10.7) and then I installed a new version of git from https://code.google.com/p/git-osx-installer/. As a result, my $PATH looks like this: /usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbin:/usr/X11/bin:/usr/local/git/bin.

Some people would suggest that /usr/local/bin come first, but that really doesn’t fix problems, at least not for complex programs like git, because, as you can see, the git that I installed to /usr/local was actually installed as a directory named /usr/local/git, and that folder has a bin folder that needs to be in the path. And I like that there is a paths.d directory so that new programs can be installed and removed easily.

What this means is that you can’t install a new version of a program without removing the old version. XCode 4.0 installed git – this is a good thing. But it installed git to the main set of directories: /usr/bin, usr/libexec and so on. So, if I want my new git to take precedence, I have to either install it on top of the existing one (a bit messy), mess with a bunch of paths to get it to be seen first, or… remove the old one. See below for that, but first a note on some other Unix filesystem bits.

MAN pages

In olden days of yore, all the man pages were installed to /usr/share/man. However, that was then and this is now – man pages have a system like $PATH where new programs can keep their man pages in their own hierarchy, but stitch them together so that the man viewer can find them.

First off, there is a config file for man, located by default at /private/etc/man.conf. This contains the default list of directories for man to search. As with the other parts, you can edit this directly, but you then run the dual risks of having your changes be wiped out by someone else changing this file, or by not being able to easily uninstall specific man files.

Second, there is a file just for man page paths, at /private/etc/manpaths, and there is a directory containing files that contain man file paths at /private/etc/manpaths.d; this is the same mechanism as used to set $PATH, just with different config files. This means that man.conf should never need to be edited. My /private/etc/manpaths looks like this

/usr/share/man
/usr/local/share/man

We have the same issue with man pages that we do with $PATH – if we install man pages for a newer version of a program that’s already installed, we won’t see our new man pages if the already-installed program is higher in the man paths hierarchy. And the solutions are the same as above – install on top, fiddle with the basic manpaths.d file, or remove the older program.

libexec

Note that this is BSD-centric, and that includes Mac; many Linux distributions don’t do this, and it’s fallen out of the latest FHS. On Linux, git-core is in /usr/lib/libexec/, and not /usr/libexec/git-core/.

There are a hierarchy of programs that actually run from a libexec path, and here I don’t know how this is extended. In git, for example, all the old “git-something” programs are in libexec, and most of them are just symlinked or hardlinked to the main git executable (e.g. when Apple built and installed git, it used hardlinks, whereas the googlecode Mac installer uses symlinks, same thing really).

The original intention of libexec was “a directory that contains daemons and utilities that can’t be used directly by the user”. These are not in the $PATH, they are magically located by other programs that just know where they are. I’m assuming that these other programs have the paths hardcoded in source, or are working from paths relative to their location. And knowing how Unix programs typically work, the paths are probably determined at build time and built into binaries.

Updating git in 10.7

With all that said, there are really only two good ways to update git in 10.7 once you’ve installed XCode 4.

  • build from source and install into /usr/bin
  • run the googlecode installer and delete the existing version in /usr/bin

I decided to initially do the latter, and what I actually did was to write a quick script to move the git in a system folder out of the way – a script because there are too many files scattered in too many folders to want to do it by hand, and then theoretically I could put this version of git back in place if it was necessary (maybe an XCode upgrade would be confused if it saw files missing?).

Writing this was interesting if you want to preserve hard links (Apple’s install of git uses hardlinks of libexec git aliases to the git binary, instead of symlinks), and if you want to transpose absolute symlinks so they still point to the same relative object).

For symlinks, there are several cases. First, the symlink could be an absolute path to something outside of the set of files you are moving; in that case, you want to leave the symlink alone. Second, the symlink could be a relative path to something in the set of files you are moving, and as long as you are moving the whole set somewhere else (e.g. preserving the local hierarchy, just moving the root), you want to leave the symlink alone. Third, the symlink could be an absolute path to something inside the set of files you are moving; in that case, you need to adjust the symlink so that it points to the new destination of the parent. Fourth, the symlink could be a relative path to something outside the set of files you are moving; this is probably an error in that you should have moved the parent too, but if not, you need to either turn this to an absolute path, or adjust the relative path so it is still valid.

For hardlinks, it’s easy if the dest location is on the same filesystem as the source location; a mv will just move the dirent and leave it pointing to the same inode. If you’re moving across filesystems, it’s a lot more interesting; you need to pick one file as the original and copy it, and then hardlink all the other entries to that new inode.

I’ll have to create a separate writeup for this, because it’s likely the same kinds of things that archive programs might do, and it’s something that would be interesting to abstract out to a new kind of file operation; while less common, it’s still something people periodically do, move a related group of files. Preserving as much metadata as possible is always a good thing.