Various links

How to Make Your Open Source Project Really Awesome – a pretty reasonable to-do list for any released project, not just open source.

checkedthreads is a fork-join parallelism framework for C++ – advanced CS class from Brian Kernigan – insane liquid bash/zsh prompt.


Federico Mena Quintero on “Software that has the Quality Without A Name”

This is a pretty good article that, while not a substitute for reading Christopher Alexander, conveys the meaning quite well.

Software that has the Quality Without A Name

Christopher Alexander eventually gave “Quality Without A Name” (often written as QWAN) a name finally – this is “wholeness”. This is a good shortcut word, at least in English, that encapsulates the things he was trying to get across about QWAN. For example, “It is a subtle kind of freedom from inner contradictions”.

Great software works in the real world, with no excuses as to “bad environments” or “user error”; both of those are just part of the real world. His method to help develop great works (great architecture, in his case) was to observe the patterns that have developed over time; long-lasting patterns tend to be stable and resilient, so if you can’t figure out resilience from first principles, base your designs on patterns that have emerged over time.

This had a huge impact on some people in the software world, and this is where Design Patterns came from. We are at the infancy, however, because we haven’t yet had enough time for bad patterns to be winnowed out and good patterns to emerge. We have to help this process along by going deeper – more analysis, more observation, and above all to stop being defensive. If we make something we know in our heart of hearts is great, and yet it fails to work in many cases in the real world, then we missed something, or we used a bad pattern in its development.

Stop circling the wagons, and learn. Software can be practically perfect.

Xcode and dev tools – install and uninstall

Uninstalling Xcode

Xcode 1 through Xcode 4.2

Before Xcode 4.3, uninstalling means running a script. Assuming you installed Xcode to /Developer, you would do this

brian-mac-pro:bfitz$ sudo /Developer/Library/uninstall-devtools --mode=all

This actually runs a handful of other scripts to uninstall the various pieces that the Xcode installer put on your system, and in looking at the scripts, they are really wrappers for pkgutil invocations to remove files and package receipts.

Xcode 4.3 and up

As of Xcode 4.3, it is self-contained – uninstalling Xcode is a matter of throwing away Mostly; if you install separate command-line tools, those need to be uninstalled by hand – or by script, and you can find a script here that will uninstall them:; said script uses the package receipts and lsbom to find and remove all the files in the package. Note that this isn’t really the “right” way to do this, instead you’re supposed to be discovering and iterating through pkgutil.

Note that you don’t actually need to install the separate command-line tools, if you’re willing to use the ones located inside the bundle; you use xcrun to find and run one, for example, xcrun git would find and run /Applications/, assuming that your Xcode is in /Applications.

Non-Xcode toolchain for Mac

Xcode is mostly free, but it’s large – it occupies multiple gigabytes on your hard disk. If you just want to build code and just want compilers and SDKs, it’s a lot of overhead.

Several years ago, Kenneth Reitz put together a cut-down version that was Xcode minus Xcode – basically GCC and all the headers and libraries that weren’t Apple-licensed. He called this OSX-GCC-Installer, hosted it on GitHub, and it became fairly popular. It became popular enough that people from Apple became interested in it, and eventually Apple released Command Line Tools for Xcode. It’s free, although it does require an Apple ID to download it.



Fun with pkgutil

With Mac OS X, Apple started out in the GUI world, but over time has transitioned to a more traditional Unix world with command-line tools, but without forcing this on most users. For example, you can perform and access install information via the command-line as well as through the GUI programs Apple supplies.

The standard on Mac OS X is “the package”, and this is for atomic entities, such as applications, libraries and frameworks. Installers add packages to the system; users run them (a .app is a folder that is a package that masquerades as a file).

The pkgutil command-line program gives you access to information about installed packages. man pkgutil tells us a little bit:

pkgutil(1)		  BSD General Commands Manual		    pkgutil(1)

     pkgutil -- Query and manipulate Mac OS X Installer packages and receipts.

     pkgutil [options] [commands]

     pkgutil reads and manipulates Mac OS X Installer flat packages, and pro-
     vides access to the ``receipt'' database used by the Installer. Options
     are processed first, and affect the operation of all commands. Multiple
     commands are performed sequentially in the given order.

First off, you can just get a list of all the packages installed to a specific volume. For the most part, packages are installed to the root volume /, and if you don’t pass in a –volumes option, pkgutil will default to /.

brian-mac-pro:~ bfitz$ pkgutil --packages
brian-mac-pro:~ bfitz$ pkgutil --packages | wc
      87      87    2549

My Mac currently has 87 packages installed on it (I don’t install a lot of things, sorry).

You can list packages that match a pattern – for example, to find all packages with the string Xcode:

brian-mac-pro:~ bfitz$ pkgutil --pkgs=.\+Xcode.\+

The trick with the regular expression is that it must cover the entire name, there’s an implied start and end anchor applied to the regex, and you need to escape characters that the shell might interpret (like <, +, \ and so on). For example, if your regex needs a backslash, then you need a backslash for that backslash.

One of the most useful commands is “I have a file on my hard disk, what installed it”. For example, something installed git to /usr/bin/git – what was it?

brian-mac-pro:~ bfitz$ pkgutil --file-info /usr/bin/git
volume: /
path: /usr/bin/git

install-time: 1316396966
uid: 0
gid: 0
mode: 755
brian-mac-pro:~ bfitz$

Evidently, when I said “install command-line versions of tools” in Xcode, it installed git to the global system folder. So, what else did it install? The –files option lists all the files installed by a package, and –only-files makes it list just the files, not the directories that were created to hold those files.

brian-mac-pro:~ bfitz$ pkgutil --only-files  --files
brian-mac-pro:~ bfitz$ pkgutil --only-files  --files | wc
    1852    1856  102002

It installed 1852 files, and the files are as expected, command line programs and man pages and even a suite of test code.

You can get more information about a package with –pkg-info.

brian-mac-pro:~ bfitz$ pkgutil --pkg-info
volume: /
location: /
install-time: 1316396966
brian-mac-pro:~ bfitz$ date -r 1316396966
Sun Sep 18 18:49:26 PDT 2011

The install-time flag is in Unix seconds (seconds since 1970), which I turned into a human-readable date with the date command-line tool, so you can see that I installed this package (which came from an install of Xcode) on September 18, 2011.

Here we see that is part of several groups, and we can discover what other packages are in a group by using –group-pkgs:

brian-mac-pro:~ bfitz$ pkgutil --group-pkgs

Of course, at this point you’re reverse-engineering what some developer has as their plan for how to organize software, and you’re not likely to find this documented anywhere.

R and ggplot2

The relatively new graphing package ggplot2 looks better and is more fully-featured. Here’s some translation of typical graphs from plot to ggplot2. ggplot2 takes its direction from Leland Wilkinson’s grammar of graphics; there is data (what we want to visualize), geometry (the geometric objects used to represent data), and aesthetic attributes that are visual properties of geoms like position, color, shapes and so on.

FYI, you’ll need to install it (which you can do inside R), and then in each R session you need to load it.


Line graphs

In ggplot2

ggplot(two50, aes(x=Days, y=Likelihood)) + geom_line()

Cumulative probability distribution graph

In plot

n <- length(data$var)
plot(sort(data$var), (1:n)/n, type="s")

In ggplot2, the literatal translation would be

n <- length(data$var)
qplot(sort(data$var), (1:n)/n, stat="ecdf", geom="step")

but the idiomatic version is

ggplot(data, aes(x=var)) + stat_ecdf()

The stat_ecdf function here is from the ggplot2 package, and is the same function used in the qplot line above.