Category Archives: Software Development

A build layout – updated

This seems to be the cleanest.

Build/
  Win32-Debug/
    Obj/
      sub1/
      sub2/
    Lib/
      sub1.lib
      sub2.lib
    program.exe
    program.ilk
    program.pdb
  x64-Release/
    ..

This keeps all the object files in one place, one sub-folder per project, all the libraries in another place (named after project, so presumably unique), and then the executables in the main target folder. Instead of nesting platform and target, there’s a flat hierarchy of targets, with each target name sufficiently disambiguated. For example, if there’s only one architecture, that might be left out, but once there are several architectures, all are named.

I haven’t yet figured out how to do this in a single property sheet in Visual Studio. There are two, one for projects that make executables, and the other for projects that make libraries. This is because of multiple different properties that need to coincide.

Here’s a sample property sheet for libraries:

<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <OutDir>$(SolutionDir)Build\$(Platform)-$(Configuration)\Lib\</OutDir>
    <IntDir>$(SolutionDir)Build\$(Platform)-$(Configuration)\Obj\$(ProjectName)\</IntDir>
  </PropertyGroup>
  <ItemDefinitionGroup>
    <Lib>
      <TargetPath>$(SolutionDir)Build\$(Platform)-$(Configuration)\Lib\</TargetPath>
      <OutputFile>$(SolutionDir)Build\$(Platform)-$(Configuration)\Lib\$(TargetName)$(TargetExt)</OutputFile>
    </Lib>
  </ItemDefinitionGroup>
</Project>

and a sample property sheet for executables

<?xml version="1.0" encoding="utf-8"?>
<Project DefaultTargets="Build" ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <OutDir>$(SolutionDir)Build\$(Platform)-$(Configuration)\</OutDir>
    <IntDir>$(SolutionDir)Build\$(Platform)-$(Configuration)\Obj\$(ProjectName)\</IntDir>
  </PropertyGroup>
</Project>

It’s possible that I have some redundancy due to not completely understanding the interaction between $(OutDir), $(TargetPath) and $(OutputFile). I also should define them in terms of a base path for more readability, something like $(BuildBase).

Now, if only the other Visual Studio artifacts like *.suo, *.sdf, and *.user could be tucked away into a build folder like that. The ideal would be that “get from source and build” does not litter your working directory with artifact files all over the place, but that they are in one place.

Even more ideally, the build output folder would not be inside the source working folder. That’s probably too much to ask, given both existing practices and existing tools. But the very existence of an “ignore list” is a result of such scattering of temp files. And ignore lists are bad, because they hide things, and they are usually based on regular expressions against names, which can have unwanted side effects (e.g. an ignore on “Debug” as a temp folder name, but then an un-ignore needed if you have a folder named “Debug” in source control).

A build layout

Suggestion for Visual Studio builds – and generalized to all builds, including cross-platform builds into the same folder.

Build/
  Obj/
    Win32-Debug/
      main.obj
    x64-Release/
      main.obj
  Win32-Debug/
    program.exe
    program.pdb
  x64-Release/
    program.exe
    program.pdb

This puts all build artifacts in a single folder named Build, with object files separated from final build files.

Alternatively, the build folder could have Obj directories per-target, like this

Build/
  Win32-Debug/
    Obj/
      main.obj
    program.exe
    program.pdb
  x64-Release/
    Obj/
      main.obj
    program.exe
    program.pdb

This is probably a little more logical, except that it will give someone the impression that you could clean a single target easily. “Real” projects often have some pieces compiled in debug and others compiled in release, or with optimizations.

The main idea is that object files and binaries get unique paths based on build settings, including platform and target, but also perhaps including smaller subdivisions like optimization settings.

The main down-side is that a simple project requires several levels of folder to get to the binary, and also that any test data would either need to be duplicated, or located by some more complicated means than “next to the executable”. When launched from Visual Studio, the working directory is set to the folder containing the solution, but when run directly, the working directory defaults to the executable location. There is no obvious easy answer here.

Making this work in Visual Studio involves setting OutDir and IntDir variables like so:

OutDir: $(SolutionDir)Build\$(Platform)-$(Configuration)\
IntDir: $(SolutionDir)Build\Obj\$(Platform)-$(Configuration)\

and this can be done in the IDE (where, in English, OutDir is named “Output Directory” and IntDir is named “Intermediate Directory”) or directly in the vcxproj files.

  <PropertyGroup Condition="'$(Configuration)|$(Platform)'=='Debug|Win32'">
    <LinkIncremental>true</LinkIncremental>
    <OutDir>$(SolutionDir)Build\$(Platform)-$(Configuration)\</OutDir>
    <IntDir>$(SolutionDir)Build\Obj\$(Platform)-$(Configuration)\</IntDir>
  </PropertyGroup>

MD5 implementations

I’m going to try to collect all the open-source MD5 implementations that I know of, in this post. I’ll extend this to all useful cryptographic hash functions. I’m going to supply them directly as links here (for posterity’s sake), but also include links to as original sources. One reason to do this is to find or create a really fast one; all of the ones I’ve seen so far are reasonable, but not speedy.

MD5 Homepage (unofficial) is Mordechai Abzug’s collection of MD5 information.

1991, RSA (Ron Rivest)

RFC 1321: The MD5 Message-Digest Algorithm

Ron Rivest and RSA released MD5 in an open-source fashion in 1992 in the appendix of RFC1321, along with test vectors. The method listed in the RFC source – context, init-func, update-func, finalize-func – has been followed pretty faithfully ever since.

1993, Collin Plumb

MD5 Command Line Message Digest Utility

Collin Plumb wrote the first public-domain implementation of the MD5 algorithm in 1993, and his code was “hacked slightly by John Walker to turn it back into K&R”. This might need to be credited to the combination of Collin Plumb, Branko Lankester, Ian Jackson and Galen Hazelwood.

1999, L. Peter Deutsch/Aladdin

libmd5-rfc

L. Peter Deutsch also wrote his own version of MD5 from RFC 1321, in order to have something with a more BSD-like license.

2001, Alexander Peslyak

A portable, fast, and free implementation of the MD5 Message-Digest Algorithm (RFC 1321)

Alexander Peslyak (posting as Solar Designer, solar@openwall.com), wrote an OpenSSL-compatible version of MD5 in 2001.

 2012, Nayuki Minase

Fast MD5 hash implementation in x86 assembly

Nayuki Minase optimized an MD5 implementation for speed. The assembly version is 10% faster than the C version, but the impressive part is that the C version is 390 MiB/sec, just short of OpenSSL’s speed of 410 MiB/sec. The speeds reported are on a Core2 Q6000 with a 2.4 Ghz clock speed, but the limiting factor might be memory access speeds, not CPU speed itself.

The code is specifically not open-source, although the source code is available, so it’s really more of a data point, and not something that can be used (without licensing).

Django REST framework

Django has an updated toolkit they are calling “Django REST framework”.

http://django-rest-framework.org/

This is worth reading through and exploring, not because you are a developer-user of Django, but because at first glance it’s well thought out and documented. In case you don’t know, Django is a web application framework written in Python. Django REST framework is a library on top of Django to make it easier to build Web APIs. And of course, good Web APIs are REST APIs.

If after further reading of my own, I recant this opinion, I’ll come back and update this post.

 

Build systems

Google “Build in the cloud” – http://google-engtools.blogspot.com/2011/08/build-in-cloud-how-build-system-works.html

Google has a FUSE filesystem that keeps track of digests of files as an attribute. This is something that needs to be default in all filesystems. http://google-engtools.blogspot.com/2011/06/build-in-cloud-accessing-source-code.html. Also, they build everything from head.

http://www.cs.virginia.edu/~dww4s/articles/build_systems.html

 

Revision control systems

This is more about theory and implementation of revision control systems, and not really about use. I’m interested in various concepts came into being and how they evolved.

One early system (that died quite thoroughly, evidently it never got used much) was OpenCM, billed as a “secure and safe CVS replacement”. I found a copy of a user’s manual from 2002 that describes some of the concepts (I’ll see if I can mirror it locally before it too disappears into obscurity).

OpenCM User’s Guide

It hit upon the idea of giving each file a universal name, but it does so by generating “random” names based on your machine name, so it loses some of the benefit of doing content-based names (like using the SHA-1 of the file contents as Monotone, Git and Mercurial do).

Here’s a snapshot of what existed in version-control land back in 2007, at least as far as open source went.

Appendix A. Free Version Control Systems from Producing Open-Source Software by Karl Fogel.

Here’s a slideshow history of revision control: http://www.win.tue.nl/~aserebre/2IS55/2009-2010/stijn.pdf. Well, somewhat – it skips 90% of revision control systems and only talks about the open-source ones. This may be appropriate, since there hasn’t been a lot of cross-pollination from the closed-source revision-control systems.

Monotone’s first release was created by Graydon Hoare and released on April 6, 2003 (according to LWN and Wikipedia). Monotone was rejected by Linus Torvalds as being too slow, and this led directly to the creation of Git.

Veracity is Eric Sink’s replacement for SourceGear. http://veracity-scm.com/

Petr Baudis’ Bachelor Thesis (2008) was on Current Concepts in Version Control Systems. He contributed a lot to Git development, starting days after Linus Torvald’s first release by building a front end for Git (git-pasky, later Cogito, and then folded some of it into the Git core).