Dan Wood: The Eponymous Weblog (Archives)

Dan Wood Dan Wood is co-owner of Karelia Software, creating programs for the Macintosh computer. He is the father of two kids, lives in the Bay Area of California USA, and prefers bicycles to cars. This site is his older weblog, which mostly covers geeky topics like Macs and Mac Programming. Go visit the current blog here.

Useful Tidbits and Egotistical Musings from Dan Wood

Categories: Business · Mac OS X · Cocoa Programming · General · All Categories

Wed, 13 Feb 2008

Filtering out junk in a FileMerge compare

Today I had the unfortunate occasion to use FileMerge to compare a lot of source code files These files had been checked into our Subversion repository in separate branches, and we were having trouble merging, so we wound up having to deal with this by hand.

Almost immediately I noticed that just about all the source files were marked as different from each other, even though we hadn't made any changes. What was up with that?


Well, it turns out that FileMerge was (correctly) noticing the differences in the special text that is substituted for source repository information. Something like this:

// $Id: DotMacConnection.m 597 2005-03-03 23:43:39Z ttalbot $
// $Date:$

Fortunately, I was able to dig up a script that I had found and adapted in the past for CVS; I updated it trivially for subversion by changing its name! It's a perl script that reads a file and converts patterns like the above culprits to a non-substituted version, sending the resulting text to standard output. The above lines would be substituted with this:

// $Id$
// $Date:$

FileMerge has a nifty facility for filtering (processing) files that it compares. This is handy for dealing with non-textual files and packages such as nibs, RTF, and so forth — but why not use it for text as well. So with the above script (in my case, installed in the "bin" directory within my home directory, I just needed to add two filters in FileMerge's preferences:

FileMerge preferences dialog with two new entries for 'h' and 'm' extensions being set to the filter of '$(HOME/bin/svnclear.pl $(FILE)', display set to 'Original', and Apply set to 'Yes'

Here is the svnclear.pl script; put it in wherever you keep your command line scripts, be sure to set it to be executable, and make sure that your FileMerge preferences point to its installed location. Oh, and be sure to include the $(FILE) argument, or all your source files will appear identical because you will be comparing identical error messages! Also, the "Display" column should show "Original", not "Filtered"; this will allow the merged files to be saved.

With this script installed, files with different '$' tags are considered equal, which is exactly what I wanted. Imaging extending this even further, and coming up with other text processing scripts to help almost-equal files be considered equal, for instance, removing comments. The possibilities are nearly endless.

BTW, please no comments or questions about perl! I don't really know much about perl!