Dan Wood: The Eponymous Weblog (Archives)

Dan Wood Dan Wood is co-owner of Karelia Software, creating programs for the Macintosh computer. He is the father of two kids, lives in the Bay Area of California USA, and prefers bicycles to cars. This site is his older weblog, which mostly covers geeky topics like Macs and Mac Programming. Go visit the current blog here.

Useful Tidbits and Egotistical Musings from Dan Wood

Categories: Business · Mac OS X · Cocoa Programming · General · All Categories

Wed, 20 Dec 2006

itunes 7: turning some artists to "Various Artists"?

Is this a bug in iTunes 7? Anybody else having this happen?

I've noticed that many albums in my library have lost their "Artist" value, being replaced with "Various Artists". So what used to be albums of songs now are split into multiple albums, or in the case here, just two songs of an album have lost their artist:

itunes screenshot

Karelia's technique for building "Apple Help" files

One of the cool new features of Sandvox 1.1 is that we now have an Apple Help book.

Before, choosing "Sandvox Help" from the Help menu would just direct the user to view our online wiki. (Some people have wondered why we didn't chose Sandvox to build our help. The answer is that although it is technically possible, Sandvox just isn't the right tool for that kind of content.) We had originally chosen a wiki (MediaWiki) as the authoring system for our help because of the way that it is good at managing content; just by creating a link to a new page, it shows up on a list of orphan pages, so there's an automatic to-do list as you start authoring pages. It's pretty easy to create a site without links to nonexistant pages this way. We were also able to, during the early stages, rely on some user contributions to the site. As we started dedicating some resources to building the help, we were able to edit the site "live" so that the help got better and better as time went by.

At some point, we decided that we were ready to start documenting features that were not part of the current 1.0.x release. To do that, we made a clone of the wiki and put it on a new subdomain, where only the official authors of the site (Myself, Terrence, and Mike) could view and edit. A private wiki seems kind of strange, but we were documenting unreleased software, and I didn't want to confuse users of the released version.

Many users were probably more confused by being directed to a wiki than helped by it. There is a lot of extraneous information on a mediawiki page that really wasn't needed. So I decided to work on exporting the wiki to a simpler page look.

I built up some shell scripts to automate the process. The first step is to download the wiki, as rendered as HTML, onto the local computer. The second step, the bulk of the work, is to clean up the HTML so that only the essential content remains. The third step just merges the cleaned-up HTML into our source tree so it will be part of the application.

Getting the HTML, the first step, is fairly brute-force. There might be some cleverer way to extract information from the mediawiki database, but I wanted something simple. It boils down to a single line of wget: (Simplified here just a bit)

wget --domains=private_domain --level=2
	--no-parent --convert-links --html-extension
	--recursive --reject "*\?*"
	http://private_domain/Special:Allpages

wget is a great little utility; it's too bad it's not included with Mac OS X by default. One thing that I find annoying is that it seems to match the patterns after downloading a URL, rather than before. So it uses up a lot of time and bandwidth for the many special links on a mediawiki page that I don't actually want. The download takes a while, but at least it works for putting a static copy of the main pages of the wiki onto my local system, ready for processing.

The second step is just a big shell script that operates on the files. It performs the following tasks:

  1. Copy the downloaded wiki into a new directory for editing the files (so that I don't have to re-download the original files if my script isn't quite right)
  2. Remove certain pages and directories that I don't want for the user documentation (developer & designer pages, mediawiki "meta" pages, the top-level pages that are replaced in the Apple Help pages, etc.)
  3. Loop through all the pages and build up an index page, properly taking redirect pages into account
  4. Repair all links to "redirect" pages to point to the proper target pages
  5. Using the sips utility, try converting the PNG files from the wiki to JPEG 2000 format. Only the files that are actually smaller are kept as JP2. (We also have a hand-maintained blacklist of files to exclude from this process because some of the images look terrible when converted to JP2.) Overall, this technique shrinks the images from 8.4 megabytes to 4.3 megabytes!
  6. The remaining PNG files are run through optipng to shrink them down as much as possible. This shrinks the files down just a bit more.
  7. Using perl -pi -e 's/source/destination/g' ..., do a bunch of substitutions on the .html files to remove the junk we don't want, move keywords that we have explicitly defined at the bottom of each page into the <meta> tags, etc. There are actually a lot of sub-steps here that I won't get into, but I will note that in order for this to really work, I first changed \n to \r so that it would essentially treat the file as a single line of text. I just couldn't get perl to do its substitution across \n line breaks.
  8. Run tidy on the pages so the HTML is readable
  9. Do a check for dead links.

The final step carefully merges the edited files into our source tree, careful not to clobber subversion tags; it also leaves alone the hand-built, static pages (such as the initial page and "Discover Sandvox"). With the CSS there, the final web site looks and behaves a lot like many of the other help books that come from Apple.

I've replicated this process on our own server as well, so that the website docs.karelia.com contains our help pages as well. There are a few differences with the home page, and we don't remove the special categories such as the Developer and Designer pages. This allows us to link to specific pages on the web when communicating with Sandvox users who need help. (Now if Google would just index docs.karelia.com, that website will be searchable as well!)

On the pages for Apple help, the trickiest part was getting everything just right so that Apple Help would do the right things: list Sandvox in its list of applications, show the correct initial page, be searchable, and so forth. The documentation for Apple Help is sorely lacking, (Actually, this "preliminary" document from 2004 is probably the most useful) but their mailing list fills in the gaps.

OK, this post had nothing to do with Cocoa. Sosumi.

"The Luxurious Flower Shedding"

I had to share this with the world. My mom recently bought this water pitcher with the most bizarre label on it:

Product this peculiar overlengthy groove is it design to surface, can stretch reach flowering shrubs preventing rivers from overflow effective when watering flowers.

the luxurious flower shedding

Full size scan @ flickr: The Luxurious Flower Shedding

Nintendo Day & Real Shooting...

Today I kept reading blog posts and notice iChat status messages of people I know so excited about the Nintendo Wii (pronounced "why" — I don't care for computer gaming) ...

... But this story was certainly a sobering take on the phenomenon.

"Oh, you're buying those Nintendo tapes? You've got those ones that are all war.. I've been there. I've seen it." A few people were perplexed, but still curious about what the guy had to say.
...
"You remember that, when you're playing your little tapes," the man said as he gestured playing around with a controller, "you remember that there are people really doing that. They're shooting and they're getting shot at."
...
Not only was it a moment of intense frustration, but also introspection - I was sitting in line for a f[...] Nintendo while there are people dying for no reason. I'm programmed to buy the latest crap just because it's the latest crap, and play games that mock the reality of the horrifying environment of war. You can try your best to change the status quo sonny, but it's not gonna work - so fire up your Sony NintendoBox 2000 and shut the hell up.

Read the whole story about the early-morning encounter at the mall.