"SCRUZIA" ... a very infrequently updated weblog ...


A small metadata adventure

My nephew Ben, his friend Chance, and I went on a backpacking trip in mid-July (7/12 through 16th), from Twin Lakes (northeast of Bridgeport) into Yosemite. Here are the pictures from the Twin Lakes trip:
http://picasaweb.google.com/dlandauer/IgLTwinLakes The actual trip report will be in another posting. [Linked here]

I had a small problem to solve once we got back: Ben had one of my cameras, and I had another. Each one can set the "creation time" in the JPEG file's metadata, but the two cameras' internal clocks were not synchronized, and I wanted to send them up to Picasa in chronological order. The rest of this weblog entry isn't really about the backpacking trip (read the pictures' captions for that, or the forthcoming trip report), but more about how I solved the time-synchronization problem.

[This turns out to be a problem others have run into, and there's a Linux program called "jhead" that can do this kind of date manipulation. But writing the small Python program was quick and somewhat entertaining...]

I did some bulk file renaming to make the filenames lowercase, remove some leading zeros, and make the camera names -- Lumix and Sony -- clearer. I run Mac OS X, which has a command called "sips" (Simple Image Processing System"?); I used it to extract the recorded times for each photo, and listed each time and its corresponding filename into a text file. After another couple of global substitutes, I had a Python array containing the filenames and their nominal times, like this:

pix_and_times = [
  [ 'lmx50553.jpg', '2009:07:10 07:06:45' ],
  [ 'lmx50555.jpg', '2009:07:11 12:15:30' ],
  [ 'lmx50795.jpg', '2009:07:16 16:08:07' ],
  [ 'lmx50796.jpg', '2009:07:16 16:50:09' ],

  [ 'sony6308.jpg', '2009:07:12 05:36:12' ],
  [ 'sony6309.jpg', '2009:07:12 05:36:22' ],
  [ 'sony6524.jpg', '2009:07:16 06:06:31' ],
  [ 'sony6528.jpg', '2009:07:16 06:07:48' ],
  [ 'sony6529.jpg', '2009:07:16 06:08:25' ]
The next question was, when and where did we take any pictures that I knew were from around the same time? What I came up with was that on our last hiking day, in Kerrick Meadow, we saw a pair of Mountain Chickadee chicks, running around on a muddy streambed, and I knew we had both taken pictures of them. So I found a couple of the corresponding photos
Lumix chick Sony chick

and figured out that they were 12 hours and 23 minutes apart.

Armed with that knowledge, I wrote 50-some lines of python that renamed the files so that their filenames, sorted alphabetically, would also be in correct chronological order. The guts of the code:

  # Make an object for each photo
  pixobs = map(Pixob, pix_and_times)

  # Parse the dates of the photos, and adjust my notion of the creation
  # time for only the sony ones.
  for ob in pixobs:

  # Sort by adjusted date
  pixobs.sort( date_cmp )

  # Print out a "mv" command to rename each file with a name
  # that will sort the way we want.
  for nr, ob in enumerate(pixobs):
    print 'mv ' + ob.fn + ' ' + aatrans(nr) + ob.fn
The aatrans function translates an integer between 1 and 676 into 'aa', 'ab', 'ac', ... 'zx', 'zy', 'zz'

  def aatrans ( nr ) :
    qq = nr // 26
    rr = nr % 26
    return atoz[qq] + atoz[rr]
Finally, I ran the shell on that output, resulting in files with names like these interleaved ones, from somewhere in the middle:

(At some point, I'll figure out how to put the python source code somewhere on the web. Note that it doesn't include the bulk renames and sips data extraction.)

My favorite part of this is how well the time-alignment was demonstrated. After aligning the photos via the chicks, I went back and looked at this sequence of photos taken as we were climbing the snow and rocks towards Burro Pass. I took a photo of Ben, way up on the rocks above us, then I zoomed way in and took another one, nine seconds later. In between those two shots, the sorting program placed the picture that Ben took, of me taking one of those pictures! Here's that sequence: