Hoist by My Own Petard

The former incarnation of this website was a homegrown godawful mess of a CMS that I cobbled together using PHP and MySQL. It did a so/so job of separating content from presentation.

Not that I bothered to get a backup of the actual MySQL database before abandoning the site. I think it was one of those things that got lost in the shuffle of preparations for the move back east.

I did manage to grab a more-or-less complete mirror of the old site with wget, which enabled me to brute force my way through the old static HTML files (thankfully generated using a somewhat-thoughtfully commented template) and kluge the contents into Movable Type's import format.

I took a closer look at my archive files tonight and realized that all of the old photo pages are intact as well. The gears started turning, and here it is 1:21 AM and I'm up to my elbows in a procedural PHP script to process the old pages into something I can reincorporate into my new site without too much trouble. My plan of attack was something like this:

  1. Find a text pattern distinguishing the photo pages and retrieve a comprehensive list using grep
  2. for each file in that list, read the contents into memory and use regexes to extract the title, photo filename, and description data.
  3. Read the EXIF data from the photo file to get the photo date and time
  4. Generate thumbnails using GD
  5. Assemble the resulting data into a Movable Type import file
  6. Upload all the photos and thumbnails
  7. Run the import
  8. Rebuild and revel in how L337 I am

The only catch is that the FileDateTime info in the headers was set to the date upon which I copied them to my desktop from whatever old CD-ROM I found them on, not the actual dates of the photos.

Fortunately the dates are contained in the photo pages themselves - a bit more regex hacking will take care of that. I don't have the times, so by miraculous coincidence each one will appear to have been taken at 12:00 PM precisely.

Some of the thumbnails are missing, so I'm now in the process of downloading the ImageMagick binaries for Darwin... I could probably put together a PHP script to do it with GD in the time the download will take, but I don't feel like burning the mental energy.

The biggest problem of the whole process will probably be uploading all these photos to the server.

Syndicate content

Twitter

Older

Contact

Andy Chase
(978) 297-6402
andychase [at] gmail.com
GPG/PGP Public Key