This Site Is Such a Hack

Hacking in a suite at clarionPhoto © 2010 Johan Nilsson CC BY-NC 2.0

Hacking in a suite at clarion
Photo © 2010 Johan Nilsson CC BY-NC 2.0

I’ve been wanting to set up a software-development blog for some time. And for some time I’ve been wrestling with the pains of managing multiple WordPress blogs. Back in the day, when I had just my writing blog and a LiveJournal, it wasn’t such a big deal. Since then, I’ve retired the LiveJournal—sorry, LJ friends—and have set up a number of other WordPress sites, for different niches in which I write. In particular, I have a personal blog, a political blog (because I decided I should not pollute my personal blog with political posts), and now a software-development blog (which you are currently reading).

When I spun off the political blog, I got a taste of what it meant to migrate posts from one WordPress instance to another. And my experience with this SD blog was ten times worse, because there were 10 times as many posts involved.

The short version of the story: WordPress is broken, and it will always be broken.

The long version of the story… is a little more of an acerbic joke.

The irony is that if WordPress were at least designed well, I could have fixed it—after all I am a developer myself—at least good enough for my purposes. In fact, many months ago when I first started thinking about the set of problems I face, fixing WordPress was the first possibility I considered. No chance of that ever becoming a reality.

What I really want is to manage all my posts, across all the sites in my web universe, through a single, unified admin interface. I want to select which post is displayed on which site, and then have all the other sites respond with 301 redirect if someone tries to access the post through the wrong URL. Similar to what Lineage and Resonate do for Plone, except without having to run a Plone site. None of this is rocket science, of course. But unfortunately, it seems to be beyond the architectural capabilities of most CMS programmers. It would require loose coupling between the data model and business rules, for example.

Failing that, I figured I could export my SD posts from my personal blog and import them into this blog… kinda.

Exporting only a select set of posts from a WordPress blog, WordPress doesn’t make that easy. But it is possible.

I started by selecting all the posts I wanted to export, bulk-editing them, and then adding a new tag to them, called “to export.” I hit OK, and only then, after I had done all that work, did WordPress inform me that the edit data was too long for it (or maybe for my host’s PHP installation) to handle.

Second attempt: I selected only a few posts that I wanted to export, bulk-edited them, and then added a new “to export” tag to them. I hit OK, and WordPress changed the last-modified date of the draft posts in the set—It wasn’t supposed to do that, because I didn’t actually modify the post data. And it also failed to add the “to export” tag to those posts. Several more experiments verified that I cannot add tags to a post using WordPress’s bulk-edit feature, because WordPress silently fails. Wonderful F*ing software, eh?

Attempt number three: I created a new category, called “- to export” (with the hyphen in front so that it always would appear first in the alphabetized checkbox list of categories). Then I selected the posts I wanted to export, in batches, bulk-edited them, and checked the “- to export” category. Finally, that worked.

I used WordPress’s export feature to export the “- to export” posts. Then imported the posts into this SD blog. The posts showed up okay, with all comments, but without any of the revision history, because WordPress does not know how to migrate revision history. It also does not know how to migrate image attachments, unless you export the entire content of the site. Again, WordPress gives no indication of this; it silently fails. As a quick Google search revealed, this is a known problem—has been known for years—and has bitten others who have tried to do exactly the same sort of thing that I wanted to do.

So I had all the posts, but all the attached images were still over at the old site. I tried the Import External Images plugin, which thought the full-size and scaled-down version of each image were actually two separate images, and insisted on importing them both. Not that it mattered much, because the plugin consistently hung while importing the first post it tried. Every time. I glanced at the code but quickly gave up. I didn’t have time for this.

So I gave up. All the images on all the imported posts are still hosted from over on the other blog, on the corresponding (no longer maintained) posts over there.

That’s okay, I guess, because I wanted to redirect from the old posts over to this blog, for search engines and other external links. I used the Redirect plugin, a simple plugin that I have hacked to work correctly, but which is no longer maintained by the original developer, who seems to have disappeared off the Internet. No matter, the (hacked version of the) plugin does the job. I have to set the redirect code and target manually for each and every post on the old blog, but at least it worked.

Of course, then I had to hide the old posts from listings on the old blog. For that, I used the WP Hide Post plugin. Actually, I thought I might be able to add code to the Redirect plugin to automagically hide posts that I had redirected… but this would have required WordPress to have been engineered with something approximating an actual architecture.

I would have had to hook into the posts_where_paged and posts_join_paged filters, which WP Hide Post actually does, and I take my hat off to its developers! The hooks are undocumented. I did look at WordPress’s get_posts() function, which calls these hooks. Sigh. I did once refactor a function that was almost as badly written as this one is. But that was at least a week’s worth of work, full-time; and this was a task I needed to finish up in an afternoon.

Again, I had to manually hide each and every post I had migrated over to the other blog, editing each one and checking off all 7 ways in which I wanted to hide it. That blew through another episode or two of Fringe.

(Actually, I think I was watching Cheers at the time. Or maybe old episodes of Magnum, P.I.. Or The Rockford Files. One of those.)

Only then did I discover that this did not hide the posts from the XML sitemap. (Oy vey!) To remove the posts from the sitemap, I had to copy and paste the post ID of each and every post into an edit box in the appropriate settings page.

But I finally got the job done… kinda.

As I said, pieces of the content are still hosted through a different URL. And I never did find a way to redirect comment feeds from the old site to the new one. So there are holes.

The thing is, this is a job that ought to have taken 5 minutes. As I said, I really want a unified system where I can just say, “This post goes over on that site,” and it just happens. Or failing that, I want to be able to spend an afternoon with a code editor to add the feature. I should not need to wrestle with features that don’t work as advertised, and do all this manual hacking on the data, all for a solution held together with bandage tape, which should make any real software engineer throw up just a little.

But that, dear friends, is the nature of the most popular blogging platform in the world.

-TimK

This entry was posted in Uncategorized and tagged , , . Bookmark the permalink.

2 Responses to "This Site Is Such a Hack"

Leave a reply