Migrating From a WordPress backup

I have been using hugo to generate this site for quite a while now and I really like it[^1]. But I needed to migrate my old wordpress blog into a static framework.

Fortunately, WP practices ethical software development and makes it easy to get your data out of their software. You can get a MySQL database dump, which is useful for migrating from one hosting provider to another, or get a dump in the form of a json array containing all your posts. Each posts looks like this:

{
  "ID": "1",
  "post_author": "1",
  "post_date": "2013-03-16 20:15:14",
  "post_date_gmt": "2013-03-16 20:15:14",
  "post_content": "Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!",
  "post_title": "Hello world!",
  "post_excerpt": "",
  "post_status": "publish",
  "comment_status": "open",
  "ping_status": "open",
  "post_password": "",
  "post_name": "hello-world",
  "to_ping": "",
  "pinged": "",
  "post_modified": "2013-03-16 20:15:14",
  "post_modified_gmt": "2013-03-16 20:15:14",
  "post_content_filtered": "",
  "post_parent": "0",
  "guid": "http://jpfairbanks.net/blog/?p=1",
  "menu_order": "0",
  "post_type": "post",
  "post_mime_type": "",
  "comment_count": "0"
}

You can find all the information you need to generate a hugo markdown file for your site out of this json object. Any you can use my favorite UNIX tool invented after Y2K jq[^2].

jq '.[] |
  {status:.post_status,
  title:.post_title,
  date: .post_date,
  content:.post_content,
  type:.post_type,
  path: (.post_title |
          gsub("!@#$"; "") |
          gsub(" "; "_") |
          "content/post/"+.+".md" |
          ascii_downcase)
  } 

Which will output something like:

{
  "status": "publish",
  "title": "Hello world!",
  "date": "2013-03-16 20:15:14",
  "content": "Welcome to WordPress. This is your first post. Edit or delete it, then start blogging!",
  "type": "post",
  "path": "content/post/hello_world!.md"
}

Now you need to get those posts into separate files so set the -c flag for compressed output on your jq scripts and start writing in a real programming language. For this case we use python.

"""gen.py takes a json decription of the posts and writes them out into hugo input files."""
import json
import sys

def render(obj):
    "render an object with the template"
    out = """
+++
date = "{}"
title = "{}"
tags = []
highlight = true
math = false
draft = true

[header]
  caption = ""
  image = ""

+++
{}""".format(obj['date'], obj['title'], obj['content'])
    return out

def main():
    "write the rendered template for each line of stdin"
    for line in sys.stdin:
        obj = json.loads(line)
        string = render(obj)
        path = obj['path']
        print("Writing to {}".format(path))
        with open(path, 'w') as filp:
            print(string, file=filp)

if __name__ == "__main__":
    main()

In this case we wrote a program to take a database dump and generate input files for the static site generator to make the site. You could certainly do this in Go, but a quick python script was easy enough. It would be fun to write a full featured tool that took a WP dump and converted it to a hugo input directory. Maybe this would make a good wordpress plugin for someone who knows either wordpress or hugo and wants to learn the other really well.

[1]: In fact, static site generation inspired me to design my QueryGarden project based on static AOT SQL query generation. [2]: This requires jq v1.5+