Hum – A Personal URL shortener for WordPress

While I haven’t had much time over the last year or so to spend actually writing code for DiSo, I’ve been really interested in the new direction Tantek has been taking things with his DiSo 2.0 concepts. Many of the early efforts in DiSo were focused just on how to move social data around the web (data formats, protocols, authentication mechanisms, etc). Tantek is taking a slightly different approach to this by first emphasizing the importance of data ownership. It’s not enough to simply pull in a copy of your content from social networks into your local repository. In order to truly own your data, the original should be on your site, and then copies pushed out to whatever social networks, with links pointing back to the original where appropriate. It may sound like a purely academic distinction, but it’s the difference between sharecropping and homesteading.

So just to prove that I don’t actually spend all of my time in the belly of the beast, I’m happy to announce a new project I’ve been working on – Hum. It’s a personal URL shortener for WordPress, inspired by Whistle. It’s important to note that this is not a traditional URL shortener like bit.ly or goo.gl that is designed for creating links to any arbitrary site on the web. Instead, a personal URL shortener is intended to link to your own content… your blog posts, your status updates, your photos. When pushing content from your site out to social networks, you often need the ability to link back to the original. Thanks to Twitter this means short URLs, and following DiSo principles, this means controlling those URLs. A URL shortener may not be sexy, but it’s necessary infrastructure for DiSo 2.0.

Hum is like Whistle in a lot of ways. It’s designed to be run on a personal domain. It use NewBase60 encodings to keep the URLs both short and human readable. And it uses the same content-type partitioning of the URL space. The key difference is in the use of database keys. One of the design principles of Whistle is to have algorithmically reversible URLs. That means that anyone who knows the algorithm can convert the short URL back to the fully expanded form without actually having to make an HTTP request. It also means that you don’t need any kind of datastore to lookup the mapping. You can easily store everything in flat files. This results in a more stable and overall faster system. But because we’re building on WordPress, which is tied to a database anyway, there wasn’t as much of a benefit in avoiding the use of database keys in the short URLs.

How it Works

Hum is actually pretty simple. It’s very lightweight and currently uses no data storage of its own, as there are no configuration options. It registers a few URL patterns like /b/* and then handles any requests to those paths. For example, the short URL for this blog post is http://wjn.me/b/FJ. It also hooks into the built-in WordPress shortlink functionality to expose these new shortlinks in the metadata for each page.

Doing nothing else, this should give you reasonably short URLs, depending on your domain name. But the real value comes when you couple it with a personal short domain, and it’s incredibly simple to do. Buy a short domain, and set it to redirect to your primary domain. I did this by putting the following in my .htaccess for my short domain:

RewriteEngine On
RewriteBase /
RewriteRule (.*) http://willnorris.com/$1 [L,R=permanent]

Then you want to tell Hum that you have a short domain that it should use for generating URLs. To do that, add something like the following to your theme’s functions.php file:

add_filter('hum_shortlink_base', create_function('', 'return "http://wjn.me/";'));

And that’s it. You now have simple short URLs for all of your WordPress content. Hum includes additional hooks to make it very easy to link to offsite content, which I’ll hopefully cover in a future post. In the meantime, read the source… it’s pretty well documented.

This is only the first step in what I’d like to build for a WordPress implementation of DiSo 2.0, but a necessary one. If you’re interested in this, please contact me. You should also consider coming to IndieWebCamp in Portand, Oregon where we’ll be discussing this stuff for a full weekend.

Download Hum on GitHub: https://github.com/willnorris/wordpress-hum
Also on WordPress Extend: http://wordpress.org/extend/plugins/hum/

Comments and responses

Have you written a response to this? Let me know the URL:

This is similar to my old Shorten plugin. The / after the b (something my impl also did) is superfluous, you can take it out if you only use a single char for partitioning.

On algorithmic reversing. For sngpl.ma I chose to use algo shortening (same algo as Whistle) for one reason: durability. What if I leave Wordpress? What if my DB (and backups) get lost? What if my site goes offline? With an algo shortener you can look up the post in the Internet Archive or similar even if my site goes away.

Tantek seems to be following precedent regarding the trailing slash:

b - blog post specific short URL design: /b/SSSn (prefer /x/ design per Flic.kr precedent, extensibility)

I also considered the durability of URLs, particularly with "what if I leave WordPress". For me personally, I'm not willing to change the permalink structure of my blog to something that allows for fully algorithmic url expansion. I'm pretty heavily invested in WordPress already, and don't see myself switching anytime soon. If and when I do, then I'll deal with it then :). It's certainly a trade-off, but one I made consciously.

I'll add, I don't actually agree 100% with Tantek's DiSo 2.0 concepts, but fortunately I don't have to. There is still a lot of value to be gained in terms of data ownership with an 80% solution.
I haven't changed my permalink structure at all. The algo link gives the epoch day (and thus the year and month) and from that one can look through posts on that day and find the nth one