Building a plugin for Jekyll

The general layout of a plugin for Jekyll

Background *

If you haven't noticed, every header on this site is a link. This is so that you can share a specific portion of a post, much in the same way Wikipedia works. Now, when I built this site, I wanted the markup to have as much semantic meaning as possible, meaning that every id, every class and basically every tag has to make sense without CSS or Javascript (I'm not there yet, though! Try visiting the website through lynx).

One of the weakest links of what I've had the time to do as of today, was the page intralink functionality I spoke of above. The way it was set up, was that when the page is loaded and ready, [a small script][1] inserted links to every header on the page.

Now, you might wonder: "But Marcus, how do browsers such as lynx, or Google Bot see these links?" Truth is: They don't.

The first thing I did to remedy the situation, was to throw the Javascript "solution" out the window. It was a quick hack at best, and I honestly don't know what I thought when I pushed it live.

Ruby to the rescue *

Jekyll is built using Ruby. By extension, so is the site you're reading now. Jekyll is extensible, and any files matching _plugins/*.rb from within your source directory will be loaded into the generation runtime.

Using this pattern you can create Generators that generate new content, Converters that convert source content into something else, and bascially anything you can do using Ruby (Read: anything, click here).

My solution to our little problem was to plop a Converter into the _plugins directory, monkey patch the Markdown engine I use (RDiscount) and overload the convert method.

The structure of plugins for Jekyll *

For starters, add this to your file:

#_plugins/converter.rb:

module Jekyll
    module Converters
        class Markdown
            class RDiscountParser

            end
        end
    end
end

Go ahead and open up a parser of your choice. I'm using RDiscount, so that's what I used. Your parser might not reside in Jekyll::Converters::Markdown, so check your documentation!

While we're at it, toss this into the mix:

#_plugins/converter.rb:
# [..]

alias_method :original_convert, :convert

def convert(content)
    return original_convert(content)
end

# [..]

What this does, is that it creates an alias to the original convert method so we can use our own version as well.

NOTE: I know many old-timers with Ruby will scream and bang their head at me for using the return statement. But since I use a multitude of languages every single day, I prefer to stick to a syntax that's familiar. Readability over standards in my case, unfortunately.

Next, add these two methods so Jekyll will know which files we should process:

#_plugins/converter.rb:
# [..]

def matches(ext)
    return ext =~ /^\.html$/i
end

def output_ext(ext)
    return ext
end

# [..]

Simple enough. Match all .html files and output an extension we want (of course, modify these however you want).

This is really all you need to know to extend your Parser. Put your logic in convert and modify content however you see fit. You can find my particular solution further down the page.

Remark *

Worth noting, is that even though I use this on my own site, it's not in the wild. I use this every 50003432343 year or so when I create a new post and have to regenerate my site. I would never use this somewhat hacky solution on something that's "public facing", i.e. in a RoR controller or something, for that I'd opt for a full-fleshed parser (perhaps Nokogiri?) instead of some homegrown regular expression.

#
# If you use this code, put a link somewhere to this post. 
# Otherwise you're free to do whatever you want.
# If you make a gazillion dollars with it, hire me!
#
module Jekyll
    module Converters
        class Markdown
            class RDiscountParser

                #Matches any h2-h6 tag with or without attributes
                START_PATTERN = /\<(?<tag>h[2-6])\s*(?<attr>(\w+(\=(\"|\')(\w|\s)*(\"|\'))*\s*)*)\>/

                #Matches any ending h2-h6 tag
                END_PATTERN = /\<\/\s*(?<tag>h[0-6])\s*\>/

                #The tag that was matched in START_PATTER, END_PATTER and the content between them
                CONTENT = /\<(?<tag>h[2-6])\s*(?<attr>(\w+(\=(\"|\')(\w|\s)*(\"|\'))*\s*)*)\>(?<content>(\w|\s|\d|\n|[\!\"\#\&\(\)\=\?\\\/\.-])*)?\<\/\s*(?<tag>h[0-6])\s*\>/

                def matches(ext)
                    ext =~ /^\.html$/i
                end

                def output_ext(ext)
                    ext
                end

                alias_method :original_convert, :convert
                def convert(content)

                    return original_convert(content).gsub CONTENT do |match|
                        #Remember
                        #content = the content between the tag
                        #attr = the attributes of the tag
                        #tag = the tag itself

                        tag = $~[:tag]
                        attr = $~[:attr]
                        content = $~[:content]
                        id = content.gsub /\s+/, '-'

                        # After formatting:
                        # <hx x="y">
                        #  <a href="#Some-Header">Some Header</a>
                        #  <a href="#top"> *</a>
                        # </hx>
                        <<-eos
                            <#{tag} #{attr} id=\"#{id}\">
                                <a href=\"\##{id}\">#{content}</a>
                                <a href="#top"> *</a>
                            </#{tag}>"
                        eos
                    end
                end
            end
        end
    end
end

This post is one in a series of posts on building a site with Jekyll. These are the articles I have written so far, in no particular order: