Django: Highlighting in HTML using Pygments and Beautiful Soup

Two months ago I wrote about Django, Pygments and Beautiful Soup.

A few days after the post I switched from HTML markup to reStructuredText and thus didn’t need my filter anymore and forgot to post more about it.

Today I received a comment asking me to publish the source code of the template filter—so here it is:

# encoding: utf-8

"""
A filter to highlight code blocks in html with Pygments and BeautifulSoup.

    {% load highlight_code %}

    {{ var_with_code|highlight|safe }}
"""

from BeautifulSoup  import BeautifulSoup
from django import template
from django.template.defaultfilters import stringfilter
import pygments
import pygments.formatters
import pygments.lexers


register = template.Library()

@register.filter
@stringfilter
def highlight(html):
    soup = BeautifulSoup(html)
    codeblocks = soup.findAll('pre')
    for block in codeblocks:
        if block.has_key('class'):
            try:
                code = ''.join([unicode(item) for item in block.contents])
                lexer = pygments.lexers.get_lexer_by_name(block['class'])
                formatter = pygments.formatters.HtmlFormatter()
                code_hl = pygments.highlight(code, lexer, formatter)
                block.contents = [BeautifulSoup(code_hl)]
                block.name = 'code'
            except:
                raise
    return unicode(soup)

Copy the code into a file called templatetags/highlight_code.py within a new or an existing Django app.

highlight() searches the passed HTML code for <pre>-tags with a class denoting the lexer to be used, e.g.:

<p><em>Hello World!</em> in Python:</p>
<pre class="python">
print 'Hello World'
</pre>

Furthermore you might want to create a CSS file containing all the style definitions in your static media directory. The following python script will do the job:

# Call it this way:
# python gen_css.py pygments.css
import sys

from pygments.formatters import HtmlFormatter

f = open(sys.argv[1], 'w')

# You can change style and the html class here:
f.write(HtmlFormatter(style='colorful').get_style_defs('.highlight'))

f.close()

I hope I explained all this well enough—if not, leave a comment and I’ll update this post. :-)