Category Python
 

Things about Python or projects I do with Python

Subcategories:

Django (13)
Posts about things I’ve done with Django (especially this site)
Scientific (5)
Posts about scientific Python packages or about using Python in science in general.

Designing and Testing PyZMQ Applications – Part 1

ZeroMQ (or ØMQ or ZMQ) is an intelligent messaging framework and described as “sockets on steroids”. That is, they look like normal TCP sockets but actually work as you’d expect sockets to work. PyZMQ adds even more convenience to them, which makes it a really a good choice if you want to implement a distributed application. Another big plus for ØMQ is that you can integrate sub-systems written in C, Java or any other language ØMQ supports (which are a lot).

If you’ve never heard of ØMQ before, I recommend to read ZeroMQ an Introduction by Nicholas Piël, before you go on with this article.

The ØMQ Guide and PyZMQ’s documentation are really good, so you can easily get started. However, when we began to implement a larger application with it (a distributed simulation framework), several questions arose which were not covered by the documentation:

  • What’s the best way do design our application?
  • How can we keep it readable, flexible and maintainable?
  • How do we test it?

I didn’t find something like a best practice article that answered my questions. So in this series of articles, I’m going to talk about what I’ve learned during the last months. I’m not a PyZMQ expert (yet ;-)), but what I’ve done so far works quite well and I never had more tests in a project than I do have now.

You’ll find the source for the examples at bitbucket. They are written in Python 3.2 and tested under Mac OS X Lion, Ubuntu 11.10 and Windows 7, 64 bit in each case. If you have any suggestions or improvements, please fork me or just leave a comment.

In this first article, I’m going to talk a bit about how you could generally design your application to be flexible, maintainable and testable. The second part will be about unit testing and the finally, I’ll cover process and system testing.

Comparison of Different Approaches

There are basically three possible ways to implement a PyZMQ application. One, that’s easy, but limited in practical use, one that’s more flexible, but not really pythonic and one, that needs a bit more setup, but is flexible and pythonic.

All three examples feature a simple ping process and a pong process with varying complexity. I use multiprocessing to run the pong process, because that’s what you should usually do in real PyZMQ applications (you don’t want to use threads and if both processes are running on the same machine, there’s no need to invoke both of them separately).

All of the examples will have the following output:

(zmq)$ python blocking_recv.py
Pong got request: ping 0
Ping got reply: pong 0
...
Pong got request: ping 4
Ping got reply: pong 4

Let’s start with the easy one first. You just use on of the socket’s recv methods in a loop:

# blocking_recv.py
import multiprocessing
import zmq


addr = 'tcp://127.0.0.1:5678'


def ping():
    """Sends ping requests and waits for replies."""
    context = zmq.Context()
    sock = context.socket(zmq.REQ)
    sock.bind(addr)

    for i in range(5):
        sock.send_unicode('ping %s' % i)
        rep = sock.recv_unicode()  # This blocks until we get something
        print('Ping got reply:', rep)


def pong():
    """Waits for ping requests and replies with a pong."""
    context = zmq.Context()
    sock = context.socket(zmq.REP)
    sock.connect(addr)

    for i in range(5):
        req = sock.recv_unicode()  # This also blocks
        print('Pong got request:', req)
        sock.send_unicode('pong %s' % i)


if __name__ == '__main__':
    pong_proc = multiprocessing.Process(target=pong)
    pong_proc.start()

    ping()

    pong_proc.join()

So this is very easy and no that much code. The problem with this is, that it only works well if your process only uses one socket. Unfortunately, in larger applications that is rather rarely the case.

A way to handle multiple sockets per process is polling. In addition to your context and socket(s), you need a poller. You also have to tell it which events on which socket you are going to poll:

# polling.py
def pong():
    """Waits for ping requests and replies with a pong."""
    context = zmq.Context()
    sock = context.socket(zmq.REP)
    sock.bind(addr)

    # Create a poller and register the events we want to poll
    poller = zmq.Poller()
    poller.register(sock, zmq.POLLIN|zmq.POLLOUT)

    for i in range(10):
        # Get all sockets that can do something
        socks = dict(poller.poll())

        # Check if we can receive something
        if sock in socks and socks[sock] == zmq.POLLIN:
            req = sock.recv_unicode()
            print('Pong got request:', req)

        # Check if we cann send something
        if sock in socks and socks[sock] == zmq.POLLOUT:
            sock.send_unicode('pong %s' % (i // 2))

    poller.unregister(sock)

You see, that our pong function got pretty ugly. You need 10 iterations to do five ping-pongs, because in each iteration you can either send or reply. And each socket you add to your process adds two more if-statements. You could improve that design if you created a base class wrapping the polling loop and just register sockets and callbacks in an inheriting class.

That brings us to our final example. PyZMQ comes with with an adapted Tornado eventloop that handles the polling and works with ZMQStreams, that wrap sockets and add some functionality:

# eventloop.py
from zmq.eventloop import ioloop, zmqstream


class Pong(multiprocessing.Process):
    """Waits for ping requests and replies with a pong."""
    def __init__(self):
        super().__init__()
        self.loop = None
        self.stream = None
        self.i = 0

    def run(self):
        """
        Initializes the event loop, creates the sockets/streams and
        starts the (blocking) loop.

        """
        context = zmq.Context()
        self.loop = ioloop.IOLoop.instance()  # This is the event loop

        sock = context.socket(zmq.REP)
        sock.bind(addr)
        # We need to create a stream from our socket and
        # register a callback for recv events.
        self.stream = zmqstream.ZMQStream(sock, self.loop)
        self.stream.on_recv(self.handle_ping)

        # Start the loop. It runs until we stop it.
        self.loop.start()

    def handle_ping(self, msg):
        """Handles ping requests and sends back a pong."""
        # req is a list of byte objects
        req = msg[0].decode()
        print('Pong got request:', req)
        self.stream.send_unicode('pong %s' % self.i)

        # We’ll stop the loop after 5 pings
        self.i += 1
        if self.i == 5:
            self.stream.flush()
            self.loop.stop()

This even adds more boilerplate code, but it will pay of if you use more sockets and most of that stuff in run() can be put into a base class. Another drawback is, that the IOLoop only uses recv_multipart(). So you always get a lists of byte strings which you have to decode or deserialize on your own. However, you can use all the send methods socket offers (like send_unicode() or send_json()). You can also stop the loop from within a message handler.

In the next sections, I’ll discuss how you could implement a PyZMQ process that uses the event loop.

Communication Design

Before you start to implement anything, you should think about what kind of processes you need in your application and which messages they exchange. You should also decide what kind of message format and serialization you want to use.

PyZMQ has built-in support for Unicode (send sends plain C strings which map to Python byte objects, so there’s a separate method to send Unicode strings), JSON and Pickle.

JSON is nice, because it’s fast and lets you integrate processes written in other languages into you application. It’s also a bit safer, because you cannot receive arbitrary objects as with pickle. The most straightforward syntax for JSON messages is to let them be triples [msg_type, args, kwargs], where msg_type maps to a method name and args and kwargs get passed as positional and keyword arguments.

I strongly recommend you to document each chain of messages your application sends to perform a certain task. I do this with fancy PowerPoint graphics and with even fancier ASCII art in Sphinx. Here is how I would document our ping-pong:

Sending pings
-------------

* If the ping process sends a *ping*, the pong processes responds with a
  *pong*.
* The number of pings (and pongs) is counted. The current ping count is
  sent with each message.

::

    PingProc      PongProc
     [REQ] ---1--> [REP]
           <--2---


    1 IN : ['ping, count']
    1 OUT: ['ping, count']

    2 IN : ['pong, count']
    2 OUT: ['pong, count']

First, I write some bullet points that explain how the processes behave and why they behave this way. This is followed by some kind of sequence diagram that shows when which process sents which message using which socket type. Finally, I write down how the messages are looking. # IN is what you would pass to send_multipart and # OUT is, what is received on the other side by recv_multipart. If one of the participating sockets is a ROUTER or DEALER, IN and OUT will differ (though that’s not the case in this example). Everything in single quotation marks (') represents a JSON serialized list.

If our pong process used a ROUTER socket instead of the REP socket, it would look like this:

1 IN : ['ping, count']
1 OUT: [ping_uuid, '', 'ping, count']

2 IN : [ping_uuid, '', 'pong, count']
2 OUT: ['pong, count']

This seems like a lot of tedious work, but trust me, it really helps a lot when you need to change something a few weeks later!

Application Design

In the examples above, the Pong process was responsible for setting everything up, for receiving/sending messages and for the actual application logic (counting incoming pings and creating a pong).

Obviously, this is not a very good design. What we can do about this is to put most of that nasty setup stuff into a base class which all your processes can inherit from, and to put all the actual application logic into a separate (PyZMQ independent) class.

ZmqPocess – The Base Class for all Processes

The base class basically implements two things:

  • a setup method that creates a context an a loop
  • a stream factory method for streams with a on_recv callback. It creates a socket and can connect/bind it to a given address or bind it to a random port (that’s why it returns the port number in addition to the stream itself).

It also inherits multiprocessing.Process so that it is easier to spawn it as sub-process. Of course, you can also just call its run() method from you main().

# zmqproc.py
import multiprocessing

from zmq.eventloop import ioloop, zmqstream
import zmq


class ZmqProcess(multiprocessing.Process):
    """
    This is the base for all processes and offers utility functions
    for setup and creating new streams.

    """
    def __init__(self):
        super().__init__()

        self.context = None
        """The ØMQ :class:`~zmq.Context` instance."""

        self.loop = None
        """PyZMQ's event loop (:class:`~zmq.eventloop.ioloop.IOLoop`)."""

    def setup(self):
        """
        Creates a :attr:`context` and an event :attr:`loop` for the process.

        """
        self.context = zmq.Context()
        self.loop = ioloop.IOLoop.instance()

    def stream(self, sock_type, addr, bind, callback=None, subscribe=b''):
        """
        Creates a :class:`~zmq.eventloop.zmqstream.ZMQStream`.

        :param sock_type: The ØMQ socket type (e.g. ``zmq.REQ``)
        :param addr: Address to bind or connect to formatted as *host:port*,
                *(host, port)* or *host* (bind to random port).
                If *bind* is ``True``, *host* may be:

                - the wild-card ``*``, meaning all available interfaces,
                - the primary IPv4 address assigned to the interface, in its
                  numeric representation or
                - the interface name as defined by the operating system.

                If *bind* is ``False``, *host* may be:

                - the DNS name of the peer or
                - the IPv4 address of the peer, in its numeric representation.

                If *addr* is just a host name without a port and *bind* is
                ``True``, the socket will be bound to a random port.
        :param bind: Binds to *addr* if ``True`` or tries to connect to it
                otherwise.
        :param callback: A callback for
                :meth:`~zmq.eventloop.zmqstream.ZMQStream.on_recv`, optional
        :param subscribe: Subscription pattern for *SUB* sockets, optional,
                defaults to ``b''``.
        :returns: A tuple containg the stream and the port number.

        """
        sock = self.context.socket(sock_type)

        # addr may be 'host:port' or ('host', port)
        if isinstance(addr, str):
            addr = addr.split(':')
        host, port = addr if len(addr) == 2 else (addr[0], None)

        # Bind/connect the socket
        if bind:
            if port:
                sock.bind('tcp://%s:%s' % (host, port))
            else:
                port = sock.bind_to_random_port('tcp://%s' % host)
        else:
            sock.connect('tcp://%s:%s' % (host, port))

        # Add a default subscription for SUB sockets
        if sock_type == zmq.SUB:
            sock.setsockopt(zmq.SUBSCRIBE, subscribe)

        # Create the stream and add the callback
        stream = zmqstream.ZMQStream(sock, self.loop)
        if callback:
            stream.on_recv(callback)

        return stream, int(port)

PongProc – The Actual Process

The PongProc inherits ZmqProcess and is the main class for our process. It creates the streams, starts the event loop and dispatches all messages to the appropriate handlers:

# pongproc.py
from zmq.utils import jsonapi as json
import zmq

import zmqproc


host = '127.0.0.1'
port = 5678


class PongProc(zmqproc.ZmqProcess):
    """
    Main processes for the Ponger. It handles ping requests and sends back
    a pong.

    """
    def __init__(self, bind_addr):
        super().__init__()

        self.bind_addr = bind_addr
        self.rep_stream = None

        # Make sure this is pickle-able (e.g., not using threads)
        # or it won't work on Windows. If it's not pickle-able, instantiate
        # it in setup().
        self.ping_handler = PingHandler()

    def setup(self):
        """Sets up PyZMQ and creates all streams."""
        super().setup()

        self.rep_stream, _ = self.stream(zmq.REP, self.bind_addr, bind=True,
                callback=self.handle_rep_stream)

    def run(self):
        """Sets up everything and starts the event loop."""
        self.setup()
        self.loop.start()

    def stop(self):
        """Stops the event loop."""
        self.loop.stop()

    def handle_rep_stream(self, msg):
        """
        Handles messages from a Pinger:

        *ping*
            Send back a pong.

        *plzdiekthxbye*
            Stop the ioloop and exit.

        """
        msg_type, data = json.loads(msg[0])

        if msg_type == 'ping':
            rep = self.ping_handler.make_pong(data)
            self.rep_stream.send_json(rep)

        elif msg_type == 'plzdiekthxbye':
            self.stop()

        else:
            raise RuntimeError('Received unkown message type: %s' % msg_type)

There are a couple of things to note here:

  • I instantiated the PingHandler in the process’ __init__ method. If you are going to start this process as a sub-process via start, make sure everything you instantiate in __init__ is pickle-able or it won’t work on Windows (Linux and Mac OS X use fork to create a sub-process and fork just makes a copy of the main process and gives it a new process ID. On Windows, there is no fork and the context of your main process is pickled and sent to the sub-process).

  • In setup, call super().setup() before you create a stream or you won’t have a loop instance for them. You don’t call setup in the process’ __init__, because the context must be created within the new system process. So we call setup in run.

  • The stop method is not really necessary in this example, but it can be used to send stop messages to sub-processes when the main process terminates and to do other kinds of clean-up. You can also execute it if you except a KeyboardInterrupt after calling run.

  • handle_rep_stream is the message dispatcher for the process’ REP stream. It parses the message and calls the appropriate handler for that message (or raises an error if the message type is invalid). If your if and elif statements all do the same, you might consider replacing them with a dict that contains the handlers for each message type:

    handlers = {
        'msg': self.handler_for_msg,
    }
    try:
        rep = handlers[msg_type](data)
        self.rep_stream.send_multipart(rep)
    except KeyError:
        raise RuntimeError('Received unknown message.')
    

PingHandler – The Application Logic

The PingHandler contains the actual application logic (which is not much, in this example). The make_pong method just gets the number of pings sent with the ping message and creates a new pong message. The serialization is done by PongProc, so our Handler does not depend on PyZMQ:

class PingHandler(object):

    def make_pong(self, num_pings):
        """Creates and returns a pong message."""
        print('Pong got request number %s' % num_pings)

        return ['pong', num_pings]

Summary

Okay, that’s it for now. I showed you three ways to use PyZMQ. If you have a very simple process with only one socket, you can easily use its blocking recv methods. If you need more than one socket, I recommend using the event loop. And polling … you don’t want to use that.

If you decide to use PyZMQ’s event loop, you should separate the application logic from all the PyZMQ stuff (like creating streams, sending/receiving messages and dispatching them). If your application consists of more then one process (which is usually the case), you should also create a base class with shared functionality for them.

In the next part, I’m going to talk about how you can test your application.

Book review: NumPy 1.5 Beginner’s Guide

I recently got the chance to review the book NumPy 1.5 Beginner’s Guide by Ivan Idris and published by Packt Publishing.

It covers many aspects of NumPy and also introduces SciPy as well as Matplotlib. The author includes a lot of examples and exercises and also shows the effects of some not-so-easy-to-understand functions using matplotlib graphs.

The book is easy to read, so you should make fast progress in learning NumPy. Overall, it’s a good read for NumPy beginners. Advanced NumPy users, who just want to look up how specific things work, are better of with NumPy’s documentation, though.

SimPy 2.2

SimPy is a process-based discrete-event simulation library written in pure Python.

Ontje Lünsdorf and I have already contributed to prior version of SimPy and now have become members of the SimPy team.

In SimPy 2.2, we have restructured the package layout to be conform to the Hitchhiker’s Guide to packaging. We ported the the unit tests (524 test cases) to pytest. We improved and cleaned up parts of the documentation and finally, we fixed the behavior of Store._put—thanks to Johannes Koomer for the heads-up and the fix.

You can download SimPy from PyPI.

django-lastfm 1.0.1

Django-lastfm is a small Django app that allows you to embed your charts or recently listened tracks from Last.fm into your website. You can see the widget in action in the sidebar of this website.

Version 1.0.1 includes a few minor bugfixes and a more comprehensive test coverage.

You can find django-lastfm in the Cheese Shop or at Bitbucket.

Building NumPy, SciPy & Matplotlib for Python 2.7 on Snow Leopard

A few days ago I wrote about how to build SciPy for Python 2.7 on Mac OS 10.6 Snow Leopard.

Usually you want to install NumPy, SciPy and Matplotlib. After reading Installing SciPy/Mac OS X, the Matplotlib installation instructions and the HJBlog you might come to the conclusion, that it’s not trivial to build them on your own and that you better use the 32bit binaries for Python 2.6 or get them via MacPorts.

But actually it’s really easy. The only dependencies that you need to install are Xcode (for gcc and X11) and gfortran.

To simplify the the installation, I wrote a small Makefile that downloads all packages (except for Xcode) and builds/installs them:

  1. Rename* the downloaded makefile to Makefile, open Terminal and cd to the diretory with the makefile.

  2. Download gfortran and start the graphical installer:

    $ make fortran
    
  3. Download and install NumPy, SciPy and Matplotlib:

    $ make
    
  4. Run NumPy’s and SciPy’s test suite and delete the temporary build diretory:

    $ make test clean
    

That’s how the makefile looks like:

# Download all packages to this directory.
TMP_DIR=./PYTMP

# Select Python version.
PYVERSION=2.7
PYTHON=python${PYVERSION}

# Package to download frm http://r.research.att.com/
FORTRANPACKAGE=gfortran-42-5664.pkg

# Select which versions of the packages you want to install.
NUMPYVERSION=1.5.0
SCIPYVERSION=0.8.0
MATPLOTLIBMAJORVERSION=1.0
MATPLOTLIBVERSION=${MATPLOTLIBMAJORVERSION}.0

# Normally, you shouldn’t need change this.
OSX_SDK_VER=10.6
ARCH_FLAGS=-arch i386 -arch x86_64

# Values for some environment variables. You shouldn’t need to change this.
MACOSX_DEPLOYMENT_TARGET=${OSX_SDK_VER}
CFLAGS="${ARCH_FLAGS} -I/usr/X11/include -I/usr/X11/include/freetype2 -isysroot /Developer/SDKs/MacOSX${OSX_SDK_VER}.sdk"
LDFLAGS="-Wall -undefined dynamic_lookup -bundle ${ARCH_FLAGS} -L/usr/X11/lib -syslibroot,/Developer/SDKs/MacOSX${OSX_SDK_VER}.sdk"
FFLAGS="${ARCH_FLAGS}"


default: all

# Create a temporary directory for the build process
_tmp_dir:
    mkdir -p ${TMP_DIR}

# Clean the the temporary build directory
clean:
    rm -rf ${TMP_DIR}

# Download gfortran and start the installer
fortran: _tmp_dir
    cd ${TMP_DIR} && \
    ${PYTHON} -c "import urllib; urllib.urlretrieve('http://r.research.att.com/${FORTRANPACKAGE}', '${FORTRANPACKAGE}')" && \
    open ${FORTRANPACKAGE}

# Download all required packages
fetch: _tmp_dir
    cd ${TMP_DIR} && \
    ${PYTHON} -c "import urllib; urllib.urlretrieve('http://sourceforge.net/projects/numpy/files/NumPy/${NUMPYVERSION}/numpy-${NUMPYVERSION}.tar.gz/download', 'numpy-${NUMPYVERSION}.tar.gz')" && \
    ${PYTHON} -c "import urllib; urllib.urlretrieve('http://sourceforge.net/projects/scipy/files/scipy/${SCIPYVERSION}/scipy-${SCIPYVERSION}.tar.gz/download', 'scipy-${SCIPYVERSION}.tar.gz')" && \
    ${PYTHON} -c "import urllib; urllib.urlretrieve('http://sourceforge.net/projects/matplotlib/files/matplotlib/matplotlib-${MATPLOTLIBMAJORVERSION}/matplotlib-${MATPLOTLIBVERSION}.tar.gz/download', 'matplotlib-${MATPLOTLIBVERSION}.tar.gz')"

# Extract, build and install NumPy
numpy:
    cd ${TMP_DIR} && \
    rm -rf numpy-${NUMPYVERSION} && \
    tar -xf numpy-${NUMPYVERSION}.tar.gz && \
    cd numpy-${NUMPYVERSION} && \
    export MACOSX_DEPLOYMENT_TARGET=${MACOSX_DEPLOYMENT_TARGET} && \
    export CFLAGS=${CFLAGS} && \
    export LDFLAGS=${LDFLAGS} && \
    export FFLAGS=${FFLAGS} && \
    ${PYTHON} setup.py build && \
    ${PYTHON} setup.py install && \
    cd .. && \
    ${PYTHON} -c "import numpy; print 'Installed NumPy', numpy.__version__"

# Extract, build and install SciPy
scipy:
    cd ${TMP_DIR} && \
    rm -rf scipy-${SCIPYVERSION} && \
    tar -xf scipy-${SCIPYVERSION}.tar.gz && \
    cd scipy-${SCIPYVERSION} && \
    export MACOSX_DEPLOYMENT_TARGET=${MACOSX_DEPLOYMENT_TARGET} && \
    export CFLAGS=${CFLAGS} && \
    export LDFLAGS=${LDFLAGS} && \
    export FFLAGS=${FFLAGS} && \
    ${PYTHON} setup.py build && \
    ${PYTHON} setup.py install && \
    cd .. && \
    ${PYTHON} -c "import scipy; print 'Installed SciPy', scipy.__version__"

# Extract, build and install Matplotlib
matplotlib:
    cd ${TMP_DIR} && \
    rm -rf matplotlib-${MATPLOTLIBVERSION} && \
    tar -xf matplotlib-${MATPLOTLIBVERSION}.tar.gz && \
    cd matplotlib-${MATPLOTLIBVERSION} && \
    export MACOSX_DEPLOYMENT_TARGET=${MACOSX_DEPLOYMENT_TARGET} && \
    export CFLAGS=${CFLAGS} && \
    export LDFLAGS=${LDFLAGS} && \
    ${PYTHON} setup.py build && \
    ${PYTHON} setup.py install && \
    cd .. && \
    ${PYTHON} -c "import matplotlib; print 'Installed Matplotlib', matplotlib.__version__"

# Fetch and build NumPy, SciPy and Matplotlib
all: fetch numpy scipy matplotlib
    echo "all done"

# Execute tests for NumPy and SciPy
test: _tmp_dir
    export MACOSX_DEPLOYMENT_TARGET=${MACOSX_DEPLOYMENT_TARGET} && \
    export CFLAGS=${CFLAGS} && \
    export LDFLAGS=${LDFLAGS} && \
    export FFLAGS=${FFLAGS} && \
    cd ${TMP_DIR} && \
    ${PYTHON} -c "import numpy; numpy.test('1', '10')" && \
    ${PYTHON} -c "import scipy; scipy.test('1', '10')"

I have only tested it on my local machine yet. Hope, it helps and you don’t run into any troubles.

* If you don’t rename the makefile, you must pass it’s name to make: make -f Makefile_nsm <target>.

Building SciPy on Snow Leopard with Python 2.7

I recently had some struggle to build and install SciPy 0.8.0 on Mac OS X 10.6 Snow Leopard, but actually it’s quite easy.

I used the instructions on scipy.org and the HJBlog as source. Since there are builds of NumPy 1.5.0 for Python 2.7 available, you don’t need to install fftw and umfpack manually.

You only need to install gfortran from this site. Pick the latest build for Snow Leopard (e.g. this one).

Next, install NumPy with the disk image from SourceForge or with pip:

$ pip install numpy

To build and install SciPy, download the latest version from SourceForge and do the following:

$ tar -xf scipy-0.8.0.tar.gz
$ cd scipy-0.8.0
$ export CFLAGS="-arch i386 -arch x86_64"
$ export LDFLAGS="-Wall -undefined dynamic_lookup -bundle -arch i386 -arch x86_64"
$ export FFLAGS="-arch i386 -arch x86_64"
$ export MACOSX_DEPLOYMENT_TARGET=10.6
$ python setup.py build
$ python setup.py install

Everything in one command:

$ CFLAGS="-arch i386 -arch x86_64" LDFLAGS="-Wall -undefined dynamic_lookup -bundle -arch i386 -arch x86_64" FFLAGS="-arch i386 -arch x86_64" MACOSX_DEPLOYMENT_TARGET=10.6 python setup.py build install

I hope this works for you as well as it did for me.

django-lastfm 1.0

Django-lastfm is a small Django app that allows you to embed your charts or recently listened tracks from Last.fm into your website. You can see the widget in action in the sidebar of this website.

I raised its version to 1.0 since there have been no problems for a long time and there are also no features I want to include.

You can find django-lastfm in the Cheese Shop or at Bitbucket.

django-sphinxdoc 1.0

Most Python projects use Sphinx for their documentation. And many (most?) Python powered websites use Django as framework.

So there might be some people who use both Sphinx and Django. If you belong to this group and want to integrate the documentation of your projects into your Django powered website, django-sphinxdoc might be the app you’re searching for.

Django-sphinxdoc can build and import your Sphinxdocumentation and provides views for browsing and searching it. You can see django-sphinxdoc in action be reading its documentation.

What’s new in this version?

  • You can now search the documentation (via Haystack).
  • New management command updatedoc for importing and building JSON files from your documentation and updating Haystack’s search index.
  • New model Document for JSON files.
  • Renamed the App model to Project

What’s planned for the future?

  • Allow users to comment the documentation.

You can find django-sphinxdoc in the Cheese Shop or at Bitbucket.

Collectors 1.0

It took me nearly three months to fix five small issues with the documentation. But now I finally released Collectors v1.0. :-)

You can read everything important in the RC1 posting.

Collectors 1.0-rc1

Collectors made a huge jump von v0.1 to v1.0 over the last weeks, since we added lots of changes and consider what we’ve done as stable. If we don’t find any bugs, we’ll release the final v1.0 at the end of next week. The changes are:

  • [NEW] Documentation!
  • [NEW] Tests!
  • [NEW] Collectors can use different storage backends now
  • [NEW] Storage backend for PyTables
  • [NEW] Storage backend for MS Excel
  • [NEW] Collector.collect() as alias to Collector.__call__()
  • [CHANGE] Monitor is now called Collector
  • [CHANGE] Shortcut functions moved to shortcuts package

You can download and install Collectors via PyPI. If you find any bugs or have ideas for further enhancement, please let us know.

The Documentation can be found here.

Updates for django-lastfm and django-sphinxdoc

After reading The Hitchhiker’s Guide to Packaging I update my packages for django-lastfm and django-sphinxdoc.

There are no functionional improvements, so you don’t need to update them.

Django: Highlighting with ReST using Pygments

In my last post I wrote about code highlighting in HTML formatted text, but since I’m now using reStructuredText for markup things, I want to share how I added synthax highlighting for this.

Within your project directory (the same that contains the settings.py) create a file called rst_directive.py and fill it with the following code:

# -*- coding: utf-8 -*-
"""
    The Pygments reStructuredText directive
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    This fragment is a Docutils_ 0.5 directive that renders source code
    (to HTML only, currently) via Pygments.

    To use it, adjust the options below and copy the code into a module
    that you import on initialization.  The code then automatically
    registers a ``sourcecode`` directive that you can use instead of
    normal code blocks like this::

        .. sourcecode:: python

            My code goes here.

    If you want to have different code styles, e.g. one with line numbers
    and one without, add formatters with their names in the VARIANTS dict
    below.  You can invoke them instead of the DEFAULT one by using a
    directive option::

        .. sourcecode:: python
            :linenos:

            My code goes here.

    Look at the `directive documentation`_ to get all the gory details.

    .. _Docutils: http://docutils.sf.net/
    .. _directive documentation:
       http://docutils.sourceforge.net/docs/howto/rst-directives.html

    :copyright: Copyright 2006-2009 by the Pygments team, see AUTHORS.
    :license: BSD, see LICENSE for details.
"""

# Options
# ~~~~~~~

# Set to True if you want inline CSS styles instead of classes
INLINESTYLES = False

from pygments.formatters import HtmlFormatter

# The default formatter
DEFAULT = HtmlFormatter(noclasses=INLINESTYLES)

# Add name -> formatter pairs for every variant you want to use
VARIANTS = {
    # 'linenos': HtmlFormatter(noclasses=INLINESTYLES, linenos=True),
}


from docutils import nodes
from docutils.parsers.rst import directives, Directive

from pygments import highlight
from pygments.lexers import get_lexer_by_name, TextLexer

class Pygments(Directive):
    """ Source code syntax hightlighting.
    """
    required_arguments = 1
    optional_arguments = 0
    final_argument_whitespace = True
    option_spec = dict([(key, directives.flag) for key in VARIANTS])
    has_content = True

    def run(self):
        self.assert_has_content()
        try:
            lexer = get_lexer_by_name(self.arguments[0])
        except ValueError:
            # no lexer found - use the text one instead of an exception
            lexer = TextLexer()
        # take an arbitrary option if more than one is given
        formatter = self.options and VARIANTS[self.options.keys()[0]] or DEFAULT
        parsed = highlight(u'\n'.join(self.content), lexer, formatter)
        return [nodes.raw('', parsed, format='html')]

directives.register_directive('sourcecode', Pygments)

Open the file __init__.py in the same directory and add the following line:

import rst_directive

Furthermore you might want to create a CSS file containing all the style definitions in your static media directory. The following python script will do the job:

# Call it this way:
# python gen_css.py pygments.css
import sys

from pygments.formatters import HtmlFormatter

f = open(sys.argv[1], 'w')

# You can change style and the html class here:
f.write(HtmlFormatter(style='colorful').get_style_defs('.highlight'))

f.close()

Now you can highlight code using the sourcecode directive:

*Hello World!* in Python is so easy:

.. sourcecode:: python

    print 'Hello World!'

That’s it! If you have any questions, leave a comment.

Django: Highlighting in HTML using Pygments and Beautiful Soup

Two months ago I wrote about Django, Pygments and Beautiful Soup.

A few days after the post I switched from HTML markup to reStructuredText and thus didn’t need my filter anymore and forgot to post more about it.

Today I received a comment asking me to publish the source code of the template filter—so here it is:

# encoding: utf-8

"""
A filter to highlight code blocks in html with Pygments and BeautifulSoup.

    {% load highlight_code %}

    {{ var_with_code|highlight|safe }}
"""

from BeautifulSoup  import BeautifulSoup
from django import template
from django.template.defaultfilters import stringfilter
import pygments
import pygments.formatters
import pygments.lexers


register = template.Library()

@register.filter
@stringfilter
def highlight(html):
    soup = BeautifulSoup(html)
    codeblocks = soup.findAll('pre')
    for block in codeblocks:
        if block.has_key('class'):
            try:
                code = ''.join([unicode(item) for item in block.contents])
                lexer = pygments.lexers.get_lexer_by_name(block['class'])
                formatter = pygments.formatters.HtmlFormatter()
                code_hl = pygments.highlight(code, lexer, formatter)
                block.contents = [BeautifulSoup(code_hl)]
                block.name = 'code'
            except:
                raise
    return unicode(soup)

Copy the code into a file called templatetags/highlight_code.py within a new or an existing Django app.

highlight() searches the passed HTML code for <pre>-tags with a class denoting the lexer to be used, e.g.:

<p><em>Hello World!</em> in Python:</p>
<pre class="python">
print 'Hello World'
</pre>

Furthermore you might want to create a CSS file containing all the style definitions in your static media directory. The following python script will do the job:

# Call it this way:
# python gen_css.py pygments.css
import sys

from pygments.formatters import HtmlFormatter

f = open(sys.argv[1], 'w')

# You can change style and the html class here:
f.write(HtmlFormatter(style='colorful').get_style_defs('.highlight'))

f.close()

I hope I explained all this well enough—if not, leave a comment and I’ll update this post. :-)

Django-sphinxdoc 0.2 now with documentation!

Just released version 0.2 of django-sphinxdoc. It can now display the documentation itself, the general index and the module index. The source and static views as well as the search functionality is not implemented yet.

I’m still pondering whether to use a Google custom search or the Sphinx JavaScript search. Maybe it would be even better if I used Haystack

You can find a quick guide as well as some other guides in django-sphinxdoc’s documentation section (which is of course powered by django-sphinxdoc ;-)).

The download can be found at bitbucket.org.

Documentation for django-lastfm with django-sphinxdoc

Yesterday I finished a first version of django-sphinxdoc that integrates Sphinx documentation into a Django powered website. Its based on Django’s documentation app, but can manage the documenation for more then one app. I’ll post more on this later.

What’s more important is, that I have put the documentation for django-lastfm online with it. :-)

django-lastfm

On my old Wordpress blog I had a widget that let me display some of my last.fm stats. Since there was no such widget for Django powered sites (or I didn’t search well enough), I created my own as you can see in the right column.

You currently can choose between your recently listened tracks, your weekly artist chart and your top artists. I’ve created a bitbucket project for its further development. So go clone it and give some feedback.

PS: I’ll update/write the documentation within the next few days … ;-)

A BeautifulSoup with Django and Pygments

Just added syntax highlighting using BeautifulSoup and Pygments. I took the SaltyCrane Blog for inspiration, but in contrast to it, I implemented it as a template filter in a separate application. This is surely not the most performant way, but I’m planning to use memcache, so I think this is ok.

Here an example how to use the filter:

{% load highlight_code %}
{{ my_var_with_code|highlight|safe }}

More Feeds

Just added Atom Feeds for post categories and post comments. They are available in the category detail and post detail views.

Comments and an Atom Feed

Small update for this site: Comments are now working and there is an Atom Feed for the latest posts in this blog.

The behaviour and the style of the comments is not very polished yet, but I’ll work on that later. I’ll also add more feeds, e.g. for post comments, categories and tags.

Magic! Python! Django! Whee!

This blog is now run by Django. I didn’t really like Wordpress which I used before. And also I don’t like PHP anymore (I really liked it some years ago, but everything changed, when I learned Python …).

When I told a friend that I wanted to switch from Wordpress to something else, he just said: «Use Django.» So I took a look at the Tutorial and was instantly thrilled.

At first, I wanted to use an existing weblog app, but I also wanted to code an app for my own and since I din’t find a weblog app that I 100% liked, I just decided to write my own. So here it is (far from finished though)!

Features so far:

  • Basic post model (title, slug, pub date, modify date, status, category, HTML body)
  • Post manager for post counts (per year, month, category (and tag))
  • Hierarchical category model – imho, this is the highlight of my app.
  • Archive and Categories
  • Some special template tags.
  • Usage of a recursive template tag for the categories
  • Unit- and doctests for the models and template tags

Features to come:

  • Comments
  • Atom feeds
  • Sidebar
  • Static pages
  • Trac integration for my projects
  • Search
  • Last.fm sidebar widget
  • Tags for posts (via django-tagging?)
  • CKEditor integration
  • Pygments integration
  • Improved image/media support for posts
  • Admin actions for posts (e.g. change status and category)
  • Preview function for new posts

That’s it so far. When I have Trac running, I’ll also publish the source of my applications. :-)

Magic! Python! Django! Whee!