A blog for odd things and odd thoughts.

Python’s doctest considered harmful.

This is a message from the Julian of 2010, to any Julian of the future who is starting out a new software project.

Message begins: Doctest is a interesting and clever testing harness built right in to Python. Try not to use it. Message ends.

On second thoughts, maybe it is a mistake to assume that messages to the future have limited bandwidth. I probably have room to explain further.

As you know, Python allows you to annotate each module, class and method with a “docstring” – a special multi-line comment which describes it, and is available to various meta-tools.

One way to describe a function is to provide examples on how to call it and what you might expect when it returns.

Here’s a snippet from the example in the doctest documentation:

def factorial(n):
    """Return the factorial of n, an exact integer >= 0.

    If the result is small enough to fit in an int, return an int.
    Else return a long.

    >>> [factorial(n) for n in range(6)]
    [1, 1, 2, 6, 24, 120]
    >>> [factorial(long(n)) for n in range(6)]
    [1, 1, 2, 6, 24, 120]

Doctest is built around an interesting idea.

Doctest offers a way of testing these examples. Call doctest.testmod() and it will examine the docstrings in the current module, looking for specially formatted examples (note the >>> prefix), execute the code and check it against the provided example results.

Python has an idiom for test cases.

If you put code like this at the bottom of your module:

if __name__ == '__main__':
    import doctest

it will be executed if you run the module as a program, but if you import it as a library for another module to use, it won’t be executed. It’s a good place to invoke unit-tests, so you can run them immediately when you edit the code, but they stay out of the way when the unit actually gets used in anger.

So doctest is a great way to check your code’s comments don’t get out of date. Well, one very limited aspect of your code’s comments. If that’s all you are using it for, that’s fine.

But, Julian, it’s not how you use it, is it? For simple functions with just two test cases, and two examples, why would you bother creating a whole unit-test, when you have everything you need already there?

Another test case? Why not just throw it in as another example, rather creating a separate unit-test. Two more? How confusing could it be?

Oh what’s that? You now have ten examples of how to call your code and it turns out it can be very confusing? That’s okay, because doctest has a great facility to extend testing into a separate file. Put all your examples in a file called “examples.doctest” and call doctest.testfile("examples.doctest").

Oh, don’t forget you’ll need to add an “import” statement, to load your original module… and probably more for all the types that appear in function parameters. But still, how about that! A really quick unit-test framework, for practically no effort!

Except that every time you try to execute the current document in your IDE, it will get shirty. Windows doesn’t know how to run a doctest, and nor does Python. You need to find the code file being tested, and run that instead.

After you have made that mistake 50 times on any a particular file, you’ll be tempted to hack around it. Put """ marks around the whole doctest file to make it a docstring, append the magic idiom for unit-tests, above. Then some magic cookies to the top for your Unix code, or teach your editor to run it as a Python file in Windows, or even rename it with a .py extension. Now you have a fully executable test code.

Oh, except every reference to the newline character, \n (or other special characters) needs to be double-slashed as \\n, but only when it is executed as testmod (when it is treated as a text file), and not as testfile (when it is treated as a docstring) so good luck working around that.

More importantly, look at what you’ve done here.

You’ve started using a new language. It is similar to Python, but not quite. Every line needs to be prefixed with >>> or ... to show it should be executed, the escaping rules are different, and, most importantly, all your fancy language-based tools can’t figure it out.

No more syntax-colouring.

No more more autocompletion.

No more pylint suggestions.

All that, just to avoid using a real unit-testing framework?

Not worth it. Just not worth it.

So, dear Julian of the future (sorry, may I call you Julian, or should I be addressing you as Mr President?), please don’t do that again, okay?

Oh, and if you are going to reply to this message, please remember to include this week’s lotto numbers. Thanks.


  1. The way I see it, docstrings are documentation for the method/class/module. They can contain code samples to illustrate the API. Running the code samples as “doctests” is simply a way that the documentation is correct.

    This doesn’t replace your test suite.

    When you say:

    Another test case? Why not just throw it in as another example, rather creating a separate unit-test. Two more? How confusing could it be?

    you should realize that you need a separate test suite.
    But that test suite is exactly that – a test suite. It’s not API documentation, and is not meant to be read by the user.

  2. Eyal,

    As far as I can tell, your comment is saying “Yes, Julian of 2010, everything you say is consistent with my understanding. You are right to have learnt this lesson.”

    Thanks! I hope future Julian is smart enough to listen to us both.

  3. I propose a new law of computer science:
    Any device or idea which you are tempted to illustrate using the factorial function is fine in theory but useless in practice.

    Examples: recursion, doctest. Others?

  4. Andrew,

    Err… I’m wondering whether your theory is, itself, being illustrated by the factorial function… 🙂

    Besides, I maintain recursion is an eminently practical way to negotiate trees, and that trees are a common and practical data-structure.

  5. I’ve been thinking of this since you posted it.
    Disclaimer: I never even heard of doctest before I read your blog post.

    It seems that doctest is a very good framework, for making sure your comments are correct up-to-date. It has *nothing* to do with testing though, and we all know what happens when libraries are used for things that they are not meant to do.

    Recursion is one of the most useful things I’ve ever learned, ranking right up there with functions and arrays. It is an important cornerstone of programming, and any programmer who would shy away from recursion simply because it is what it is, is not a very good programmer in my mind.

    Suppose you need to traverse one of the most complicated tree structures – a control tree in a Winforms application. Which is simpler?

    void Recursive(Control container) {
    // do something to the container control

    foreach (Control control in container.Controls)


    void NotRecursive(Control container) {
    Queue queue = new Queue();

    while (queue.Count > 0) {
    Control parent = queue.Pop();

    // do that same something to parent

    foreach (Control subcontrol in parent.Controls)

    There are many more complex examples, but the point is that many data structure are inherently recursive in programming: trees, XML data, parse trees (for compilers), etc. And while it’s possible to traverse a recursive data type with a non-recursive function, it’s a bit awkward.

    Also, there are many languages where recursion is useful even for normal operation. Take Prolog, or lisp, as examples. They both have highly-optimized tail-recursion, and that is a very common practice in both of them.

  6. 2, 15, 37, 29, 9, 14.

  7. It worked! I’m rich!

  8. Julian of the future: So far, you will have forgotten to say which draw that is. (Which tense is that again?)

    Julian/Configurator: Let it not be thought that I shy away from recursion. Some of my best friends are recursive. I’m in the middle of implementing something heavily recursive right now.

    I’m joking about the usual practice (when teaching recursion) of explaining how to implement factorial with it. If the most useful thing you can think of to do with your new toy is factorials… think some more.

  9. In a decent editor, you can have Python syntax highlighting in the docstrings:

    This is gVim with the “Slate“ colorscheme.

  10. Martin, I am afraid the link to the image site didn’t come through. I got the name of the site, but not the particular URL. Please submit it again, if you like.

Leave a comment

You must be logged in to post a comment.