OddThinking

A blog for odd things and odd thoughts.

Python Message-Buffering Logging Handler Design Notes

In this article, I am rubber-ducking with my blog readers, to help me think my way through this design issue. It worked. I overturned some decisions when I saw my justification written down.

Motivation

You may recall I had a serious performance issue with the design of Python logging when it was configured to send email.

I turned off the emails, and added a mid-priority item to my To Do list to come up with a solution. In the mean-time, I have been working on higher priority items to do with data collection and processing.

Yesterday, I discovered that a long-running thread had failed weeks ago. It logged a critical error – which wasn’t emailed – and got rapidly lost in a log-file of hundreds of thousands of entries. Meanwhile the application appeared to be doing its job. It wasn’t until I started to process the collected data that I found that the data collected was hopelessly corrupted without this thread. Gaah! That was a huge disappointment; I have a lot of theories to test with that data.

So, now I find myself with the both the strong motivation to fix this problem and some unexpected free time, because I don’t have any data to process.

Goal

I am starting with a “stretch goal”. If I can’t achieve that in a reasonable time, I have a fall-back goal. If even that is too hard, I have a fallback, fallback goal.

Initial Goal

The initial goal is to produce a customisation of the Python logging component that can be used by anyone who has the same general needs as me – that is, a general need for logging and real-time performance goals that preclude unpredictable delays on I/O.

It should hide the random I/O delays of the logging subsystem from the client, by putting them in a separate thread. It should follow the customisation framework provided by Python logging system, and work with any other customisations in that framework.

If I do a good job here, I can post the source-code in some public source repository, and lobby the component maintainer (Vinay Sanjip, who commented on the issue here earlier) to include a similar component in future Python releases.

Fall-Back Goal

The first fall-back goal is to produce a solution that works with most, typical users of the logging framework, even if it has some constraints that it can’t handle the more esoteric customisations.

If I do a good job here, I can post the source code on this blog, and solve the problem for a few others.

Fall-Back, Fall-Back Goal

I can hard-code a solution that only works for me, and bugger the whole giving-back-to-the-community idea.

Background – Simplified Logging Flow

Just to bring the newbies up to speed:

  1. Client gets logger from logging module by name.
  2. Client calls logger to log a message.
    1. Logger applies filters, if any.
    2. Logger informs handlers, if any.
      1. Handler locks for thread-safety.
      2. Handler applies more filters, if any.
      3. Handler applies formatter.
      4. Handler does work of emitting (storing/sending) the message.
    3. Logger recursively propagates to parent loggers. (Optional)
This is plagiarised from an upcoming presentation I might be giving at a local Python user’s group. In that talk, I give a lot of context. Sorry if there isn’t enough here.

Basic Design

Within this framework, my goal becomes to reduce the time that the handler has the lock, and ensure it isn’t dependent on I/O.

Ideally, the goal would be to make that operation time-bounded, but I can’t promise that within Python.

I have no goal to optimise the filters; they don’t normally depend on I/O.

The basic design is to have a new subclass of Handler. It acts as a proxy to the real (or “protected”) handler, which is passed in as a parameter to the constructor. The proxy handler accepts requests, pushes them to a queue and returns immediately. A consumer thread monitors the queue, and makes the appropriate calls on the protected handler. That is a very simple design.

Issue: No Return Values

The simple design does have a drawback; it cannot support methods that return values. Most methods on the Handler object are procedures, not functions (sorry, is my Pascal showing?) so that isn’t a big problem. The only method that returns a value is format(). It isn’t something a client would normally call; normally it is only called from the handler itself; it is a de-facto protected method. So, the first constraint to be added is:

Constraint 1: The proxy handler does not offer the format() method to the client. It is still available to the protected handler. If the client maintains a reference to that handler, it could still be called, but that may not be thread-safe.

Issue: Access to Mutable Objects

Log messages may have parameters passed in with them. These parameters are normally processed by the formatters – typically just calling their str() function. However, they may also be be accessed by the filters and handlers.

Those parameters are often constant strings, which is fine. However, if they are mutable objects, two problems may occur.

Firstly, we cannot guarantee that the objects are thread-safe, and hence won’t be corrupted by being accessed by the handler function at the same time that the client code is continuing to operate.

This breaks Julian’s first rule of logging systems: First, do no harm.

Secondly, even if they are thread-safe, they may have changed values between the time the message was logged and the time the item was emitted.

Consider this code:

if speaker.volume() > max:
   logging.warning("Muting speaker is louder than %d dB: %s", speaker, max)
   speaker.mute()

If speaker.__str()__ includes the volume in its output, we risk it outputting:

Warning: Muting speaker that is louder than 30 dB: Speaker 1, volume= 0dB.

This is misleading, and breaks Julian’s second rule of logging systems: Reveal, don’t obscure, what happened.

The remedy is for the proxy to immediately evaluate the mutable object’s str() functions in the client’s thread, before passing them on to the protected proxy. In particular, that means parameters explicitly passed in the log call. (Implicit data, such as source file names and line numbers are (believed) immutable, and can be safely used.)

This, however, leads to some more constraints.

Constraint 2: The str() function will be called on parameter objects passed to the logging call, in the client’s thread, even if they are never emitted, due to filters. This may have performance impact on the main thread, especially for debug-level messages.

Constraint 3: Handlers, formatters and filters may not rely on parameters that are explicitly passed in with the logging command. They will be replaced with an already-formatted sub-string.

Constraint 4: Similarly, the int() function will be called on parameters passed to setlevel()

Hmmm… These constraints are tricky to reason about and rather severe. Can they be softened?

One way to soften them is to tackle the filtering. It could be improved by moving filtering over to be done in the main thread. Rather than passing the handle() call over to the protected handler, it could be executed in the client thread. Then the filtering would be done in the client thread too.

That assumes that the protected handler doesn’t override the handle method. Looking at some examples of subclasses, that doesn’t seem to happen. The handle() method is de-facto private.

It also allows the str() functions to only be called on items that pass the logger’s filter and the handler’s filter.

Now, we can replace the last batch of constraints.

Constraint 2A: The str() function will be called on objects, in the client’s thread, only if they are to be emitted. They should not call I/O operations, as that would have an unpredictable performance impact.

That is a trivial constraint. I am happy with that.

Constraint 3A: Handlers and formatters may not rely on parameters that are explicitly passed in with the logging command. They will be replaced with an already-formatted sub-string. Filters will be run from a different thread to handles and formatters.

I still think we might be failing Goal 1, but that is an improvement.

Now that we are performing the filtering in the proxy, it raises the question of where the record of filters should be kept: in the proxy handler or in the protected handler.

Pros of storing it in the protected handler

  • Changes made to the filters by the client using a direct reference to the protected handler will take effect.

Cons of storing it in the protected handler

  • Changes to filters will not take effect immediately, until they are processed by the other thread.
  • The filter() method isn’t thread-safe if it is run at the same time as addFilter() or removeFilter() – unless you count relying on the Global Interpreter Lock.

On the balance, I would rather go with the filters being stored in the proxy, not the protected handler.

Summary

Method Handling

Methods on Handler Behaviour in Proxy handler Behaviour in Protected handler
Filterer.__init__() Called, but re-initialise list of filters from protected handler. As is.
Filterer.addFilter(), Filterer.removeFilter(), Untouched, run on proxy As is. Ignored after proxy handler is initialised.
Handler.__init__() Called, but re-initialise level from protected, and forget formatter. As is.
createLock(), acquire() and release() Untouched, run on proxy. As is.
format() Removed. Constrained to only receive partially formatted string and no explicit parameters.
emit() Partially evaluate message string. Push onto queue for later execution by other thread. As is.
handle() Used as is Never called.
setFormatter() Push onto queue for later execution by other thread As is.
flush() Push onto queue for later execution by other thread As is.
close() Push sentinel to queue, to tell it to shut down thread now. Call super.close() As is.
__del__() If not closed, push sentinel to queue. As is.
handleError() As is. This is a separate method to that used by protected handler. This one is used by either thread, when in the proxy handler code. As is

Constraints, Restated

Formatters will not receive parameters passed on the command line – they will only receive the resulting message string, formatted according to the client’s provided format string.

The user-provided filters should be fast. I/O-bound filters and formatters are outside of the scope of this project.

If the client passes objects as parameters to the log functions, and they have user-defined __str__() methods, these methods should be fast. I/O-bound __str__() methods are outside the scope of this project.

Clients should not directly call format() on the proxy handler.

Protected handlers should not override handle(). Neither the protected handler nor clients should call handle() on the protected handler.

Once the proxy handler is initialised, any further calls to setLevel(), addFilter and removeFilter() made onto the protected handler are ignored by the proxy handler. Conversely, the protected handler is not informed of calls made onto the proxy handler. Best practice would be to only update the proxy handler, and not use the values stored in the protected handler.

Clients should give each of the protected and proxy handlers appropriate error handlers, to taste.

Re-stated, these don’t seem too bad.
3 CommentsCategories: S/W Dev

Comments

  1. Incidentally, don’t go up to a Briton and ask to Rubber Duck her. Importantly, if she accepts, do not then sit her down and tell her about Python logging.

  2. Python’s new concurrent.futures (futures on pypi) core module (3.2+, the pypi package supports 2.6+) should handle all of the shenanigans for you, including making thread safe copies of data, thread pooling, etc.

Leave a comment

You must be logged in to post a comment.

Web Mentions

  1. OddThinking » Python Consumer Thread Shutdown Design Notes