OddThinking

A blog for odd things and odd thoughts.

Introduction to a Puzzle-Solving Architecture

In the past, I have blogged about a several projects to solve some puzzles with a common theme: Nonograms, Kakuro, Sudoku, Solitaire Battleships and Slitherlinks [in press].

My solutions shared a similar architecture – one that has been evolving over time.

The architecture is based on a common Artificial Intelligence approach of having a large number of different objects representing part of the puzzle (e.g. a coloured line or a square in the grid). Each object is trying to independently resolve some internal constraints. As they make progress and discover some information, they share the current information with their neighbours. Those neighbours use that to improve their situation, recursively, until, hopefully, a solution is found. The whole system is decentralised, with no master object driving the algorithm.

The area of the architecture that has changed the most is the message-passing mechanism.

Scope

This article focuses on that message-passing system and the variants I have tried.

The question I am trying to answer is: Each object knows about its direct neighbours. When it comes time to tell those neighbours some new information, how should it go about that?

I am sorry if this isn’t of general interest, but writing about it will probably clear it up in my head, and help me in future projects.

Recursion

The obvious solution is for the object simply to call each of the neighbours in turn, and have them check their constraints immediately, while the original object waits. If the constraints lead to new information, the neighbours recursively call their neighbours, until there is no new information to be shared, the stack unwinds and the original object completes.

This system is conceptually fairly straightforward, and it is easy to get the basic implementation working.

It is helpful in debugging, as the stack trace shows a complete history of how the information was passed around. (If you are debugging why a piece of wrong information was passed to an object, you can see who the culprit was.)

It is inefficient. For example, if the cell_1 object passes a message to line_1 and cell_2 objects, and then the line_1 object updates some data and passes another message to cell_2, then cell_2 is going to get two calls to its constraint checking. It will check its constraint twice rather than once.

Care must be taken to ensure that functions support re-entrance. An object’s methods may be called when it is half-way through a function. Its invariants must all be met before passing the message to its neighbours.

Animation hooks are a bit trickier. Many of these projects have an associated game board that is displayed to show progress. Rendering of these graphics are CPU-intensive. Typically, I would refresh the screen after every contraint check during early debugging but only after every 50-or-so constraint checks during the final runs. Having a recursive solution means putting calls to the animation system throughout the code for the object. It couples the algorithm to the rendering.

If the puzzle is solved before all the constraints are checked, it is difficult to finish early. Similar to the animation hooks, checks for puzzle-completeness are required throughout the code.

Command Queue

Another implementation of message passing is to use a “command queue”.

Rather than using the run-time stack to keep track of the neighbours, a separate queue is created.

Rather than calling the neighbours directly, a request is pushed on the command queue for each of the neighbours. Each constraint check is now a short (probably constant time) operation that is guaranteed to return quickly.

The calling software needs to pop the first command off the queue, execute it (which calls the appropriate object) and repeat until the queue is empty or the puzzle is solved.

A minor improvement to reduce coupling is to move the command queue push operation into the neighbour’s code. So if a Line object needs to pass a message to a Cell object, it simply calls a method on the Cell object, which pushes the correct command type onto the queue, and returns immediately. This decoupling allows you to switch between the different message-passing techniques, without your client objects necessarily needing to know.

Animation and puzzle-completeness hooks can be moved into the calling code. They can perform the updates/checks after executing every command or every `n` commands, as desired.

Coding can afford to be a bit sloppier, if you know you are using this technique. You can afford to trigger the call to the neighbour object even before you finish updating your local data to maintain the invariants. Code is not re-entrant.

While I haven’t tried it, I think this technique supports multi-threading better. The items in the queue can be distributed amongst different threads – perhaps based on some partition of the affected objects to minimise the amount of locking required. I’ve been meaning to try this.

Again, I haven’t tried this, but if you have computer-assisted solving of the puzzles and a slow algorithm (e.g. brute force), it allows an opportunity for the user to make an update before the entire algorithm has completed. So the human can play in real-time, with the computer just slowly adding information as it discovers it.

The effect of an update is divorced from its cause. During debugging, I have found incorrect information lined up in the command queue, and it has been difficult to back trace it to the code that incorrectly created it.

The system remains inefficient. The example above of cell_2 updating twice continues to apply.

The pureness of the bottom-up, decentralised nature of the implementation is tainted. There is now a centralised master object (i.e. the command queue) that every object needs to know about – either it is a global or it needs to passed around to almost every object in the system.

Prioritised Command Queue

Some messages are more useful than others.

Consider two messages to a Sudoku cell. One says “Your value is not an 8. Please eliminate that from your list of possibilities.” The other says “Your value is a 3.” The latter message has more information content.

By giving each type of message a relative priority, and keeping the command queue sorted by that priority, you can ensure the more useful messages are processed first.

The less useful messages will either be abandoned (if the puzzle is solved before they are processed), ignored (once an object has resolved all of its own constraints using the more useful messages, it can discard further information quickly) or eventually processed (if the object is still looking for more information.)

I also used this technique where some objects had multi-stage initialisation (e.g. before and after their neighbours were associated) but I eventually moved this responsibility to the object that created them, because it was simpler.

Prioritised Command Queues are an optimisation that is only application in some situations. There is a small cost in keeping the queue sorted.

Command Set

Looking at the command queue data reveals that there are generally many duplicate commands, like the cell_1 example, above.

Converting the command queue into a set can greatly reduce the number of redundant calls. Duplicates are simply discarded as they are added to the queue.

Note: Compare these two different styles of message-passing:

1. “Some data has changed. Please fetch the latest data from your neighbours and update your constraints.”
2. “This is line_1. I can no longer be blue. Please use this information to update your constraints.”

If fetching data from neighbours is expensive, the latter might be more efficient. However, if the evaluation of the data is the expensive part, the the former style, plus the command set design, can save a large number of constraint checks.

A Prioritised Command Set is also possible, combining the duplication-removal features of the set with the best-data first features of the prioritised queue.

Iteration

During the Kakuro project, I made a gross simplification.

I dropped the message-passing between neighbours entirely. Instead, I had the calling program simply iterate repeatedly over every cell object, and tell it to check its constraints, whether its neighbours had changed or not.

This contained an inherent inefficiency: cells were called on to update even when their environment hadn’t changed in the meantime.

However, the code was much, much simpler.

If the neighbourhood connectivity is high compared to the number of objects, if it takes a large number of messages to an object before it can solve itself, and if there is only one message type, this is a worthwhile consideration.

Conclusion

I have experimented with a number of different designs. The Prioritised Command Set is the default that I reach for now. Now I can use this article to justify that decision to myself!

If I was ever hired to take these disposable prototypes and replace them with massively over-engineered production code, I could certainly see the opportunity for a puzzle-solving framework (including a Prioritised Command Set) to do most of the infrastructure work.

By the way, if you like this sort of stuff, you’d probably like Douglas Hofstadter’s Fluid Concepts and Creative Analogies book. It’s a lot more “cognitive” and serious than my attempts, but it’s interesting seeing a lot of different puzzle implementations by different people that also have a common theme.

1. Somehow all of these strike me as inelegant (although – perhaps surprisingly – Iteration the least so). What I dislike about all of them is that they have this arbitrary timing quality: elements will arrive at solutions at arbitrary times, making for an arbitrary call path through the problem space mainly dependent upon the particular problem under consideration.

The fact that you update the accompanying graphics at more or less arbitrary points is an obvious symptom.

I think I’d do something of which both the Command Set and Iteration approaches are (differently) degenerate forms. How it works is that there are two message queues, and the main loop pings every element in turn for which there are messages in the old queue. The element picks up all of its messages, updates itself, and then casts new messages into the new queue as required. At the end of the iteration the old queue flushes and the queues swap places. Elements that have arrived at their final value unregister themselves from the main loop so they won’t get pinged again.

That gives you a very clear arrow of time, a speed of light by which information travels across the board, and a very obvious hook updating the display.

Maybe I’m just permanently brain-damaged from implementing too many cellular automata in my free time in a former life.

2. Aristotle,

Some interesting thoughts, thank you.

Let me start at the sentence that struck me first:

The fact that you update the accompanying graphics at more or less arbitrary points is an obvious symptom.

OMG! You are right! I render the graphics by iterating through each of the items and asking them to render themselves onto a static image object. When they are complete, I display the image object. Each of the items has a “dirty bit” to indicate whether they have been updated since the last rendering. If not, they can skip re-rendering.

I immediately realised, after reading your comment, I should have rather simply had the object re-render itself immediately that it realises it is dirty. If that is too frequent, the object should at least register itself with some “sweep” mechanism as being dirty, rather than being polled. The more I can make the overriding manager object (generally called the “Grid” object in most of the projects) merely a collection of autonomous objects and not something that understands the rules of the game, the happier I would be.

I pondered for a minute why that wasn’t the original architecture. I realised it was an artifact of how the rendering system came about.

I originally just took a snapshot of the board state and created a (Python Imagine Library) image. I stored it to a file or displayed it as a static item on the screen. Every object needed to be rendered at the same time. Later, I learnt just enough of PyGame to display a series of static images on the screen and receive mouse-presses back, to do simple animation of the algorithm.

I obviously never bothered to learn to go fully native with PyGame and have parts of the screen update asynchronously, without consideration of the rest of the board. It is now on my To Do list. Thank you.

Elements that have arrived at their final value unregister themselves from the main loop so they won’t get pinged again.

I know at least one of the projects (Kakuro?) went some way down that path. Cells that considered themselves “solved” deregistered themselves from their neighbours lists.

Most of the projects have eschewed this sophistication and have simply had `if (solved) return;` at the top of their notification routines. (Both the methods that add the commands to the command queue, and the methods that are called when a command is popped from the queue.)

That gives you a very clear arrow of time, a speed of light by which information travels across the board

I certainly understand this motivation to have the solution proceed in a logical fashion.

When writing the detail of the Nonogram implementation, I wrote:

Another possibility that I did not implement was to try to make the order of the processing seem “more natural” to a human observing the animation. By focussing the processing on, for example the top-left corner, before processing equivalent commands on the bottom-right, I thought it might make the puzzles solution tend to appear as a gradual, yet intermittent (perhaps the right word is “punctuated”?) wipe rather than the whole puzzle being gradually being brought into focus with a series of quick waves.

I considered (briefly) how you could determine an “area of interest” for a human. Your solution (as with mine) would have ripples of change that radiate out from a clue. I think a human would tend not to work that way, but instead follow more linearly along one seam of information, possible (but not definitely – with a probability reducing over time) returning the to centre once the seam was mined. Occasionally, if there was little progress in one area, they might jump randomly to another area out of boredom. (Actually quitting the puzzle in frustration and going to read the newspaper instead wasn’t the behaviour I was thinking of modelling!)

I mentioned Hofstadter’s book in the above article. That work was focussed on modelling the human approach to problem solving, rather than solving it in the fastest time.

To conclude, I see the ordering of the solution as an interesting area, and I certainly think your approach has merits. However, I don’t perceive my approach as being any less elegant (graphics rendering aside).

My breadth of experience with cellular automata is narrow – only Conway’s Game of Life – but I hope there are some out there that don’t use Conway’s firm drum-beat of generations, and instead have them procreate more evenly over time. That would be more like the way my code has been solving these problems.

3. I haven’t come across a cellular automaton that didn’t have a global clock cycle. If you did that, you’d introduce a bias into your simulated universe whereby messages in certain directions travel into the future. If you have a simple 2D grid, say, and you iterate from left to right (and top to bottom), then state changes will have a rightward effect within the current cycle while they can only have a leftward effect on the next cycle. So the order in which you ping elements affects the outcome. You need the past/future segregation of propagating state changes to remove this bias.

Of course such bias is not a problem in your case – it only affects the path taken towards the solution, not the final outcome. But I still dislike it on principle. There are other instances of similar phenomena, eg. simpleminded iterative substitution template engines, where it’s undesirable for modifications from the last and current iterations to mix. (Iterative substitution, of course, is naught but a very special case of a cellular automaton…)

I guess in mind, once you cast a problem as a collection of actors, it is more like universe governed by a set of laws to which the constituents are subject. It doesn’t have the same “feel” anymore as a search algorithm, which it would be if you cast it that way.

Another possibility that I did not implement was to try to make the order of the processing seem “more natural” to a human observing the animation.

I think this too is an artifact of your approach, and likewise is your realisation that you should let each agent update its display whenever it changed state. I think iterating over the entire board to redraw it at once is just fine, and artificially making the solution path more human-like is an uninteresting problem.

If the human likeness was natural, ie. a second-order effect arising from the fact that the underlying solver works the same way as human cognition, that would be interesting – but that is rather a large problem to tackle, and one I doubt is either in the scope of your interest or your reach. (Otherwise, I’m sure there are a few neuroscientists who’d like to talk to you…)

If you have a global heartbeat as I outlined, the solver algorithm won’t go on a recursive deep dive in one spot before it randomly resurfaces and wanders over to somewhere else. It would be quite natural to redraw the entire board once at the end of one main loop iteration, too. The solution finding would then proceed apace across the entire board, so “is anything happening?” phases are much less likely. What’s more, state changes that fix a value which happens to be a strong constraint in the particular problem at hand would cause a ripple effect that you could actually see. With an unconstrained depth-first recursion approach, the processing order is biased so that some agents affected by a state change will not be visited until long after it happened, concealing such ripple effects in many problem configurations.

4. Excellent article and discussion.

I have noticed that in more recent puzzle solving attempts you have played around with different approaches (eg the recent Virus puzzle). This would seem to be a key goal for the software design and you seem to have moved towards a design which enabled this, and yet it does not feature explicitly in your descriptions above.

In other words, there are two sets of rules that you wish to model: the rules of the puzzle itself and the rules of the puzzle solving algorithm. The former is fixed, and the latter needs to vary. The ease by which this can happen is a key success measure of the overall design.

(In some ways this conflicts with your everything-is-decentralised aesthetic, although it’s probably a matter of opinion.)

Anyway: It seems to me that your initial “simple” solutions such as recursion do not succeed in separating puzzle rules from the solution algorithm, and so can be considered undesirable for this reason (as well as the others). On the other hand, the prioritised command set approach does seem to allow this, as the subsequent discussion has revealed.

Thanks also for the pointer to Hofstadter – I haven’t picked it up in years but will do so now.

5. Aristotle,

I haven’t come across a cellular automaton that didn’t have a global clock cycle. If you did that, you’d introduce a bias into your simulated universe whereby messages in certain directions travel into the future.

That wasn’t quite what I was thinking. Imagine a global clock, but with real (i.e. floating-point, not integer) time.

Imagine cells with floating-point energy levels (hit points?), that are drained or charged by the presence of neighbours which may appear and disappear at any time. Cells might be born with the average of the initial hit-points of their parents, so they are spread out over the real number.

You could even write the rules to have it degenerate to a standard (well-known) cellular automata if the initial conditions happen to be that all the cells start with integer values.

I need to emphasize that I don’t have much experience with cellular automata, so I imagine I am reinventing the wheel here. It’s probably been done before.

6. Alastair,

This would seem to be a key goal for the software design and you seem to have moved towards a design which enabled this, and yet it does not feature explicitly in your descriptions above.

This is interesting; I think you have leapt further upwards with your thinking than I have.

You are suggesting that flexibility is a key goal. I am not sure I have been deliberately coding towards flexibility. I have just been coding towards what I thought was a better design. As I did, the inflexible pieces kept breaking off, and being replaced with more flexible components. There’s a difference there, I think. I didn’t start with the idea of a framework; but I re-used ideas often enough, it started to appear.

That’s probably why I didn’t mention it; I didn’t realise it was important.

there are two sets of rules that you wish to model: the rules of the puzzle itself and the rules of the puzzle solving algorithm.

I pondered that for a bit – where had I encoded the rules of the puzzle? I now see four parts to the code:

• The general puzzle-independent code that drives the algorithm, such as this message-passing stuff.
• The logic code (the rules of the puzzle-solving algorithm) that says “This is Sudoku and there is a 4 in this row, you can’t be a four.” or “This is Solitaire Battleships, and there is a ship north-east of you, so you can’t be a ship.”
• There is the game-board initialisation code which says “This is Solitaire Battleships, so initialise the game with 64 cells, and link up the neighbourhood links like an 8×8 square grid.” or “This is Virus 2, so initialise the game with 320 cells, and link up the neighbourhood like a 16×20 hex grid.” These are part of the rules of the game, but only the static part – what the board looks like at the beginning.
• Finally, there is the actual rules of the game. “Solitaire Battleships: you can’t have two different ships touching.”, “Sudoku: You can’t have two identical numbers in a single row.” My initial thought was “Hey, this is weird, but I never actually code those rules.” Later, I realised I did code those rules, but only in `assert` statements!
7. Aristotle,

Artificially making the solution path more human-like is an uninteresting problem.

If the human likeness was natural, ie. a second-order effect arising from the fact that the underlying solver works the same way as human cognition, that would be interesting

I had trouble with these statements, but I couldn’t place it.

The second statement I absolutely agree with. That’s the territory Hofstadter explores.

But I couldn’t simply dismiss “artificially human-like” as an uninteresting problem.

Today, I came up with a counter-example!

Mycin was an Artificial Intelligence experiment from the 1970s. It was a rule-based expert system (in the domain of blood-bourne bacteria). It would fire off simple questions to the doctor about the patients signs and symptoms, and develop an idea of the most likely diagnosis and best treatment. The questions asked would depend on the previous answers.

Unlike a human, the software didn’t work by picking one bacteria species (or group of bacteria) and then testing to confirm or eliminate their conjecture. It was working on calculating the likelihood of every possible bacteria at once.

However, the developers found that if Mycin’s questions jumped around too much (tell me about Blood Test A, tell me about Blood Test B, tell me more about Blood Test A), it confused or worried the users. They made it seem more natural by changing the question order to look like it was following up a hypothesis, perhaps abandoning it based on the evidence and trying another hypothesis.

So, here was a case where they made no attempt to emulate human cognition at the lower-levels (it was still a rules-based engine, not a neural net or the like) but they made it “artificially human-like” in order to improve usability.

I clearly think it is an interesting case, because I still remember it well over a decade-and-a-half after hearing about it. (On the other hand, don’t trust my description of the project too closely for the same reason!)

8. Right, but the scales don’t quite compare – a puzzle solver vs. a medical expert system? I think the latter has a whole raft of additional concerns and then some – after all, it’s not an end unto itself. That might be an invalid argument if approaches to human-likeness developed in the context of the puzzle solver were applicable to a medical expert system, but that seem like more than a stretch to me.

What’s more, you skipped the part where I argued that the problem you were seeing is caused by giving each agent unbounded processing time and letting it decide when and which neighbours to yield any to. With that approach there’s no way to prevent the solution progression from degenerating into a series of local recursive deep dives. If instead you iterate over the entire board on each heartbeat, bounding the progress that each agent can make in a single cycle, then you get a steady solution progression across all the board, with all effects of any single event spreading in a uniform temporal progression, as opposed to being dependent on when an affected agent happens to be granted processing time by a neighbour.

(However, I see now that this post was written in the context of a puzzle helper, rather than a puzzle solver, which rather affects the concerns. Of course, when developing a helper, you want something that will behave not unlike a human, so that it and the user can coöperate more effectively. Just like in the case of the medical expert system, human-likeness is (essentially) a usability concern.)