This is a project that I did some years ago for performing computer simulations based on discrete events. The program pretends to be versatile and very configurable in order to allow the user to simulate anything. By the way, using Python as its extension language, the user can define very complex scenarios with event triggers and processing functions. The result is a powerful simulation engine focused in the richness of scene definition rather speed. Therefore, it is a nice tool to try experimental simulation models before building a more specific simulator program.
First at all, I will explain how the simulator works. On the one hand, there is an event scheduler that processes the existing events by chronological order. On the other hand, there is an interface between the simulator core and the extension language interpreter, so it is possible to run some code associated to an event when processed. Following that explanation come a sample script and some concluding remarks.
How it works
When we talk about computer simulators, we can distinguish between continuous or discrete models. Regarding the last ones, the operation of the modelled system is represented as a chronological sequence of events.
Every event in that sequence has a timestamp, this is, the time instant when the event occours. All the events, when generated, are pushed into a special queue type called priority queue, waiting to be processed.
The program core
The core of the program is a class called Scheduler, and its mission is to run the event actions by the order imposed by the timestamp. This class uses the STL’s template for the priority queues, called priority_queue; and defines a comparison function in order to determine, given two events e1 and e2, which of these must be processed first. Thus, the events are stored in the priority queue as they are generated and the scheduler gets them, one by one, ordered by timestamp.
In order to model the events to be processed by the scheduler, there is a C++ class called Event. That is an abstract class, with no more functionality than to serve as basis for other classes derived from Event. Therefore, the simulator can handle many types of events; this is done by extending the base class and defining new event subclases with new behaviors.
Types of events
At this time, there are two types of events defined with their correspondent classes: CallbackEvent and CREvent. The first one is just handles a callback function that is called by the scheduler in the right instant. The last one is used to schedule pieces of code from a coroutine: a special function definition that stops its execution several times along its code to simulate pauses or time lapses of any kind.
The callback-type events
A callback is a function, defined in the extension language (Python), that models the behavior of an event. The user can use an event of this type to run arbitrary code in a given simulation time instant. When the simulator clock reaches that time, the scheduler will call that function.
The coroutine-type events
A coroutine is also a function, but there is a big difference between a callback event and a coroutine event: the first one will call a function and all is done, while the last one schedules an event for each pause -i.e. each yield statement- the coroutine does.
The CREvent class models an event of this type. Each time a coroutine executes a yield statement, a new event is scheduled just to awake again the coroutine function at the end of the time lapse. Then, the coroutine yields the execution control to the simulator core. Thus, the scheduler can now continue dispatching events.
The extension language

This simulator embeds a Python interpreter in order to serve the modelling needs by the user. The Python language became very popular in the last years and now it is being used in a lot of disciplines, from scientific applications to web development. This reason and the ease of the programming in this language qualify Python as a preferent choice for the extension language of this simulator. Additionally, the API for extending and embedding the language is clear and easy to understand.
Defining the primitive functions
There are four functions defined in this version; these are C++ static functions that use an special type called PyObject for their arguments and returning value. These functions must be registered in a table by the Py_InitModule function. They will appear into the Python interpreter as a module named simul.
The four functions defined are:
- start(): Starts the scheduler; the simulation begins.
- schedule_process(g): Schedules the coroutine g.
- schedule_callback(t, fn, data): Schedules the callback function fn to be called at time t and passing data as argument to it.
- clock(): Returns the current simulation time value.
Ending words
I have developed a tool quite simple. It is a versatile discrete event simulator very suitable for building preliminary simulation scenarios with no care of optimization nor speed issues. By the way, it is also a technical demonstration of the architecture of these type of simulation tools, so the simplicity is a desired feature.
There are several software packages for doing discrete event simulations: a good example is the SimPy package. People often develop their own simulation tools; these are very efficient and specialized programs. Finally, the added value of some of the most popular open source simulation tools is debt to the contributions of a community; these projects started as a simple but extensible tool and a lot of people began using it for their work by adding custom modules.
In future posts, I will extend this simulator with some useful features and I will expose some typical uses of this kind of tools.
Resources
Source code
I decided to write this library in 2000, after watching “The Simpsons” episode “Treehouse of Horror VI”.
In the last story of this episode, Homer Simpson jumps into a three-dimensional world. The full story, and specially this parallel world, are full of mathematical tributes: geometrical shapes, the Euler’s identity, a reference to the P versus NP problem, and so on. Among these mathematical references, a disturbing numerical formula attracted our attention: 178212 + 184112 = 192212

Homer Simpson in a three-dimensional world
The previous affirmation is obviously a joke, since it must be false according to Fermat’s Last Theorem. But if we do the operation with a calculator with less than 10 significant digits, one can believe in its truthfulness, since both sides of the equality match up in the 9 first of their 40 digits.
The equation falseness can only be proved with a calculator capable to handle the enough significant digits, and even though there are numerous tools that can perform calculus with arbitrary precision (like the phyton language, or bc, into the GNU project), I found interesting to design a single and generic library to deal with arbitrary-size numbers, which could solve this kind of problems and another similar challenges.
In this post I will present the library code and its basic structure, and in future publications we will try to improve the algorithms efficiency and we will show some practical examples.
Library specification
A natural binary-coded decimal (NBCD) system has been chosen, with sign and module agreement, so an arbitrarily big integer can be coded with a long enough BCD digit sequence.
Furthermore, in order to handle fractional numbers, the library uses fixed-point arithmetic, therefore the format (the BCD sequence) is divided into whole and fractional parts. Thus, fractional numbers can be approximated with arbitrary precision by simply allocating enough memory in the fractional part. This scheme can deal with arbitrary-size numbers, but in a static way, i.e., having previously allocated the needed memory.
Once the format has been chosen, let us describe some simple arithmetic operations:
The code
The library is coded in C++, but almost none of the language features are used in this limited version.
A number is coded in BigNumber class, which consists basically in a byte array and a sign flag. For simplicity and clarity, only one decimal digit is coded in each byte. The file bignum.h contains the declaration and bignum.cc contains some basic functions implementation.
The file config.h is a configuration point, where we can set the number of digits assigned to the whole and the fractional parts in the format.
The file test.cc contains a main function with an example application: evaluates both sides of the former equation 178212 + 184112 = 192212, and verifies its falseness. At least 40 digits are needed for doing the calculus (the default configuration reserves 50 digits for both whole and fractional parts).
The rest of the files codify several operations.
- addsub.cc: addition and subtraction.
- mul.cc: multiplication.
- div.cc: division.
- convert.cc: conversion between bignumbers and floating point numbers.
Links
Source code
Legal Notice: The Simpsons TM and (C) Fox and its related companies. All rights reserved. Any reproduction, duplication, or distribution in any form is expressly prohibited.
Disclaimer: This web site, its operators, and any content contained on this site relating to The Simpsons are not specifically authorized by Fox.
Related to the original post, I’m going to propose another way to solve a sudoku puzzle.
This program will try to find all the puzzle solutions in a more efficient way using a constraint oriented algorithm.
The basic idea and the source code dates from the year 2005, when the puzzle came into fashion.
The algorithm basis
The key factors are the uncertainty and the uniqueness constraints.
In a sudoku puzzle, we have 9×9 squares, each of them belonging to 3 uniqueness constraint groups: a row, a column and a 3×3 sub-grid. There are 3×9 restriction groups in total: 9 rows, 9 columns and 9 sub-grids.
Each square has a state: the digit set that can be fitted without violating the 3 uniqueness constraints, so the total table state will be the set of all the 81 individual states. Without any information, the initial state is the one with the maximum uncertainty: all the digits are possible in each square.
Furthermore, each square state gives us an uncertainty metric: the number of possible digits, so considering the uncertainty of all the squares, we can obtain a global uncertainty metric. Our strategy will be to reduce this uncertainty until reach the minimum state: the solution.
We have two ways to perform this reduction:
- Using direct rules: given a correct state, we move to other correct states that have progressively less uncertainty. No errors are made in this one-way type of movements.
- Using hypothesis: given a correct state, we make an assumption which takes us to another less uncertainty state that we suppose as correct . If we reach to an impossible state, we rule out the last hypothesis. These are the kind of rules considered in the original post.
Using the second kind rules in a standalone way involves a typical brute-force algorithm. But in this algorithm, combining the two types will allows us to reduce the space into which the hypothesis are made: the less the uncertainty is, the less assumptions we’ll need.
So the basic idea is to give priority to type 1 rules before making any hypothesis, but, how these rules are?.
The basic rule is: given a restriction group C, i.e., a set of 9 squares, let A the state of one of them (the set of possible digits of that square) and let B the set of squares of C with their state contained in A. If the cardinal of A (see cardinal number) is the same as the one of B, then the squares of C which aren’t contained in B (C\B) can’t contain in their states any element of A, so those elements can be removed from their states, and, consequently, the uncertainty is reduced.
In other words, if we want to distribute a set of N digits (A) among the squares of a restriction group (C), and there are exactly N squares (B) that accepts only digits from that set, my N digits necessarily have to be distributed among those N squares, and the other 9 – N (C\B) squares can’t receive any of them. A simple example: if a square of a row has its state fixed to one digit (no uncertainty), that digit can’t be placed on any of the 8 remaining squares of the same row.
One interesting thing is that when we reduce the uncertainty state of a square, we can trigger recursively a new reduction in their related squares, i.e., the other squares belonging to its restriction groups. This yields a notable global uncertainty reduction, given an initial information, without the necessity of making assumptions.
The algorithm is as follows:
- We start from the maximum uncertainty state: a void table where all the digits (1..9) are possible in all the squares.
- Use the initial table information to make type 1 reductions. If the puzzle is solved, we exit (many cases can be solved using only this technique).
- Make a possible assumption over the next square with uncertainty. We have three possible results:
- If the puzzle is solved, we exit.
- If we arrive at an inconsistent state, the last hypothesis was incorrect and we try the next possible one over the same square.
- Otherwise (the puzzle is not solved yet), we jump to the point 3.
The code
The program is written in C and has been tested under GNU Linux. More details about its usage and table format can be found in the README file.
The code is quite clear, so I’ll just give some guidelines to help its understanding:
- About the data model, the structures square_t and square_group_t models, respectively, a square and a restriction group. No more complex data structures are needed.
- As the square state is coded as a bit field (an unsigned short where the first 9 bits represent the possible digits), the following bitwise operations can be done:
- An uncertainty reduction operation consists in removing a bit subset from a bit set: a = a and not(b) (the set b is removed from a).
- A set inclusion can be detected checking the equality: a or b == b? (is a contained in b?)
- The number of considered digits (the number of set bits) can be computed with the built-in function __builtin_popcount.
- The digit coded by a set of bits with only one set bit can be obtained with the built-in function __builtin_ffs, wich returns the first set bit index.
- The method uncertainty_reduction performs a type 1 rule over a square (the recursion enables to trigger other reductions in the related squares), and the method make_assumption performs a type 2 rule over a square.
Results
Whereas the original post presented us a simple algorithm where some basic AI techniques and general concepts were used, this algorithm uses a more specific and optimized approach, and therefore the performance is higher for common sudokus.
Links
First Ordered List Element.
- Unordered List element.
- And another.
Some years ago, newspapers and magazines worldwide introduced us a novelty puzzle: the sudoku. The game gained popularity from its first publishing, in 1979, becoming an international hit in 2005.

A sudoku puzzle.
Then, we decided to design a computer program to solve the puzzle automatically, using AI techniques — just a programming challenge. In this post we show you how the program works; explaining first the theory and then how was it coded.
The algorithm basis
The method used to solve a sudoku is inspired on a classic AI technique: a heuristic search procedure.
Consider the whole space of sudoku puzzles, this is, 9×9 matrixes, satisfying or not the yet well-known game rules — let’s call ‘state’ to any of these matrixes. Consider also transitions among states representing the possible settings of a given digit for a given empty cell in a sudoku table. Therefore, there are some sequences of states —differing only a digit between a given state and the next one— that lead to the final state: the puzzle solved.
But think about it: there are millions of possibilities, and only a few ones are valid. How can we find the good ones avoiding to run thru the whole tree of states? Using an heuristic is the answer.
In order to let the algorithm to find a good sequence of states (to solve the puzzle), we must define an utility function. That functions gives us an estimation of the goodness of any possible movement. Then we can use it applied on any empty cell to know how good is to try filling that cell with any valid digit. Choosing the right cell to fill is very important and filling it should reduce the most uncertainty as possible.
So we need an heuristic with an utility function that depends on the number of possibilities going from a given state to another valid one. The function we have choosen is the one that gives the number of candidates for a given empty cell in the puzzle, so if a cell can fit any of N digits without violate the puzzle rules the function will return N.
Then we operate recursively on the puzzle matrix, choosing the empty cell with less candidates and trying to fill it with each one of those digits. If we reach a state with empty cells but no candidates we must undo some actions and keep on trying with other candidates and other cells. At the end, we get a state with no empty cell; that means the puzzle is solved.
The code
The program’s code is simple and easily understandable. There are three major functions, working together, to perform the heuristic algorithm. We defined a procedure to look for empty cells in a sudoku table, another procedure that tries to fill those empty cells with the candidates given by the third function: the heuristic itself. These functions are called, respectively, do_sudoku, try_a_number and find_candidates.
The do_sudoku function looks for blanks in a given sudoku, calculates the candidate digits for those blanks and tries to solve the puzzle filling the empty cell that has less candidates. All the work is done applying this rule recursively.
The try_a_number function iterates recursively the algorithm, trying to fit a candidate digit in a given empty cell. If it reaches a ‘dead-end’ state, returns undoing that try.
The find_candidates function looks for all fittable digits for a given cell. It checks if a digit is not being used in the same row, the same column and the same 3×3 block; in this case, it’s a valid candidate.
Results
After trying the program with several sudokus, we have observed that:
- This is a very simple algorithm with no optimizations. With the most of the cases it finds a solution quickly. Nothing is done in order to shortcut some trivial or repetitive cases.
- Memory consumption: Tables are backuped at each recursion. Therefore, the memory allocated is proportional to the maximum level of recursion achieved; this is equivalent to the number of empty cells at the initial state.
Links