Minsky Machines (part 2)

April 4, 2026

πŸ‡ΊπŸ‡Έ

A framework for executing a Minsky machine

Welcome back in the series about my journey into Minsky machines. If you haven't, check out the previous post. In this post I want to show the framework I built for simulating Minsky machines and how I used it to build a set of operations that ultimately allow converting any Turing machine into an equivalent Minsky machine.

Recall from the previous post that the Minsky machine itself only had two operations, acting on registers with non-negative integer values:

  1. Inc(r) which adds 1 to the register r
  2. Cdec(r) which checks if r > 0 and if so subtracts one and continues to operation n, while if r == 0 it immediately continues to operation p

To simulate a Turing machine, I needed a set of operations to simulate the tape, tape head movement and manipulations of the tape. I ended up with this list of arithmetic operations:

OperationDefinition
AddN(r)\mathtt{Add_N(r)}r←r+N\mathtt{r \gets r + N} for some constant N\mathtt{N}
MulN(r)\mathtt{Mul_N(r)}r←rβˆ—N\mathtt{r \gets r * N} for some constant N\mathtt{N}
DivN(r)\mathtt{Div_N(r)}rβ†βŒŠrNβŒ‹\mathtt{r \gets \lfloor \frac{r}{N} \rfloor } for some constant N\mathtt{N}
Add(r1,r2,rr)\mathtt{Add(r_1, r_2, r_r)}rr←r1+r2\mathtt{r_r \gets r_1 + r_2}
Mul(r1,r2,rr)\mathtt{Mul(r_1, r_2, r_r)}rr←r1βˆ—r2\mathtt{r_r \gets r_1 * r_2}
Shl(r1,r2)\mathtt{Shl(r_1, r_2)}r1←r1<<r2\mathtt{r_1 \gets r_1 << r_2} which is equivalent to r1←r1βˆ—2r2\mathtt{r_1 \gets r_1 * 2^{r_2}}
Log2Ceil(r1,rr)\mathtt{Log2Ceil(r_1, r_r)}rrβ†βŒˆlog⁑2r1βŒ‰\mathtt{r_r \gets \lceil \log_2{r_1} \rceil} (with log20:=0\mathtt{log_2{0} := 0})
Clear(r)\mathtt{Clear(r)}r←0\mathtt{r \gets 0}

I also found three conditional operations useful:

OperationDefinition
IsZero(r)\mathtt{IsZero(r)}r==0\mathtt{r == 0}
IsEven(r)\mathtt{IsEven(r)}rmod  2==0\mathtt{r \mod 2 == 0}
GEN(r)\mathtt{GE_N(r)}r>=N\mathtt{r >= N} for some constant N\mathtt{N}

The cool thing with the Minsky machine is that we can build self-contained Minsky machines (you might call them functions) from the basic two operations which themselves then act as fundamental operations for more complex machines. Here is an example for AddN(r)\mathtt{Add_N(r)} with the specific value Add3(r0)\mathtt{Add_3(r_0)}:

I added a start state to make the control flow clearer. Since N\mathtt{N} is a (compile-time) constant, the corresponding Minsky machine depends on the value of N\mathtt{N} and is made up of three basic Inc(r0)\mathtt{Inc(r_0)} operations. This can then be referenced from a more high-level operation, for example SetTo3(r0)\mathtt{SetTo3(r_0)} which sets the value of register r0\mathtt{r_0} to 3\mathtt{3}:

This is the basic setup: Building progressively more complex operations out of simpler ones. In order to get a feel for these operations and also ensure their correctness, I wrote a program that simulates arbitrary Minsky machines and lets me build arbitrary operations. I'll first show how I implemented this and then I'll demonstrate how to build the various operations, though you might try the latter out for yourself first with pen and paper, it wasn't very hard and quite satisfying.

A program for executing Minsky machines

I wrote everyting in Rust because that is what I'm most comfortable with. I also wrote the Code myself1 because I enjoyed the process. First I needed a representation for the Minsky machines, i.e. some structure for storing programs as data. For parsing programming languages we typically use trees, for the Minsky machine directed graphs are more appropriate. I used the petgraph library because it had everything I needed (probably more to be honest). Here is the basic code for encoding Minsky machines:

#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum Order {
    Inc { register: usize },
    Cdec { register: usize },
    Halt,
}

#[derive(Copy, Clone, Debug, PartialEq, Eq)]
pub enum CoreEdgeType {
    Always,
    CdecZero,
    CdecNonzero,
}

#[derive(Clone, Debug)]
pub struct MinskyMachine {
    graph: Graph<Order, CoreEdgeType>,
}

The Order type is clear, I only added Order::Halt to represent the halting state explicitly. The CoreEdgeType is more interesting, as there are three types of edges! CoreEdgeType::Always is for edges that are always taken, like the edge out of an Inc(r) order. Cdec(r) on the other hand is a conditional operation, so we need a way to indicate whether the zero or the non-zero branch was taken, which is what the CdecZero and CdecNonzero variants do. We store everything as a Graph<Order, CoreEdgeType>, meaning the nodes are Orders and the edges have the aforementioned edge type. Since petgraph supports exporting graphs in the DOT format, I can render them directly with graphviz, yielding the images you see in this blog.

Now I needed to support nested operations. For this my idea was to introduce a second machine type called AbstractMachine which can store arbitrary operations that are either Orders or more complex sets of nested operations. I think my design is a little bit too complicated because I introduced an Operation trait and there is some overlap between AbstractMachine and this trait, but it works well enough. Here is the relevant code:

pub trait Operation: std::fmt::Debug + Display {
    fn required_registers(&self) -> BTreeSet<RegisterID>;
    fn flatten(&self) -> AbstractMachine;
    fn graph(&self) -> Cow<AbstractMachine>;
    fn output_arity(&self) -> usize;
}

#[derive(Debug, Clone)]
pub enum AbstractOp {
    Order(Order),
    Operation(Arc<dyn Operation>),
    NamedHalt(String),
}

#[derive(Clone, Debug, PartialEq, Eq)]
pub enum EdgeType {
    Core(CoreEdgeType),
    Custom(String),
}

#[derive(Debug, Clone, Default)]
pub struct AbstractMachine {
    graph: Graph<AbstractOp, EdgeType>,
}

Starting from the bottom, we now have a graph with nodes of the AbstractOp type, which wraps regular Orders, custom Operation implementations, as well as named halting states, which are used to enable operations with multiple conditional halting states (like IsEven(r)\mathtt{IsEven(r)}). To work with these named halting states we also need named edges, hence the EdgeType. Operation itself can return a basic graph representation using the fn graph(&self), but it can also return a flattened graph, which converts every node recursively to a graph made up entirely out of the basic Orders. The fn flatten(&self) is where the most thought went into because it needs to correctly link nested operations together, collapse their halting states etc.

With that, we can execute any AbstractMachine by converting it into MinskyMachine using flatten() and then running the following pseudocode:

1:regs←(0,0,…)regs \gets (0, 0, \ldots)

2:state←STARTstate \gets \mathtt{START}

3:while state≠HALTstate \ne \mathtt{HALT} do

4:if statestate is Inc(j)\mathtt{Inc}(j) then

5:regs[j]←regs[j]+1regs[j] \gets regs[j] + 1

6:state←nextNode(Always,state)state \gets \text{nextNode}(\mathtt{Always}, state)

7:else if statestate is Cdec(j)\mathtt{Cdec}(j) then

8:if regs[j]>0regs[j] > 0 then

9:regs[j]←regs[j]βˆ’1regs[j] \gets regs[j] - 1

10:state←nextNode(NonZero,state)state \gets \text{nextNode}(\mathtt{NonZero}, state)

11:else

12:state←nextNode(Zero,state)state \gets \text{nextNode}(\mathtt{Zero}, state)

13:end if

14:end if

15:end while

It is straightforward, execute each Order (here referred to as state\mathtt{state}) until we hit the HALT state. After execution, find the next node in the graph, depending on which type of Order we executed. The Csub node either follows Zero\mathtt{Zero} or NonZero\mathtt{NonZero} edges depending on the register value. The nextNode\mathtt{nextNode} operation is supported by using Graph::edges_directed from petgraph, filtering for the edge type and asserting that there is exactly one outgoing edge of that type2.

I also haven't yet talked about the registers, mainly because they are quite boring. I specify them as a zero-indexed array of unsigned integers (usize). Putting it all together, an AbstractMachine or MinskyMachine starts from node zero and executes in a loop until a halting state is encountered. In the real code I also added a maximum number of steps to try and detect non-halting machines. Actually detecting this is of course impossible due to the halting problem, but setting an upper limit for the number of iterations is helpful for testing. The real code also supports handling of the named halting states so I can run an AbstractMachine with multiple outputs (for example the IsEven\mathtt{IsEven} operation) and will know in which state the machine halted (True\mathtt{True} or False\mathtt{False} in the case of IsEven\mathtt{IsEven}).

Our first nested operation

Now it is time to look at creating our first nested operation using this code framework! Let's start with the Clear(r)\mathtt{Clear(r)} operation that was already mentioned above, which does r←0\mathtt{r} \gets 0. It's graph visualization looks like this for r\mathtt{r} = r0\mathtt{r_0}:

Its implementation is very simple: Execute Cdec(r) until r == 0, then HALT. Building the corresponding graph looks like this in code:

let mut graph = Graph::new();
let csub = graph.add_node(Order::Cdec { register });
let halt = graph.add_node(Order::Halt);
graph.add_edge(csub, csub, CoreEdgeType::CdecNonzero);
graph.add_edge(csub, halt, CoreEdgeType::CdecZero);

Using the edge types the Cdec node can be linked to two next nodes (itself on NonZero\mathtt{NonZero} and the HALT node on Zero\mathtt{Zero}). This is perhaps the simplest non-trivial operation.

Here is another one that is more complicated which I called MoveN(rs,rd1,rd2,...)\mathtt{MoveN(r_s, r_{d_1}, r_{d_2}, ...)}. This is a variadic operation taking one source register rs\mathtt{r_s} and any number of destination registers rdi\mathtt{r_{d_i}}. I already showed this operation in the previous post, it adds the value in rs\mathtt{r_s} to to every destination register and then clears rs\mathtt{r_s}. This image shows MoveN(r0,r1)\mathtt{MoveN(r_0, r_1)}:

Minsky machine program that moves the value from register 0 to register 1

MoveN\mathtt{MoveN} becomes interesting when we have more than one destination register, because then we can duplicate an existing value, which will become important to implement the Copy(rs,rd)\mathtt{Copy(r_s, r_d)} operation! Suppose we want to overwrite the value in register rd\mathtt{r_d} with that in register rs\mathtt{r_s}, i.e. do exactly rd←rs\mathtt{r_d \gets r_s}. This is different from MoveN\mathtt{MoveN} since the latter clears the source register and adds its value onto the existing value in the destination register. But we can use the Clear\mathtt{Clear} operation together with MoveN\mathtt{MoveN} to realize Copy\mathtt{Copy}, like so for Copy(r0,r1)\mathtt{Copy(r_0, r_1)}:

Minsky machine program that moves the value from register 0 to register 1

A Clear\mathtt{Clear} followed by two MoveN\mathtt{MoveN} operations! It is interesting that this operation requires three registers for overwriting a single register. Here the register r2\mathtt{r_2} is only used to memorize the value of the source register r0\mathtt{r_0} so that we can later restore it, since MoveN\mathtt{MoveN} drains the source register value. In my simulation engine, these temporary registers are called scratch registers and I defined all operations so that any scratch registers must start with a value of 0 and will end up with a value of 0 after the operation has completed. Since Minsky machines have an infinite number of registers, we can always find some unused register and use it as scratch space!

Now it is time for payoff, because we can flatten() the Copy(r0,r1)\mathtt{Copy(r_0, r_1)} operation into a Minsky machine made up entirely of Inc and Cdec operations:

The flattened Copy(r_0, r_1) operation

The colored nodes indicate which high-level operation each Inc and Cdec corresponds to! I also experimented with a nested visualization that shows both the high-level operations and their respective graphs, but I'm not happy with how cluttered this is. I'll still include these pictures in this post because they give a decent idea of how we get from the high-level graph to the final Minsky machine for each operation:

The flattened Copy(r_0, r_1) operation

Graph replacement algorithms

The nested visualization illustrates a challenge that I stumbled upon while implementing the flatten() function: Each individual operation has its own HALT state, which we need to remove and instead link from it to the start node of the next operation. For Copy(r0,r1)\mathtt{Copy(r_0, r_1)}, this is still somewhat simple:

The flattened Copy(r_0, r_1) operation

First, link every HALT state of operation ii to the first state of operation kk where (i,k)\mathtt{(i,k)} is an edge in the high-level graph. The new temporary edges are indicated with the red dashed arrows. Then, remove the HALT state and collapse the edges, keeping the edge type of the incoming edges of the HALT state (the green arrows). This is simple for operations that have no nested branching, but gets complicated if we introduce nested operations with multiple halting states into the mix. Enter IsZero\mathtt{IsZero}:

The IsZero(r_0) operation

This operation has two HALT states, which breaks the previous algorithm! Suppose we want to implement an operation Cinc(rc,r1,r2)\mathtt{Cinc(r_c, r_1, r_2)} that calculates:

Cinc(rc,r1,r2):={r2←r2+1,ifΒ rc==0r1←r1+1,otherwiseΒ \mathtt{Cinc(r_c, r_1, r_2):=} \begin{cases} r_2 \gets r_2 + 1 ,& \text{if } r_c == 0 \\ r_1 \gets r_1 + 1, & \text{otherwise } \end{cases}

We can use IsZero\mathtt{IsZero} for this like so:

Using the IsZero operation

Graph replacement with named edges

Cinc flattened operation

The first image shows the high-level implementation of Cinc(r0,r1,r2)\mathtt{Cinc(r_0, r_1, r_2)}, the middle image the revised graph replacement algorithm that matches halting nodes with their next nodes through named edges, and the last image shows the final Minsky machine. I implemented this through simple string matching, requiring custom named edges while building the graph:

let mut graph = Graph::new();

let is_zero = graph.add_node(IsZero::new(0).into());
let add1 = graph.add_node(Order::Inc { register: 1 }.into());
let add2 = graph.add_node(Order::Inc { register: 2 }.into());
let halt = graph.add_node(Order::Halt.into());

graph.add_edge(
    is_zero,
    add1,
    // Named edge:
    EdgeType::Custom(IsZero::HALT_STATE_FALSE.to_string()),
);
graph.add_edge(
    is_zero,
    add2,
    // Named edge:
    EdgeType::Custom(IsZero::HALT_STATE_TRUE.to_string()),
);
graph.add_edge(add1, halt, EdgeType::ALWAYS);
graph.add_edge(add2, halt, EdgeType::ALWAYS);

I'm not a fan of string identifiers for matching up stuff like this because mistakes can become hard to detect, but it works well enough for now.

All the operations

I'll conclude this long post by showing off the implementations of the various instructions needed for simulating a Turing machine. We already saw AddN(j)\mathtt{Add_N(j)} as well as the very useful Clear(j)\mathtt{Clear(j)}, MoveN(rs,rd1,rd2,...)\mathtt{MoveN(r_s, r_{d_1}, r_{d_2}, ...)} and Copy(r1,r2)\mathtt{Copy(r_1, r_2)}. Here are the remaining ones, always in the order "high-level graph" -> "nested graph" -> "flattened graph":

MulN(r0)\mathtt{Mul_N(r_0)} for N=3N=3

The high-level graph for Mul_3(r_0)

The nested graph for Mul_3(r_0)

The flattened graph for Mul_3(r_0)

GEN(r0)\mathtt{GE_N(r_0)} for N=3N=3

The high-level graph for GE_3(r_0)

The nested graph for GE_3(r_0)

The flattened graph for GE_3(r_0)

This is one of the more complex operations. The main issue is that the cleanup phase (restoring the value of r0r_0 and clearing the scratch register r1r_1) has to be duplicated since it happens after the actual decision and must result in two different halting states (TRUE and FALSE). Notice the identical high-level nodes that lead into HALT(TRUE) and HALT(FALSE)! A more simplified version could store a single true/false bit in another scratch register and then use a single Cdec\mathtt{Cdec} operation to decide, I didn't implement that because that idea only came to me while writing this post :)

DivN(r0)\mathtt{Div_N(r_0)} for N=3N=3

The high-level graph for Div_3(r_0)

The nested graph for Div_3(r_0)

The flattened graph for Div_3(r_0)

Add(r0,r1,r2)\mathtt{Add(r_0, r_1, r_2)}

The high-level graph for Add(r_0, r_1, r_2)

The nested graph for Add(r_0, r_1, r_2)

The flattened graph for Add(r_0, r_1, r_2)

Mul(r0,r1,r2)\mathtt{Mul(r_0, r_1, r_2)}

The high-level graph for Mul(r_0, r_1, r_2)

The nested graph for Mul(r_0, r_1, r_2)

The flattened graph for Mul(r_0, r_1, r_2)

This one is interesting because it builds on Add(r0,r1,r2)\mathtt{Add(r_0, r_1, r_2)} using the identity aβˆ—b=a+a+...+a⏟bΒ timesa*b=\underbrace{a+a+...+a}_{b \space times}! It is implemented using a loop and therefore needs one more scratch register than Add(r0,r1,r2)\mathtt{Add(r_0, r_1, r_2)} (3 instead of 2). We can build all the functions of the Hyperoperation Sequence in this exact way, but every step in the hierarchy increases the number of scratch registers by one. Because register indices are hardcoded in Minsky machines, I believe this approach cannot compute fΟ‰(n)=fn(n)f_{\omega}(n)=f_n(n) on the Fast-growing hierarchy. It would be an interesting problem to figure out a way to compute fΟ‰(n)f_{\omega}(n) on the Minsky machine :)

Shl(r1,r2)\mathtt{Shl(r_1, r_2)}

The high-level graph for Shl(r_1, r_2)

The nested graph for Shl(r_1, r_2)

The flattened graph for Shl(r_1, r_2)

Even though this operation performs exponentiation, since the base is a constant the resulting graph is simpler than one might expect! We also don't need to worry about division since we explicitly shift to the left, which only multiplies.

Log2Ceil(r1,rr)\mathtt{Log2Ceil(r_1, r_r)}

The high-level graph for Log2Ceil(r_1, r_r)

The nested graph for Log2Ceil(r_1, r_r)

The flattened graph for Log2Ceil(r_1, r_r)

Here the operations start to get pretty big. Log2Ceil(r1,rr)\mathtt{Log2Ceil(r_1, r_r)} also has some peculiarities, mainly because we don't have a good way of representing partial functions with the Minsky machine3. The biggest change to traditional logarithms is that I set log20=0log_2{0}=0 to make it a total function (under N0\mathbb{N}_0). I also implemented only Log2Ceil\mathtt{Log2Ceil} but no Log2Floor\mathtt{Log2Floor} because the former is what I needed for the tape simulation code.

Conclusion

And there we have it, a bunch of mathematical operations built on the Minsky machine. I'm gonna leave today's post with a teaser for the final Turing machine simulation, showing only the high-level graph :)

Teaser for the Turing machine simulation graph

Footnotes

  1. Sad that I felt the need to point that out, but also a good indicator for how much programming has changed in the last two years. ↩

  2. For incoming edges there can be more than one candidate per node, but this is not a problem, do you know why? :) ↩

  3. We could enter an infinite loop, but that didn't seem like such a great idea because it is hard to detect. ↩