A short note on composition in TLA+

art yerkes
6 min readJun 19, 2022


I’ve now read Practial TLA+ thoroughly and if you’re going to do _anything_ with TLA+, you should read it. Most of your questions will be answered via anecdotes and examples in the book. One thing I haven’t seen anyone talk about (even to the point people on mailing lists _discourage_ looking into it) is composition.

TLA+ is a specification language but it also at the end of the day represents an ideosyncratic kind of program, and those used to existing software have been told many times not to repeat themselves, but composition works in other ways too; since my interest in TLA+ is via output from another programming language (and that, with stacks too deep to facilitate single step evaluation by TLA’s java code and multiple communicating nodes besides), before giving up and emitting a tangle of interwoven code for multiple parts, I wanted to learn to isolate individual parts in TLA+. I’m not certain what I found is the best way, but there seems to be little information about how to do it so this might be useful anyway even if someone finds something better.

My goal was to enable a single queue process to service multiple users, some producers and some consumers in an idempotent at-least-once delivery scenario. The language whose programs form part of the spec don’t do IO (they purely produce data structures from inputs) so multiple delivery would be OK and I don’t need TLA+ to enforce any kind of rules in that regard. Ultimately I’d like to be able to produce these outputs in isolation from each other and have a smart, hand made queue system shuffle messages between aggregators, the TLA+ processes implementing the individual programs and each other in any possible runtime topology.

Another good reason to be able to do this is so that recompilation can produce this output again without the user having to recover or save out their own changes manually. If I can compose here, the user can compose this big piece of output with their own rule checking in the containing module.

The queue looks like this (i think it works but might not be perfect):

------ MODULE Queue ------

EXTENDS Sequences, TLC, Integers, FiniteSets


RECURSIVE ReplaceInQueueDataInner(_,_,_,_)

OUT_I == 2
TMPQ == 4

ReplaceInQueueDataInner(N,V,Val,Vec) ==
IF N = V
THEN <<Val>> \o Tail(Vec)
ELSE <<Vec[1]>> \o ReplaceInQueueDataInner(N+1,V,Val,Tail(Vec))

ReplaceInQueueData(V,Val,Vec) == ReplaceInQueueDataInner(1,V,Val,Vec)

InitQueueState == << << >>, 0, << >>, {}, << >> >>

\* incoming message format is <<cmd,sender,...>
\* where cmd is
\* 0 - erase my output
\* 1 - enqueue for recipient
(*--algorithm queue

inqueue, \* ordered messages in
out = {}, \* repository for live

queue_data = InitQueueState

process QueueRunner = QueueId

while TRUE do
await inqueue /= << >>;

queue_data := ReplaceInQueueData(LOCAL_DATA, Head(inqueue), queue_data);

inqueue := Tail(inqueue);

\* Remove output for this target.
if queue_data[LOCAL_DATA][1] = 0 then
out := out \ {(CHOOSE x \in out: x[1] = queue_data[LOCAL_DATA][2])};
queue_data := ReplaceInQueueData(OUTQUEUE, Append(queue_data[OUTQUEUE], Tail(queue_data[LOCAL_DATA])), queue_data);
end if;

\* Try to place new outputs
queue_data := ReplaceInQueueData(TMPQ, {}, queue_data);
queue_data := ReplaceInQueueData(OUT_I, 1, queue_data);

while queue_data[OUT_I] <= Len(queue_data[OUTQUEUE]) do
if {o \in out: o[1] = queue_data[OUTQUEUE][queue_data[OUT_I]][1]} = {} then
queue_data := ReplaceInQueueData(TMPQ, queue_data[TMPQ] \union {queue_data[OUT_I]}, queue_data);
out := out \union {queue_data[OUTQUEUE][queue_data[OUT_I]]};
end if;
queue_data := ReplaceInQueueData(OUT_I, queue_data[OUT_I] + 1, queue_data);
end while;

\* Erase dequeued entries
queue_data := ReplaceInQueueData(TMPOUT, << >>, queue_data);
queue_data := ReplaceInQueueData(OUT_I, 1, queue_data);

while queue_data[OUT_I] <= Len(queue_data[OUTQUEUE]) do
if {i \in queue_data[TMPQ]: i = queue_data[OUT_I]} = {} then
queue_data := ReplaceInQueueData(TMPOUT, Append(queue_data[TMPOUT], queue_data[OUTQUEUE][queue_data[OUT_I]]), queue_data);
end if;
queue_data := ReplaceInQueueData(OUT_I, queue_data[OUT_I] + 1, queue_data);
end while;

queue_data := ReplaceInQueueData(OUTQUEUE, queue_data[TMPOUT], queue_data);

end while;

end process;

end algorithm;*)

So anything that has access to its “inqueue” and “out” members can push a message onto inqueue to be routed to out when out doesn’t contain an element x such that x[1] equals the target process’ queue id or request that its own output slot is cleared (and another message could be delivered).

Meanwhile any process can check for its own process id in “out” and process the message (if it can access “out”).

QueueProducer and QueueConsumer are (relatively) short because they’re relying on being tied in:

----- MODULE QueueProducer -----

EXTENDS FiniteSets, Sequences, Integers

CONSTANT ProcessId, TargetId

(*--algorithm producer

process Producer = ProcessId
out = {}
inqueue := Append(inqueue, <<1, TargetId, "hithere">>);
end process;

end algorithm; --*)
\* snip



----- MODULE QueueConsumer -----
EXTENDS TLC, Sequences, FiniteSets


(*--algorithm consumer

process Consumer = ProcessId
out = {}
await {x \in out: x[1] = ProcessId} /= {};
myitem := CHOOSE x \in out: x[1] = ProcessId;
inqueue := Append(inqueue, <<0, ProcessId>>);
end process;

end algorithm;--*)
\* snip


One thing to observe about pluscal output is that it always produces “Init” and “Next” rules. If we just used EXTENDS with more than one module that contains pluscal output, we’d have multiple of these rules in scope and compilation would fail.

TLA+ contains INSTANCE but documentation about it focuses on how to use instantiated libraries as sources of extra TLA+ level functions, not for their algorithmic pluscal output.

So then we need to figure out how to write a module that simulates simultaneous runs of these. Some caveats:

  • Variables that are initialized to values other than our defaultInitValue must be preinitialized at the level above, otherwise initial setup will fail.
  • The parent module ultimately instantiates _every_ piece of state used by each subset, so things like queue_data in Queue allow utilities to shorten that complexity to just one state variable and their formal inputs and outputs, but it all needs to be visible to the consumer.
  • Since these runnable parts of the spec are distinct subsets of the state space, we’ll get messages like “Next is never enabled” from TLA+ if we don’t describe to TLA+ exactly what we want; that each INSTANCE’ Next rule is evaluated distinct from the others (\/ “or”) at the top level, and that each contains UNCHANGED rules for all the variables that belong to state it _isn’t_ involved in.
  • pluscal uses a “pc” variable to drive the state machine. We must provide working space for it for each instance.

So here’s what I came up with:

------ MODULE Test -------
EXTENDS Naturals, TLC, Sequences

VARIABLES inqueue, out, queue_pc, queue_data, producer_pc, consumer_pc, consumer_item

Producer == INSTANCE QueueProducer WITH ProcessId <- 2, TargetId <- 3, inqueue <- inqueue, out <- out, pc <- producer_pc, defaultInitValue <- << >>
Consumer == INSTANCE QueueConsumer WITH ProcessId <- 3, pc <- consumer_pc, myitem <- consumer_item, defaultInitValue <- << >>
Queue == INSTANCE Queue WITH QueueId <- 1, inqueue <- inqueue, out <- out, pc <- queue_pc, queue_data <- queue_data, defaultInitValue <- << >>

Init == out = {} /\ Queue!Init /\ Producer!Init /\ Consumer!Init

Next ==
\/ (Queue!Next /\ UNCHANGED << producer_pc, consumer_pc, consumer_item >>)
\/ (Producer!Next /\ UNCHANGED << queue_data, queue_pc, consumer_pc, consumer_item >>)
\/ (Consumer!Next /\ UNCHANGED << queue_data, queue_pc, producer_pc >>)


The Init rule uses “and” to ensure that it encompasses all included module initializations. I expected that to be enough but for reasons I haven’t fully explored, we were still able to escape to Next with out not initialized as a tuple (most likely due to something i overlooked), so this Init ensures that it gets initialized regardless of anything else happening.

The Next rule uses “or” to ensure that one of the subprocesses is stepped, and each clause recognizes parts of the state not used by that module using UNCHANGED (if one didn’t do this, the whole rule would fail). This does allow these three separate modules to run pluscal specified processes and share “inqueue” and “out” between them, which I think is kind of amazing and certainly provides me what I want.

I hope this is helpful and/or sparks some discussion that documents how people have done this in TLA+ or better alternatives.