Goroutines¶
Note
This example was originally published on my blog and is adapted here with minor revisions.
In Even in Go, concurrency is still not easy, Chris Siebenmann gives an example of Go code which deadlocks:
1 func FindAll() []P { //P = process data
2 pss, err := ps.Processes()
3 [...]
4 found := make(chan P)
5 limitCh := make(chan struct{}, concurrencyProcesses)
6
7 for _, pr := range pss {
8 limitCh <- struct{}{}
9 pr := pr
10 go func() {
11 defer func() { <-limitCh }()
12 [... get a P with some error checking ...]
13 found <- P
14 }()
15 }
16 [...]
17
18 var results []P
19 for p := range found {
20 results = append(results, p)
21 }
22 return results
23 }
In his words:
The bug is that the goroutines only receive from limitCh to release their token after sending their result to the unbuffered found channel, while the main code only starts receiving from found after running through the entire loop, and the main code takes the token in the loop and blocks if no tokens are available.
This is a good example of directly verifying code with a TLA+ spec, instead of verifying just a design.
Planning Ahead¶
It’s good to start with some upfront thinking of how we are going to approach this problem. There’s two parts to formally specifying something: describing the system and describing the properties of the system. In this case we can ignore the second part, since we’re only looking for deadlocks. It would be good modeling practice to add sanity check properties, like type invariants, but they are not strictly necessary.
First, we want to pick whether we’ll model in TLA+ or PlusCal. Since the Go code is highly sequential, I will use PlusCal. This will cause a bit of impedance mismatch down the road for unbuffered channels but overall it’s a net benefit.
There are three “complex bits” to Chris’s code sample: go
,
defer
, and the nature of Go channels. defer
runs cleanup code
when a goroutine is finished running. For now we’ll represent this by
moving the deferred code to its own label, but we could also use a
PlusCal procedure to be more accurate. go
spawns a new goroutine.
Since PlusCal requires us to define all of our processes in advance, we
can’t “spawn” a new goroutine. What we can do instead is define each
process but prevent them from running. Then we add a flag that says
whether or not the goroutine was initialized in the behavior yet. It
would look something like this:
variables
initialized = [w \in Routines |-> FALSE];
\* ...
process goroutine \in Routines
begin
Work:
await initialized[self];
\* ...
So each goroutine awaits
being initialized by the main process
before it does anything. This is how we can emulate spawning new
processes.
That leaves the channels, which are the most complicated part to specify. There are two kinds of Go channels: buffered and unbuffered. Sends to a buffered channel are blocked if the channel is full. Receives from a buffered channel are blocked if the channel is empty. Both of these are representable with PlusCal macros:
macro send_buffered(chan) begin
await channels[chan] < buffered[chan];
channels[chan] := channels[chan] + 1;
end macro;
macro receive_buffered(chan) begin
await channels[chan] > 0;
channels[chan] := channels[chan] - 1;
end macro;
That covers buffered channels. Unbuffered channels, by contrast, always block unless there is both a sender and receiver. In pure TLA+ this wouldn’t be too tricky to specify, but PlusCal assumes each step of the behavior is one process doing one thing. Unbuffered channels can’t be represented natively without adding some annoying bookkeeping, as we need to have one process block “first”. For that we can use a Procedures.
So now that we know a rough approach and what the pain points are likely to be, let’s write the spec.
The spec¶
First the breakdown, then the whole spec.
---- MODULE channels ----
EXTENDS Integers, TLC, Sequences
CONSTANTS NumRoutines, NumTokens
Routines == 1..NumRoutines
(* --algorithm channels
variables
channels = [tokens |-> 0, found |-> {}];
buffered = [tokens |-> NumTokens];
initialized = [w \in Routines |-> FALSE];
channels
is the current contents of each channel. For buffered
channels, we treat their contents as a single number and store the
maximum capacity in a separate buffered
variable. For unbuffered
channels, we instead store the set of senders waiting for a receiver.
initialized
is for emulating goroutines.
macro go(routine) begin
initialized[routine] := TRUE;
end macro
An extra macro I added to more closely match the Go syntax.
macro write_buffered(chan) begin
await channels[chan] < buffered[chan];
channels[chan] := channels[chan] + 1;
end macro;
macro receive_channel(chan) begin
if chan \in DOMAIN buffered then
await channels[chan] > 0;
channels[chan] := channels[chan] - 1;
else
await channels[chan] # {};
with w \in channels[chan] do
channels[chan] := channels[chan] \ {w}
end with;
end if;
end macro;
This is a change from our old read_buffered
because it handles both
buffered and unbuffered channels. Buffered channels work as expected.
For unbuffered channels, we wait for the set of blocked writers to be
nonempty and nondeterministically declare that we read from one of
them.1
procedure write_unbuffered(chan) begin
DeclareSend:
channels[chan] := channels[chan] \union {self};
Send:
await self \notin channels[chan];
return;
end procedure
To model unbuffered channels we can either put state on senders or put state on receivers. I opted to place it on the sender because Go permits reading from multiple unbuffered channels at once.2 In two separate temporal steps we 1) add the process to the set of channel senders and 2) wait to be removed from that set by a receiver.
process goroutine \in Routines
begin
A:
await initialized[self];
call write_unbuffered("found");
B:
receive_channel("tokens");
end process;
Our goroutine process is a straightforward translation of the Go code.
First we wait for the goroutine to be initialized, corresponding to line
10. Then we write to the found
channel (line 13). If I was trying to
be more faithful I would write special defer
semantics, but for this
I’m happy to just stick it on a label at the end of the process.
process main = 0
variables i = 1;
begin
Main:
while i <= NumRoutines do
write_buffered("tokens");
go(i);
i := i + 1;
end while;
Get:
while i > 1 do
i := i - 1;
receive_channel("found");
end while;
end process;
end algorithm; *)
Our emulation uses one token to initialize each goroutine.
Since write_channel
has an await
in it, it will block if there
are more goroutines than tokens. It will then stay blocked until a
goroutine releases a token.3 Final spec:
---- MODULE channels ----
EXTENDS Integers, TLC, Sequences
CONSTANTS NumRoutines, NumTokens
Routines == 1..NumRoutines
(* --algorithm channels
variables
channels = [limitCh |-> 0, found |-> {}];
buffered = [limitCh |-> NumTokens];
initialized = [w \in Routines |-> FALSE];
macro send_buffered(chan) begin
await channels[chan] < buffered[chan];
channels[chan] := channels[chan] + 1;
end macro;
macro receive_channel(chan) begin
if chan \in DOMAIN buffered then
await channels[chan] > 0;
channels[chan] := channels[chan] - 1;
else
await channels[chan] # {};
with w \in channels[chan] do
channels[chan] := channels[chan] \ {w}
end with;
end if;
end macro;
macro go(routine) begin
initialized[routine] := TRUE;
end macro
procedure send_unbuffered(chan) begin
DeclareSend:
channels[chan] := channels[chan] \union {self};
Send:
await self \notin channels[chan];
return;
end procedure
process goroutine \in Routines
begin
A:
await initialized[self];
call send_unbuffered("found");
B:
receive_channel("limitCh");
end process;
process main = 0
variables i = 1;
begin
Main:
while i <= NumRoutines do
send_buffered("limitCh");
go(i);
i := i + 1;
end while;
Get:
while i > 1 do
i := i - 1;
receive_channel("found");
end while;
end process;
end algorithm; *)
====
Now that we have a full spec, we can use the model checker, TLC, to see if it satisfies any properties. We didn’t specify any, but TLC will check for deadlocks by default.
Finding Deadlocks¶
To make this deadlock, I checked it with NumRoutines <- 3, NumTokens <- 2
. {{TODO state space}}. Unsurprisingly, this deadlocks:5
State 1: <Initial predicate>
/\ buffered = [limitCh |-> 2]
/\ channels = [limitCh |-> 0, found |-> {}]
/\ i = 1
/\ pc = (0 :> "Main" @@ 1 :> "A" @@ 2 :> "A" @@ 3 :> "A")
/\ initialized = <<FALSE, FALSE, FALSE>>
State 2: <Main line 128, col 9 to line 137, col 48 of module base>
/\ buffered = [limitCh |-> 2]
/\ channels = [limitCh |-> 1, found |-> {}]
/\ i = 2
/\ pc = (0 :> "Main" @@ 1 :> "A" @@ 2 :> "A" @@ 3 :> "A")
/\ initialized = <<TRUE, FALSE, FALSE>>
State 3: <Main line 128, col 9 to line 137, col 48 of module base>
/\ buffered = [limitCh |-> 2]
/\ channels = [limitCh |-> 2, found |-> {}]
/\ i = 3
/\ pc = (0 :> "Main" @@ 1 :> "A" @@ 2 :> "A" @@ 3 :> "A")
/\ initialized = <<TRUE, TRUE, FALSE>>
State 4: <A line 106, col 12 to line 114, col 64 of module base>
/\ buffered = [limitCh |-> 2]
/\ channels = [limitCh |-> 2, found |-> {}]
/\ i = 3
/\ pc = (0 :> "Main" @@ 1 :> "A" @@ 2 :> "DeclareSend" @@ 3 :> "A")
/\ initialized = <<TRUE, TRUE, FALSE>>
State 5: <A line 106, col 12 to line 114, col 64 of module base>
/\ buffered = [limitCh |-> 2]
/\ channels = [limitCh |-> 2, found |-> {}]
/\ i = 3
/\ pc = (0 :> "Main" @@ 1 :> "DeclareSend" @@ 2 :> "DeclareSend" @@ 3 :> "A")
/\ initialized = <<TRUE, TRUE, FALSE>>
State 6: <DeclareSend line 92, col 22 to line 95, col 77 of module base>
/\ buffered = [limitCh |-> 2]
/\ channels = [limitCh |-> 2, found |-> {1}]
/\ i = 3
/\ pc = (0 :> "Main" @@ 1 :> "Send" @@ 2 :> "DeclareSend" @@ 3 :> "A")
/\ initialized = <<TRUE, TRUE, FALSE>>
State 7: <DeclareSend line 92, col 22 to line 95, col 77 of module base>
/\ buffered = [limitCh |-> 2]
/\ channels = [limitCh |-> 2, found |-> {1, 2}]
/\ i = 3
/\ pc = (0 :> "Main" @@ 1 :> "Send" @@ 2 :> "Send" @@ 3 :> "A")
/\ initialized = <<TRUE, TRUE, FALSE>>
It’s the same issue that Chris had. The goroutines can only return their
tokens if there is a receiver on the found
channel, the only
receiver of that channel is main
, main
only reads after it
initializes all the goroutines, and main
will block if there are
more goroutines than tokens. The goroutines can’t return tokens until
all goroutines are initialized, and main can’t initialize all goroutines
until some goroutines have returned their tokens.
Fixing This¶
Chris suggests three possible ways of fixing this. We can test each of the three by modifying our spec:
If the goroutines took the token by sending to
limitCh
instead of the main for loop doing it, the bug would not exist;
process goroutine \in Routines
begin
A:
await initialized[self];
+ write_buffered("limitCh");
\* ...
while i <= NumRoutines do
- write_buffered("limitCh");
initialized[i] := TRUE;
i := i + 1;
end while;
This passes model checking.
If the goroutines received from
limitCh
to release their token before sending tofound
, it wouldn’t exist (but because of error handling, it’s simpler and more reliable to do the receive in a defer).
process goroutine \in Routines
begin
A:
await initialized[self];
+ receive_channel("limitCh");
- call write_unbuffered("found");
B:
- receive_channel("limitCh");
+ call write_unbuffered("found");
end process;
This passes model checking.
And if the entire for loop was in an additional goroutine…
This one’s a little more complicated. We create a new process for the
loop and add its identifier to initialized
. I’ll use -1
to
represent the for-loop.
initialized = [w \in Routines \union {-1} |-> FALSE];
\* After goroutines
process for_loop = -1
variables i = 1;
begin
Loop:
while i <= NumRoutines do
write_buffered("limitCh");
go(i);
i := i + 1;
end while;
end process;
Then we modify main
to initialize this instead of doing the loop
itself:
process main = 0
variables i = NumRoutines;
begin
Main:
go(-1);
Get:
while i > 0 do
i := i - 1;
receive_channel("found");
end while;
end process;
This passes model checking.
Discussion¶
Ultimately we wrote about 75 lines of specification to test 20 lines of Go code. Over half the spec is channel logic which we can now reuse in other specs. Discounting those puts us a little closer, though I’ll admit that a real TLA+ spec would be a lot longer because you’d be writing a lot more sanity checking properties. Nonetheless, writing the TLA+ version wouldn’t be significantly more effort than writing the original version and could save you net time if it caught the deadlock before production.
- 1
with
blocks if the set is empty, making theawait
statement above it redundant. I added it purely for clarity.- 2
You can also use a
select
to send to multiple channels, but I think that’s less common? This is where being able to write a raw TLA+ action could be really helpful.- 3
Get
is an inaccurate representation of how channel receives in arange
work: it should loop until the channel is closed. I left that out here because the rest of the spec doesn’t depend on closing channels and I didn’t want to add extra complexity to this example.- 5
I removed the
stack
andchan
variables from the trace to make it a little clearer.