Operators and Values¶
Operators¶
Operators are what you’d think of as a function in a programming language. They take arguments and evaluate to expressions.
EXTENDS Integers
MinutesToSeconds(m) == m * 60
Troubleshooting
If you see an error like
Was expecting “=====” or more Module Body, encountered
"$Operator"
at line $n…
It’s likely because you used a single =
instead of a double ==
when defining the operator.
Operators can take any number of arguments. There are no default values, overloaded definitions, or optional arguments. If an operator has two parameters, it must always take two values. If an operator takes no values, you can write it without the parens. In this case, it’s effectively a constant.
SecondsPerMinute == 60
Note
There are a few special forms of operators, like higher-order operators, recursive operators, and lambdas, which we will cover later.
The right-hand side of an operator is called an expression.
IF-THEN-ELSE¶
We have three keywords for structuring expressions. These are LET statements, case statements, and conditionals. We’ll be introducing the first two later: LET statements are much more useful after we can write some basic operators, while case statements aren’t used very often. That leaves conditionals:
Abs(x) == IF x < 0 THEN -x ELSE x
Expressions always equal something, so there must always be an ELSE
branch. Otherwise, this works as you’d expect it to.
Values¶
TLA+ is an “untyped” language, due to its roots in mathematics. In practice, the model checker recognizes four primitive types and four complex ones.
Primitive: strings, booleans, integers, and model values.
Complex: sets, sequences, structures, and functions.
Every value type in TLA+ has its own operators, with no overlap or overloading. The two exceptions to this are =
and #
, which “equals” and “not equals”, respectively. Values of different types cannot be tested for equality, and that will throw an error.
Troubleshooting
If you see an error like
Encountered “Beginning of Definition” at line $n…
It’s likely because you used a double ==
instead of a single =
when testing for equality.
We’ll leave the more sophisticated types for later and focus now on just strings, booleans, integers, and sets.
The obvious ones¶
Integers and strings. To get the basic addition operators, you need EXTENDS Integers
. Strings must use “double quotes” and cannot use single quotes. There are no operators for strings except =
and #
. In practice, they are used as opaque identifiers, similar to how some languages have a :symbol type. If your system needs to manipulate strings, we instead store them in a sequence.
Note there is not a float type. Floats have complex semantics that are extremely hard to represent. Usually you can abstract them out, but if you absolutely need floats then TLA+ is the wrong tool for the job.
Booleans¶
The booleans are TRUE
and FALSE
.
So why do they get their own section? First of all, the boolean operators look like mathematical symbols, not programming symbols. They are:
Logic |
TLA+ Symbol |
Math Symbol |
---|---|---|
and |
|
\(\wedge\) |
or |
|
\(\vee\) |
not |
|
\(\neg\) |
A quick mnemonic: ~
is a little flippy thing, so it’s “not”. /\
looks like an “A”, so it’s “and”. \/
is the other one. We can use these to build the other operators, like Xor
:
Xor(A, B) == A = ~B
There is one more boolean operator of note: =>
, or “implication”. A => B
means “A is at least as true as B”. A => B = FALSE
when false if A is true and B is false, otherwise it’s true. B is true or A is false (or both). You don’t see this very often in programming, as it’s pretty useless for control flow. But it’s extremely important for any kind of specification work. We’ll go into more detail about it later.2
Exercise: Contrapositives
Rewrite
A => B
using the “regular three” programming operators.
Solution
~A \/ B
For what values of
A
andB
is~B => ~A
true?
Solution
Using the same rewrite rule as part 1, we get
~~B \/ ~A
, aka~A \/ B
, which means it has the same truth values asA => B
.~B => ~A
is called the contrapositive ofA => B
.
The other thing is that TLA+ has a “bullet point notation” for boolean logic. Let’s say you need an expression like A /\ (B \/ C) /\ (D \/ (E /\ F))
. That’s really hard to read! So in TLA+ you can instead write it as:
/\ A
/\ \/ B
\/ C
/\ \/ D
\/ /\ E
/\ F
That makes it much clearer. Notice that we have an extra /\
before A
. That’s not necessary, but it makes the shape more pleasing, so we do it. This is also the only place in the language where whitespace matters. Lets say I instead wrote
/\ A
/\ \/ B
\/ C
/\ \/ D
\/ /\ E
/\ F
That means something different! It’s now A /\ (B \/ C) /\ (D \/ E) /\ F
.
Tip
“Why would you even want something like that?” It makes complex invariants much easier to read.
Sequences¶
A sequence is like a list in any other language. You write it like <<a, b, c>>
, and the elements can be any other values (including other sequences). As with most other languages, you look up a value of the sequence with seq[n]
, except that instead of the indices being 0..Len(seq)-1
, it’s 1..Len(seq)
. So yeah, they’re 1-indexed.
Warning
Did I mention they’re 1-indexed? Because they’re 1-indexed.
There’s also a Sequences
module. If you EXTENDS Sequences
, you also get (letting S == <<"a">>
):
Expression |
Gives |
---|---|
|
|
|
|
|
|
|
|
|
|
|
|
Troubleshooting
If you see an error like
Encountered “EXTENDS” at line 3, column 1 and token “Sequences”
It’s because you wrote two separate EXTENDS
lines. TLA+ can only have one EXTENDS
line per spec, but you can have multiple modules (separated by commas) on it. So instead write EXTENDS Integers, Sequences
and you should be fine.
With sequences, we can represent a 24-hour clock as <<hour, minute, second>>
.
EXTENDS Integers, Sequences
ToSeconds(time) == time[1]*3600 + time[2]*60 + time[3]
Earlier(t1, t2) == ToSeconds(t1) < ToSeconds(t2)
Note
Fixed-length sequences are also called “tuples”. It’s the same syntax either way.
Sets¶
A set is a collection of unordered, unique values. You write them with braces, like {1, 2, 3}
or {<<"a">>, <<"b", "c">>}
. You can even have sets inside other sets, like {{1}, {2}, {3}}
.
Sets cannot contain elements of different types; {1, "a"}
is invalid.
Set Operators¶
To check if x
is an element of set
, we write x \in set
. \in
is used in a few other places as syntax, not just as an operator. There’s also the inverse, \notin
. set1 \subseteq set2
tests if every element of set1
is also an element of set2
.
Note
That’s “subset or equals”. It’s a way to sidestep the question “Is a set a subset of itself?”
We also have ways of slicing and dicing sets:
set1 \union set2
is the set of all elements inset1
orset2
(or both).set1 \intersect set2
is the set of all elements in both sets.set1 \ set2
, or “set difference” is the set of all elements inset1
but notset2
.
Note
You might see \cup
and \cap
instead of \union
and \intersect
. This comes from the mathematical symbols for set union and intersection, which are \(\cup\) and \(\cap\).
>>> {1, 3} \union {1, 5}
{1, 3, 5}
>>> {1, 3} \intersect {1, 5}
{1}
>>> {1, 3} \ {1, 5}
{3}
If you EXTEND FiniteSets
, you also get Cardinality(set)
, which is the number of elements in the set.
Tip
The easiest way to test if a set is empty is by writing set = {}
. Similarly, you can test if a sequence is empty by writing seq = <<>>
.
Sets of Values¶
Now imagine we’re writing a spec which uses clock values, and we want a quick operator to add times. I might write this as
AddTimes(t1, t2) == <<t1[1] + t2[1], t1[2] + t2[2], t1[3] + t2[3]>>
Then AddTimes(<<2, 0, 1>>, <<1, 2, 3>>) = <<3, 2, 4>>
, and AddTimes(<<2, 0, 1>>, <<1, 2, 80>>) = <<3, 2, 81>>
.
Wait, 81 seconds? Our clock can’t show 81 seconds, the answer should be <<3, 3, 21>>
. You can think of there being a set of valid clock values, all the way from <<0, 0, 0>>
to <<23, 59, 59>>
, and AddTimes
should always return some value in that set, almost like it has a type signature. We can enforce this in TLA+, but first we need a way of generating sets of values from values. Fortunately, for every type of value in TLA+, there’s a method to generate sets of those values.1
Let’s start with the easiest: to get the set of all booleans, just write BOOLEAN
. That’s the set {TRUE, FALSE}
. For integers, a..b
is the set {a, a+1, a+2, ... , b}
. You need EXTENDS Integers
for this to work.
Tip
If a > b
, then a..b
is empty. This makes a lot of things a lot simpler. For example, 1..Len(seq)
is the set of the indices of seq
. If seq = <<>>
, you get 1..0 = {}
, which is what you’d expect.
Now for sequences. The Cartesian product of two sets S and T is the set of all sequences where the first element is in S and the second is in T. It’s written with \X
. For example, consider LoginAttempt
containing who’s logging in, the time they attempted the login, and if it was successful or not. I can represent the set of all possible such values as LoginAttempt == Person \X Time \X BOOLEAN
.
Note
\X
is not associative:
S == 1..3 >> <<1, 2, 3>> \in S \X S \X S TRUE >> <<1, 2, 3>> \in (S \X S) \X S FALSE >> <<<<1, 2>>, 3>> \in (S \X S) \X S TRUE
Speaking of Time
, we can combine \X
and ..
to finally get our clock type:
ClockType == (0..23) \X (0..59) \X (0..59)
As a quick sanity check, run Cardinality(ClockType)
in your Making a Scratchfile (remember, you’ll need EXTENDS FiniteSets
). You should see it has 86400 elements. We’re now one step closer to having a property for AddTimes
: we want the result of it to always return a value in ClockType
.
Finally, we can get all subsets of a set with SUBSET S
. SUBSET ClockType
will be all the sets containing a bunch of clock values… all 2^86400 of them. Don’t do this.
Tip
I often see beginners try to test if “S is a subset of T” by writing S \in SUBSET T
. This works but is very inefficient. Write S \subseteq T
instead.
Map and Filter¶
Sets can be mapped and filtered.
\* Map
Squares == {x*x: x \in 1..4}
\* Filter
Evens == {x \in 1..4: x % 2 = 0 }
I’ve found that the best way to remember which is which is by reading the colon as a “where”. So the map is “x squared where x in 1..4”, while the filter is “x in 1..4 where x is even”.
To get all the times in the second half of each hour, we could write:
{t \in ClockType: t[2] >= 30 /\ t[3] = 0}
Map and filter are great for utility, too. The range of a sequence is the set of all elements in the sequence. We can get that with a set map:
Range(seq) == {seq[i]: i \in 1..Len(seq)}
CHOOSE¶
Getting the number of seconds past midnight from a clock value is straightforward. But what about going the other way? If we have a time in seconds, we can get the clock time by
Floor divide by 3600 to get the total hours.
Floor divide again the remainder by 60 to get the total minutes.
Take the remainder of the second division as seconds.
This constructs a clock value from the total seconds. This is how we’d do it in a programming language, where we are implementing algorithms to do things. But it’s also error-prone. What happens if I pass in 90,000? Then this would give me <<25, 0, 0>>
— a value outside of our ClockType
.
Here’s another thing we could do:
Take the set of all possible clock values.
Pick the element in the set that, when converted to seconds, gives us the value.
We don’t do it this way because “the set of all possible clock values” is over 80,000 elements long and doing a find on an 80,000 element list is a waste of resources. But it more closely matches the definition of the conversion, making it more useful for specification. In TLA+ we can write the selection like this:
ToClock(seconds) == CHOOSE x \in ClockType: ToSeconds(x) = seconds
CHOOSE x \in set: P(x)
is the generic “selection” syntax. Try it in Making a Scratchfile.
CHOOSE is useful whenever we need to pull a value from a set.
Now what happens if we write ToClock(86401)
? There are no clock times that have 86,401 seconds. If you try this, TLC will raise an error. This is in contrast to the implementation solution, which will instead give us a nonsense value. 99% of the time if it can’t find a corresponding element of the set, that’s a bug in the specification, an edge case you didn’t consider. Better to harden up the operator:
ToClock(seconds) == CHOOSE x \in ClockType: ToSeconds(x) = seconds % 86400
Troubleshooting
If you see an error like
Attempted to compute the value of an expression of formCHOOSE x in S: P, but no element of S satisfied P.
It’s because you wrote a CHOOSE
that couldn’t find any values. Sometimes this just means you got the expression wrong. But other times, it points to an actual flaw in your system: you expected a value to exist, and it did not. Better write some error-handling logic or you’ll get a nasty surprise in production.
Warning
What if multiple values satisfy CHOOSE
? In this case the only requirement is that the result is deterministic: the engine must always return the same value, no matter what. In practice this means that TLC will always choose the lowest value that matches the set.
LET¶
As you can imagine, TLA+ operators can get quite complex! To make them easier to follow, we can break them into suboperators, using LET
:
ToClock(seconds) ==
LET seconds_per_day == 86400
IN CHOOSE x \in ClockType: ToSeconds(x) = seconds % seconds_per_day
The LET gives us a new definition, locally scoped to ToClock
. seconds_per_day
is an operator that only exists in the definition of this one.
Wait, operator? Yes, we can add parameterized operators in LET
, too!
ThreeMax(a, b, c) ==
LET
Max(x, y) == IF x > y THEN x ELSE y
IN
Max(Max(a, b), c)
And you can define multiple operators in the same LET:
ThreeMax(a, b, c) ==
LET
Max(x, y) == IF x > y THEN x ELSE y
maxab == Max(a, b)
IN
Max(maxab, c)
Each operator in the LET can refer to previously defined operators in that scope. With this we can construct solutions step-by-step.
Let’s calculate ToClock
the “programming way”:
ToClock2(seconds) == LET h == seconds \div 3600 h_left == seconds % 3600 m == h_left \div 60 m_left == h_left % 60 s == m_left IN <<h, m, s>>
>>> ToClock2(90000)
<<25, 0, 0>>
If you have to write a complex operator, breaking it into steps with LET is a great way to make it more understandable.
Summary¶
Operators are top-level “functions”, and evaluate to expressions. They are written
Op(a, b) == expr
, with two equal-signs.Operators can have conditions with
IF-THEN-ELSE
, and suboperators withLET-IN
.
Sequences are collections of ordered values, and are 1-indexed.
Logic is
/\
for “and”,\/
for “or”, and~
for “not”.Logical statements can be written “bullet-points” style.
Sets are collections of unordered, unique values.
We can test if an element is
\in
a set or if a set is a\subseteq
of another set.We can
\union
,\intersect
, and (set difference) two sets.We can CHOOSE elements of sets.
All types have “sets of” that type. For integers it’s
a..b
, for booleans it’sBOOLEAN
, for sets it’sSUBSET
, and for sequences it’sS1 \X S2
.We can map and filter sets.