- Design principles
- Create and plot a binary decision diagram
- CUDD interface:
dd.cudd
- Lower level:
dd.bdd
- Example: Reachability analysis
- Syntax for quantified Boolean formulas
- Multi-valued decision diagrams (MDD)
- Installation of C extension modules
- Installing the development version
- Footnotes
- Copying
The interface is in Python. The representation depends on what you want and have installed. For solving small to medium size problems, say for teaching, or prototyping new algorithms, pure Python can be more convenient. To work with larger problems, it works better if you install the C library CUDD. Let's call these “backends”.
The same user code can run with both the Python and C backends.
You only need to modify an import dd.autoref as _bdd
to
import dd.cudd as _bdd
, or import the best available interface:
try:
import dd.cudd as _bdd
except ImportError:
import dd.autoref as _bdd
The following sections describe how to use the high level interface
(almost identical in autoref
and cudd
). The lower level interface to
the pure-Python implementation in dd.bdd
is also described,
for those interested in more details.
The starting point for using a BDD library is a shared reduced ordered binary decision diagram. Implementations call this manager. The adjectives mean:
- binary: each node represents a propositional formula
- ordered: variables have a fixed order that changes in a controlled way
- reduced: given variable order, equivalent propositional formulas are represented by a unique diagram
- shared: common subformulas are represented using common elements in memory.
The manager is a directed graph, with each node representing a formula.
Each formula can be understood as a collection of assignments to variables.
As mentioned above, the variables are ordered.
The level of a variable is its index in this order, starting from 0.
The terminal nodes correspond to TRUE
and FALSE
, with maximal index.
Each manager is a BDD
class, residing in a different module:
dd.autoref.BDD
: high-level interface to pure Python implementationdd.cudd.BDD
: same high-level interface to a C implementationdd.sylvan.BDD
: interface to another C implementation (multi-core)dd.bdd.BDD
: low-level interface to pure Python implementation (wrapped bydd.autoref.BDD
).
The main difference between these modules is how a BDD node is represented:
autoref
:autoref.Function
cudd
:cudd.Function
sylvan
:sylvan.Function
bdd
:int
In autoref
and cudd
, a Function
class represents a node.
In bdd
, a signed integer represents a node.
All implementations use negated edges, so logical negation takes constant time.
Roughly four kinds of operations suffice to perform most tasks:
- creating BDDs from formulas
- quantification (
forall
,exist
) - substitution (
let
) - enumerating models that satisfy a BDD
First, instantiate a manager and declare variables
import dd.autoref as _bdd
bdd = _bdd.BDD()
bdd.declare('x', 'y', 'z')
To create a BDD node for a propositional formula, call the parser
u = bdd.add_expr(r'x /\ y') # conjunction
v = bdd.add_expr(r'z \/ ~ y') # disjunction and negation
w = u & ~ v
The formulas above are in TLA+ syntax.
If you prefer the syntax &, |, !
, the parser recognizes those operators too.
The inverse of BDD.add_expr
is BDD.to_expr
:
s = bdd.to_expr(u)
print(s)
# 'ite(x, y, False)'
Lets create the BDD of a more colorful Boolean formula
s = r'(x /\ y) <=> (~ z \/ ~ (y <=> x))'
v = bdd.add_expr(s)
In natural language, the expression s
reads:
“(x and y) if and only if ( (not z) or (y xor x) )”.
The Boolean constants are bdd.false
and bdd.true
, and in syntax
FALSE
and TRUE
.
Variables can be quantified by calling the methods exist
and forall
u = bdd.add_expr(r'x /\ y')
v = bdd.exist(['x'], u)
or by writing quantified formulas
# there exists a value of x, such that (x and y)
u = bdd.add_expr(r'\E x: x /\ y')
y = bdd.add_expr('y')
assert u == y, (u, y)
# forall x, there exists y, such that (y or x)
u = bdd.add_expr(r'\A x: \E y: y \/ z')
assert u == bdd.true, u
dd
supports "inline BDD references" via the @
operator. Each BDD node u
has an integer representation int(u)
and a string representation str(u)
.
For example, if the integer representation is 5
, then the string
representation is @5
. These enable you to mention existing BDD nodes
in formulas, without the need to expand them as formulas. For example:
u = bdd.add_expr(r'y \/ z')
s = rf'x /\ {u}'
v = bdd.add_expr(s)
v_ = bdd.add_expr(r'x /\ (y \/ z)')
assert v == v_
Substitution comes in several forms:
- replace some variable names by other variable names
- replace some variable names by Boolean constants
- replace some variable names by BDD nodes, so by arbitrary formulas
All these kinds of substitution are performed via the method let
,
which takes a dict
that maps the variable names to replacements.
To substitute some variables for some other variables
bdd.declare('x', 'p', 'y', 'q', 'z')
u = bdd.add_expr(r'x \/ (y /\ z)')
# substitute variables for variables (rename)
d = dict(x='p', y='q')
v = bdd.let(d, u)
print(f'support = {v.support}')
# support = {'p', 'q', 'z'}
The other forms are similar
# substitute constants for variables (cofactor)
values = dict(x=True, y=False)
v = bdd.let(values, u)
# substitute BDDs for variables (compose)
d = dict(x=bdd.add_expr(r'z \/ w'))
v = bdd.let(d, u)
A BDD represents a formula, a syntactic object. Semantics is about how syntax is used to describe the world. We could interpret the same formula using different semantics and reach different conclusions.
A formula is usually understood as describing some assignments of values to
variables. Such an assignment is also called a model.
A model is represented as a dict
that maps variable names to values.
u = bdd.add_expr(r'x \/ y')
assignment = bdd.pick(u) # choose an assignment, `u.pick()` works too
print(f'{assignment = }')
# assignment = {'x': False, 'y': True}
When working with BDDs, two issues arise:
- which variable names are present in an assignment?
- what values do the variables take?
The values are Boolean, because BDD machinery is designed to reason for
that case only, so 1 /\ 5
is a formula outside the realm of BDD reasoning.
The choice of variable names is a matter we discuss below.
Consider the example
bdd.declare('x', 'y', 'z')
u = bdd.add_expr(r'x \/ y')
support = u.support
print(f'{support = }')
# support = {'x', 'y'}
This tells us that the variables x
and y
occur in the formula that the
BDD node u
represents. Knowing what (Boolean) values a model assigns to
the variables x
and y
suffices to decide whether the model satisfies u
.
In other words, the values of other variables, like z
, are irrelevant to
evaluating the expression x \/ y
.
The choice of semantics is yours. Which variables you want an assignment to mention depends on what you are doing with the assignment in your algorithm.
u = bdd.add_expr('x')
# default: variables in support(u)
models = list(bdd.pick_iter(u))
print(models)
# [{'x': True}]
# variables in `care_vars`
models = list(bdd.pick_iter(u, care_vars=['x', 'y']))
print(models)
# [{'x': True, 'y': False}, {'x': True, 'y': True}]
By default, pick_iter
returns assignments to all variables in the support
of the BDD node u
given as input.
In this example, the support of u
contains one variable: x
(because the value of the expression 'x'
is independent of variable y
).
We can use the argument care_vars
to specify the variables that we want
the assignment to include. The assignments returned will include all variables
in care_vars
, plus the variables that appear along each path traversed in
the BDD. Variables in care_vars
that are unassigned along each path will
be exhaustively enumerated (i.e., all combinations of True
and False
).
For example, if care_vars == []
, then the assignments will contain only
those variables that appear along the recursive traversal of the BDD.
If care_vars == support(u)
, then the result equals the default result.
For care_vars > support(u)
we will observe more variables in each assignment
than the variables in the support.
We can also count how many assignments satisfy a BDD. The number depends on how many variables occur in an assignment. The default number is as many variables are contained in the support of that node. You can pass a larger number
bdd.declare('x', 'y')
u = bdd.add_expr('x')
count = u.count()
print(f'{count = }')
# count = 1
models = list(bdd.pick_iter(u))
print(models)
# [{'x': True}]
# pass a larger number of variables
count = u.count(nvars=3)
print(f'{count = }')
# count = 4
models = list(bdd.pick_iter(u, ['x', 'y', 'z']))
print(models)
# [{'x': True, 'y': False, 'z': False},
# {'x': True, 'y': True, 'z': False},
# {'x': True, 'y': False, 'z': True},
# {'x': True, 'y': True, 'z': True}]
A convenience method for creating a BDD from an assignment dict
is
d = dict(x=True, y=False, z=True)
u = bdd.cube(d)
v = bdd.add_expr(r'x /\ ~ y /\ z')
assert u == v, (u, v)
The interface is specified in the module dd._abc
. Although internal,
you may want to take a look at the _abc
module.
Above we discussed semantics from a proof-theoretic viewpoint. The same discussion can be rephrased in terms of function domains containing assignments to variables, so a model-theoretic viewpoint.
You can dump a PDF of all nodes in the manager as follows
import dd.autoref as _bdd
bdd = _bdd.BDD()
bdd.declare('x', 'y', 'z')
u = bdd.add_expr(r'(x /\ y) \/ ~ z')
bdd.collect_garbage() # optional
bdd.dump('awesome.pdf')
The result is shown below, with the meaning:
- integers on the left signify levels (thus variables)
- each node is annotated with a variable name (like
x
) dash the node index indd.bdd.BDD._succ
(mainly for debugging purposes) - solid arcs represent the “if” branches
- dashed arcs the “else” branches
- only an “else” branch can be negated, signified by a
-1
annotation
Negated edges mean that logical negation, i.e., ~
, is applied to the node
that is pointed to.
Negated edges and BDD theory won't be discussed here, please refer to
a reference from those listed in the docstring of the module dd.bdd
.
For example, this document
by Fabio Somenzi (CUDD's author).
In the following diagram, the BDD rooted at node x-7
represents the
Boolean function (x /\ y) \/ ~ z
. For example, for the assignment
dict(x=False)
, the dashed arc from node x-7
leads to the negation
(due to the negated edge, signified by a -1
) of the node z-5
.
The BDD rooted at z-5
represents the Boolean function z
,
so its negation is ~ z
.
The nodes x-2
, x-4
, y-3
are intermediate results that result while
constructing the BDD for the Boolean function (x /\ y) \/ ~ z
.
The BDD rooted at node x-2
represents the Boolean function x
, and the
BDD rooted at node x-4
represents the Boolean function x /\ y
.
An external reference to a BDD is an arc that points to a node.
For example, u
above is an external reference. An external reference can be
a complemented arc. External references can be included in a BDD diagram by
using the argument roots
of the method BDD.dump
. For example
import dd.autoref as _bdd
bdd = _bdd.BDD()
bdd.declare('x', 'y', 'z')
u = bdd.add_expr(r'(x /\ y) \/ ~ z')
print(u.negated)
v = ~ u
print(v.negated)
bdd.collect_garbage()
bdd.dump('rooted.pdf', roots=[v])
The result is the following diagram, where the node @-7
is the external
reference v
, which is a complemented arc.
It is instructive to dump the bdd
with and without collecting garbage.
As mentioned above, there are various ways to apply propositional operators
x = bdd.var('x')
y = bdd.var('y')
u = x & y
u = bdd.apply('and', x, y)
u = bdd.apply('/\\', x, y) # TLA+ syntax
u = bdd.apply('&', x, y) # Promela syntax
Infix Python operators work for BDD nodes in dd.autoref
and dd.cudd
,
not in dd.bdd
, because nodes there are plain integers (int
).
Besides the method apply
, there is also the ternary conditional method ite
,
but that is more commonly used internally.
For single variables, the following are equivalent
u = bdd.add_expr('x')
u = bdd.var('x') # faster
In autoref
, a few functions (not methods) are available for efficient
operations that arise naturally in fixpoint algorithms when transition
relations are involved:
image
,preimage
: rename some variables, conjoin, existentially quantify, and rename some variables, all at oncecopy_vars
: copy the variables of one BDD manager to another manager
The Python syntax for Boolean operations (u & v
) and the method apply
are faster than the method add_expr
, because the latter invokes the
parser (generated using ply.yacc
).
Using add_expr
is generally quicker and more readable.
In practice, prototype new algorithms using add_expr
,
then profile, and if it matters convert the code to use ~
, &
, |
,
and to call directly apply
, exist
, let
, and other methods.
In the future, a compiler may be added, to compile expressions into functions that can be called multiple times, without invoking again the parser.
The number of nodes in the manager bdd
is len(bdd)
.
As noted earlier, each variable corresponds to a level, which is an index in
the variable order. This mapping can be obtained with
level = bdd.level_of_var('x')
var = bdd.var_at_level(level)
assert var == 'x', var
In autoref.BDD
, the dict
that maps each defined variable to its
corresponding level can be obtained also from the attribute BDD.vars
print(bdd.vars)
# {'x': 0, 'y': 1, 'z': 2}
To copy a node from one BDD manager to another manager
a = _bdd.BDD()
a.declare('x', 'y', 'z')
u = a.add_expr(r'(x /\ y) \/ z')
# copy to another BDD manager
b = _bdd.BDD()
b.declare(*a.vars)
v = a.copy(u, b)
In each of the modules dd.autoref
and dd.cudd
, references to BDD nodes
are represented with a class named Function
. When Function
objects are
not referenced by any Python variable, CPython deallocates them, thus in
the next BDD garbage collection, the relevant BDD nodes can be deallocated.
For this reason, it is useful to avoid unnecessary references to nodes.
This includes the underscore variable _
,
for example:
import dd.autoref
bdd = dd.autoref.BDD()
bdd.declare('x', 'y')
c = [bdd.add_expr(r'x /\ y'), bdd.add_expr(r'x \/ ~ y')]
u, _ = c # `_` is assigned the `Function` that references
# the root of the BDD that represents x \/ ~ y
c = list() # Python deallocates the `list` object created above
# so `u` refers to the root of the BDD that represents x /\ y,
# as expected,
# but `_` still refers to the BDD that represents x \/ ~ y
print(bdd.to_expr(_))
The Python reference by _
in the previous example can be avoided
by indexing, i.e., u = c[0]
.
The dd.autoref
and dd.bdd
modules can dump the manager to a
pickle file and
load it back
bdd.dump('manager.p')
and later, or in another run:
bdd = BDD.load('manager.p')
As mentioned earlier, the main difference between the main dd
modules is
what type of object appears at the user interface as a “node”:
dd.bdd
gives to the user signed integers as nodesdd.autoref
anddd.cudd
give herFunction
objects as nodes.
Seldom should this make a difference to the user.
However, for integers, the meaning of the Python operators ~
, &
, |
, ^
is unrelated to the BDD manager.
So, if u = -3
and v = 25
are nodes in the dd.bdd.BDD
manager bdd
,
you cannot write w = u & v
to get the correct result.
You have to use either:
BDD.apply('and', u, v)
orBDD.add_expr(rf'{u} /\ {v}')
.
Unlike dd.bdd
, the nodes in autoref
and cudd
are of class Function
.
This abstracts away the underlying node representation, so that you can run
the same code in both pure Python (with dd.bdd.BDD
underneath as manager),
as well as C (with the struct
named DdManager
in cudd.h
underneath as manager).
The magic methods for
~
, &
, |
, ^
implemented by Function
are its most frequently used aspect.
Two methods called implies
and equiv
are available.
But it is more readable to use bdd.apply
or bdd.add_expr
,
or just v | ~ u
for u.implies(v)
and ~ (u ^ v)
for u.equiv(v)
.
If u
and v
are instances of Function
, and u == v
, then u
represents
the same BDD as v
.
This is NOT true for nodes in dd.bdd
, because u
and v
may be nodes
from different manager instances.
So, that u == -3 == v
does not suffice to deduce that u
and v
represent the same BDD. In this case, we have to ensure that u
and v
originate from the same manager. Thus, using dd.bdd
offers less protection
against subtle errors that will go undetected.
Information about a node u
can be read with a few attributes and magic:
len(u)
is the number of nodes in the graph that representsu
in memoryu.var
is the name (str
) of the variable inBDD.support(u)
with minimal level (recall that variables are ordered)u.level
is the level ofu.var
u.ref
is the reference count ofu
, meaning the number of other nodesv
with an edge(v, u)
, plus external references tou
by the user.
We say that the node u
is “labeled” with variable u.var
.
At this point, recall that these diagrams track decisions.
At each node, we decide where to go next, depending on the value of u.var
.
If u.var
is true, then we go to node u.high
, else to node u.low
.
For this reason, u.high
is also called the “then” child of u
,
and u.low
the “else” child.
This object-oriented appearance is only an external interface for the user's
convenience. Typically, the bulk of the nodes aren't referenced externally.
Internally, the nodes are managed efficiently en masse, no Function
s there.
A Function.to_expr
method is present, but using BDD.to_expr
looks tidier.
We said earlier that you can develop with autoref
, deferring usage of
CUDD
for when really needed. This raises two questions:
- Why not use
cudd
from the start ? - When should you switch from
autoref
tocudd
?
The answer to the second question is simple: when your problem takes more
time and memory than available.
For light to moderate use, cudd
probably won't be needed.
Regarding the first question, dd.cudd
requires to:
- compile CUDD, and
- cythonize, compile, and link
dd/cudd.pyx
.
The setup.py
of dd
can do these for you, as described in the file
README.md
.
However, this may require more attention than appropriate for the occassion.
An example is teaching BDDs in a class on data structures, with the objective
for students to play with BDDs, not with gcc
and linking
errors (that's enlightening too, but in the realm of a slightly different class).
If you are interested in tuning CUDD to get most out of it (or because some problem demands it due to its size), then use:
BDD.statistics
to obtain the information described in CUDD Programmer's manual / Gathering and interpreting statistics.BDD.configure
to read and set the parameters “max memory”, “loose up to”, “max cache hard”, “min hit”, and “max growth”.
Due to how CUDD manages variables, the method add_var
takes as keyword
argument the variable index, not the level (which autoref
does).
The level can still be set with the method insert_var
.
The methods dump
and load
store the BDD of a selected node in a DDDMP file.
Pickling and PDF plotting are not available yet in dd.cudd
.
An interface to the BuDDy C libary also exists, as dd.buddy
.
However, experimentation suggests that BuDDy does not contain as successful
heuristics for deciding when to invoke reordering.
CUDD is initialized with a memory_estimate
of 1 GiB (1 gibibyte).
If the machine has less RAM, then cudd.BDD
will raise an error. In this case,
pass a smaller initial memory estimate, for example
cudd.BDD(memory_estimate=0.5 * 2**30)
Note that 2**30
bits is 1 gibibyte (GiB), not 1 gigabyte (GB).
Relevant reading about gigabyte,
IEC prefixes for binary multiples, and
the ISO/IEC 80000 standard.
The functions and_exists
, or_forall
in dd.cudd
offer the functionality of
relational products (meaning neighboring variable substitution, conjunction,
and quantification, all at one pass over BDDs).
This functionality is implemented with image
, preimage
in dd.autoref
.
Note that (pre)image contains substitution, unlike and_exists
.
The function cudd.reorder
is similar to autoref.reorder
,
but does not default to invoking automated reordering.
Typical use of CUDD enables dynamic reordering.
When a BDD manager dd.cudd.BDD
is deallocated, it asserts that no BDD nodes
have nonzero reference count in CUDD. By default, this assertion should never
fail, because automated reference counting makes it impossible.
If the assertion fails, then the exception is ignored and a message is printed
instead, and Python continues execution (read also the Cython documentation of
__dealloc__
,
the Python documentation of
"Finalization and De-allocation",
and of tp_dealloc
).
In case the user decides to explicitly modify the reference counts, ignoring exceptions can make it easier for reference counting errors to go unnoticed. To make Python exit when reference counting errors exist before a BDD manager is deallocated, use:
import dd.cudd as cudd
bdd = cudd.BDD()
# ... statements ...
# raise `AssertionError` if any nodes have nonzero reference count
# just before deallocating the BDD manager
assert len(bdd) == 0, len(bdd)
Note that the meaning of len
for the class dd.autoref.BDD
is slightly
different. As a result, the code for checking that no BDD nodes have nonzero
reference count in dd.autoref
is:
import dd.autoref as _bdd
bdd = _bdd.BDD()
# ... statements ...
# raise `AssertionError` if any nodes have nonzero reference count
# just before deallocating the BDD manager
bdd._bdd.__del__() # directly calling `__del__` does raise
# any exception raised inside `__del__`
Note that if an assertion fails inside __del__
, then
the exception is ignored and a message is printed to sys.stderr
instead,
and Python continues execution. This is similar to what happens with
exceptions raised inside __dealloc__
of extension types in Cython.
When __del__
is called directly, exceptions raised inside it are not ignored.
We discuss now some more details about the pure Python implementation
in dd.bdd
. Two interfaces are available:
-
convenience: the module
dd.autoref
wrapsdd.bdd
and takes care of reference counting using__del__
. -
"low level": the module
dd.bdd
requires that the user in/decrement the reference counters associated with nodes that are used outside of aBDD
.
The pure-Python module dd.bdd
can be used directly,
which allows access more extensive than dd.autoref
.
The n
variables in a dd.bdd.BDD
are ordered
from 0
(top level) to n - 1
(bottom level).
The terminal node 1
is at level n
.
The constant TRUE
is represented by +1
, and FALSE
by -1
.
To avoid running out of memory, a BDD manager deletes nodes when they are not used anymore. This is called garbage collection. So, two things need to happen:
- keep track of node “usage”
- invoke garbage collection
Garbage collection is triggered either explicitly by the user, or when invoking the reordering algorithm. To prevent nodes from being garbage collected, their reference counts should be incremented, which is discussed in the next section.
Node usage is tracked with reference counting, for each node.
In autoref
, the reference counts are maintained by the constructor and
destructor methods of Function
(hence the “auto”).
These methods are invoked when the Function
object is not referenced any
more by variables, so Python decides
to delete it.
In dd.bdd
, you have to perform the reference counting by suitably adding to
and subtracting from the counter associated to the node you reference.
Also, garbage collection is invoked either explicitly or by reordering
(explicit or dynamic). So if you don't need to collect garbage, then you
can skip the reference counting (not recommended).
The reference counters live inside dd.bdd.BDD._ref
.
To guard against corner cases, like attempting to decrement a zero counter, use
BDD.incref(u)
: +1 to the counter of nodeu
BDD.decref(u)
: -1 to the same counter.
The method names incref
and decref
originate from the
Python reference counting
implementation. If we want node u
to persist after garbage collection,
then it needs to be actively referenced at least once
u = bdd.add_expr(s)
bdd.incref(u)
Revisiting an earlier example, manual reference counting looks like:
import dd.bdd as _bdd
bdd = _bdd.BDD()
bdd.declare('x', 'y', 'z')
s = r'(x /\ y) \/ ~ z' # TLA+ syntax
s = '(x & y) | ! z' # Promela syntax
u = bdd.add_expr(s)
bdd.incref(u)
bdd.dump('before_collections.pdf')
bdd.collect_garbage()
bdd.dump('middle.pdf')
bdd.decref(u)
bdd.collect_garbage()
bdd.dump('after_collections.pdf')
A formula may depend on a variable, or not. There are two ways to find out
u = bdd.add_expr(r'x /\ y') # TLA+ syntax
u = bdd.add_expr('x & y') # Promela syntax
c = 'x' in bdd.support(u) # more readable
c_ = bdd.is_essential(u, 'x') # slightly more efficient
assert c == True, c
assert c == c_, (c, c_)
c = 'z' in bdd.support(u)
assert c == False, c
Given a BDD, the size of its graph representation depends on the variable order. Reordering changes the variable order. Reordering optimization searches for a variable order better than the current one. Dynamic reordering is the automated invocation of reordering optimization. BDD managers use heuristics to decide when to invoke reordering, because it is NP-hard to find a variable order that minimizes a given BDD.
The function dd.bdd.reorder
implements Rudell's sifting algorithm.
This reordering heuristic is the most commonly used, also in CUDD.
Dynamic variable reordering can be enabled by calling:
import dd.bdd as _bdd
bdd = _bdd.BDD()
bdd.configure(reordering=True)
By default, dynamic reordering in dd.bdd.BDD
is disabled.
This default is unlike dd.cudd
and will change in the future to enabled.
You can also invoke reordering explicitly when desired, besides dynamic invocation. For example:
import dd.bdd as _bdd
bdd = _bdd.BDD()
vrs = [f'x{i}' for i in range(3)]
vrs.extend(f'y{i}' for i in range(3))
bdd.declare(*vrs)
print(bdd.vars)
# {'x0': 0, 'x1': 1, 'x2': 2, 'y0': 3, 'y1': 4, 'y2': 5}
s = r'(x0 /\ y0) \/ (x1 /\ y1) \/ (x2 /\ y2)' # TLA+ syntax
s = '(x0 & y0) | (x1 & y1) | (x2 & y2)' # Promela syntax
u = bdd.add_expr(s)
bdd.incref(u)
number_of_nodes = len(bdd)
print(f'{number_of_nodes = }')
# number_of_nodes = 22
# collect intermediate results produced while making u
bdd.collect_garbage()
number_of_nodes = len(bdd)
print(f'{number_of_nodes = }')
# number_of_nodes = 15
bdd.dump('before_reordering.pdf')
# invoke variable order optimization by sifting
_bdd.reorder(bdd)
number_of_nodes = len(bdd)
print(f'{number_of_nodes = }')
# number_of_nodes = 7
print(bdd.vars)
# {'x0': 0, 'x1': 3, 'x2': 5, 'y0': 1, 'y1': 2, 'y2': 4}
bdd.dump('after_reordering.pdf')
If you want to obtain a particular variable order, then give the
desired variable order as a dict
to the function reorder
.
my_favorite_order = dict(
x0=0, x1=1, x2=2,
y0=3, y1=4, y2=5)
number_of_nodes = len(bdd)
print(f'{number_of_nodes = }')
# number_of_nodes = 7
_bdd.reorder(bdd, my_favorite_order)
number_of_nodes = len(bdd)
print(f'{number_of_nodes = }')
# number_of_nodes = 15
(Request such inefficient reordering only if you have some special purpose.)
You can turn logging
to
DEBUG
, if you want to watch reordering in action.
In some cases you might want to make some pairs of variables adjacent to
each other, but don't care about the location of each pair in the
variable order (e.g., this enables efficient variable renaming).
Use reorder_to_pairs
.
All reordering algorithms rely on the elementary operation of swapping two
adjacent levels in the manager. You can do this by calling BDD.swap
,
so you can implement some reordering optimization algorithm different
than Rudell's.
The difficult part to implement is swap
, not the optimization heuristic.
The garbage collector in
dd.bdd.BDD.collect_garbage
works by scanning all nodes, marking the unreferenced ones,
then collecting those (mark-and-sweep).
The function dd.bdd.to_nx(bdd, roots)
converts the subgraph of bdd
rooted
at roots
to a networkx.MultiDiGraph
.
The remaining methods of dd.bdd.BDD
will be of interest more to developers
of algorithms that manipulate or read the graph of BDD nodes itself.
For example, say you wanted to write a little function that explores the
BDD graph rooted at node u
.
def print_descendants_forgetful(bdd, u):
i, v, w = bdd._succ[abs(u)]
print(u)
# u is terminal ?
if v is None:
return
print_descendants_forgetful(bdd, v)
print_descendants_forgetful(bdd, w)
In the worst case, this can take time exponential in the nodes of bdd
.
To make sure that it takes linear time, we have to remember visited nodes
def print_descendants(bdd, u, visited):
p = abs(u)
i, v, w = bdd._succ[p]
# visited ?
if p in visited:
return
# remember
visited.add(p)
print(u)
# u is terminal ?
if v is None:
return
print_descendants(bdd, v, visited)
print_descendants(bdd, w, visited)
Run it with visited = set()
.
New nodes are created with BDD.find_or_add(level, low, high)
.
Always use this method to make a new node, because it first checks in
the unique table BDD._pred
whether the node at level
with successors
(low, high)
already exists. This uniqueness is at the heart of reduced
ordered BDDs, the reason of their efficiency.
Throughout dd.bdd
, nodes are frequently referred to as edges.
The reason is that all nodes stored are positive integers.
Negative integers signify negation, and appear only as either edges to
successors (negated edges), or references given to the user (because
a negation is popped to above the root node, for reasons of
representation uniqueness).
The method BDD.levels
returns a generator of tuples (u, i, v, w)
,
over all nodes u
in BDD._succ
, where:
u
is a node in the iterationi
is the level thatu
lives atv
is the low (“else”) successorw
is the high (“then”) successor
The iteration starts from the bottom (largest level), just above the terminal
node for “true” and “false” (which is -1
in dd.bdd
).
It scans each level, moving upwards, until it reaches the top level
(indexed with 0).
The nodes contained in the graph rooted at node u
are bdd.descendants([u])
.
An efficient implementation of let
that works only for variables with
level <= the level of any variable in the support of node u
is
_top_cofactor(u, level)
.
Finally, BDD.reduction
is of only academic interest.
It takes a binary decision diagram that contains redundancy in its graph
representation, and reduces it to the non-redundant, canonical form that
corresponds to the chosen variable order.
This is the function described originally by Bryant.
It is never used, because all BDD graphs are constructed bottom-up
to be reduced. To observe reduction
in action, you have to manually
create a BDD graph that is not reduced.
We have been talking about BDDs, but you're probably here because you want to use them. A common application is manipulation of Boolean functions in the context of relations that represent dynamics, sometimes called transition relations.
Suppose that we have an elevator that moves between three floors. We are interested in the elevator's location, which can be at one of the three floors. So, we can pretend that 0, 1, 2 are the three floors.
Using bits, we need at least two bits to represent the triple {0, 1, 2}
.
Two bits can take a few too many values, so we should tell the computer that
3 is not possible in our model of the three floors.
Suppose that now the elevator is at floor x0
, x1
, and next at floor
x0'
, x1'
(read “x prime”).
The identifiers ["x0", "x1", "x0'", "x1'"]
are just four bits.
The elevator can move as follows
import dd.autoref as _bdd
bdd = _bdd.BDD()
bdd.declare("x0", "x0'", "x1", "x1'")
# TLA+ syntax
s = (
r"((~ x0 /\ ~ x1) => ( (~ x0' /\ ~ x1') \/ (x0' /\ ~ x1') )) /\ "
r"((x0 /\ ~ x1) => ~ (x0' /\ x1')) /\ "
r"((~ x0 /\ x1) => ( (~ x0' /\ x1') \/ (x0' /\ ~ x1') )) /\ "
r" ~ (x0 /\ x1)")
transitions = bdd.add_expr(s)
We can now find from which floors the elevator can reach floor 2. To compute this, we find the floors that either:
- are already inside the set
{2}
, or - can reach
q
after one transition.
This looks for existence of floors, hence the existential quantification.
We enlarge q
by the floors we found, and repeat.
We continue this backward iteration, until reaching a least fixpoint
(meaning that two successive iterates are equal).
# target is the set {2}
target = bdd.add_expr(r'~ x0 /\ x1')
# start from empty set
q = bdd.false
qold = None
prime = {"x0": "x0'", "x1": "x1'"}
qvars = {"x0'", "x1'"}
# fixpoint reached ?
while q != qold:
qold = q
next_q = bdd.let(prime, q)
u = transitions & next_q
# existential quantification over x0', x1'
pred = bdd.quantify(u, qvars, forall=False)
q = q | pred | target
At the end, we obtain
expr = q.to_expr()
print(expr)
# '(! ite(x0, x1, False))'
which is the set {0, 1, 2}
(it does not contain 3, because that would
evaluate to ! ite(True, True, False)
which equals ! True
, so False
).
More about building symbolic algorithms, together with infrastructure for
arithmetic and automata, and examples, can be found in the package omega
.
The method BDD.add_expr
parses the following grammar.
The TLA+ module dd_expression_grammar
extends
the TLA+ module BNFGrammars
, which is defined on page 184 of
the book "Specifying Systems".
------- MODULE dd_expression_grammar -------
(* Grammar of expressions parsed by the
function `dd._parser.Parser.parse`.
*)
EXTENDS
BNFGrammars
COMMA == tok(",")
maybe(x) ==
| Nil
| x
comma1(x) ==
x & (COMMA & x)^*
NUMERAL == OneOf("0123456789")
is_dd_lexer_grammar(L) ==
/\ L.A = tok("\\A")
/\ L.E = tok("\\E")
/\ L.S = tok("\\S")
/\ L.COLON = tok(":")
/\ L.COMMA = COMMA
/\ L.NOT = tok("~")
/\ L.AND = tok("/\\")
/\ L.OR = tok("\\/")
/\ L.IMPLIES = tok("=>")
/\ L.IFF = tok("<=>")
/\ L.EQ = tok("=")
/\ L.NEQ = tok("#")
/\ L.EXCLAMATION = tok("!")
/\ L.ET = tok("&")
/\ L.PIPE = tok("|")
/\ L.RARROW = tok("->")
/\ L.LR_ARROW = tok("<->")
/\ L.CIRCUMFLEX = tok("^")
/\ L.SLASH = tok("/")
/\ L.AT = tok("@")
/\ L.LPAREN = tok("(")
/\ L.RPAREN = tok(")")
/\ L.FALSE = Tok({
"FALSE",
"false"
})
/\ L.TRUE = Tok({
"TRUE",
"true"
})
/\ L.ITE = tok("ite")
/\ L.NAME =
LET
LETTER ==
| OneOf("abcdefghijklmnopqrstuvwxyz")
| OneOf("ABCDEFGHIJKLMNOPQRSTUVWXYZ")
UNDERSCORE == tok("_")
start ==
| LETTER
| UNDERSCORE
DOT == tok(".")
PRIME == tok("'")
symbol ==
| start
| NUMERAL
| DOT
| PRIME
tail == symbol^*
IN
start & tail
/\ L.INTEGER =
LET
DASH == tok("-")
_dash == maybe(DASH)
numerals == NUMERAL^+
IN
_dash & numerals
is_dd_parser_grammar(L, G) ==
/\ \A symbol \in DOMAIN L:
G[symbol] = L[symbol]
/\ G.expr =
(* predicate logic *)
| L.A & G.names & L.COLON & G.expr
(* universal quantification ("forall") *)
| L.E & G.names & L.COLON & G.expr
(* existential quantification ("exists") *)
| L.S & G.pairs & G.COLON & G.expr
(* renaming of variables ("substitution") *)
(* propositional *)
(* TLA+ syntax *)
| G.NOT & G.expr
(* negation ("not") *)
| G.expr & G.AND & G.expr
(* conjunction ("and") *)
| G.expr & G.OR & G.expr
(* disjunction ("or") *)
| G.expr & G.IMPLIES & G.expr
(* implication ("implies") *)
| G.expr & G.IFF & G.expr
(* equivalence ("if and only if") *)
| G.expr & G.NEQ & G.expr
(* difference (negation of `<=>`) *)
(* Promela syntax *)
| L.EXCLAMATION & G.expr
(* negation *)
| G.expr & L.ET & G.expr
(* conjunction *)
| G.expr & L.PIPE & G.expr
(* disjunction *)
| G.expr & L.RARROW & G.expr
(* implication *)
| G.expr & L.LR_ARROW & G.expr
(* equivalence *)
(* other *)
| G.expr & L.CIRCUMFLEX & G.expr
(* xor (exclusive disjunction) *)
| L.ITE & L.LPAREN &
G.expr & L.COMMA &
G.expr & L.COMMA &
G.expr &
L.RPAREN
(* ternary conditional
(if-then-else) *)
| G.expr & L.EQ & G.expr
| L.LPAREN & G.expr & L.RPAREN
(* parentheses *)
| L.NAME
(* identifier (bit variable) *)
| L.AT & L.INTEGER
(* BDD node reference *)
| L.FALSE
| L.TRUE
(* Boolean constants *)
/\ G.names =
comma1(L.NAME)
/\ G.pairs =
comma1(G.pair)
/\ G.pair =
L.NAME & L.SLASH & L.NAME
dd_grammar ==
LET
L == LeastGrammar(is_dd_lexer_grammar)
is_parser_grammar(G) ==
is_dd_parser_grammar(L, G)
IN
LeastGrammar(is_parser_grammar)
============================================
Comments are written using TLA+ syntax:
(* this is a doubly-delimited comment *)
\* this is a trailing comment
Doubly-delimited comments can span multiple lines.
The token precedence (lowest to highest) and associativity is:
:
(left)<=>, <->
(left)=>, ->
(left)-
(left)#
,^
(left)\/, |
(left)/\, &
(left)=
(left)~, !
-
unary minus, as in-5
The meaning of a number of operators,
assuming a
and b
take Boolean values:
a => b
meansb \/ ~ a
a <=> b
means(a /\ b) \/ (~ a /\ ~ b)
a # b
means(a /\ ~ b) \/ (b /\ ~ a)
ite(a, b, c)
means(a /\ b) \/ (~ a /\ c)
Both and setup.py
, and a developer may want to force a rebuild of the
parser table. For this purpose, each module that contains a parser,
also has a function _rewrite_tables
that deletes and rewrites the tables.
If the module is run as a script, then the __main__
stanza calls this
function to delete and the write the parser tables to the current directory.
The parsers use astutils
.
A representation for functions from integer variables to Boolean values.
The primary motivation for implementing MDDs was to produce more readable
string and graphical representations of BDDs.
MDDs are implemented in pure Python.
The interface is “low”, similar to dd.bdd
, with reference counting
managed by the user.
A core of necessary methods have been implemented, named as the BDD
methods
with the same functionality in other dd
modules.
Complemented edges are used here too.
BDDs are predicates over binary variables (two-valued).
MDDs are predicates over integer variables (multi-valued).
As expected, the data structure and algorithms for representing an MDD are
a slight generalization of those for BDDs.
For example, compare the body of dd.mdd.MDD.find_or_add
with
dd.bdd.BDD.find_or_add
.
The variables are defined by a dict
argument to the constructor
(in the future, dynamic variable addition may be implemented,
by adding a method MDD.add_var
)
import dd.mdd as _mdd
dvars = dict(
x=dict(level=0, len=4),
y=dict(level=1, len=2))
mdd = _mdd.MDD(dvars)
So, variable x
is an integer that can take values in range(4)
.
Currently, the main use of an MDD is for more comprehensive representation of
a predicate stored in a BDD. This is achieved with the function
dd.mdd.bdd_to_mdd
that takes a dd.bdd.BDD
and a mapping from
MDD integers to BDD bits. Referencing of BDD nodes is necessary,
because bdd_to_mdd
invokes garbage collection on the BDD.
import dd.bdd as _bdd
import dd.mdd as _mdd
bits = dict(x=0, y0=1, y1=2)
bdd = _bdd.BDD(bits)
u = bdd.add_expr(r'x \/ (~ y0 /\ y1)')
bdd.incref(u)
# convert BDD to MDD
ints = dict(
x=dict(level=1, len=2, bitnames=['x']),
y=dict(level=0, len=4, bitnames=['y0', 'y1']))
mdd, umap = _mdd.bdd_to_mdd(bdd, ints)
# map node `u` from BDD to MDD
v = umap[abs(u)]
# complemented ?
if u < 0:
v = - v
print(v)
# -3
expr = mdd.to_expr(v)
print(expr)
# (! if (y in set([0, 1, 3])): (if (x = 0): 1,
# elif (x = 1): 0),
# elif (y = 2): 0)
# plot MDD with graphviz
mdd.dump('mdd.pdf')
Note that the MDD
node v
is complemented (-3 < 0), so the predicate
in the negated value computed for node y-3
in the next image.
By default, the package dd
installs only its Python modules.
You can select to install Cython extensions using
environment variables:
DD_FETCH=1
: download CUDD v3.0.0 sources from the internet, check the tarball's hash, unpack the tarball, andmake
CUDD.DD_CUDD=1
: build moduledd.cudd
, for CUDD BDDsDD_CUDD_ZDD=1
: build moduledd.cudd_zdd
, for CUDD ZDDsDD_SYLVAN=1
: build moduledd.sylvan
, for Sylvan BDDsDD_BUDDY=1
: build moduledd.buddy
, for BuDDy BDDs
Example scripts are available that fetch and install the Cython bindings:
Activating the Cython build by directly running
python setup.py
is an alternative to
using environment variables (e.g., export DD_CUDD=1
etc).
The relevant command-line options of setup.py
are:
--fetch
: same effect asDD_FETCH=1
--cudd
: same effect asDD_CUDD=1
--cudd_zdd
: same effect asDD_CUDD_ZDD=1
--sylvan
: same effect asDD_SYLVAN=1
--buddy
: same effect asDD_BUDDY=1
These options work for python setup.py sdist
and
python setup.py install
, but directly running
python setup.py
is deprecated by setuptools >= 58.3.0
.
Example:
pip download dd --no-deps
tar xzf dd-*.tar.gz
pushd dd-*/
# `pushd` means `cd`
python setup.py install --fetch --cudd --cudd_zdd
popd
pushd directory
is akin to stack.append(directory)
in
Python, and popd
to stack.pop()
.
The path to an existing CUDD build directory can be passed as an argument, for example:
python setup.py install \
--fetch \
--cudd="/home/user/cudd"
The following also works for building source tarballs and wheels:
pip install cython
export DD_FETCH=1 DD_CUDD=1
python -m build --no-isolation
To build a source tarball:
DD_CUDD=1 python -m build --sdist --no-isolation
If you build and install CUDD, Sylvan, or BuDDy yourself, then ensure that:
- the header files and libraries are present, and
- the compiler is configured appropriately (include, linking, and library configuration),
either by setting environment variables
prior to calling pip
, or by editing the file download.py
.
Currently, download.py
expects to find Sylvan under dd/sylvan
and
built with Autotools
(for an example, read .github/workflows/setup_build_env.sh
).
If the path differs in your environment, remember to update it.
If you prefer defining installation directories, then follow
Cython's instructions
to define CFLAGS
and LDFLAGS
before installing.
You need to have copied CuddInt.h
to the installation's include location
(CUDD omits it).
For installing the development version of dd
from the git
repository,
an alternative to cloning the repository and installing from the cloned
repository is to use pip
for doing so:
pip install https://github.com/tulip-control/dd/archive/main.tar.gz
or with pip
using git
(this alternative requires that git
be installed):
pip install git+https://github.com/tulip-control/dd
A git
URL can be passed also to pip download
,
for example:
pip download --no-deps https://github.com/tulip-control/dd/archive/main.tar.gz
The extension .zip
too can be used for the name of the archive file
in the URL. Analogously, with pip
using git
:
pip download --no-deps git+https://github.com/tulip-control/dd
Note that the naming of paths within the archive file downloaded from
GitHub in this way will differ, depending on whether https://
or
git+https://
is used.
-
The
Makefile
contains the rulessdist
andwheel
that create distributions for uploading to PyPI withtwine
. -
Press
Ctrl + \
on Linux and Darwin to quit the Python process when CUDD computations take a long time. Readstty -a
for your settings. -
If you are interested in exploring other decision diagram packages, you can find a list at
github.com/johnyf/tool_lists/
.
This document is copyright 2015-2022 by California Institute of Technology. All rights reserved. Licensed under 3-clause BSD.