Information and Computation 204 (2006) 1526­1574
www.elsevier.com/locate/ic
Decision procedures for term algebras with integer
constraints
Ting Zhang *, Henny B. Sipma, Zohar Manna
Computer Science Department, Stanford University, USA
Received 1 January 2005; revised 30 May 2005
Available online 25 July 2006
Abstract
Term algebras can model recursive data structures which are widely used in programming languages. To
verify programs we must be able to reason about these structures. However, as programming languages often
involve multiple data domains, in program verification decision procedures for a single theory are usually not
applicable. An important class of mixed constraints consists of combinations of data structures with integer
constraints on the size of data structures. Such constraints can express memory safety properties such as ab-
sence of memory overflow and out-of-bound array access, which are crucial for program correctness. In this
paper we extend the theory of term algebras with the length function which maps a term to its size, resulting in a
combined theory of term algebras and Presburger arithmetic. This arithmetic extension provides a natural but
tight coupling between the two theories, and hence the general purpose combination methods like Nelson-Op-
pen combination are not applicable. We present decision procedures for quantifier-free theories in structures
with an infinite constant domain and with a finite constant domain. We also present a quantifier elimination
procedure for the extended first-order theory that can remove a block of existential quantifiers in one step.
 2006 Elsevier Inc. All rights reserved
Keywords: Decision procedures; Recursive data structures; Term algebras; Presburger arithmetic; Quantifier elimination;
Combination of satisfiability procedures
This research was supported in part by NSF Grants CCR-01-21403, CCR-02-20134, CCR-02-09237, CNS-0411363,
and CCF-0430102, by ARO Grant DAAD19-01-1-0723, and by NAVY/ONR contract N00014-03-1-0939.

Corresponding author.
E-mail addresses: tingz@cs.stanford.edu (T. Zhang), sipma@cs.stanford.edu (H.B. Sipma), zm@cs.stanford.edu
(Z. Manna).
0890-5401/$ - see front matter  2006 Elsevier Inc. All rights reserved
doi:10.1016/j.ic.2006.03.004
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1527
1. Introduction
Recursively defined data structures are essential constructs in programming languages.
Intuitively, a data structure is recursively defined if it is partially composed of smaller or simpler
instances of the same structure. Examples include lists, stacks, counters, trees, records and queues.
To verify programs containing recursively defined data structures we must be able to reason about
these data structures. Decision procedures for several data structures exist. However, in program
verification decision procedures for a single theory are usually not applicable as programming lan-
guages often involve multiple data domains, resulting in verification conditions that span multiple
theories. Common examples of such mixed constraints are combinations of data structures with
integer constraints on the size of those structures. Such constraints can express memory safety
properties such as absence of memory overflow and out-of-bound array access, which are crucial
to program correctness.
In this paper we consider the integration of Presburger arithmetic with term algebras which can
represent an important class of recursively defined data structures known as recursive data struc-
tures. This class of structures satisfies the following two properties of term algebras: (i) the data
domain is the set of data objects generated exclusively by applying constructors, and (ii) each data
object is uniquely generated. Examples of such structures include lists, stacks, counters, trees and
records; queues do not belong to this class as they are not uniquely generated: they can grow at
both ends.
Our language of the integrated theory has two sorts; the integer sort and the term sort . The
language is the set-theoretic union of the language of term algebras and the language of Presburger
arithmetic plus the additional length function |  | :  . Formulas are formed from term literals
and integer literals in the usual way. Term literals are exactly the literals in the theory of term alge-
bras. Integer literals are those that can be built up from primitive integer terms (the length function
applied to -terms), addition and the other usual arithmetic functions and relations.
We present decision procedures for the quantifier-free and the first-order theory of term algebras
with length function and integer constraints, for structures with both finite and infinite constant
domain. In the rest of the paper we will use the following notation for these theories. Th
(TA) and
Th
(TA ) denote the quantifier-free theory of, respectively, pure term algebras and term algebras
with a length function and Presburger arithmetic constraints. Similarly, Th(TA) and Th(TA ) de-
note the full first-order theory of pure term algebras and term algebras with a length function and
Presburger arithmetic constraints. When we separately consider decision procedures for structures
with infinite constant domain, we add an  superscript, for example, Th
(TA
).
The decision procedures for Th
(TA ) are based on Oppen's algorithm for acyclic recursive data
structures with infinite data domain (which, essentially, is Th
(TA
)) [26]. To decide satisfiability of
a term constraint , Oppen's procedure constructs a DAG for , extracts from this DAG all implied
equalities between terms, and then checks for inconsistencies with disequalities in . We extend this
procedure to Th
(TA ) by extracting an implied length constraint from the term constraint. We
show that for structures with infinite constant domain such a length constraint, which is satisfiable
if and only if the term constraint  is satisfiable, can be effectively computed and is expressible
by a quantifier-free Presburger formula linear in the size of . For structures with finite constant
domain we introduce an additional counting constraint to account for the fact that with finitely
many constants the number of distinct terms of a particular length is bounded. We show that also
1528 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
this counting constraint is expressible by a quantifier-free Presburger formula. The latter decision
procedure directly extends Oppen's decision procedure for infinite data domains to finite domains.
For the first-order theory, we first present a new quantifier elimination procedure for Th(TA)
and then extend it to an elimination procedure for Th(TA ). Our elimination procedure for Th(TA)
is based on the elimination procedure in [13], but can eliminate blocks of quantifiers of the same
kind in one step. We extend it to a decision procedure for Th(TA ) by, again, extracting integer
constraints from term constraints combined with a reduction of quantifiers on term variables to
quantifiers on integer variables.
The decision procedures for Th
(TA ) and Th(TA ) were first published, without proofs, in
[38]. The improved version that allows elimination of blocks of quantifiers was published in [39]. In
that paper we showed that the complexity of our decision procedures was 2k-fold exponential for k
quantifier alternations for Th(TA ). This paper provides an extended presentation of the results in
both papers, improves the complexity of the decision procedure for Th(TA ) to k-fold exponential
for k quantifier alternations, and includes all the proofs.
Related work and comparison. Our component theories are both decidable. Presburger arithmetic
was first shown to be decidable in 1929 by quantifier elimination [10]. A more efficient algorithm
was later discovered by Cooper [7] and further improved by Reddy and Loveland [28].
It is well-known that recursive data structures can be modeled as term algebras which were shown
to be decidable by Mal'cev using quantifier elimination [23]. This result was proved again several
times in different settings [21,6,13,5,2,30,18,19,38].
Quantifier elimination has been used to obtain decidability results for various extensions of term
algebras. Maher showed the decidability of the theory of infinite and rational trees [21]. Comon
and Delor presented an elimination procedure for term algebras with membership predicate in the
regular tree language [5]. Backofen presented an elimination procedure for structures of feature
trees with arity constraints [2]. Rybina and Voronkov showed the decidability of term algebras with
queues [30]. Kuncak and Rinard showed the decidability of term powers, which are term algebras
augmented with coordinate-wise defined predicates [18].
Decision procedures for the quantifier-free theory of recursive data structures were discovered
by Nelson, Oppen, Downey, Sethi and Tarjan [24,26,9]. Oppen gave a linear algorithm for acyclic
structures [26] and (with Nelson) a quadratic algorithm for cyclic structures [24]. If the values of
the selector functions on constants are specified, then the problem is NP-complete [26].
A general combination method for decision procedures for quantifier-free theories was developed
by Nelson and Oppen in 1979 [25]. The method requires that component theories be loosely cou-
pled, that is, have disjoint signatures, and are stably infinite1 [32]. Tinelli and Ringeissen presented a
general theoretical framework for combining satisfiability procedures of theories with non-disjoint
signatures [33]. Tinelli and Zarba generalized Nelson­Oppen's method to theories in multisorted
languages [34]. Armando et al. presented a uniform framework using superposition for deriving
decision procedures for certain combined theories [1]. Ghilardi presented a set of model-theoretical
conditions for the existence of Nelson­Oppen combination schema on theories having non-disjoint
signatures [12]. However, none of these general purpose combination methods are applicable to the
1 A theory is stably infinite if a quantifier-free formula in the theory is satisfiable if and only if it is satisfiable in an
infinite model.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1529
combination of our component theories, which is a multisorted theory with a function mapping
elements in one sort to another.
Zarba constructed decision procedures for a combined theory of sets and integers [36] and a
theory of multisets and integers [37]. The integration of Presburger arithmetic with recursive data
structures was discussed by Bjrner [4] and an incomplete procedure was implemented in STeP
(Stanford Temporal Prover) [3].
Integer constraints not only arise in the combination of decision procedures, but they are also
useful as an auxiliary extension to encode properties on data structures. This line of investigation
goes back to Skolem who showed the decidability of the first-order theory of Boolean algebras by
reducing constraints on sets to constraints on the cardinality of sets [31]. It readily follows from
the reduction technique that the first order theory of sets with cardinality constraints in Presburg-
er arithmetic is decidable [11]. Recently, Revesz [29], and Kuncak and Rinard [17] independently
presented decision procedures for this theory. A combination of Presburger arithmetic and term
algebras was used by Korovin and Voronkov to show that the quantifier-free theory of term alge-
bras with Knuth-Bendix order is NP-complete [15,16]. Along this line of investigation we proved the
decidability of the first-order theory of Knuth-Bendix orders [40] using quantifier elimination. The
elimination procedure makes extensive use of Presburger arithmetic in the reduction of quantifiers
on term variables to quantifiers on integer variables.
Paper organization. Section 2 presents the notation and terminology. Section 3 introduces the lan-
guage and structure of term algebras. Section 4 describes Oppen's algorithm for recursive data
structures. Section 5 presents our decision procedures for the quantifier-free theory of term alge-
bras augmented with a length function and Presburger arithmetic. In Section 5.1 we first describe
the theory and then in Section 5.2 we outline our approach for constructing the decision proce-
dures by introducing the concepts of implied length constraints and presenting a generic decision
procedure. In Section 5.3 we specialize this procedure for structures with infinite constant domain,
and in Section 5.4 we refine it further for structures with finite constant domain. Section 5.5 out-
lines the approach to obtain decision procedures for structures whose constant domain contains
relations besides equality. Section 5.6 discusses the complexity of the decision procedures for the
quantifier-free theories.
Section 6 presents a new decision procedure for the first-order theory of term algebras. Section 7
presents the decision procedures for the first-order theory of term algebras with integer constraints.
We first introduce the technical machinery for the construction of a quantifier elimination proce-
dure and then, in Section 7.2, we extend the elimination procedure for term algebras presented in
Section 6 to term algebras with integers. Section 7.3 further generalizes the result to structures whose
constant domain has an internal structure and admits quantifier elimination. In Section 7.4 we dis-
cuss how this procedure can be adapted for theories with infinite languages. Section 8 concludes
with some ideas for future work. Most proofs are provided in the Appendix.
2. Preliminaries
We assume the first-order syntactic notions of variables, parameters and quantifiers, and semantic
notions of structures, satisfiability and validity as in [10].
1530 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
A signature is a set of function symbols and predicate symbols each of which is associated with
an arity. The function symbols with arity 0 are also called constants. The set of -terms T ( , X)
is recursively defined by: (i) every constant c  or variable x  X is a term, and (ii) if f  is
an n-place function symbol and t1, . . . , tn are terms, then f(t1, . . . , tn) is a term. Equality = is always
included as a binary predicate symbol. If  is a formula, we use T () to denote the set of terms
occurring in , V() to denote the set of variables in .
An atomic formula is a formula of the form P(t1, . . . , tn) where P is an n-place predicate symbol
and t1, . . . , tn are terms. A literal is an atomic formula or its negation. A variable occurs free in a
formula if it is not in the scope of a quantifier. A formula without quantifiers is called quantifier-free.
A ground formula is a formula with no variables. A sentence is a formula in which no variable occurs
free. Every quantifier-free formula can be put into disjunctive normal form, that is, a disjunction of
conjunctions of literals. A formula ( x) can be put into prenex form Q1y1, . . . , Qnyn ( x, y1, . . . , yn),
where Qi's are either  or  and ( x, y1, . . . , yn) is quantifier-free, called the matrix of .
A -structure A is a tuple A, I where A is a non-empty domain and I is a function that associ-
ates each n-place function symbol f (respectively, predicate symbol P) with an n-place function f A
(respectively, relation PA
) on A. We use Gothic letters (like A) for structures and Roman letters (like
A) for the underlying domain. We usually denote A by A; which is called the signature of A. A
variable assignment (in A) is a function that assigns each variable an element of A. We use [[x]] to
denote the assigned value of x under and [[]] for the truth value of  under . A |= [[]] means
 is true under . A formula  is satisfiable (in A), denoted by A |= , if A |= [[]] for some ; is
unsatisfiable (in A), denoted by A |= , if A |= [[]] for no ; is valid (in A), denoted by A |= ,
if A |= [[]] for any . A formula  is valid if and only if  is unsatisfiable.
A is a model of a set T of sentences if every sentence in T is true in A. A sentence  is (logically)
implied by T (or T -valid), written T |= , if  is true in every model of T . Similarly, we say that
 is T -satisfiable if  is true in some model of T and it is T -unsatisfiable otherwise. The notions
of (T -)validity and (T -)satisfiability naturally extend to a set of formulas. A theory T is a set of
sentences that is closed under logical implication, that is, if T |= , then   T . The theory of A,
written Th(A), is the set of all true sentences in A. By a quantifier-free theory of A, written Th
(A),
we mean the quantifier-free subclass of Th(A).
We use x to denote a sequence of variables, say, x1, . . . , xn, and  x (respectively,  x) as an abbrevi-
ation of x1, . . . , xn (respectively, x1, . . . , xn). We write ( x) to indicate that all variables occurring
freely in  are among x. ( x)( x) and ( x)( x) are called existential closure and universal closure
of ( x), respectively. If ( x) is quantifier-free, then ( x)( x) is called 1 and ( x)( x) is called 1.
The quantifier-free (respectively, 1, 1) fragment of a language is the subclass of quantifier-free
(respectively, 1, 1) sentences in the language. By satisfiability and validity of a quantifier-free for-
mula, we actually mean the validity of the corresponding 1 and 1 formulas, respectively. Similarly,
the satisfiability problem and the validity problem of a quantifier-free theory (or more precisely,
the quantifier-free fragment), respectively, refer to the validity problem of the corresponding 1
fragment and 1 fragment.
A theory T is said to admit quantifier elimination if any formula can be equivalently (modulo T )
and effectively transformed into a quantifier-free formula. If a theory admits quantifier elimination,
then the truth value of any sentence is reducible to the truth value of a ground formula.
All above notions naturally generalize to multi-sorted logics. In a multi-sorted logic, we have
a non-empty set S of sorts. Variables, constants, equality symbols and quantifiers are indexed by
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1531
s  S. A n-ary function symbol f is associated with a n + 1-tuple s1, . . . , sn+1 , called the type of f .
Similarly, the type of a n-ary predicate symbol p is a n-tuple s1, . . . , sn . For 1 i n, we say that
the ith-place of f (or p) has sort si. We require the type of the equality symbol of sort s be s, s .
A term t is of sort s if (i) t is a variable or a constant of sort s, or (ii) t is of the form f(t1, . . . , tn)
such that the type of f is s1, . . . , sn, s , and t1, . . . , tn are of sorts s1, . . . , sn, respectively. A formula is
well-formed if in addition it is well-typed in the sense that (i) a term of sort s only occurs in a place of
sort s in a function or predicate symbol, and (ii) variables of sort s are only quantified by s and s.
A multi-sorted structure A is a tuple {A}S, S, I where S is the set of sorts, {A}S are mutually
disjoint sets (domains) indexed by S, and I is an interpretation such that (i) each n-ary function
symbol f with type s1, . . . , sn+1 is assigned a function F : As1 ×    × Asn  Asn+1 ; (ii) each n-ary
predicate symbol p with type s1, . . . , sn is assigned a relation P  As1 ×    × Asn . A variable as-
signment assigns a variable of sort s an element in As. Satisfiability, unsatisfiability and validity are
defined as above with s (respectively, s) being interpreted as "for all (respectively, some) elements
in the domain As".
Presburger arithmetic is the first-order theory of addition in the arithmetic of integers. The corre-
sponding structure is denoted by PA = ; 0, +, < . We use L to denote the formal language of PA.
We use  for syntactic equality. We use interval notation for integer sets. For example, the closed
interval [m, n] means {i | m i n}. Similarly for open intervals (m..n) and half-open intervals
[m..n) and (m..n].
3. Term algebras
We present a general language and structure of term algebras. For simplicity, we do not dis-
tinguish syntactic terms in the language from semantic terms in the corresponding structure. The
meaning should be clear from the context.
Definition 1. A term algebra TA : ; C, A, S, T consists of
(1) : The term domain, which exclusively consists of terms recursively built up from constants
by applying non-nullary constructors. Objects in are called TA-terms. The type of a term t,
denoted by type(t), is the outermost constructor symbol of t. We say that t is -typed (or is an
-term) if type(t) = .
(2) C: A set of constructors: , , , . . . The arity of is denoted by ar( ).
(3) A: A set of constants: a, b, c, . . . We require A =  and A  C. For a  A, ar(a) = 0 and
type(a) = a.
(4) S: A set of selectors. For a constructor with arity k > 0, there are k selectors s1 , . . . , sk in S.
We call si (1 i k) the ith -selector. For a term x, si (x) returns the ith component of x if x
is an -term and x itself otherwise.
(5) T : A set of testers. For each constructor there is a corresponding tester Is . For a term x,
Is (x) is true if and only if x is an -term. For a constant a, Isa(x) is just x = a. In addition
there is a special tester IsA such that IsA(x) is true if and only if x is a constant.
We use L to denote the language of term algebras.
1532 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
A term algebra has two essential properties: (i) the domain is exclusively generated by recur-
sively applying constructors; (ii) each object in is uniquely constructed. Note that the term domain
is the (ground) Herbrand domain in the language consisting of only constructors. Selectors and
testers are introduced into the formal language for our study. A more general definition of term
algebras would include a variable base X. For our purpose of modeling recursive data structures,
however, it suffices to assume the term domain only consists of ground terms; i.e., the base X = .
Nevertheless, our method can be modified to deal with a non-ground term domain by treating
variables as special constants.
For simplicity, in the rest of this paper we assume that L is finite except in Section 5.3 where we
present an algorithm for structures with an infinite constant domain as the basis for algorithms for
structures with a finite constant domain. The main techniques used for finite languages, however,
can be easily generalized to handle the case of infinite languages. Actually, the decision problems
become considerably easier if we allow L to have infinitely many constants. We defer the detailed
discussion to Section 7.4.
Example 2. Consider the LISP list structure
List = list; {cons, nil}, {nil}, {car, cdr}, {Iscons, Isnil, IsA}
where list denotes the domain, nil denotes the empty list, cons is a binary constructor (pairing func-
tion) and car and cdr are the corresponding left and right selectors (projectors), respectively. It is
not difficult to verify that List is an instance of term algebras.
The theory of term algebras is axiomatizable. Let z denote z1, . . . , zar( ). The following formula
schemes, in which variables are implicitly universally quantified over , axiomatize Th(TA).
A1. t(x) = x, if t is built solely by constructors and t properly contains x.
A2. (x1, . . . , xar( )) = (y1, . . . , yar()), if ,   C and  .
A3. (x1, . . . , xar( )) = (y1, . . . , yar( ))  1 i ar( ) xi = yi.
A4. Is (x)  z (z ) = x for  C. In particular, Isa(x)  x = a for a  A.
A5. IsA(x)  aA Isa(x).
A6. si (x) = y  z ( (z ) = x  y = zi))  (z ( (z ) = x)  x = y) .
This set of axioms is a variant of the axiomatization given in [13]. In general, selectors and testers
can be defined by constructors and vice versa. One direction has been shown by (A4), (A5) and
(A6), which are purely definitional axioms. The other direction follows from the equivalence of
k
i=1 si (x) = xi  Is (x) and x = (x1, . . . , xk).
We write = (s1 , . . . , sk ) to indicate that is a non-nullary constructor with ar( ) = k and
s1 , . . . , sk are the corresponding selectors of . A term t is called a constructor term if t is a variable
or the outermost function symbol of t is a constructor. A constructor term not containing selectors
is called pure. For example, constants are pure constructor terms. A term t is called a selector term
if either t is a variable or the outermost function symbol of t is a selector. Note that variables are
both constructor terms and selector terms. We assume that no constructors appear immediately
inside selectors as simplification can always be done. For example, si ( (x1, . . . , xk)) simplifies to xi
(1 i k) and s

j ( (x1, . . . , xk)) simplifies to (x1, . . . , xk) for  . As a consequence, a selector
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1533
term has the form s1(. . . (sn(x) . . .)) for n 0. We use L, F , G, H, . . . to denote (possibly empty) se-
lector sequences. So s1(. . . (sn(x) . . .)) can be abbreviated as Lx for L = s1, . . . , sn. The depth of x in
Lx is |L|, the length of L. The depth of x in a formula  is the maximum depth of x in the selector
terms in , denoted by depth(x). We use Is (t1, . . . , tn) as an abbreviation for n
i=1 Is (t). We say a
selector term si (t) is proper in a formula  if Is (t) is a conjunct of . We can make selector terms
proper with type information.
Definition 3 (Type completeness). A conjunction of literals is type complete if for any selector term
t occurring in , exactly one type of tester predicate Is (t) (  C \ A  {A}) is a conjunct of .
For a type complete containing a term t, we write type(t) = to indicate that Is (t) is a conjunct
of . A type complete can be simplified so that any selector term is proper.
Example 4. Let us consider in List the type complete constraint
y = cons(x, car(x))  IsA(x, car(x))  Iscons(y). (1)
Thanks to the type information, it can be simplified to
y = cons(x, x)  IsA(x)  Iscons(y). (2)
We could have defined the notion of type completeness only for terms that occur inside selectors. In
this way, a type complete formula may contain terms of unspecified types; they are either variables
or selector terms that are not embedded inside selectors. This will lead to more efficient algorithms
in practice. We choose the above definition, however, because it simplifies descriptions of algorithms
presented in the following sections, and in addition, it does not affect the worst-case complexity for
those algorithms.
Given a quantifier-free formula , we can obtain a type complete  from  by adding exactly one
tester predicate Is (t) (  C) for each term t occurring in . We call  so obtained a type completion
of .
Example 5.LetusrevisitExample4.Itiseasilyseenthat(1)isatypecompletionofy = cons(x, car(x)).
Example 6. Let ,   C,   and = (s1 ). A possible type completion for y = s1 (x) is
y = s1 (x)  Is(x, s1 (x), y), (3)
which, by Axioms (A4) and (A6), simplifies to y = x  Is(x, y). Another type completion is
y = s1 (x)  Is (x)  Is(s1 (x), y), (4)
in which the selector term s1 (x) is proper and no simplification is possible. As a third possibility, we
can have the type completion
y = s1 (x)  Is (x, s1 (x))  Is(y), (5)
which simplifies to false because of type conflicts.
As shown by the above example, a type completion of a satisfiable formula may be contradictory
due to type conflicts. A type completion  of  is compatible with  if the satisfiability of  implies
1534 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
the satisfiability of  . Obviously,  is satisfiable if and only if it has a satisfiable compatible type
completion.
In the following sections, we present nondeterministic algorithms that rely on the successful guess
of a satisfiable compatible type completion. Unless stated otherwise, we assume that any quantifi-
er-free formula is type complete, and all occurring selector terms are simplified to proper ones. We
assume that equalities (respectively, disequalities) between terms with conflicting types are simpli-
fied to false (respectively, true). For example, in List the appearance of car(x) = y should be read
as car(x) = y  Iscons(x). We also assume that formulas do not have conflicting type literals. For
example, we never encounter formulas containing subformulas of the form
x = (t1, . . . , tar( ))  x = (t1, . . . , tar())
for  , because at least one conjunct would have been simplified to true. But to save notation,
we omit test literals and treat them as implicit side conditions.
4. Decision procedures for Th(TA)
This section and the next section present decision procedures for quantifier-free theories. All our
decision procedures for quantifier-free theories are refutation-based; to determine the validity of a
formula , a procedure determines the unsatisfiability of , which further reduces to determining
the unsatisfiability of each disjunct in the disjunctive normal form of . Henceforth, in discussions
related to quantifier-free theories, a quantifier-free formula (or an input formula) always refers to a
conjunction of literals. To emphasize this, we use capital symbols , , . . . to denote conjunctions
of literals. We identify a conjunction of literals with the set of all its conjuncts. By A  we
mean A is a conjunct of and by  A (or more formally  {A}) we mean  A. We present
algorithms in a nondeterministic manner; by saying "guess" , we mean to split on a disjunction
that is valid in the context and contains  as one of the disjuncts, and we work on each resultant
constraint "simultaneously".
In [26] Oppen presented a decision procedure for the quantifier-free theory of acyclic recursive
data structures which is essentially Th
(TA). The basic idea of the decision procedure is to generate
all equalities implied by the input formula and check for inconsistencies with disequalities given in
the input. The decision procedure relies on the fact that Th
(TA) is convex in a language without
selectors. In fact, the convexity was shown implicitly in the correctness proof of Oppen's algorithm
[26].
Definition 7 (Convexity). A theory is convex if whenever a conjunction of literals implies a disjunction
of atomic formulas, it also implies one of the disjuncts.
Let be a conjunction of equalities and a disjunction of equalities. The convexity of Th
(TA)
can be rephrased as follows:   is satisfiable if and only if for each of conjuncts s = t   ,
 s = t is satisfiable, i.e., |= s = t. The idea of Oppen's algorithm is to discover all logically
implied equalities (between terms in and ) by constructing a directed acyclic graph (DAG) with
terms as vertexes and computing an equivalence relation on the nodes based on equality of children
and ancestor nodes (bidirectional closure), formally described below.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1535
Fig. 1. The DAG of (6).
Definition 8 (DAG representation of terms). A term t can be naturally represented by an ordered
tree Tt such that (i) if t is a constant, then Tt is a leaf vertex labeled by t, and (ii) if t is in the form
(t1, . . . , tk), then Tt is the tree with root labeled by and Tt1 , . . . , Ttk as its subtrees. A directed acyclic
graph (DAG) Gt of t is obtained from Tt by merging all identical subgraphs of Tt.
The DAG of a formula is the DAG representing all terms in the formula. For example, Fig. 1
shows the DAG for
cons(y, z) = cons(x, z)  cons(x, y) = z. (6)
We assume DAGs are sibling complete in the sense that a node and all of its siblings coexist. For
example, car(x) appears if and only if cdr(x) does. A sibling completion can be easily obtained by
adding trivial equality literals like t = t to the original formula. For a vertex u, let (u) denote the
outgoing degree and u[i] (1 i (u)) the ith successor of u.
Let R be a binary relation on the vertexes of a DAG and let u, v be any two vertexes.
Definition 9 (Unification closure). We say that R is the unification closure of R (denoted by R) if R
is the smallest equivalence relation extending R such that (u, v)  R and type(u) = type(v) implies
(u[i], v[i])  R , for every i  [1.. (u)].
Definition 10 (Congruence closure). We say that R is the congruence closure of R (denoted by R) if
R is the smallest equivalence relation extending R such that (u[i], v[i])  R (for every i  [1.. (u)])
and type(u) = type(v) implies (u, v)  R .
If R is both unification and congruence closed (with respect to R), we call it the bidirectional
closure, denoted by R .
Let R be the set of all pairs asserted equal in . It was shown that R represents all equalities
logically implied by [26]. Therefore, is unsatisfiable if and only if there exist t and s such that
t = s  and (t, s)  R .
Algorithm 1 (Oppen's decision procedure for Th
(TA) [26]). Input:
: q1 = r1  . . .  qk = rk  s1 = t1  . . .  sl = tl,
where qi, ri, si and ti are pure constructor terms.
(1) Construct the DAG G of .
(2) Compute on G the bidirectional closure R of R = {(qi, ri) | 1 i k}.
(3) Return FAIL if i(si, ti)  R , or ((u, v)  R )[type(u) = type(v)].
Return SUCCESS otherwise.
1536 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Oppen's original algorithm is given for the theory of LISP lists, which only has one non-nullary
constructor (cons). It is straightforward, however, to generalize it to term algebras with an arbitrary
number of non-nullary constructors; in Algorithm 1, we added type checking which is not present
in Oppen's original algorithm.
In our setting, the language contains selectors and values of -selectors on non -terms are spec-
ified, e.g., si (x) = x if x is not an -term. It was shown that for such structures the decision problem
is NP-complete [24]. The complication is that it is not known a priori whether s(x) is a proper
subterm of x and hence it is not possible to use the DAG representation directly. A solution to
this problem is to guess the type information of terms occurring inside selectors before applying
Algorithm 1.
Algorithm 2 (Decision procedure for Th
(TA) with selectors).
Input: , a conjunction of equalities and disequalities.
(1) Guess a type completion c of and simplify selector terms accordingly.
(2) Call Algorithm 1 on c.
Example 11. Fig. 2 shows the DAG representation of the LISP list formula
cons(y, z) = cons(cdr(x), z)  cons(car(x), y) = x (7)
under the guess that x is not a constant. Initially R = {(v3, v4)} as v3 and v4 are asserted equal in (7).
(For simplicity reflexive pairs are not listed.) By the unification algorithm (v6, v7) are merged, which
gives R = {(v3, v4), (v6, v7)}. Then by the congruence algorithm (v1, v2) are merged, resulting in
R = { (v1, v2), (v3, v4), (v6, v7) }.
(Here we used the property that R = (R ) .) Obviously this branch fails as v1 = v2 is asserted by
(7). The remaining branch (with presence of IsA(x)) simplifies to IsA(x)  x = y which is clearly
satisfiable, and therefore so is (7).
Note that the correctness of both Algorithm 1 and 2 relies on the (implicit) assumption that the
constant domain is infinite, since otherwise the theory is not convex. As a counter-example, for the
structure TA with domain A = {a, b}, we have IsA(x) |= x = a  x = b, but neither IsA(x) |= x = a
nor IsA(x) |= x = b. We shall see (in Section 5.4) that our algorithm extends Oppen's algorithm to
structures with finite constant domain.
Fig. 2. The DAG of (7) under the assumption Iscons(x).
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1537
5. Decision procedures for quantifier-free theories
In this section we present decision procedures for Th
(TA ), the quantifier-free theory of term
algebras augmented with a length function and Presburger arithmetic. First, in Section 5.1 we de-
scribe the theory, and then, in Section 5.2, outline our generic approach for constructing decision
procedures for this theory. This approach is refined in Section 5.3 into a decision procedure for
structures with an infinite constant domain, Th
(TA
) and then further refined for structures with
finite constant domain, Th
(TA ) in Section 5.4. Section 5.5 outlines the approach to obtain deci-
sion procedures for structures whose constant domain contains relations besides equality. Section
5.6 discusses the complexity of the decision procedures for the quantifier-free theories.
5.1. Term algebras with integers
Definition 12. The structure of term algebras with integers is
TA = TA; PA; |  | : 
where TA is a term algebra, PA is Presburger arithmetic, and |  | denotes the length function
defined recursively by: (i) for any constant a, |a| = 1, and (ii) for a term (t1, . . . , tk), | (t1, . . . , tk)|
= k
i=1 |ti| + 1.
The extended language is denoted by L .
The length function given in Definition 12 was chosen for ease of presentation. Generalizing it
into a weight function that assigns an arbitrary nonnegative integer to each symbol, or a height
function that gives the length of the maximum path does not require any essential changes to our
techniques.
We use subscripts , (or prefixes TA-, PA-) to denote notions related to term sort and inte-
ger sort, respectively. We also use "term" for "TA" when there is no confusion. For example,
denotes a formula in L , the language of pure term algebras, denotes a formula in L , the
language of Presburger arithmetic, V denotes the collection of variables in L and V denotes the
collection of variables in L . Although L contains two equality predicates: term equality = in
L and integer equality = in L , we use = in both cases unless there is a chance of confusion.
A TA-term can occur inside the length function. Such occurrence is called an integer occurrence
to be distinguished from the normal term occurrence. From now on, we freely use integer terms |t|
to form Presburger formulas. For example, car(x) is a TA-term and |car(x)| is a PA-term. The first
occurrence of car(x) is a term occurrence and the second one is an integer occurrence.
To save space, we use an integer term |t( x)| in two ways; one as the length function |  | applied
to t( x) (when t( x) is in discussion), and the other as a special integer variable (called pseudo-integer
variable). In the latter case, if x  V , then | x| = z denotes the formula of the form i |ti( x)| = zi
where z  V and ti(x) enumerates all TA-terms containing x in the context.
Example 13. Consider the formula ( x) : x1 = cons(x2, x3). In the context of this formula | x| = z
denotes
|x1| = z1  |x2| = z2  |x3| = z3  |cons(x2, x3)| = z4.
1538 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Example 14. In the context of the formula ( x) : |x1| = |car(cdr(x2))|, | x| = z denotes
|x1| = z1  |x2| = z2  |cdr(x2)| = z3  |car(cdr(x2))| = z4.
Suppose ( x, y) is in a context in which all occurrences of x  V are integer occurrences. Then
by (z, y) we mean the formula obtained by substituting a true integer variable z (z  z) for each
pseudo-integer variable |t( x)|. (Here we actually overload the symbol , but the risk of confusion is
minimal.) We use | x|  z to denote such a substitution. For example, in Example 14 (z) denotes
z1 = z4. If is an assignment for V , then | | denotes the corresponding assignment for pseudo-in-
teger variables. For example, if is x1 = cons(x2, x3) and is {x1 := nil, x2 := cons(nil, nil), x3 :=
nil}, then | | is {|x1| := 1, |x2| := 3, |x3| := 1}.
It is easily seen that the general purpose combination method in [25] is not directly applicable to
TA due to the presence of the length function.
Example 15. The constraints
list : x = cons(car(y), y), : |x| < 2|car(x)|
are clearly satisfiable, respectively, in List and PA. However, since list implies that car(x) = car(y),
x contains two copies of car(y) and so its length should be at least two times the length of car(x).
Therefore,  list is unsatisfiable in List .
A simple but crucial observation is that induces an implicit length constraint , in addi-
tion to the explicit constraint given in the input. The unsatisfiability is due to the fact that
contradicts . In Example 15, list in fact implies |x| 2|car(x)|, resulting in a contradiction to
.
Implicit length constraints are induced not only by the structures of objects, but also by the size
of the constant domain.
Example 16. Consider the constraint
list : x = cons(cons(nil, nil), nil)  x = cons(nil, cons(nil, nil)), : |x| = 5
in List and in PA, respectively. Clearly, both are satisfiable in the respective structures. In the com-
bined structure List , however, there are exactly two term trees with length 5 and list states that x
is not equal to either of them. As a consequence, list implies : |x| = 5, contradicting |x| = 5.
Intuitively, if we can extract from the implicit that exactly characterizes the solution set
of , then the satisfiability of  reduces to the satisfiability of  . As a consequence,
we can derive decision procedures for the combined theory by utilizing the decision procedures for
PA and TA.
5.2. A generic decision procedure for Th
(TA )
Given a term constraint our objective is to construct a Presburger formula , called a Length
Constraint Completion, that is satisfiable if and only if is satisfiable.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1539
Definition 17 (Length constraint completion (LCC)). An L -formula ( x) is a length constraint
completion (LCC) for ( x) if the following formulas are valid:
( x : ) ( x)  (z : ) (z)  | x| = z , (8)
(z : ) (z)  ( x : ) ( x)  | x| = z . (9)
Let be an LCC for . Condition (8) says that for any satisfying assignment of , | | is
a satisfying assignment for . In other words, |  | maps a satisfying assignment for in TA to
a satisfying assignment for in PA. We say that satisfying (8) is sound with respect to .
On the other hand, condition (9) says that for any satisfying assignment of there exists a
satisfying assignment of such that | | = . In other words, any satisfying assignment in PA
is the image under |  | of a satisfying assignment in TA. We say that satisfying (9) is realizable
by . In particular, if is unsatisfiable, then so is .
Let be a formula satisfying both (8) and (9). Let + and -, respectively, be any two
formulas satisfying (8) and (9) (when in place of ). If we identify these constraints with their
corresponding solution sets, we have
-   +. (10)
Thus is the exact projection of from TA to PA, while +, - are over and under approxi-
mations of respectively. Let and both be LLCs for . By (10) we have  
and hence = (with respect to the corresponding solution sets). Therefore, for a term con-
straint there exists a unique LCC up to equivalence.
Example 18. Consider in List the formulas list : cons(x, y) = z. The constraint
+ : |z| > |x|  |z| > |y|  |x| > 0  |y| > 0  2 |x|  2 |y|
is sound but it is not realizable for list, as the integer assignment
: { |x| := 3, |y| := 3, |z| := 4 }
cannot be realized. On the other hand, the constraint
- : |x| + |y| + 1 = |z|  |x| > 5  |y| > 0  2 |x|  2 |y|
is realizable for , but it is not sound because it is not satisfied by the data assignment
: { x := nil, y := nil, z := cons(nil, nil) }.
Finally, the constraint
: |x| + |y| + 1 = |z|  |x| > 0  |y| > 0  2 |x|  2 |y|
is both sound and realizable, and hence is the induced length constraint of list.
We have a decision procedure for Th
(TA ) if can be effectively computed from .
1540 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Theorem 19 ([38]). Let be an LCC for . Then TA |=  if and only if PA |=  .
Proof. Conditions (8) and (9) give the "" and "" directions, respectively.
By this theorem the decision problem for quantifier-free theories reduces to computing the LCC
in Presburger arithmetic. To obtain an LCC, we need a normalization process to transform to an
equivalent disjunction in which each disjunct is of the form  Â . We call such a transformation
a partitioning and each disjunct a partition. (We do not require partitions to be mutually exclusive.)
In the subsequent sections, we shall show in detail each of the normalization procedures. First, we
extend Definition 17 to deal with newly generated integer constraints in the normalization.
Definition 20 (Relativized LCC (RLCC)). A formula ( x) is a length constraint completion for
( x) relativized to Â ( x), (in short, ( x) is an RLCC for ( x)/Â ( x)), if the following formulas
are valid:
( x : ) ( x)  Â ( x)  (z : ) (z)  | x| = z , (11)
(z : ) (z)  ( x : ) ( x)  Â ( x)  | x| = z . (12)
Example 21. Consider List. Let
x : {x1, x2, x3}, ( x) : cons(x1, x2) = x3, Â ( x) : |x1| < |x2|.
Consider the following formulas:
div : |x1| > 0  |x2| > 0  2 |x1|  2 |x2|,
: |x1| + |x2| + 1 = |x3|  |x1| < |x2|  div,
+ : |x1| < |x3|  |x2| < |x3|  |x1| < |x2|  div,
- : |x1| + |x2| + 1 = |x3|  |x1| 3  |x2| > 3  div.
It is not hard to prove that is an RLCC for ( x)/Â ( x). However, neither + nor - is
such an RLCC. Although + satisfies (11), it does not satisfy (12), as the assignment
{ |x1| := 2, |x2| := 3, |x3| = 4 }
cannot be realized by any assignment for x. On the other hand, - satisfies (12), but not (11), as
the assignment
{ x1 := nil, x2 := cons(nil, nil), x3 := cons(nil, cons(nil, nil)) }
falsifies -.
Comparing (8) and (9) with (11) and (12), we see that an LCC is an RLCC with Â  true. Like
LCCs, up to equivalence, there exists a unique RLCC with respect to ( x)/Â ( x). In addition,
RLCCs have the following easily proved "additive" property.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1541
Proposition 22. If is an RLCC for /Â , then for any Â ,  Â is an RLCC for /(Â  Â ).
Inparticular,bylettingÂ := andÂ := true,weseethatif isanLCCfor ,then 
is an RLCC for / . So Theorem 19 is in fact a special case of the following
theorem.
Theorem 23. Let be an RLCC for /Â . Then TA |=  Â if and only if PA |= .
Proof. Conditions (11) and (12) give the "" and "" directions, respectively.
This theorem motivates the strategy of our decision procedures. In the normalization process,
with introduction of auxiliary integer constraints, we partition the original search space for such
that  i
(i)
 Â
(i)
, until we easily compute the RLCC
(i)
for each
(i)
/Â
(i)
. By Proposition
22,
(i)
 is an RLCC for
(i)
/(Â
(i)
 ). Then TA |=  if and only if for some i,
TA |=
(i)
 Â
(i)
 ,which,byTheorem23(set :=
(i)
, :=
(i)
 , :=  Â
(i)
),
reduces to determining whether PA |=
(i)
 . Note that is not involved in computing an
RLCC. Therefore, we can assume that Â
(i)
includes constraints relevant to the corresponding par-
titioning and other constraints generated during the normalization procedure have been merged
into .
Algorithm 3 (Generic decision procedure). Input:  .
(1) Return FAIL if TA |= .
(2) For each partition
(i)
 Â
(i)
of :
(a) Compute an RLCC
(i)
for
(i)
/Â
(i)
.
(b) Return SUCCESS if PA |=
(i)
 .
(3) Return FAIL.
5.3. A decision procedure for Th
(TA
)
The easiest arithmetic extension of term algebras is Th
(TA
), the quantifier-free theory of
term algebras with integers and with an infinite constant domain. In TA
, an LCC can be derived
directly from the DAG for the formula. (We do not need the notion of RLCC in this case.) Be-
fore we present the algorithm we define the following integer predicates on lengths of terms in a
DAG:
Tree(x) : x1, . . . , xn 0 x = 1 +
n
i=1
dixi , (13)
Node (x, x ) : x = 1 +
ar( )
i=1
xi, (14)
Tree (x) : x

Node (x, x ) 
ar( )
i=1
Tree(xi)

 , (15)
1542 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
where x denotes x1, . . . , xar( ) and d1, . . . , dn are the distinct arities of non-nullary construc-
tors. The predicate Tree(x) is true if and only if x is the length of a well-formed tree, since
whenever a leaf expands one level with outgoing degree d, the length of the tree increases by
d. The second predicate expresses that the length of an -typed node with known children
is the sum of its children's lengths plus 1. The last predicate states the length constraint for
an -typed tree. With these predicates the construction of an LCC is given by the following
algorithm.
Algorithm 4 (Computation of LCC in TA
). Let be a type-complete term constraint, G the
DAG of and R the bidirectional closure obtained by Algorithm 1. Initially set = . For
each term t add the following to .
ˇ |t| = 1, if t is a constant or asserted to be a constant (i.e., IsA(t) is in );
ˇ |t| = |s|, if (t, s)  R ;
ˇ Node (|t|, |t1|, . . . , |tar( )|), if t is an -typed vertex with children t1, . . . , tar( );
ˇ Tree (|t|), if t is an -typed leaf vertex.
Proposition 24. obtained by Algorithm 4 is expressible in a quantifier-free Presburger formula
linear in the size of .
Theorem 25. obtained by Algorithm 4 is an LCC for .
Algorithm 5 (Decision procedure for Th
(TA
)). Input:  where is type-complete.
(1) Call Algorithm 1 on ; return FAIL if TA |= .
(2) Construct from the DAG G using Algorithm 4.
ˇ Return SUCCESS if PA |=  .
ˇ Return FAIL otherwise.
The correctness of the algorithm follows from Theorem 19 and Theorem 25.
Example 26. Fig. 3 shows the DAG of
x = cons(car(y), y)  |cons(car(y), y)| < 2|car(x)|, (16)
assuming that y is not a constant. The computed R is { (v1, v2), (v3, v5), (v4, v6) }.
Fig. 3. The DAG of (16) under the assumption Iscons(y).
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1543
By Algorithm 4 is
|x| = |cons(car(y), y)|  |car(x)| = |car(y)|  |cdr(x)| = |y|

|x| = |car(x)| + |cdr(x)| + 1  |y| = |car(y)| + |cdr(y)| + 1

|cons(car(y), y)| = |car(y)| + |y| + 1

2 |car(x)|  2 |cdr(x)|  2 |car(y)|  2 |cdr(y)|

|car(x)| 1  |cdr(x)| 1  |car(y)| 1  |cdr(y)| 1
(17)
which implies
|cons(car(y), y)| 2|car(x)|. (18)
contradicting |cons(car(y), y)| < 2|car(x)|. If y is a constant, v3, v6, v7 are merged. In this case also
implies (18), and therefore (16) is unsatisfiable.
Note that the last two lines of (17) are the result of simplification of constraints of the form
Tree(); according to our definition of the length function, the length of any term (tree) in List is a
positive odd number.
5.4. A decision procedure for Th
(TA )
Algorithm 4 can produce an incorrect LCC in Th
(TA ), the quantifier-free theory of term alge-
bras with integers and with a finite constant domain, as illustrated by the following
example.
Example 27. Consider List with A = {nil}. The constraint
|x| = 5  IsA(y)  x = cons(cons(y, y), y)  x = cons(y, cons(y, y)) (19)
is unsatisfiable while obtained by Algorithm 4 is
|y| = 1  |cons(y, y)| = 3  |cons(cons(y, y), y)| = 5  |cons(y, cons(y, y)| = 5
which is obviously satisfiable together with |x| = 5.
The reason is that if A is finite, then there are only finitely many terms of length n for any n > 0.
If a term t is forced to be distinct from all of them, then t cannot have length n. Therefore needs
to include constraints that count the number of distinct terms of a certain length.
Definition 28 (Counting constraint). A counting constraint is a predicate CNTk,n(x) (k > 0, n 0)
that is true if and only if there are at least n+1 different -terms of length x in TA with |A| = k.
CNTk,n(x) is similarly defined with -terms replaced by TA-terms.
Example 29. For List with A = {nil}, CNTcons
1,n (x) is x 2m - 1  2 m where m is the least number
such that the m-th Catalan number
1544 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Cm =
1
m
2m - 2
m - 1
is greater than n. This is not surprising as Cm gives the number of binary trees with m leaves (that
tree has 2m - 1 nodes).
The following two monotonicity properties are easily proven: for any l k > 0 and m n > 0,
CNTk,n(x)  CNTl,n(x), CNTk,m(x)  CNTk,n(x).
In general we have the following result.
Proposition 30. CNTk,n(x) and CNTk,n(x) are expressible by quantifier-free Presburger formulas that
can be computed in time O(n).
In order to construct counting constraints, we need equality information between terms.
Definition 31 (Equality completeness).  Â is equality complete if for any two terms u and v in
ˇ either u = v or u = v (but not both) is in , and
ˇ either |u| = |v| or |u| = |v| (but not both) is in Â .
Equality completeness is a syntactical notion similar to a variable partition in the Nelson­
Oppen combination method. We can make a quantifier-free formula  Â (which does not
contain contradictory literals) equality complete by adding exactly one of u = v and u = v to ,
and exactly one of |u| = |v| and |u| = |v| to Â . Let us call the resulting formula an equality comple-
tion of  Â . Similarly, we can define equality completion for sets of terms. Like type completion,
 Â is a compatible equality completion of  Â if the satisfiability of  Â implies the
satisfiability of  Â .
Example 32. Let list be y = cons(x, z)  Iscons(x, y, z). A possible equality completion of
list (Â = ) is
|y| = |cons(x, z)|  |x| = |z|  |y| = |x|  Iscons(x, y, z) 
t,t S;tt
t = t , (20)
where S = {x, y, z, cons(x, z)}.
Strictly speaking, the formula (20) is not an equality completion of ; to save space, we omitted
equalities and disequalities that follow from equality substitution. In general, equality completion
could add O(n2) literals. However, it can be more succinctly represented as assertions of the form
eq(t1, . . . , tn) or neq(t1, . . . , tn) that state that a set of terms are all equal or all pairwise distinct,
respectively.
We partition the search space for by computation of equality completion. To save notation,
and Â always refer to the updated version for one of the partitions. By CLSn(x0, x1, . . . , xn)
we denote the conjunction of literals expressing that x0, . . . , xn are -typed terms having the same
length but are pairwise distinct.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1545
Algorithm 6 (Computation of an RLCC in TA ). Input:  Â (type and equality complete); k,
the cardinality of A.
(1) Call Algorithm 4 to obtain .
(2) Set :=  Â .
(3) Add to the constraints CNTk,n(|t|) if CLSn(t, t1, . . . , tn) is induced by  Â for some
t1, . . . , tn.
Note that CNTk,n(|t|)  CNTk,m(|t|) for n m. In step (3), therefore, it suffices to add CNTk,n(|t|)
only if the set {t, t1, . . . , tn} is maximal in the sense that CLSn(t, t1, . . . , tn) occurs in  Â , but for
no t1, . . . , tn+1 does CLSn+1(t, t1, . . . , tn+1) also occur in  Â . In addition, due to symmetry, there
is no need to add CNTk,m(|t1|), . . ., CNTk,m(|tn|).
Proposition 33. obtained by Algorithm 6 is expressible in a quantifier-free Presburger formula of
size linear in the size of  Â .
Theorem 34. obtained by Algorithm 6 is an RLCC for /Â .
Algorithm 7 (Decision procedure for Th
(TA )). Input :  .
(1) Guess a type and equality completion of , denoted by  Â .
(2) Call Algorithm 1 on . Return FAIL if TA |= .
(3) Construct from  using Algorithm 6.
ˇ Return SUCCESS if PA |=  .
ˇ Return FAIL otherwise.
The correctness of Algorithm 7 follows from Theorems 23 and 34. Notice that, when is empty,
the algorithm can be viewed as an extension of Oppen's original algorithm for structures with a
finite constant domain.
Example 35. Let us return to Example 27. Constraint (19) has exactly one compatible completion,
namely
CLScons
2 (x, cons(cons(y, y), y), cons(y, cons(y, y))).
This results in the counting constraint CNTcons
1,2 (|x|) : |x| > 5  2 |x|, contradicting |x| = 5 in (19).
5.5. Richer theories on constant domain
Up to now we assumed that the constant domain is purely equational, i.e., we can only express
equality and disequality between constants. It is fairly easy, however, to relax this assumption and
allow a constant domain with richer constructs provided the enriched structure is decidable and
the signature of this structure is disjoint with the signature of TA except for constants. (In fact,
disjointness is trivially satisfied because in L we only have the equality predicate on constants.)
We outline the approach below.
1546 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Let Ac denote the new constant structure and Lc the corresponding language. Let TA+
denote
the extended structure of term algebras with integers and + a constraint in TA+
. Without loss of
generality, we assume that both + is equality and type complete with respect to TA-terms. There
is a standard way to purify + to  c  Abs where is a constraint in LTA, c is a constraint
in Lc, and Abs is the set of equalities of the form {vi = ti}i where vi are fresh variables and ti are
TA-terms in which are of constant type but are not constants directly. Both and c contains
the same equality completion (up to the isomorphic mapping {vi  ti}i) for terms of constant type.
It follows from Nelson­Oppen combination method that TA+
|=
+ if and only if TA |=
and Ac |= c.
5.6. Complexity
The complexity of the decision problems for the quantifier-free theories is NP-complete. Let n be
the input size. First it is not hard to see that decision problems for both Th
(TA ) and Th
(TA
)
are NP-hard as they are super-theories of Th
(TA) and Th
(PA), both of which are NP-complete
[26], [14, pp. 336­340]. Second, Algorithm 4 computes in O(n) (see Proposition 24) and so does
Algorithm 6. Third, the size of any type and equality completion of is bounded by O(n2) as there
are at most n2 pairs of terms. By the nondeterministic nature of our algorithms, we see that each
branch of computation (in Algorithms 5 and 7, respectively) is in P. Therefore both Th
(TA
) and
Th(TA ) are NP-complete.
6. A new quantifier elimination procedure for Th(TA)
In this section we present a new quantifier elimination algorithm for Th(TA), the first-order
theory of term algebras, and show that the algorithm only needs exponential time to eliminate a
block of quantifiers of the same kind. The algorithm works mainly in the constructor language
while using selectors as auxiliary tools. It is the basis for the elimination procedure for the extended
theory presented in the next section. In this section we assume a finite constant domain. We drop
the subscript in this section.
Normal form. It is well-known that eliminating arbitrary quantifiers reduces to eliminating exis-
tential quantifiers from formulas in the form
( x) A1( x, y)      An( x, y) , (21)
where Ai( x, y) (1 i n) are literals [13]. We can assume that the literals Ai are not of the form
x = t when x does not appear inside selectors. For x(x = t  (x, y)) simplifies to (t, y) if x does
not occur in t, to x (x, y) if t  x, and to false by Axiom (A1) if t is a term properly containing x.
In all algorithms we assume these simplifications are performed in each step to restore the normal
form.
Nondeterminism. In the rest of this paper all transformations are done on formulas of the form
(21). Again as in the presentation for quantifier-free theories, whenever we say "guess ," we mean
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1547
to add a valid (with respect to the context) disjunction i i (where is one of the disjuncts) to the
matrix of (21). When we replace by i i or directly introduce i i, it should be understood that
an implicit disjunctive splitting is carried out and we work on each resultant disjunct of the form
(21) "simultaneously."
Simplification. For simplicity, in the description of algorithms, we omit tester literals unless they
are needed for the correctness proof. We may also assume that the matrix of (21) is type complete
and basic simplifications are carried out whenever applicable: for a nonempty selector sequence L,
we replace Lx = x by true and Lx = x by false; if t(x) is a term properly containing x and x does not
appear in selector terms, we replace t(x) = x by true and t(x) = x by false.
Notation. In the algorithm we use the following notation: x denote the set of existentially quantified
variables; y denote the set of parameters (implicitly universally quantified variables); s, t, u denote
TA-terms; L, F , G, H denote (possibly empty) selector sequences; f , g, h, p, q denote index functions
with ranges clear from the context; i, j, k, l denote indexes; numerical superscripts are parenthesized.
Index functions are used to differentiate multiple occurrences of the same variables.
In each step of the transformations the algorithm manipulates the formula ( x) ( x, y) to pro-
duce a version of the same form (or multiple versions of the same form in case disjunctions are
introduced), and thus in each step ( x) ( x, y) refers to the updated version rather than to the
original input formula.
Outline. The elimination is performed as follows. A sequence of equivalence-preserving transforma-
tions will bring the input formula into a disjunction of formulas in solved form which have solutions
under any instantiation of parameters. Therefore, the whole block of existential quantifiers  x can
be eliminated by removing all literals containing x in the matrix.
Definition 36 (Solved form). We say ( x, y) is solved in x, if x do not appear in equalities, are not
asserted to be constants and are not inside selector terms. We say ( x) ( x, y) is in solved form if
( x, y) is solved in x.
A solved form can be obtained by the following normalization procedure. The normalization
can be viewed as an explicit syntactical procedure comparable with the DAG construction and
computation of the bidirectional closure.
Algorithm 8 (Normalization in TA). Input: ( x) ( x, y).
(1) Type Completion. Guess a type completion of ( x, y) and simplify every selector term to a
proper one.
(2) Selector Elimination. Replace all selector terms containing x by the corresponding equivalent
constructor terms according to Axiom (A6).
(3) Decomposition. Call Algorithm 9 to decompose equalities and disequalities between construc-
tor terms and equalities containing x.
(4) Constant Elimination. If some x  x is asserted to be a constant (i.e., IsA(x) appears), we
instantiate x to each constant to eliminate x since A is finite.
1548 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
After step (1) of Algorithm 8 terms can have three forms:
(i) Lx, (ii) Ly, (iii) (t1( x, y), . . . , tar( )( x, y)),
where L may be empty. Terms of the form (iii) are constructor terms built recursively from terms
in (i)-(ii) using non-nullary constructors with ti( x, y) of the form (i), (ii) or (iii). These three forms
give rise to the following six types of equality literals:
Lx = L x , (22)
Lx = L y, (23)
Lx = (t1( x, y), . . . , tar( )( x, y)), (24)
Ly = L y , (25)
Ly = (t1( x, y), . . . , tar( )( x, y)), (26)
(t1( x, y), . . . , tar( )( x, y)) = (t1( x, y), . . . , tar( )( x, y)). (27)
Similarly, we have six types of disequalities, the negations of (22)­(27).
Step (2) transforms equalities of the forms (22­24), and similarly for disequalities of the same
form. Thus after application of this step, we can assume that x does not appear inside selector terms,
that is, equality literals have the forms
Ly = L y , (28)
Ly = (t1( x, y), . . . , tar( )( x, y)), (29)
(t1( x, y), . . . , tar( )( x, y)) = (t1( x, y), . . . , tar( )( x, y)), (30)
and disequality literals are in the forms of the negations of (28)­(30) and in the forms of
x = x , x = Ly, x = (t1( x, y), . . . , tar( )( x, y)).
Step (2) may generate literals like x = t( x, y). The reinstatement of normal form, however, does
not put any of x inside selector terms. Step (2) may also linearly increase the size of the matrix. In
general, elimination of selectors adds more existential quantifiers of sort term. The newly added
quantifiers, however, will be removed in one step together with the original ones. The following
example illustrates step (2).
Example 37. Step (2), selector elimination, first converts the formula
(x) car(x) = y2  cdr(x) = y2  x = y3 (31)
into
(x1x2) x1 = y2  x2 = y2  cons(x1, x2) = y3 , (32)
which, by substitution of x1 for y2, simplifies to
(x2) x2 = y2  cons(y2, x2) = y3 . (33)
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1549
Step (3) converts equalities and disequalities of form (30) and equalities of form (29). After step
(3) the matrix, if it did not simplify to false, contains only literals of the form
x = t( x, y), Ly = t( x, y), Ly = L y . (34)
where t is: (i) existentially quantified variables x, (ii) implicitly universally quantified parameters
y, (iii) selector terms of parameters in the form Ly (y  y), (iv) constants in A, or (v) constructor
terms built recursively from terms in (i)­(iv) using non-nullary constructors.
Algorithm 9 (Decomposition).
(1) Decomposition of Equalities between Constructor Terms. Replace
(t1, . . . , tar( )) = (t1, . . . , tar( )) (35)
by 1 i ar( ) ti = ti. Repeat until no equality of the form (35) appears.
(2) Decomposition of Disequalities between Constructor Terms. Replace
(t1, . . . , tar( )) = (t1, . . . , tar( )) (36)
by 1 i ar( ) ti = ti. Repeat until no disequality of the form (36) appears.
(3) Decomposition of Equalities Containing x. Solve equalities of the form Ly = t( x, y), where
t( x, y) is a constructor term containing x, in terms of Ly such that the result is a set of equalities
in the selector language.
In step (1) of Algorithm 9, recall that literals like x = t( x, y) can always be eliminated together with
(x) and hence after this step, we can assume no such literals appear in the matrix.
After step (2) we can assume that literals have one of the following forms:
Ly = L y , (37)
Ly = L y , (38)
x = t( x, y), (39)
Ly = (t1, . . . , tar( )), (40)
Ly = (t1, . . . , tar( )). (41)
Step (3) solves equalities of the form (41), and thus we are left only with literals of the forms
(37)­(40). Notice that these are the same as those in (34), where Ly = t( x, y) represents both (38)
and (40).
The following example illustrates how equalities are solved.
Example 38. The literal cdr(y) = cons(cons(x1, y1), y2) is converted to the solution set
x1 = car(car(cdr(y))), y1 = cdr(car(cdr(y))), y2 = cdr(cdr(y)).
1550 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Given an input formula ( x) ( x, y), the normalization procedure in Algorithm 8 lets us effectively
eliminate all quantifiers ( x) in one step. Indeed, after application of Algorithm 8 we can assume,
according to (34), that ( x) ( x, y) is in the form
( x)
i
xp(i) = ti( x, y) 
j
Ljyq(j) = sj( x, y)

k
Fkyf(k) = uk( y) 
l
Glyg(l) = Hlyh(l). (42)
Here ti, sj are:
(i) existentially quantified variables x,
(ii) implicitly universally quantified parameters y,
(iii) selector terms of parameters in the form Ly (y  y),
(iv) constants in A, or
(v) constructor terms built recursively from terms in (i)­(iv) using constructors,
where we require that sj contain at least one occurrence of a variable in x (otherwise, the corre-
sponding literal should have been moved out of the scope of ( x)). The term uk( y) can be one of
(ii)­(v) above, where in (v) recursion is limited to (ii)­(v).
We claim that the first part of (42)
( x)
i
xp(i) = ti( x, y) 
j
Ljyq(j) = sj( x, y) , (43)
is valid, and hence (42) is equivalent to
k
Fkyf(k) = uk( y) 
l
Glyg(l) = Hlyh(l). (44)
Thus the algorithm for elimination of quantifiers can be given as
Algorithm 10 (Elimination of quantifiers). Input: ( x) ( x, y).
(1) Call Algorithm 8 to normalize ( x) ( x, y).
(2) Remove ( x) and all literals containing x.
Theorem 39. All transformations in Algorithm 10 preserve equivalence.
Theorem 40. Algorithm 10 removes a block of quantifiers in time 2O(n).
Example 41. Let us look at an example in List. Consider a type complete formula
(x) cons(car(x), y1) = y2  y2 = x . (45)
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1551
(Because of the assumption of type completeness, we know Iscons(y2) and Iscons(x) must be present
in the side condition. We omit type information of y1, because it is irrelevant here.) Step (2) of
Algorithm 8 removes the selectors on x by converting (45) into
(x1x2) cons(x1, y1) = y2  y2 = cons(x1, x2) . (46)
Step (3) of Algorithm 8 (Algorithm 9) solves the equality cons(x1, y1) = y2, with solution set
{ x1 = car(y2), y1 = cdr(y2) },
resulting in
(x1x2) car(y2) = x1  cdr(y2) = y1  y2 = cons(x1, x2) , (47)
which, by standard substitution and quantifier manipulation, reduces to
(x2) y2 = cons(car(y2), x2)  cdr(y2) = y1. (48)
As (x2)[ y2 = cons(car(y2), x2) ] is in solved form, it is valid, and hence (48) reduces to cdr(y2) = y1,
or more formally, to cdr(y2) = y1  Iscons(y2).
7. Decision procedures for quantified theories
In this section we present decision procedures for the theory Th(TA ) of term algebras with
integers and parameters. In Section 7.1 we first refine the notions and techniques from Section 5 for
the construction of a quantifier elimination procedure for Th(TA ), and then, in Section 7.2, pres-
ent the quantifier elimination procedure itself. Throughout Sections 7.1 and 7.2 we assume a finite
language. In Section 7.3 we further generalize the result to structures whose constant domain has
an internal structure and admits quantifier elimination. In Section 7.4 we discuss how the procedure
can be adapted for infinite languages.
7.1. Term algebras with integers and parameters
In this section we refine the notions and techniques from Section 5 for the construction of a
quantifier elimination procedure for Th(TA ), which is given in the next section.
We first refine the notion of equality completion. As we have seen, to get an RLCC (LCC) we need
to express in Presburger arithmetic the set of legitimate lengths such that a certain number of distinct
terms of any length in the set can co-exist. This can be supported by equality completion. Equality
completion, however, in general introduces more literals, especially disequalities, which may again
destroy the completion because it may cause generation of new terms in the subsequent operation
(see disequality splitting in Algorithm 13). To avoid compromising convergence, we introduce the
notion of clusters which is weaker than equality completion but contains sufficient information to
allow extracting counting constraints. Intuitively, it suffices to have the equality information only
between terms of the same length and of the same type.
1552 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Definition 42 (Clusters). Let be a conjunction of literals in L . Term equality literals ap-
pearing in induce equivalence classes on TA-terms occurring in . Let [t] denote such an
equivalence class containing the term t. We say that C = { [t0], . . . , [tn] } is an -cluster (of )
if CLSn (t0, . . . , tn) is induced by , i.e., expresses that t0, . . . , tn are pairwise distinct -terms
of the same length.
The notion of clusters is syntactic modulo = and = (equality substitution). For example,
t = s  s = u  u = v  |t| = |v|  Is (t, s, u, v), (49)
induces the -cluster {[t], [v]} (where [t] = {t, s, u}, [v] = {v}). Formally speaking, a cluster C is
induced by the closure of under equality substitution. We chose this definition to limit the form
of disequalities generated in transformations of Algorithm 13.
Below we will drop the subscript if is clear from the context. For clarity, we view a cluster as
a set consisting of terms that are chosen representatives of their corresponding equivalence classes.
The choice of representatives is arbitrary unless stated otherwise. For example, { [t0], . . . , [tn] } will
also be written simply as { t0, . . . , tn }.
A cluster is maximal if no superset of it is a cluster. A cluster C is closed if C is maximal and
it is disjoint with any other maximal clusters. Two clusters C and C are called connected if there
exists t  C and t  C (or more formally [t]  C and [t ]  C ) such that either (i) t = t occurs
in the defining formula (i.e., C and C intersect), or (ii) t = t occurs in the defining formula while
|t| = |t | does not. Two clusters are called mutually independent if they are not connected. The size
of a cluster is the number of equivalence classes it contains. The rank of a cluster C, written rk(C),
is |t| for an arbitrarily chosen term t occurring in C. Clusters are partially ordered by their ranks;
for two clusters C, C we write C < C if rk(C) = t, rk(C ) = t and |t| < |t | is logically implied
by .
Example 43. Consider again the formula from Example 32,
|y| = |cons(x, z)|  |x| = |z|  |y| = |x|  Iscons(x, y, z) 
t,t S;tt
t = t , (50)
where S = {x, y, z, cons(x, z)}. This formula induces two mutually independent cons-clusters,
C1 : { x, z }, C2 : { y, cons(x, z) }
with rk(C1) < rk(C2).
A conjunction of literals is cluster complete if all maximal clusters of are mutually indepen-
dent. A conjunction of literals is a cluster completion of if  and is cluster complete.
is compatible with if satisfiability of implies satisfiability of . Like equality completion,
we are only interested in compatible cluster completions.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1553
Example 44. The List formula :
x = y  x = z  x = w  |x| = |y|  |x| = |z|  Iscons(x, y, z, w) (51)
gives three maximal cons-clusters
C1 : { x, y }, C2 : { x, z }, C3 : { w }.
C3 is closed, but neither C1 nor C2 is. If we conjoin (51) with y = z, C1 and C2 are merged, making the
resulting cluster { x, y, z } maximal. If we conjoin (51) with y = z, then C1 and C2 become identical.
Neither strengthening, however, results in a cluster completion of (51), because C3 is unaffected by
either changes, and in both cases is still connected to other maximal clusters. The first case can be
made cluster complete by conjoining it with |w| = |x| or |w| = |x|  w = y  w = z. Similarly for
the second case.
As demonstrated in Example 43, equality completeness implies cluster completeness. The con-
verse, however, does not hold: it is not necessary to have complete equality and type information
on all terms to induce a set of mutually independent clusters. For example,
n
i=1
|xi| = |yi|  xi = yi  Is (xi, yi) ,
is not equality complete, but induces n maximal and mutually independent -clusters, namely
{ {xi, yi} | 1 i n }. This in fact is what we want: a constraint weaker than an equality and type
completion.
We also need to refine the notion of RLCC to deal with parameters.
Definition 45 (RLCC with parameters). Consider
( x : ) ( x, y)  Â ( x, y) ,
where y are parameters. Let
(2)
( y) be the maximum subset of ( x, y) not containing x and
(1)
( x, y) = ( x, y) \
(2)
( y). A formula ( x, y) is an RLCC in x for ( x, y) relativized to
Â ( x, y), (in short, ( x, y) is an RLCC for ( x, y)/ x/Â ( x, y)), if the following formulas are valid:
( x, y : ) ( x, y)  Â ( x, y)  (z : ) (z, y)  | x| = z , (52)
( y : )(z : )
(2)
( y)  (z, y)( x : ) ( x, y)  Â ( x, y)  | x| = z . (53)
In the viewpoint of logical equivalence, we can assume
(2)
( y) is empty as it can be moved out
of the scope of ( x). But we cannot make such an assumption with respect to the computation of
RLCC, as its existence in general affects cluster completeness. For example,
(x : ) x = y1  x = y2  Is (x, y1, y2)  |x| = |y1|  |x| = |y2|
is not cluster complete without equality information between y1 and y2.
1554 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
For the construction of an RLCC, we require that ( x, y)  Â ( x, y) be cluster complete and
in strongly solved form.
Definition 46 (Strongly solved form). We say that ( x, y)  Â ( x, y) is strongly solved in x if ( x, y)
is solved in x and all literals of the form Ly = t( x, y), where y  y and t( x, y) is a constructor term
(properly) containing x, are redundant. We say that ( x : )[ ( x, y)  Â ( x, y)] is in strongly
solved form if ( x, y)  Â ( x, y) is strongly solved in x.
Example 47. The formula list  Â
Iscons(y)y = cons(x, z) 
t,t S;tt
t = t |y| = |cons(x, z)|  |x| = |z|  |y| = |x|, (54)
with S = {x, y, z, cons(x, z)} is not in strongly solved form. It can be made into strongly solved
form, however, by adding car(y) = x or cdr(y) = z to list, or by changing |cons(x, z)| = |y| to
|cons(x, z)| = |y| in Â . Either one will make the literal y = cons(x, z) redundant.
Recall that CLSn(x0, x1, . . . , xn) states that x0, . . . , xn is a -cluster of n + 1 elements. We
claim that Algorithm 6 computes an RLCC (with parameters y being treated as ordinary
variables).
Theorem 48. If ( x, y)  Â ( x, y) is strongly solved in x and cluster complete, then ( x, y)
computed by Algorithm 6 is an RLCC for ( x, y)/ x/Â ( x, y).
7.2. A quantifier elimination procedure for Th(TA )
In this section we expand Algorithm 8 to an elimination procedure for Th(TA ), the first-order
theory of term algebras with integers. Since L has two sorts, namely and , we need to show
elimination of integer quantifiers as well as term quantifiers.
7.2.1. Elimination of quantifiers on integer variables
We assume that formulas with quantifiers on integer variables are in the form
(z : ) ( x)  ( x, y, z) , (55)
where y, z are integer variables and x are term variables. Since ( x) is in L , we can move ( x)
out of the scope of z, obtaining
( x)  (z : ) ( x, y, z). (56)
For this reason, neither ( x) nor ( x, y, z) is required to be quantifier-free. Now we can put
(z : ) ( x, y, z) into a quantifier-free form using Cooper's method [7,28]. For the sake of effi-
ciency, however, we can defer the actual elimination on (z : ) ( x, y, z) until all term quantifiers
have been eliminated. The reason, as we shall see soon, is that the elimination of term quantifiers
does not require the integer constraint in (57) to be quantifier-free.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1555
7.2.2. Elimination of quantifiers on term variables
We assume that formulas with quantifiers on term variables are in the form
( x : ) ( x, y)  ( x, y, z) , (57)
where x, y are term variables, z are integer variables, ( x, y) is quantifier-free. The following
algorithm is based on Algorithm 8.
Algorithm 11 (Normalization with parameters). Input: (57). Apply the following subprocedures suc-
cessively.
(1) Basic Normalization. Apply Algorithm 8 to ( x, y) and update ( x, y, z) accordingly.
(2) Cluster Completion. Normalize (57) to a cluster completion (Algorithm 12).
(3) Decomposition of Disequalities. Decompose disequalities of the Form Ly = t( x, y) (Algo-
rithm 13).
The purpose of steps (1­3) is to transform (57) into a formula which is in strongly solved form
and cluster complete, such that we can use Algorithm 6 to construct an RLCC. Having the RLCC
allows us to reduce term quantifiers to integer quantifiers. Step (1) gives us a formula of the form
(omitting integer literals) (43) (copied below)
( x : )
i
xp(i) = ti( x, y) 
j
Ljyq(j) = sj( x, y) . (43)
Algorithm 12 produces cluster completions of the input by merging mutually dependent maximal
clusters.
Algorithm 12 (Cluster completion). Input: (43). Apply the following subprocedures repeatedly until
cluster completeness is obtained.
(1) Select Clusters. Let
C1 = {t1, . . . , tn}, C2 = {s1, . . . , sm}
be two connected maximal clusters with witness either ti = sj or ti = sj.
(2) Merge Clusters.
(a) If ti = sj occurs. Guess an equality completion for C1  C2.
(b) If ti = sj occurs, but not |ti| = |sj|. Split on
|ti| = |sj|  |ti| = |sj|.
For the |ti| = |sj| branch, guess an equality completion for C1  C2.
(3) Renormalization. Apply Algorithm 9 to ( x, y) and update ( x, y, z) accordingly.
1556 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
The termination of Algorithm 12 can be seen as follows. Step (2) may increase the number of
literals and hence the number of term occurrences, but it does not increase the number of equiva-
lence classes on TA-terms. Step (3) (Algorithm 9) is called afterwards to restore the normal form.
Clearly, steps (1­2) of Algorithm 9 do not introduce new terms. Step (3) of Algorithm 9 solves
equalities containing existentially quantified variables x in terms of parameters y. However, it will
only introduce new terms into existing equivalence classes. In particular, it removes at least one
variable in x by substitution. So there can be at most n rounds of the run of step (2) where n is the
number of distinct terms in (43). It is easily seen that the resulting formula is still in the form of (43)
(omitting integer literals).
Algorithm 13 decomposes disequalities in that are of the form Ly = t( x, y) such that |Ly| =
|t( x, y)| is implied by and t( x, y) is a constructor term (properly) containing x. The decomposi-
tion consists of a sequence of disjunctive splittings, where in each step the matrix of (57) is updated
accordingly. It is assumed that when Algorithm 13 is called, the matrix  of (57) is cluster
complete. The algorithm preserves this completeness: all resulting branches are cluster complete.
Algorithm 13 (Decomposition of disequalities containing x). Input: Set D of disequalities of the form
Ly = t( x, y) such that |Ly| = |t( x, y)|, in the context of (57).
Repeat until D is empty.
(1) Disequality Removal. Select D  D such that for some Ly
D = { Ly = (t
(i)
1 ( x, y), . . . , t
(i)
k ( x, y)) | 1 i m }
(a) Disequality Splitting. Remove D from D and add to ( x, y)
m
i=1
sp(i)Ly = t
(i)
p(i)( x, y) 
1 j<p(i)
sj Ly = t
(i)
j ( x, y) , (58)
where p is an index function with domain [1..m] and range [1..k].
(b) Disequality Rewriting. Let  [1..m], I  [1..m] be such that
p( ) = sup { p(i) | 1 i m },
I = { i  [1..m] | p(i) < p( ) }.
For any i  I, replace
sp(i)Ly = t
(i)
p(i)( x, y) by t
( )
p(i)( x, y) = t
(i)
p(i)( x, y).
(2) Cluster Completion. Call Algorithm 12 to restore cluster completeness.
In step (1a) we rewrite Ly = (t
(i)
1 ( x, y), . . . , t
(i)
k ( x, y)) to
1 j k
sj Ly = t
(i)
j ( x, y) 
1 l<j
sl Ly = t
(i)
l ( x, y)
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1557
instead of more directly to
1 j k
sj Ly = t
(i)
j ( x, y) .
Introduction of equalities helps limit the number of generated terms by consolidating most of them
with existing ones. Thus, among newly generated terms, exactly one (sp( )Ly) is asserted unequal to
old terms that may contain x thanks to the rewriting in step (1a) and an additional requirement on
clusters explained below. Each run of step (1a) replaces a set of disequalities
{ Ly = (t
(i)
1 ( x, y), . . . , t
(i)
k ( x, y)) | i  [1..m] }
with a new set of disequalities
{ sp( )Ly = t
(i)
p( )( x, y) | i  I }.
Clearly, sp( )Ly must reside in a cluster of lower rank than that of Ly. As rank ordering is well-
founded, the size of D will eventually decrease, and hence Algorithm 13 terminates. In addition, it
can be shown that the total number of newly generated terms is bounded quadratically by the input
size. As a result, we can obtain one exponential upper bound on the complexity of Algorithm 13.
Besides the rewriting in step (1a), we also need to prevent step (2) (Algorithm 12) from generat-
ing unwanted disequalities. This can be done by requiring, upon calling Algorithm 12, that newly
generated terms are not made representatives for existing equivalence classes.
Example 49. To illustrate step (1), assume = (s1 , s2, s3, s4), and
D =
y = (u1, u2, u3, u4), y = (v1, v2, v3, v4)
y = (w1, w2, w3, w4), y = (t1, t2, t3, t4)
.
Let us consider below one of the disjuncts obtained by step (1a), written in matrix style with the
conjunction connective omitted:


s1 y = u1 s2y = u2 s3y = u3
s1 y = v1 s2y = v2 s3y = v3 s4y = v4
s1 y = w1 s2y = w2
s1 y = t1


 .
It follows that = 2 and p( ) = 4. In step (1b) we use v1, v2, v3, respectively, to replace s1 y, s2y, s3y
that occur in newly generated disequalities, obtaining a new matrix of conjuncts


s1 y = u1 s2y = u2 v3 = u3
s1 y = v1 s2y = v2 s3y = v3 s4y = v4
s1 y = w1 v2 = w2
v1 = t1


 .
1558 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
In this case only s4y (among the newly generated terms), having a lower rank than y, is part of a
disequality that may need to be split further.
Proposition 50. Algorithm 11 produces a formula that is in strongly solved form and cluster complete.
After normalization (Algorithm 11) and removal of redundant disequalities, the resulting formula
is in the form
( x : )
(1)
( x, y) 
(2)
( y)  Â ( x, y)  ( x, y, z) , (59)
where
(1)
( x, y) is of the form i xf(i) = ti( x, y) and
(2)
( y) does not contain x. Â ( x, y) is the
integer constraint obtained from runs of Algorithm 12 (step (2) of Algorithm 11 and step (2) of Al-
gorithm 13). The resulting formula is guaranteed to be in strongly solved form and cluster complete,
and hence by Theorem 48, we can compute an RLCC ( x, y) for
(1)
( x, y) 
(2)
( y)/ x/Â ( x, y),
producing the equivalent
( x : )
(1)
( x, y) 
(2)
( y)  ( x, y)  ( x, y, z) , (60)
which reduces to
(2)
( y)  ( u : ) ( u, y)  ( u, y, z) , (61)
because
(1)
( x, y) has been completely characterized by , and all occurrences of x in are
integer occurrences and hence integer variables.
Thus, the final algorithm for elimination of term quantifiers can be given as follows.
Algorithm 14 (Reduction of term quantifiers to integer quantifiers). Input: (57)
(1) Call Algorithm 11 to normalize the input, obtaining (59).
(2) Call Algorithm 6 to get the RLCC ( x, y) for ( x, y)/ x/Â ( x, y) and add it to the matrix
of (59), obtaining (60).
(3) Remove
(1)
( x, y) from the matrix of (60), pull
(2)
( x, y) out of ( x : ) and then reduce
( x : ) to (z : ), obtaining (61).
Theorem 51. All transformations in Algorithm 14 preserve equivalence.
Theorem 52. Algorithm 14 eliminates a block of quantifiers in time 2O(n2 lg n).
Example 53. Let us modify Example 41 by conjoining the matrix of (45) with an integer constraint
 (x). (The concrete form of  (x) is irrelevant.) Now we have
(x : list) cons(car(x), y1) = y2  y2 = x   (x) . (62)
Running Algorithm 8 on (62) and updating integer constraints accordingly, we arrive at
cdr(y2) = y1  (x2 : list) y2 = cons(car(y2), x2)  (|car(y2)| + |x2| + 1) (63)
(cf. (48)). Below we will follow a single branch produced in the disjunctive splittings.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1559
We run Algorithm 12 to obtain a compatible cluster completion of (63), producing
cdr(y2) = y1  (x2 : list) y2 = cons(car(y2), x2) (64)
 |y2| = |car(y2)| + |x2| + 1   (|car(y2)| + |x2| + 1) .
Here we omitted obvious disequalities implied by literals listed. The matrix of (64) induces three
mutually independent clusters:
{ x2 }, { car(y2) }, { y2, cons(car(y2), x2) }.
The formula (64), however, is not in strongly solved form as x2 appears in the disequality y2 =
cons(car(y2), x2), and hence we need to run Algorithm 13. Choosing cdr(y2) = x2 in step (1a), we
obtain
cdr(y2) = y1  (x2 : list) cdr(y2) = x2  |y2| = |car(y2)| + |x2| + 1 (65)
  (|car(y2)| + |x2| + 1) ,
which induces four clusters:
{ car(y2) }, { x2 }, { cdr(y2) }, { y2 }
that are not mutually independent as neither {x2} nor {cdr(y2)} is closed. Then step (2) of
Algorithm 13 (Algorithm 12) is called upon to fix the problem of mutual independence. Choos-
ing |cdr(y2)| = |x2| we have
cdr(y2) = y1  (x2 : list) cdr(y2) = x2  |y2| = |car(y2)| + |x2| + 1 |cdr(y2)| (66)
= |x2|   (|car(y2) + |x2| + 1) ,
which is in strongly solved form and induces three mutually independent clusters:
{ car(y2) }, { x2, cdr(y2) }, { y2 }.
Identifying (66) with (59), we have
( x, y) :
(1)
( x, y) 
(2)
( y),
(1)
( x, y) : cdr(y2) = x2,
(2)
( y) : cdr(y2) = y1,
Â ( x, y) : |y2| = |car(y2)| + |x2| + 1  |cdr(y2)| = |x2|,
( x, y, z) :  (|car(y2)| + |x2| + 1).
We can now call Algorithm 6 to get the RLCC ( x, y) for ( x, y)/ x/Â ( x, y), which simplifies
to
Â ( x, y)  |x2| > 3  2 |x2|.
1560 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
By step (2) of Algorithm 14, (66) is equivalent to
cdr(y2) = y1  (x2 : list) cdr(y2) = x2 (67)
 |y2| = |car(y2)| + |x2| + 1  |cdr(y2)| = |x2|
 |x2| > 3  2 |x2|   (|car(y2)| + |x2| + 1) ,
which, by the reduction of term quantifiers to integer quantifiers, transforms to
cdr(y2) = y1  (z : ) |y2| = |car(y2)| + z + 1  |cdr(y2)| = z (68)
 z > 3  2 z   (|car(y2)| + z + 1) .
Bysubstituting|cdr(y2)|forz wecanremove(z: ).Usingthefactthat|y2| = |car(y2)| + |cdr(y2)| +
1, we finally obtain
cdr(y2) = y1  |cdr(y2)| > 3  2 |cdr(y2)|  (|car(y2)| + |cdr(y2)| + 1). (69)
It is easy to verify that (69) implies (62). The reverse direction, however, does not hold because we
only showed one branch of the reduction.
7.3. Richer theories on constant domain
Similar to Section 5.5, we can have quantifier elimination for TA+
provided Ac also admits
quantifier elimination. It is easily seen that no change is needed for the elimination of quantifiers
on integer variables. For the elimination of term quantifiers, we assume formulas are of the form
( x : ) c( x, y)  ( x, y)  ( x, y, z) , (70)
where c( x, y) is a formula in Lc. Thanks to step (2) in Algorithm 8, we can assume that none of
x appears inside selectors. Without loss of generality, we also assume x are either all asserted to be
non-nullary constructor terms or all asserted to be constants. In the former case, we can assume
c( x, y) is just c( y) because x do not properly appear inside selectors and hence they do not have
any place in c. So we can simply move c( y) out of  x, obtaining a quantified formula in the
same form as (57). In the latter case, we can assume that x do not properly occur inside constructor
terms either, because we can decompose constructor terms properly containing x in the same way
as in Algorithms 9 and 13. So literals containing x in ( x, y) are equalities or disequalities on
terms of constant type, and hence we can rewrite ( x, y) as
(a)
( x, y) 
(b)
( y) with
(a)
( x, y)
being a constraint in Lc. We can then assume that
(a)
( x, y) is a part of c( x, y). We also simplify
( x, y, z) to ( y, z) by instantiating all |xi| to 1. Now it is clear that (70) is equivalent to
(b)
( y)  ( y, z)  ( x : ) c( x, y), (71)
where the quantifiers can be handled by the quantifier elimination procedure for Th(Ac).
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1561
7.4. Adaptation for infinite languages
In this section we describe adaptations to aforementioned algorithms needed to deal with the-
ories in a (countably) infinite language. We require the number of distinct arities of the language
be finite, which is not a real restriction for practically interesting theories. We distinguish three
types of infinite languages according to the cardinalities of the constant domain and non-nullary
constructor domain:
(1) infinitely many constants, finitely many non-nullary constructors;
(2) finitely many constants, infinitely many non-nullary constructors;
(3) infinitely many constants, infinitely many non-nullary constructors.
A finite language is called type 0.
Below we discuss the three aspects of our algorithms that are affected by an infinite signature:
type completion, counting constraints, and constant instantiation.
7.4.1. Type completion
Type completion is affected only by the cardinality of the domain of non-nullary constructors,
since for type completion the identity of constants is not important; only the fact that a term is
a constant is relevant, which is provided by IsA. Thus, no adaptation is necessary for languages
of type 1. For languages of type 2 and 3, note that a given formula  can only contain finitely
many non-nullary constructors. For type completion we can collect all non-nullary constructors
not occurring in  into one pseudo-constructor type U, thus reducing a type 2 (respectively, type
3) language a type 0 (respectively, type 1) language. Below, we show how to define the counting
constraints for the new constructor type U.
7.4.2. Counting constraints
Both an infinite constant domain and an infinite number of non-nullary constructors allow
counting constraints to be relaxed. For languages with an infinite constant domain (type 1 and 3),
CNT,n(x) (respectively, CNT,n(x)) is equivalent to Tree (x) (respectively, Tree(x)), as there are
infinitely many trees of any legitimate tree length.
For languages with only an infinite number of non-nullary constructors (type 2), the situa-
tion is slightly more complicated, because, depending on the arities of those non-nullary construc-
tors, an infinite subset of those non-nullary constructors may or may not be useful in forming
terms of a particular length. First consider the simplest case in which the language has an in-
finite number of non-nullary constructors of arity i only. Then the number of distinct terms of
length x is infinite if and only if x - i is a legitimate tree length (and thus x is also a legitimate
tree length). If x - i is not a legitimate tree length, the counting constraint reduces to that of a
finite language. Thus in this case the counting constraint CNTk,n,i(x), where the additional sub-
script i denotes the arity with an infinite number of non-nullary constructors, can be defined
as
CNTk,n,i(x) : Tree(x - i)  CNTk,n(x).
1562 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
In general there can be a finite number of arities for which the number of non-nullary constructors
is infinite. Let I be the set of these arities. Then the counting constraint CNTk,n,I (x) can be expressed
as
CNTk,n,I (x) :
iI
Tree(x - i)  CNTk,n(x),
and the counting constraint CNTk,n,I (x) can be expressed as
CNTk,n,I (x) :
iI
Tree (x - i)  CNTk,n(x).
Note that CNTk,n(x) and CNTk,n(x) above are defined with respect to a finite sub-language con-
taining all constants and all non-nullary constructors not having arities in I. It does not matter
if the sub-language contains any non-nullary constructors having arities in I, because the pres-
ence of those constructors does not affect the truth value of CNTk,n(x) (respectively, CNTk,n(x))
when iI Tree(x - i) (respectively, iI Tree (x - i)) is false. Let S denote the signature of such
a sub-language. For the pseudo-constructor type U introduced above we have
CNTU
k,n,I (x) :
iI
Tree(x - i) 
S
CNTk,n(x),
since U must necessarily include all non-nullary constructors of arities in I.
7.4.3. Constant instantiation
Having an infinite number of constants prohibits the application of the constant elimination
(step (4) of Algorithm 8). In that step all variables x that are asserted to be constants are nondeter-
ministically instantiated with constants, thereby allowing the elimination of the corresponding  x.
With an infinite number of constants, direct instantiation obviously is not possible. These variables,
however, can still be eliminated as follows. At step (4) of Algorithm 8 all existentially quantified
variables appear in disequalities only (cf. (34)). Let xc  x be the set of existentially quantified vari-
ables that are asserted to be constants. Then for any given assignment of the parameters y and the
remaining variables x\ xc, all disequalities containing variables in xc can be simultaneously satisfied
by assigning them distinct constants occurring neither in [[ y]] nor in the formula. Therefore all
variables in xc can be removed from the formula (43), as desired.
8. Conclusion
We presented decision procedures for term algebras augmented with Presburger arithmetic, for
quantifier-free theories and quantified theories. Our technique is based on the extraction of exact
integer constraints from term constraints, and in case of quantified theories, combined with the
reduction of term quantifiers to integer quantifiers.
We have extended our results to queues, a type of non-recursive data structure in which an ob-
ject can be constructed in more than one way [41]. We plan to extend this work to reason about
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1563
the combination of data structures with integers in richer languages such as the theory of term
algebras with subterm relation [35], the theory of concatenation [22] and the theory of queues with
sub-sequence relations (including subqueue, prefix or suffix relation) [4].
We also plan to extend our results to allow more than one integer function on data structures,
Such extensions allow us to model a wide range of augmented data structures, such as balanced
tree structures like AVL trees and red-black trees [8]. For example, the theory of term algebras with
two length functions, which respectively give the length of the maximum path and the length of the
minimum path, can express red-black trees [8].
9. Acknowledgments
We thank the anonymous referees for their careful reading and suggestions. The detailed com-
ments that pointed out several inaccuracies in an earlier version were extremely valuable.
Appendix A
In the following proofs we need to express legitimate lengths of trees. To support those ex-
pressions we use the following lemma, which is a well-known result following from the Euclidean
algorithm for computing greatest common divisors.
Lemma 54. Let d1, . . . , dn be positive integers and gcd their greatest common divisor. Then
if x1, . . . , xn 0 x =
n
i=1
xidi then gcd | x. (A.1)
In addition, there exists Nd such that for any x Nd the reverse also holds, that is,
if gcd | x then x1, . . . , xn 0 x =
n
i=1
xidi . (A.2)
For 1 x < Nd , however, there is no closed formula to decide if x is a non-negative linear combi-
nation of d1, . . . , dn. Finding the smallest Nd for gcd = 1 is known as the Frobenius Coin Problem,
and has been shown to be NP-hard.
Below we present the proofs not included in the main text. For ease of reference we restate the
propositions and theorems.
Proposition 24. obtained by Algorithm 4 is expressible in a quantifier-free Presburger formula
linear in the size of .
Proof. Let n be the size of . Then the size of G is O(n). For each node in G we add (on average)
at most four integer constraints. For an equivalence class {t1, . . . , tn}, we only need to add n - 1
equalities, namely |t1| = |t2| =    = |tn|. So it suffices to show that the integer constraints Tree(|t|)
and Tree (|t|) can be expressed in quantifier-free Presburger formulas linear in the size of .
1564 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Let d1, . . . , dn be the distinct arities of the non-nullary constructors in the given language. Since
n and di (1 i n) are constant values (in the given language), the predicates Tree(x) and Tree (x)
are of constant length (with quantifiers). By Cooper's method, there exist equivalent quantifier-free
formulas with length at most triple exponential in the length of Tree(x) (or Tree (x)) [27]. Of course,
these equivalent quantifier-free formulas are still of constant length.
In fact Tree(x) can be expressed directly by a quantifier-free formula. Indeed, Tree(x) states that
x - 1 is a non-negative linear combination of d1, . . . , dn. By Lemma 54 this condition is equivalent
to gcd | x - 1 provided x - 1 Nd where gcd and Nd are as stated in the lemma. Therefore Tree(x)
is equivalent to
sS
(x - 1 = s)  x - 1 Nd  gcd | x - 1 .
where S  {1, . . . , Nd - 1} can be precomputed for the given language to contain exactly those values
1 s < Nd such that
x1, . . . , xn 0 s =
n
i=1
xidi .
Theorem 25. obtained by Algorithm 4 is an LCC for .
Proof. To show that computed by Algorithm 4 is an LCC for , we need to show, by Definition
17, the validity of
( x : ) ( x)  (z : ) (z)  | x| = z , (8)
(z : ) (z)  ( x : ) ( x)  | x| = z . (9)
Clearly, from the description of Algorithm 4, for any variable assignment satisfying , | | satisfies
and thus (8) holds.
To demonstrate the validity of (9), let be a satisfying assignment of . We have to show
that there exists a variable assignment such that | | = and satisfies . Let be an
arbitrary variable assignment such that | | = . Clearly such an assignment exists; it may not,
however, satisfy . We show how can be transformed into an assignment that is guaranteed
to satisfy . Let G be the DAG of and R be the bidirectional closure induced by . Let
G be the extension of G that represents the variable assignment , that is G is obtained from G
by replacing each leaf labeled by a variable v by the ground tree representing (v). Without loss of
generality we assume that all leaf vertexes in G are labeled by either constants or variables; this
can be achieved by variable abstraction of selector terms, as illustrated in Example 55 below.
To obtain from , apply the following two steps:
(1) substitute each variable asserted to be a constant in by a fresh constant. This is possible,
since TA
has infinitely many constants.
(2) for each equivalence class {v1, . . . , vk} set (vk) =    = (v2) = (v1).
We claim that is a satisfying assignment for . Clearly | | = | | = since the transformation
does not affect the lengths of the terms. Moreover, respects R , that is, for any terms t and s,
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1565
Fig. A. 1. The DAG of in Example 55.
(t, s)  R implies (t) = (s) (or Gt = Gs). It remains to show that for any two nonequivalent
vertexes s and t in G , Gs = Gt.
Let s and t be two vertexes such that s = t in , but suppose, for a proof by contradiction,
that Gs = Gt. Let ht(s) denote the height of Gs and similar for ht(t). Without loss of generality, as-
sume that among all pairs of vertices t1, t2 such that t1 = t2 in , but Gt1
= Gt2
, h = ht(t1) = ht(t2)
is minimal. If h = 1, then both s and t are variables in distinct equivalence classes, and thus they
were assigned distinct constants in step (1), a contradiction. If h > 1, then (t) = (s) implies that
(t[1]) = (s[1]). But ht(t[1]) = ht(s[1]) = h - 1 < h, contradicting the minimality of h.
Example 55. To illustrate the variable abstraction and the construction of a satisfying variable
assignment consider the following constraint
: Iscons(x)  cons(y, z) = cons(cdr(x), z)  car(x) = z. (A.3)
The DAG of , shown in Fig. A.1, is the same as that of Example 18 and as in that example, R
is
{ (v1, v2), (v3, v4), (v6, v7) }.
To eliminate the selector terms labeling the leaf nodes v5 and v6, we introduce two new variables,
x1 and x2, and let x1 = car(x) and x2 = cdr(x). Now x can be replaced by cons(x1, x2) and thus is not
part of the variable assignment.
A satisfying integer assignment for is
: { |x1| := 1, |x2| := 5, |y| := 5, |z| := 1 }.
A corresponding term assignment such that | | = is
: { x1 := a, x2 := cons(cons(a, a), a), y := cons(a, cons(a, a)), z := a }.
The first step of the transformation produces
{ x1 := a1, x2 := cons(cons(a2, b2), c2), y := cons(a3, cons(b3, c3)), z := a4 }.
Since x2 and y are in the same equivalence class, the second step of the transformation produces
:{ x1 := a1, x2 := cons(cons(a2, b2), c2), y := cons(cons(a2, b2), c2), z := a4 }.
which satisfies .
1566 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Proposition 30. CNTk,n(x) and CNTk,n(x) are expressible by quantifier-free Presburger formulas that
can be computed in time O(n).
Proof. Let the language L have m non-nullary constructors 1, . . . , m with arities d1, . . . , dm. Let
N(x) denote the number of distinct term trees of length x. Recall that CNTk,n(x) holds if and only
if N(x) > n for a constant domain of size k. Thus it suffices to express N(x) > n as quantifier-free
Presburger formula of size O(n).
For x > 1, N(x) can be expressed as a recurrence relation:
N(x) =
m
i=1 x1, . . . , xdi
|
x1 +    + xdi
= x - 1 
x1, . . . , xdi
> 0
di
j=1
N(xj). (A.4)
The relation can be explained as follows: there are m ways to label the root of a tree; for a root with
d children with lengths x1, . . . , xd , respectively, there are d
j=1 N(xj) combinations. Using dynamic
programming we can compute N(1), N(2), . . ., with N(1) = k, until we reach the first xmin such that
N(xmin) > n.
We first consider the special case that d1 =    = dm, where we have for any x1, x2 > 0 the following
monotonicity property:
Tree(x1)  Tree(x2)  x1 > x2  N(x1) > N(x2). (A.5)
The reason is that in this case term trees always "grow continuously", that is, the next larger tree is
always obtained by expanding one of the vertexes of the previous tree.
For this special case N(x) > n reduces to Tree(x)  x xmin. To show that Tree(x)  x xmin
can be computed in O(n) time, let d be the maximum arity. There are O(xd-1) different sequences
of positive numbers with sum x - 1, and thus N(x) can be obtained by O(xd ) arithmetic operations.
As N(x) grows exponentially in x, xmin is at the scale of O(lg n). Moreover, as all integers in the
computation are less than n, any arithmetic operation costs time O(lg n). Therefore the search for
such xmin can be done in O(n) time.
Unfortunately, (A.5) does not hold in general when arities are different. For example, for
d1 = 3, d2 = 10 and k = 1, N(10) = 12, while N(11) = 1. The problem is that in this case term
trees do not necessarily grow continuously. Indeed, a tree of length 10 must consist of a root
with three children, with lengths either 4, 4, and 1, or 7, 1, and 1. A tree of length 11, on
the other hand, can only consist of a root with 10 children each of size 1. Consequently, a
tree of length 10 cannot "grow" into a tree of length 11, and therefore N(10) and N(11) are
completely unrelated.
However, there exists Nd such that for any x xmin + Nd ,
N(x) > N(xmin) iff Tree(x).
Let Nd be as in Lemma 54 and let gcd be the greatest common divisor of d1, . . . , dm. The "only if"
direction is trivial. For the "if" direction, suppose Tree(x) holds, that is, x - 1 can be expressed as
a non-negative linear combination of d1, . . . , dm. By Lemma 54, (A.1), gcd | x - 1. For the same
reason, we have gcd | xmin - 1, and since x - xmin = (x - 1) - (xmin - 1) we have gcd | x - xmin.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1567
By assumption, x - xmin Nd , and therefore, by Lemma 54, (A.2), x - xmin can be expressed as a
non-negative linear combination of d1, . . . , dm, and thus a tree of length xmin can grow into a tree of
length x by replacing one of its vertexes with a tree of size x - xmin + 1. Therefore, N(x) > n, and
thus CNTk,n(x) can be expressed by
sSn
(x = s)  x xmin + Nd  Tree(x) ,
where
Sn = {xmin s < xmin + Nd | N(s) > n}.
Since Nd and the size of Sn are constant, CNTk,n(x) can be computed in O(n) time as desired.
The proof for CNTk,n(x) is similar.
Proposition 33. obtained by Algorithm 6 is expressible in a quantifier-free Presburger formula of
size linear in the size of  Â .
Proof. By Proposition 24, the call to Algorithm 4 to obtain takes time O(n) where n is the size
of the input  Â . Next, for each -cluster of size m, Algorithm 6 adds CNTk,m(x) to . By
Proposition 30, CNTk,m(x) can be computed in time O(m). Since the sum of sizes of all clusters is
O(n), it follows that can be computed in O(n), and hence the size of is O(n).
Theorem 34. obtained by Algorithm 6 is an RLCC for /Â .
Proof. Let ( x) be a type- and equality-complete term constraint and Â (| x|) be a Presburger
formula. To show that computed by Algorithm 6 is an RLCC for /Â , we need to show, by
Definition 20, the validity of
( x : ) ( x)  Â ( x)  (z : ) (z)  | x| = z , (11)
(z : ) (z)  ( x : ) ( x)  Â ( x)  | x| = z . (12)
For (11) consider an arbitrary variable assignment such that  Â is true. By Algorithm 6,
consists of ,4  Â  (CNTk,n)i, where ,4 is the constraint computed by Algorithm 4 and
(CNTk,n)i are the counting constraints added in step (3). By Theorem 25, there exists an integer
assignment such that = | | such that ,4 is true, and obviously this assignment also makes
Â true. Finally, the counting constraints impose a restriction on the length of terms. By Algo-
rithm 6, for any term t such that CLSn(t, t1, . . . , tn) is implied by  Â , includes the counting
constraint CNTk,n(t, t1, . . . , tn), or equivalently, N(|t|) > n, with N(|t|) as before. Since satisfies
 Â , it must have assigned different terms to t, t1, . . . , tn, and thus their length necessarily satisfies
N(|t|) > n.
For (12) consider an arbitrary integer assignment , assigning z := d, such that ( d) holds.
We have to show that there exists a term assignment , assigning x := t such that |t| = d and
(t) and Â (t) hold. In contrast with the proof of Theorem 25, it is not immediately obvious that
such an assignment exists, because we no longer assume an infinite constant domain. Therefore we
incrementally construct , starting with terms of smallest length.
1568 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Let G be the DAG and R the bidirectional closure for as constructed in Algorithm 1. We
assume, as in the proof of Theorem 25, that all selectors have been eliminated and thus leaf nodes
of G are labeled by either constants or variables. Let G be identical to G except that all nodes
corresponding to equivalence classes in R have been merged. Since is equality-complete any
two distinct vertices in G must correspond to distinct terms.
To construct , we order all term lengths according to the values assigned by :
|t
(1)
0 | =    = |t(1)
n1
|
block 1
< |t
(2)
0 | =    = |t(2)
n2
|
block 2
<       < |t
(k)
0 | =    = |t(k)
nk
|
block k
.
Note that any term t in appears in and hence |t| is assigned a length by and appears in the
above sequence. Let li be the length of the terms in block i. For simplicity, we assume all terms in
block i are of type i. Starting with the terms in the first block, that is, t
(1)
1 , . . . , t
(1)
n1 , we incrementally
construct a satisfying assignment for such that | | = . Obviously t
(1)
i (0 i n1) is either
a constant or a term variable as its length is the smallest, and we only need to consider t
(1)
i which are
term variables. By Algorithm 7 we know that CNT 1
k,n1
(l1) is in . As satisfies , there are at
least n1+1 different 1-terms of length l1. Therefore we can simply assign each t
(1)
i (if it is a variable)
a distinct term. Now we proceed to the (i + 1)th block assuming that the terms in the ith block have
been assigned. At this time the values of all non-variable terms in the (i + 1)th block have been
fixed because variables (if any) in those terms have length less than li and those variables have been
assigned by the ith round. By the same argument as before, due to the presence of CNT i+1
k,ni+1
(li+1)
in , we are able to assign each variable in the (i + 1)th block a different tree of length li+1. The
assignment in each round will not create any equality between terms in G simply because terms of
the same type and the same length are assigned to different values. By induction we can eventually
construct a satisfying assignment for such that | | = .
Since includes Â , Â ( d) holds, and therefore also satisfies Â , as required.
Theorem 39. All transformations in Algorithm 10 preserve equivalence.
Proof. Recall that Algorithm 10 first transforms a formula into one in solved form, that is in the
form
( x)
i
xp(i) = ti( x, y) 
j
Ljyq(j) = sj( x, y) (42)

k
Fkyf(k) = uk( y) 
l
Glyg(l) = Hlyh(l).
using Algorithm 8. It then eliminates the quantifiers by removing all literals containing x. It is
straightforward to show that all transformations in Algorithm 8 preserve equivalence, and we omit
the proof. It remains to show that (42) is equivalent to
k
Fkyf(k) = uk( y) 
l
Glyg(l) = Hlyh(l),
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1569
that is, that
( y : )( x : )
i
xp(i) = ti( x, y) 
j
Ljyq(j) = sj( x, y) . (43)
is valid, or that
i
xp(i) = ti( x, b) 
j
Ljbq(j) = sj( x, b), (A.6)
is satisfiable for an arbitrary sequence b of fixed TA-terms. For simplicity, we assume that all non-
nullary constructors have the same arity. The general case can be proved via a minor modification.
Let x be x1, . . . , xn. Let be the maximal length of terms not containing x in (A.6) and let >
be such that there exist at least n distinct nonconstant terms d = d1, . . . , dn of length . We claim
that d satisfies (A.6). Observe that, due to step (4) of Algorithm 8, none of the xi are constrained to
be constants, and thus all variables can be assigned nonconstant terms.
To show that Ljbq(j) = sj( d, b) holds, we assume that the constructor term sj actually contains x.
If not, the literal Ljyq(j) = sj( x, y) can be moved out of the scope of  x. Since sj is a constructor term,
|sj( d, b)| > . On the other hand, Lj is a selector sequence and thus |Ljbq(j)| . Therefore, for
all j, Ljbq(j) = sj( d, b).
To show that dp(i) = ti( d, b), consider the following four cases depending on the structure of ti.
(1) ti does not contain any variable in x. Then |ti(b)| while |dp(i)| = > , and hence
dp(i) = ti(b).
(2) ti properly contains a variable in x. Then |dp(i)| = but |ti( d, b)| > , and hence dp(i) = ti(b).
(3) ti is a constant ci. Then dp(i) = ci holds since none of terms in d is a constant.
(4) ti is a variable in x. Then ti  xp(i ) where p(i ) = p(i) (or xp(i) = xp(i) simplifies to false). Since
p(i) = p(i ), dp(i) = dp(i ) holds by the selection of d.
Theorem 40. Algorithm 10 removes a block of quantifiers in time 2O(n).
Proof. First note that we need not be concerned with the increase of the matrix size by the sub-
stitution, since we can represent a conjunction of literals efficiently using the DAG representation
(Definition 8), in which substitution can simply be done by rearranging edges in the graph. For
example, consider the following sequence of formulas
x1 = (x2, x2), x2 = (x3, x3), . . . , xn = (xm+1, xm+1).
Instead of generating a formula of size 2O(m), the substitution only gives a linear "double-edged"
path from x1 to xm+1. For details see [26].
It suffices to analyze each step of Algorithm 8. Let n be the size of the matrix. Step (1) (type
completion) generates at most 2O(n) disjuncts, as for a formula containing n selector terms, there
are at most 2O(n) combinations of tester literals. Step (2) (selector elimination) can be done in O(n)
1570 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
as a selector term can be transformed to a formula in the constructor language in linear size. Step
(3) (decomposition, Algorithm 9) generates at most 2O(n) disjuncts. In addition, it can increase the
matrix size to O(n2), due to solving x in terms of y (step (3) of Algorithm 9); the total lengths of
all paths in a tree of size n is bound by n2. Note that the size increase comes from the increase of
term occurrences and the number of distinct terms is still in O(n). Step (4) (constant elimination)
produces at most 2O(n) disjuncts. All put together, steps (1­4) generate 2O(n) disjuncts each of which
has size O(n2) and contains O(n) distinct terms.
Theorem 48. If ( x, y)  Â ( x, y) is strongly solved in x and cluster complete, then ( x, y)
computed by Algorithm 6 is an RLCC for ( x, y)/ x/Â ( x, y).
Proof. To show that computed by Algorithm 6 is an RLCC for
( x, y)/ x/Â ( x, y), we need to show, by Definition 45, the validity of
( x, y : ) ( x, y)  Â ( x, y)  (z : ) (z, y)  | x| = z , (52)
( y : )(z : )
(2)
( y)  (z, y)  ( x : ) ( x, y)  Â ( x, y)  | x| = z . (53)
The validity of (52) can be shown by a similar argument as was given for (11) in the proof of Theorem
34.
To prove (53) consider arbitrary assignments and
(y)
, assigning z := p and y := s respec-
tively, such that ( p, s) and
(2)
(s) hold. We have to show that there exists a term assignment
(x)
, assigning x := t such that |t| = | p| and (t, s) and Â (t, s) hold. As in the proof of Theorem
34, we incrementally construct
(x)
, starting with terms of smallest length.
To construct
(x)
, we order all term lengths according to the values assigned by :
|u
(1)
0 | =    = |u(1)
n1
|
block 1
< |u
(2)
0 | =    = |u(2)
n2
|
block 2
<       < |u
(j)
0 | =    = |u
(j)
nj |
block j
.
Let li be the length of terms in the ith block. For any TA-term t occurring in ( x, y), |t| ap-
pears in ( x, y) and hence in the above sequence. In general a block can contain more than one
cluster for each type. However, by Proposition 50, maximal clusters are mutually independent,
therefore, without loss of generality, we can assume each block i consists of only one maximal
i-cluster.
Beginning with terms in the first block, namely u
(1)
1 , . . . , u
(1)
n1 , we incrementally construct a sat-
isfying assignment for Â ( x, s). We only need to consider u
(j)
nj which contains x. Obviously u
(1)
i
(0 i n1) is either a constant or a variable in x as its length is the smallest. By Algorithm 6
we know that CNT 1
k,n1
(l1) is in ( x, y). As p satisfies ( x, s), there are at least n1 + 1 differ-
ent terms of length l1. Therefore we can simply assign each u
(1)
i (if it is a variable) a distinct 1-
term.
Now suppose that all variables in the ith block have been assigned. Consider the (i+1)th block.
At this time values of all non-variable terms in the (i + 1)th block have been determined, because x
only appear inside constructor terms, and hence have been assigned values by the ith round. For ex-
ample, suppose that t(x) is a constructor term in the (i+1)th block. Since |x| < |t(x)|, x was assigned
by the ith round and so is the value of t(x). Due to the presence of CNT i+1
k,ni+1
(li+1) in ( x, y), we
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1571
are able to assign each variable in the (i + 1)th block a different term of length li+1. The variable
assignment in each round does not create any equality between terms in a block, i.e., it does not
violate any disequalities.
Up to now the proof is essentially the same as the one for Theorem 34. The only problematic
case is that a cluster may have selector terms containing y as well as constructor terms containing x.
For example, suppose that a cluster C in the ith block contains both Ly and i(t1( x, y), . . . , tk( x, y))
for ar( i) = k. [[Ly]] is fixed by
(y)
and [[t1( x, y)]], . . . , [[tk( x, y)]] have been determined before
the ith round and hence [[ i(t1( x, y), . . . , tk( x, y))]] is determined too. By Definition 46, however,
the disequality Ly = t( x, y) is redundant, which can only be the case if for some j (1 j k),
s i
j Ly = tj( x, y) is implied by . Since |tj( x, y)| < | i(tj( x, y), . . . , tk( x, y))|, tj( x, y) has been as-
signed before the ith round such that [[s i
j Ly]] = [[tj( x, y)]]. As a consequence we have [[Ly]] =
[[ i(t1( x, y), . . . , tk( x, y))]].
Since at each step we can build a satisfying partial assignment for x, by induction, we can even-
tually construct a satisfying assignment t such that (t, s) and |t| = p. It is clear that Â (t, s) also
holds as implies Â (thanks to step (2) of Algorithm 6).
Proposition 50. Algorithm 11 produces a formula which is in strongly solved form and cluster complete.
Proof. The production of a strongly solved form is guaranteed by Algorithm 13. It suffices to show
that Algorithm 13 preserves cluster completeness. Step (1a) of Algorithm 13 may generate new
equalities as well as new disequality literals of the form Ly = t ( x, y), which in general destroys the
mutual independence of clusters. Step (2) (Algorithm 12), however, is called after each run of step
(1a) to restore cluster completeness. The termination argument is given in the description of the
algorithm.
Theorem 51. All transformations in Algorithm 14 preserve equivalence.
Proof. Clearly, Algorithm 11 preserves equivalence. So, it suffices to show the equivalence
between
( x : )
(1)
( x, y) 
(2)
( y)  Â ( x, y)  ( x, y, z) (59)
and
(2)
( y)  ( u : ) ( u, y)  ( u, y, z) . (61)
By Proposition 50 ( x, y)  Â ( x, y) is in strongly solved form and cluster complete. It then fol-
lows from Theorem 48 that ( x, y) is an RLCC for ( x, y)/ x/Â ( x, y). By Proposition 22,
( x, y)  ( x, y, z) is an RLCC for ( x, y)/ x/(Â ( x, y)  ( x, y, z)).
Toshowthat(59)implies(61),assumethatforall y andforall z thereexists x suchthat
(1)
( x, y) 
(2)
( y)  Â ( x, y)  ( x, y, z) holds. Then by Definition 45, (52), there exists u = | x| such that
( u, y)  ( u, y, z) holds, and hence (61) holds.
To show that (61) implies (59), assume that for all y and for all z there exists u such that
(2)
( y) 
( u, y)  ( u, y, z) holds. Then by Definition 45, (53), there exists x such that | x| = u and
( x, y)  Â ( x, y)  ( x, y, z) holds, and hence, noting that ( x, y) 
(1)
( x, y) 
(2)
( y), (59)
holds, as required.
1572 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
Theorem 52. Algorithm 14 eliminates a block of quantifiers in time 2O(n2 lg n).
Proof. We only need to analyze each step of Algorithm 11. As shown in the proof of Theorem 40,
step (1) (basic normalization, Algorithm 8) generates 2O(n) disjuncts each of which contains O(n)
distinct terms.
Step (2) (cluster completion, Algorithm 12) produces 2O(nlg n) cluster completions. An equality
completion is a valid product of a partition of terms on syntactic equality and a partition on terms
on length equality. For a set of size n the number of distinct partitions is equal to the Bell number,
B(n). An asymptotical expression for B(n) is 1
n
(n)n+ 1
2 e (n)-n-1, where (n) is implicitly defined
by (n) lg (n) = n [20]. Obviously B(n) is bounded by 2O(nlg n), and so is the number of cluster
completions.
Step (3) (Algorithm 13) is a bit more costly. Let the input to Algorithm 13 be a formula of the
form (43) that induces n clusters containing variables in x. Assign to the rank r of each of these
clusters a number p(r) in [1..n] such that if r1 < r2, then p(r1) < p(r2). The sum of all rank numbers
is bounded by n2. Each run of step (1-2) reduces the sum by at least 1, and generates at most one new
term appearing in disequalities. The total number of distinct terms after Algorithm 13 is bounded by
O(n2). The number of cluster completions is therefore bounded by 2O(n2 lg n2) = 2O(n2 lg n), including
the cost of Algorithm 12 and Algorithm 9, both of which may be called O(n2) times in the run of
Algorithm 13.
References
[1] A. Armando, S. Ranise, M. Rusinowitch, Uniform derivation of decision procedures by superposition, in: Lecture
Notes in Computer Science, vol. 2142, 2001, pp. 513­527 .
[2] R. Backofen, A complete axiomatization of a theory with feature and arity constraints, Journal of Logical Program-
ming 24 (1 and 2) (1995) 37­71.
[3] N.S. Bjrner, A. Browne, M. Colón, B. Finkbeiner, Z. Manna, H.B. Sipma, T.E. Uribe, Verifying temporal properties
of reactive systems: a STeP tutorial, Formal Methods in System Design 16 (3) (2000) 227­270.
[4] N.S. Bjrner, Integrating decision procedures for temporal verification, Ph.D. thesis, Computer Science Department,
Stanford University, 1998.
[5] H. Comon, C. Delor, Equational formulae with membership constraints, Information and Computation 112 (2) (1994)
167­216.
[6] H. Comon, P. Lescanne, Equational problems and disunification, Journal of Symbolic Computation 7 (1989) 371­425.
[7] D.C. Cooper, Theorem proving in arithmetic without multiplication, in: Machine Intelligence, vol. 7, American
Elsevier, 1972, pp. 91­99.
[8] T. Zhang, Arithmetic Integration of Decision Procedures, PhD thesis, Computer Science Department, Stanford
University, June 2006.
[9] J. Downey, R. Sethi, R.E. Tarjan, Variations of the common subexpression problem, Journal of ACM 27 (1980)
758­771.
[10] H.B. Enderton, A Mathematical Introduction to Logic, Academic Press, New York, 2001.
[11] S. Feferman, R. Vaught, The first order properties of products of algebraic systems, Fundamenta Mathematicae 47
(1959) 57­103.
[12] S. Ghilardi, Model-theoretic methods in combined constraint satisfiability, Journal of Automated Reasoning 33 (3-4)
(2005) 221­249.
[13] W. Hodges, Model Theory, Cambridge University Press, Cambridge, UK, 1993.
T. Zhang et al. / Information and Computation 204 (2006) 1526­1574 1573
[14] J. Hopcroft, J. Ullman, Introduction to Automata Theory, Langages, and Computation, Addison-Wesley Publishing
Company, Reading, MA, 1979.
[15] K. Korovin, A. Voronkov, A decision procedure for the existential theory of term algebras with the Knuth-
Bendix ordering, in: Proceedings of the 15th Annual Symposium on Logic in Computer Science, 2000,
pp.291­302.
[16] K. Korovin, A. Voronkov, Knuth-Bendix constraint solving is NP-complete, in: Proceedings of 28th International
Colloquium on Automata, Languages and Programming (ICALP'01), Lecture Notes in Computer Science, vol. 2076,
Springer-Verlag, Berlin, 2001, pp. 979­992.
[17] V. Kuncak, M. Rinard, An algorithm for deciding BAPA: Boolean algebra with Presburger arithmetic, in: R.
Nieuwenhuis (Ed.), 20th International Conference on Automated Deduction (CADE'05), Lecture Notes in Computer
Science, vol. 3632, Springer-Verlag, Berlin, 2005, pp. 260­277.
[18] V. Kuncak, M. Rinard, The structural subtyping of non-recursive types is decidable, in: Proceedings of the 18th
Annual Symposium on Logic in Computer Science, IEEE Computer Society Press, Sliver Spring, MD, 2003,
pp. 96­107.
[19] V. Kuncak, M. Rinard, On the theory of structural subtyping, Technical Report MIT-LCS-TR-879, Massachusetts
Institute of Technology, 2003.
[20] L. Lovász, Combinatorial Problems and Exercises, Elsevier, North-Holland, 1993.
[21] M.J. Maher, Complete axiomatizations of the algebras of finite, rational and infinite tree, in: Proceedings of the
3rd Annual Symposium on Logic in Computer Science, IEEE Computer Society Press, Sliver Spring, MD, 1988,
pp. 348­357.
[22] G.S. Makanin, The problem of solvability of equations in a free semigroup, Math. Sbornik 103 (1977) 147­236.
[23] A.I. Mal'cev, Axiomatizable classes of locally free algebras of various types, in: The Metamathematics of Algebraic
Systems, Collected Papers, North Holland, 1971, Ch. 23, pp. 262­281.
[24] G. Nelson, D.C. Oppen, Fast decision procedures based on congruence closure, Journal of ACM 27 (2) (1980) 356­364.
[25] G. Nelson, D.C. Oppen, Simplification by cooperating decision procedures, ACM Transactions on Programming
Languages and Systems 1 (2) (1979) 245­257.
[26] D.C. Oppen, Reasoning about recursively defined data structures, Journal of ACM 27 (3) (1980) 403­411.
[27] D.C. Oppen, Elementary bounds for Presburger arithmetic, in: Proceedings of 5th ACM Symposium on Theory of
Computing, 1973, pp. 34­37.
[28] C.R. Reddy, D.W. Loveland, Presburger arithmetic with bounded quantifier alternation, in: Proceedings of the 10th
Annual Symposium on Theory of Computing, ACM Press, New York, 1978, pp. 320­325.
[29] P. Revesz, Quantifier-elimination for the first-order theory of Boolean algebras with linear cardinality constraints, in:
Proceedings of Advances in Databases and Information Systems (ADBIS'04), Lecture Notes in Computer Science,
vol. 3255, Springer-Verlag, Berlin, 2004, pp. 1­21.
[30] T. Rybina, A. Voronkov, A decision procedure for term algebras with queues, ACM Transactions on Computational
Logic 2 (2) (2001) 155­181.
[31] T. Skolem, Untersuchungen über die axiome des klassenkalküls und über produktations- und summationsprobleme,
welche gewisse klassen von aussagen betreffen, in: J.E. Fenstad (Ed.), Selected Works in Logic, Universitetsforlaget,
1970, pp. 67­101.
[32] C. Tinelli, M.T. Harandi, A new correctness proof of the Nelson­Oppen combination procedure, in: F. Baader, K.U.
Schulz (Eds.), Frontiers of Combining Systems: Proceedings of the 1st International Workshop (Munich, Germany),
Applied Logic, Kluwer Academic Publishers, Dordrecht, 1996, pp. 103­120.
[33] C. Tinelli, C. Ringeissen, Unions of non-disjoint theories and combinations of satisfiability procedures, Theoretical
Computer Science 290 (1) (2003) 291­353.
[34] C.Tinelli,C.G.Zarba,Combiningdecisionproceduresforsortedtheories,in:J.J.Alferes,J.A.Leite(Eds.),Proceedings
of the 9th European Conference on Logic in Artificial Intelligence (JELIA'04), Lecture Notes in Computer Science,
vol. 3229, Springer, Berlin, 2004, pp. 641­653.
[35] K.N. Venkataraman, Decidability of the purely existential fragment of the theory of term algebras, Journal of ACM
34 (2) (1987) 492­510.
[36] C.G. Zarba, Combining sets with integers, in: A. Armando (Ed.), Frontiers of Combining Systems, Lecture Notes in
Artificial Intelligence, vol. 2309, Springer, Berlin, 2002, pp. 103­116.
1574 T. Zhang et al. / Information and Computation 204 (2006) 1526­1574
[37] C.G. Zarba, Combining multisets with integers, in: A. Voronkov (Ed.), Proceedings of the 18th International Confer-
ence on Automated Deduction, Lecture Notes in Artificial Intelligence, vol. 2392, Springer, Berlin, 2002, pp. 363­376.
[38] T. Zhang, H.B. Sipma, Z. Manna, Decision procedures for recursive data structures with integer constraints, in:
2th International Joint Conference on Automated Reasoning (IJCAR'04), Lecture Notes in Computer Science,
vol. 3097, Springer-Verlag, Berlin, 2004, pp. 152­167.
[39] T. Zhang, H.B. Sipma, Z. Manna, Term algebras with length function and bounded quantifier alternation, in: The 17th
International Conference on Theorem Proving in Higher Order Logics (TPHOLs'04), Lecture Notes in Computer
Science, vol. 3223, Springer-Verlag, Berlin, 2004, pp. 321­336.
[40] T. Zhang, H.B. Sipma, Z. Manna, The decidability of the first-order theory of term algebras with Knuth-Bendix
order, in: R. Nieuwenhuis (Ed.), The 20th International Conference on Automated Deduction (CADE'05), Lecture
Notes in Computer Science, vol. 3632, Springer-Verlag, Berlin, 2005, pp. 131­148.
[41] T. Zhang, H.B. Sipma, Z. Manna, Decision procedures for queues with integer constraints, in: R. Ramanujam,
S. Sen (Eds.), Proceedings of the 25th International Conference on the Foundatations of Software Technology
and Theoretical Computer Science (FSTTCS'05), Lecture Notes in Computer Science, vol. 3821, Springer Verlag,
Hyderabad, India, 2005, pp. 225­237.